netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] TC: refactor act_mirred packets re-injection
@ 2018-07-19 13:02 Paolo Abeni
  2018-07-19 13:02 ` [PATCH net-next 1/4] tc/act: user space can't use TC_ACT_REDIRECT directly Paolo Abeni
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Paolo Abeni @ 2018-07-19 13:02 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, Daniel Borkmann,
	Marcelo Ricardo Leitner, Eyal Birger

This series is aimed at improving the act_mirred redirect performances.
Such action is used by OVS to represent TC S/W flows, and it's current largest
bottle-neck is the need for a skb_clone() for each packet.

The first 2 patches introduce some cleanup and safeguards to allow extending 
tca_result: we will use it to store RCU protected redirect information.
Then a new tca_action value is introduced: TC_ACT_MIRRED, similar to
TC_ACT_REDIRECT, but preserving the mirred semantic. The last patch exploits
the introduced infrastructure in the act_mirred action, to avoid a skb_clone,
when possible.

Overall this the above gives a ~10% performance improvement in forwarding tput,
when using the TC S/W datapath.

v1 -> v2:
 - preserve the rcu lock in act_bpf
 - add and use a new action value to reinject the packets, preserving the mirred
   semantic

Paolo Abeni (4):
  tc/act: user space can't use TC_ACT_REDIRECT directly
  tc/act: remove unneeded RCU lock in action callback
  net/tc: introduce TC_ACT_MIRRED.
  act_mirred: use ACT_MIRRED when possible

 include/net/act_api.h        |  2 +-
 include/net/sch_generic.h    | 21 +++++++++++++++++++++
 include/uapi/linux/pkt_cls.h |  2 ++
 net/core/dev.c               |  4 ++++
 net/sched/act_api.c          |  7 +++++++
 net/sched/act_csum.c         | 12 +++---------
 net/sched/act_ife.c          |  5 +----
 net/sched/act_mirred.c       | 35 +++++++++++++++++++++++------------
 net/sched/act_sample.c       |  4 +---
 net/sched/act_skbedit.c      | 10 +++-------
 net/sched/act_skbmod.c       | 21 +++++++++------------
 net/sched/act_tunnel_key.c   |  6 +-----
 net/sched/act_vlan.c         | 19 +++++++------------
 13 files changed, 83 insertions(+), 65 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next 1/4] tc/act: user space can't use TC_ACT_REDIRECT directly
  2018-07-19 13:02 [PATCH net-next 0/4] TC: refactor act_mirred packets re-injection Paolo Abeni
@ 2018-07-19 13:02 ` Paolo Abeni
  2018-07-19 13:02 ` [PATCH net-next 2/4] tc/act: remove unneeded RCU lock in action callback Paolo Abeni
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2018-07-19 13:02 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, Daniel Borkmann,
	Marcelo Ricardo Leitner, Eyal Birger

Only cls_bpf and act_bpf can safely use such value. If a generic
action is configured by user space to return TC_ACT_REDIRECT,
the usually visible behavior is passing the skb up the stack - as
for unknown action, but, with complex configuration, more random
results can be obtained.

This patch forcefully converts TC_ACT_REDIRECT to TC_ACT_LAST + 1
at action init time, making the kernel behavior more consistent.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/uapi/linux/pkt_cls.h | 1 +
 net/sched/act_api.c          | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index c4262d911596..7cdd62b51106 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -45,6 +45,7 @@ enum {
 				   * the skb and act like everything
 				   * is alright.
 				   */
+#define TC_ACT_LAST		TC_ACT_TRAP
 
 /* There is a special kind of actions called "extended actions",
  * which need a value parameter. These have a local opcode located in
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 148a89ab789b..f6438f246dab 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -895,6 +895,11 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 		}
 	}
 
+	if (a->tcfa_action == TC_ACT_REDIRECT) {
+		net_warn_ratelimited("TC_ACT_REDIRECT can't be used directly");
+		a->tcfa_action = TC_ACT_LAST + 1;
+	}
+
 	return a;
 
 err_mod:
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next 2/4] tc/act: remove unneeded RCU lock in action callback
  2018-07-19 13:02 [PATCH net-next 0/4] TC: refactor act_mirred packets re-injection Paolo Abeni
  2018-07-19 13:02 ` [PATCH net-next 1/4] tc/act: user space can't use TC_ACT_REDIRECT directly Paolo Abeni
@ 2018-07-19 13:02 ` Paolo Abeni
  2018-07-19 13:02 ` [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED Paolo Abeni
  2018-07-19 13:02 ` [PATCH net-next 4/4] act_mirred: use ACT_MIRRED when possible Paolo Abeni
  3 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2018-07-19 13:02 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, Daniel Borkmann,
	Marcelo Ricardo Leitner, Eyal Birger

Each lockless action currently does its own RCU locking in ->act().
This is allows using plain RCU accessor, even if the context
is really RCU BH.

This change drops the per action RCU lock, replace the accessors
with _bh variant, cleans up a bit the surronding code and documents
the RCU status in the relevant header.
No functional nor performance change is intended.

The goal of this patch is clarifying that the RCU critical section
used by the tc actions extends up to the classifier's caller.

v1 -> v2:
 - preserve rcu lock in act_bpf: it's needed by eBPF helpers,
   as pointed out by Daniel

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/act_api.h      |  2 +-
 include/net/sch_generic.h  |  2 ++
 net/sched/act_csum.c       | 12 +++---------
 net/sched/act_ife.c        |  5 +----
 net/sched/act_mirred.c     |  4 +---
 net/sched/act_sample.c     |  4 +---
 net/sched/act_skbedit.c    | 10 +++-------
 net/sched/act_skbmod.c     | 21 +++++++++------------
 net/sched/act_tunnel_key.c |  6 +-----
 net/sched/act_vlan.c       | 19 +++++++------------
 10 files changed, 29 insertions(+), 56 deletions(-)

diff --git a/include/net/act_api.h b/include/net/act_api.h
index 683ce41053d9..8c9bc02d05e1 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -85,7 +85,7 @@ struct tc_action_ops {
 	size_t	size;
 	struct module		*owner;
 	int     (*act)(struct sk_buff *, const struct tc_action *,
-		       struct tcf_result *);
+		       struct tcf_result *); /* called under RCU BH lock*/
 	int     (*dump)(struct sk_buff *, struct tc_action *, int, int);
 	void	(*cleanup)(struct tc_action *);
 	int     (*lookup)(struct net *net, struct tc_action **a, u32 index,
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 7432100027b7..056dc1083aa3 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -275,6 +275,8 @@ struct tcf_proto {
 	/* Fast access part */
 	struct tcf_proto __rcu	*next;
 	void __rcu		*root;
+
+	/* called under RCU BH lock*/
 	int			(*classify)(struct sk_buff *,
 					    const struct tcf_proto *,
 					    struct tcf_result *);
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index bd232d3bd022..4f092fbd1e20 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -561,15 +561,14 @@ static int tcf_csum(struct sk_buff *skb, const struct tc_action *a,
 	u32 update_flags;
 	int action;
 
-	rcu_read_lock();
-	params = rcu_dereference(p->params);
+	params = rcu_dereference_bh(p->params);
 
 	tcf_lastuse_update(&p->tcf_tm);
 	bstats_cpu_update(this_cpu_ptr(p->common.cpu_bstats), skb);
 
 	action = params->action;
 	if (unlikely(action == TC_ACT_SHOT))
-		goto drop_stats;
+		goto drop;
 
 	update_flags = params->update_flags;
 	switch (tc_skb_protocol(skb)) {
@@ -583,16 +582,11 @@ static int tcf_csum(struct sk_buff *skb, const struct tc_action *a,
 		break;
 	}
 
-unlock:
-	rcu_read_unlock();
 	return action;
 
 drop:
-	action = TC_ACT_SHOT;
-
-drop_stats:
 	qstats_drop_inc(this_cpu_ptr(p->common.cpu_qstats));
-	goto unlock;
+	return TC_ACT_SHOT;
 }
 
 static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index 3d6e265758c0..df4060e32d43 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -820,14 +820,11 @@ static int tcf_ife_act(struct sk_buff *skb, const struct tc_action *a,
 	struct tcf_ife_params *p;
 	int ret;
 
-	rcu_read_lock();
-	p = rcu_dereference(ife->params);
+	p = rcu_dereference_bh(ife->params);
 	if (p->flags & IFE_ENCODE) {
 		ret = tcf_ife_encode(skb, a, res, p);
-		rcu_read_unlock();
 		return ret;
 	}
-	rcu_read_unlock();
 
 	return tcf_ife_decode(skb, a, res);
 }
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 6afd89a36c69..eeb335f03102 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -181,11 +181,10 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 	tcf_lastuse_update(&m->tcf_tm);
 	bstats_cpu_update(this_cpu_ptr(m->common.cpu_bstats), skb);
 
-	rcu_read_lock();
 	m_mac_header_xmit = READ_ONCE(m->tcfm_mac_header_xmit);
 	m_eaction = READ_ONCE(m->tcfm_eaction);
 	retval = READ_ONCE(m->tcf_action);
-	dev = rcu_dereference(m->tcfm_dev);
+	dev = rcu_dereference_bh(m->tcfm_dev);
 	if (unlikely(!dev)) {
 		pr_notice_once("tc mirred: target device is gone\n");
 		goto out;
@@ -236,7 +235,6 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 		if (tcf_mirred_is_act_redirect(m_eaction))
 			retval = TC_ACT_SHOT;
 	}
-	rcu_read_unlock();
 
 	return retval;
 }
diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index 3079e7be5bde..2608ccc83e5e 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -140,8 +140,7 @@ static int tcf_sample_act(struct sk_buff *skb, const struct tc_action *a,
 	bstats_cpu_update(this_cpu_ptr(s->common.cpu_bstats), skb);
 	retval = READ_ONCE(s->tcf_action);
 
-	rcu_read_lock();
-	psample_group = rcu_dereference(s->psample_group);
+	psample_group = rcu_dereference_bh(s->psample_group);
 
 	/* randomly sample packets according to rate */
 	if (psample_group && (prandom_u32() % s->rate == 0)) {
@@ -165,7 +164,6 @@ static int tcf_sample_act(struct sk_buff *skb, const struct tc_action *a,
 			skb_pull(skb, skb->mac_len);
 	}
 
-	rcu_read_unlock();
 	return retval;
 }
 
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index da56e6938c9e..a6db47ebec11 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -43,8 +43,7 @@ static int tcf_skbedit(struct sk_buff *skb, const struct tc_action *a,
 	tcf_lastuse_update(&d->tcf_tm);
 	bstats_cpu_update(this_cpu_ptr(d->common.cpu_bstats), skb);
 
-	rcu_read_lock();
-	params = rcu_dereference(d->params);
+	params = rcu_dereference_bh(d->params);
 	action = READ_ONCE(d->tcf_action);
 
 	if (params->flags & SKBEDIT_F_PRIORITY)
@@ -77,14 +76,11 @@ static int tcf_skbedit(struct sk_buff *skb, const struct tc_action *a,
 	}
 	if (params->flags & SKBEDIT_F_PTYPE)
 		skb->pkt_type = params->ptype;
-
-unlock:
-	rcu_read_unlock();
 	return action;
+
 err:
 	qstats_drop_inc(this_cpu_ptr(d->common.cpu_qstats));
-	action = TC_ACT_SHOT;
-	goto unlock;
+	return TC_ACT_SHOT;
 }
 
 static const struct nla_policy skbedit_policy[TCA_SKBEDIT_MAX + 1] = {
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index cdc6bacfb190..c437c6d51a71 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -41,20 +41,14 @@ static int tcf_skbmod_run(struct sk_buff *skb, const struct tc_action *a,
 	 * then MAX_EDIT_LEN needs to change appropriately
 	*/
 	err = skb_ensure_writable(skb, MAX_EDIT_LEN);
-	if (unlikely(err)) { /* best policy is to drop on the floor */
-		qstats_overlimit_inc(this_cpu_ptr(d->common.cpu_qstats));
-		return TC_ACT_SHOT;
-	}
+	if (unlikely(err)) /* best policy is to drop on the floor */
+		goto drop;
 
-	rcu_read_lock();
 	action = READ_ONCE(d->tcf_action);
-	if (unlikely(action == TC_ACT_SHOT)) {
-		qstats_overlimit_inc(this_cpu_ptr(d->common.cpu_qstats));
-		rcu_read_unlock();
-		return action;
-	}
+	if (unlikely(action == TC_ACT_SHOT))
+		goto drop;
 
-	p = rcu_dereference(d->skbmod_p);
+	p = rcu_dereference_bh(d->skbmod_p);
 	flags = p->flags;
 	if (flags & SKBMOD_F_DMAC)
 		ether_addr_copy(eth_hdr(skb)->h_dest, p->eth_dst);
@@ -62,7 +56,6 @@ static int tcf_skbmod_run(struct sk_buff *skb, const struct tc_action *a,
 		ether_addr_copy(eth_hdr(skb)->h_source, p->eth_src);
 	if (flags & SKBMOD_F_ETYPE)
 		eth_hdr(skb)->h_proto = p->eth_type;
-	rcu_read_unlock();
 
 	if (flags & SKBMOD_F_SWAPMAC) {
 		u16 tmpaddr[ETH_ALEN / 2]; /* ether_addr_copy() requirement */
@@ -73,6 +66,10 @@ static int tcf_skbmod_run(struct sk_buff *skb, const struct tc_action *a,
 	}
 
 	return action;
+
+drop:
+	qstats_overlimit_inc(this_cpu_ptr(d->common.cpu_qstats));
+	return TC_ACT_SHOT;
 }
 
 static const struct nla_policy skbmod_policy[TCA_SKBMOD_MAX + 1] = {
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index 3ec585d58762..1ab959ef3547 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -31,9 +31,7 @@ static int tunnel_key_act(struct sk_buff *skb, const struct tc_action *a,
 	struct tcf_tunnel_key_params *params;
 	int action;
 
-	rcu_read_lock();
-
-	params = rcu_dereference(t->params);
+	params = rcu_dereference_bh(t->params);
 
 	tcf_lastuse_update(&t->tcf_tm);
 	bstats_cpu_update(this_cpu_ptr(t->common.cpu_bstats), skb);
@@ -53,8 +51,6 @@ static int tunnel_key_act(struct sk_buff *skb, const struct tc_action *a,
 		break;
 	}
 
-	rcu_read_unlock();
-
 	return action;
 }
 
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index ad37f308175a..15a0ee214c9c 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -40,11 +40,9 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
 	if (skb_at_tc_ingress(skb))
 		skb_push_rcsum(skb, skb->mac_len);
 
-	rcu_read_lock();
-
 	action = READ_ONCE(v->tcf_action);
 
-	p = rcu_dereference(v->vlan_p);
+	p = rcu_dereference_bh(v->vlan_p);
 
 	switch (p->tcfv_action) {
 	case TCA_VLAN_ACT_POP:
@@ -61,7 +59,7 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
 	case TCA_VLAN_ACT_MODIFY:
 		/* No-op if no vlan tag (either hw-accel or in-payload) */
 		if (!skb_vlan_tagged(skb))
-			goto unlock;
+			goto out;
 		/* extract existing tag (and guarantee no hw-accel tag) */
 		if (skb_vlan_tag_present(skb)) {
 			tci = skb_vlan_tag_get(skb);
@@ -86,18 +84,15 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
 		BUG();
 	}
 
-	goto unlock;
-
-drop:
-	action = TC_ACT_SHOT;
-	qstats_drop_inc(this_cpu_ptr(v->common.cpu_qstats));
-
-unlock:
-	rcu_read_unlock();
+out:
 	if (skb_at_tc_ingress(skb))
 		skb_pull_rcsum(skb, skb->mac_len);
 
 	return action;
+
+drop:
+	qstats_drop_inc(this_cpu_ptr(v->common.cpu_qstats));
+	return TC_ACT_SHOT;
 }
 
 static const struct nla_policy vlan_policy[TCA_VLAN_MAX + 1] = {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED.
  2018-07-19 13:02 [PATCH net-next 0/4] TC: refactor act_mirred packets re-injection Paolo Abeni
  2018-07-19 13:02 ` [PATCH net-next 1/4] tc/act: user space can't use TC_ACT_REDIRECT directly Paolo Abeni
  2018-07-19 13:02 ` [PATCH net-next 2/4] tc/act: remove unneeded RCU lock in action callback Paolo Abeni
@ 2018-07-19 13:02 ` Paolo Abeni
  2018-07-19 18:07   ` Cong Wang
  2018-07-19 18:56   ` Jiri Pirko
  2018-07-19 13:02 ` [PATCH net-next 4/4] act_mirred: use ACT_MIRRED when possible Paolo Abeni
  3 siblings, 2 replies; 13+ messages in thread
From: Paolo Abeni @ 2018-07-19 13:02 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, Daniel Borkmann,
	Marcelo Ricardo Leitner, Eyal Birger

This is similar TC_ACT_REDIRECT, but with a slightly different
semantic:
- on ingress the mirred skbs are passed to the target device
network stack without any additional check not scrubbing.
- the rcu-protected stats provided via the tcf_result struct
  are updated on error conditions.

v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce
 a new action type instead

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/sch_generic.h    | 19 +++++++++++++++++++
 include/uapi/linux/pkt_cls.h |  3 ++-
 net/core/dev.c               |  4 ++++
 net/sched/act_api.c          |  6 ++++--
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 056dc1083aa3..667d7b66fee2 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -235,6 +235,12 @@ struct tcf_result {
 			u32		classid;
 		};
 		const struct tcf_proto *goto_tp;
+
+		/* used by the TC_ACT_MIRRED action */
+		struct {
+			bool		ingress;
+			struct gnet_stats_queue *qstats;
+		};
 	};
 };
 
@@ -1091,4 +1097,17 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
 void mini_qdisc_pair_init(struct mini_Qdisc_pair *miniqp, struct Qdisc *qdisc,
 			  struct mini_Qdisc __rcu **p_miniq);
 
+static inline void skb_tc_redirect(struct sk_buff *skb, struct tcf_result *res)
+{
+	struct gnet_stats_queue *stats = res->qstats;
+	int ret;
+
+	if (res->ingress)
+		ret = netif_receive_skb(skb);
+	else
+		ret = dev_queue_xmit(skb);
+	if (ret && stats)
+		qstats_overlimit_inc(res->qstats);
+}
+
 #endif
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 7cdd62b51106..1a5e8a3217f3 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -45,7 +45,8 @@ enum {
 				   * the skb and act like everything
 				   * is alright.
 				   */
-#define TC_ACT_LAST		TC_ACT_TRAP
+#define TC_ACT_MIRRED		9
+#define TC_ACT_LAST		TC_ACT_MIRRED
 
 /* There is a special kind of actions called "extended actions",
  * which need a value parameter. These have a local opcode located in
diff --git a/net/core/dev.c b/net/core/dev.c
index 14a748ee8cc9..3822f29d730f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4602,6 +4602,10 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
 		__skb_push(skb, skb->mac_len);
 		skb_do_redirect(skb);
 		return NULL;
+	case TC_ACT_MIRRED:
+		/* this does not scrub the packet, and updates stats on error */
+		skb_tc_redirect(skb, &cl_res);
+		return NULL;
 	default:
 		break;
 	}
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index f6438f246dab..029302e2813e 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -895,8 +895,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 		}
 	}
 
-	if (a->tcfa_action == TC_ACT_REDIRECT) {
-		net_warn_ratelimited("TC_ACT_REDIRECT can't be used directly");
+	if (a->tcfa_action == TC_ACT_REDIRECT ||
+	    a->tcfa_action == TC_ACT_MIRRED) {
+		net_warn_ratelimited("action %d can't be used directly",
+				     a->tcfa_action);
 		a->tcfa_action = TC_ACT_LAST + 1;
 	}
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next 4/4] act_mirred: use ACT_MIRRED when possible
  2018-07-19 13:02 [PATCH net-next 0/4] TC: refactor act_mirred packets re-injection Paolo Abeni
                   ` (2 preceding siblings ...)
  2018-07-19 13:02 ` [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED Paolo Abeni
@ 2018-07-19 13:02 ` Paolo Abeni
  2018-07-21 23:29   ` David Miller
  3 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2018-07-19 13:02 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, Daniel Borkmann,
	Marcelo Ricardo Leitner, Eyal Birger

When mirred is invoked from the ingress path, and it wants to redirect
the processed packet, it can now use the ACT_MIRRED action,
filling the tcf_result accordingly, and avoiding a per packet
skb_clone().

Overall this gives a ~10% improvement in forwarding performance for the
TC S/W data path and TC S/W performances are now comparable to the
kernel openswitch datapath.

v1 -> v2: use ACT_MIRRED instead of ACT_REDIRECT

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/sched/act_mirred.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index eeb335f03102..fa06c161f139 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -171,10 +171,12 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 		      struct tcf_result *res)
 {
 	struct tcf_mirred *m = to_mirred(a);
+	struct sk_buff *skb2 = skb;
 	bool m_mac_header_xmit;
 	struct net_device *dev;
-	struct sk_buff *skb2;
 	int retval, err = 0;
+	bool want_ingress;
+	bool is_redirect;
 	int m_eaction;
 	int mac_len;
 
@@ -196,16 +198,19 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 		goto out;
 	}
 
-	skb2 = skb_clone(skb, GFP_ATOMIC);
-	if (!skb2)
-		goto out;
+	is_redirect = tcf_mirred_is_act_redirect(m_eaction);
+	if (!skb_at_tc_ingress(skb) || !is_redirect) {
+		skb2 = skb_clone(skb, GFP_ATOMIC);
+		if (!skb2)
+			goto out;
+	}
 
 	/* If action's target direction differs than filter's direction,
 	 * and devices expect a mac header on xmit, then mac push/pull is
 	 * needed.
 	 */
-	if (skb_at_tc_ingress(skb) != tcf_mirred_act_wants_ingress(m_eaction) &&
-	    m_mac_header_xmit) {
+	want_ingress = tcf_mirred_act_wants_ingress(m_eaction);
+	if (skb_at_tc_ingress(skb) != want_ingress && m_mac_header_xmit) {
 		if (!skb_at_tc_ingress(skb)) {
 			/* caught at egress, act ingress: pull mac */
 			mac_len = skb_network_header(skb) - skb_mac_header(skb);
@@ -216,14 +221,22 @@ static int tcf_mirred(struct sk_buff *skb, const struct tc_action *a,
 		}
 	}
 
+	skb2->skb_iif = skb->dev->ifindex;
+	skb2->dev = dev;
+
 	/* mirror is always swallowed */
-	if (tcf_mirred_is_act_redirect(m_eaction)) {
+	if (is_redirect) {
 		skb2->tc_redirected = 1;
 		skb2->tc_from_ingress = skb2->tc_at_ingress;
+
+		/* let's the caller reinject the packet, if possible */
+		if (skb_at_tc_ingress(skb)) {
+			res->ingress = want_ingress;
+			res->qstats = this_cpu_ptr(m->common.cpu_qstats);
+			return TC_ACT_MIRRED;
+		}
 	}
 
-	skb2->skb_iif = skb->dev->ifindex;
-	skb2->dev = dev;
 	if (!tcf_mirred_act_wants_ingress(m_eaction))
 		err = dev_queue_xmit(skb2);
 	else
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED.
  2018-07-19 13:02 ` [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED Paolo Abeni
@ 2018-07-19 18:07   ` Cong Wang
  2018-07-20  9:54     ` Paolo Abeni
  2018-07-19 18:56   ` Jiri Pirko
  1 sibling, 1 reply; 13+ messages in thread
From: Cong Wang @ 2018-07-19 18:07 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	Daniel Borkmann, Marcelo Ricardo Leitner, Eyal Birger

On Thu, Jul 19, 2018 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> This is similar TC_ACT_REDIRECT, but with a slightly different
> semantic:
> - on ingress the mirred skbs are passed to the target device
> network stack without any additional check not scrubbing.
> - the rcu-protected stats provided via the tcf_result struct
>   are updated on error conditions.

At least its name sucks, it means to skip the skb_clone(),
that is avoid a copy, but you still call it MIRRED...

MIRRED means MIRror and REDirect.

Also, I don't understand why this new TC_ACT code needs
to be visible to user-space, whether to clone or not is purely
internal.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED.
  2018-07-19 13:02 ` [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED Paolo Abeni
  2018-07-19 18:07   ` Cong Wang
@ 2018-07-19 18:56   ` Jiri Pirko
  1 sibling, 0 replies; 13+ messages in thread
From: Jiri Pirko @ 2018-07-19 18:56 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, Jamal Hadi Salim, Cong Wang, Daniel Borkmann,
	Marcelo Ricardo Leitner, Eyal Birger

Thu, Jul 19, 2018 at 03:02:28PM CEST, pabeni@redhat.com wrote:
>This is similar TC_ACT_REDIRECT, but with a slightly different
>semantic:
>- on ingress the mirred skbs are passed to the target device
>network stack without any additional check not scrubbing.
>- the rcu-protected stats provided via the tcf_result struct
>  are updated on error conditions.
>
>v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce
> a new action type instead
>
>Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>---
> include/net/sch_generic.h    | 19 +++++++++++++++++++
> include/uapi/linux/pkt_cls.h |  3 ++-
> net/core/dev.c               |  4 ++++
> net/sched/act_api.c          |  6 ++++--
> 4 files changed, 29 insertions(+), 3 deletions(-)
>
>diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>index 056dc1083aa3..667d7b66fee2 100644
>--- a/include/net/sch_generic.h
>+++ b/include/net/sch_generic.h
>@@ -235,6 +235,12 @@ struct tcf_result {
> 			u32		classid;
> 		};
> 		const struct tcf_proto *goto_tp;
>+
>+		/* used by the TC_ACT_MIRRED action */
>+		struct {
>+			bool		ingress;
>+			struct gnet_stats_queue *qstats;
>+		};
> 	};
> };
> 
>@@ -1091,4 +1097,17 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
> void mini_qdisc_pair_init(struct mini_Qdisc_pair *miniqp, struct Qdisc *qdisc,
> 			  struct mini_Qdisc __rcu **p_miniq);
> 
>+static inline void skb_tc_redirect(struct sk_buff *skb, struct tcf_result *res)
>+{
>+	struct gnet_stats_queue *stats = res->qstats;
>+	int ret;
>+
>+	if (res->ingress)
>+		ret = netif_receive_skb(skb);
>+	else
>+		ret = dev_queue_xmit(skb);
>+	if (ret && stats)
>+		qstats_overlimit_inc(res->qstats);
>+}
>+
> #endif
>diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>index 7cdd62b51106..1a5e8a3217f3 100644
>--- a/include/uapi/linux/pkt_cls.h
>+++ b/include/uapi/linux/pkt_cls.h
>@@ -45,7 +45,8 @@ enum {
> 				   * the skb and act like everything
> 				   * is alright.
> 				   */
>-#define TC_ACT_LAST		TC_ACT_TRAP
>+#define TC_ACT_MIRRED		9
>+#define TC_ACT_LAST		TC_ACT_MIRRED
> 
> /* There is a special kind of actions called "extended actions",
>  * which need a value parameter. These have a local opcode located in
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 14a748ee8cc9..3822f29d730f 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -4602,6 +4602,10 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
> 		__skb_push(skb, skb->mac_len);
> 		skb_do_redirect(skb);
> 		return NULL;
>+	case TC_ACT_MIRRED:
>+		/* this does not scrub the packet, and updates stats on error */
>+		skb_tc_redirect(skb, &cl_res);

REDIRECT does skb_do_redirect and MIRRED does skb_tc_redirect.
Confusing. I agree with Cong that the name is not correct.


>+		return NULL;
> 	default:
> 		break;
> 	}
>diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>index f6438f246dab..029302e2813e 100644
>--- a/net/sched/act_api.c
>+++ b/net/sched/act_api.c
>@@ -895,8 +895,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
> 		}
> 	}
> 
>-	if (a->tcfa_action == TC_ACT_REDIRECT) {
>-		net_warn_ratelimited("TC_ACT_REDIRECT can't be used directly");
>+	if (a->tcfa_action == TC_ACT_REDIRECT ||
>+	    a->tcfa_action == TC_ACT_MIRRED) {
>+		net_warn_ratelimited("action %d can't be used directly",
>+				     a->tcfa_action);
> 		a->tcfa_action = TC_ACT_LAST + 1;
> 	}
> 
>-- 
>2.17.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED.
  2018-07-19 18:07   ` Cong Wang
@ 2018-07-20  9:54     ` Paolo Abeni
  2018-07-23 21:12       ` Cong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2018-07-20  9:54 UTC (permalink / raw)
  To: Cong Wang, Jiri Pirko
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim,
	Daniel Borkmann, Marcelo Ricardo Leitner, Eyal Birger

Hi,

Jiri, Cong, thank you for the feedback. Please allow me to give a
single reply to both of you, as you rised similar concers.

On Thu, 2018-07-19 at 11:07 -0700, Cong Wang wrote:
> On Thu, Jul 19, 2018 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > 
> > This is similar TC_ACT_REDIRECT, but with a slightly different
> > semantic:
> > - on ingress the mirred skbs are passed to the target device
> > network stack without any additional check not scrubbing.
> > - the rcu-protected stats provided via the tcf_result struct
> >   are updated on error conditions.
> 
> At least its name sucks, it means to skip the skb_clone(),
> that is avoid a copy, but you still call it MIRRED...
> 
> MIRRED means MIRror and REDirect.

I was not satified with the name, too, but I also wanted to collect
some feedback, as the different time zones are not helping here.

Would TC_ACT_REINJECT be a better choice? (renaming skb_tc_redirect as
skb_tc_reinject, too). Do you have some better name?

Thanks!

> Also, I don't understand why this new TC_ACT code needs
> to be visible to user-space, whether to clone or not is purely
> internal.

Note this is what already happens with TC_ACT_REDIRECT: currently the
user space uses it freely, even if only {cls,act}_bpf can return such
value in a meaningful way, and only from the ingress and the egress
hooks.

I think we can add a clear separation between the values accessible
from user-space, and the ones used interanally by the kernel, with
something like the code below (basically unknown actions are explicitly
mapped to TC_ACT_UNSPEC), WDYT?

Note: as TC_ACT_REDIRECT is already part of the uAPI, it will remain
accessible from user-space, so patch 1/4 would be still needed.

Cheers,

Paolo

---
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e4252a176eec..9079e4ee2bbe 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -7,6 +7,9 @@
 #include <net/sch_generic.h>
 #include <net/act_api.h>
 
+/* TC action not accessible from user space */
+#define TC_ACT_REINJECT		(TC_ACT_MAX + 1)
+
 /* Basic packet classifier frontend definitions. */
 
 struct tcf_walker {
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index c4262d911596..c8a24861d4c8 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -45,6 +45,7 @@ enum {
 				   * the skb and act like everything
 				   * is alright.
 				   */
+#define TC_ACT_VALUE_MAX	TC_ACT_TRAP
 
 /* There is a special kind of actions called "extended actions",
  * which need a value parameter. These have a local opcode located in
@@ -55,11 +56,12 @@ enum {
 #define __TC_ACT_EXT_SHIFT 28
 #define __TC_ACT_EXT(local) ((local) << __TC_ACT_EXT_SHIFT)
 #define TC_ACT_EXT_VAL_MASK ((1 << __TC_ACT_EXT_SHIFT) - 1)
-#define TC_ACT_EXT_CMP(combined, opcode) \
-	(((combined) & (~TC_ACT_EXT_VAL_MASK)) == opcode)
+#define TC_ACT_EXT_OPCODE(combined) ((combined) & (~TC_ACT_EXT_VAL_MASK))
+#define TC_ACT_EXT_CMP(combined, opcode) (TC_ACT_EXT_OPCODE(combined) == opcode)
 
 #define TC_ACT_JUMP __TC_ACT_EXT(1)
 #define TC_ACT_GOTO_CHAIN __TC_ACT_EXT(2)
+#define TC_ACT_EXT_OPCODE_MAX	TC_ACT_GOTO_CHAIN
 
 /* Action type identifiers*/
 enum {
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 148a89ab789b..657c3d99698d 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -798,6 +798,7 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	char act_name[IFNAMSIZ];
 	struct nlattr *tb[TCA_ACT_MAX + 1];
 	struct nlattr *kind;
+	int opcode;
 	int err;
 
 	if (name == NULL) {
@@ -895,6 +896,14 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 		}
 	}
 
+	opcode = TC_ACT_EXT_OPCODE(a->tcfa_action);
+	if ((!opcode && a->tcfa_action > TC_ACT_VALUE_MAX) ||
+	    (opcode && opcode > TC_ACT_EXT_OPCODE_MAX)) {
+		net_warn_ratelimited("invalid %d action value",
+				     a->tcfa_action);
+		a->tcfa_action = TC_ACT_UNSPEC;
+	}
+
 	return a;
 
 err_mod:

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 4/4] act_mirred: use ACT_MIRRED when possible
  2018-07-19 13:02 ` [PATCH net-next 4/4] act_mirred: use ACT_MIRRED when possible Paolo Abeni
@ 2018-07-21 23:29   ` David Miller
  2018-07-22 14:32     ` Paolo Abeni
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2018-07-21 23:29 UTC (permalink / raw)
  To: pabeni
  Cc: netdev, jhs, xiyou.wangcong, jiri, daniel, marcelo.leitner, eyal.birger

From: Paolo Abeni <pabeni@redhat.com>
Date: Thu, 19 Jul 2018 15:02:29 +0200

> kernel openswitch datapath.
         ^^^^^^^^^^

"openvswitch"

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 4/4] act_mirred: use ACT_MIRRED when possible
  2018-07-21 23:29   ` David Miller
@ 2018-07-22 14:32     ` Paolo Abeni
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2018-07-22 14:32 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, jhs, xiyou.wangcong, jiri, daniel, marcelo.leitner, eyal.birger

On Sat, 2018-07-21 at 16:29 -0700, David Miller wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> Date: Thu, 19 Jul 2018 15:02:29 +0200
> 
> > kernel openswitch datapath.
>          ^^^^^^^^^^
> 
> "openvswitch"

Oops, typo!

Thank you for the feedback! I do hope this is your major concern ;)

I will address it v3.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED.
  2018-07-20  9:54     ` Paolo Abeni
@ 2018-07-23 21:12       ` Cong Wang
  2018-07-24  6:48         ` Paolo Abeni
  0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2018-07-23 21:12 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Jiri Pirko, Linux Kernel Network Developers, Jamal Hadi Salim,
	Daniel Borkmann, Marcelo Ricardo Leitner, Eyal Birger

On Fri, Jul 20, 2018 at 2:54 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> Hi,
>
> Jiri, Cong, thank you for the feedback. Please allow me to give a
> single reply to both of you, as you rised similar concers.
>
> On Thu, 2018-07-19 at 11:07 -0700, Cong Wang wrote:
> > On Thu, Jul 19, 2018 at 6:03 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > >
> > > This is similar TC_ACT_REDIRECT, but with a slightly different
> > > semantic:
> > > - on ingress the mirred skbs are passed to the target device
> > > network stack without any additional check not scrubbing.
> > > - the rcu-protected stats provided via the tcf_result struct
> > >   are updated on error conditions.
> >
> > At least its name sucks, it means to skip the skb_clone(),
> > that is avoid a copy, but you still call it MIRRED...
> >
> > MIRRED means MIRror and REDirect.
>
> I was not satified with the name, too, but I also wanted to collect
> some feedback, as the different time zones are not helping here.
>
> Would TC_ACT_REINJECT be a better choice? (renaming skb_tc_redirect as
> skb_tc_reinject, too). Do you have some better name?


Any name not implying a copy is better. I don't worry about it, please
see below.


> > Also, I don't understand why this new TC_ACT code needs
> > to be visible to user-space, whether to clone or not is purely
> > internal.
>
> Note this is what already happens with TC_ACT_REDIRECT: currently the
> user space uses it freely, even if only {cls,act}_bpf can return such
> value in a meaningful way, and only from the ingress and the egress
> hooks.
>

Yes, my question is why do we give user such a freedom?

In other words, what do you want users to choose here? To scrub or not
to scrub? To clone or not to clone?

>From my understanding of your whole patchset, your goal is to get rid
of clone, and users definitely don't care about clone or not clone for
redirections, this is why I insist it doesn't need to be visible to user.

If your goal is not just skipping clone, but also, let's say, scrub or not
scrub, then it should be visible to users. However, I don't see why
users care about scrub or not, they have to understand what scrub
is at least, it is a purely kernel-internal behavior.


> I think we can add a clear separation between the values accessible
> from user-space, and the ones used interanally by the kernel, with
> something like the code below (basically unknown actions are explicitly
> mapped to TC_ACT_UNSPEC), WDYT?
>
> Note: as TC_ACT_REDIRECT is already part of the uAPI, it will remain
> accessible from user-space, so patch 1/4 would be still needed.

I think that is doable too, but we should understand whether we
need to do it or not at first.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED.
  2018-07-23 21:12       ` Cong Wang
@ 2018-07-24  6:48         ` Paolo Abeni
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2018-07-24  6:48 UTC (permalink / raw)
  To: Cong Wang
  Cc: Jiri Pirko, Linux Kernel Network Developers, Jamal Hadi Salim,
	Daniel Borkmann, Marcelo Ricardo Leitner, Eyal Birger

Hi,

On Mon, 2018-07-23 at 14:12 -0700, Cong Wang wrote:
> On Fri, Jul 20, 2018 at 2:54 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > Note this is what already happens with TC_ACT_REDIRECT: currently the
> > user space uses it freely, even if only {cls,act}_bpf can return such
> > value in a meaningful way, and only from the ingress and the egress
> > hooks.
>
> Yes, my question is why do we give user such a freedom?
> 
> In other words, what do you want users to choose here? To scrub or not
> to scrub? To clone or not to clone?
> 
> From my understanding of your whole patchset, your goal is to get rid
> of clone, and users definitely don't care about clone or not clone for
> redirections, this is why I insist it doesn't need to be visible to user.

Thank you for your kind reply!

No, my intention is not to expose to the user-space another option. I
added the  additional tcfa_action value in response to concerns exposed
vs the v1 version of this series (it changed the act_mirred behaviour
and possibly broke some use-case).

When assembling the v2 I did not implemented the (deserved) isolation
vs user-space because of the already existing TC_ACT_REDIRECT: its
current implementation fooled me to think such considerations were not
relevant.

> If your goal is not just skipping clone, but also, let's say, scrub or not
> scrub, then it should be visible to users. However, I don't see why
> users care about scrub or not, they have to understand what scrub
> is at least, it is a purely kernel-internal behavior.

I agree to hide TC_ACT_REINJECT and any choice about scrubbing to user-
space, as per the code chunk I  posted before. I'll send a v3
implementing such schema.

Cheers,

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next 1/4] tc/act: user space can't use TC_ACT_REDIRECT directly
  2018-07-13  9:54 [PATCH net-next 0/4] TC: refactor TC_ACT_REDIRECT action Paolo Abeni
@ 2018-07-13  9:54 ` Paolo Abeni
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2018-07-13  9:54 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, Alexei Starovoitov,
	Daniel Borkmann, Marcelo Ricardo Leitner

Only cls_bpf and act_bpf can safely use such value. If a generic
action is configured by user space to return TC_ACT_REDIRECT,
the usually visible behavior is passing the skb up the stack - as
for unknown action, but, with complex configuration, more random
results can be obtained.

This patch forcefully converts TC_ACT_REDIRECT to TC_ACT_LAST + 1
at action init time, making the kernel behavior more consistent.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/uapi/linux/pkt_cls.h | 1 +
 net/sched/act_api.c          | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index c4262d911596..7cdd62b51106 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -45,6 +45,7 @@ enum {
 				   * the skb and act like everything
 				   * is alright.
 				   */
+#define TC_ACT_LAST		TC_ACT_TRAP
 
 /* There is a special kind of actions called "extended actions",
  * which need a value parameter. These have a local opcode located in
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 148a89ab789b..f6438f246dab 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -895,6 +895,11 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 		}
 	}
 
+	if (a->tcfa_action == TC_ACT_REDIRECT) {
+		net_warn_ratelimited("TC_ACT_REDIRECT can't be used directly");
+		a->tcfa_action = TC_ACT_LAST + 1;
+	}
+
 	return a;
 
 err_mod:
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-07-24  7:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-19 13:02 [PATCH net-next 0/4] TC: refactor act_mirred packets re-injection Paolo Abeni
2018-07-19 13:02 ` [PATCH net-next 1/4] tc/act: user space can't use TC_ACT_REDIRECT directly Paolo Abeni
2018-07-19 13:02 ` [PATCH net-next 2/4] tc/act: remove unneeded RCU lock in action callback Paolo Abeni
2018-07-19 13:02 ` [PATCH net-next 3/4] net/tc: introduce TC_ACT_MIRRED Paolo Abeni
2018-07-19 18:07   ` Cong Wang
2018-07-20  9:54     ` Paolo Abeni
2018-07-23 21:12       ` Cong Wang
2018-07-24  6:48         ` Paolo Abeni
2018-07-19 18:56   ` Jiri Pirko
2018-07-19 13:02 ` [PATCH net-next 4/4] act_mirred: use ACT_MIRRED when possible Paolo Abeni
2018-07-21 23:29   ` David Miller
2018-07-22 14:32     ` Paolo Abeni
  -- strict thread matches above, loose matches on Subject: below --
2018-07-13  9:54 [PATCH net-next 0/4] TC: refactor TC_ACT_REDIRECT action Paolo Abeni
2018-07-13  9:54 ` [PATCH net-next 1/4] tc/act: user space can't use TC_ACT_REDIRECT directly Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).