All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next,v2 0/2] net: sched: introduce em_ipt ematch
@ 2018-01-26 16:48 Eyal Birger
  2018-01-26 16:48 ` [PATCH net-next,v2 1/2] net: sched: ematch: pass protocol to ematch 'change()' handlers Eyal Birger
  2018-01-26 16:48 ` [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches Eyal Birger
  0 siblings, 2 replies; 7+ messages in thread
From: Eyal Birger @ 2018-01-26 16:48 UTC (permalink / raw)
  To: davem, jhs, xiyou.wangcong, netdev, pablo; +Cc: shmulik, Eyal Birger

From: Eyal Birger <eyal@metanetworks.com>

The following patchset introduces a new tc ematch for matching using
netfilter matches.

This allows early classification as well as mirroning/redirecting traffic
based on logic implemented in netfilter extensions.

Example use case is classification based on the incoming IPSec policy used
during decpsulation using the 'policy' iptables extension (xt_policy).

This patchset is an enhancement of a former series ([1]) which allowed only
policy matching following a suggestion by Pablo Neira Ayuso ([2]).

[1] https://patchwork.ozlabs.org/cover/859887/
[2] https://patchwork.ozlabs.org/patch/859888/

v2:
  Remove skb push/pull and limit functionality to ingress

Eyal Birger (2):
  net: sched: ematch: pass protocol to ematch 'change()' handlers
  net: sched: add em_ipt ematch for calling xtables matches

 include/net/pkt_cls.h                    |   2 +-
 include/uapi/linux/pkt_cls.h             |   3 +-
 include/uapi/linux/tc_ematch/tc_em_ipt.h |  19 +++
 net/sched/Kconfig                        |  10 ++
 net/sched/Makefile                       |   1 +
 net/sched/em_canid.c                     |   4 +-
 net/sched/em_ipset.c                     |   4 +-
 net/sched/em_ipt.c                       | 244 +++++++++++++++++++++++++++++++
 net/sched/em_meta.c                      |   2 +-
 net/sched/em_nbyte.c                     |   4 +-
 net/sched/em_text.c                      |   2 +-
 net/sched/ematch.c                       |   3 +-
 12 files changed, 287 insertions(+), 11 deletions(-)
 create mode 100644 include/uapi/linux/tc_ematch/tc_em_ipt.h
 create mode 100644 net/sched/em_ipt.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next,v2 1/2] net: sched: ematch: pass protocol to ematch 'change()' handlers
  2018-01-26 16:48 [PATCH net-next,v2 0/2] net: sched: introduce em_ipt ematch Eyal Birger
@ 2018-01-26 16:48 ` Eyal Birger
  2018-01-26 16:48 ` [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches Eyal Birger
  1 sibling, 0 replies; 7+ messages in thread
From: Eyal Birger @ 2018-01-26 16:48 UTC (permalink / raw)
  To: davem, jhs, xiyou.wangcong, netdev, pablo; +Cc: shmulik, Eyal Birger

From: Eyal Birger <eyal@metanetworks.com>

In order to allow ematches to create their internal state based on the
L3 protocol specified when creating the filter.

Signed-off-by: Eyal Birger <eyal@metanetworks.com>
---
 include/net/pkt_cls.h | 2 +-
 net/sched/em_canid.c  | 4 ++--
 net/sched/em_ipset.c  | 4 ++--
 net/sched/em_meta.c   | 2 +-
 net/sched/em_nbyte.c  | 4 ++--
 net/sched/em_text.c   | 2 +-
 net/sched/ematch.c    | 3 ++-
 7 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 8740625..ff5fcb1 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -474,7 +474,7 @@ struct tcf_ematch_tree {
 struct tcf_ematch_ops {
 	int			kind;
 	int			datalen;
-	int			(*change)(struct net *net, void *,
+	int			(*change)(struct net *net, __be16, void *,
 					  int, struct tcf_ematch *);
 	int			(*match)(struct sk_buff *, struct tcf_ematch *,
 					 struct tcf_pkt_info *);
diff --git a/net/sched/em_canid.c b/net/sched/em_canid.c
index ddd883c..445c10d 100644
--- a/net/sched/em_canid.c
+++ b/net/sched/em_canid.c
@@ -120,8 +120,8 @@ static int em_canid_match(struct sk_buff *skb, struct tcf_ematch *m,
 	return match;
 }
 
-static int em_canid_change(struct net *net, void *data, int len,
-			  struct tcf_ematch *m)
+static int em_canid_change(struct net *net, __be16 protocol, void *data,
+			   int len, struct tcf_ematch *m)
 {
 	struct can_filter *conf = data; /* Array with rules */
 	struct canid_match *cm;
diff --git a/net/sched/em_ipset.c b/net/sched/em_ipset.c
index c1b23e3..50f7282 100644
--- a/net/sched/em_ipset.c
+++ b/net/sched/em_ipset.c
@@ -19,8 +19,8 @@
 #include <net/ip.h>
 #include <net/pkt_cls.h>
 
-static int em_ipset_change(struct net *net, void *data, int data_len,
-			   struct tcf_ematch *em)
+static int em_ipset_change(struct net *net, __be16 protocol, void *data,
+			   int data_len, struct tcf_ematch *em)
 {
 	struct xt_set_info *set = data;
 	ip_set_id_t index;
diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c
index d6e9711..6892efc 100644
--- a/net/sched/em_meta.c
+++ b/net/sched/em_meta.c
@@ -904,7 +904,7 @@ static const struct nla_policy meta_policy[TCA_EM_META_MAX + 1] = {
 	[TCA_EM_META_HDR]	= { .len = sizeof(struct tcf_meta_hdr) },
 };
 
-static int em_meta_change(struct net *net, void *data, int len,
+static int em_meta_change(struct net *net, __be16 protocol, void *data, int len,
 			  struct tcf_ematch *m)
 {
 	int err;
diff --git a/net/sched/em_nbyte.c b/net/sched/em_nbyte.c
index 07c10ba..62a7a2e 100644
--- a/net/sched/em_nbyte.c
+++ b/net/sched/em_nbyte.c
@@ -23,8 +23,8 @@ struct nbyte_data {
 	char			pattern[0];
 };
 
-static int em_nbyte_change(struct net *net, void *data, int data_len,
-			   struct tcf_ematch *em)
+static int em_nbyte_change(struct net *net, __be16 protocol, void *data,
+			   int data_len, struct tcf_ematch *em)
 {
 	struct tcf_em_nbyte *nbyte = data;
 
diff --git a/net/sched/em_text.c b/net/sched/em_text.c
index 73e2ed5..b5d9e21 100644
--- a/net/sched/em_text.c
+++ b/net/sched/em_text.c
@@ -44,7 +44,7 @@ static int em_text_match(struct sk_buff *skb, struct tcf_ematch *m,
 	return skb_find_text(skb, from, to, tm->config) != UINT_MAX;
 }
 
-static int em_text_change(struct net *net, void *data, int len,
+static int em_text_change(struct net *net, __be16 protocol, void *data, int len,
 			  struct tcf_ematch *m)
 {
 	struct text_match *tm;
diff --git a/net/sched/ematch.c b/net/sched/ematch.c
index 1331a4c..a69abd8 100644
--- a/net/sched/ematch.c
+++ b/net/sched/ematch.c
@@ -242,7 +242,8 @@ static int tcf_em_validate(struct tcf_proto *tp,
 			goto errout;
 
 		if (em->ops->change) {
-			err = em->ops->change(net, data, data_len, em);
+			err = em->ops->change(net, tp->protocol, data, data_len,
+			                      em);
 			if (err < 0)
 				goto errout;
 		} else if (data_len > 0) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches
  2018-01-26 16:48 [PATCH net-next,v2 0/2] net: sched: introduce em_ipt ematch Eyal Birger
  2018-01-26 16:48 ` [PATCH net-next,v2 1/2] net: sched: ematch: pass protocol to ematch 'change()' handlers Eyal Birger
@ 2018-01-26 16:48 ` Eyal Birger
  2018-01-26 18:50   ` Pablo Neira Ayuso
  1 sibling, 1 reply; 7+ messages in thread
From: Eyal Birger @ 2018-01-26 16:48 UTC (permalink / raw)
  To: davem, jhs, xiyou.wangcong, netdev, pablo; +Cc: shmulik, Eyal Birger

From: Eyal Birger <eyal@metanetworks.com>

This module allows performing tc classification based on data structures
and implementations provided by netfilter extensions.

Example use case is classification based on the incoming IPSec policy used
during decpsulation using the 'policy' iptables extension (xt_policy).

Only tc ingress is supported in order to avoid modifying the skb at egress
to suit xtable matches skb->data expectations.

Signed-off-by: Eyal Birger <eyal@metanetworks.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>

---

This ematch tries its best to provide matches with a similar
environment to the one they are used to; Due to the wide range of criteria
supported by matches, I am not able to test every combination of match and
traffic.
---
 include/uapi/linux/pkt_cls.h             |   3 +-
 include/uapi/linux/tc_ematch/tc_em_ipt.h |  19 +++
 net/sched/Kconfig                        |  10 ++
 net/sched/Makefile                       |   1 +
 net/sched/em_ipt.c                       | 244 +++++++++++++++++++++++++++++++
 5 files changed, 276 insertions(+), 1 deletion(-)
 create mode 100644 include/uapi/linux/tc_ematch/tc_em_ipt.h
 create mode 100644 net/sched/em_ipt.c

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 46c5066..7cafb26d 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -555,7 +555,8 @@ enum {
 #define	TCF_EM_VLAN		6
 #define	TCF_EM_CANID		7
 #define	TCF_EM_IPSET		8
-#define	TCF_EM_MAX		8
+#define	TCF_EM_IPT		9
+#define	TCF_EM_MAX		9
 
 enum {
 	TCF_EM_PROG_TC
diff --git a/include/uapi/linux/tc_ematch/tc_em_ipt.h b/include/uapi/linux/tc_ematch/tc_em_ipt.h
new file mode 100644
index 0000000..aecd252
--- /dev/null
+++ b/include/uapi/linux/tc_ematch/tc_em_ipt.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef __LINUX_TC_EM_IPT_H
+#define __LINUX_TC_EM_IPT_H
+
+#include <linux/types.h>
+#include <linux/pkt_cls.h>
+
+enum {
+	TCA_EM_IPT_UNSPEC,
+	TCA_EM_IPT_HOOK,
+	TCA_EM_IPT_MATCH_NAME,
+	TCA_EM_IPT_MATCH_REVISION,
+	TCA_EM_IPT_MATCH_DATA,
+	__TCA_EM_IPT_MAX
+};
+
+#define TCA_EM_IPT_MAX (__TCA_EM_IPT_MAX - 1)
+
+#endif
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index c03d86a..203e7f7 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -658,6 +658,16 @@ config NET_EMATCH_IPSET
 	  To compile this code as a module, choose M here: the
 	  module will be called em_ipset.
 
+config NET_EMATCH_IPT
+	tristate "IPtables Matches"
+	depends on NET_EMATCH && NETFILTER && NETFILTER_XTABLES
+	---help---
+	  Say Y here to be able to classify packets based on iptables
+	  matches.
+
+	  To compile this code as a module, choose M here: the
+	  module will be called em_ipt.
+
 config NET_CLS_ACT
 	bool "Actions"
 	select NET_CLS
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 5b63544..8811d38 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -75,3 +75,4 @@ obj-$(CONFIG_NET_EMATCH_META)	+= em_meta.o
 obj-$(CONFIG_NET_EMATCH_TEXT)	+= em_text.o
 obj-$(CONFIG_NET_EMATCH_CANID)	+= em_canid.o
 obj-$(CONFIG_NET_EMATCH_IPSET)	+= em_ipset.o
+obj-$(CONFIG_NET_EMATCH_IPT)	+= em_ipt.o
diff --git a/net/sched/em_ipt.c b/net/sched/em_ipt.c
new file mode 100644
index 0000000..2103b30
--- /dev/null
+++ b/net/sched/em_ipt.c
@@ -0,0 +1,244 @@
+/*
+ * net/sched/em_ipt.c IPtables matches Ematch
+ *
+ * (c) 2018 Eyal Birger <eyal.birger@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/gfp.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/skbuff.h>
+#include <linux/ip.h>
+#include <linux/tc_ematch/tc_em_ipt.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter_ipv4/ip_tables.h>
+#include <linux/netfilter_ipv6/ip6_tables.h>
+#include <net/pkt_cls.h>
+#include <net/ip.h>
+
+struct em_ipt_match {
+	const struct xt_match *match;
+	u32 hook;
+	u8 nfproto;
+	u8 match_data[0] __aligned(8);
+};
+
+static int check_match(struct net *net, struct em_ipt_match *im, int mdata_len)
+{
+	struct xt_mtchk_param mtpar = {};
+	union {
+		struct ipt_entry e4;
+		struct ip6t_entry e6;
+	} e = {};
+
+	mtpar.net	= net;
+	mtpar.table	= "filter";
+	mtpar.hook_mask	= 1 << im->hook;
+	mtpar.family	= im->nfproto;
+	mtpar.match	= im->match;
+	mtpar.entryinfo = &e;
+	mtpar.matchinfo	= (void *)im->match_data;
+	return xt_check_match(&mtpar, mdata_len, 0, 0);
+}
+
+static const struct nla_policy em_ipt_policy[TCA_EM_IPT_MAX + 1] = {
+	[TCA_EM_IPT_HOOK]		= { .type = NLA_U32 },
+	[TCA_EM_IPT_MATCH_NAME]		= { .type = NLA_STRING,
+					    .len = XT_EXTENSION_MAXNAMELEN },
+	[TCA_EM_IPT_MATCH_REVISION]	= { .type = NLA_U8 },
+	[TCA_EM_IPT_MATCH_DATA]		= { .type = NLA_UNSPEC },
+};
+
+static int em_ipt_change(struct net *net, __be16 protocol, void *data,
+			 int data_len, struct tcf_ematch *em)
+{
+	struct nlattr *tb[TCA_EM_IPT_MAX + 1];
+	struct em_ipt_match *im = NULL;
+	struct xt_match *match;
+	u8 nfproto, mrev = 0;
+	int mdata_len, ret;
+	const char *mname;
+
+	switch (protocol) {
+	case htons(ETH_P_IP):
+		nfproto = NFPROTO_IPV4;
+		break;
+	case htons(ETH_P_IPV6):
+		nfproto = NFPROTO_IPV6;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ret = nla_parse(tb, TCA_EM_IPT_MAX, data, data_len, em_ipt_policy,
+			NULL);
+	if (ret < 0)
+		return ret;
+
+	if (!tb[TCA_EM_IPT_HOOK] || !tb[TCA_EM_IPT_MATCH_NAME] ||
+	    !tb[TCA_EM_IPT_MATCH_DATA])
+		return -EINVAL;
+
+	mname = nla_data(tb[TCA_EM_IPT_MATCH_NAME]);
+	if (tb[TCA_EM_IPT_MATCH_REVISION])
+		mrev = nla_get_u8(tb[TCA_EM_IPT_MATCH_REVISION]);
+
+	match = xt_request_find_match(nfproto, mname, mrev);
+	if (IS_ERR(match)) {
+		pr_err("unable to find match %s:%d\n", mname, mrev);
+		return PTR_ERR(match);
+	}
+
+	mdata_len = XT_ALIGN(nla_len(tb[TCA_EM_IPT_MATCH_DATA]));
+	im = kzalloc(sizeof(*im) + mdata_len, GFP_KERNEL);
+	if (!im) {
+		ret = -ENOMEM;
+		goto err;
+	}
+
+	im->match = match;
+	im->hook = nla_get_u32(tb[TCA_EM_IPT_HOOK]);
+	im->nfproto = nfproto;
+	nla_memcpy(im->match_data, tb[TCA_EM_IPT_MATCH_DATA], mdata_len);
+
+	ret = check_match(net, im, mdata_len);
+	if (ret)
+		goto err;
+
+	em->datalen = sizeof(*im) + mdata_len;
+	em->data = (unsigned long)im;
+	return 0;
+
+err:
+	kfree(im);
+	module_put(match->me);
+	return ret;
+}
+
+static void em_ipt_destroy(struct tcf_ematch *em)
+{
+	struct em_ipt_match *im = (void *)em->data;
+
+	if (!im)
+		return;
+
+	if (im->match->destroy) {
+		struct xt_mtdtor_param par = {
+			.net = em->net,
+			.match = im->match,
+			.matchinfo = im->match_data,
+			.family = im->nfproto
+		};
+		im->match->destroy(&par);
+	}
+	module_put(im->match->me);
+	kfree((void *)im);
+}
+
+static int em_ipt_match(struct sk_buff *skb, struct tcf_ematch *em,
+			struct tcf_pkt_info *info)
+{
+	const struct em_ipt_match *im = (const void *)em->data;
+	struct xt_action_param acpar = {};
+	struct net_device *indev = NULL;
+	struct nf_hook_state state;
+	int ret;
+
+	if (unlikely(!skb_at_tc_ingress(skb))) {
+		pr_notice_once("ipt match must not be used at egress\n");
+		return 0;
+	}
+
+	switch (tc_skb_protocol(skb)) {
+	case htons(ETH_P_IP):
+		if (im->nfproto != NFPROTO_IPV4)
+			return 0;
+		if (!pskb_network_may_pull(skb, sizeof(struct iphdr)))
+			return 0;
+		acpar.thoff = ip_hdrlen(skb);
+		break;
+	case htons(ETH_P_IPV6):
+		if (im->nfproto != NFPROTO_IPV6)
+			return 0;
+		if (!pskb_network_may_pull(skb, sizeof(struct ipv6hdr)))
+			return 0;
+		if (ipv6_find_hdr(skb, &acpar.thoff, -1,
+				  (unsigned short *)&acpar.fragoff, NULL) < 0)
+			return 0;
+		break;
+	default:
+		return 0;
+	}
+
+	rcu_read_lock();
+
+	if (skb->skb_iif)
+		indev = dev_get_by_index_rcu(em->net, skb->skb_iif);
+
+	nf_hook_state_init(&state, im->hook, im->nfproto, indev ?: skb->dev,
+			   skb->dev, NULL, em->net, NULL);
+
+	acpar.match = im->match;
+	acpar.matchinfo = im->match_data;
+	acpar.state = &state;
+
+	ret = im->match->match(skb, &acpar);
+
+	rcu_read_unlock();
+	return ret;
+}
+
+static int em_ipt_dump(struct sk_buff *skb, struct tcf_ematch *em)
+{
+	struct em_ipt_match *im = (void *)em->data;
+
+	if (nla_put_u32(skb, TCA_EM_IPT_HOOK, im->hook) < 0)
+		return -EMSGSIZE;
+	if (nla_put_string(skb, TCA_EM_IPT_MATCH_NAME, im->match->name) < 0)
+		return -EMSGSIZE;
+	if (nla_put_u8(skb, TCA_EM_IPT_MATCH_REVISION, im->match->revision) < 0)
+		return -EMSGSIZE;
+	if (nla_put(skb, TCA_EM_IPT_MATCH_DATA,
+		    im->match->usersize ?: im->match->matchsize,
+		    im->match_data) < 0)
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+static struct tcf_ematch_ops em_ipt_ops = {
+	.kind	  = TCF_EM_IPT,
+	.change	  = em_ipt_change,
+	.destroy  = em_ipt_destroy,
+	.match	  = em_ipt_match,
+	.dump	  = em_ipt_dump,
+	.owner	  = THIS_MODULE,
+	.link	  = LIST_HEAD_INIT(em_ipt_ops.link)
+};
+
+static int __init init_em_ipt(void)
+{
+	return tcf_em_register(&em_ipt_ops);
+}
+
+static void __exit exit_em_ipt(void)
+{
+	tcf_em_unregister(&em_ipt_ops);
+}
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Eyal Birger <eyal.birger@gmail.com>");
+MODULE_DESCRIPTION("TC extended match for IPtables matches");
+
+module_init(init_em_ipt);
+module_exit(exit_em_ipt);
+
+MODULE_ALIAS_TCF_EMATCH(TCF_EM_IPT);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches
  2018-01-26 16:48 ` [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches Eyal Birger
@ 2018-01-26 18:50   ` Pablo Neira Ayuso
  2018-01-26 19:57     ` Eyal Birger
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2018-01-26 18:50 UTC (permalink / raw)
  To: Eyal Birger; +Cc: davem, jhs, xiyou.wangcong, netdev, shmulik, Eyal Birger

On Fri, Jan 26, 2018 at 06:48:53PM +0200, Eyal Birger wrote:
> diff --git a/net/sched/em_ipt.c b/net/sched/em_ipt.c
> new file mode 100644
> index 0000000..2103b30
> --- /dev/null
> +++ b/net/sched/em_ipt.c
[...]
> +static int em_ipt_match(struct sk_buff *skb, struct tcf_ematch *em,
> +			struct tcf_pkt_info *info)
> +{
> +	const struct em_ipt_match *im = (const void *)em->data;
> +	struct xt_action_param acpar = {};
> +	struct net_device *indev = NULL;
> +	struct nf_hook_state state;
> +	int ret;
> +
> +	if (unlikely(!skb_at_tc_ingress(skb))) {
> +		pr_notice_once("ipt match must not be used at egress\n");

Isn't there a way to reject the use of this from ->change()? ie. from
control plane configuration.

> +		return 0;
> +	}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches
  2018-01-26 18:50   ` Pablo Neira Ayuso
@ 2018-01-26 19:57     ` Eyal Birger
  2018-01-29  3:22       ` Cong Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Eyal Birger @ 2018-01-26 19:57 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: David Miller, Jamal Hadi Salim, Cong Wang,
	Linux Kernel Network Developers, shmulik, Eyal Birger

On Fri, Jan 26, 2018 at 8:50 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Fri, Jan 26, 2018 at 06:48:53PM +0200, Eyal Birger wrote:
>> diff --git a/net/sched/em_ipt.c b/net/sched/em_ipt.c
>> new file mode 100644
>> index 0000000..2103b30
>> --- /dev/null
>> +++ b/net/sched/em_ipt.c
> [...]
>> +static int em_ipt_match(struct sk_buff *skb, struct tcf_ematch *em,
>> +                     struct tcf_pkt_info *info)
>> +{
>> +     const struct em_ipt_match *im = (const void *)em->data;
>> +     struct xt_action_param acpar = {};
>> +     struct net_device *indev = NULL;
>> +     struct nf_hook_state state;
>> +     int ret;
>> +
>> +     if (unlikely(!skb_at_tc_ingress(skb))) {
>> +             pr_notice_once("ipt match must not be used at egress\n");
>
> Isn't there a way to reject the use of this from ->change()? ie. from
> control plane configuration.

I wasn't able to find a simple way of doing so:

- AFAIU tc filters are detached from the qdiscs they operate on via
tcf_block instances
  that may be shared by different qdiscs. I was not able to be sure that filters
  attached to ingress qdiscs via tcf_blocks at configuration time
cannot be later be shared
  with non ingress qdiscs. Nor was I able to find another classifier
making the ingress/egress
  distinction at configuration time.

- ematches are not provided with 'ingress/egress' information at
'change()' invocation, though
  of course the infrastructure could be extended to provide this,
given the distinction is available.

Eyal.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches
  2018-01-26 19:57     ` Eyal Birger
@ 2018-01-29  3:22       ` Cong Wang
  2018-01-30  8:48         ` Eyal Birger
  0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2018-01-29  3:22 UTC (permalink / raw)
  To: Eyal Birger
  Cc: Pablo Neira Ayuso, David Miller, Jamal Hadi Salim,
	Linux Kernel Network Developers, shmulik, Eyal Birger

On Fri, Jan 26, 2018 at 11:57 AM, Eyal Birger <eyal.birger@gmail.com> wrote:
> On Fri, Jan 26, 2018 at 8:50 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>> Isn't there a way to reject the use of this from ->change()? ie. from
>> control plane configuration.
>
> I wasn't able to find a simple way of doing so:
>
> - AFAIU tc filters are detached from the qdiscs they operate on via
> tcf_block instances
>   that may be shared by different qdiscs. I was not able to be sure that filters
>   attached to ingress qdiscs via tcf_blocks at configuration time
> cannot be later be shared
>   with non ingress qdiscs. Nor was I able to find another classifier
> making the ingress/egress
>   distinction at configuration time.
>
> - ematches are not provided with 'ingress/egress' information at
> 'change()' invocation, though
>   of course the infrastructure could be extended to provide this,
> given the distinction is available.
>

In the past you can check tp->q, but now we support shared tc filter
block, so it is hard. I think your v1 is okay, which just silently passes
the match on egress side. Or maybe we can just add a pr_info()
unconditionally in em_ipt_change() saying only ingress is supported.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches
  2018-01-29  3:22       ` Cong Wang
@ 2018-01-30  8:48         ` Eyal Birger
  0 siblings, 0 replies; 7+ messages in thread
From: Eyal Birger @ 2018-01-30  8:48 UTC (permalink / raw)
  To: Cong Wang
  Cc: Pablo Neira Ayuso, David Miller, Jamal Hadi Salim,
	Linux Kernel Network Developers, shmulik, Eyal Birger

On Sun, 28 Jan 2018 19:22:12 -0800
Cong Wang <xiyou.wangcong@gmail.com> wrote:

> On Fri, Jan 26, 2018 at 11:57 AM, Eyal Birger <eyal.birger@gmail.com>
> wrote:
> > On Fri, Jan 26, 2018 at 8:50 PM, Pablo Neira Ayuso
> > <pablo@netfilter.org> wrote:  
> >> Isn't there a way to reject the use of this from ->change()? ie.
> >> from control plane configuration.  
> >
> > I wasn't able to find a simple way of doing so:
> >
> > - AFAIU tc filters are detached from the qdiscs they operate on via
> > tcf_block instances
> >   that may be shared by different qdiscs. I was not able to be sure
> > that filters attached to ingress qdiscs via tcf_blocks at
> > configuration time cannot be later be shared
> >   with non ingress qdiscs. Nor was I able to find another classifier
> > making the ingress/egress
> >   distinction at configuration time.
> >
> > - ematches are not provided with 'ingress/egress' information at
> > 'change()' invocation, though
> >   of course the infrastructure could be extended to provide this,
> > given the distinction is available.
> >  
> 
> In the past you can check tp->q, but now we support shared tc filter
> block, so it is hard. I think your v1 is okay, which just silently
> passes the match on egress side. Or maybe we can just add a pr_info()
> unconditionally in em_ipt_change() saying only ingress is supported.

Thanks!

The motivation for allowing only ingress was to avoid skb modifications
on egress as when running the match on egress, skb->data must point to
the L3 header. Looking again at the calling flow e.g. from __dev_queue_xmit(),
I don't see a case where skb may be shared.

Similarly on ingress flow, sch_handle_ingress() modifies the skb, and
tc actions perform skb modification without share checking.

So as far as I can tell skb_pull() on the match is safe.
Is there a different code path I should be looking for?

If that is the case, perhaps the v1 approach supporting both directions
including skb_pull() can be resubmitted without the pr_notice once
net-next is open.

Eyal.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-01-30  8:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-26 16:48 [PATCH net-next,v2 0/2] net: sched: introduce em_ipt ematch Eyal Birger
2018-01-26 16:48 ` [PATCH net-next,v2 1/2] net: sched: ematch: pass protocol to ematch 'change()' handlers Eyal Birger
2018-01-26 16:48 ` [PATCH net-next,v2 2/2] net: sched: add em_ipt ematch for calling xtables matches Eyal Birger
2018-01-26 18:50   ` Pablo Neira Ayuso
2018-01-26 19:57     ` Eyal Birger
2018-01-29  3:22       ` Cong Wang
2018-01-30  8:48         ` Eyal Birger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.