All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/6] net: add support for ip_tun_info options setting
@ 2019-09-16  7:10 Xin Long
  2019-09-16  7:10 ` [PATCH net-next 1/6] lwtunnel: add options process for arp request Xin Long
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:10 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063

With this patchset, users can configure options with LWTUNNEL_IP(6)_OPTS
by ip route encap for ersapn or vxlan lwtunnel. Note that in kernel part
it won't parse the option details but do some check and memcpy only, and
the options will be parsed by iproute in userspace.

We also improve the vxlan and erspan options processing in this patchset.

As an example I also wrote a patch for iproute2 that I will reply on this
mail, with it we can add options for erspan lwtunnel like:

   # ip net a a; ip net a b
   # ip -n a l a eth0 type veth peer name eth0 netns b
   # ip -n a l s eth0 up; ip -n b link set eth0 up
   # ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
   # ip -n b l a erspan1 type erspan key 1 seq erspan 123 \
        local 10.1.0.2 remote 10.1.0.1
   # ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
   # ip -n b r a 2.1.1.0/24 dev erspan1
   # ip -n a l a erspan1 type erspan key 1 seq local 10.1.0.1 external
   # ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
   # ip -n a r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
        dst 10.1.0.2 dev erspan1
   # ip -n a r s; ip net exec a ping 1.1.1.1 -c 1

Xin Long (6):
  lwtunnel: add options process for arp request
  lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip
  lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6
  vxlan: check tun_info options_len properly
  erspan: fix the tun_info options_len check
  erspan: make md work without TUNNEL_ERSPAN_OPT set

 drivers/net/vxlan.c           |  6 +++--
 include/uapi/linux/lwtunnel.h |  2 ++
 net/ipv4/ip_gre.c             | 31 ++++++++++-------------
 net/ipv4/ip_tunnel_core.c     | 59 +++++++++++++++++++++++++++++++++----------
 net/ipv6/ip6_gre.c            | 35 +++++++++++++------------
 5 files changed, 84 insertions(+), 49 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/6] lwtunnel: add options process for arp request
  2019-09-16  7:10 [PATCH net-next 0/6] net: add support for ip_tun_info options setting Xin Long
@ 2019-09-16  7:10 ` Xin Long
  2019-09-16  7:10   ` [PATCH net-next 2/6] lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip Xin Long
  2019-09-16  7:17 ` [PATCH net-next 0/6] net: add support for ip_tun_info options setting Xin Long
  2019-09-16  7:20 ` David Miller
  2 siblings, 1 reply; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:10 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063

Without options copied to the dst tun_info in iptunnel_metadata_reply()
called by arp_process for handling arp_request, the generated arp_reply
packet may be dropped or sent out with wrong options for some tunnels
like erspan and vxlan, and the traffic will break.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/ipv4/ip_tunnel_core.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 1452a97..10f0848 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -126,15 +126,14 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
 
 	if (!md || md->type != METADATA_IP_TUNNEL ||
 	    md->u.tun_info.mode & IP_TUNNEL_INFO_TX)
-
 		return NULL;
 
-	res = metadata_dst_alloc(0, METADATA_IP_TUNNEL, flags);
+	src = &md->u.tun_info;
+	res = metadata_dst_alloc(src->options_len, METADATA_IP_TUNNEL, flags);
 	if (!res)
 		return NULL;
 
 	dst = &res->u.tun_info;
-	src = &md->u.tun_info;
 	dst->key.tun_id = src->key.tun_id;
 	if (src->mode & IP_TUNNEL_INFO_IPV6)
 		memcpy(&dst->key.u.ipv6.dst, &src->key.u.ipv6.src,
@@ -143,6 +142,8 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
 		dst->key.u.ipv4.dst = src->key.u.ipv4.src;
 	dst->key.tun_flags = src->key.tun_flags;
 	dst->mode = src->mode | IP_TUNNEL_INFO_TX;
+	ip_tunnel_info_opts_set(dst, ip_tunnel_info_opts(src),
+				src->options_len, 0);
 
 	return res;
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/6] lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip
  2019-09-16  7:10 ` [PATCH net-next 1/6] lwtunnel: add options process for arp request Xin Long
@ 2019-09-16  7:10   ` Xin Long
  2019-09-16  7:10     ` [PATCH net-next 3/6] lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6 Xin Long
  0 siblings, 1 reply; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:10 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063

This patch is to add LWTUNNEL_IP_OPTS into lwtunnel_ip_t, by which
users will be able to set options for ip_tunnel_info by "ip route
encap" for erspan and vxlan's private metadata. Like one way to go
in iproute is:

  # ip route add 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
      dst 10.1.0.2 dev erspan1
  # ip route show
    1.1.1.0/24  encap ip id 1 src 0.0.0.0 dst 10.1.0.2 ttl 0 \
      tos 0 erspan ver 1 idx 123 dev erspan1 scope link

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 include/uapi/linux/lwtunnel.h |  1 +
 net/ipv4/ip_tunnel_core.c     | 30 ++++++++++++++++++++++++------
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index de696ca..93f2c05 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -27,6 +27,7 @@ enum lwtunnel_ip_t {
 	LWTUNNEL_IP_TOS,
 	LWTUNNEL_IP_FLAGS,
 	LWTUNNEL_IP_PAD,
+	LWTUNNEL_IP_OPTS,
 	__LWTUNNEL_IP_MAX,
 };
 
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 10f0848..d9b7188 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -218,6 +218,7 @@ static const struct nla_policy ip_tun_policy[LWTUNNEL_IP_MAX + 1] = {
 	[LWTUNNEL_IP_TTL]	= { .type = NLA_U8 },
 	[LWTUNNEL_IP_TOS]	= { .type = NLA_U8 },
 	[LWTUNNEL_IP_FLAGS]	= { .type = NLA_U16 },
+	[LWTUNNEL_IP_OPTS]	= { .type = NLA_BINARY },
 };
 
 static int ip_tun_build_state(struct nlattr *attr,
@@ -228,14 +229,20 @@ static int ip_tun_build_state(struct nlattr *attr,
 	struct ip_tunnel_info *tun_info;
 	struct lwtunnel_state *new_state;
 	struct nlattr *tb[LWTUNNEL_IP_MAX + 1];
-	int err;
+	int err, opts_len = 0;
+	void *opts;
 
 	err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP_MAX, attr,
 					  ip_tun_policy, extack);
 	if (err < 0)
 		return err;
 
-	new_state = lwtunnel_state_alloc(sizeof(*tun_info));
+	if (tb[LWTUNNEL_IP_OPTS]) {
+		opts = nla_data(tb[LWTUNNEL_IP_OPTS]);
+		opts_len = nla_len(tb[LWTUNNEL_IP_OPTS]);
+	}
+
+	new_state = lwtunnel_state_alloc(sizeof(*tun_info) + opts_len);
 	if (!new_state)
 		return -ENOMEM;
 
@@ -269,8 +276,10 @@ static int ip_tun_build_state(struct nlattr *attr,
 	if (tb[LWTUNNEL_IP_FLAGS])
 		tun_info->key.tun_flags = nla_get_be16(tb[LWTUNNEL_IP_FLAGS]);
 
+	if (opts_len)
+		ip_tunnel_info_opts_set(tun_info, opts, opts_len, 0);
+
 	tun_info->mode = IP_TUNNEL_INFO_TX;
-	tun_info->options_len = 0;
 
 	*ts = new_state;
 
@@ -299,6 +308,10 @@ static int ip_tun_fill_encap_info(struct sk_buff *skb,
 	    nla_put_u8(skb, LWTUNNEL_IP_TTL, tun_info->key.ttl) ||
 	    nla_put_be16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags))
 		return -ENOMEM;
+	if (tun_info->options_len &&
+	    nla_put(skb, LWTUNNEL_IP_OPTS,
+		    tun_info->options_len, ip_tunnel_info_opts(tun_info)))
+		return -ENOMEM;
 
 	return 0;
 }
@@ -310,13 +323,18 @@ static int ip_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
 		+ nla_total_size(4)	/* LWTUNNEL_IP_SRC */
 		+ nla_total_size(1)	/* LWTUNNEL_IP_TOS */
 		+ nla_total_size(1)	/* LWTUNNEL_IP_TTL */
-		+ nla_total_size(2);	/* LWTUNNEL_IP_FLAGS */
+		+ nla_total_size(2)	/* LWTUNNEL_IP_FLAGS */
+		+ lwt_tun_info(lwtstate)->options_len;	/* LWTUNNEL_IP_OPTS */
 }
 
 static int ip_tun_cmp_encap(struct lwtunnel_state *a, struct lwtunnel_state *b)
 {
-	return memcmp(lwt_tun_info(a), lwt_tun_info(b),
-		      sizeof(struct ip_tunnel_info));
+	struct ip_tunnel_info *info_a = lwt_tun_info(a);
+	struct ip_tunnel_info *info_b = lwt_tun_info(b);
+	u8 opts_len;
+
+	opts_len = min(info_a->options_len, info_b->options_len);
+	return memcmp(info_a, info_b, sizeof(*info_a) + opts_len);
 }
 
 static const struct lwtunnel_encap_ops ip_tun_lwt_ops = {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 3/6] lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6
  2019-09-16  7:10   ` [PATCH net-next 2/6] lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip Xin Long
@ 2019-09-16  7:10     ` Xin Long
  2019-09-16  7:10       ` [PATCH net-next 4/6] vxlan: check tun_info options_len properly Xin Long
  0 siblings, 1 reply; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:10 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063

Similar to lwtunnel_ip, this patch is to add options set/dump support
for lwtunnel_ip6.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 include/uapi/linux/lwtunnel.h |  1 +
 net/ipv4/ip_tunnel_core.c     | 22 ++++++++++++++++++----
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 93f2c05..4bed5e6 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -42,6 +42,7 @@ enum lwtunnel_ip6_t {
 	LWTUNNEL_IP6_TC,
 	LWTUNNEL_IP6_FLAGS,
 	LWTUNNEL_IP6_PAD,
+	LWTUNNEL_IP6_OPTS,
 	__LWTUNNEL_IP6_MAX,
 };
 
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index d9b7188..c8f5375a 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -353,6 +353,7 @@ static const struct nla_policy ip6_tun_policy[LWTUNNEL_IP6_MAX + 1] = {
 	[LWTUNNEL_IP6_HOPLIMIT]		= { .type = NLA_U8 },
 	[LWTUNNEL_IP6_TC]		= { .type = NLA_U8 },
 	[LWTUNNEL_IP6_FLAGS]		= { .type = NLA_U16 },
+	[LWTUNNEL_IP6_OPTS]		= { .type = NLA_BINARY },
 };
 
 static int ip6_tun_build_state(struct nlattr *attr,
@@ -363,14 +364,20 @@ static int ip6_tun_build_state(struct nlattr *attr,
 	struct ip_tunnel_info *tun_info;
 	struct lwtunnel_state *new_state;
 	struct nlattr *tb[LWTUNNEL_IP6_MAX + 1];
-	int err;
+	int err, opts_len = 0;
+	void *opts;
 
 	err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP6_MAX, attr,
 					  ip6_tun_policy, extack);
 	if (err < 0)
 		return err;
 
-	new_state = lwtunnel_state_alloc(sizeof(*tun_info));
+	if (tb[LWTUNNEL_IP6_OPTS]) {
+		opts = nla_data(tb[LWTUNNEL_IP6_OPTS]);
+		opts_len = nla_len(tb[LWTUNNEL_IP6_OPTS]);
+	}
+
+	new_state = lwtunnel_state_alloc(sizeof(*tun_info)  + opts_len);
 	if (!new_state)
 		return -ENOMEM;
 
@@ -396,8 +403,10 @@ static int ip6_tun_build_state(struct nlattr *attr,
 	if (tb[LWTUNNEL_IP6_FLAGS])
 		tun_info->key.tun_flags = nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]);
 
+	if (opts_len)
+		ip_tunnel_info_opts_set(tun_info, opts, opts_len, 0);
+
 	tun_info->mode = IP_TUNNEL_INFO_TX | IP_TUNNEL_INFO_IPV6;
-	tun_info->options_len = 0;
 
 	*ts = new_state;
 
@@ -417,6 +426,10 @@ static int ip6_tun_fill_encap_info(struct sk_buff *skb,
 	    nla_put_u8(skb, LWTUNNEL_IP6_HOPLIMIT, tun_info->key.ttl) ||
 	    nla_put_be16(skb, LWTUNNEL_IP6_FLAGS, tun_info->key.tun_flags))
 		return -ENOMEM;
+	if (tun_info->options_len &&
+	    nla_put(skb, LWTUNNEL_IP6_OPTS,
+		    tun_info->options_len, ip_tunnel_info_opts(tun_info)))
+		return -ENOMEM;
 
 	return 0;
 }
@@ -428,7 +441,8 @@ static int ip6_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
 		+ nla_total_size(16)	/* LWTUNNEL_IP6_SRC */
 		+ nla_total_size(1)	/* LWTUNNEL_IP6_HOPLIMIT */
 		+ nla_total_size(1)	/* LWTUNNEL_IP6_TC */
-		+ nla_total_size(2);	/* LWTUNNEL_IP6_FLAGS */
+		+ nla_total_size(2)	/* LWTUNNEL_IP6_FLAGS */
+		+ lwt_tun_info(lwtstate)->options_len;  /* LWTUNNEL_IP6_OPTS */
 }
 
 static const struct lwtunnel_encap_ops ip6_tun_lwt_ops = {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 4/6] vxlan: check tun_info options_len properly
  2019-09-16  7:10     ` [PATCH net-next 3/6] lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6 Xin Long
@ 2019-09-16  7:10       ` Xin Long
  2019-09-16  7:10         ` [PATCH net-next 5/6] erspan: fix the tun_info options_len check Xin Long
  0 siblings, 1 reply; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:10 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063

This patch is to improve the tun_info options_len by dropping
the skb when TUNNEL_VXLAN_OPT is set but options_len is less
than vxlan_metadata. This can void a potential out-of-bounds
access on ip_tun_info.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 drivers/net/vxlan.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 3d9bcc9..e0787286 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2487,9 +2487,11 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		vni = tunnel_id_to_key32(info->key.tun_id);
 		ifindex = 0;
 		dst_cache = &info->dst_cache;
-		if (info->options_len &&
-		    info->key.tun_flags & TUNNEL_VXLAN_OPT)
+		if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+			if (info->options_len < sizeof(*md))
+				goto drop;
 			md = ip_tunnel_info_opts(info);
+		}
 		ttl = info->key.ttl;
 		tos = info->key.tos;
 		label = info->key.label;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 5/6] erspan: fix the tun_info options_len check
  2019-09-16  7:10       ` [PATCH net-next 4/6] vxlan: check tun_info options_len properly Xin Long
@ 2019-09-16  7:10         ` Xin Long
  2019-09-16  7:10           ` [PATCH net-next 6/6] erspan: make md work without TUNNEL_ERSPAN_OPT set Xin Long
  0 siblings, 1 reply; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:10 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063

The check for !md doens't really work for ip_tunnel_info_opts(info) which
only does info + 1. Also to avoid out-of-bounds access on info, it should
ensure options_len is not less than erspan_metadata in both erspan_xmit()
and ip6erspan_tunnel_xmit().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/ipv4/ip_gre.c  | 4 ++--
 net/ipv6/ip6_gre.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index a53a543..df7149c 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -509,9 +509,9 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
 	key = &tun_info->key;
 	if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
 		goto err_free_skb;
-	md = ip_tunnel_info_opts(tun_info);
-	if (!md)
+	if (sizeof(*md) > tun_info->options_len)
 		goto err_free_skb;
+	md = ip_tunnel_info_opts(tun_info);
 
 	/* ERSPAN has fixed 8 byte GRE header */
 	version = md->version;
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index dd2d0b96..4aba9e0 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -980,9 +980,9 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
 		dsfield = key->tos;
 		if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
 			goto tx_err;
-		md = ip_tunnel_info_opts(tun_info);
-		if (!md)
+		if (sizeof(*md) > tun_info->options_len)
 			goto tx_err;
+		md = ip_tunnel_info_opts(tun_info);
 
 		tun_id = tunnel_id_to_key32(key->tun_id);
 		if (md->version == 1) {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 6/6] erspan: make md work without TUNNEL_ERSPAN_OPT set
  2019-09-16  7:10         ` [PATCH net-next 5/6] erspan: fix the tun_info options_len check Xin Long
@ 2019-09-16  7:10           ` Xin Long
  0 siblings, 0 replies; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:10 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063

Now when a skb comes with ip_tun_info but with no TUNNEL_ERSPAN_OPT to
a md erspan device, it will be dropped.

This patch is to allow this skb to go through this erspan device, and
the options (version, index, hwid, dir, etc.) will be filled with
tunnel's params, which can be configured by users.

This can be verified by:

 # ip net a a; ip net a b
 # ip -n a l a eth0 type veth peer name eth0 netns b
 # ip -n a l s eth0 up; ip -n b link set eth0 up
 # ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
 # ip -n b l a erspan1 type erspan key 1 seq local 10.1.0.2 remote 10.1.0.1
 # ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
 # ip -n b r a 2.1.1.0/24 dev erspan1
 # ip -n a l a erspan1 type erspan key 1 seq local 10.1.0.1 external
 # ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
 # ip -n a r a 1.1.1.0/24 encap ip id 1 dst 10.1.0.2 dev erspan1
 # ip net exec a ping 1.1.1.1 -c 1

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/ipv4/ip_gre.c  | 31 +++++++++++++------------------
 net/ipv6/ip6_gre.c | 35 +++++++++++++++++++----------------
 2 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index df7149c..ac4cbb8 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -491,15 +491,12 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
 static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct ip_tunnel *tunnel = netdev_priv(dev);
+	struct erspan_metadata *md = NULL;
 	struct ip_tunnel_info *tun_info;
 	const struct ip_tunnel_key *key;
-	struct erspan_metadata *md;
+	int version, nhoff, thoff;
 	bool truncate = false;
 	__be16 proto;
-	int tunnel_hlen;
-	int version;
-	int nhoff;
-	int thoff;
 
 	tun_info = skb_tunnel_info(skb);
 	if (unlikely(!tun_info || !(tun_info->mode & IP_TUNNEL_INFO_TX) ||
@@ -507,15 +504,11 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
 		goto err_free_skb;
 
 	key = &tun_info->key;
-	if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
-		goto err_free_skb;
-	if (sizeof(*md) > tun_info->options_len)
-		goto err_free_skb;
-	md = ip_tunnel_info_opts(tun_info);
-
-	/* ERSPAN has fixed 8 byte GRE header */
-	version = md->version;
-	tunnel_hlen = 8 + erspan_hdr_len(version);
+	if (key->tun_flags & TUNNEL_ERSPAN_OPT) {
+		if (tun_info->options_len < sizeof(*md))
+			goto err_free_skb;
+		md = ip_tunnel_info_opts(tun_info);
+	}
 
 	if (skb_cow_head(skb, dev->needed_headroom))
 		goto err_free_skb;
@@ -538,15 +531,17 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
 	    (ntohs(ipv6_hdr(skb)->payload_len) > skb->len - thoff))
 		truncate = true;
 
+	version = md ? md->version : tunnel->erspan_ver;
 	if (version == 1) {
 		erspan_build_header(skb, ntohl(tunnel_id_to_key32(key->tun_id)),
-				    ntohl(md->u.index), truncate, true);
+				    md ? ntohl(md->u.index) : tunnel->index,
+				    truncate, true);
 		proto = htons(ETH_P_ERSPAN);
 	} else if (version == 2) {
 		erspan_build_header_v2(skb,
 				       ntohl(tunnel_id_to_key32(key->tun_id)),
-				       md->u.md2.dir,
-				       get_hwid(&md->u.md2),
+				       md ? md->u.md2.dir : tunnel->dir,
+				       md ? get_hwid(&md->u.md2) : tunnel->hwid,
 				       truncate, true);
 		proto = htons(ETH_P_ERSPAN2);
 	} else {
@@ -556,7 +551,7 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
 	gre_build_header(skb, 8, TUNNEL_SEQ,
 			 proto, 0, htonl(tunnel->o_seqno++));
 
-	ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen);
+	ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, 8 + erspan_hdr_len(version));
 
 	return;
 
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 4aba9e0..a48cec3 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -959,10 +959,11 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
 	 * for native mode, call prepare_ip6gre_xmit_{ipv4,ipv6}.
 	 */
 	if (t->parms.collect_md) {
+		struct erspan_metadata *md = NULL;
 		struct ip_tunnel_info *tun_info;
 		const struct ip_tunnel_key *key;
-		struct erspan_metadata *md;
 		__be32 tun_id;
+		int version;
 
 		tun_info = skb_tunnel_info(skb);
 		if (unlikely(!tun_info ||
@@ -978,23 +979,25 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
 		fl6.flowi6_uid = sock_net_uid(dev_net(dev), NULL);
 
 		dsfield = key->tos;
-		if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
-			goto tx_err;
-		if (sizeof(*md) > tun_info->options_len)
-			goto tx_err;
-		md = ip_tunnel_info_opts(tun_info);
+		if (key->tun_flags & TUNNEL_ERSPAN_OPT) {
+			if (tun_info->options_len < sizeof(*md))
+				goto tx_err;
+			md = ip_tunnel_info_opts(tun_info);
+		}
 
 		tun_id = tunnel_id_to_key32(key->tun_id);
-		if (md->version == 1) {
-			erspan_build_header(skb,
-					    ntohl(tun_id),
-					    ntohl(md->u.index), truncate,
-					    false);
-		} else if (md->version == 2) {
-			erspan_build_header_v2(skb,
-					       ntohl(tun_id),
-					       md->u.md2.dir,
-					       get_hwid(&md->u.md2),
+		version = md ? md->version : t->parms.erspan_ver;
+		if (version == 1) {
+			erspan_build_header(skb, ntohl(tun_id),
+					    md ? ntohl(md->u.index)
+					       : t->parms.index,
+					    truncate, false);
+		} else if (version == 2) {
+			erspan_build_header_v2(skb, ntohl(tun_id),
+					       md ? md->u.md2.dir
+						  : t->parms.dir,
+					       md ? get_hwid(&md->u.md2)
+						  : t->parms.hwid,
 					       truncate, false);
 		} else {
 			goto tx_err;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 0/6] net: add support for ip_tun_info options setting
  2019-09-16  7:10 [PATCH net-next 0/6] net: add support for ip_tun_info options setting Xin Long
  2019-09-16  7:10 ` [PATCH net-next 1/6] lwtunnel: add options process for arp request Xin Long
@ 2019-09-16  7:17 ` Xin Long
  2019-09-16  7:20 ` David Miller
  2 siblings, 0 replies; 9+ messages in thread
From: Xin Long @ 2019-09-16  7:17 UTC (permalink / raw)
  To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, William Tu

[-- Attachment #1: Type: text/plain, Size: 2040 bytes --]

On Mon, Sep 16, 2019 at 3:10 PM Xin Long <lucien.xin@gmail.com> wrote:
>
> With this patchset, users can configure options with LWTUNNEL_IP(6)_OPTS
> by ip route encap for ersapn or vxlan lwtunnel. Note that in kernel part
> it won't parse the option details but do some check and memcpy only, and
> the options will be parsed by iproute in userspace.
>
> We also improve the vxlan and erspan options processing in this patchset.
>
> As an example I also wrote a patch for iproute2 that I will reply on this
> mail, with it we can add options for erspan lwtunnel like:
>
>    # ip net a a; ip net a b
>    # ip -n a l a eth0 type veth peer name eth0 netns b
>    # ip -n a l s eth0 up; ip -n b link set eth0 up
>    # ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
>    # ip -n b l a erspan1 type erspan key 1 seq erspan 123 \
>         local 10.1.0.2 remote 10.1.0.1
>    # ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
>    # ip -n b r a 2.1.1.0/24 dev erspan1
>    # ip -n a l a erspan1 type erspan key 1 seq local 10.1.0.1 external
>    # ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
>    # ip -n a r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
>         dst 10.1.0.2 dev erspan1
>    # ip -n a r s; ip net exec a ping 1.1.1.1 -c 1
the iproute2 patch for testing is as attached.

>
> Xin Long (6):
>   lwtunnel: add options process for arp request
>   lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip
>   lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6
>   vxlan: check tun_info options_len properly
>   erspan: fix the tun_info options_len check
>   erspan: make md work without TUNNEL_ERSPAN_OPT set
>
>  drivers/net/vxlan.c           |  6 +++--
>  include/uapi/linux/lwtunnel.h |  2 ++
>  net/ipv4/ip_gre.c             | 31 ++++++++++-------------
>  net/ipv4/ip_tunnel_core.c     | 59 +++++++++++++++++++++++++++++++++----------
>  net/ipv6/ip6_gre.c            | 35 +++++++++++++------------
>  5 files changed, 84 insertions(+), 49 deletions(-)
>
> --
> 2.1.0
>

[-- Attachment #2: 0001-iproute_lwtunnel-add-support-options-for-erspan-meta.patch --]
[-- Type: application/octet-stream, Size: 7224 bytes --]

From c852e8b4c6838b5beed3e477bb6bfb9dbd9b2900 Mon Sep 17 00:00:00 2001
From: Xin Long <lucien.xin@gmail.com>
Date: Mon, 16 Sep 2019 03:14:07 -0400
Subject: [PATCH] iproute_lwtunnel: add support options for erspan metadata

Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 include/uapi/linux/lwtunnel.h |   2 +
 ip/iproute_lwtunnel.c         | 131 ++++++++++++++++++++++++++++++++--
 2 files changed, 129 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 3f3fe6f3..c25ff92d 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -27,6 +27,7 @@ enum lwtunnel_ip_t {
 	LWTUNNEL_IP_TOS,
 	LWTUNNEL_IP_FLAGS,
 	LWTUNNEL_IP_PAD,
+	LWTUNNEL_IP_OPTS,
 	__LWTUNNEL_IP_MAX,
 };
 
@@ -41,6 +42,7 @@ enum lwtunnel_ip6_t {
 	LWTUNNEL_IP6_TC,
 	LWTUNNEL_IP6_FLAGS,
 	LWTUNNEL_IP6_PAD,
+	LWTUNNEL_IP6_OPTS,
 	__LWTUNNEL_IP6_MAX,
 };
 
diff --git a/ip/iproute_lwtunnel.c b/ip/iproute_lwtunnel.c
index 03217b8f..ff7eb450 100644
--- a/ip/iproute_lwtunnel.c
+++ b/ip/iproute_lwtunnel.c
@@ -32,6 +32,7 @@
 #include <linux/seg6_hmac.h>
 #include <linux/seg6_local.h>
 #include <linux/if_tunnel.h>
+#include <linux/erspan.h>
 
 static const char *format_encap_type(int type)
 {
@@ -294,7 +295,7 @@ static void print_encap_mpls(FILE *fp, struct rtattr *encap)
 static void print_encap_ip(FILE *fp, struct rtattr *encap)
 {
 	struct rtattr *tb[LWTUNNEL_IP_MAX+1];
-	__u16 flags;
+	__u16 flags = 0;
 
 	parse_rtattr_nested(tb, LWTUNNEL_IP_MAX, encap);
 
@@ -329,6 +330,25 @@ static void print_encap_ip(FILE *fp, struct rtattr *encap)
 		if (flags & TUNNEL_SEQ)
 			print_bool(PRINT_ANY, "seq", "seq ", true);
 	}
+
+	if (tb[LWTUNNEL_IP_OPTS]) {
+		if (flags & TUNNEL_ERSPAN_OPT) {
+			struct erspan_metadata *em = RTA_DATA(tb[LWTUNNEL_IP_OPTS]);
+
+			if (em->version == 1) {
+				print_bool(PRINT_ANY, "erspan", "erspan ", true);
+				print_uint(PRINT_ANY, "ver", "ver %u ", 1);
+				print_uint(PRINT_ANY, "idx", "idx %u ",
+					   ntohl(em->u.index));
+			} else if (em->version == 2) {
+				print_bool(PRINT_ANY, "erspan", "erspan ", true);
+				print_color_string(PRINT_ANY, COLOR_INET,
+						   "dir", "dir %s ",
+						   em->u.md2.dir ? "ingress" : "exgress");
+				print_uint(PRINT_ANY, "hwid", "hwid %u ", em->u.md2.hwid);
+			}
+		}
+	}
 }
 
 static void print_encap_ila(FILE *fp, struct rtattr *encap)
@@ -365,7 +385,7 @@ static void print_encap_ila(FILE *fp, struct rtattr *encap)
 static void print_encap_ip6(FILE *fp, struct rtattr *encap)
 {
 	struct rtattr *tb[LWTUNNEL_IP6_MAX+1];
-	__u16 flags;
+	__u16 flags = 0;
 
 	parse_rtattr_nested(tb, LWTUNNEL_IP6_MAX, encap);
 
@@ -401,6 +421,25 @@ static void print_encap_ip6(FILE *fp, struct rtattr *encap)
 		if (flags & TUNNEL_SEQ)
 			print_bool(PRINT_ANY, "seq", "seq ", true);
 	}
+
+	if (tb[LWTUNNEL_IP6_OPTS]) {
+		if (flags & TUNNEL_ERSPAN_OPT) {
+			struct erspan_metadata *em = RTA_DATA(tb[LWTUNNEL_IP6_OPTS]);
+
+			if (em->version == 1) {
+				print_bool(PRINT_ANY, "erspan", "erspan ", true);
+				print_uint(PRINT_ANY, "ver", "ver %u ", 1);
+				print_uint(PRINT_ANY, "idx", "idx %u ",
+					   ntohl(em->u.index));
+			} else if (em->version == 2) {
+				print_bool(PRINT_ANY, "erspan", "erspan ", true);
+				print_color_string(PRINT_ANY, COLOR_INET,
+						   "dir", "dir %s ",
+						   em->u.md2.dir ? "ingress" : "exgress");
+				print_uint(PRINT_ANY, "hwid", "hwid %u ", em->u.md2.hwid);
+			}
+		}
+	}
 }
 
 static void print_encap_bpf(FILE *fp, struct rtattr *encap)
@@ -799,7 +838,7 @@ static int parse_encap_ip(struct rtattr *rta, size_t len,
 			  int *argcp, char ***argvp)
 {
 	int id_ok = 0, dst_ok = 0, src_ok = 0, tos_ok = 0, ttl_ok = 0;
-	int key_ok = 0, csum_ok = 0, seq_ok = 0;
+	int key_ok = 0, csum_ok = 0, seq_ok = 0, erspan_ok = 0;
 	char **argv = *argvp;
 	int argc = *argcp;
 	int ret = 0;
@@ -851,6 +890,48 @@ static int parse_encap_ip(struct rtattr *rta, size_t len,
 			if (get_u8(&ttl, *argv, 0))
 				invarg("\"ttl\" value is invalid\n", *argv);
 			ret = rta_addattr8(rta, len, LWTUNNEL_IP_TTL, ttl);
+		} else if (strcmp(*argv, "erspan") == 0) {
+			struct erspan_metadata em;
+
+			memset(&em, 0, sizeof(em));
+			if (erspan_ok++)
+				duparg2("erspan", *argv);
+			flags |= TUNNEL_ERSPAN_OPT;
+			NEXT_ARG();
+			if (strcmp(*argv, "ver"))
+				duparg2("ver", *argv);
+			NEXT_ARG();
+			if (get_s32(&em.version, *argv, 0) ||
+			    (em.version != 1 && em.version != 2))
+				invarg("\"tos\" value is invalid\n", *argv);
+			NEXT_ARG();
+			if (em.version == 1) {
+				if (strcmp(*argv, "idx"))
+					duparg2("idx", *argv);
+				NEXT_ARG();
+				if (get_be32(&em.u.index, *argv, 0))
+					invarg("\"idx\" value is invalid\n", *argv);
+			} else {
+				__u8 hwid;
+
+				if (strcmp(*argv, "dir"))
+					duparg2("dir", *argv);
+				NEXT_ARG();
+				if (strcmp(*argv, "ingress") == 0)
+					em.u.md2.dir = 0;
+				else if (strcmp(*argv, "exgress") == 0)
+					em.u.md2.dir = 1;
+				else
+					invarg("\"dir\" value is invalid\n", *argv);
+				NEXT_ARG();
+				if (strcmp(*argv, "hwid"))
+					duparg2("hwid", *argv);
+				NEXT_ARG();
+				if (get_u8(&hwid, *argv, 0))
+					invarg("\"hwid\" value is invalid\n", *argv);
+				em.u.md2.hwid = hwid;
+			}
+			rta_addattr_l(rta, len, LWTUNNEL_IP_OPTS, &em, sizeof(em));
 		} else if (strcmp(*argv, "key") == 0) {
 			if (key_ok++)
 				duparg2("key", *argv);
@@ -966,7 +1047,7 @@ static int parse_encap_ip6(struct rtattr *rta, size_t len,
 			   int *argcp, char ***argvp)
 {
 	int id_ok = 0, dst_ok = 0, src_ok = 0, tos_ok = 0, ttl_ok = 0;
-	int key_ok = 0, csum_ok = 0, seq_ok = 0;
+	int key_ok = 0, csum_ok = 0, seq_ok = 0, erspan_ok = 0;
 	char **argv = *argvp;
 	int argc = *argcp;
 	int ret = 0;
@@ -1020,6 +1101,48 @@ static int parse_encap_ip6(struct rtattr *rta, size_t len,
 				       *argv);
 			ret = rta_addattr8(rta, len, LWTUNNEL_IP6_HOPLIMIT,
 					   hoplimit);
+		} else if (strcmp(*argv, "erspan") == 0) {
+			struct erspan_metadata em;
+
+			memset(&em, 0, sizeof(em));
+			if (erspan_ok++)
+				duparg2("erspan", *argv);
+			flags |= TUNNEL_ERSPAN_OPT;
+			NEXT_ARG();
+			if (strcmp(*argv, "ver"))
+				duparg2("ver", *argv);
+			NEXT_ARG();
+			if (get_s32(&em.version, *argv, 0) ||
+			    (em.version != 1 && em.version != 2))
+				invarg("\"tos\" value is invalid\n", *argv);
+			NEXT_ARG();
+			if (em.version == 1) {
+				if (strcmp(*argv, "idx"))
+					duparg2("idx", *argv);
+				NEXT_ARG();
+				if (get_be32(&em.u.index, *argv, 0))
+					invarg("\"idx\" value is invalid\n", *argv);
+			} else {
+				__u8 hwid;
+
+				if (strcmp(*argv, "dir"))
+					duparg2("dir", *argv);
+				NEXT_ARG();
+				if (strcmp(*argv, "ingress") == 0)
+					em.u.md2.dir = 0;
+				else if (strcmp(*argv, "exgress") == 0)
+					em.u.md2.dir = 1;
+				else
+					invarg("\"dir\" value is invalid\n", *argv);
+				NEXT_ARG();
+				if (strcmp(*argv, "hwid"))
+					duparg2("hwid", *argv);
+				NEXT_ARG();
+				if (get_u8(&hwid, *argv, 0))
+					invarg("\"hwid\" value is invalid\n", *argv);
+				em.u.md2.hwid = hwid;
+			}
+			rta_addattr_l(rta, len, LWTUNNEL_IP6_OPTS, &em, sizeof(em));
 		} else if (strcmp(*argv, "key") == 0) {
 			if (key_ok++)
 				duparg2("key", *argv);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 0/6] net: add support for ip_tun_info options setting
  2019-09-16  7:10 [PATCH net-next 0/6] net: add support for ip_tun_info options setting Xin Long
  2019-09-16  7:10 ` [PATCH net-next 1/6] lwtunnel: add options process for arp request Xin Long
  2019-09-16  7:17 ` [PATCH net-next 0/6] net: add support for ip_tun_info options setting Xin Long
@ 2019-09-16  7:20 ` David Miller
  2 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2019-09-16  7:20 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev, jbenc, tgraf, u9012063

From: Xin Long <lucien.xin@gmail.com>
Date: Mon, 16 Sep 2019 15:10:14 +0800

> With this patchset, users can configure options with LWTUNNEL_IP(6)_OPTS
> by ip route encap for ersapn or vxlan lwtunnel. Note that in kernel part
> it won't parse the option details but do some check and memcpy only, and
> the options will be parsed by iproute in userspace.

Sorry, this will have to wait until net-next opens back up.

Thank you.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-16  7:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16  7:10 [PATCH net-next 0/6] net: add support for ip_tun_info options setting Xin Long
2019-09-16  7:10 ` [PATCH net-next 1/6] lwtunnel: add options process for arp request Xin Long
2019-09-16  7:10   ` [PATCH net-next 2/6] lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip Xin Long
2019-09-16  7:10     ` [PATCH net-next 3/6] lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6 Xin Long
2019-09-16  7:10       ` [PATCH net-next 4/6] vxlan: check tun_info options_len properly Xin Long
2019-09-16  7:10         ` [PATCH net-next 5/6] erspan: fix the tun_info options_len check Xin Long
2019-09-16  7:10           ` [PATCH net-next 6/6] erspan: make md work without TUNNEL_ERSPAN_OPT set Xin Long
2019-09-16  7:17 ` [PATCH net-next 0/6] net: add support for ip_tun_info options setting Xin Long
2019-09-16  7:20 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.