* [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
@ 2019-11-06 9:01 Xin Long
2019-11-06 9:01 ` [PATCH net-next 1/5] lwtunnel: add options process for arp request Xin Long
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Xin Long @ 2019-11-06 9:01 UTC (permalink / raw)
To: network dev; +Cc: davem, simon.horman, Jiri Benc, Thomas Graf, u9012063
With this patchset, users can configure options by ip route encap
for geneve, vxlan and ersapn lwtunnel, like:
# ip r a 1.1.1.0/24 encap ip id 1 geneve class 0 type 0 \
data "1212121234567890" dst 10.1.0.2 dev geneve1
# ip r a 1.1.1.0/24 encap ip id 1 vxlan gbp 456 \
dst 10.1.0.2 dev erspan1
# ip r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
dst 10.1.0.2 dev erspan1
iproute side patch is attached on the reply of this mail.
Thank Simon for good advice.
Xin Long (5):
lwtunnel: add options process for arp request
lwtunnel: add options process for cmp_encap
lwtunnel: add options setting and dumping for geneve
lwtunnel: add options setting and dumping for vxlan
lwtunnel: add options setting and dumping for erspan
include/uapi/linux/lwtunnel.h | 41 +++++
net/ipv4/ip_tunnel_core.c | 382 +++++++++++++++++++++++++++++++++++++++---
2 files changed, 402 insertions(+), 21 deletions(-)
--
2.1.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net-next 1/5] lwtunnel: add options process for arp request
2019-11-06 9:01 [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping Xin Long
@ 2019-11-06 9:01 ` Xin Long
2019-11-06 9:01 ` [PATCH net-next 2/5] lwtunnel: add options process for cmp_encap Xin Long
2019-11-06 9:03 ` [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping Xin Long
2019-11-07 5:14 ` David Miller
2 siblings, 1 reply; 13+ messages in thread
From: Xin Long @ 2019-11-06 9:01 UTC (permalink / raw)
To: network dev; +Cc: davem, simon.horman, Jiri Benc, Thomas Graf, u9012063
Without options copied to the dst tun_info in iptunnel_metadata_reply()
called by arp_process for handling arp_request, the generated arp_reply
packet may be dropped or sent out with wrong options for some tunnels
like erspan and vxlan, and the traffic will break.
Fixes: 63d008a4e9ee ("ipv4: send arp replies to the correct tunnel")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/ipv4/ip_tunnel_core.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 1452a97..10f0848 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -126,15 +126,14 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
if (!md || md->type != METADATA_IP_TUNNEL ||
md->u.tun_info.mode & IP_TUNNEL_INFO_TX)
-
return NULL;
- res = metadata_dst_alloc(0, METADATA_IP_TUNNEL, flags);
+ src = &md->u.tun_info;
+ res = metadata_dst_alloc(src->options_len, METADATA_IP_TUNNEL, flags);
if (!res)
return NULL;
dst = &res->u.tun_info;
- src = &md->u.tun_info;
dst->key.tun_id = src->key.tun_id;
if (src->mode & IP_TUNNEL_INFO_IPV6)
memcpy(&dst->key.u.ipv6.dst, &src->key.u.ipv6.src,
@@ -143,6 +142,8 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
dst->key.u.ipv4.dst = src->key.u.ipv4.src;
dst->key.tun_flags = src->key.tun_flags;
dst->mode = src->mode | IP_TUNNEL_INFO_TX;
+ ip_tunnel_info_opts_set(dst, ip_tunnel_info_opts(src),
+ src->options_len, 0);
return res;
}
--
2.1.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next 2/5] lwtunnel: add options process for cmp_encap
2019-11-06 9:01 ` [PATCH net-next 1/5] lwtunnel: add options process for arp request Xin Long
@ 2019-11-06 9:01 ` Xin Long
2019-11-06 9:01 ` [PATCH net-next 3/5] lwtunnel: add options setting and dumping for geneve Xin Long
0 siblings, 1 reply; 13+ messages in thread
From: Xin Long @ 2019-11-06 9:01 UTC (permalink / raw)
To: network dev; +Cc: davem, simon.horman, Jiri Benc, Thomas Graf, u9012063
When comparing two tun_info, dst_cache member should have been skipped,
as dst_cache is a per cpu pointer and they are always different values
even in two tun_info with the same keys.
So this patch is to skip dst_cache member and compare the key, mode and
options_len only. For the future opts setting support, also to compare
options.
Fixes: 2d79849903e0 ("lwtunnel: ip tunnel: fix multiple routes with different encap")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/ipv4/ip_tunnel_core.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 10f0848..c0b5bad 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -315,8 +315,14 @@ static int ip_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
static int ip_tun_cmp_encap(struct lwtunnel_state *a, struct lwtunnel_state *b)
{
- return memcmp(lwt_tun_info(a), lwt_tun_info(b),
- sizeof(struct ip_tunnel_info));
+ struct ip_tunnel_info *info_a = lwt_tun_info(a);
+ struct ip_tunnel_info *info_b = lwt_tun_info(b);
+
+ return memcmp(info_a, info_b, sizeof(info_a->key)) ||
+ info_a->mode != info_b->mode ||
+ info_a->options_len != info_b->options_len ||
+ memcmp(ip_tunnel_info_opts(info_a),
+ ip_tunnel_info_opts(info_b), info_a->options_len);
}
static const struct lwtunnel_encap_ops ip_tun_lwt_ops = {
--
2.1.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next 3/5] lwtunnel: add options setting and dumping for geneve
2019-11-06 9:01 ` [PATCH net-next 2/5] lwtunnel: add options process for cmp_encap Xin Long
@ 2019-11-06 9:01 ` Xin Long
2019-11-06 9:01 ` [PATCH net-next 4/5] lwtunnel: add options setting and dumping for vxlan Xin Long
0 siblings, 1 reply; 13+ messages in thread
From: Xin Long @ 2019-11-06 9:01 UTC (permalink / raw)
To: network dev; +Cc: davem, simon.horman, Jiri Benc, Thomas Graf, u9012063
To add options setting and dumping, .build_state(), .fill_encap() and
.get_encap_size() in ip_tun_lwt_ops needs to be extended:
ip_tun_build_state():
ip_tun_parse_opts():
ip_tun_parse_opts_geneve()
ip_tun_fill_encap_info():
ip_tun_fill_encap_opts():
ip_tun_fill_encap_opts_geneve()
ip_tun_encap_nlsize()
ip_tun_opts_nlsize():
if (tun_flags & TUNNEL_GENEVE_OPT)
ip_tun_parse_opts(), ip_tun_fill_encap_opts() and ip_tun_opts_nlsize()
processes LWTUNNEL_IP_OPTS.
ip_tun_parse_opts_geneve(), ip_tun_fill_encap_opts_geneve() and
if (tun_flags & TUNNEL_GENEVE_OPT) processes LWTUNNEL_IP_OPTS_GENEVE.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
include/uapi/linux/lwtunnel.h | 20 ++++
net/ipv4/ip_tunnel_core.c | 212 ++++++++++++++++++++++++++++++++++++++----
2 files changed, 216 insertions(+), 16 deletions(-)
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index de696ca..b595ab2 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -27,6 +27,7 @@ enum lwtunnel_ip_t {
LWTUNNEL_IP_TOS,
LWTUNNEL_IP_FLAGS,
LWTUNNEL_IP_PAD,
+ LWTUNNEL_IP_OPTS,
__LWTUNNEL_IP_MAX,
};
@@ -41,12 +42,31 @@ enum lwtunnel_ip6_t {
LWTUNNEL_IP6_TC,
LWTUNNEL_IP6_FLAGS,
LWTUNNEL_IP6_PAD,
+ LWTUNNEL_IP6_OPTS,
__LWTUNNEL_IP6_MAX,
};
#define LWTUNNEL_IP6_MAX (__LWTUNNEL_IP6_MAX - 1)
enum {
+ LWTUNNEL_IP_OPTS_UNSPEC,
+ LWTUNNEL_IP_OPTS_GENEVE,
+ __LWTUNNEL_IP_OPTS_MAX,
+};
+
+#define LWTUNNEL_IP_OPTS_MAX (__LWTUNNEL_IP_OPTS_MAX - 1)
+
+enum {
+ LWTUNNEL_IP_OPT_GENEVE_UNSPEC,
+ LWTUNNEL_IP_OPT_GENEVE_CLASS,
+ LWTUNNEL_IP_OPT_GENEVE_TYPE,
+ LWTUNNEL_IP_OPT_GENEVE_DATA,
+ __LWTUNNEL_IP_OPT_GENEVE_MAX,
+};
+
+#define LWTUNNEL_IP_OPT_GENEVE_MAX (__LWTUNNEL_IP_OPT_GENEVE_MAX - 1)
+
+enum {
LWT_BPF_PROG_UNSPEC,
LWT_BPF_PROG_FD,
LWT_BPF_PROG_NAME,
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index c0b5bad..1ec9d94 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -34,6 +34,7 @@
#include <net/netns/generic.h>
#include <net/rtnetlink.h>
#include <net/dst_metadata.h>
+#include <net/geneve.h>
const struct ip_tunnel_encap_ops __rcu *
iptun_encaps[MAX_IPTUN_ENCAP_OPS] __read_mostly;
@@ -218,24 +219,112 @@ static const struct nla_policy ip_tun_policy[LWTUNNEL_IP_MAX + 1] = {
[LWTUNNEL_IP_TTL] = { .type = NLA_U8 },
[LWTUNNEL_IP_TOS] = { .type = NLA_U8 },
[LWTUNNEL_IP_FLAGS] = { .type = NLA_U16 },
+ [LWTUNNEL_IP_OPTS] = { .type = NLA_NESTED },
};
+static const struct nla_policy ip_opts_policy[LWTUNNEL_IP_OPTS_MAX + 1] = {
+ [LWTUNNEL_IP_OPTS_GENEVE] = { .type = NLA_NESTED },
+};
+
+static const struct nla_policy
+geneve_opt_policy[LWTUNNEL_IP_OPT_GENEVE_MAX + 1] = {
+ [LWTUNNEL_IP_OPT_GENEVE_CLASS] = { .type = NLA_U16 },
+ [LWTUNNEL_IP_OPT_GENEVE_TYPE] = { .type = NLA_U8 },
+ [LWTUNNEL_IP_OPT_GENEVE_DATA] = { .type = NLA_BINARY, .len = 128 },
+};
+
+static int ip_tun_parse_opts_geneve(struct nlattr *attr,
+ struct ip_tunnel_info *info,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr *tb[LWTUNNEL_IP_OPT_GENEVE_MAX + 1];
+ int data_len, err;
+
+ err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP_OPT_GENEVE_MAX,
+ attr, geneve_opt_policy, extack);
+ if (err)
+ return err;
+
+ if (!tb[LWTUNNEL_IP_OPT_GENEVE_CLASS] ||
+ !tb[LWTUNNEL_IP_OPT_GENEVE_TYPE] ||
+ !tb[LWTUNNEL_IP_OPT_GENEVE_DATA])
+ return -EINVAL;
+
+ attr = tb[LWTUNNEL_IP_OPT_GENEVE_DATA];
+ data_len = nla_len(attr);
+ if (data_len % 4)
+ return -EINVAL;
+
+ if (info) {
+ struct geneve_opt *opt = ip_tunnel_info_opts(info);
+
+ memcpy(opt->opt_data, nla_data(attr), data_len);
+ opt->length = data_len / 4;
+ attr = tb[LWTUNNEL_IP_OPT_GENEVE_CLASS];
+ opt->opt_class = nla_get_be16(attr);
+ attr = tb[LWTUNNEL_IP_OPT_GENEVE_TYPE];
+ opt->type = nla_get_u8(attr);
+ info->key.tun_flags |= TUNNEL_GENEVE_OPT;
+ }
+
+ return sizeof(struct geneve_opt) + data_len;
+}
+
+static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr *tb[LWTUNNEL_IP_OPTS_MAX + 1];
+ int err;
+
+ if (!attr)
+ return 0;
+
+ err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP_OPTS_MAX, attr,
+ ip_opts_policy, extack);
+ if (err)
+ return err;
+
+ if (tb[LWTUNNEL_IP_OPTS_GENEVE])
+ err = ip_tun_parse_opts_geneve(tb[LWTUNNEL_IP_OPTS_GENEVE],
+ info, extack);
+ else
+ err = -EINVAL;
+
+ return err;
+}
+
+static int ip_tun_get_optlen(struct nlattr *attr,
+ struct netlink_ext_ack *extack)
+{
+ return ip_tun_parse_opts(attr, NULL, extack);
+}
+
+static int ip_tun_set_opts(struct nlattr *attr, struct ip_tunnel_info *info,
+ struct netlink_ext_ack *extack)
+{
+ return ip_tun_parse_opts(attr, info, extack);
+}
+
static int ip_tun_build_state(struct nlattr *attr,
unsigned int family, const void *cfg,
struct lwtunnel_state **ts,
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_info *tun_info;
- struct lwtunnel_state *new_state;
struct nlattr *tb[LWTUNNEL_IP_MAX + 1];
- int err;
+ struct lwtunnel_state *new_state;
+ struct ip_tunnel_info *tun_info;
+ int err, opt_len;
err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP_MAX, attr,
ip_tun_policy, extack);
if (err < 0)
return err;
- new_state = lwtunnel_state_alloc(sizeof(*tun_info));
+ opt_len = ip_tun_get_optlen(tb[LWTUNNEL_IP_OPTS], extack);
+ if (opt_len < 0)
+ return opt_len;
+
+ new_state = lwtunnel_state_alloc(sizeof(*tun_info) + opt_len);
if (!new_state)
return -ENOMEM;
@@ -243,6 +332,12 @@ static int ip_tun_build_state(struct nlattr *attr,
tun_info = lwt_tun_info(new_state);
+ err = ip_tun_set_opts(tb[LWTUNNEL_IP_OPTS], tun_info, extack);
+ if (err < 0) {
+ lwtstate_free(new_state);
+ return err;
+ }
+
#ifdef CONFIG_DST_CACHE
err = dst_cache_init(&tun_info->dst_cache, GFP_KERNEL);
if (err) {
@@ -267,10 +362,10 @@ static int ip_tun_build_state(struct nlattr *attr,
tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP_TOS]);
if (tb[LWTUNNEL_IP_FLAGS])
- tun_info->key.tun_flags = nla_get_be16(tb[LWTUNNEL_IP_FLAGS]);
+ tun_info->key.tun_flags |= nla_get_be16(tb[LWTUNNEL_IP_FLAGS]);
tun_info->mode = IP_TUNNEL_INFO_TX;
- tun_info->options_len = 0;
+ tun_info->options_len = opt_len;
*ts = new_state;
@@ -286,6 +381,54 @@ static void ip_tun_destroy_state(struct lwtunnel_state *lwtstate)
#endif
}
+static int ip_tun_fill_encap_opts_geneve(struct sk_buff *skb,
+ struct ip_tunnel_info *tun_info)
+{
+ struct geneve_opt *opt;
+ struct nlattr *nest;
+
+ nest = nla_nest_start_noflag(skb, LWTUNNEL_IP_OPTS_GENEVE);
+ if (!nest)
+ return -ENOMEM;
+
+ opt = ip_tunnel_info_opts(tun_info);
+ if (nla_put_be16(skb, LWTUNNEL_IP_OPT_GENEVE_CLASS, opt->opt_class) ||
+ nla_put_u8(skb, LWTUNNEL_IP_OPT_GENEVE_TYPE, opt->type) ||
+ nla_put(skb, LWTUNNEL_IP_OPT_GENEVE_DATA, opt->length * 4,
+ opt->opt_data)) {
+ nla_nest_cancel(skb, nest);
+ return -ENOMEM;
+ }
+
+ nla_nest_end(skb, nest);
+ return 0;
+}
+
+static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
+ struct ip_tunnel_info *tun_info)
+{
+ struct nlattr *nest;
+ int err = 0;
+
+ if (!(tun_info->key.tun_flags & TUNNEL_GENEVE_OPT))
+ return 0;
+
+ nest = nla_nest_start_noflag(skb, type);
+ if (!nest)
+ return -ENOMEM;
+
+ if (tun_info->key.tun_flags & TUNNEL_GENEVE_OPT)
+ err = ip_tun_fill_encap_opts_geneve(skb, tun_info);
+
+ if (err) {
+ nla_nest_cancel(skb, nest);
+ return err;
+ }
+
+ nla_nest_end(skb, nest);
+ return 0;
+}
+
static int ip_tun_fill_encap_info(struct sk_buff *skb,
struct lwtunnel_state *lwtstate)
{
@@ -297,12 +440,34 @@ static int ip_tun_fill_encap_info(struct sk_buff *skb,
nla_put_in_addr(skb, LWTUNNEL_IP_SRC, tun_info->key.u.ipv4.src) ||
nla_put_u8(skb, LWTUNNEL_IP_TOS, tun_info->key.tos) ||
nla_put_u8(skb, LWTUNNEL_IP_TTL, tun_info->key.ttl) ||
- nla_put_be16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags))
+ nla_put_be16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags) ||
+ ip_tun_fill_encap_opts(skb, LWTUNNEL_IP_OPTS, tun_info))
return -ENOMEM;
return 0;
}
+static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
+{
+ int opt_len;
+
+ if (!(info->key.tun_flags & TUNNEL_GENEVE_OPT))
+ return 0;
+
+ opt_len = nla_total_size(0); /* LWTUNNEL_IP_OPTS */
+ if (info->key.tun_flags & TUNNEL_GENEVE_OPT) {
+ struct geneve_opt *opt = ip_tunnel_info_opts(info);
+
+ opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_GENEVE */
+ + nla_total_size(2) /* OPT_GENEVE_CLASS */
+ + nla_total_size(1) /* OPT_GENEVE_TYPE */
+ + nla_total_size(opt->length * 4);
+ /* OPT_GENEVE_DATA */
+ }
+
+ return opt_len;
+}
+
static int ip_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
{
return nla_total_size_64bit(8) /* LWTUNNEL_IP_ID */
@@ -310,7 +475,9 @@ static int ip_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
+ nla_total_size(4) /* LWTUNNEL_IP_SRC */
+ nla_total_size(1) /* LWTUNNEL_IP_TOS */
+ nla_total_size(1) /* LWTUNNEL_IP_TTL */
- + nla_total_size(2); /* LWTUNNEL_IP_FLAGS */
+ + nla_total_size(2) /* LWTUNNEL_IP_FLAGS */
+ + ip_tun_opts_nlsize(lwt_tun_info(lwtstate));
+ /* LWTUNNEL_IP_OPTS */
}
static int ip_tun_cmp_encap(struct lwtunnel_state *a, struct lwtunnel_state *b)
@@ -348,17 +515,21 @@ static int ip6_tun_build_state(struct nlattr *attr,
struct lwtunnel_state **ts,
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_info *tun_info;
- struct lwtunnel_state *new_state;
struct nlattr *tb[LWTUNNEL_IP6_MAX + 1];
- int err;
+ struct lwtunnel_state *new_state;
+ struct ip_tunnel_info *tun_info;
+ int err, opt_len;
err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP6_MAX, attr,
ip6_tun_policy, extack);
if (err < 0)
return err;
- new_state = lwtunnel_state_alloc(sizeof(*tun_info));
+ opt_len = ip_tun_get_optlen(tb[LWTUNNEL_IP6_OPTS], extack);
+ if (opt_len < 0)
+ return opt_len;
+
+ new_state = lwtunnel_state_alloc(sizeof(*tun_info) + opt_len);
if (!new_state)
return -ENOMEM;
@@ -366,6 +537,12 @@ static int ip6_tun_build_state(struct nlattr *attr,
tun_info = lwt_tun_info(new_state);
+ err = ip_tun_set_opts(tb[LWTUNNEL_IP6_OPTS], tun_info, extack);
+ if (err < 0) {
+ lwtstate_free(new_state);
+ return err;
+ }
+
if (tb[LWTUNNEL_IP6_ID])
tun_info->key.tun_id = nla_get_be64(tb[LWTUNNEL_IP6_ID]);
@@ -382,10 +559,10 @@ static int ip6_tun_build_state(struct nlattr *attr,
tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP6_TC]);
if (tb[LWTUNNEL_IP6_FLAGS])
- tun_info->key.tun_flags = nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]);
+ tun_info->key.tun_flags |= nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]);
tun_info->mode = IP_TUNNEL_INFO_TX | IP_TUNNEL_INFO_IPV6;
- tun_info->options_len = 0;
+ tun_info->options_len = opt_len;
*ts = new_state;
@@ -403,7 +580,8 @@ static int ip6_tun_fill_encap_info(struct sk_buff *skb,
nla_put_in6_addr(skb, LWTUNNEL_IP6_SRC, &tun_info->key.u.ipv6.src) ||
nla_put_u8(skb, LWTUNNEL_IP6_TC, tun_info->key.tos) ||
nla_put_u8(skb, LWTUNNEL_IP6_HOPLIMIT, tun_info->key.ttl) ||
- nla_put_be16(skb, LWTUNNEL_IP6_FLAGS, tun_info->key.tun_flags))
+ nla_put_be16(skb, LWTUNNEL_IP6_FLAGS, tun_info->key.tun_flags) ||
+ ip_tun_fill_encap_opts(skb, LWTUNNEL_IP6_OPTS, tun_info))
return -ENOMEM;
return 0;
@@ -416,7 +594,9 @@ static int ip6_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
+ nla_total_size(16) /* LWTUNNEL_IP6_SRC */
+ nla_total_size(1) /* LWTUNNEL_IP6_HOPLIMIT */
+ nla_total_size(1) /* LWTUNNEL_IP6_TC */
- + nla_total_size(2); /* LWTUNNEL_IP6_FLAGS */
+ + nla_total_size(2) /* LWTUNNEL_IP6_FLAGS */
+ + ip_tun_opts_nlsize(lwt_tun_info(lwtstate));
+ /* LWTUNNEL_IP6_OPTS */
}
static const struct lwtunnel_encap_ops ip6_tun_lwt_ops = {
--
2.1.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next 4/5] lwtunnel: add options setting and dumping for vxlan
2019-11-06 9:01 ` [PATCH net-next 3/5] lwtunnel: add options setting and dumping for geneve Xin Long
@ 2019-11-06 9:01 ` Xin Long
2019-11-06 9:01 ` [PATCH net-next 5/5] lwtunnel: add options setting and dumping for erspan Xin Long
0 siblings, 1 reply; 13+ messages in thread
From: Xin Long @ 2019-11-06 9:01 UTC (permalink / raw)
To: network dev; +Cc: davem, simon.horman, Jiri Benc, Thomas Graf, u9012063
Based on the code framework built on the last patch, to
support setting and dumping for vxlan, we only need to
add ip_tun_parse_opts_vxlan() for .build_state and
ip_tun_fill_encap_opts_vxlan() for .fill_encap and
if (tun_flags & TUNNEL_VXLAN_OPT) for .get_encap_size.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
include/uapi/linux/lwtunnel.h | 9 ++++++
net/ipv4/ip_tunnel_core.c | 67 +++++++++++++++++++++++++++++++++++++++++--
2 files changed, 74 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index b595ab2..638b7b1 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -51,6 +51,7 @@ enum lwtunnel_ip6_t {
enum {
LWTUNNEL_IP_OPTS_UNSPEC,
LWTUNNEL_IP_OPTS_GENEVE,
+ LWTUNNEL_IP_OPTS_VXLAN,
__LWTUNNEL_IP_OPTS_MAX,
};
@@ -67,6 +68,14 @@ enum {
#define LWTUNNEL_IP_OPT_GENEVE_MAX (__LWTUNNEL_IP_OPT_GENEVE_MAX - 1)
enum {
+ LWTUNNEL_IP_OPT_VXLAN_UNSPEC,
+ LWTUNNEL_IP_OPT_VXLAN_GBP,
+ __LWTUNNEL_IP_OPT_VXLAN_MAX,
+};
+
+#define LWTUNNEL_IP_OPT_VXLAN_MAX (__LWTUNNEL_IP_OPT_VXLAN_MAX - 1)
+
+enum {
LWT_BPF_PROG_UNSPEC,
LWT_BPF_PROG_FD,
LWT_BPF_PROG_NAME,
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 1ec9d94..61be2e0 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -35,6 +35,7 @@
#include <net/rtnetlink.h>
#include <net/dst_metadata.h>
#include <net/geneve.h>
+#include <net/vxlan.h>
const struct ip_tunnel_encap_ops __rcu *
iptun_encaps[MAX_IPTUN_ENCAP_OPS] __read_mostly;
@@ -224,6 +225,7 @@ static const struct nla_policy ip_tun_policy[LWTUNNEL_IP_MAX + 1] = {
static const struct nla_policy ip_opts_policy[LWTUNNEL_IP_OPTS_MAX + 1] = {
[LWTUNNEL_IP_OPTS_GENEVE] = { .type = NLA_NESTED },
+ [LWTUNNEL_IP_OPTS_VXLAN] = { .type = NLA_NESTED },
};
static const struct nla_policy
@@ -233,6 +235,11 @@ geneve_opt_policy[LWTUNNEL_IP_OPT_GENEVE_MAX + 1] = {
[LWTUNNEL_IP_OPT_GENEVE_DATA] = { .type = NLA_BINARY, .len = 128 },
};
+static const struct nla_policy
+vxlan_opt_policy[LWTUNNEL_IP_OPT_VXLAN_MAX + 1] = {
+ [LWTUNNEL_IP_OPT_VXLAN_GBP] = { .type = NLA_U32 },
+};
+
static int ip_tun_parse_opts_geneve(struct nlattr *attr,
struct ip_tunnel_info *info,
struct netlink_ext_ack *extack)
@@ -270,6 +277,32 @@ static int ip_tun_parse_opts_geneve(struct nlattr *attr,
return sizeof(struct geneve_opt) + data_len;
}
+static int ip_tun_parse_opts_vxlan(struct nlattr *attr,
+ struct ip_tunnel_info *info,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr *tb[LWTUNNEL_IP_OPT_VXLAN_MAX + 1];
+ int err;
+
+ err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP_OPT_VXLAN_MAX,
+ attr, vxlan_opt_policy, extack);
+ if (err)
+ return err;
+
+ if (!tb[LWTUNNEL_IP_OPT_VXLAN_GBP])
+ return -EINVAL;
+
+ if (info) {
+ struct vxlan_metadata *md = ip_tunnel_info_opts(info);
+
+ attr = tb[LWTUNNEL_IP_OPT_VXLAN_GBP];
+ md->gbp = nla_get_u32(attr);
+ info->key.tun_flags |= TUNNEL_VXLAN_OPT;
+ }
+
+ return sizeof(struct vxlan_metadata);
+}
+
static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
struct netlink_ext_ack *extack)
{
@@ -287,6 +320,9 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
if (tb[LWTUNNEL_IP_OPTS_GENEVE])
err = ip_tun_parse_opts_geneve(tb[LWTUNNEL_IP_OPTS_GENEVE],
info, extack);
+ else if (tb[LWTUNNEL_IP_OPTS_VXLAN])
+ err = ip_tun_parse_opts_vxlan(tb[LWTUNNEL_IP_OPTS_VXLAN],
+ info, extack);
else
err = -EINVAL;
@@ -404,13 +440,34 @@ static int ip_tun_fill_encap_opts_geneve(struct sk_buff *skb,
return 0;
}
+static int ip_tun_fill_encap_opts_vxlan(struct sk_buff *skb,
+ struct ip_tunnel_info *tun_info)
+{
+ struct vxlan_metadata *md;
+ struct nlattr *nest;
+
+ nest = nla_nest_start_noflag(skb, LWTUNNEL_IP_OPTS_VXLAN);
+ if (!nest)
+ return -ENOMEM;
+
+ md = ip_tunnel_info_opts(tun_info);
+ if (nla_put_u32(skb, LWTUNNEL_IP_OPT_VXLAN_GBP, md->gbp)) {
+ nla_nest_cancel(skb, nest);
+ return -ENOMEM;
+ }
+
+ nla_nest_end(skb, nest);
+ return 0;
+}
+
static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
struct ip_tunnel_info *tun_info)
{
struct nlattr *nest;
int err = 0;
- if (!(tun_info->key.tun_flags & TUNNEL_GENEVE_OPT))
+ if (!(tun_info->key.tun_flags &
+ (TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT)))
return 0;
nest = nla_nest_start_noflag(skb, type);
@@ -419,6 +476,8 @@ static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
if (tun_info->key.tun_flags & TUNNEL_GENEVE_OPT)
err = ip_tun_fill_encap_opts_geneve(skb, tun_info);
+ else if (tun_info->key.tun_flags & TUNNEL_VXLAN_OPT)
+ err = ip_tun_fill_encap_opts_vxlan(skb, tun_info);
if (err) {
nla_nest_cancel(skb, nest);
@@ -451,7 +510,8 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
{
int opt_len;
- if (!(info->key.tun_flags & TUNNEL_GENEVE_OPT))
+ if (!(info->key.tun_flags &
+ (TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT)))
return 0;
opt_len = nla_total_size(0); /* LWTUNNEL_IP_OPTS */
@@ -463,6 +523,9 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
+ nla_total_size(1) /* OPT_GENEVE_TYPE */
+ nla_total_size(opt->length * 4);
/* OPT_GENEVE_DATA */
+ } else if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_VXLAN */
+ + nla_total_size(4); /* OPT_VXLAN_GBP */
}
return opt_len;
--
2.1.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net-next 5/5] lwtunnel: add options setting and dumping for erspan
2019-11-06 9:01 ` [PATCH net-next 4/5] lwtunnel: add options setting and dumping for vxlan Xin Long
@ 2019-11-06 9:01 ` Xin Long
0 siblings, 0 replies; 13+ messages in thread
From: Xin Long @ 2019-11-06 9:01 UTC (permalink / raw)
To: network dev; +Cc: davem, simon.horman, Jiri Benc, Thomas Graf, u9012063
Based on the code framework built on the last patch, to
support setting and dumping for vxlan, we only need to
add ip_tun_parse_opts_erspan() for .build_state and
ip_tun_fill_encap_opts_erspan() for .fill_encap and
if (tun_flags & TUNNEL_ERSPAN_OPT) for .get_encap_size.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
include/uapi/linux/lwtunnel.h | 12 ++++++
net/ipv4/ip_tunnel_core.c | 94 ++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 104 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 638b7b1..f6035f7 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -52,6 +52,7 @@ enum {
LWTUNNEL_IP_OPTS_UNSPEC,
LWTUNNEL_IP_OPTS_GENEVE,
LWTUNNEL_IP_OPTS_VXLAN,
+ LWTUNNEL_IP_OPTS_ERSPAN,
__LWTUNNEL_IP_OPTS_MAX,
};
@@ -76,6 +77,17 @@ enum {
#define LWTUNNEL_IP_OPT_VXLAN_MAX (__LWTUNNEL_IP_OPT_VXLAN_MAX - 1)
enum {
+ LWTUNNEL_IP_OPT_ERSPAN_UNSPEC,
+ LWTUNNEL_IP_OPT_ERSPAN_VER,
+ LWTUNNEL_IP_OPT_ERSPAN_INDEX,
+ LWTUNNEL_IP_OPT_ERSPAN_DIR,
+ LWTUNNEL_IP_OPT_ERSPAN_HWID,
+ __LWTUNNEL_IP_OPT_ERSPAN_MAX,
+};
+
+#define LWTUNNEL_IP_OPT_ERSPAN_MAX (__LWTUNNEL_IP_OPT_ERSPAN_MAX - 1)
+
+enum {
LWT_BPF_PROG_UNSPEC,
LWT_BPF_PROG_FD,
LWT_BPF_PROG_NAME,
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 61be2e0..d4f84bf 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -36,6 +36,7 @@
#include <net/dst_metadata.h>
#include <net/geneve.h>
#include <net/vxlan.h>
+#include <net/erspan.h>
const struct ip_tunnel_encap_ops __rcu *
iptun_encaps[MAX_IPTUN_ENCAP_OPS] __read_mostly;
@@ -226,6 +227,7 @@ static const struct nla_policy ip_tun_policy[LWTUNNEL_IP_MAX + 1] = {
static const struct nla_policy ip_opts_policy[LWTUNNEL_IP_OPTS_MAX + 1] = {
[LWTUNNEL_IP_OPTS_GENEVE] = { .type = NLA_NESTED },
[LWTUNNEL_IP_OPTS_VXLAN] = { .type = NLA_NESTED },
+ [LWTUNNEL_IP_OPTS_ERSPAN] = { .type = NLA_NESTED },
};
static const struct nla_policy
@@ -240,6 +242,14 @@ vxlan_opt_policy[LWTUNNEL_IP_OPT_VXLAN_MAX + 1] = {
[LWTUNNEL_IP_OPT_VXLAN_GBP] = { .type = NLA_U32 },
};
+static const struct nla_policy
+erspan_opt_policy[LWTUNNEL_IP_OPT_ERSPAN_MAX + 1] = {
+ [LWTUNNEL_IP_OPT_ERSPAN_VER] = { .type = NLA_U8 },
+ [LWTUNNEL_IP_OPT_ERSPAN_INDEX] = { .type = NLA_U32 },
+ [LWTUNNEL_IP_OPT_ERSPAN_DIR] = { .type = NLA_U8 },
+ [LWTUNNEL_IP_OPT_ERSPAN_HWID] = { .type = NLA_U8 },
+};
+
static int ip_tun_parse_opts_geneve(struct nlattr *attr,
struct ip_tunnel_info *info,
struct netlink_ext_ack *extack)
@@ -303,6 +313,46 @@ static int ip_tun_parse_opts_vxlan(struct nlattr *attr,
return sizeof(struct vxlan_metadata);
}
+static int ip_tun_parse_opts_erspan(struct nlattr *attr,
+ struct ip_tunnel_info *info,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr *tb[LWTUNNEL_IP_OPT_ERSPAN_MAX + 1];
+ int err;
+
+ err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP_OPT_ERSPAN_MAX,
+ attr, erspan_opt_policy, extack);
+ if (err)
+ return err;
+
+ if (!tb[LWTUNNEL_IP_OPT_ERSPAN_VER])
+ return -EINVAL;
+
+ if (info) {
+ struct erspan_metadata *md = ip_tunnel_info_opts(info);
+
+ attr = tb[LWTUNNEL_IP_OPT_ERSPAN_VER];
+ md->version = nla_get_u8(attr);
+
+ if (md->version == 1 && tb[LWTUNNEL_IP_OPT_ERSPAN_INDEX]) {
+ attr = tb[LWTUNNEL_IP_OPT_ERSPAN_INDEX];
+ md->u.index = nla_get_be32(attr);
+ } else if (md->version == 2 && tb[LWTUNNEL_IP_OPT_ERSPAN_DIR] &&
+ tb[LWTUNNEL_IP_OPT_ERSPAN_HWID]) {
+ attr = tb[LWTUNNEL_IP_OPT_ERSPAN_DIR];
+ md->u.md2.dir = nla_get_u8(attr);
+ attr = tb[LWTUNNEL_IP_OPT_ERSPAN_HWID];
+ set_hwid(&md->u.md2, nla_get_u8(attr));
+ } else {
+ return -EINVAL;
+ }
+
+ info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ }
+
+ return sizeof(struct erspan_metadata);
+}
+
static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
struct netlink_ext_ack *extack)
{
@@ -323,6 +373,9 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
else if (tb[LWTUNNEL_IP_OPTS_VXLAN])
err = ip_tun_parse_opts_vxlan(tb[LWTUNNEL_IP_OPTS_VXLAN],
info, extack);
+ else if (tb[LWTUNNEL_IP_OPTS_ERSPAN])
+ err = ip_tun_parse_opts_erspan(tb[LWTUNNEL_IP_OPTS_ERSPAN],
+ info, extack);
else
err = -EINVAL;
@@ -460,6 +513,37 @@ static int ip_tun_fill_encap_opts_vxlan(struct sk_buff *skb,
return 0;
}
+static int ip_tun_fill_encap_opts_erspan(struct sk_buff *skb,
+ struct ip_tunnel_info *tun_info)
+{
+ struct erspan_metadata *md;
+ struct nlattr *nest;
+
+ nest = nla_nest_start_noflag(skb, LWTUNNEL_IP_OPTS_ERSPAN);
+ if (!nest)
+ return -ENOMEM;
+
+ md = ip_tunnel_info_opts(tun_info);
+ if (nla_put_u32(skb, LWTUNNEL_IP_OPT_ERSPAN_VER, md->version))
+ goto err;
+
+ if (md->version == 1 &&
+ nla_put_be32(skb, LWTUNNEL_IP_OPT_ERSPAN_INDEX, md->u.index))
+ goto err;
+
+ if (md->version == 2 &&
+ (nla_put_u8(skb, LWTUNNEL_IP_OPT_ERSPAN_DIR, md->u.md2.dir) ||
+ nla_put_u8(skb, LWTUNNEL_IP_OPT_ERSPAN_HWID,
+ get_hwid(&md->u.md2))))
+ goto err;
+
+ nla_nest_end(skb, nest);
+ return 0;
+err:
+ nla_nest_cancel(skb, nest);
+ return -ENOMEM;
+}
+
static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
struct ip_tunnel_info *tun_info)
{
@@ -467,7 +551,7 @@ static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
int err = 0;
if (!(tun_info->key.tun_flags &
- (TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT)))
+ (TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT | TUNNEL_ERSPAN_OPT)))
return 0;
nest = nla_nest_start_noflag(skb, type);
@@ -478,6 +562,8 @@ static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
err = ip_tun_fill_encap_opts_geneve(skb, tun_info);
else if (tun_info->key.tun_flags & TUNNEL_VXLAN_OPT)
err = ip_tun_fill_encap_opts_vxlan(skb, tun_info);
+ else if (tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT)
+ err = ip_tun_fill_encap_opts_erspan(skb, tun_info);
if (err) {
nla_nest_cancel(skb, nest);
@@ -511,7 +597,7 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
int opt_len;
if (!(info->key.tun_flags &
- (TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT)))
+ (TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT | TUNNEL_ERSPAN_OPT)))
return 0;
opt_len = nla_total_size(0); /* LWTUNNEL_IP_OPTS */
@@ -526,6 +612,10 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
} else if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_VXLAN */
+ nla_total_size(4); /* OPT_VXLAN_GBP */
+ } else if (info->key.tun_flags & TUNNEL_ERSPAN_OPT) {
+ opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_ERSPAN */
+ + nla_total_size(1) /* OPT_ERSPAN_VER */
+ + nla_total_size(4); /* OPT_ERSPAN_INDEX/DIR/HWID */
}
return opt_len;
--
2.1.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
2019-11-06 9:01 [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping Xin Long
2019-11-06 9:01 ` [PATCH net-next 1/5] lwtunnel: add options process for arp request Xin Long
@ 2019-11-06 9:03 ` Xin Long
2019-11-07 5:14 ` David Miller
2 siblings, 0 replies; 13+ messages in thread
From: Xin Long @ 2019-11-06 9:03 UTC (permalink / raw)
To: network dev; +Cc: davem, Simon Horman, Jiri Benc, Thomas Graf, William Tu
[-- Attachment #1: Type: text/plain, Size: 1069 bytes --]
On Wed, Nov 6, 2019 at 5:01 PM Xin Long <lucien.xin@gmail.com> wrote:
>
> With this patchset, users can configure options by ip route encap
> for geneve, vxlan and ersapn lwtunnel, like:
>
> # ip r a 1.1.1.0/24 encap ip id 1 geneve class 0 type 0 \
> data "1212121234567890" dst 10.1.0.2 dev geneve1
>
> # ip r a 1.1.1.0/24 encap ip id 1 vxlan gbp 456 \
> dst 10.1.0.2 dev erspan1
>
> # ip r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
> dst 10.1.0.2 dev erspan1
>
> iproute side patch is attached on the reply of this mail.
>
> Thank Simon for good advice.
>
> Xin Long (5):
> lwtunnel: add options process for arp request
> lwtunnel: add options process for cmp_encap
> lwtunnel: add options setting and dumping for geneve
> lwtunnel: add options setting and dumping for vxlan
> lwtunnel: add options setting and dumping for erspan
>
> include/uapi/linux/lwtunnel.h | 41 +++++
> net/ipv4/ip_tunnel_core.c | 382 +++++++++++++++++++++++++++++++++++++++---
> 2 files changed, 402 insertions(+), 21 deletions(-)
>
> --
> 2.1.0
>
[-- Attachment #2: 0001-iproute_lwtunnel-add-support-options-for-geneve-vxla.patch --]
[-- Type: application/octet-stream, Size: 18962 bytes --]
From 6be78778a9046bdc7430eec6908d06c61ea39ea1 Mon Sep 17 00:00:00 2001
From: Xin Long <lucien.xin@gmail.com>
Date: Tue, 5 Nov 2019 23:42:51 -0500
Subject: [PATCH] iproute_lwtunnel: add support options for geneve/vxlan/erspan
metadata
1. example for geneve:
ip net d a; ip net d b; ip net a a; ip net a b
ip -n a l a eth0 type veth peer name eth0 netns b
ip -n a l s eth0 up; ip -n b link set eth0 up
ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
ip -n b l a erspan1 type geneve id 1 remote 10.1.0.1 ttl 64
ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
ip -n b r a 2.1.1.0/24 dev erspan1
ip -n a l a erspan1 type geneve external
ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
ip -n a r a 1.1.1.0/24 encap ip id 1 geneve class 0 type 0 data "1212121234567890" dst 10.1.0.2 dev erspan1
ip -n a r s; ip net exec a ping 1.1.1.1 -c 1
2. example for vxlan:
ip net d a; ip net d b; ip net a a; ip net a b
ip -n a l a eth0 type veth peer name eth0b netns b
ip -n a l s eth0 up; ip -n b link set eth0b up
ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0b
ip -n b l a erspan1 type vxlan id 1 local 10.1.0.2 remote 10.1.0.1 dev eth0b ttl 64 gbp
ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
ip -n b r a 2.1.1.0/24 dev erspan1
ip -n a l a erspan1 type vxlan local 10.1.0.1 dev eth0 ttl 64 gbp external
ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
ip -n a r a 1.1.1.0/24 encap ip id 1 vxlan gbp 456 dst 10.1.0.2 dev erspan1
ip -n a r s; ip net exec a ping 1.1.1.1 -c 1
3. example for erspan:
ip net d a; ip net d b; ip net a a; ip net a b
ip -n a l a eth0 type veth peer name eth0 netns b
ip -n a l s eth0 up; ip -n b link set eth0 up
ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
ip -n b l a erspan1 type erspan key 1 seq erspan 123 local 10.1.0.2 remote 10.1.0.1
ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
ip -n b r a 2.1.1.0/24 dev erspan1
ip -n a l a erspan1 type erspan key 1 seq local 10.1.0.1 external
ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
ip -n a r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 dst 10.1.0.2 dev erspan1
ip -n a r s; ip net exec a ping 1.1.1.1 -c 1
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
include/uapi/linux/lwtunnel.h | 41 ++++
ip/iproute_lwtunnel.c | 355 +++++++++++++++++++++++++++++++++-
2 files changed, 390 insertions(+), 6 deletions(-)
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 3f3fe6f3..d2ee622d 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -27,6 +27,7 @@ enum lwtunnel_ip_t {
LWTUNNEL_IP_TOS,
LWTUNNEL_IP_FLAGS,
LWTUNNEL_IP_PAD,
+ LWTUNNEL_IP_OPTS,
__LWTUNNEL_IP_MAX,
};
@@ -41,6 +42,7 @@ enum lwtunnel_ip6_t {
LWTUNNEL_IP6_TC,
LWTUNNEL_IP6_FLAGS,
LWTUNNEL_IP6_PAD,
+ LWTUNNEL_IP6_OPTS,
__LWTUNNEL_IP6_MAX,
};
@@ -68,4 +70,43 @@ enum {
#define LWT_BPF_MAX_HEADROOM 256
+enum {
+ LWTUNNEL_IP_OPTS_UNSPEC,
+ LWTUNNEL_IP_OPTS_GENEVE,
+ LWTUNNEL_IP_OPTS_VXLAN,
+ LWTUNNEL_IP_OPTS_ERSPAN,
+ __LWTUNNEL_IP_OPTS_MAX,
+};
+
+#define LWTUNNEL_IP_OPTS_MAX (__LWTUNNEL_IP_OPTS_MAX - 1)
+
+enum {
+ LWTUNNEL_IP_OPT_GENEVE_UNSPEC,
+ LWTUNNEL_IP_OPT_GENEVE_CLASS,
+ LWTUNNEL_IP_OPT_GENEVE_TYPE,
+ LWTUNNEL_IP_OPT_GENEVE_DATA,
+ __LWTUNNEL_IP_OPT_GENEVE_MAX,
+};
+
+#define LWTUNNEL_IP_OPT_GENEVE_MAX (__LWTUNNEL_IP_OPT_GENEVE_MAX - 1)
+
+enum {
+ LWTUNNEL_IP_OPT_VXLAN_UNSPEC,
+ LWTUNNEL_IP_OPT_VXLAN_GBP,
+ __LWTUNNEL_IP_OPT_VXLAN_MAX,
+};
+
+#define LWTUNNEL_IP_OPT_VXLAN_MAX (__LWTUNNEL_IP_OPT_VXLAN_MAX - 1)
+
+enum {
+ LWTUNNEL_IP_OPT_ERSPAN_UNSPEC,
+ LWTUNNEL_IP_OPT_ERSPAN_VER,
+ LWTUNNEL_IP_OPT_ERSPAN_INDEX,
+ LWTUNNEL_IP_OPT_ERSPAN_DIR,
+ LWTUNNEL_IP_OPT_ERSPAN_HWID,
+ __LWTUNNEL_IP_OPT_ERSPAN_MAX,
+};
+
+#define LWTUNNEL_IP_OPT_ERSPAN_MAX (__LWTUNNEL_IP_OPT_ERSPAN_MAX - 1)
+
#endif /* _LWTUNNEL_H_ */
diff --git a/ip/iproute_lwtunnel.c b/ip/iproute_lwtunnel.c
index 03217b8f..03c3098a 100644
--- a/ip/iproute_lwtunnel.c
+++ b/ip/iproute_lwtunnel.c
@@ -32,6 +32,7 @@
#include <linux/seg6_hmac.h>
#include <linux/seg6_local.h>
#include <linux/if_tunnel.h>
+#include <linux/erspan.h>
static const char *format_encap_type(int type)
{
@@ -294,7 +295,7 @@ static void print_encap_mpls(FILE *fp, struct rtattr *encap)
static void print_encap_ip(FILE *fp, struct rtattr *encap)
{
struct rtattr *tb[LWTUNNEL_IP_MAX+1];
- __u16 flags;
+ __u16 flags = 0;
parse_rtattr_nested(tb, LWTUNNEL_IP_MAX, encap);
@@ -329,6 +330,60 @@ static void print_encap_ip(FILE *fp, struct rtattr *encap)
if (flags & TUNNEL_SEQ)
print_bool(PRINT_ANY, "seq", "seq ", true);
}
+
+ if (tb[LWTUNNEL_IP_OPTS]) {
+ struct rtattr *tb_opt[LWTUNNEL_IP_OPTS_MAX+1];
+
+ parse_rtattr_nested(tb_opt, LWTUNNEL_IP_OPTS_MAX, tb[LWTUNNEL_IP_OPTS]);
+
+ if ((flags & TUNNEL_ERSPAN_OPT) && tb_opt[LWTUNNEL_IP_OPTS_ERSPAN]) {
+ struct rtattr *tb_esn[LWTUNNEL_IP_OPT_ERSPAN_MAX+1];
+ struct erspan_metadata em;
+
+ memset(&em, 0, sizeof(em));
+ parse_rtattr_nested(tb_esn, LWTUNNEL_IP_OPT_ERSPAN_MAX, tb_opt[LWTUNNEL_IP_OPTS_ERSPAN]);
+ em.version = rta_getattr_u8(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_VER]);
+ if (em.version == 1) {
+ em.u.index = rta_getattr_u32(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_INDEX]);
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_uint(PRINT_ANY, "ver", "ver %u ", 1);
+ print_uint(PRINT_ANY, "idx", "idx %u ",
+ ntohl(em.u.index));
+ } else if (em.version == 2) {
+ em.u.md2.dir = rta_getattr_u8(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_DIR]);
+ em.u.md2.hwid = rta_getattr_u16(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_HWID]);
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_uint(PRINT_ANY, "ver", "ver %u ", 2);
+ print_color_string(PRINT_ANY, COLOR_INET,
+ "dir", "dir %s ",
+ em.u.md2.dir ? "ingress" : "exgress");
+ print_uint(PRINT_ANY, "hwid", "hwid %u ", em.u.md2.hwid);
+ }
+ } else if ((flags & TUNNEL_VXLAN_OPT) && tb_opt[LWTUNNEL_IP_OPTS_VXLAN]) {
+ struct rtattr *tb_esn[LWTUNNEL_IP_OPT_VXLAN_MAX+1];
+ unsigned int gbp;
+
+ parse_rtattr_nested(tb_esn, LWTUNNEL_IP_OPT_VXLAN_MAX, tb_opt[LWTUNNEL_IP_OPTS_VXLAN]);
+ gbp = rta_getattr_u32(tb_esn[LWTUNNEL_IP_OPT_VXLAN_GBP]);
+ print_bool(PRINT_ANY, "vxlan", "vxlan ", true);
+ print_uint(PRINT_ANY, "gbp", "gbp %u ", gbp);
+ } else if ((flags & TUNNEL_GENEVE_OPT) && tb_opt[LWTUNNEL_IP_OPTS_GENEVE]) {
+ struct rtattr *tb_esn[LWTUNNEL_IP_OPT_GENEVE_MAX+1];
+ char data[128] = { 0 };
+ __u16 class;
+ __u8 type;
+
+ parse_rtattr_nested(tb_esn, LWTUNNEL_IP_OPT_GENEVE_MAX, tb_opt[LWTUNNEL_IP_OPTS_GENEVE]);
+ class = rta_getattr_u16(tb_esn[LWTUNNEL_IP_OPT_GENEVE_CLASS]);
+ type = rta_getattr_u8(tb_esn[LWTUNNEL_IP_OPT_GENEVE_TYPE]);
+ hexstring_n2a(RTA_DATA(tb_esn[LWTUNNEL_IP_OPT_GENEVE_DATA]),
+ RTA_PAYLOAD(tb_esn[LWTUNNEL_IP_OPT_GENEVE_DATA]), data, sizeof(data));
+ print_bool(PRINT_ANY, "geneve", "geneve ", true);
+ print_uint(PRINT_ANY, "class", "class %u ", class);
+ print_uint(PRINT_ANY, "type", "type %u ", type);
+ print_color_string(PRINT_ANY, COLOR_INET, "data", "data %s ", data);
+ }
+ }
}
static void print_encap_ila(FILE *fp, struct rtattr *encap)
@@ -365,7 +420,7 @@ static void print_encap_ila(FILE *fp, struct rtattr *encap)
static void print_encap_ip6(FILE *fp, struct rtattr *encap)
{
struct rtattr *tb[LWTUNNEL_IP6_MAX+1];
- __u16 flags;
+ __u16 flags = 0;
parse_rtattr_nested(tb, LWTUNNEL_IP6_MAX, encap);
@@ -401,6 +456,59 @@ static void print_encap_ip6(FILE *fp, struct rtattr *encap)
if (flags & TUNNEL_SEQ)
print_bool(PRINT_ANY, "seq", "seq ", true);
}
+ if (tb[LWTUNNEL_IP6_OPTS]) {
+ struct rtattr *tb_opt[LWTUNNEL_IP_OPTS_MAX+1];
+
+ parse_rtattr_nested(tb_opt, LWTUNNEL_IP_OPTS_MAX, tb[LWTUNNEL_IP6_OPTS]);
+
+ if ((flags & TUNNEL_ERSPAN_OPT) && tb_opt[LWTUNNEL_IP_OPTS_ERSPAN]) {
+ struct rtattr *tb_esn[LWTUNNEL_IP_OPT_ERSPAN_MAX+1];
+ struct erspan_metadata em;
+
+ memset(&em, 0, sizeof(em));
+ parse_rtattr_nested(tb_esn, LWTUNNEL_IP_OPT_ERSPAN_MAX, tb_opt[LWTUNNEL_IP_OPTS_ERSPAN]);
+ em.version = rta_getattr_u8(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_VER]);
+ if (em.version == 1) {
+ em.u.index = rta_getattr_u32(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_INDEX]);
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_uint(PRINT_ANY, "ver", "ver %u ", 1);
+ print_uint(PRINT_ANY, "idx", "idx %u ",
+ ntohl(em.u.index));
+ } else if (em.version == 2) {
+ em.u.md2.dir = rta_getattr_u8(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_DIR]);
+ em.u.md2.hwid = rta_getattr_u16(tb_esn[LWTUNNEL_IP_OPT_ERSPAN_HWID]);
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_uint(PRINT_ANY, "ver", "ver %u ", 2);
+ print_color_string(PRINT_ANY, COLOR_INET,
+ "dir", "dir %s ",
+ em.u.md2.dir ? "ingress" : "exgress");
+ print_uint(PRINT_ANY, "hwid", "hwid %u ", em.u.md2.hwid);
+ }
+ } else if ((flags & TUNNEL_VXLAN_OPT) && tb_opt[LWTUNNEL_IP_OPTS_VXLAN]) {
+ struct rtattr *tb_esn[LWTUNNEL_IP_OPT_VXLAN_MAX+1];
+ unsigned int gbp;
+
+ parse_rtattr_nested(tb_esn, LWTUNNEL_IP_OPT_VXLAN_MAX, tb_opt[LWTUNNEL_IP_OPTS_VXLAN]);
+ gbp = rta_getattr_u32(tb_esn[LWTUNNEL_IP_OPT_VXLAN_GBP]);
+ print_bool(PRINT_ANY, "vxlan", "vxlan ", true);
+ print_uint(PRINT_ANY, "gbp", "gbp %u ", gbp);
+ } else if ((flags & TUNNEL_GENEVE_OPT) && tb_opt[LWTUNNEL_IP_OPTS_GENEVE]) {
+ struct rtattr *tb_esn[LWTUNNEL_IP_OPT_GENEVE_MAX+1];
+ char data[128] = { };
+ __u16 class;
+ __u8 type;
+
+ parse_rtattr_nested(tb_esn, LWTUNNEL_IP_OPT_GENEVE_MAX, tb_opt[LWTUNNEL_IP_OPTS_GENEVE]);
+ class = rta_getattr_u16(tb_esn[LWTUNNEL_IP_OPT_GENEVE_CLASS]);
+ type = rta_getattr_u8(tb_esn[LWTUNNEL_IP_OPT_GENEVE_TYPE]);
+ hexstring_n2a(RTA_DATA(tb_esn[LWTUNNEL_IP_OPT_GENEVE_DATA]),
+ RTA_PAYLOAD(tb_esn[LWTUNNEL_IP_OPT_GENEVE_DATA]), data, sizeof(data));
+ print_bool(PRINT_ANY, "geneve", "geneve ", true);
+ print_uint(PRINT_ANY, "class", "class %u ", class);
+ print_uint(PRINT_ANY, "type", "type %u ", type);
+ print_color_string(PRINT_ANY, COLOR_INET, "data", "data %s ", data);
+ }
+ }
}
static void print_encap_bpf(FILE *fp, struct rtattr *encap)
@@ -799,10 +907,10 @@ static int parse_encap_ip(struct rtattr *rta, size_t len,
int *argcp, char ***argvp)
{
int id_ok = 0, dst_ok = 0, src_ok = 0, tos_ok = 0, ttl_ok = 0;
- int key_ok = 0, csum_ok = 0, seq_ok = 0;
+ int key_ok = 0, csum_ok = 0, seq_ok = 0, erspan_ok = 0;
char **argv = *argvp;
int argc = *argcp;
- int ret = 0;
+ int ret = 0, data_len;
__u16 flags = 0;
while (argc > 0) {
@@ -851,6 +959,124 @@ static int parse_encap_ip(struct rtattr *rta, size_t len,
if (get_u8(&ttl, *argv, 0))
invarg("\"ttl\" value is invalid\n", *argv);
ret = rta_addattr8(rta, len, LWTUNNEL_IP_TTL, ttl);
+ } else if (strcmp(*argv, "erspan") == 0) {
+ struct rtattr *nest, *nest2;
+ struct erspan_metadata em;
+
+ memset(&em, 0, sizeof(em));
+
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+
+ nest = rta_nest(rta, len, LWTUNNEL_IP_OPTS);
+ nest2 = rta_nest(rta, len, LWTUNNEL_IP_OPTS_ERSPAN);
+ {
+ flags |= TUNNEL_ERSPAN_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "ver"))
+ duparg2("ver", *argv);
+ NEXT_ARG();
+ if (get_s32(&em.version, *argv, 0) ||
+ (em.version != 1 && em.version != 2))
+ invarg("\"tos\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (em.version == 1) {
+ if (strcmp(*argv, "idx"))
+ duparg2("idx", *argv);
+ NEXT_ARG();
+ if (get_be32(&em.u.index, *argv, 0))
+ invarg("\"idx\" value is invalid\n", *argv);
+ } else {
+ __u8 hwid;
+
+ if (strcmp(*argv, "dir"))
+ duparg2("dir", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "ingress") == 0)
+ em.u.md2.dir = 0;
+ else if (strcmp(*argv, "exgress") == 0)
+ em.u.md2.dir = 1;
+ else
+ invarg("\"dir\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "hwid"))
+ duparg2("hwid", *argv);
+ NEXT_ARG();
+ if (get_u8(&hwid, *argv, 0))
+ invarg("\"hwid\" value is invalid\n", *argv);
+ em.u.md2.hwid = hwid;
+ }
+ ret = rta_addattr8(rta, len, LWTUNNEL_IP_OPT_ERSPAN_VER, em.version);
+ if (em.version == 1) {
+ ret = rta_addattr32(rta, len, LWTUNNEL_IP_OPT_ERSPAN_INDEX, em.u.index);
+ } else {
+ ret = rta_addattr8(rta, len, LWTUNNEL_IP_OPT_ERSPAN_DIR, em.u.md2.dir);
+ ret = rta_addattr16(rta, len, LWTUNNEL_IP_OPT_ERSPAN_HWID, em.u.md2.hwid);
+ }
+ }
+ rta_nest_end(rta, nest2);
+ rta_nest_end(rta, nest);
+ } else if (strcmp(*argv, "vxlan") == 0) {
+ struct rtattr *nest, *nest2;
+ unsigned int gbp;
+
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+
+ nest = rta_nest(rta, len, LWTUNNEL_IP_OPTS);
+ nest2 = rta_nest(rta, len, LWTUNNEL_IP_OPTS_VXLAN);
+ {
+ flags |= TUNNEL_VXLAN_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "gbp"))
+ duparg2("gbp", *argv);
+ NEXT_ARG();
+ if (get_u32(&gbp, *argv, 0))
+ invarg("\"gbp\" value is invalid\n", *argv);
+ ret = rta_addattr32(rta, len, LWTUNNEL_IP_OPT_VXLAN_GBP, gbp);
+ }
+ rta_nest_end(rta, nest2);
+ rta_nest_end(rta, nest);
+ } else if (strcmp(*argv, "geneve") == 0) {
+ struct rtattr *nest, *nest2;
+ uint8_t data[128] = { 0 };
+ __u16 class;
+ __u8 type;
+
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+
+ nest = rta_nest(rta, len, LWTUNNEL_IP_OPTS);
+ nest2 = rta_nest(rta, len, LWTUNNEL_IP_OPTS_GENEVE);
+ {
+ flags |= TUNNEL_GENEVE_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "class"))
+ duparg2("class", *argv);
+ NEXT_ARG();
+ if (get_u16(&class, *argv, 0))
+ invarg("\"class\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "type"))
+ duparg2("type", *argv);
+ NEXT_ARG();
+ if (get_u8(&type, *argv, 0))
+ invarg("\"type\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "data"))
+ duparg2("data", *argv);
+ NEXT_ARG();
+ data_len = strlen(*argv);
+ if (!data_len)
+ break;
+ if (hex2mem(*argv, data, data_len / 2) < 0)
+ invarg("\"data\" value is invalid\n", *argv);
+ ret = rta_addattr_l(rta, len, LWTUNNEL_IP_OPT_GENEVE_DATA, data, data_len / 2);
+ ret = rta_addattr16(rta, len, LWTUNNEL_IP_OPT_GENEVE_CLASS, class);
+ ret = rta_addattr8(rta, len, LWTUNNEL_IP_OPT_GENEVE_TYPE, type);
+ }
+ rta_nest_end(rta, nest2);
+ rta_nest_end(rta, nest);
} else if (strcmp(*argv, "key") == 0) {
if (key_ok++)
duparg2("key", *argv);
@@ -966,10 +1192,10 @@ static int parse_encap_ip6(struct rtattr *rta, size_t len,
int *argcp, char ***argvp)
{
int id_ok = 0, dst_ok = 0, src_ok = 0, tos_ok = 0, ttl_ok = 0;
- int key_ok = 0, csum_ok = 0, seq_ok = 0;
+ int key_ok = 0, csum_ok = 0, seq_ok = 0, erspan_ok = 0;
char **argv = *argvp;
int argc = *argcp;
- int ret = 0;
+ int ret = 0, data_len;
__u16 flags = 0;
while (argc > 0) {
@@ -1024,6 +1250,123 @@ static int parse_encap_ip6(struct rtattr *rta, size_t len,
if (key_ok++)
duparg2("key", *argv);
flags |= TUNNEL_KEY;
+ } else if (strcmp(*argv, "erspan") == 0) {
+ struct rtattr *nest, *nest2;
+ struct erspan_metadata em;
+
+ memset(&em, 0, sizeof(em));
+
+ nest = rta_nest(rta, len, LWTUNNEL_IP6_OPTS);
+ nest2 = rta_nest(rta, len, LWTUNNEL_IP_OPTS_ERSPAN);
+ {
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+ flags |= TUNNEL_ERSPAN_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "ver"))
+ duparg2("ver", *argv);
+ NEXT_ARG();
+ if (get_s32(&em.version, *argv, 0) ||
+ (em.version != 1 && em.version != 2))
+ invarg("\"tos\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (em.version == 1) {
+ if (strcmp(*argv, "idx"))
+ duparg2("idx", *argv);
+ NEXT_ARG();
+ if (get_be32(&em.u.index, *argv, 0))
+ invarg("\"idx\" value is invalid\n", *argv);
+ } else {
+ __u8 hwid;
+
+ if (strcmp(*argv, "dir"))
+ duparg2("dir", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "ingress") == 0)
+ em.u.md2.dir = 0;
+ else if (strcmp(*argv, "exgress") == 0)
+ em.u.md2.dir = 1;
+ else
+ invarg("\"dir\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "hwid"))
+ duparg2("hwid", *argv);
+ NEXT_ARG();
+ if (get_u8(&hwid, *argv, 0))
+ invarg("\"hwid\" value is invalid\n", *argv);
+ em.u.md2.hwid = hwid;
+ }
+ ret = rta_addattr8(rta, len, LWTUNNEL_IP_OPT_ERSPAN_VER, em.version);
+ if (em.version == 1) {
+ ret = rta_addattr32(rta, len, LWTUNNEL_IP_OPT_ERSPAN_INDEX, em.u.index);
+ } else {
+ ret = rta_addattr8(rta, len, LWTUNNEL_IP_OPT_ERSPAN_DIR, em.u.md2.dir);
+ ret = rta_addattr16(rta, len, LWTUNNEL_IP_OPT_ERSPAN_HWID, em.u.md2.hwid);
+ }
+ }
+ rta_nest_end(rta, nest2);
+ rta_nest_end(rta, nest);
+ } else if (strcmp(*argv, "vxlan") == 0) {
+ struct rtattr *nest, *nest2;
+ unsigned int gbp;
+
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+
+ nest = rta_nest(rta, len, LWTUNNEL_IP6_OPTS);
+ nest2 = rta_nest(rta, len, LWTUNNEL_IP_OPTS_VXLAN);
+ {
+ flags |= TUNNEL_VXLAN_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "gbp"))
+ duparg2("gbp", *argv);
+ NEXT_ARG();
+ if (get_u32(&gbp, *argv, 0))
+ invarg("\"gbp\" value is invalid\n", *argv);
+ ret = rta_addattr32(rta, len, LWTUNNEL_IP_OPT_VXLAN_GBP, gbp);
+ }
+ rta_nest_end(rta, nest2);
+ rta_nest_end(rta, nest);
+ } else if (strcmp(*argv, "geneve") == 0) {
+ struct rtattr *nest, *nest2;
+ uint8_t data[128] = { };
+ __u16 class;
+ __u8 type;
+
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+
+ nest = rta_nest(rta, len, LWTUNNEL_IP6_OPTS);
+ nest2 = rta_nest(rta, len, LWTUNNEL_IP_OPTS_GENEVE);
+ {
+ flags |= TUNNEL_GENEVE_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "class"))
+ duparg2("class", *argv);
+ NEXT_ARG();
+ if (get_u16(&class, *argv, 0))
+ invarg("\"class\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "type"))
+ duparg2("type", *argv);
+ NEXT_ARG();
+ if (get_u8(&type, *argv, 0))
+ invarg("\"type\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "data"))
+ duparg2("data", *argv);
+ NEXT_ARG();
+ data_len = strlen(*argv);
+ if (!data_len)
+ break;
+ if (hex2mem(*argv, data, data_len / 2) < 0)
+ invarg("\"data\" value is invalid\n", *argv);
+ ret = rta_addattr_l(rta, len, LWTUNNEL_IP_OPT_GENEVE_DATA, data, data_len / 2);
+ ret = rta_addattr16(rta, len, LWTUNNEL_IP_OPT_GENEVE_CLASS, class);
+ ret = rta_addattr8(rta, len, LWTUNNEL_IP_OPT_GENEVE_TYPE, type);
+ }
+ rta_nest_end(rta, nest2);
+ rta_nest_end(rta, nest);
} else if (strcmp(*argv, "csum") == 0) {
if (csum_ok++)
duparg2("csum", *argv);
--
2.18.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
2019-11-06 9:01 [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping Xin Long
2019-11-06 9:01 ` [PATCH net-next 1/5] lwtunnel: add options process for arp request Xin Long
2019-11-06 9:03 ` [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping Xin Long
@ 2019-11-07 5:14 ` David Miller
2019-11-07 10:50 ` Xin Long
2 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2019-11-07 5:14 UTC (permalink / raw)
To: lucien.xin; +Cc: netdev, simon.horman, jbenc, tgraf, u9012063
From: Xin Long <lucien.xin@gmail.com>
Date: Wed, 6 Nov 2019 17:01:02 +0800
> With this patchset, users can configure options by ip route encap
> for geneve, vxlan and ersapn lwtunnel, like:
>
> # ip r a 1.1.1.0/24 encap ip id 1 geneve class 0 type 0 \
> data "1212121234567890" dst 10.1.0.2 dev geneve1
>
> # ip r a 1.1.1.0/24 encap ip id 1 vxlan gbp 456 \
> dst 10.1.0.2 dev erspan1
>
> # ip r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
> dst 10.1.0.2 dev erspan1
>
> iproute side patch is attached on the reply of this mail.
>
> Thank Simon for good advice.
Series applied, looks good.
Can you comment about how this code is using the deprecated nla
parsers for new options?
Thank you.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
2019-11-07 5:14 ` David Miller
@ 2019-11-07 10:50 ` Xin Long
2019-11-07 16:18 ` David Ahern
2019-11-07 18:34 ` David Miller
0 siblings, 2 replies; 13+ messages in thread
From: Xin Long @ 2019-11-07 10:50 UTC (permalink / raw)
To: David Miller
Cc: network dev, Simon Horman, Jiri Benc, Thomas Graf, William Tu
On Thu, Nov 7, 2019 at 1:15 PM David Miller <davem@davemloft.net> wrote:
>
> From: Xin Long <lucien.xin@gmail.com>
> Date: Wed, 6 Nov 2019 17:01:02 +0800
>
> > With this patchset, users can configure options by ip route encap
> > for geneve, vxlan and ersapn lwtunnel, like:
> >
> > # ip r a 1.1.1.0/24 encap ip id 1 geneve class 0 type 0 \
> > data "1212121234567890" dst 10.1.0.2 dev geneve1
> >
> > # ip r a 1.1.1.0/24 encap ip id 1 vxlan gbp 456 \
> > dst 10.1.0.2 dev erspan1
> >
> > # ip r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
> > dst 10.1.0.2 dev erspan1
> >
> > iproute side patch is attached on the reply of this mail.
> >
> > Thank Simon for good advice.
>
> Series applied, looks good.
>
> Can you comment about how this code is using the deprecated nla
> parsers for new options?
I didn't think too much, just used what it's using in cls_flower.c and
act_tunnel_key.c to parse GENEVE options.
Now think about it again, nla_parse_nested() should always be used on
new options, should I post a fix for it? since no code to access this
from userspace yet.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
2019-11-07 10:50 ` Xin Long
@ 2019-11-07 16:18 ` David Ahern
2019-11-08 14:08 ` Xin Long
2019-11-07 18:34 ` David Miller
1 sibling, 1 reply; 13+ messages in thread
From: David Ahern @ 2019-11-07 16:18 UTC (permalink / raw)
To: Xin Long, David Miller
Cc: network dev, Simon Horman, Jiri Benc, Thomas Graf, William Tu
On 11/7/19 3:50 AM, Xin Long wrote:
> Now think about it again, nla_parse_nested() should always be used on
> new options, should I post a fix for it? since no code to access this
> from userspace yet.
please do. All new options should use strict parsing from the beginning.
And you should be able to set LWTUNNEL_IP_OPT_GENEVE_UNSPEC to
.strict_start_type = LWTUNNEL_IP_OPT_GENEVE_UNSPEC + 1 in the policy so
that new command using new option on an old kernel throws an error.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
2019-11-07 10:50 ` Xin Long
2019-11-07 16:18 ` David Ahern
@ 2019-11-07 18:34 ` David Miller
1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2019-11-07 18:34 UTC (permalink / raw)
To: lucien.xin; +Cc: netdev, simon.horman, jbenc, tgraf, u9012063
From: Xin Long <lucien.xin@gmail.com>
Date: Thu, 7 Nov 2019 18:50:15 +0800
> Now think about it again, nla_parse_nested() should always be used on
> new options, should I post a fix for it? since no code to access this
> from userspace yet.
If that is true, yes you should.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
2019-11-07 16:18 ` David Ahern
@ 2019-11-08 14:08 ` Xin Long
2019-11-08 15:29 ` David Ahern
0 siblings, 1 reply; 13+ messages in thread
From: Xin Long @ 2019-11-08 14:08 UTC (permalink / raw)
To: David Ahern
Cc: David Miller, network dev, Simon Horman, Jiri Benc, Thomas Graf,
William Tu
On Fri, Nov 8, 2019 at 12:18 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 11/7/19 3:50 AM, Xin Long wrote:
> > Now think about it again, nla_parse_nested() should always be used on
> > new options, should I post a fix for it? since no code to access this
> > from userspace yet.
>
> please do. All new options should use strict parsing from the beginning.
> And you should be able to set LWTUNNEL_IP_OPT_GENEVE_UNSPEC to
> .strict_start_type = LWTUNNEL_IP_OPT_GENEVE_UNSPEC + 1 in the policy so
> that new command using new option on an old kernel throws an error.
I'm not sure if strict_start_type is needed when using nla_parse_nested().
.strict_start_type seems only checked in validate_nla():
if (strict_start_type && type >= strict_start_type)
validate |= NL_VALIDATE_STRICT; <------ [1]
But in the path of:
nla_parse_nested() ->
__nla_parse() ->
__nla_validate_parse() ->
validate_nla()
The param 'validate' is always NL_VALIDATE_STRICT, no matter Code [1] is
triggered or not. or am I missing something here?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping
2019-11-08 14:08 ` Xin Long
@ 2019-11-08 15:29 ` David Ahern
0 siblings, 0 replies; 13+ messages in thread
From: David Ahern @ 2019-11-08 15:29 UTC (permalink / raw)
To: Xin Long
Cc: David Miller, network dev, Simon Horman, Jiri Benc, Thomas Graf,
William Tu
On 11/8/19 7:08 AM, Xin Long wrote:
> On Fri, Nov 8, 2019 at 12:18 AM David Ahern <dsahern@gmail.com> wrote:
>>
>> On 11/7/19 3:50 AM, Xin Long wrote:
>>> Now think about it again, nla_parse_nested() should always be used on
>>> new options, should I post a fix for it? since no code to access this
>>> from userspace yet.
>>
>> please do. All new options should use strict parsing from the beginning.
>> And you should be able to set LWTUNNEL_IP_OPT_GENEVE_UNSPEC to
>> .strict_start_type = LWTUNNEL_IP_OPT_GENEVE_UNSPEC + 1 in the policy so
>> that new command using new option on an old kernel throws an error.
> I'm not sure if strict_start_type is needed when using nla_parse_nested().
>
> .strict_start_type seems only checked in validate_nla():
>
> if (strict_start_type && type >= strict_start_type)
> validate |= NL_VALIDATE_STRICT; <------ [1]
>
> But in the path of:
> nla_parse_nested() ->
> __nla_parse() ->
> __nla_validate_parse() ->
> validate_nla()
>
> The param 'validate' is always NL_VALIDATE_STRICT, no matter Code [1] is
> triggered or not. or am I missing something here?
>
ok, I missed that.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-11-08 15:29 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-06 9:01 [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping Xin Long
2019-11-06 9:01 ` [PATCH net-next 1/5] lwtunnel: add options process for arp request Xin Long
2019-11-06 9:01 ` [PATCH net-next 2/5] lwtunnel: add options process for cmp_encap Xin Long
2019-11-06 9:01 ` [PATCH net-next 3/5] lwtunnel: add options setting and dumping for geneve Xin Long
2019-11-06 9:01 ` [PATCH net-next 4/5] lwtunnel: add options setting and dumping for vxlan Xin Long
2019-11-06 9:01 ` [PATCH net-next 5/5] lwtunnel: add options setting and dumping for erspan Xin Long
2019-11-06 9:03 ` [PATCH net-next 0/5] lwtunnel: add ip and ip6 options setting and dumping Xin Long
2019-11-07 5:14 ` David Miller
2019-11-07 10:50 ` Xin Long
2019-11-07 16:18 ` David Ahern
2019-11-08 14:08 ` Xin Long
2019-11-08 15:29 ` David Ahern
2019-11-07 18:34 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).