[net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior
@ 2020-11-07 15:31 Andrea Mayer
  2020-11-07 15:31 ` [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached Andrea Mayer
                   ` (4 more replies)
  0 siblings, 5 replies; 34+ messages in thread
From: Andrea Mayer @ 2020-11-07 15:31 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest
  Cc: Stefano Salsano, Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

This patchset provides support for the SRv6 End.DT4 behavior.

The SRv6 End.DT4 is used to implement multi-tenant IPv4 L3 VPN. It
decapsulates the received packets and performs IPv4 routing lookup in the
routing table of the tenant. The SRv6 End.DT4 Linux implementation
leverages a VRF device. The SRv6 End.DT4 is defined in the SRv6 Network
Programming [1].

- Patch 1/5 is needed to solve a pre-existing issue with tunneled packets
  when a sniffer is attached;

- Patch 2/5 improves the management of the seg6local attributes used by the
  SRv6 behaviors;

- Patch 3/5 introduces two callbacks used for customizing the
  creation/destruction of a SRv6 behavior;

- Patch 4/5 is the core patch that adds support for the SRv6 End.DT4
  behavior;

- Patch 5/5 adds the selftest for SRv6 End.DT4 behavior.

I would like to thank David Ahern for his support during the development of
this patch set.

Comments, suggestions and improvements are very welcome!

Thanks,
Andrea Mayer

v2
 no changes made: resubmitted after false build report.

v1
 improve comments;

 add new patch 2/5 titled: seg6: improve management of behavior attributes

 seg6: add support for the SRv6 End.DT4 behavior
  - remove the inline keyword in the definition of fib6_config_get_net().

 selftests: add selftest for the SRv6 End.DT4 behavior
  - add check for the vrf sysctl

[1] https://tools.ietf.org/html/draft-ietf-spring-srv6-network-programming

Andrea Mayer (5):
  vrf: add mac header for tunneled packets when sniffer is attached
  seg6: improve management of behavior attributes
  seg6: add callbacks for customizing the creation/destruction of a
    behavior
  seg6: add support for the SRv6 End.DT4 behavior
  selftests: add selftest for the SRv6 End.DT4 behavior

 drivers/net/vrf.c                             |  78 ++-
 net/ipv6/seg6_local.c                         | 370 ++++++++++++-
 .../selftests/net/srv6_end_dt4_l3vpn_test.sh  | 494 ++++++++++++++++++
 3 files changed, 927 insertions(+), 15 deletions(-)
 create mode 100755 tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh

-- 
2.20.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached
  2020-11-07 15:31 [net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior Andrea Mayer
@ 2020-11-07 15:31 ` Andrea Mayer
  2020-11-10 22:50   ` Jakub Kicinski
  2020-11-07 15:31 ` [net-next,v2,2/5] seg6: improve management of behavior attributes Andrea Mayer
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 34+ messages in thread
From: Andrea Mayer @ 2020-11-07 15:31 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest
  Cc: Stefano Salsano, Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Before this patch, a sniffer attached to a VRF used as the receiving
interface of L3 tunneled packets detects them as malformed packets and
it complains about that (i.e.: tcpdump shows bogus packets).

The reason is that a tunneled L3 packet does not carry any L2
information and when the VRF is set as the receiving interface of a
decapsulated L3 packet, no mac header is currently set or valid.
Therefore, the purpose of this patch consists of adding a MAC header to
any packet which is directly received on the VRF interface ONLY IF:

 i) a sniffer is attached on the VRF and ii) the mac header is not set.

In this case, the mac address of the VRF is copied in both the
destination and the source address of the ethernet header. The protocol
type is set either to IPv4 or IPv6, depending on which L3 packet is
received.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
---
 drivers/net/vrf.c | 78 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 72 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 60c1aadece89..26f2ed02a5c1 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -1263,6 +1263,61 @@ static void vrf_ip6_input_dst(struct sk_buff *skb, struct net_device *vrf_dev,
 	skb_dst_set(skb, &rt6->dst);
 }
 
+static int vrf_prepare_mac_header(struct sk_buff *skb,
+				  struct net_device *vrf_dev, u16 proto)
+{
+	struct ethhdr *eth;
+	int err;
+
+	/* in general, we do not know if there is enough space in the head of
+	 * the packet for hosting the mac header.
+	 */
+	err = skb_cow_head(skb, LL_RESERVED_SPACE(vrf_dev));
+	if (unlikely(err))
+		/* no space in the skb head */
+		return -ENOBUFS;
+
+	__skb_push(skb, ETH_HLEN);
+	eth = (struct ethhdr *)skb->data;
+
+	skb_reset_mac_header(skb);
+
+	/* we set the ethernet destination and the source addresses to the
+	 * address of the VRF device.
+	 */
+	ether_addr_copy(eth->h_dest, vrf_dev->dev_addr);
+	ether_addr_copy(eth->h_source, vrf_dev->dev_addr);
+	eth->h_proto = htons(proto);
+
+	/* the destination address of the Ethernet frame corresponds to the
+	 * address set on the VRF interface; therefore, the packet is intended
+	 * to be processed locally.
+	 */
+	skb->protocol = eth->h_proto;
+	skb->pkt_type = PACKET_HOST;
+
+	skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
+
+	skb_pull_inline(skb, ETH_HLEN);
+
+	return 0;
+}
+
+/* prepare and add the mac header to the packet if it was not set previously.
+ * In this way, packet sniffers such as tcpdump can parse the packet correctly.
+ * If the mac header was already set, the original mac header is left
+ * untouched and the function returns immediately.
+ */
+static int vrf_add_mac_header_if_unset(struct sk_buff *skb,
+				       struct net_device *vrf_dev,
+				       u16 proto)
+{
+	if (skb_mac_header_was_set(skb))
+		return 0;
+
+	return vrf_prepare_mac_header(skb, vrf_dev, proto);
+}
+
 static struct sk_buff *vrf_ip6_rcv(struct net_device *vrf_dev,
 				   struct sk_buff *skb)
 {
@@ -1289,9 +1344,15 @@ static struct sk_buff *vrf_ip6_rcv(struct net_device *vrf_dev,
 		skb->skb_iif = vrf_dev->ifindex;
 
 		if (!list_empty(&vrf_dev->ptype_all)) {
-			skb_push(skb, skb->mac_len);
-			dev_queue_xmit_nit(skb, vrf_dev);
-			skb_pull(skb, skb->mac_len);
+			int err;
+
+			err = vrf_add_mac_header_if_unset(skb, vrf_dev,
+							  ETH_P_IPV6);
+			if (likely(!err)) {
+				skb_push(skb, skb->mac_len);
+				dev_queue_xmit_nit(skb, vrf_dev);
+				skb_pull(skb, skb->mac_len);
+			}
 		}
 
 		IP6CB(skb)->flags |= IP6SKB_L3SLAVE;
@@ -1334,9 +1395,14 @@ static struct sk_buff *vrf_ip_rcv(struct net_device *vrf_dev,
 	vrf_rx_stats(vrf_dev, skb->len);
 
 	if (!list_empty(&vrf_dev->ptype_all)) {
-		skb_push(skb, skb->mac_len);
-		dev_queue_xmit_nit(skb, vrf_dev);
-		skb_pull(skb, skb->mac_len);
+		int err;
+
+		err = vrf_add_mac_header_if_unset(skb, vrf_dev, ETH_P_IP);
+		if (likely(!err)) {
+			skb_push(skb, skb->mac_len);
+			dev_queue_xmit_nit(skb, vrf_dev);
+			skb_pull(skb, skb->mac_len);
+		}
 	}
 
 	skb = vrf_rcv_nfhook(NFPROTO_IPV4, NF_INET_PRE_ROUTING, skb, vrf_dev);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next,v2,2/5] seg6: improve management of behavior attributes
  2020-11-07 15:31 [net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior Andrea Mayer
  2020-11-07 15:31 ` [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached Andrea Mayer
@ 2020-11-07 15:31 ` Andrea Mayer
  2020-11-10 22:50   ` Jakub Kicinski
  2020-11-07 15:31 ` [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior Andrea Mayer
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 34+ messages in thread
From: Andrea Mayer @ 2020-11-07 15:31 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest
  Cc: Stefano Salsano, Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Depending on the attribute (i.e.: SEG6_LOCAL_SRH, SEG6_LOCAL_TABLE, etc),
the parse() callback performs some validity checks on the provided input
and updates the tunnel state (slwt) with the result of the parsing
operation. However, an attribute may also need to reserve some additional
resources (i.e.: memory or setting up an eBPF program) in the parse()
callback to complete the parsing operation.

The parse() callbacks are invoked by the parse_nla_action() for each
attribute belonging to a specific behavior. Given a behavior with N
attributes, if the parsing of the i-th attribute fails, the
parse_nla_action() returns immediately with an error. Nonetheless, the
resources acquired during the parsing of the i-1 attributes are not freed
by the parse_nla_action().

Attributes which acquire resources must release them *in an explicit way*
in both the seg6_local_{build/destroy}_state(). However, adding a new
attribute of this type requires changes to
seg6_local_{build/destroy}_state() to release the resources correctly.

The seg6local infrastructure still lacks a simple and structured way to
release the resources acquired in the parse() operations.

We introduced a new callback in the struct seg6_action_param named
destroy(). This callback releases any resource which may have been acquired
in the parse() counterpart. Each attribute may or may not implement the
destroy() callback depending on whether it needs to free some acquired
resources.

The destroy() callback comes with several of advantages:

 1) we can have many attributes as we want for a given behavior with no
    need to explicitly free the taken resources;

 2) As in case of the seg6_local_build_state(), the
    seg6_local_destroy_state() does not need to handle the release of
    resources directly. Indeed, it calls the destroy_attrs() function which
    is in charge of calling the destroy() callback for every set attribute.
    We do not need to patch seg6_local_{build/destroy}_state() anymore as
    we add new attributes;

 3) the code is more readable and better structured. Indeed, all the
    information needed to handle a given attribute are contained in only
    one place;

 4) it facilitates the integration with new features introduced in further
    patches.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
---
 net/ipv6/seg6_local.c | 103 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 93 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index eba23279912d..63a82e2fdea9 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -710,6 +710,12 @@ static int cmp_nla_srh(struct seg6_local_lwt *a, struct seg6_local_lwt *b)
 	return memcmp(a->srh, b->srh, len);
 }
 
+static void destroy_attr_srh(struct seg6_local_lwt *slwt)
+{
+	kfree(slwt->srh);
+	slwt->srh = NULL;
+}
+
 static int parse_nla_table(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 {
 	slwt->table = nla_get_u32(attrs[SEG6_LOCAL_TABLE]);
@@ -901,16 +907,33 @@ static int cmp_nla_bpf(struct seg6_local_lwt *a, struct seg6_local_lwt *b)
 	return strcmp(a->bpf.name, b->bpf.name);
 }
 
+static void destroy_attr_bpf(struct seg6_local_lwt *slwt)
+{
+	kfree(slwt->bpf.name);
+	if (slwt->bpf.prog)
+		bpf_prog_put(slwt->bpf.prog);
+
+	slwt->bpf.name = NULL;
+	slwt->bpf.prog = NULL;
+}
+
 struct seg6_action_param {
 	int (*parse)(struct nlattr **attrs, struct seg6_local_lwt *slwt);
 	int (*put)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
 	int (*cmp)(struct seg6_local_lwt *a, struct seg6_local_lwt *b);
+
+	/* optional destroy() callback useful for releasing resources which
+	 * have been previously acquired in the corresponding parse()
+	 * function.
+	 */
+	void (*destroy)(struct seg6_local_lwt *slwt);
 };
 
 static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
 	[SEG6_LOCAL_SRH]	= { .parse = parse_nla_srh,
 				    .put = put_nla_srh,
-				    .cmp = cmp_nla_srh },
+				    .cmp = cmp_nla_srh,
+				    .destroy = destroy_attr_srh },
 
 	[SEG6_LOCAL_TABLE]	= { .parse = parse_nla_table,
 				    .put = put_nla_table,
@@ -934,13 +957,68 @@ static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
 
 	[SEG6_LOCAL_BPF]	= { .parse = parse_nla_bpf,
 				    .put = put_nla_bpf,
-				    .cmp = cmp_nla_bpf },
+				    .cmp = cmp_nla_bpf,
+				    .destroy = destroy_attr_bpf },
 
 };
 
+/* call the destroy() callback (if available) for each set attribute in
+ * @parsed_attrs, starting from attribute index @start up to @end excluded.
+ */
+static void __destroy_attrs(unsigned long parsed_attrs, int start, int end,
+			    struct seg6_local_lwt *slwt)
+{
+	struct seg6_action_param *param;
+	int i;
+
+	/* Every seg6local attribute is identified by an ID which is encoded as
+	 * a flag (i.e: 1 << ID) in the @parsed_attrs bitmask; such bitmask
+	 * keeps track of the attributes parsed so far.
+
+	 * We scan the @parsed_attrs bitmask, starting from the attribute
+	 * identified by @start up to the attribute identified by @end
+	 * excluded. For each set attribute, we retrieve the corresponding
+	 * destroy() callback.
+	 * If the callback is not available, then we skip to the next
+	 * attribute; otherwise, we call the destroy() callback.
+	 */
+	for (i = start; i < end; ++i) {
+		if (!(parsed_attrs & (1 << i)))
+			continue;
+
+		param = &seg6_action_params[i];
+
+		if (param->destroy)
+			param->destroy(slwt);
+	}
+}
+
+/* release all the resources that may have been acquired during parsing
+ * operations.
+ */
+static void destroy_attrs(struct seg6_local_lwt *slwt)
+{
+	struct seg6_action_desc *desc;
+	unsigned long attrs;
+
+	desc = slwt->desc;
+	if (!desc) {
+		WARN_ONCE(1,
+			  "seg6local: seg6_action_desc* for action %d is NULL",
+			  slwt->action);
+		return;
+	}
+
+	/* get the attributes for the current behavior instance */
+	attrs = desc->attrs;
+
+	__destroy_attrs(attrs, 0, SEG6_LOCAL_MAX + 1, slwt);
+}
+
 static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 {
 	struct seg6_action_param *param;
+	unsigned long parsed_attrs = 0;
 	struct seg6_action_desc *desc;
 	int i, err;
 
@@ -963,11 +1041,22 @@ static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 
 			err = param->parse(attrs, slwt);
 			if (err < 0)
-				return err;
+				goto parse_err;
+
+			/* current attribute has been parsed correctly */
+			parsed_attrs |= (1 << i);
 		}
 	}
 
 	return 0;
+
+parse_err:
+	/* release any resource that may have been acquired during the i-1
+	 * parse() operations.
+	 */
+	__destroy_attrs(parsed_attrs, 0, i, slwt);
+
+	return err;
 }
 
 static int seg6_local_build_state(struct net *net, struct nlattr *nla,
@@ -1012,7 +1101,6 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
 	return 0;
 
 out_free:
-	kfree(slwt->srh);
 	kfree(newts);
 	return err;
 }
@@ -1021,12 +1109,7 @@ static void seg6_local_destroy_state(struct lwtunnel_state *lwt)
 {
 	struct seg6_local_lwt *slwt = seg6_local_lwtunnel(lwt);
 
-	kfree(slwt->srh);
-
-	if (slwt->desc->attrs & (1 << SEG6_LOCAL_BPF)) {
-		kfree(slwt->bpf.name);
-		bpf_prog_put(slwt->bpf.prog);
-	}
+	destroy_attrs(slwt);
 
 	return;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior
  2020-11-07 15:31 [net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior Andrea Mayer
  2020-11-07 15:31 ` [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached Andrea Mayer
  2020-11-07 15:31 ` [net-next,v2,2/5] seg6: improve management of behavior attributes Andrea Mayer
@ 2020-11-07 15:31 ` Andrea Mayer
  2020-11-10 22:56   ` Jakub Kicinski
  2020-11-07 15:31 ` [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior Andrea Mayer
  2020-11-07 15:31 ` [net-next,v2,5/5] selftests: add selftest " Andrea Mayer
  4 siblings, 1 reply; 34+ messages in thread
From: Andrea Mayer @ 2020-11-07 15:31 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest
  Cc: Stefano Salsano, Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

We introduce two callbacks used for customizing the creation/destruction of
a SRv6 behavior. Such callbacks are defined in the new struct
seg6_local_lwtunnel_ops and hereafter we provide a brief description of
them:

 - build_state(...): used for calling the custom constructor of the
   behavior during its initialization phase and after all the attributes
   have been parsed successfully;

 - destroy_state(...): used for calling the custom destructor of the
   behavior before it is completely destroyed.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
---
 net/ipv6/seg6_local.c | 64 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 63a82e2fdea9..4b0f155d641d 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -33,11 +33,23 @@
 
 struct seg6_local_lwt;
 
+typedef int (*slwt_build_state_t)(struct seg6_local_lwt *slwt, const void *cfg,
+				  struct netlink_ext_ack *extack);
+typedef void (*slwt_destroy_state_t)(struct seg6_local_lwt *slwt);
+
+/* callbacks used for customizing the creation and destruction of a behavior */
+struct seg6_local_lwtunnel_ops {
+	slwt_build_state_t build_state;
+	slwt_destroy_state_t destroy_state;
+};
+
 struct seg6_action_desc {
 	int action;
 	unsigned long attrs;
 	int (*input)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
 	int static_headroom;
+
+	struct seg6_local_lwtunnel_ops slwt_ops;
 };
 
 struct bpf_lwt_prog {
@@ -1015,6 +1027,45 @@ static void destroy_attrs(struct seg6_local_lwt *slwt)
 	__destroy_attrs(attrs, 0, SEG6_LOCAL_MAX + 1, slwt);
 }
 
+/* call the custom constructor of the behavior during its initialization phase
+ * and after that all its attributes have been parsed successfully.
+ */
+static int
+seg6_local_lwtunnel_build_state(struct seg6_local_lwt *slwt, const void *cfg,
+				struct netlink_ext_ack *extack)
+{
+	slwt_build_state_t build_func;
+	struct seg6_action_desc *desc;
+	int err = 0;
+
+	desc = slwt->desc;
+	if (!desc)
+		return -EINVAL;
+
+	build_func = desc->slwt_ops.build_state;
+	if (build_func)
+		err = build_func(slwt, cfg, extack);
+
+	return err;
+}
+
+/* call the custom destructor of the behavior which is invoked before the
+ * tunnel is going to be destroyed.
+ */
+static void seg6_local_lwtunnel_destroy_state(struct seg6_local_lwt *slwt)
+{
+	slwt_destroy_state_t destroy_func;
+	struct seg6_action_desc *desc;
+
+	desc = slwt->desc;
+	if (!desc)
+		return;
+
+	destroy_func = desc->slwt_ops.destroy_state;
+	if (destroy_func)
+		destroy_func(slwt);
+}
+
 static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 {
 	struct seg6_action_param *param;
@@ -1090,8 +1141,16 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
 
 	err = parse_nla_action(tb, slwt);
 	if (err < 0)
+		/* In case of error, the parse_nla_action() takes care of
+		 * releasing resources which have been acquired during the
+		 * processing of attributes.
+		 */
 		goto out_free;
 
+	err = seg6_local_lwtunnel_build_state(slwt, cfg, extack);
+	if (err < 0)
+		goto free_attrs;
+
 	newts->type = LWTUNNEL_ENCAP_SEG6_LOCAL;
 	newts->flags = LWTUNNEL_STATE_INPUT_REDIRECT;
 	newts->headroom = slwt->headroom;
@@ -1100,6 +1159,9 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
 
 	return 0;
 
+free_attrs:
+	destroy_attrs(slwt);
+
 out_free:
 	kfree(newts);
 	return err;
@@ -1109,6 +1171,8 @@ static void seg6_local_destroy_state(struct lwtunnel_state *lwt)
 {
 	struct seg6_local_lwt *slwt = seg6_local_lwtunnel(lwt);
 
+	seg6_local_lwtunnel_destroy_state(slwt);
+
 	destroy_attrs(slwt);
 
 	return;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-07 15:31 [net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior Andrea Mayer
                   ` (2 preceding siblings ...)
  2020-11-07 15:31 ` [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior Andrea Mayer
@ 2020-11-07 15:31 ` Andrea Mayer
  2020-11-10 23:12   ` Jakub Kicinski
  2020-11-13  9:23   ` kernel test robot
  2020-11-07 15:31 ` [net-next,v2,5/5] selftests: add selftest " Andrea Mayer
  4 siblings, 2 replies; 34+ messages in thread
From: Andrea Mayer @ 2020-11-07 15:31 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest
  Cc: Stefano Salsano, Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

SRv6 End.DT4 is defined in the SRv6 Network Programming [1].

The SRv6 End.DT4 is used to implement IPv4 L3VPN use-cases in
multi-tenants environments. It decapsulates the received packets and it
performs IPv4 routing lookup in the routing table of the tenant.

The SRv6 End.DT4 Linux implementation leverages a VRF device in order to
force the routing lookup into the associated routing table.

To make the End.DT4 work properly, it must be guaranteed that the routing
table used for routing lookup operations is bound to one and only one
VRF during the tunnel creation. Such constraint has to be enforced by
enabling the VRF strict_mode sysctl parameter, i.e:
 $ sysctl -wq net.vrf.strict_mode=1.

At JANOG44, LINE corporation presented their multi-tenant DC architecture
using SRv6 [2]. In the slides, they reported that the Linux kernel is
missing the support of SRv6 End.DT4 behavior.

The iproute2 counterpart required for configuring the SRv6 End.DT4
behavior is already implemented along with the other supported SRv6
behaviors [3].

[1] https://tools.ietf.org/html/draft-ietf-spring-srv6-network-programming
[2] https://speakerdeck.com/line_developers/line-data-center-networking-with-srv6
[3] https://patchwork.ozlabs.org/patch/799837/

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
---
 net/ipv6/seg6_local.c | 205 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 205 insertions(+)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 4b0f155d641d..a41074acd43e 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -57,6 +57,14 @@ struct bpf_lwt_prog {
 	char *name;
 };
 
+struct seg6_end_dt4_info {
+	struct net *net;
+	/* VRF device associated to the routing table used by the SRv6 End.DT4
+	 * behavior for routing IPv4 packets.
+	 */
+	int vrf_ifindex;
+};
+
 struct seg6_local_lwt {
 	int action;
 	struct ipv6_sr_hdr *srh;
@@ -66,6 +74,7 @@ struct seg6_local_lwt {
 	int iif;
 	int oif;
 	struct bpf_lwt_prog bpf;
+	struct seg6_end_dt4_info dt4_info;
 
 	int headroom;
 	struct seg6_action_desc *desc;
@@ -413,6 +422,194 @@ static int input_action_end_dx4(struct sk_buff *skb,
 	return -EINVAL;
 }
 
+#ifdef CONFIG_NET_L3_MASTER_DEV
+
+static struct net *fib6_config_get_net(const struct fib6_config *fib6_cfg)
+{
+	const struct nl_info *nli = &fib6_cfg->fc_nlinfo;
+
+	return nli->nl_net;
+}
+
+static int seg6_end_dt4_build(struct seg6_local_lwt *slwt, const void *cfg,
+			      struct netlink_ext_ack *extack)
+{
+	struct seg6_end_dt4_info *info = &slwt->dt4_info;
+	int vrf_ifindex;
+	struct net *net;
+
+	net = fib6_config_get_net(cfg);
+
+	vrf_ifindex = l3mdev_ifindex_lookup_by_table_id(L3MDEV_TYPE_VRF, net,
+							slwt->table);
+	if (vrf_ifindex < 0) {
+		if (vrf_ifindex == -EPERM) {
+			NL_SET_ERR_MSG(extack,
+				       "Strict mode for VRF is disabled");
+		} else if (vrf_ifindex == -ENODEV) {
+			NL_SET_ERR_MSG(extack, "No such device");
+		} else {
+			NL_SET_ERR_MSG(extack, "Unknown error");
+
+			pr_debug("seg6local: SRv6 End.DT4 creation error=%d\n",
+				 vrf_ifindex);
+		}
+
+		return vrf_ifindex;
+	}
+
+	info->net = net;
+	info->vrf_ifindex = vrf_ifindex;
+
+	return 0;
+}
+
+/* The SRv6 End.DT4 behavior extracts the inner (IPv4) packet and routes the
+ * IPv4 packet by looking at the configured routing table.
+ *
+ * In the SRv6 End.DT4 use case, we can receive traffic (IPv6+Segment Routing
+ * Header packets) from several interfaces and the IPv6 destination address (DA)
+ * is used for retrieving the specific instance of the End.DT4 behavior that
+ * should process the packets.
+ *
+ * However, the inner IPv4 packet is not really bound to any receiving
+ * interface and thus the End.DT4 sets the VRF (associated with the
+ * corresponding routing table) as the *receiving* interface.
+ * In other words, the End.DT4 processes a packet as if it has been received
+ * directly by the VRF (and not by one of its slave devices, if any).
+ * In this way, the VRF interface is used for routing the IPv4 packet in
+ * according to the routing table configured by the End.DT4 instance.
+ *
+ * This design allows you to get some interesting features like:
+ *  1) the statistics on rx packets;
+ *  2) the possibility to install a packet sniffer on the receiving interface
+ *     (the VRF one) for looking at the incoming packets;
+ *  3) the possibility to leverage the netfilter prerouting hook for the inner
+ *     IPv4 packet.
+ *
+ * This function returns:
+ *  - the sk_buff* when the VRF rcv handler has processed the packet correctly;
+ *  - NULL when the skb is consumed by the VRF rcv handler;
+ *  - a pointer which encodes a negative error number in case of error.
+ *    Note that in this case, the function takes care of freeing the skb.
+ */
+static struct sk_buff *end_dt4_vrf_rcv(struct sk_buff *skb,
+				       struct net_device *dev)
+{
+	/* based on l3mdev_ip_rcv; we are only interested in the master */
+	if (unlikely(!netif_is_l3_master(dev) && !netif_has_l3_rx_handler(dev)))
+		goto drop;
+
+	if (unlikely(!dev->l3mdev_ops->l3mdev_l3_rcv))
+		goto drop;
+
+	/* the decap packet (IPv4) does not come with any mac header info.
+	 * We must unset the mac header to allow the VRF device to rebuild it,
+	 * just in case there is a sniffer attached on the device.
+	 */
+	skb_unset_mac_header(skb);
+
+	skb = dev->l3mdev_ops->l3mdev_l3_rcv(dev, skb, AF_INET);
+	if (!skb)
+		/* the skb buffer was consumed by the handler */
+		return NULL;
+
+	/* when a packet is received by a VRF or by one of its slaves, the
+	 * master device reference is set into the skb.
+	 */
+	if (unlikely(skb->dev != dev || skb->skb_iif != dev->ifindex))
+		goto drop;
+
+	return skb;
+
+drop:
+	kfree_skb(skb);
+	return ERR_PTR(-EINVAL);
+}
+
+static struct net_device *end_dt4_get_vrf_rcu(struct sk_buff *skb,
+					      struct seg6_end_dt4_info *info)
+{
+	int vrf_ifindex = info->vrf_ifindex;
+	struct net *net = info->net;
+
+	if (unlikely(vrf_ifindex < 0))
+		goto error;
+
+	if (unlikely(!net_eq(dev_net(skb->dev), net)))
+		goto error;
+
+	return dev_get_by_index_rcu(net, vrf_ifindex);
+
+error:
+	return NULL;
+}
+
+static int input_action_end_dt4(struct sk_buff *skb,
+				struct seg6_local_lwt *slwt)
+{
+	struct net_device *vrf;
+	struct iphdr *iph;
+	int err;
+
+	if (!decap_and_validate(skb, IPPROTO_IPIP))
+		goto drop;
+
+	if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+		goto drop;
+
+	vrf = end_dt4_get_vrf_rcu(skb, &slwt->dt4_info);
+	if (unlikely(!vrf))
+		goto drop;
+
+	skb->protocol = htons(ETH_P_IP);
+
+	skb_dst_drop(skb);
+
+	skb_set_transport_header(skb, sizeof(struct iphdr));
+
+	skb = end_dt4_vrf_rcv(skb, vrf);
+	if (!skb)
+		/* packet has been processed and consumed by the VRF */
+		return 0;
+
+	if (IS_ERR(skb)) {
+		err = PTR_ERR(skb);
+		return err;
+	}
+
+	iph = ip_hdr(skb);
+
+	err = ip_route_input(skb, iph->daddr, iph->saddr, 0, skb->dev);
+	if (err)
+		goto drop;
+
+	return dst_input(skb);
+
+drop:
+	kfree_skb(skb);
+	return -EINVAL;
+}
+
+#else
+
+static int seg6_end_dt4_build(struct seg6_local_lwt *slwt, const void *cfg,
+			      struct netlink_ext_ack *extack)
+{
+	NL_SET_ERR_MSG(extack, "Operation is not supported");
+
+	return -EOPNOTSUPP;
+}
+
+static int input_action_end_dt4(struct sk_buff *skb,
+				struct seg6_local_lwt *slwt)
+{
+	kfree_skb(skb);
+	return -EOPNOTSUPP;
+}
+
+#endif
+
 static int input_action_end_dt6(struct sk_buff *skb,
 				struct seg6_local_lwt *slwt)
 {
@@ -601,6 +798,14 @@ static struct seg6_action_desc seg6_action_table[] = {
 		.attrs		= (1 << SEG6_LOCAL_NH4),
 		.input		= input_action_end_dx4,
 	},
+	{
+		.action		= SEG6_LOCAL_ACTION_END_DT4,
+		.attrs		= (1 << SEG6_LOCAL_TABLE),
+		.input		= input_action_end_dt4,
+		.slwt_ops	= {
+					.build_state = seg6_end_dt4_build,
+				  },
+	},
 	{
 		.action		= SEG6_LOCAL_ACTION_END_DT6,
 		.attrs		= (1 << SEG6_LOCAL_TABLE),
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [net-next,v2,5/5] selftests: add selftest for the SRv6 End.DT4 behavior
  2020-11-07 15:31 [net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior Andrea Mayer
                   ` (3 preceding siblings ...)
  2020-11-07 15:31 ` [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior Andrea Mayer
@ 2020-11-07 15:31 ` Andrea Mayer
  4 siblings, 0 replies; 34+ messages in thread
From: Andrea Mayer @ 2020-11-07 15:31 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest
  Cc: Stefano Salsano, Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

this selftest is designed for evaluating the new SRv6 End.DT4 behavior
used, in this example, for implementing IPv4 L3 VPN use cases.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
---
 .../selftests/net/srv6_end_dt4_l3vpn_test.sh  | 494 ++++++++++++++++++
 1 file changed, 494 insertions(+)
 create mode 100755 tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh

diff --git a/tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh b/tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh
new file mode 100755
index 000000000000..a5547fed5048
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh
@@ -0,0 +1,494 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# author: Andrea Mayer <andrea.mayer@uniroma2.it>
+
+# This test is designed for evaluating the new SRv6 End.DT4 behavior used for
+# implementing IPv4 L3 VPN use cases.
+#
+# Hereafter a network diagram is shown, where two different tenants (named 100
+# and 200) offer IPv4 L3 VPN services allowing hosts to communicate with each
+# other across an IPv6 network.
+#
+# Only hosts belonging to the same tenant (and to the same VPN) can communicate
+# with each other. Instead, the communication among hosts of different tenants
+# is forbidden.
+# In other words, hosts hs-t100-1 and hs-t100-2 are connected through the IPv4
+# L3 VPN of tenant 100 while hs-t200-3 and hs-t200-4 are connected using the
+# IPv4 L3 VPN of tenant 200. Cross connection between tenant 100 and tenant 200
+# is forbidden and thus, for example, hs-t100-1 cannot reach hs-t200-3 and vice
+# versa.
+#
+# Routers rt-1 and rt-2 implement IPv4 L3 VPN services leveraging the SRv6
+# architecture. The key components for such VPNs are: a) SRv6 Encap behavior,
+# b) SRv6 End.DT4 behavior and c) VRF.
+#
+# To explain how an IPv4 L3 VPN based on SRv6 works, let us briefly consider an
+# example where, within the same domain of tenant 100, the host hs-t100-1 pings
+# the host hs-t100-2.
+#
+# First of all, L2 reachability of the host hs-t100-2 is taken into account by
+# the router rt-1 which acts as an arp proxy.
+#
+# When the host hs-t100-1 sends an IPv4 packet destined to hs-t100-2, the
+# router rt-1 receives the packet on the internal veth-t100 interface. Such
+# interface is enslaved to the VRF vrf-100 whose associated table contains the
+# SRv6 Encap route for encapsulating any IPv4 packet in a IPv6 plus the Segment
+# Routing Header (SRH) packet. This packet is sent through the (IPv6) core
+# network up to the router rt-2 that receives it on veth0 interface.
+#
+# The rt-2 router uses the 'localsid' routing table to process incoming
+# IPv6+SRH packets which belong to the VPN of the tenant 100. For each of these
+# packets, the SRv6 End.DT4 behavior removes the outer IPv6+SRH headers and
+# performs the lookup on the vrf-100 table using the destination address of
+# the decapsulated IPv4 packet. Afterwards, the packet is sent to the host
+# hs-t100-2 through the veth-t100 interface.
+#
+# The ping response follows the same processing but this time the role of rt-1
+# and rt-2 are swapped.
+#
+# Of course, the IPv4 L3 VPN for tenant 200 works exactly as the IPv4 L3 VPN
+# for tenant 100. In this case, only hosts hs-t200-3 and hs-t200-4 are able to
+# connect with each other.
+#
+#
+# +-------------------+                                   +-------------------+
+# |                   |                                   |                   |
+# |  hs-t100-1 netns  |                                   |  hs-t100-2 netns  |
+# |                   |                                   |                   |
+# |  +-------------+  |                                   |  +-------------+  |
+# |  |    veth0    |  |                                   |  |    veth0    |  |
+# |  | 10.0.0.1/24 |  |                                   |  | 10.0.0.2/24 |  |
+# |  +-------------+  |                                   |  +-------------+  |
+# |        .          |                                   |         .         |
+# +-------------------+                                   +-------------------+
+#          .                                                        .
+#          .                                                        .
+#          .                                                        .
+# +-----------------------------------+   +-----------------------------------+
+# |        .                          |   |                         .         |
+# | +---------------+                 |   |                 +---------------- |
+# | |   veth-t100   |                 |   |                 |   veth-t100   | |
+# | | 10.0.0.254/24 |    +----------+ |   | +----------+    | 10.0.0.254/24 | |
+# | +-------+-------+    | localsid | |   | | localsid |    +-------+-------- |
+# |         |            |   table  | |   | |   table  |            |         |
+# |    +----+----+       +----------+ |   | +----------+       +----+----+    |
+# |    | vrf-100 |                    |   |                    | vrf-100 |    |
+# |    +---------+     +------------+ |   | +------------+     +---------+    |
+# |                    |   veth0    | |   | |   veth0    |                    |
+# |                    | fd00::1/64 |.|...|.| fd00::2/64 |                    |
+# |    +---------+     +------------+ |   | +------------+     +---------+    |
+# |    | vrf-200 |                    |   |                    | vrf-200 |    |
+# |    +----+----+                    |   |                    +----+----+    |
+# |         |                         |   |                         |         |
+# | +---------------+                 |   |                 +---------------- |
+# | |   veth-t200   |                 |   |                 |   veth-t200   | |
+# | | 10.0.0.254/24 |                 |   |                 | 10.0.0.254/24 | |
+# | +---------------+      rt-1 netns |   | rt-2 netns      +---------------- |
+# |        .                          |   |                          .        |
+# +-----------------------------------+   +-----------------------------------+
+#          .                                                         .
+#          .                                                         .
+#          .                                                         .
+#          .                                                         .
+# +-------------------+                                   +-------------------+
+# |        .          |                                   |          .        |
+# |  +-------------+  |                                   |  +-------------+  |
+# |  |    veth0    |  |                                   |  |    veth0    |  |
+# |  | 10.0.0.3/24 |  |                                   |  | 10.0.0.4/24 |  |
+# |  +-------------+  |                                   |  +-------------+  |
+# |                   |                                   |                   |
+# |  hs-t200-3 netns  |                                   |  hs-t200-4 netns  |
+# |                   |                                   |                   |
+# +-------------------+                                   +-------------------+
+#
+#
+# ~~~~~~~~~~~~~~~~~~~~~~~~~
+# | Network configuration |
+# ~~~~~~~~~~~~~~~~~~~~~~~~~
+#
+# rt-1: localsid table (table 90)
+# +----------------------------------------------+
+# |SID              |Action                      |
+# +----------------------------------------------+
+# |fc00:21:100::6004|apply SRv6 End.DT4 table 100|
+# +----------------------------------------------+
+# |fc00:21:200::6004|apply SRv6 End.DT4 table 200|
+# +----------------------------------------------+
+#
+# rt-1: VRF tenant 100 (table 100)
+# +---------------------------------------------------+
+# |host       |Action                                 |
+# +---------------------------------------------------+
+# |10.0.0.2   |apply seg6 encap segs fc00:12:100::6004|
+# +---------------------------------------------------+
+# |10.0.0.0/24|forward to dev veth_t100               |
+# +---------------------------------------------------+
+#
+# rt-1: VRF tenant 200 (table 200)
+# +---------------------------------------------------+
+# |host       |Action                                 |
+# +---------------------------------------------------+
+# |10.0.0.4   |apply seg6 encap segs fc00:12:200::6004|
+# +---------------------------------------------------+
+# |10.0.0.0/24|forward to dev veth_t200               |
+# +---------------------------------------------------+
+#
+#
+# rt-2: localsid table (table 90)
+# +----------------------------------------------+
+# |SID              |Action                      |
+# +----------------------------------------------+
+# |fc00:12:100::6004|apply SRv6 End.DT4 table 100|
+# +----------------------------------------------+
+# |fc00:12:200::6004|apply SRv6 End.DT4 table 200|
+# +----------------------------------------------+
+#
+# rt-2: VRF tenant 100 (table 100)
+# +---------------------------------------------------+
+# |host       |Action                                 |
+# +---------------------------------------------------+
+# |10.0.0.1   |apply seg6 encap segs fc00:21:100::6004|
+# +---------------------------------------------------+
+# |10.0.0.0/24|forward to dev veth_t100               |
+# +---------------------------------------------------+
+#
+# rt-2: VRF tenant 200 (table 200)
+# +---------------------------------------------------+
+# |host       |Action                                 |
+# +---------------------------------------------------+
+# |10.0.0.3   |apply seg6 encap segs fc00:21:200::6004|
+# +---------------------------------------------------+
+# |10.0.0.0/24|forward to dev veth_t200               |
+# +---------------------------------------------------+
+#
+
+readonly LOCALSID_TABLE_ID=90
+readonly IPv6_RT_NETWORK=fd00
+readonly IPv4_HS_NETWORK=10.0.0
+readonly VPN_LOCATOR_SERVICE=fc00
+PING_TIMEOUT_SEC=4
+
+ret=0
+
+PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no}
+
+log_test()
+{
+	local rc=$1
+	local expected=$2
+	local msg="$3"
+
+	if [ ${rc} -eq ${expected} ]; then
+		nsuccess=$((nsuccess+1))
+		printf "\n    TEST: %-60s  [ OK ]\n" "${msg}"
+	else
+		ret=1
+		nfail=$((nfail+1))
+		printf "\n    TEST: %-60s  [FAIL]\n" "${msg}"
+		if [ "${PAUSE_ON_FAIL}" = "yes" ]; then
+			echo
+			echo "hit enter to continue, 'q' to quit"
+			read a
+			[ "$a" = "q" ] && exit 1
+		fi
+	fi
+}
+
+print_log_test_results()
+{
+	if [ "$TESTS" != "none" ]; then
+		printf "\nTests passed: %3d\n" ${nsuccess}
+		printf "Tests failed: %3d\n"   ${nfail}
+	fi
+}
+
+log_section()
+{
+	echo
+	echo "################################################################################"
+	echo "TEST SECTION: $*"
+	echo "################################################################################"
+}
+
+cleanup()
+{
+	ip link del veth-rt-1 2>/dev/null || true
+	ip link del veth-rt-2 2>/dev/null || true
+
+	# destroy routers rt-* and hosts hs-*
+	for ns in $(ip netns show | grep -E 'rt-*|hs-*'); do
+		ip netns del ${ns} || true
+	done
+}
+
+# Setup the basic networking for the routers
+setup_rt_networking()
+{
+	local rt=$1
+	local nsname=rt-${rt}
+
+	ip netns add ${nsname}
+	ip link set veth-rt-${rt} netns ${nsname}
+	ip -netns ${nsname} link set veth-rt-${rt} name veth0
+
+	ip -netns ${nsname} addr add ${IPv6_RT_NETWORK}::${rt}/64 dev veth0
+	ip -netns ${nsname} link set veth0 up
+	ip -netns ${nsname} link set lo up
+
+	ip netns exec ${nsname} sysctl -wq net.ipv4.ip_forward=1
+	ip netns exec ${nsname} sysctl -wq net.ipv6.conf.all.forwarding=1
+}
+
+setup_hs()
+{
+	local hs=$1
+	local rt=$2
+	local tid=$3
+	local hsname=hs-t${tid}-${hs}
+	local rtname=rt-${rt}
+	local rtveth=veth-t${tid}
+
+	# set the networking for the host
+	ip netns add ${hsname}
+	ip -netns ${hsname} link add veth0 type veth peer name ${rtveth}
+	ip -netns ${hsname} link set ${rtveth} netns ${rtname}
+	ip -netns ${hsname} addr add ${IPv4_HS_NETWORK}.${hs}/24 dev veth0
+	ip -netns ${hsname} link set veth0 up
+	ip -netns ${hsname} link set lo up
+
+	# configure the VRF for the tenant X on the router which is directly
+	# connected to the source host.
+	ip -netns ${rtname} link add vrf-${tid} type vrf table ${tid}
+	ip -netns ${rtname} link set vrf-${tid} up
+
+	# enslave the veth-tX interface to the vrf-X in the access router
+	ip -netns ${rtname} link set ${rtveth} master vrf-${tid}
+	ip -netns ${rtname} addr add ${IPv4_HS_NETWORK}.254/24 dev ${rtveth}
+	ip -netns ${rtname} link set ${rtveth} up
+
+	ip netns exec ${rtname} sysctl -wq net.ipv4.conf.${rtveth}.proxy_arp=1
+
+	# disable the rp_filter otherwise the kernel gets confused about how
+	# to route decap ipv4 packets.
+	ip netns exec ${rtname} sysctl -wq net.ipv4.conf.all.rp_filter=0
+	ip netns exec ${rtname} sysctl -wq net.ipv4.conf.${rtveth}.rp_filter=0
+
+	ip netns exec ${rtname} sh -c "echo 1 > /proc/sys/net/vrf/strict_mode"
+}
+
+setup_vpn_config()
+{
+	local hssrc=$1
+	local rtsrc=$2
+	local hsdst=$3
+	local rtdst=$4
+	local tid=$5
+
+	local hssrc_name=hs-t${tid}-${hssrc}
+	local hsdst_name=hs-t${tid}-${hsdst}
+	local rtsrc_name=rt-${rtsrc}
+	local rtdst_name=rt-${rtdst}
+	local vpn_sid=${VPN_LOCATOR_SERVICE}:${hssrc}${hsdst}:${tid}::6004
+
+	# set the encap route for encapsulating packets which arrive from the
+	# host hssrc and destined to the access router rtsrc.
+	ip -netns ${rtsrc_name} -4 route add ${IPv4_HS_NETWORK}.${hsdst}/32 vrf vrf-${tid} \
+		encap seg6 mode encap segs ${vpn_sid} dev veth0
+	ip -netns ${rtsrc_name} -6 route add ${vpn_sid}/128 vrf vrf-${tid} \
+		via fd00::${rtdst} dev veth0
+
+	# set the decap route for decapsulating packets which arrive from
+	# the rtdst router and destined to the hsdst host.
+	ip -netns ${rtdst_name} -6 route add ${vpn_sid}/128 table ${LOCALSID_TABLE_ID} \
+		encap seg6local action End.DT4 table ${tid} dev vrf-${tid}
+
+	# all sids for VPNs start with a common locator which is fc00::/16.
+	# Routes for handling the SRv6 End.DT4 behavior instances are grouped
+	# together in the 'localsid' table.
+	#
+	# NOTE: added only once
+	if [ -z "$(ip -netns ${rtdst_name} -6 rule show | \
+	    grep "to ${VPN_LOCATOR_SERVICE}::/16 lookup ${LOCALSID_TABLE_ID}")" ]; then
+		ip -netns ${rtdst_name} -6 rule add \
+			to ${VPN_LOCATOR_SERVICE}::/16 \
+			lookup ${LOCALSID_TABLE_ID} prio 999
+	fi
+}
+
+setup()
+{
+	ip link add veth-rt-1 type veth peer name veth-rt-2
+	# setup the networking for router rt-1 and router rt-2
+	setup_rt_networking 1
+	setup_rt_networking 2
+
+	# setup two hosts for the tenant 100.
+	#  - host hs-1 is directly connected to the router rt-1;
+	#  - host hs-2 is directly connected to the router rt-2.
+	setup_hs 1 1 100  #args: host router tenant
+	setup_hs 2 2 100
+
+	# setup two hosts for the tenant 200
+	#  - host hs-3 is directly connected to the router rt-1;
+	#  - host hs-4 is directly connected to the router rt-2.
+	setup_hs 3 1 200
+	setup_hs 4 2 200
+
+	# setup the IPv4 L3 VPN which connects the host hs-t100-1 and host
+	# hs-t100-2 within the same tenant 100.
+	setup_vpn_config 1 1 2 2 100  #args: src_host src_router dst_host dst_router tenant
+	setup_vpn_config 2 2 1 1 100
+
+	# setup the IPv4 L3 VPN which connects the host hs-t200-3 and host
+	# hs-t200-4 within the same tenant 200.
+	setup_vpn_config 3 1 4 2 200
+	setup_vpn_config 4 2 3 1 200
+}
+
+check_rt_connectivity()
+{
+	local rtsrc=$1
+	local rtdst=$2
+
+	ip netns exec rt-${rtsrc} ping -c 1 -W 1 ${IPv6_RT_NETWORK}::${rtdst} \
+		>/dev/null 2>&1
+}
+
+check_and_log_rt_connectivity()
+{
+	local rtsrc=$1
+	local rtdst=$2
+
+	check_rt_connectivity ${rtsrc} ${rtdst}
+	log_test $? 0 "Routers connectivity: rt-${rtsrc} -> rt-${rtdst}"
+}
+
+check_hs_connectivity()
+{
+	local hssrc=$1
+	local hsdst=$2
+	local tid=$3
+
+	ip netns exec hs-t${tid}-${hssrc} ping -c 1 -W ${PING_TIMEOUT_SEC} \
+		${IPv4_HS_NETWORK}.${hsdst} >/dev/null 2>&1
+}
+
+check_and_log_hs_connectivity()
+{
+	local hssrc=$1
+	local hsdst=$2
+	local tid=$3
+
+	check_hs_connectivity ${hssrc} ${hsdst} ${tid}
+	log_test $? 0 "Hosts connectivity: hs-t${tid}-${hssrc} -> hs-t${tid}-${hsdst} (tenant ${tid})"
+}
+
+check_and_log_hs_isolation()
+{
+	local hssrc=$1
+	local tidsrc=$2
+	local hsdst=$3
+	local tiddst=$4
+
+	check_hs_connectivity ${hssrc} ${hsdst} ${tidsrc}
+	# NOTE: ping should fail
+	log_test $? 1 "Hosts isolation: hs-t${tidsrc}-${hssrc} -X-> hs-t${tiddst}-${hsdst}"
+}
+
+
+check_and_log_hs2gw_connectivity()
+{
+	local hssrc=$1
+	local tid=$2
+
+	check_hs_connectivity ${hssrc} 254 ${tid}
+	log_test $? 0 "Hosts connectivity: hs-t${tid}-${hssrc} -> gw (tenant ${tid})"
+}
+
+router_tests()
+{
+	log_section "IPv6 routers connectivity test"
+
+	check_and_log_rt_connectivity 1 2
+	check_and_log_rt_connectivity 2 1
+}
+
+host2gateway_tests()
+{
+	log_section "IPv4 connectivity test among hosts and gateway"
+
+	check_and_log_hs2gw_connectivity 1 100
+	check_and_log_hs2gw_connectivity 2 100
+
+	check_and_log_hs2gw_connectivity 3 200
+	check_and_log_hs2gw_connectivity 4 200
+}
+
+host_vpn_tests()
+{
+	log_section "SRv6 VPN connectivity test among hosts in the same tenant"
+
+	check_and_log_hs_connectivity 1 2 100
+	check_and_log_hs_connectivity 2 1 100
+
+	check_and_log_hs_connectivity 3 4 200
+	check_and_log_hs_connectivity 4 3 200
+}
+
+host_vpn_isolation_tests()
+{
+	local i
+	local j
+	local k
+	local tmp
+	local l1="1 2"
+	local l2="3 4"
+	local t1=100
+	local t2=200
+
+	log_section "SRv6 VPN isolation test among hosts in different tentants"
+
+	for k in 0 1; do
+		for i in ${l1}; do
+			for j in ${l2}; do
+				check_and_log_hs_isolation ${i} ${t1} ${j} ${t2}
+			done
+		done
+
+		# let us test the reverse path
+		tmp="${l1}"; l1="${l2}"; l2="${tmp}"
+		tmp=${t1}; t1=${t2}; t2=${tmp}
+	done
+}
+
+if [ "$(id -u)" -ne 0 ];then
+	echo "SKIP: Need root privileges"
+	exit 0
+fi
+
+if [ ! -x "$(command -v ip)" ]; then
+	echo "SKIP: Could not run test without ip tool"
+	exit 0
+fi
+
+modprobe vrf &>/dev/null
+if [ ! -e /proc/sys/net/vrf/strict_mode ]; then
+        echo "SKIP: vrf sysctl does not exist"
+        exit 0
+fi
+
+cleanup &>/dev/null
+
+setup
+
+router_tests
+host2gateway_tests
+host_vpn_tests
+host_vpn_isolation_tests
+
+print_log_test_results
+
+cleanup &>/dev/null
+
+exit ${ret}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,2/5] seg6: improve management of behavior attributes
  2020-11-07 15:31 ` [net-next,v2,2/5] seg6: improve management of behavior attributes Andrea Mayer
@ 2020-11-10 22:50   ` Jakub Kicinski
  2020-11-13  0:55     ` Andrea Mayer
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-10 22:50 UTC (permalink / raw)
  To: Andrea Mayer
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam

On Sat,  7 Nov 2020 16:31:36 +0100 Andrea Mayer wrote:
> Depending on the attribute (i.e.: SEG6_LOCAL_SRH, SEG6_LOCAL_TABLE, etc),
> the parse() callback performs some validity checks on the provided input
> and updates the tunnel state (slwt) with the result of the parsing
> operation. However, an attribute may also need to reserve some additional
> resources (i.e.: memory or setting up an eBPF program) in the parse()
> callback to complete the parsing operation.

Looks good, a few nit picks below.

> diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> index eba23279912d..63a82e2fdea9 100644
> --- a/net/ipv6/seg6_local.c
> +++ b/net/ipv6/seg6_local.c
> @@ -710,6 +710,12 @@ static int cmp_nla_srh(struct seg6_local_lwt *a, struct seg6_local_lwt *b)
>  	return memcmp(a->srh, b->srh, len);
>  }
>  
> +static void destroy_attr_srh(struct seg6_local_lwt *slwt)
> +{
> +	kfree(slwt->srh);
> +	slwt->srh = NULL;

This should never be called twice, right? No need for defensive
programming then.

> +}
> +
>  static int parse_nla_table(struct nlattr **attrs, struct seg6_local_lwt *slwt)
>  {
>  	slwt->table = nla_get_u32(attrs[SEG6_LOCAL_TABLE]);
> @@ -901,16 +907,33 @@ static int cmp_nla_bpf(struct seg6_local_lwt *a, struct seg6_local_lwt *b)
>  	return strcmp(a->bpf.name, b->bpf.name);
>  }
>  
> +static void destroy_attr_bpf(struct seg6_local_lwt *slwt)
> +{
> +	kfree(slwt->bpf.name);
> +	if (slwt->bpf.prog)
> +		bpf_prog_put(slwt->bpf.prog);

Same - why check if prog is NULL? That doesn't seem necessary if the
code is correct.

> +	slwt->bpf.name = NULL;
> +	slwt->bpf.prog = NULL;
> +}
> +
>  struct seg6_action_param {
>  	int (*parse)(struct nlattr **attrs, struct seg6_local_lwt *slwt);
>  	int (*put)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
>  	int (*cmp)(struct seg6_local_lwt *a, struct seg6_local_lwt *b);
> +
> +	/* optional destroy() callback useful for releasing resources which
> +	 * have been previously acquired in the corresponding parse()
> +	 * function.
> +	 */
> +	void (*destroy)(struct seg6_local_lwt *slwt);
>  };
>  
>  static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
>  	[SEG6_LOCAL_SRH]	= { .parse = parse_nla_srh,
>  				    .put = put_nla_srh,
> -				    .cmp = cmp_nla_srh },
> +				    .cmp = cmp_nla_srh,
> +				    .destroy = destroy_attr_srh },
>  
>  	[SEG6_LOCAL_TABLE]	= { .parse = parse_nla_table,
>  				    .put = put_nla_table,
> @@ -934,13 +957,68 @@ static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
>  
>  	[SEG6_LOCAL_BPF]	= { .parse = parse_nla_bpf,
>  				    .put = put_nla_bpf,
> -				    .cmp = cmp_nla_bpf },
> +				    .cmp = cmp_nla_bpf,
> +				    .destroy = destroy_attr_bpf },
>  
>  };
>  
> +/* call the destroy() callback (if available) for each set attribute in
> + * @parsed_attrs, starting from attribute index @start up to @end excluded.
> + */
> +static void __destroy_attrs(unsigned long parsed_attrs, int start, int end,

You always pass 0 as start, no need for that argument.

slwt and max_parsed should be the only args this function needs.

> +			    struct seg6_local_lwt *slwt)
> +{
> +	struct seg6_action_param *param;
> +	int i;
> +
> +	/* Every seg6local attribute is identified by an ID which is encoded as
> +	 * a flag (i.e: 1 << ID) in the @parsed_attrs bitmask; such bitmask
> +	 * keeps track of the attributes parsed so far.
> +
> +	 * We scan the @parsed_attrs bitmask, starting from the attribute
> +	 * identified by @start up to the attribute identified by @end
> +	 * excluded. For each set attribute, we retrieve the corresponding
> +	 * destroy() callback.
> +	 * If the callback is not available, then we skip to the next
> +	 * attribute; otherwise, we call the destroy() callback.
> +	 */
> +	for (i = start; i < end; ++i) {
> +		if (!(parsed_attrs & (1 << i)))
> +			continue;
> +
> +		param = &seg6_action_params[i];
> +
> +		if (param->destroy)
> +			param->destroy(slwt);
> +	}
> +}
> +
> +/* release all the resources that may have been acquired during parsing
> + * operations.
> + */
> +static void destroy_attrs(struct seg6_local_lwt *slwt)
> +{
> +	struct seg6_action_desc *desc;
> +	unsigned long attrs;
> +
> +	desc = slwt->desc;
> +	if (!desc) {
> +		WARN_ONCE(1,
> +			  "seg6local: seg6_action_desc* for action %d is NULL",
> +			  slwt->action);
> +		return;
> +	}

Defensive programming?

> +
> +	/* get the attributes for the current behavior instance */
> +	attrs = desc->attrs;
> +
> +	__destroy_attrs(attrs, 0, SEG6_LOCAL_MAX + 1, slwt);
> +}
> +
>  static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
>  {
>  	struct seg6_action_param *param;
> +	unsigned long parsed_attrs = 0;
>  	struct seg6_action_desc *desc;
>  	int i, err;
>  
> @@ -963,11 +1041,22 @@ static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
>  
>  			err = param->parse(attrs, slwt);
>  			if (err < 0)
> -				return err;
> +				goto parse_err;
> +
> +			/* current attribute has been parsed correctly */
> +			parsed_attrs |= (1 << i);

Why do you need parsed_attrs, attributes are not optional. Everything
that's sepecified in desc->attrs and lower than i must had been parsed.

>  		}
>  	}
>  
>  	return 0;
> +
> +parse_err:
> +	/* release any resource that may have been acquired during the i-1
> +	 * parse() operations.
> +	 */
> +	__destroy_attrs(parsed_attrs, 0, i, slwt);
> +
> +	return err;
>  }
>  
>  static int seg6_local_build_state(struct net *net, struct nlattr *nla,



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached
  2020-11-07 15:31 ` [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached Andrea Mayer
@ 2020-11-10 22:50   ` Jakub Kicinski
  2020-11-13  0:37     ` Andrea Mayer
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-10 22:50 UTC (permalink / raw)
  To: Andrea Mayer
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam

On Sat,  7 Nov 2020 16:31:35 +0100 Andrea Mayer wrote:
> Before this patch, a sniffer attached to a VRF used as the receiving
> interface of L3 tunneled packets detects them as malformed packets and
> it complains about that (i.e.: tcpdump shows bogus packets).
> 
> The reason is that a tunneled L3 packet does not carry any L2
> information and when the VRF is set as the receiving interface of a
> decapsulated L3 packet, no mac header is currently set or valid.
> Therefore, the purpose of this patch consists of adding a MAC header to
> any packet which is directly received on the VRF interface ONLY IF:
> 
>  i) a sniffer is attached on the VRF and ii) the mac header is not set.
> 
> In this case, the mac address of the VRF is copied in both the
> destination and the source address of the ethernet header. The protocol
> type is set either to IPv4 or IPv6, depending on which L3 packet is
> received.
> 
> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>

Please keep David's review tag since you haven't changed the code.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior
  2020-11-07 15:31 ` [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior Andrea Mayer
@ 2020-11-10 22:56   ` Jakub Kicinski
  2020-11-13  1:06     ` Andrea Mayer
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-10 22:56 UTC (permalink / raw)
  To: Andrea Mayer
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam

On Sat,  7 Nov 2020 16:31:37 +0100 Andrea Mayer wrote:
> We introduce two callbacks used for customizing the creation/destruction of
> a SRv6 behavior. Such callbacks are defined in the new struct
> seg6_local_lwtunnel_ops and hereafter we provide a brief description of
> them:
> 
>  - build_state(...): used for calling the custom constructor of the
>    behavior during its initialization phase and after all the attributes
>    have been parsed successfully;
> 
>  - destroy_state(...): used for calling the custom destructor of the
>    behavior before it is completely destroyed.
> 
> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>

Looks good, minor nits.

> diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> index 63a82e2fdea9..4b0f155d641d 100644
> --- a/net/ipv6/seg6_local.c
> +++ b/net/ipv6/seg6_local.c
> @@ -33,11 +33,23 @@
>  
>  struct seg6_local_lwt;
>  
> +typedef int (*slwt_build_state_t)(struct seg6_local_lwt *slwt, const void *cfg,
> +				  struct netlink_ext_ack *extack);
> +typedef void (*slwt_destroy_state_t)(struct seg6_local_lwt *slwt);

Let's avoid the typedefs. Instead of taking a pointer to the op take a
pointer to the ops struct in seg6_local_lwtunnel_build_state() etc.

> +/* callbacks used for customizing the creation and destruction of a behavior */
> +struct seg6_local_lwtunnel_ops {
> +	slwt_build_state_t build_state;
> +	slwt_destroy_state_t destroy_state;
> +};
> +
>  struct seg6_action_desc {
>  	int action;
>  	unsigned long attrs;
>  	int (*input)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
>  	int static_headroom;
> +
> +	struct seg6_local_lwtunnel_ops slwt_ops;
>  };
>  
>  struct bpf_lwt_prog {
> @@ -1015,6 +1027,45 @@ static void destroy_attrs(struct seg6_local_lwt *slwt)
>  	__destroy_attrs(attrs, 0, SEG6_LOCAL_MAX + 1, slwt);
>  }
>  
> +/* call the custom constructor of the behavior during its initialization phase
> + * and after that all its attributes have been parsed successfully.
> + */
> +static int
> +seg6_local_lwtunnel_build_state(struct seg6_local_lwt *slwt, const void *cfg,
> +				struct netlink_ext_ack *extack)
> +{
> +	slwt_build_state_t build_func;
> +	struct seg6_action_desc *desc;
> +	int err = 0;
> +
> +	desc = slwt->desc;
> +	if (!desc)
> +		return -EINVAL;

This is impossible, right?

> +
> +	build_func = desc->slwt_ops.build_state;
> +	if (build_func)
> +		err = build_func(slwt, cfg, extack);
> +
> +	return err;

no need for err, just use return directly.

	if (!ops->build_state)
		return 0;
	return ops->build_state(...);

> +}
> +
> +/* call the custom destructor of the behavior which is invoked before the
> + * tunnel is going to be destroyed.
> + */
> +static void seg6_local_lwtunnel_destroy_state(struct seg6_local_lwt *slwt)
> +{
> +	slwt_destroy_state_t destroy_func;
> +	struct seg6_action_desc *desc;
> +
> +	desc = slwt->desc;
> +	if (!desc)
> +		return;
> +
> +	destroy_func = desc->slwt_ops.destroy_state;
> +	if (destroy_func)
> +		destroy_func(slwt);
> +}
> +
>  static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
>  {
>  	struct seg6_action_param *param;
> @@ -1090,8 +1141,16 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
>  
>  	err = parse_nla_action(tb, slwt);
>  	if (err < 0)
> +		/* In case of error, the parse_nla_action() takes care of
> +		 * releasing resources which have been acquired during the
> +		 * processing of attributes.
> +		 */

that's the normal behavior for a kernel function, comment is
unnecessary IMO

>  		goto out_free;
>  
> +	err = seg6_local_lwtunnel_build_state(slwt, cfg, extack);
> +	if (err < 0)
> +		goto free_attrs;

The function is called destroy_attrs, call the label out_destroy_attrs,
or err_destroy_attrs.

>  	newts->type = LWTUNNEL_ENCAP_SEG6_LOCAL;
>  	newts->flags = LWTUNNEL_STATE_INPUT_REDIRECT;
>  	newts->headroom = slwt->headroom;
> @@ -1100,6 +1159,9 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
>  
>  	return 0;
>  
> +free_attrs:
> +	destroy_attrs(slwt);
> +

no need for empty lines on error paths

>  out_free:
>  	kfree(newts);
>  	return err;
> @@ -1109,6 +1171,8 @@ static void seg6_local_destroy_state(struct lwtunnel_state *lwt)
>  {
>  	struct seg6_local_lwt *slwt = seg6_local_lwtunnel(lwt);
>  
> +	seg6_local_lwtunnel_destroy_state(slwt);
> +
>  	destroy_attrs(slwt);
>  
>  	return;


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-07 15:31 ` [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior Andrea Mayer
@ 2020-11-10 23:12   ` Jakub Kicinski
  2020-11-13  1:28     ` Andrea Mayer
  2020-11-13  9:23   ` kernel test robot
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-10 23:12 UTC (permalink / raw)
  To: Andrea Mayer
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam

On Sat,  7 Nov 2020 16:31:38 +0100 Andrea Mayer wrote:
> SRv6 End.DT4 is defined in the SRv6 Network Programming [1].
> 
> The SRv6 End.DT4 is used to implement IPv4 L3VPN use-cases in
> multi-tenants environments. It decapsulates the received packets and it
> performs IPv4 routing lookup in the routing table of the tenant.
> 
> The SRv6 End.DT4 Linux implementation leverages a VRF device in order to
> force the routing lookup into the associated routing table.

How does the behavior of DT4 compare to DT6?

The implementation looks quite different.

> To make the End.DT4 work properly, it must be guaranteed that the routing
> table used for routing lookup operations is bound to one and only one
> VRF during the tunnel creation. Such constraint has to be enforced by
> enabling the VRF strict_mode sysctl parameter, i.e:
>  $ sysctl -wq net.vrf.strict_mode=1.
> 
> At JANOG44, LINE corporation presented their multi-tenant DC architecture
> using SRv6 [2]. In the slides, they reported that the Linux kernel is
> missing the support of SRv6 End.DT4 behavior.
> 
> The iproute2 counterpart required for configuring the SRv6 End.DT4
> behavior is already implemented along with the other supported SRv6
> behaviors [3].
> 
> [1] https://tools.ietf.org/html/draft-ietf-spring-srv6-network-programming
> [2] https://speakerdeck.com/line_developers/line-data-center-networking-with-srv6
> [3] https://patchwork.ozlabs.org/patch/799837/
> 
> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
> ---
>  net/ipv6/seg6_local.c | 205 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 205 insertions(+)
> 
> diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> index 4b0f155d641d..a41074acd43e 100644
> --- a/net/ipv6/seg6_local.c
> +++ b/net/ipv6/seg6_local.c
> @@ -57,6 +57,14 @@ struct bpf_lwt_prog {
>  	char *name;
>  };
>  
> +struct seg6_end_dt4_info {
> +	struct net *net;
> +	/* VRF device associated to the routing table used by the SRv6 End.DT4
> +	 * behavior for routing IPv4 packets.
> +	 */
> +	int vrf_ifindex;
> +};
> +
>  struct seg6_local_lwt {
>  	int action;
>  	struct ipv6_sr_hdr *srh;
> @@ -66,6 +74,7 @@ struct seg6_local_lwt {
>  	int iif;
>  	int oif;
>  	struct bpf_lwt_prog bpf;
> +	struct seg6_end_dt4_info dt4_info;
>  
>  	int headroom;
>  	struct seg6_action_desc *desc;
> @@ -413,6 +422,194 @@ static int input_action_end_dx4(struct sk_buff *skb,
>  	return -EINVAL;
>  }
>  
> +#ifdef CONFIG_NET_L3_MASTER_DEV
> +

no need for this empty line.

> +static struct net *fib6_config_get_net(const struct fib6_config *fib6_cfg)
> +{
> +	const struct nl_info *nli = &fib6_cfg->fc_nlinfo;
> +
> +	return nli->nl_net;
> +}
> +
> +static int seg6_end_dt4_build(struct seg6_local_lwt *slwt, const void *cfg,
> +			      struct netlink_ext_ack *extack)
> +{
> +	struct seg6_end_dt4_info *info = &slwt->dt4_info;
> +	int vrf_ifindex;
> +	struct net *net;
> +
> +	net = fib6_config_get_net(cfg);
> +
> +	vrf_ifindex = l3mdev_ifindex_lookup_by_table_id(L3MDEV_TYPE_VRF, net,
> +							slwt->table);
> +	if (vrf_ifindex < 0) {
> +		if (vrf_ifindex == -EPERM) {
> +			NL_SET_ERR_MSG(extack,
> +				       "Strict mode for VRF is disabled");
> +		} else if (vrf_ifindex == -ENODEV) {
> +			NL_SET_ERR_MSG(extack, "No such device");

That's what -ENODEV already says.

> +		} else {
> +			NL_SET_ERR_MSG(extack, "Unknown error");

Useless error.

> +			pr_debug("seg6local: SRv6 End.DT4 creation error=%d\n",
> +				 vrf_ifindex);
> +		}
> +
> +		return vrf_ifindex;
> +	}
> +
> +	info->net = net;
> +	info->vrf_ifindex = vrf_ifindex;
> +
> +	return 0;
> +}
> +
> +/* The SRv6 End.DT4 behavior extracts the inner (IPv4) packet and routes the
> + * IPv4 packet by looking at the configured routing table.
> + *
> + * In the SRv6 End.DT4 use case, we can receive traffic (IPv6+Segment Routing
> + * Header packets) from several interfaces and the IPv6 destination address (DA)
> + * is used for retrieving the specific instance of the End.DT4 behavior that
> + * should process the packets.
> + *
> + * However, the inner IPv4 packet is not really bound to any receiving
> + * interface and thus the End.DT4 sets the VRF (associated with the
> + * corresponding routing table) as the *receiving* interface.
> + * In other words, the End.DT4 processes a packet as if it has been received
> + * directly by the VRF (and not by one of its slave devices, if any).
> + * In this way, the VRF interface is used for routing the IPv4 packet in
> + * according to the routing table configured by the End.DT4 instance.
> + *
> + * This design allows you to get some interesting features like:
> + *  1) the statistics on rx packets;
> + *  2) the possibility to install a packet sniffer on the receiving interface
> + *     (the VRF one) for looking at the incoming packets;
> + *  3) the possibility to leverage the netfilter prerouting hook for the inner
> + *     IPv4 packet.
> + *
> + * This function returns:
> + *  - the sk_buff* when the VRF rcv handler has processed the packet correctly;
> + *  - NULL when the skb is consumed by the VRF rcv handler;
> + *  - a pointer which encodes a negative error number in case of error.
> + *    Note that in this case, the function takes care of freeing the skb.
> + */
> +static struct sk_buff *end_dt4_vrf_rcv(struct sk_buff *skb,
> +				       struct net_device *dev)
> +{
> +	/* based on l3mdev_ip_rcv; we are only interested in the master */
> +	if (unlikely(!netif_is_l3_master(dev) && !netif_has_l3_rx_handler(dev)))
> +		goto drop;
> +
> +	if (unlikely(!dev->l3mdev_ops->l3mdev_l3_rcv))
> +		goto drop;
> +
> +	/* the decap packet (IPv4) does not come with any mac header info.
> +	 * We must unset the mac header to allow the VRF device to rebuild it,
> +	 * just in case there is a sniffer attached on the device.
> +	 */
> +	skb_unset_mac_header(skb);
> +
> +	skb = dev->l3mdev_ops->l3mdev_l3_rcv(dev, skb, AF_INET);
> +	if (!skb)
> +		/* the skb buffer was consumed by the handler */
> +		return NULL;
> +
> +	/* when a packet is received by a VRF or by one of its slaves, the
> +	 * master device reference is set into the skb.
> +	 */
> +	if (unlikely(skb->dev != dev || skb->skb_iif != dev->ifindex))
> +		goto drop;
> +
> +	return skb;
> +
> +drop:
> +	kfree_skb(skb);
> +	return ERR_PTR(-EINVAL);
> +}
> +
> +static struct net_device *end_dt4_get_vrf_rcu(struct sk_buff *skb,
> +					      struct seg6_end_dt4_info *info)
> +{
> +	int vrf_ifindex = info->vrf_ifindex;
> +	struct net *net = info->net;
> +
> +	if (unlikely(vrf_ifindex < 0))
> +		goto error;
> +
> +	if (unlikely(!net_eq(dev_net(skb->dev), net)))
> +		goto error;
> +
> +	return dev_get_by_index_rcu(net, vrf_ifindex);
> +
> +error:
> +	return NULL;
> +}
> +
> +static int input_action_end_dt4(struct sk_buff *skb,
> +				struct seg6_local_lwt *slwt)
> +{
> +	struct net_device *vrf;
> +	struct iphdr *iph;
> +	int err;
> +
> +	if (!decap_and_validate(skb, IPPROTO_IPIP))
> +		goto drop;
> +
> +	if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> +		goto drop;
> +
> +	vrf = end_dt4_get_vrf_rcu(skb, &slwt->dt4_info);
> +	if (unlikely(!vrf))
> +		goto drop;
> +
> +	skb->protocol = htons(ETH_P_IP);
> +
> +	skb_dst_drop(skb);
> +
> +	skb_set_transport_header(skb, sizeof(struct iphdr));
> +
> +	skb = end_dt4_vrf_rcv(skb, vrf);
> +	if (!skb)
> +		/* packet has been processed and consumed by the VRF */
> +		return 0;
> +
> +	if (IS_ERR(skb)) {
> +		err = PTR_ERR(skb);
> +		return err;

return PTR_ERR(skb)

> +	}
> +
> +	iph = ip_hdr(skb);
> +
> +	err = ip_route_input(skb, iph->daddr, iph->saddr, 0, skb->dev);
> +	if (err)
> +		goto drop;
> +
> +	return dst_input(skb);
> +
> +drop:
> +	kfree_skb(skb);
> +	return -EINVAL;
> +}
> +
> +#else
> +

new line not needed

> +static int seg6_end_dt4_build(struct seg6_local_lwt *slwt, const void *cfg,
> +			      struct netlink_ext_ack *extack)
> +{
> +	NL_SET_ERR_MSG(extack, "Operation is not supported");

This extack message probably could be more helpful. As it stands it's
basically 

> +
> +	return -EOPNOTSUPP;
> +}
> +
> +static int input_action_end_dt4(struct sk_buff *skb,
> +				struct seg6_local_lwt *slwt)

Maybe just ifdef out the part of the action table instead of creating
those stubs?

> +{
> +	kfree_skb(skb);
> +	return -EOPNOTSUPP;
> +}
> +
> +#endif
> +
>  static int input_action_end_dt6(struct sk_buff *skb,
>  				struct seg6_local_lwt *slwt)
>  {
> @@ -601,6 +798,14 @@ static struct seg6_action_desc seg6_action_table[] = {

BTW any idea why the action table is not marked as const?

Would you mind sending a patch to fix that?

>  		.attrs		= (1 << SEG6_LOCAL_NH4),
>  		.input		= input_action_end_dx4,
>  	},
> +	{
> +		.action		= SEG6_LOCAL_ACTION_END_DT4,
> +		.attrs		= (1 << SEG6_LOCAL_TABLE),
> +		.input		= input_action_end_dt4,
> +		.slwt_ops	= {
> +					.build_state = seg6_end_dt4_build,
> +				  },
> +	},
>  	{
>  		.action		= SEG6_LOCAL_ACTION_END_DT6,
>  		.attrs		= (1 << SEG6_LOCAL_TABLE),


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached
  2020-11-10 22:50   ` Jakub Kicinski
@ 2020-11-13  0:37     ` Andrea Mayer
  0 siblings, 0 replies; 34+ messages in thread
From: Andrea Mayer @ 2020-11-13  0:37 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Hi Jakub,

On Tue, 10 Nov 2020 14:50:45 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Sat,  7 Nov 2020 16:31:35 +0100 Andrea Mayer wrote:
> > Before this patch, a sniffer attached to a VRF used as the receiving
> > interface of L3 tunneled packets detects them as malformed packets and
> > it complains about that (i.e.: tcpdump shows bogus packets).
> > 
> > The reason is that a tunneled L3 packet does not carry any L2
> > information and when the VRF is set as the receiving interface of a
> > decapsulated L3 packet, no mac header is currently set or valid.
> > Therefore, the purpose of this patch consists of adding a MAC header to
> > any packet which is directly received on the VRF interface ONLY IF:
> > 
> >  i) a sniffer is attached on the VRF and ii) the mac header is not set.
> > 
> > In this case, the mac address of the VRF is copied in both the
> > destination and the source address of the ethernet header. The protocol
> > type is set either to IPv4 or IPv6, depending on which L3 packet is
> > received.
> > 
> > Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
> 
> Please keep David's review tag since you haven't changed the code.

I will keep David's review tag in v3.

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,2/5] seg6: improve management of behavior attributes
  2020-11-10 22:50   ` Jakub Kicinski
@ 2020-11-13  0:55     ` Andrea Mayer
  0 siblings, 0 replies; 34+ messages in thread
From: Andrea Mayer @ 2020-11-13  0:55 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Hi Jakub,
many thanks for your review. Please see my responses inline:

On Tue, 10 Nov 2020 14:50:21 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Sat,  7 Nov 2020 16:31:36 +0100 Andrea Mayer wrote:
> > Depending on the attribute (i.e.: SEG6_LOCAL_SRH, SEG6_LOCAL_TABLE, etc),
> > the parse() callback performs some validity checks on the provided input
> > and updates the tunnel state (slwt) with the result of the parsing
> > operation. However, an attribute may also need to reserve some additional
> > resources (i.e.: memory or setting up an eBPF program) in the parse()
> > callback to complete the parsing operation.
> 
> Looks good, a few nit picks below.
> 
> > diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> > index eba23279912d..63a82e2fdea9 100644
> > --- a/net/ipv6/seg6_local.c
> > +++ b/net/ipv6/seg6_local.c
> > @@ -710,6 +710,12 @@ static int cmp_nla_srh(struct seg6_local_lwt *a, struct seg6_local_lwt *b)
> >  	return memcmp(a->srh, b->srh, len);
> >  }
> >  
> > +static void destroy_attr_srh(struct seg6_local_lwt *slwt)
> > +{
> > +	kfree(slwt->srh);
> > +	slwt->srh = NULL;
> 
> This should never be called twice, right? No need for defensive
> programming then.
>

Yes, the patch that I wrote does not call the function twice.
When I wrote the code my only concern was if someone (in the future) could ever
call the destroy_attr_srh() in a wrong way or in an inappropriate part of the code.
This choice was driven by an excess of paranoia rather than a real issue.

Given that, I will remove it with no problem at all in v3.

> > +}
> > +
> >  static int parse_nla_table(struct nlattr **attrs, struct seg6_local_lwt *slwt)
> >  {
> >  	slwt->table = nla_get_u32(attrs[SEG6_LOCAL_TABLE]);
> > @@ -901,16 +907,33 @@ static int cmp_nla_bpf(struct seg6_local_lwt *a, struct seg6_local_lwt *b)
> >  	return strcmp(a->bpf.name, b->bpf.name);
> >  }
> >  
> > +static void destroy_attr_bpf(struct seg6_local_lwt *slwt)
> > +{
> > +	kfree(slwt->bpf.name);
> > +	if (slwt->bpf.prog)
> > +		bpf_prog_put(slwt->bpf.prog);
> 
> Same - why check if prog is NULL? That doesn't seem necessary if the
> code is correct.
> 

Same as above.

> > +	slwt->bpf.name = NULL;
> > +	slwt->bpf.prog = NULL;
> > +}
> > +
> >  struct seg6_action_param {
> >  	int (*parse)(struct nlattr **attrs, struct seg6_local_lwt *slwt);
> >  	int (*put)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
> >  	int (*cmp)(struct seg6_local_lwt *a, struct seg6_local_lwt *b);
> > +
> > +	/* optional destroy() callback useful for releasing resources which
> > +	 * have been previously acquired in the corresponding parse()
> > +	 * function.
> > +	 */
> > +	void (*destroy)(struct seg6_local_lwt *slwt);
> >  };
> >  
> >  static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
> >  	[SEG6_LOCAL_SRH]	= { .parse = parse_nla_srh,
> >  				    .put = put_nla_srh,
> > -				    .cmp = cmp_nla_srh },
> > +				    .cmp = cmp_nla_srh,
> > +				    .destroy = destroy_attr_srh },
> >  
> >  	[SEG6_LOCAL_TABLE]	= { .parse = parse_nla_table,
> >  				    .put = put_nla_table,
> > @@ -934,13 +957,68 @@ static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
> >  
> >  	[SEG6_LOCAL_BPF]	= { .parse = parse_nla_bpf,
> >  				    .put = put_nla_bpf,
> > -				    .cmp = cmp_nla_bpf },
> > +				    .cmp = cmp_nla_bpf,
> > +				    .destroy = destroy_attr_bpf },
> >  
> >  };
> >  
> > +/* call the destroy() callback (if available) for each set attribute in
> > + * @parsed_attrs, starting from attribute index @start up to @end excluded.
> > + */
> > +static void __destroy_attrs(unsigned long parsed_attrs, int start, int end,
> 
> You always pass 0 as start, no need for that argument.
> 
> slwt and max_parsed should be the only args this function needs.
> 

My initial goal was to explicitly pass the 'parsed_attrs' as an argument so that
we can reuse this function also for further improvements (i.e.: the patch for
optional attributes I am working on).

However, for v3 I will keep the stuff straight forward following what you
suggested to me.

> > +			    struct seg6_local_lwt *slwt)
> > +{
> > +	struct seg6_action_param *param;
> > +	int i;
> > +
> > +	/* Every seg6local attribute is identified by an ID which is encoded as
> > +	 * a flag (i.e: 1 << ID) in the @parsed_attrs bitmask; such bitmask
> > +	 * keeps track of the attributes parsed so far.
> > +
> > +	 * We scan the @parsed_attrs bitmask, starting from the attribute
> > +	 * identified by @start up to the attribute identified by @end
> > +	 * excluded. For each set attribute, we retrieve the corresponding
> > +	 * destroy() callback.
> > +	 * If the callback is not available, then we skip to the next
> > +	 * attribute; otherwise, we call the destroy() callback.
> > +	 */
> > +	for (i = start; i < end; ++i) {
> > +		if (!(parsed_attrs & (1 << i)))
> > +			continue;
> > +
> > +		param = &seg6_action_params[i];
> > +
> > +		if (param->destroy)
> > +			param->destroy(slwt);
> > +	}
> > +}
> > +
> > +/* release all the resources that may have been acquired during parsing
> > + * operations.
> > + */
> > +static void destroy_attrs(struct seg6_local_lwt *slwt)
> > +{
> > +	struct seg6_action_desc *desc;
> > +	unsigned long attrs;
> > +
> > +	desc = slwt->desc;
> > +	if (!desc) {
> > +		WARN_ONCE(1,
> > +			  "seg6local: seg6_action_desc* for action %d is NULL",
> > +			  slwt->action);
> > +		return;
> > +	}
> 
> Defensive programming?
> 

Yes, like above. I will remove the check on the 'desc' and consequently the
WARN_ON in v3.

> > +
> > +	/* get the attributes for the current behavior instance */
> > +	attrs = desc->attrs;
> > +
> > +	__destroy_attrs(attrs, 0, SEG6_LOCAL_MAX + 1, slwt);
> > +}
> > +
> >  static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
> >  {
> >  	struct seg6_action_param *param;
> > +	unsigned long parsed_attrs = 0;
> >  	struct seg6_action_desc *desc;
> >  	int i, err;
> >  
> > @@ -963,11 +1041,22 @@ static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
> >  
> >  			err = param->parse(attrs, slwt);
> >  			if (err < 0)
> > -				return err;
> > +				goto parse_err;
> > +
> > +			/* current attribute has been parsed correctly */
> > +			parsed_attrs |= (1 << i);
> 
> Why do you need parsed_attrs, attributes are not optional. Everything
> that's sepecified in desc->attrs and lower than i must had been parsed.
> 

Here, all the attributes are required and not optional. So in this patch, the
parsed_attrs can be certainly avoided. I'll remove it in v3.

> >  		}
> >  	}
> >  
> >  	return 0;
> > +
> > +parse_err:
> > +	/* release any resource that may have been acquired during the i-1
> > +	 * parse() operations.
> > +	 */
> > +	__destroy_attrs(parsed_attrs, 0, i, slwt);
> > +
> > +	return err;
> >  }
> >  
> >  static int seg6_local_build_state(struct net *net, struct nlattr *nla,
> 
> 

Thank you,
Andrea

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior
  2020-11-10 22:56   ` Jakub Kicinski
@ 2020-11-13  1:06     ` Andrea Mayer
  0 siblings, 0 replies; 34+ messages in thread
From: Andrea Mayer @ 2020-11-13  1:06 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Hi Jakub,
many thanks for your review. Please see my responses inline:

On Tue, 10 Nov 2020 14:56:55 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Sat,  7 Nov 2020 16:31:37 +0100 Andrea Mayer wrote:
> > We introduce two callbacks used for customizing the creation/destruction of
> > a SRv6 behavior. Such callbacks are defined in the new struct
> > seg6_local_lwtunnel_ops and hereafter we provide a brief description of
> > them:
> > 
> >  - build_state(...): used for calling the custom constructor of the
> >    behavior during its initialization phase and after all the attributes
> >    have been parsed successfully;
> > 
> >  - destroy_state(...): used for calling the custom destructor of the
> >    behavior before it is completely destroyed.
> > 
> > Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
> 
> Looks good, minor nits.
> 
> > diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> > index 63a82e2fdea9..4b0f155d641d 100644
> > --- a/net/ipv6/seg6_local.c
> > +++ b/net/ipv6/seg6_local.c
> > @@ -33,11 +33,23 @@
> >  
> >  struct seg6_local_lwt;
> >  
> > +typedef int (*slwt_build_state_t)(struct seg6_local_lwt *slwt, const void *cfg,
> > +				  struct netlink_ext_ack *extack);
> > +typedef void (*slwt_destroy_state_t)(struct seg6_local_lwt *slwt);
> 
> Let's avoid the typedefs. Instead of taking a pointer to the op take a
> pointer to the ops struct in seg6_local_lwtunnel_build_state() etc.
>

Ok, I will do it this way in v3.

> > +/* callbacks used for customizing the creation and destruction of a behavior */
> > +struct seg6_local_lwtunnel_ops {
> > +	slwt_build_state_t build_state;
> > +	slwt_destroy_state_t destroy_state;
> > +};
> > +
> >  struct seg6_action_desc {
> >  	int action;
> >  	unsigned long attrs;
> >  	int (*input)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
> >  	int static_headroom;
> > +
> > +	struct seg6_local_lwtunnel_ops slwt_ops;
> >  };
> >  
> >  struct bpf_lwt_prog {
> > @@ -1015,6 +1027,45 @@ static void destroy_attrs(struct seg6_local_lwt *slwt)
> >  	__destroy_attrs(attrs, 0, SEG6_LOCAL_MAX + 1, slwt);
> >  }
> >  
> > +/* call the custom constructor of the behavior during its initialization phase
> > + * and after that all its attributes have been parsed successfully.
> > + */
> > +static int
> > +seg6_local_lwtunnel_build_state(struct seg6_local_lwt *slwt, const void *cfg,
> > +				struct netlink_ext_ack *extack)
> > +{
> > +	slwt_build_state_t build_func;
> > +	struct seg6_action_desc *desc;
> > +	int err = 0;
> > +
> > +	desc = slwt->desc;
> > +	if (!desc)
> > +		return -EINVAL;
> 
> This is impossible, right?
> 

Yes, it is. I will remove this check in v3.

> > +
> > +	build_func = desc->slwt_ops.build_state;
> > +	if (build_func)
> > +		err = build_func(slwt, cfg, extack);
> > +
> > +	return err;
> 
> no need for err, just use return directly.
> 
> 	if (!ops->build_state)
> 		return 0;
> 	return ops->build_state(...);
> 

Ok, I will do it in this way in v3.

> > +}
> > +
> > +/* call the custom destructor of the behavior which is invoked before the
> > + * tunnel is going to be destroyed.
> > + */
> > +static void seg6_local_lwtunnel_destroy_state(struct seg6_local_lwt *slwt)
> > +{
> > +	slwt_destroy_state_t destroy_func;
> > +	struct seg6_action_desc *desc;
> > +
> > +	desc = slwt->desc;
> > +	if (!desc)
> > +		return;
> > +
> > +	destroy_func = desc->slwt_ops.destroy_state;
> > +	if (destroy_func)
> > +		destroy_func(slwt);
> > +}
> > +
> >  static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
> >  {
> >  	struct seg6_action_param *param;
> > @@ -1090,8 +1141,16 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
> >  
> >  	err = parse_nla_action(tb, slwt);
> >  	if (err < 0)
> > +		/* In case of error, the parse_nla_action() takes care of
> > +		 * releasing resources which have been acquired during the
> > +		 * processing of attributes.
> > +		 */
> 
> that's the normal behavior for a kernel function, comment is
> unnecessary IMO
> 

Yes and this is the way it should be. But before this patch, the
parse_nla_action() in case of error did not always release all the acquired
resources. From this patcheset onward, the parse_nla_action() behaves like we
expect. Therefore, I will remove the comment in v3.

> >  		goto out_free;
> >  
> > +	err = seg6_local_lwtunnel_build_state(slwt, cfg, extack);
> > +	if (err < 0)
> > +		goto free_attrs;
> 
> The function is called destroy_attrs, call the label out_destroy_attrs,
> or err_destroy_attrs.
> 

Fine, I will stick with the out_destroy_attrs to be consistent and uniform with
the out_free label in v3.

> >  	newts->type = LWTUNNEL_ENCAP_SEG6_LOCAL;
> >  	newts->flags = LWTUNNEL_STATE_INPUT_REDIRECT;
> >  	newts->headroom = slwt->headroom;
> > @@ -1100,6 +1159,9 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
> >  
> >  	return 0;
> >  
> > +free_attrs:
> > +	destroy_attrs(slwt);
> > +
> 
> no need for empty lines on error paths
> 

Ok.

> >  out_free:
> >  	kfree(newts);
> >  	return err;
> > @@ -1109,6 +1171,8 @@ static void seg6_local_destroy_state(struct lwtunnel_state *lwt)
> >  {
> >  	struct seg6_local_lwt *slwt = seg6_local_lwtunnel(lwt);
> >  
> > +	seg6_local_lwtunnel_destroy_state(slwt);
> > +
> >  	destroy_attrs(slwt);
> >  
> >  	return;
> 

Thank you,
Andrea

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-10 23:12   ` Jakub Kicinski
@ 2020-11-13  1:28     ` Andrea Mayer
  2020-11-13  1:49       ` David Ahern
  0 siblings, 1 reply; 34+ messages in thread
From: Andrea Mayer @ 2020-11-13  1:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Hi Jakub,
many thanks for your review. Please see my responses inline:

On Tue, 10 Nov 2020 15:12:55 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Sat,  7 Nov 2020 16:31:38 +0100 Andrea Mayer wrote:
> > SRv6 End.DT4 is defined in the SRv6 Network Programming [1].
> > 
> > The SRv6 End.DT4 is used to implement IPv4 L3VPN use-cases in
> > multi-tenants environments. It decapsulates the received packets and it
> > performs IPv4 routing lookup in the routing table of the tenant.
> > 
> > The SRv6 End.DT4 Linux implementation leverages a VRF device in order to
> > force the routing lookup into the associated routing table.
> 
> How does the behavior of DT4 compare to DT6?
> 

The implementation of SRv6 End.DT4 differs from the the implementation of SRv6
End.DT6 due to the different *route input* lookup functions. For IPv6 is it
possible to force the routing lookup specifying a routing table through the
ip6_pol_route() function (as it is done in the seg6_lookup_any_nexthop()).

Conversely, for the IPv4 we cannot force the lookup into a specific table with
the functions that are currently exposed by the kernel.

> The implementation looks quite different.
>

Long story short:
A long time ago, we discussed here on the mailing list how best to implement the
SRv6 DT4. After some time, we identified with the help of David Ahern the VRF as
the key infrastructure on which to build the SRv6 End.DT4. Indeed, the use of
VRF allows us not to touch in any way the core components of the kernel (i.e.:
the ipv4 routing system) and to exploit an already existing infrastructure.

I would say that also the SRv6 End.DT6 should leverage the VRF as we did for
SRv6 End.DT4. We can also try to change End.DT6 implementation, if needed.

> > To make the End.DT4 work properly, it must be guaranteed that the routing
> > table used for routing lookup operations is bound to one and only one
> > VRF during the tunnel creation. Such constraint has to be enforced by
> > enabling the VRF strict_mode sysctl parameter, i.e:
> >  $ sysctl -wq net.vrf.strict_mode=1.
> > 
> > At JANOG44, LINE corporation presented their multi-tenant DC architecture
> > using SRv6 [2]. In the slides, they reported that the Linux kernel is
> > missing the support of SRv6 End.DT4 behavior.
> > 
> > The iproute2 counterpart required for configuring the SRv6 End.DT4
> > behavior is already implemented along with the other supported SRv6
> > behaviors [3].
> > 
> > [1] https://tools.ietf.org/html/draft-ietf-spring-srv6-network-programming
> > [2] https://speakerdeck.com/line_developers/line-data-center-networking-with-srv6
> > [3] https://patchwork.ozlabs.org/patch/799837/
> > 
> > Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
> > ---
> >  net/ipv6/seg6_local.c | 205 ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 205 insertions(+)
> > 
> > diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> > index 4b0f155d641d..a41074acd43e 100644
> > --- a/net/ipv6/seg6_local.c
> > +++ b/net/ipv6/seg6_local.c
> > @@ -57,6 +57,14 @@ struct bpf_lwt_prog {
> >  	char *name;
> >  };
> >  
> > +struct seg6_end_dt4_info {
> > +	struct net *net;
> > +	/* VRF device associated to the routing table used by the SRv6 End.DT4
> > +	 * behavior for routing IPv4 packets.
> > +	 */
> > +	int vrf_ifindex;
> > +};
> > +
> >  struct seg6_local_lwt {
> >  	int action;
> >  	struct ipv6_sr_hdr *srh;
> > @@ -66,6 +74,7 @@ struct seg6_local_lwt {
> >  	int iif;
> >  	int oif;
> >  	struct bpf_lwt_prog bpf;
> > +	struct seg6_end_dt4_info dt4_info;
> >  
> >  	int headroom;
> >  	struct seg6_action_desc *desc;
> > @@ -413,6 +422,194 @@ static int input_action_end_dx4(struct sk_buff *skb,
> >  	return -EINVAL;
> >  }
> >  
> > +#ifdef CONFIG_NET_L3_MASTER_DEV
> > +
> 
> no need for this empty line.
> 

Ok.

> > +static struct net *fib6_config_get_net(const struct fib6_config *fib6_cfg)
> > +{
> > +	const struct nl_info *nli = &fib6_cfg->fc_nlinfo;
> > +
> > +	return nli->nl_net;
> > +}
> > +
> > +static int seg6_end_dt4_build(struct seg6_local_lwt *slwt, const void *cfg,
> > +			      struct netlink_ext_ack *extack)
> > +{
> > +	struct seg6_end_dt4_info *info = &slwt->dt4_info;
> > +	int vrf_ifindex;
> > +	struct net *net;
> > +
> > +	net = fib6_config_get_net(cfg);
> > +
> > +	vrf_ifindex = l3mdev_ifindex_lookup_by_table_id(L3MDEV_TYPE_VRF, net,
> > +							slwt->table);
> > +	if (vrf_ifindex < 0) {
> > +		if (vrf_ifindex == -EPERM) {
> > +			NL_SET_ERR_MSG(extack,
> > +				       "Strict mode for VRF is disabled");
> > +		} else if (vrf_ifindex == -ENODEV) {
> > +			NL_SET_ERR_MSG(extack, "No such device");
> 
> That's what -ENODEV already says.
>

Yes, sorry for this very trivial message. I will improve it in v3.
 
> > +		} else {
> > +			NL_SET_ERR_MSG(extack, "Unknown error");
> 
> Useless error.
> 

Ok, I will remove it and keep only the pr_debug message in v3.

> > +			pr_debug("seg6local: SRv6 End.DT4 creation error=%d\n",
> > +				 vrf_ifindex);
> > +		}
> > +
> > +		return vrf_ifindex;
> > +	}
> > +
> > +	info->net = net;
> > +	info->vrf_ifindex = vrf_ifindex;
> > +
> > +	return 0;
> > +}
> > +
> > +/* The SRv6 End.DT4 behavior extracts the inner (IPv4) packet and routes the
> > + * IPv4 packet by looking at the configured routing table.
> > + *
> > + * In the SRv6 End.DT4 use case, we can receive traffic (IPv6+Segment Routing
> > + * Header packets) from several interfaces and the IPv6 destination address (DA)
> > + * is used for retrieving the specific instance of the End.DT4 behavior that
> > + * should process the packets.
> > + *
> > + * However, the inner IPv4 packet is not really bound to any receiving
> > + * interface and thus the End.DT4 sets the VRF (associated with the
> > + * corresponding routing table) as the *receiving* interface.
> > + * In other words, the End.DT4 processes a packet as if it has been received
> > + * directly by the VRF (and not by one of its slave devices, if any).
> > + * In this way, the VRF interface is used for routing the IPv4 packet in
> > + * according to the routing table configured by the End.DT4 instance.
> > + *
> > + * This design allows you to get some interesting features like:
> > + *  1) the statistics on rx packets;
> > + *  2) the possibility to install a packet sniffer on the receiving interface
> > + *     (the VRF one) for looking at the incoming packets;
> > + *  3) the possibility to leverage the netfilter prerouting hook for the inner
> > + *     IPv4 packet.
> > + *
> > + * This function returns:
> > + *  - the sk_buff* when the VRF rcv handler has processed the packet correctly;
> > + *  - NULL when the skb is consumed by the VRF rcv handler;
> > + *  - a pointer which encodes a negative error number in case of error.
> > + *    Note that in this case, the function takes care of freeing the skb.
> > + */
> > +static struct sk_buff *end_dt4_vrf_rcv(struct sk_buff *skb,
> > +				       struct net_device *dev)
> > +{
> > +	/* based on l3mdev_ip_rcv; we are only interested in the master */
> > +	if (unlikely(!netif_is_l3_master(dev) && !netif_has_l3_rx_handler(dev)))
> > +		goto drop;
> > +
> > +	if (unlikely(!dev->l3mdev_ops->l3mdev_l3_rcv))
> > +		goto drop;
> > +
> > +	/* the decap packet (IPv4) does not come with any mac header info.
> > +	 * We must unset the mac header to allow the VRF device to rebuild it,
> > +	 * just in case there is a sniffer attached on the device.
> > +	 */
> > +	skb_unset_mac_header(skb);
> > +
> > +	skb = dev->l3mdev_ops->l3mdev_l3_rcv(dev, skb, AF_INET);
> > +	if (!skb)
> > +		/* the skb buffer was consumed by the handler */
> > +		return NULL;
> > +
> > +	/* when a packet is received by a VRF or by one of its slaves, the
> > +	 * master device reference is set into the skb.
> > +	 */
> > +	if (unlikely(skb->dev != dev || skb->skb_iif != dev->ifindex))
> > +		goto drop;
> > +
> > +	return skb;
> > +
> > +drop:
> > +	kfree_skb(skb);
> > +	return ERR_PTR(-EINVAL);
> > +}
> > +
> > +static struct net_device *end_dt4_get_vrf_rcu(struct sk_buff *skb,
> > +					      struct seg6_end_dt4_info *info)
> > +{
> > +	int vrf_ifindex = info->vrf_ifindex;
> > +	struct net *net = info->net;
> > +
> > +	if (unlikely(vrf_ifindex < 0))
> > +		goto error;
> > +
> > +	if (unlikely(!net_eq(dev_net(skb->dev), net)))
> > +		goto error;
> > +
> > +	return dev_get_by_index_rcu(net, vrf_ifindex);
> > +
> > +error:
> > +	return NULL;
> > +}
> > +
> > +static int input_action_end_dt4(struct sk_buff *skb,
> > +				struct seg6_local_lwt *slwt)
> > +{
> > +	struct net_device *vrf;
> > +	struct iphdr *iph;
> > +	int err;
> > +
> > +	if (!decap_and_validate(skb, IPPROTO_IPIP))
> > +		goto drop;
> > +
> > +	if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> > +		goto drop;
> > +
> > +	vrf = end_dt4_get_vrf_rcu(skb, &slwt->dt4_info);
> > +	if (unlikely(!vrf))
> > +		goto drop;
> > +
> > +	skb->protocol = htons(ETH_P_IP);
> > +
> > +	skb_dst_drop(skb);
> > +
> > +	skb_set_transport_header(skb, sizeof(struct iphdr));
> > +
> > +	skb = end_dt4_vrf_rcv(skb, vrf);
> > +	if (!skb)
> > +		/* packet has been processed and consumed by the VRF */
> > +		return 0;
> > +
> > +	if (IS_ERR(skb)) {
> > +		err = PTR_ERR(skb);
> > +		return err;
> 
> return PTR_ERR(skb)
> 

I will fix it in v3.

> > +	}
> > +
> > +	iph = ip_hdr(skb);
> > +
> > +	err = ip_route_input(skb, iph->daddr, iph->saddr, 0, skb->dev);
> > +	if (err)
> > +		goto drop;
> > +
> > +	return dst_input(skb);
> > +
> > +drop:
> > +	kfree_skb(skb);
> > +	return -EINVAL;
> > +}
> > +
> > +#else
> > +
> 
> new line not needed
> 

Ok.

> > +static int seg6_end_dt4_build(struct seg6_local_lwt *slwt, const void *cfg,
> > +			      struct netlink_ext_ack *extack)
> > +{
> > +	NL_SET_ERR_MSG(extack, "Operation is not supported");
> 
> This extack message probably could be more helpful. As it stands it's
> basically 
> 

Please, see just right below.

> > +
> > +	return -EOPNOTSUPP;
> > +}
> > +
> > +static int input_action_end_dt4(struct sk_buff *skb,
> > +				struct seg6_local_lwt *slwt)
> 
> Maybe just ifdef out the part of the action table instead of creating
> those stubs?
> 

This is a very interesting point and I like your idea. We can eliminate the two
stubs while keeping the "unsupported operation" semantics in this way:

static struct seg6_action_desc seg6_action_table[] = {
   [...]
   {
       .action = SEG6_LOCAL_ACTION_END_DT4,
       .attrs = (1 << SEG6_LOCAL_TABLE),
#ifdef CONFIG_NET_L3_MASTER_DEV
       .input = input_action_end_dt4,
       .slwt_ops = {
           .build_state = seg6_end_dt4_build,
        },
#endif
   },
[...]
}

when the CONFIG_NET_L3_MASTER_DEV is not defined, the behavior can not be
instantiated because the "input" callback is initialized to NULL. This fact
forces the parse_nla_action() to fail returning -EOPNOTSUPP to the user
(that is exactly what we want to achieve).

Note that surrounding the entire DT4 action table entry with #ifdef/#endif does
not allow us to distinguish whether the DT4 was really implemented or it was
not supported due to the way in which the CONFIG_NET_L3_MASTER_DEV was set.
In both cases, when the user tries to instantiate a new DT4 behavior, the
kernel replies back with the -EINVAL error.

> > +{
> > +	kfree_skb(skb);
> > +	return -EOPNOTSUPP;
> > +}
> > +
> > +#endif
> > +
> >  static int input_action_end_dt6(struct sk_buff *skb,
> >  				struct seg6_local_lwt *slwt)
> >  {
> > @@ -601,6 +798,14 @@ static struct seg6_action_desc seg6_action_table[] = {
> 
> BTW any idea why the action table is not marked as const?
> 

Frankly speaking, I have no idea. I have been working on the seg6 infrastructure
for some time now, and I have never seen a single value changed in
seg6_action_table[] after its initialization (neither the necessity to carry
out an update operation).

> Would you mind sending a patch to fix that?
> 

Yes, I will send a fix for this issue adding the 'const' keyword.

> >  		.attrs		= (1 << SEG6_LOCAL_NH4),
> >  		.input		= input_action_end_dx4,
> >  	},
> > +	{
> > +		.action		= SEG6_LOCAL_ACTION_END_DT4,
> > +		.attrs		= (1 << SEG6_LOCAL_TABLE),
> > +		.input		= input_action_end_dt4,
> > +		.slwt_ops	= {
> > +					.build_state = seg6_end_dt4_build,
> > +				  },
> > +	},
> >  	{
> >  		.action		= SEG6_LOCAL_ACTION_END_DT6,
> >  		.attrs		= (1 << SEG6_LOCAL_TABLE),
> 

Thank you,
Andrea



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13  1:28     ` Andrea Mayer
@ 2020-11-13  1:49       ` David Ahern
  2020-11-13 16:55         ` Jakub Kicinski
  0 siblings, 1 reply; 34+ messages in thread
From: David Ahern @ 2020-11-13  1:49 UTC (permalink / raw)
  To: Andrea Mayer, Jakub Kicinski
  Cc: David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam

On 11/12/20 6:28 PM, Andrea Mayer wrote:
> The implementation of SRv6 End.DT4 differs from the the implementation of SRv6
> End.DT6 due to the different *route input* lookup functions. For IPv6 is it
> possible to force the routing lookup specifying a routing table through the
> ip6_pol_route() function (as it is done in the seg6_lookup_any_nexthop()).

It is unfortunate that the IPv6 variant got in without the VRF piece.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-07 15:31 ` [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior Andrea Mayer
  2020-11-10 23:12   ` Jakub Kicinski
@ 2020-11-13  9:23   ` kernel test robot
  2020-11-13 16:57     ` Jakub Kicinski
  1 sibling, 1 reply; 34+ messages in thread
From: kernel test robot @ 2020-11-13  9:23 UTC (permalink / raw)
  To: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann
  Cc: kbuild-all, clang-built-linux, netdev

[-- Attachment #1: Type: text/plain, Size: 4697 bytes --]

Hi Andrea,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on ipvs/master]
[also build test ERROR on linus/master sparc-next/master v5.10-rc3 next-20201112]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Andrea-Mayer/seg6-add-support-for-the-SRv6-End-DT4-behavior/20201109-093019
base:   https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git master
config: x86_64-randconfig-a005-20201111 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 874b0a0b9db93f5d3350ffe6b5efda2d908415d0)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/761138e2f757ac64efe97b03311c976db242dc92
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Andrea-Mayer/seg6-add-support-for-the-SRv6-End-DT4-behavior/20201109-093019
        git checkout 761138e2f757ac64efe97b03311c976db242dc92
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> net/ipv6/seg6_local.c:793:4: error: field designator 'slwt_ops' does not refer to any field in type 'struct seg6_action_desc'
                   .slwt_ops       = {
                    ^
>> net/ipv6/seg6_local.c:826:10: error: invalid application of 'sizeof' to an incomplete type 'struct seg6_action_desc []'
           count = ARRAY_SIZE(seg6_action_table);
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:48:32: note: expanded from macro 'ARRAY_SIZE'
   #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
                                  ^~~~~
   2 errors generated.

vim +793 net/ipv6/seg6_local.c

   757	
   758	static struct seg6_action_desc seg6_action_table[] = {
   759		{
   760			.action		= SEG6_LOCAL_ACTION_END,
   761			.attrs		= 0,
   762			.input		= input_action_end,
   763		},
   764		{
   765			.action		= SEG6_LOCAL_ACTION_END_X,
   766			.attrs		= (1 << SEG6_LOCAL_NH6),
   767			.input		= input_action_end_x,
   768		},
   769		{
   770			.action		= SEG6_LOCAL_ACTION_END_T,
   771			.attrs		= (1 << SEG6_LOCAL_TABLE),
   772			.input		= input_action_end_t,
   773		},
   774		{
   775			.action		= SEG6_LOCAL_ACTION_END_DX2,
   776			.attrs		= (1 << SEG6_LOCAL_OIF),
   777			.input		= input_action_end_dx2,
   778		},
   779		{
   780			.action		= SEG6_LOCAL_ACTION_END_DX6,
   781			.attrs		= (1 << SEG6_LOCAL_NH6),
   782			.input		= input_action_end_dx6,
   783		},
   784		{
   785			.action		= SEG6_LOCAL_ACTION_END_DX4,
   786			.attrs		= (1 << SEG6_LOCAL_NH4),
   787			.input		= input_action_end_dx4,
   788		},
   789		{
   790			.action		= SEG6_LOCAL_ACTION_END_DT4,
   791			.attrs		= (1 << SEG6_LOCAL_TABLE),
   792			.input		= input_action_end_dt4,
 > 793			.slwt_ops	= {
   794						.build_state = seg6_end_dt4_build,
   795					  },
   796		},
   797		{
   798			.action		= SEG6_LOCAL_ACTION_END_DT6,
   799			.attrs		= (1 << SEG6_LOCAL_TABLE),
   800			.input		= input_action_end_dt6,
   801		},
   802		{
   803			.action		= SEG6_LOCAL_ACTION_END_B6,
   804			.attrs		= (1 << SEG6_LOCAL_SRH),
   805			.input		= input_action_end_b6,
   806		},
   807		{
   808			.action		= SEG6_LOCAL_ACTION_END_B6_ENCAP,
   809			.attrs		= (1 << SEG6_LOCAL_SRH),
   810			.input		= input_action_end_b6_encap,
   811			.static_headroom	= sizeof(struct ipv6hdr),
   812		},
   813		{
   814			.action		= SEG6_LOCAL_ACTION_END_BPF,
   815			.attrs		= (1 << SEG6_LOCAL_BPF),
   816			.input		= input_action_end_bpf,
   817		},
   818	
   819	};
   820	
   821	static struct seg6_action_desc *__get_action_desc(int action)
   822	{
   823		struct seg6_action_desc *desc;
   824		int i, count;
   825	
 > 826		count = ARRAY_SIZE(seg6_action_table);
   827		for (i = 0; i < count; i++) {
   828			desc = &seg6_action_table[i];
   829			if (desc->action == action)
   830				return desc;
   831		}
   832	
   833		return NULL;
   834	}
   835	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35348 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13  1:49       ` David Ahern
@ 2020-11-13 16:55         ` Jakub Kicinski
  2020-11-13 17:02           ` Stefano Salsano
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-13 16:55 UTC (permalink / raw)
  To: David Ahern
  Cc: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Stefano Salsano,
	Paolo Lungaroni, Ahmed Abdelsalam

On Thu, 12 Nov 2020 18:49:17 -0700 David Ahern wrote:
> On 11/12/20 6:28 PM, Andrea Mayer wrote:
> > The implementation of SRv6 End.DT4 differs from the the implementation of SRv6
> > End.DT6 due to the different *route input* lookup functions. For IPv6 is it
> > possible to force the routing lookup specifying a routing table through the
> > ip6_pol_route() function (as it is done in the seg6_lookup_any_nexthop()).  
> 
> It is unfortunate that the IPv6 variant got in without the VRF piece.

Should we make it a requirement for this series to also extend the v6
version to support the preferred VRF-based operation? Given VRF is
better and we require v4 features to be implemented for v6?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13  9:23   ` kernel test robot
@ 2020-11-13 16:57     ` Jakub Kicinski
  2020-11-13 17:05       ` David Ahern
  2020-11-23  1:13       ` [kbuild-all] " Rong Chen
  0 siblings, 2 replies; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-13 16:57 UTC (permalink / raw)
  To: kernel test robot
  Cc: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, kbuild-all,
	clang-built-linux, netdev

Good people of build bot, 

would you mind shedding some light on this one? It was also reported on
v1, and Andrea said it's impossible to repro. Strange that build bot
would make the same mistake twice, tho.

Thanks!

On Fri, 13 Nov 2020 17:23:09 +0800 kernel test robot wrote:
> Hi Andrea,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on ipvs/master]
> [also build test ERROR on linus/master sparc-next/master v5.10-rc3 next-20201112]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:    https://github.com/0day-ci/linux/commits/Andrea-Mayer/seg6-add-support-for-the-SRv6-End-DT4-behavior/20201109-093019
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git master
> config: x86_64-randconfig-a005-20201111 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 874b0a0b9db93f5d3350ffe6b5efda2d908415d0)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install x86_64 cross compiling tool for clang build
>         # apt-get install binutils-x86-64-linux-gnu
>         # https://github.com/0day-ci/linux/commit/761138e2f757ac64efe97b03311c976db242dc92
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Andrea-Mayer/seg6-add-support-for-the-SRv6-End-DT4-behavior/20201109-093019
>         git checkout 761138e2f757ac64efe97b03311c976db242dc92
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All errors (new ones prefixed by >>):
> 
> >> net/ipv6/seg6_local.c:793:4: error: field designator 'slwt_ops' does not refer to any field in type 'struct seg6_action_desc'  
>                    .slwt_ops       = {
>                     ^
> >> net/ipv6/seg6_local.c:826:10: error: invalid application of 'sizeof' to an incomplete type 'struct seg6_action_desc []'  
>            count = ARRAY_SIZE(seg6_action_table);
>                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    include/linux/kernel.h:48:32: note: expanded from macro 'ARRAY_SIZE'
>    #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
>                                   ^~~~~
>    2 errors generated.
> 
> vim +793 net/ipv6/seg6_local.c
> 
>    757	
>    758	static struct seg6_action_desc seg6_action_table[] = {
>    759		{
>    760			.action		= SEG6_LOCAL_ACTION_END,
>    761			.attrs		= 0,
>    762			.input		= input_action_end,
>    763		},
>    764		{
>    765			.action		= SEG6_LOCAL_ACTION_END_X,
>    766			.attrs		= (1 << SEG6_LOCAL_NH6),
>    767			.input		= input_action_end_x,
>    768		},
>    769		{
>    770			.action		= SEG6_LOCAL_ACTION_END_T,
>    771			.attrs		= (1 << SEG6_LOCAL_TABLE),
>    772			.input		= input_action_end_t,
>    773		},
>    774		{
>    775			.action		= SEG6_LOCAL_ACTION_END_DX2,
>    776			.attrs		= (1 << SEG6_LOCAL_OIF),
>    777			.input		= input_action_end_dx2,
>    778		},
>    779		{
>    780			.action		= SEG6_LOCAL_ACTION_END_DX6,
>    781			.attrs		= (1 << SEG6_LOCAL_NH6),
>    782			.input		= input_action_end_dx6,
>    783		},
>    784		{
>    785			.action		= SEG6_LOCAL_ACTION_END_DX4,
>    786			.attrs		= (1 << SEG6_LOCAL_NH4),
>    787			.input		= input_action_end_dx4,
>    788		},
>    789		{
>    790			.action		= SEG6_LOCAL_ACTION_END_DT4,
>    791			.attrs		= (1 << SEG6_LOCAL_TABLE),
>    792			.input		= input_action_end_dt4,
>  > 793			.slwt_ops	= {  
>    794						.build_state = seg6_end_dt4_build,
>    795					  },
>    796		},
>    797		{
>    798			.action		= SEG6_LOCAL_ACTION_END_DT6,
>    799			.attrs		= (1 << SEG6_LOCAL_TABLE),
>    800			.input		= input_action_end_dt6,
>    801		},
>    802		{
>    803			.action		= SEG6_LOCAL_ACTION_END_B6,
>    804			.attrs		= (1 << SEG6_LOCAL_SRH),
>    805			.input		= input_action_end_b6,
>    806		},
>    807		{
>    808			.action		= SEG6_LOCAL_ACTION_END_B6_ENCAP,
>    809			.attrs		= (1 << SEG6_LOCAL_SRH),
>    810			.input		= input_action_end_b6_encap,
>    811			.static_headroom	= sizeof(struct ipv6hdr),
>    812		},
>    813		{
>    814			.action		= SEG6_LOCAL_ACTION_END_BPF,
>    815			.attrs		= (1 << SEG6_LOCAL_BPF),
>    816			.input		= input_action_end_bpf,
>    817		},
>    818	
>    819	};
>    820	
>    821	static struct seg6_action_desc *__get_action_desc(int action)
>    822	{
>    823		struct seg6_action_desc *desc;
>    824		int i, count;
>    825	
>  > 826		count = ARRAY_SIZE(seg6_action_table);  
>    827		for (i = 0; i < count; i++) {
>    828			desc = &seg6_action_table[i];
>    829			if (desc->action == action)
>    830				return desc;
>    831		}
>    832	
>    833		return NULL;
>    834	}
>    835	
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 16:55         ` Jakub Kicinski
@ 2020-11-13 17:02           ` Stefano Salsano
  2020-11-13 17:04             ` David Ahern
  0 siblings, 1 reply; 34+ messages in thread
From: Stefano Salsano @ 2020-11-13 17:02 UTC (permalink / raw)
  To: Jakub Kicinski, David Ahern
  Cc: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Paolo Lungaroni,
	Ahmed Abdelsalam

Il 2020-11-13 17:55, Jakub Kicinski ha scritto:
> On Thu, 12 Nov 2020 18:49:17 -0700 David Ahern wrote:
>> On 11/12/20 6:28 PM, Andrea Mayer wrote:
>>> The implementation of SRv6 End.DT4 differs from the the implementation of SRv6
>>> End.DT6 due to the different *route input* lookup functions. For IPv6 is it
>>> possible to force the routing lookup specifying a routing table through the
>>> ip6_pol_route() function (as it is done in the seg6_lookup_any_nexthop()).
>>
>> It is unfortunate that the IPv6 variant got in without the VRF piece.
> 
> Should we make it a requirement for this series to also extend the v6
> version to support the preferred VRF-based operation? Given VRF is
> better and we require v4 features to be implemented for v6?

I think it is better to separate the two aspects... adding a missing 
feature in IPv4 datapath should not depend on improving the quality of 
the implementation of the IPv6 datapath :-)

I think that Andrea is willing to work on improving the IPv6 
implementation, but this should be considered after this patchset...

my 2c

Stefano

-- 
*******************************************************************
Stefano Salsano
Professore Associato
Dipartimento Ingegneria Elettronica
Universita' di Roma Tor Vergata
Viale Politecnico, 1 - 00133 Roma - ITALY

http://netgroup.uniroma2.it/Stefano_Salsano/

E-mail  : stefano.salsano@uniroma2.it
Cell.   : +39 320 4307310
Office  : (Tel.) +39 06 72597770 (Fax.) +39 06 72597435
*******************************************************************


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 17:02           ` Stefano Salsano
@ 2020-11-13 17:04             ` David Ahern
  2020-11-13 19:40               ` Jakub Kicinski
  0 siblings, 1 reply; 34+ messages in thread
From: David Ahern @ 2020-11-13 17:04 UTC (permalink / raw)
  To: Stefano Salsano, Jakub Kicinski
  Cc: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Paolo Lungaroni,
	Ahmed Abdelsalam

On 11/13/20 10:02 AM, Stefano Salsano wrote:
> Il 2020-11-13 17:55, Jakub Kicinski ha scritto:
>> On Thu, 12 Nov 2020 18:49:17 -0700 David Ahern wrote:
>>> On 11/12/20 6:28 PM, Andrea Mayer wrote:
>>>> The implementation of SRv6 End.DT4 differs from the the
>>>> implementation of SRv6
>>>> End.DT6 due to the different *route input* lookup functions. For
>>>> IPv6 is it
>>>> possible to force the routing lookup specifying a routing table
>>>> through the
>>>> ip6_pol_route() function (as it is done in the
>>>> seg6_lookup_any_nexthop()).
>>>
>>> It is unfortunate that the IPv6 variant got in without the VRF piece.
>>
>> Should we make it a requirement for this series to also extend the v6
>> version to support the preferred VRF-based operation? Given VRF is
>> better and we require v4 features to be implemented for v6?
> 
> I think it is better to separate the two aspects... adding a missing
> feature in IPv4 datapath should not depend on improving the quality of
> the implementation of the IPv6 datapath :-)
> 
> I think that Andrea is willing to work on improving the IPv6
> implementation, but this should be considered after this patchset...
> 

agreed. The v6 variant has existed for a while. The v4 version is
independent.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 16:57     ` Jakub Kicinski
@ 2020-11-13 17:05       ` David Ahern
  2020-11-13 19:00         ` Nathan Chancellor
  2020-11-23  1:13       ` [kbuild-all] " Rong Chen
  1 sibling, 1 reply; 34+ messages in thread
From: David Ahern @ 2020-11-13 17:05 UTC (permalink / raw)
  To: Jakub Kicinski, kernel test robot
  Cc: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, kbuild-all,
	clang-built-linux, netdev

On 11/13/20 9:57 AM, Jakub Kicinski wrote:
> Good people of build bot, 
> 
> would you mind shedding some light on this one? It was also reported on
> v1, and Andrea said it's impossible to repro. Strange that build bot
> would make the same mistake twice, tho.
> 

I kicked off a build this morning using Andrea's patches and the config
from the build bot; builds fine as long as the first 3 patches are applied.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 17:05       ` David Ahern
@ 2020-11-13 19:00         ` Nathan Chancellor
  2020-11-14  3:37           ` David Ahern
  0 siblings, 1 reply; 34+ messages in thread
From: Nathan Chancellor @ 2020-11-13 19:00 UTC (permalink / raw)
  To: David Ahern
  Cc: Jakub Kicinski, kernel test robot, Andrea Mayer, David S. Miller,
	David Ahern, Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	kbuild-all, clang-built-linux, netdev

On Fri, Nov 13, 2020 at 10:05:56AM -0700, David Ahern wrote:
> On 11/13/20 9:57 AM, Jakub Kicinski wrote:
> > Good people of build bot, 
> > 
> > would you mind shedding some light on this one? It was also reported on
> > v1, and Andrea said it's impossible to repro. Strange that build bot
> > would make the same mistake twice, tho.
> > 
> 
> I kicked off a build this morning using Andrea's patches and the config
> from the build bot; builds fine as long as the first 3 patches are applied.
> 

I can confirm this as well with clang; if I applied the first three
patches then this one, there is no error but if you just apply this one,
there will be. If you open the GitHub URL, it shows just this patch
applied, not the first three, which explains it.

For what it's worth, b4 chokes over this series:

$ b4 am -o - 20201107153139.3552-1-andrea.mayer@uniroma2.it | git am
Looking up https://lore.kernel.org/r/20201107153139.3552-1-andrea.mayer%40uniroma2.it
Grabbing thread from lore.kernel.org/linux-kselftest
Analyzing 18 messages in the thread
---
Writing /tmp/tmp8425by7fb4-am-stdout
  [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior
---
Total patches: 1
---
 Link: https://lore.kernel.org/r/20201107153139.3552-1-andrea.mayer@uniroma2.it
 Base: not found
---
Applying: seg6: add callbacks for customizing the creation/destruction of a behavior
error: patch failed: net/ipv6/seg6_local.c:1015
error: net/ipv6/seg6_local.c: patch does not apply
Patch failed at 0001 seg6: add callbacks for customizing the creation/destruction of a behavior
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Even if I grab the mbox from lore.kernel.org, it tries to do the same
thing and apply the 3rd patch first, which might explain why the 0day
bot got confused.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 17:04             ` David Ahern
@ 2020-11-13 19:40               ` Jakub Kicinski
  2020-11-13 21:32                 ` Stefano Salsano
  2020-11-13 21:40                 ` Jakub Kicinski
  0 siblings, 2 replies; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-13 19:40 UTC (permalink / raw)
  To: David Ahern
  Cc: Stefano Salsano, Andrea Mayer, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest,
	Paolo Lungaroni, Ahmed Abdelsalam

On Fri, 13 Nov 2020 10:04:44 -0700 David Ahern wrote:
> On 11/13/20 10:02 AM, Stefano Salsano wrote:
> > Il 2020-11-13 17:55, Jakub Kicinski ha scritto:  
> >> On Thu, 12 Nov 2020 18:49:17 -0700 David Ahern wrote:  
> >>> On 11/12/20 6:28 PM, Andrea Mayer wrote:  
> >>>> The implementation of SRv6 End.DT4 differs from the the
> >>>> implementation of SRv6
> >>>> End.DT6 due to the different *route input* lookup functions. For
> >>>> IPv6 is it
> >>>> possible to force the routing lookup specifying a routing table
> >>>> through the
> >>>> ip6_pol_route() function (as it is done in the
> >>>> seg6_lookup_any_nexthop()).  
> >>>
> >>> It is unfortunate that the IPv6 variant got in without the VRF piece.  
> >>
> >> Should we make it a requirement for this series to also extend the v6
> >> version to support the preferred VRF-based operation? Given VRF is
> >> better and we require v4 features to be implemented for v6?  
> > 
> > I think it is better to separate the two aspects... adding a missing
> > feature in IPv4 datapath should not depend on improving the quality of
> > the implementation of the IPv6 datapath :-)
> > 
> > I think that Andrea is willing to work on improving the IPv6
> > implementation, but this should be considered after this patchset...
>
> agreed. The v6 variant has existed for a while. The v4 version is
> independent.

Okay, I'm not sure what's the right call so I asked DaveM.

TBH I wasn't expecting this reaction, we're talking about a 200 LoC
patch which would probably be 90% reused for v6...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 19:40               ` Jakub Kicinski
@ 2020-11-13 21:32                 ` Stefano Salsano
  2020-11-13 21:40                 ` Jakub Kicinski
  1 sibling, 0 replies; 34+ messages in thread
From: Stefano Salsano @ 2020-11-13 21:32 UTC (permalink / raw)
  To: Jakub Kicinski, David Ahern
  Cc: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Paolo Lungaroni,
	Ahmed Abdelsalam

Il 2020-11-13 20:40, Jakub Kicinski ha scritto:
> On Fri, 13 Nov 2020 10:04:44 -0700 David Ahern wrote:
>> On 11/13/20 10:02 AM, Stefano Salsano wrote:
>>> Il 2020-11-13 17:55, Jakub Kicinski ha scritto:
>>>> On Thu, 12 Nov 2020 18:49:17 -0700 David Ahern wrote:
>>>>> On 11/12/20 6:28 PM, Andrea Mayer wrote:
>>>>>> The implementation of SRv6 End.DT4 differs from the the
>>>>>> implementation of SRv6
>>>>>> End.DT6 due to the different *route input* lookup functions. For
>>>>>> IPv6 is it
>>>>>> possible to force the routing lookup specifying a routing table
>>>>>> through the
>>>>>> ip6_pol_route() function (as it is done in the
>>>>>> seg6_lookup_any_nexthop()).
>>>>>
>>>>> It is unfortunate that the IPv6 variant got in without the VRF piece.
>>>>
>>>> Should we make it a requirement for this series to also extend the v6
>>>> version to support the preferred VRF-based operation? Given VRF is
>>>> better and we require v4 features to be implemented for v6?
>>>
>>> I think it is better to separate the two aspects... adding a missing
>>> feature in IPv4 datapath should not depend on improving the quality of
>>> the implementation of the IPv6 datapath :-)
>>>
>>> I think that Andrea is willing to work on improving the IPv6
>>> implementation, but this should be considered after this patchset...
>>
>> agreed. The v6 variant has existed for a while. The v4 version is
>> independent.
> 
> Okay, I'm not sure what's the right call so I asked DaveM.
> 
> TBH I wasn't expecting this reaction, we're talking about a 200 LoC
> patch which would probably be 90% reused for v6...
> 

Jakub, we've considered the possibility to extend the v6 version to 
support the preferred VRF-based operation as you suggested

at first glance, it would break the uAPI compatibility with existing 
scripts that use SRv6 DT6, currently we configure the decap operation in 
this way

ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 
dev eth0

if the v6 version is extended to support the VRF-based operation, in 
order to configure the decap operation we have to do (like we do in the 
v4 version)

ip link add vrf0 type vrf table 100
sysctl -w net.vrf.strict_mode=1
ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 
dev eth0

(of course the sysctl is needed globally once... while the "ip link 
add..." command is needed once for every table X that is used in a script)

considering how much we care of not breaking existing functionality... 
it is not clear IMO if we should go into this direction or we should 
think twice... and maybe look for another design to introduce VRFs into v6

so I would prefer finalizing the DT4 patchset and then start discussing 
the VRF support in v6 version

-- 
*******************************************************************
Stefano Salsano
Professore Associato
Dipartimento Ingegneria Elettronica
Universita' di Roma Tor Vergata
Viale Politecnico, 1 - 00133 Roma - ITALY

http://netgroup.uniroma2.it/Stefano_Salsano/

E-mail  : stefano.salsano@uniroma2.it
Cell.   : +39 320 4307310
Office  : (Tel.) +39 06 72597770 (Fax.) +39 06 72597435
*******************************************************************


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 19:40               ` Jakub Kicinski
  2020-11-13 21:32                 ` Stefano Salsano
@ 2020-11-13 21:40                 ` Jakub Kicinski
  2020-11-13 23:00                   ` Andrea Mayer
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-13 21:40 UTC (permalink / raw)
  To: David Ahern
  Cc: Stefano Salsano, Andrea Mayer, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest,
	Paolo Lungaroni, Ahmed Abdelsalam

On Fri, 13 Nov 2020 11:40:36 -0800 Jakub Kicinski wrote:
> > agreed. The v6 variant has existed for a while. The v4 version is
> > independent.  
> 
> Okay, I'm not sure what's the right call so I asked DaveM.

DaveM raised a concern that unless we implement v6 now we can't be sure
the interface we create for v4 is going to fit there.

So Andrea unless it's a major hurdle, could you take a stab at the v6
version with VRFs as part of this series?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 21:40                 ` Jakub Kicinski
@ 2020-11-13 23:00                   ` Andrea Mayer
  2020-11-13 23:54                     ` Jakub Kicinski
  0 siblings, 1 reply; 34+ messages in thread
From: Andrea Mayer @ 2020-11-13 23:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Ahern, Stefano Salsano, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest,
	Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Hi Jakub,

On Fri, 13 Nov 2020 13:40:10 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Fri, 13 Nov 2020 11:40:36 -0800 Jakub Kicinski wrote:
> > > agreed. The v6 variant has existed for a while. The v4 version is
> > > independent.  
> > 
> > Okay, I'm not sure what's the right call so I asked DaveM.
> 
> DaveM raised a concern that unless we implement v6 now we can't be sure
> the interface we create for v4 is going to fit there.
> 
> So Andrea unless it's a major hurdle, could you take a stab at the v6
> version with VRFs as part of this series?

I can tackle the v6 version but how do we face the compatibility issue raised
by Stefano in his message?

if it is ok to implement a uAPI that breaks the existing scripts, it is relatively
easy to replicate the VRF-based approach also in v6.

Waiting for your advice!

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 23:00                   ` Andrea Mayer
@ 2020-11-13 23:54                     ` Jakub Kicinski
  2020-11-14  1:50                       ` Andrea Mayer
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-13 23:54 UTC (permalink / raw)
  To: Andrea Mayer
  Cc: David Ahern, Stefano Salsano, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest,
	Paolo Lungaroni, Ahmed Abdelsalam

On Sat, 14 Nov 2020 00:00:24 +0100 Andrea Mayer wrote:
> On Fri, 13 Nov 2020 13:40:10 -0800
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
> > On Fri, 13 Nov 2020 11:40:36 -0800 Jakub Kicinski wrote:  
> > > > agreed. The v6 variant has existed for a while. The v4 version is
> > > > independent.    
> > > 
> > > Okay, I'm not sure what's the right call so I asked DaveM.  
> > 
> > DaveM raised a concern that unless we implement v6 now we can't be sure
> > the interface we create for v4 is going to fit there.
> > 
> > So Andrea unless it's a major hurdle, could you take a stab at the v6
> > version with VRFs as part of this series?  
> 
> I can tackle the v6 version but how do we face the compatibility issue raised
> by Stefano in his message?
> 
> if it is ok to implement a uAPI that breaks the existing scripts, it is relatively
> easy to replicate the VRF-based approach also in v6.

We need to keep existing End.DT6 as is, and add a separate
implementation.

The way to distinguish between the two could be either by passing via
netlink a flag attribute (which would request use of VRF in both
cases); using a different attribute than SEG6_LOCAL_TABLE for the
table id (or perhaps passing VRF's ifindex instead), e.g.
SEG6_LOCAL_TABLE_VRF; or adding a new command
(SEG6_LOCAL_ACTION_END_DT6_VRF) which would behave like End.DT4.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 23:54                     ` Jakub Kicinski
@ 2020-11-14  1:50                       ` Andrea Mayer
  2020-11-14  2:01                         ` Jakub Kicinski
  0 siblings, 1 reply; 34+ messages in thread
From: Andrea Mayer @ 2020-11-14  1:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Ahern, Stefano Salsano, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest,
	Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Hi Jakub,
Please see my responses inline:

On Fri, 13 Nov 2020 15:54:37 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Sat, 14 Nov 2020 00:00:24 +0100 Andrea Mayer wrote:
> > On Fri, 13 Nov 2020 13:40:10 -0800
> > Jakub Kicinski <kuba@kernel.org> wrote:
> > 
> > I can tackle the v6 version but how do we face the compatibility issue raised
> > by Stefano in his message?
> > 
> > if it is ok to implement a uAPI that breaks the existing scripts, it is relatively
> > easy to replicate the VRF-based approach also in v6.
> 
> We need to keep existing End.DT6 as is, and add a separate
> implementation.

ok

>
> The way to distinguish between the two could be either by

> 1) passing via
> netlink a flag attribute (which would request use of VRF in both
> cases);

yes, feasible... see UAPI solution 1

> 2) using a different attribute than SEG6_LOCAL_TABLE for the
> table id (or perhaps passing VRF's ifindex instead), e.g.
> SEG6_LOCAL_TABLE_VRF;

yes, feasible... see UAPI solution 2

> 3) or adding a new command
> (SEG6_LOCAL_ACTION_END_DT6_VRF) which would behave like End.DT4.

no, we prefer not to add a new command, because it is better to keep a 
semantic one-to-one relationship between these commands and the SRv6 
behaviors defined in the draft.

UAPI solution 1

we add a new parameter "vrfmode". DT4 can only be used with the 
vrfmode parameter (hence it is a required parameter for DT4).
DT6 can be used with "vrfmode" (new vrf based mode) or without "vrfmode" 
(legacy mode)(hence "vrfmode" is an optional parameter for DT6)

UAPI solution 1 examples:

ip -6 route add 2001:db8::1/128 encap seg6local action End.DT4 vrfmode table 100 dev eth0
ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 vrfmode table 100 dev eth0
ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 dev eth0

UAPI solution 2

we turn "table" into an optional parameter and we add the "vrftable" optional
parameter. DT4 can only be used with the "vrftable" (hence it is a required
parameter for DT4).
DT6 can be used with "vrftable" (new vrf mode) or with "table" (legacy mode)
(hence it is an optional parameter for DT6).

UAPI solution 2 examples:

ip -6 route add 2001:db8::1/128 encap seg6local action End.DT4 vrftable 100 dev eth0
ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 vrftable 100 dev eth0
ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 dev eth0

IMO solution 2 is nicer from UAPI POV because we always have only one 
parameter, maybe solution 1 is slightly easier to implement, all in all 
we prefer solution 2 but we can go for 1 if you prefer.

Waiting for your advice!

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-14  1:50                       ` Andrea Mayer
@ 2020-11-14  2:01                         ` Jakub Kicinski
  2020-11-14  2:29                           ` Andrea Mayer
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-14  2:01 UTC (permalink / raw)
  To: Andrea Mayer
  Cc: David Ahern, Stefano Salsano, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest,
	Paolo Lungaroni, Ahmed Abdelsalam

On Sat, 14 Nov 2020 02:50:58 +0100 Andrea Mayer wrote:
> Hi Jakub,
> Please see my responses inline:
> 
> On Fri, 13 Nov 2020 15:54:37 -0800
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
> > On Sat, 14 Nov 2020 00:00:24 +0100 Andrea Mayer wrote:  
> > > On Fri, 13 Nov 2020 13:40:10 -0800
> > > Jakub Kicinski <kuba@kernel.org> wrote:
> > > 
> > > I can tackle the v6 version but how do we face the compatibility issue raised
> > > by Stefano in his message?
> > > 
> > > if it is ok to implement a uAPI that breaks the existing scripts, it is relatively
> > > easy to replicate the VRF-based approach also in v6.  
> > 
> > We need to keep existing End.DT6 as is, and add a separate
> > implementation.  
> 
> ok
> 
> >
> > The way to distinguish between the two could be either by  
> 
> > 1) passing via
> > netlink a flag attribute (which would request use of VRF in both
> > cases);  
> 
> yes, feasible... see UAPI solution 1
> 
> > 2) using a different attribute than SEG6_LOCAL_TABLE for the
> > table id (or perhaps passing VRF's ifindex instead), e.g.
> > SEG6_LOCAL_TABLE_VRF;  
> 
> yes, feasible... see UAPI solution 2
> 
> > 3) or adding a new command
> > (SEG6_LOCAL_ACTION_END_DT6_VRF) which would behave like End.DT4.  
> 
> no, we prefer not to add a new command, because it is better to keep a 
> semantic one-to-one relationship between these commands and the SRv6 
> behaviors defined in the draft.
> 
> 
> UAPI solution 1
> 
> we add a new parameter "vrfmode". DT4 can only be used with the 
> vrfmode parameter (hence it is a required parameter for DT4).
> DT6 can be used with "vrfmode" (new vrf based mode) or without "vrfmode" 
> (legacy mode)(hence "vrfmode" is an optional parameter for DT6)
> 
> UAPI solution 1 examples:
> 
> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT4 vrfmode table 100 dev eth0
> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 vrfmode table 100 dev eth0
> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 dev eth0
> 
> UAPI solution 2
> 
> we turn "table" into an optional parameter and we add the "vrftable" optional
> parameter. DT4 can only be used with the "vrftable" (hence it is a required
> parameter for DT4).
> DT6 can be used with "vrftable" (new vrf mode) or with "table" (legacy mode)
> (hence it is an optional parameter for DT6).
> 
> UAPI solution 2 examples:
> 
> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT4 vrftable 100 dev eth0
> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 vrftable 100 dev eth0
> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 dev eth0
> 
> IMO solution 2 is nicer from UAPI POV because we always have only one 
> parameter, maybe solution 1 is slightly easier to implement, all in all 
> we prefer solution 2 but we can go for 1 if you prefer.

Agreed, 2 looks better to me as well. But let's not conflate uABI with
iproute2's command line. I'm more concerned about the kernel ABI.

BTW you prefer to operate on tables (and therefore require
net.vrf.strict_mode=1) because that's closer to the spirit of the RFC,
correct? As I said from the implementation perspective passing any VRF
ifindex down from user space to the kernel should be fine?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-14  2:01                         ` Jakub Kicinski
@ 2020-11-14  2:29                           ` Andrea Mayer
  2020-11-14  2:52                             ` David Ahern
  0 siblings, 1 reply; 34+ messages in thread
From: Andrea Mayer @ 2020-11-14  2:29 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Ahern, Stefano Salsano, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, netdev, linux-kernel, linux-kselftest,
	Paolo Lungaroni, Ahmed Abdelsalam, Andrea Mayer

Hi Jakub,

On Fri, 13 Nov 2020 18:01:26 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> > UAPI solution 2
> > 
> > we turn "table" into an optional parameter and we add the "vrftable" optional
> > parameter. DT4 can only be used with the "vrftable" (hence it is a required
> > parameter for DT4).
> > DT6 can be used with "vrftable" (new vrf mode) or with "table" (legacy mode)
> > (hence it is an optional parameter for DT6).
> > 
> > UAPI solution 2 examples:
> > 
> > ip -6 route add 2001:db8::1/128 encap seg6local action End.DT4 vrftable 100 dev eth0
> > ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 vrftable 100 dev eth0
> > ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 dev eth0
> > 
> > IMO solution 2 is nicer from UAPI POV because we always have only one 
> > parameter, maybe solution 1 is slightly easier to implement, all in all 
> > we prefer solution 2 but we can go for 1 if you prefer.
> 
> Agreed, 2 looks better to me as well. But let's not conflate uABI with
> iproute2's command line. I'm more concerned about the kernel ABI.

Sorry I was a little imprecise here. I reported only the user command perspective.
From the kernel point of view in solution 2 the vrftable will be a new
[SEG6_LOCAL_VRFTABLE] optional parameter.

> BTW you prefer to operate on tables (and therefore require
> net.vrf.strict_mode=1) because that's closer to the spirit of the RFC,
> correct? As I said from the implementation perspective passing any VRF
> ifindex down from user space to the kernel should be fine?

Yes, I definitely prefer to operate on tables (and so on the table ID) due to
the spirit of the RFC. We have discussed in depth this design choice with
David Ahern when implementing the DT4 patch and we are confident that operating
with VRF strict mode is a sound approach also for DT6. 

Thanks
Andrea,

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-14  2:29                           ` Andrea Mayer
@ 2020-11-14  2:52                             ` David Ahern
  0 siblings, 0 replies; 34+ messages in thread
From: David Ahern @ 2020-11-14  2:52 UTC (permalink / raw)
  To: Andrea Mayer, Jakub Kicinski
  Cc: Stefano Salsano, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, netdev, linux-kernel, linux-kselftest, Paolo Lungaroni,
	Ahmed Abdelsalam

On 11/13/20 7:29 PM, Andrea Mayer wrote:
> Hi Jakub,
> 
> On Fri, 13 Nov 2020 18:01:26 -0800
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
>>> UAPI solution 2
>>>
>>> we turn "table" into an optional parameter and we add the "vrftable" optional
>>> parameter. DT4 can only be used with the "vrftable" (hence it is a required
>>> parameter for DT4).
>>> DT6 can be used with "vrftable" (new vrf mode) or with "table" (legacy mode)
>>> (hence it is an optional parameter for DT6).
>>>
>>> UAPI solution 2 examples:
>>>
>>> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT4 vrftable 100 dev eth0
>>> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 vrftable 100 dev eth0
>>> ip -6 route add 2001:db8::1/128 encap seg6local action End.DT6 table 100 dev eth0
>>>
>>> IMO solution 2 is nicer from UAPI POV because we always have only one 
>>> parameter, maybe solution 1 is slightly easier to implement, all in all 
>>> we prefer solution 2 but we can go for 1 if you prefer.
>>
>> Agreed, 2 looks better to me as well. But let's not conflate uABI with
>> iproute2's command line. I'm more concerned about the kernel ABI.
> 
> Sorry I was a little imprecise here. I reported only the user command perspective.
> From the kernel point of view in solution 2 the vrftable will be a new
> [SEG6_LOCAL_VRFTABLE] optional parameter.
> 
>> BTW you prefer to operate on tables (and therefore require
>> net.vrf.strict_mode=1) because that's closer to the spirit of the RFC,
>> correct? As I said from the implementation perspective passing any VRF
>> ifindex down from user space to the kernel should be fine?
> 
> Yes, I definitely prefer to operate on tables (and so on the table ID) due to
> the spirit of the RFC. We have discussed in depth this design choice with
> David Ahern when implementing the DT4 patch and we are confident that operating
> with VRF strict mode is a sound approach also for DT6. 
> 

I like the vrftable option. Straightforward extension from current table
argument.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 19:00         ` Nathan Chancellor
@ 2020-11-14  3:37           ` David Ahern
  0 siblings, 0 replies; 34+ messages in thread
From: David Ahern @ 2020-11-14  3:37 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Jakub Kicinski, kernel test robot, Andrea Mayer, David S. Miller,
	David Ahern, Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	kbuild-all, clang-built-linux, netdev

On 11/13/20 12:00 PM, Nathan Chancellor wrote:
> On Fri, Nov 13, 2020 at 10:05:56AM -0700, David Ahern wrote:
>> On 11/13/20 9:57 AM, Jakub Kicinski wrote:
>>> Good people of build bot, 
>>>
>>> would you mind shedding some light on this one? It was also reported on
>>> v1, and Andrea said it's impossible to repro. Strange that build bot
>>> would make the same mistake twice, tho.
>>>
>>
>> I kicked off a build this morning using Andrea's patches and the config
>> from the build bot; builds fine as long as the first 3 patches are applied.
>>
> 
> I can confirm this as well with clang; if I applied the first three
> patches then this one, there is no error but if you just apply this one,
> there will be. If you open the GitHub URL, it shows just this patch
> applied, not the first three, which explains it.
> 
> For what it's worth, b4 chokes over this series:

Thanks, Nathan. I'll forward to Konstantin.

> 
> $ b4 am -o - 20201107153139.3552-1-andrea.mayer@uniroma2.it | git am
> Looking up https://lore.kernel.org/r/20201107153139.3552-1-andrea.mayer%40uniroma2.it
> Grabbing thread from lore.kernel.org/linux-kselftest
> Analyzing 18 messages in the thread
> ---
> Writing /tmp/tmp8425by7fb4-am-stdout
>   [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior
> ---
> Total patches: 1
> ---
>  Link: https://lore.kernel.org/r/20201107153139.3552-1-andrea.mayer@uniroma2.it
>  Base: not found
> ---
> Applying: seg6: add callbacks for customizing the creation/destruction of a behavior
> error: patch failed: net/ipv6/seg6_local.c:1015
> error: net/ipv6/seg6_local.c: patch does not apply
> Patch failed at 0001 seg6: add callbacks for customizing the creation/destruction of a behavior
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> Even if I grab the mbox from lore.kernel.org, it tries to do the same
> thing and apply the 3rd patch first, which might explain why the 0day
> bot got confused.
> 
> Cheers,
> Nathan
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [kbuild-all] Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-13 16:57     ` Jakub Kicinski
  2020-11-13 17:05       ` David Ahern
@ 2020-11-23  1:13       ` Rong Chen
  2020-11-23 17:19         ` Jakub Kicinski
  1 sibling, 1 reply; 34+ messages in thread
From: Rong Chen @ 2020-11-23  1:13 UTC (permalink / raw)
  To: Jakub Kicinski, kernel test robot
  Cc: Andrea Mayer, David S. Miller, David Ahern, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Shuah Khan, Shrijeet Mukherjee,
	Alexei Starovoitov, Daniel Borkmann, kbuild-all,
	clang-built-linux, netdev

Hi Jakub,

Sorry for the inconvenience, we have optimized the build bot to avoid 
this situation.

Best Regards,
Rong Chen

On 11/14/20 12:57 AM, Jakub Kicinski wrote:
> Good people of build bot,
>
> would you mind shedding some light on this one? It was also reported on
> v1, and Andrea said it's impossible to repro. Strange that build bot
> would make the same mistake twice, tho.
>
> Thanks!
>
> On Fri, 13 Nov 2020 17:23:09 +0800 kernel test robot wrote:
>> Hi Andrea,
>>
>> Thank you for the patch! Yet something to improve:
>>
>> [auto build test ERROR on ipvs/master]
>> [also build test ERROR on linus/master sparc-next/master v5.10-rc3 next-20201112]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url:    https://github.com/0day-ci/linux/commits/Andrea-Mayer/seg6-add-support-for-the-SRv6-End-DT4-behavior/20201109-093019
>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git master
>> config: x86_64-randconfig-a005-20201111 (attached as .config)
>> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 874b0a0b9db93f5d3350ffe6b5efda2d908415d0)
>> reproduce (this is a W=1 build):
>>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>          chmod +x ~/bin/make.cross
>>          # install x86_64 cross compiling tool for clang build
>>          # apt-get install binutils-x86-64-linux-gnu
>>          # https://github.com/0day-ci/linux/commit/761138e2f757ac64efe97b03311c976db242dc92
>>          git remote add linux-review https://github.com/0day-ci/linux
>>          git fetch --no-tags linux-review Andrea-Mayer/seg6-add-support-for-the-SRv6-End-DT4-behavior/20201109-093019
>>          git checkout 761138e2f757ac64efe97b03311c976db242dc92
>>          # save the attached .config to linux build tree
>>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <lkp@intel.com>
>>
>> All errors (new ones prefixed by >>):
>>
>>>> net/ipv6/seg6_local.c:793:4: error: field designator 'slwt_ops' does not refer to any field in type 'struct seg6_action_desc'
>>                     .slwt_ops       = {
>>                      ^
>>>> net/ipv6/seg6_local.c:826:10: error: invalid application of 'sizeof' to an incomplete type 'struct seg6_action_desc []'
>>             count = ARRAY_SIZE(seg6_action_table);
>>                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>     include/linux/kernel.h:48:32: note: expanded from macro 'ARRAY_SIZE'
>>     #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
>>                                    ^~~~~
>>     2 errors generated.
>>
>> vim +793 net/ipv6/seg6_local.c
>>
>>     757	
>>     758	static struct seg6_action_desc seg6_action_table[] = {
>>     759		{
>>     760			.action		= SEG6_LOCAL_ACTION_END,
>>     761			.attrs		= 0,
>>     762			.input		= input_action_end,
>>     763		},
>>     764		{
>>     765			.action		= SEG6_LOCAL_ACTION_END_X,
>>     766			.attrs		= (1 << SEG6_LOCAL_NH6),
>>     767			.input		= input_action_end_x,
>>     768		},
>>     769		{
>>     770			.action		= SEG6_LOCAL_ACTION_END_T,
>>     771			.attrs		= (1 << SEG6_LOCAL_TABLE),
>>     772			.input		= input_action_end_t,
>>     773		},
>>     774		{
>>     775			.action		= SEG6_LOCAL_ACTION_END_DX2,
>>     776			.attrs		= (1 << SEG6_LOCAL_OIF),
>>     777			.input		= input_action_end_dx2,
>>     778		},
>>     779		{
>>     780			.action		= SEG6_LOCAL_ACTION_END_DX6,
>>     781			.attrs		= (1 << SEG6_LOCAL_NH6),
>>     782			.input		= input_action_end_dx6,
>>     783		},
>>     784		{
>>     785			.action		= SEG6_LOCAL_ACTION_END_DX4,
>>     786			.attrs		= (1 << SEG6_LOCAL_NH4),
>>     787			.input		= input_action_end_dx4,
>>     788		},
>>     789		{
>>     790			.action		= SEG6_LOCAL_ACTION_END_DT4,
>>     791			.attrs		= (1 << SEG6_LOCAL_TABLE),
>>     792			.input		= input_action_end_dt4,
>>   > 793			.slwt_ops	= {
>>     794						.build_state = seg6_end_dt4_build,
>>     795					  },
>>     796		},
>>     797		{
>>     798			.action		= SEG6_LOCAL_ACTION_END_DT6,
>>     799			.attrs		= (1 << SEG6_LOCAL_TABLE),
>>     800			.input		= input_action_end_dt6,
>>     801		},
>>     802		{
>>     803			.action		= SEG6_LOCAL_ACTION_END_B6,
>>     804			.attrs		= (1 << SEG6_LOCAL_SRH),
>>     805			.input		= input_action_end_b6,
>>     806		},
>>     807		{
>>     808			.action		= SEG6_LOCAL_ACTION_END_B6_ENCAP,
>>     809			.attrs		= (1 << SEG6_LOCAL_SRH),
>>     810			.input		= input_action_end_b6_encap,
>>     811			.static_headroom	= sizeof(struct ipv6hdr),
>>     812		},
>>     813		{
>>     814			.action		= SEG6_LOCAL_ACTION_END_BPF,
>>     815			.attrs		= (1 << SEG6_LOCAL_BPF),
>>     816			.input		= input_action_end_bpf,
>>     817		},
>>     818	
>>     819	};
>>     820	
>>     821	static struct seg6_action_desc *__get_action_desc(int action)
>>     822	{
>>     823		struct seg6_action_desc *desc;
>>     824		int i, count;
>>     825	
>>   > 826		count = ARRAY_SIZE(seg6_action_table);
>>     827		for (i = 0; i < count; i++) {
>>     828			desc = &seg6_action_table[i];
>>     829			if (desc->action == action)
>>     830				return desc;
>>     831		}
>>     832	
>>     833		return NULL;
>>     834	}
>>     835	
>>
>> ---
>> 0-DAY CI Kernel Test Service, Intel Corporation
>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
> _______________________________________________
> kbuild-all mailing list -- kbuild-all@lists.01.org
> To unsubscribe send an email to kbuild-all-leave@lists.01.org


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [kbuild-all] Re: [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior
  2020-11-23  1:13       ` [kbuild-all] " Rong Chen
@ 2020-11-23 17:19         ` Jakub Kicinski
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Kicinski @ 2020-11-23 17:19 UTC (permalink / raw)
  To: Rong Chen
  Cc: kernel test robot, Andrea Mayer, David S. Miller, David Ahern,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Shuah Khan,
	Shrijeet Mukherjee, Alexei Starovoitov, Daniel Borkmann,
	kbuild-all, clang-built-linux, netdev

On Mon, 23 Nov 2020 09:13:49 +0800 Rong Chen wrote:
> Sorry for the inconvenience, we have optimized the build bot to avoid 
> this situation.

Great to hear, thank you!

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-11-23 17:19 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-07 15:31 [net-next,v2,0/5] seg6: add support for SRv6 End.DT4 behavior Andrea Mayer
2020-11-07 15:31 ` [net-next,v2,1/5] vrf: add mac header for tunneled packets when sniffer is attached Andrea Mayer
2020-11-10 22:50   ` Jakub Kicinski
2020-11-13  0:37     ` Andrea Mayer
2020-11-07 15:31 ` [net-next,v2,2/5] seg6: improve management of behavior attributes Andrea Mayer
2020-11-10 22:50   ` Jakub Kicinski
2020-11-13  0:55     ` Andrea Mayer
2020-11-07 15:31 ` [net-next,v2,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior Andrea Mayer
2020-11-10 22:56   ` Jakub Kicinski
2020-11-13  1:06     ` Andrea Mayer
2020-11-07 15:31 ` [net-next,v2,4/5] seg6: add support for the SRv6 End.DT4 behavior Andrea Mayer
2020-11-10 23:12   ` Jakub Kicinski
2020-11-13  1:28     ` Andrea Mayer
2020-11-13  1:49       ` David Ahern
2020-11-13 16:55         ` Jakub Kicinski
2020-11-13 17:02           ` Stefano Salsano
2020-11-13 17:04             ` David Ahern
2020-11-13 19:40               ` Jakub Kicinski
2020-11-13 21:32                 ` Stefano Salsano
2020-11-13 21:40                 ` Jakub Kicinski
2020-11-13 23:00                   ` Andrea Mayer
2020-11-13 23:54                     ` Jakub Kicinski
2020-11-14  1:50                       ` Andrea Mayer
2020-11-14  2:01                         ` Jakub Kicinski
2020-11-14  2:29                           ` Andrea Mayer
2020-11-14  2:52                             ` David Ahern
2020-11-13  9:23   ` kernel test robot
2020-11-13 16:57     ` Jakub Kicinski
2020-11-13 17:05       ` David Ahern
2020-11-13 19:00         ` Nathan Chancellor
2020-11-14  3:37           ` David Ahern
2020-11-23  1:13       ` [kbuild-all] " Rong Chen
2020-11-23 17:19         ` Jakub Kicinski
2020-11-07 15:31 ` [net-next,v2,5/5] selftests: add selftest " Andrea Mayer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).