All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] net: add support for IPv6 Segment Routing
@ 2016-10-17 14:42 David Lebrun
  2016-10-17 14:42 ` [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header) David Lebrun
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:42 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

Segment Routing (SR) is a source routing paradigm, architecturally
defined in draft-ietf-spring-segment-routing-09 [1]. The IPv6 flavor of
SR is defined in draft-ietf-6man-segment-routing-header-02 [2].

The main idea is that an SR-enabled packet contains a list of segments,
which represent mandatory waypoints. Each waypoint is called a segment
endpoint. The SR-enabled packet is routed normally (e.g. shortest path)
between the segment endpoints. A node that inserts an SRH into a packet
is called an ingress node, and a node that is the last segment endpoint
is called an egress node.

>From an IPv6 viewpoint, an SR-enabled packet contains an IPv6 extension
header, which is a Routing Header type 4, defined as follows:

struct ipv6_sr_hdr {
        __u8    nexthdr;
        __u8    hdrlen;
        __u8    type;
        __u8    segments_left;
        __u8    first_segment;
        __be16  flags;
        __u8    reserved;

        struct in6_addr segments[0];
} __attribute__((packed));

The first 4 bytes of the SRH is consistent with the Routing Header
definition in RFC 2460. The type is set to `4' (SRH).

Each segment is encoded as an IPv6 address. The segments are encoded in
reverse order: segments[0] is the last segment of the path, and
segments[first_segment] is the first segment of the path.

segments[segments_left] points to the currently active segment and
segments_left is decremented at each segment endpoint.

There exist two ways for a packet to receive an SRH, we call them
encap mode and inline mode. In the encap mode, the packet is encapsulated
in an outer IPv6 header that contains the SRH. The inner (original) packet
is not modified. A virtual tunnel is thus created between the ingress node
(the node that encapsulates) and the egress node (the last segment of the path).
Once an encapsulated SR packet reaches the egress node, the node decapsulates
the packet and performs a routing decision on the inner packet. This kind of
SRH insertion is intended to use for routers that encapsulates in-transit
packet.

The second SRH insertion method, the inline mode, acts by directly inserting
the SRH right after the IPv6 header of the original packet. For this method,
if a particular flag (SR6_FLAG_CLEANUP) is set, then the penultimate segment
endpoint must strip the SRH from the packet before forwarding it to the last
segment endpoint. This insertion method is intended to use for endhosts,
however it is also used for in-transit packets by some industry actors.

Finally, the SRH may contain TLVs after the segments list. Several types of
TLVs are defined, but we currently consider only the HMAC TLV. This TLV is
an answer to the deprecation of the RH0 and enables to ensure the authenticity
and integrity of the SRH. The HMAC text contains the flags, the first_segment
index, the full list of segments, and the source address of the packet. While
SR is intended to use mostly within a single administrative domain, the HMAC
TLV allows to verify SR packets coming from an untrusted source.

This patches series implements support for the IPv6 flavor of SR and is
logically divided into the following components:

        (1) Data plane support (patch 01). This patch adds a function
            in net/ipv6/exthdrs.c to handle the Routing Header type 4.
            It enables the kernel to act as a segment endpoint, by supporting
            the following operations: decrementation of the segments_left field,
            cleanup flag support (removal of the SRH if we are the penultimate
            segment endpoint) and decapsulation of the inner packet as an egress
            node.
        (2) Control plane support (patches 02..03 and 07..09). These patches enables
            to insert SRH on locally emitted and/or forwarded packets, both with
            encap mode and with inline mode. The SRH insertion is controlled through
            the lightweight tunnels mechanism. Furthermore, patch 08 enables the
            applications to insert an SRH on a per-socket basis, through the
            setsockopt() system call. The mechanism to specify a per-socket
            Routing Header was already defined for RH0 and no special modification
            was performed on this side. However, the code to actually push the RH
            onto the packets had to be adapted for the SRH specifications.
        (3) HMAC support (patches 04..06). These patches adds the support of the
            HMAC TLV verification for the dataplane part, and generation for
            the control plane part. Two hashing algorithms are supported
            (SHA-1 as legacy and SHA-256 as required by the IETF draft), but
            additional algorithms can be easily supported by simply adding an
            entry into an array.

[1] https://tools.ietf.org/html/draft-ietf-spring-segment-routing-09
[2] https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02

David Lebrun (9):
  ipv6: implement dataplane support for rthdr type 4 (Segment Routing
    Header)
  ipv6: sr: add code base for control plane support of SR-IPv6
  ipv6: sr: add support for SRH encapsulation and injection with
    lwtunnels
  ipv6: sr: add core files for SR HMAC support
  ipv6: sr: implement API to control SR HMAC structures
  ipv6: sr: add calls to verify and insert HMAC signatures
  ipv6: add source address argument for ipv6_push_nfrag_opts
  ipv6: sr: add support for SRH injection through setsockopt
  ipv6: sr: add documentation file for per-interface sysctls

 Documentation/networking/seg6-sysctl.txt |  18 ++
 include/linux/ipv6.h                     |   4 +
 include/linux/seg6.h                     |   6 +
 include/linux/seg6_genl.h                |   6 +
 include/linux/seg6_hmac.h                |   6 +
 include/linux/seg6_iptunnel.h            |   6 +
 include/net/ipv6.h                       |   3 +-
 include/net/netns/ipv6.h                 |   3 +
 include/net/seg6.h                       |  51 ++++
 include/net/seg6_hmac.h                  |  62 +++++
 include/uapi/linux/ipv6.h                |   3 +
 include/uapi/linux/lwtunnel.h            |   1 +
 include/uapi/linux/seg6.h                |  46 ++++
 include/uapi/linux/seg6_genl.h           |  33 +++
 include/uapi/linux/seg6_hmac.h           |  20 ++
 include/uapi/linux/seg6_iptunnel.h       |  33 +++
 net/core/lwtunnel.c                      |   2 +
 net/ipv6/Kconfig                         |  13 +
 net/ipv6/Makefile                        |   1 +
 net/ipv6/addrconf.c                      |  28 ++
 net/ipv6/exthdrs.c                       | 217 +++++++++++++++-
 net/ipv6/ip6_output.c                    |   5 +-
 net/ipv6/ip6_tunnel.c                    |   2 +-
 net/ipv6/ipv6_sockglue.c                 |   4 +
 net/ipv6/seg6.c                          | 363 ++++++++++++++++++++++++++
 net/ipv6/seg6_hmac.c                     | 432 +++++++++++++++++++++++++++++++
 net/ipv6/seg6_iptunnel.c                 | 328 +++++++++++++++++++++++
 27 files changed, 1686 insertions(+), 10 deletions(-)
 create mode 100644 Documentation/networking/seg6-sysctl.txt
 create mode 100644 include/linux/seg6.h
 create mode 100644 include/linux/seg6_genl.h
 create mode 100644 include/linux/seg6_hmac.h
 create mode 100644 include/linux/seg6_iptunnel.h
 create mode 100644 include/net/seg6.h
 create mode 100644 include/net/seg6_hmac.h
 create mode 100644 include/uapi/linux/seg6.h
 create mode 100644 include/uapi/linux/seg6_genl.h
 create mode 100644 include/uapi/linux/seg6_hmac.h
 create mode 100644 include/uapi/linux/seg6_iptunnel.h
 create mode 100644 net/ipv6/seg6.c
 create mode 100644 net/ipv6/seg6_hmac.c
 create mode 100644 net/ipv6/seg6_iptunnel.c

-- 
2.7.3

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
@ 2016-10-17 14:42 ` David Lebrun
  2016-10-17 14:57   ` David Miller
                     ` (2 more replies)
  2016-10-17 14:42 ` [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6 David Lebrun
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:42 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

Implement minimal support for processing of SR-enabled packets
as described in
https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02.

This patch implements the following operations:
- Intermediate segment endpoint: incrementation of active segment and rerouting.
- Egress for SR-encapsulated packets: decapsulation of outer IPv6 header + SRH
  and routing of inner packet.
- Cleanup flag support for SR-inlined packets: removal of SRH if we are the
  penultimate segment endpoint.

A per-interface sysctl seg6_enabled is provided, to accept/deny SR-enabled
packets. Default is deny.

This patch does not provide support for HMAC-signed packets.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 include/linux/ipv6.h      |   3 +
 include/linux/seg6.h      |   6 ++
 include/uapi/linux/ipv6.h |   2 +
 include/uapi/linux/seg6.h |  46 +++++++++++++++
 net/ipv6/Kconfig          |  13 +++++
 net/ipv6/addrconf.c       |  18 ++++++
 net/ipv6/exthdrs.c        | 140 ++++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 228 insertions(+)
 create mode 100644 include/linux/seg6.h
 create mode 100644 include/uapi/linux/seg6.h

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 7e9a789..75395ad 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -64,6 +64,9 @@ struct ipv6_devconf {
 	} stable_secret;
 	__s32		use_oif_addrs_only;
 	__s32		keep_addr_on_down;
+#ifdef CONFIG_IPV6_SEG6
+	__s32		seg6_enabled;
+#endif
 
 	struct ctl_table_header *sysctl_header;
 };
diff --git a/include/linux/seg6.h b/include/linux/seg6.h
new file mode 100644
index 0000000..7a66d2b
--- /dev/null
+++ b/include/linux/seg6.h
@@ -0,0 +1,6 @@
+#ifndef _LINUX_SEG6_H
+#define _LINUX_SEG6_H
+
+#include <uapi/linux/seg6.h>
+
+#endif
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 8c27723..7ff1d65 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -39,6 +39,7 @@ struct in6_ifreq {
 #define IPV6_SRCRT_STRICT	0x01	/* Deprecated; will be removed */
 #define IPV6_SRCRT_TYPE_0	0	/* Deprecated; will be removed */
 #define IPV6_SRCRT_TYPE_2	2	/* IPv6 type 2 Routing Header	*/
+#define IPV6_SRCRT_TYPE_4	4	/* Segment Routing with IPv6 */
 
 /*
  *	routing header
@@ -178,6 +179,7 @@ enum {
 	DEVCONF_DROP_UNSOLICITED_NA,
 	DEVCONF_KEEP_ADDR_ON_DOWN,
 	DEVCONF_RTR_SOLICIT_MAX_INTERVAL,
+	DEVCONF_SEG6_ENABLED,
 	DEVCONF_MAX
 };
 
diff --git a/include/uapi/linux/seg6.h b/include/uapi/linux/seg6.h
new file mode 100644
index 0000000..9f9e157
--- /dev/null
+++ b/include/uapi/linux/seg6.h
@@ -0,0 +1,46 @@
+/*
+ *  SR-IPv6 implementation
+ *
+ *  Author:
+ *  David Lebrun <david.lebrun@uclouvain.be>
+ *
+ *
+ *  This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _UAPI_LINUX_SEG6_H
+#define _UAPI_LINUX_SEG6_H
+
+/*
+ * SRH
+ */
+struct ipv6_sr_hdr {
+	__u8	nexthdr;
+	__u8	hdrlen;
+	__u8	type;
+	__u8	segments_left;
+	__u8	first_segment;
+	__be16	flags;
+	__u8	reserved;
+
+	struct in6_addr segments[0];
+} __attribute__((packed));
+
+#define SR6_FLAG_CLEANUP	(1 << 15)
+#define SR6_FLAG_PROTECTED	(1 << 14)
+#define SR6_FLAG_OAM		(1 << 13)
+#define SR6_FLAG_ALERT		(1 << 12)
+#define SR6_FLAG_HMAC		(1 << 11)
+
+#define SR6_TLV_INGRESS		1
+#define SR6_TLV_EGRESS		2
+#define SR6_TLV_OPAQUE		3
+#define SR6_TLV_PADDING		4
+#define SR6_TLV_HMAC		5
+
+#define sr_get_flags(srh) (be16_to_cpu((srh)->flags))
+
+#endif
diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 2343e4f..691c318 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -289,4 +289,17 @@ config IPV6_PIMSM_V2
 	  Support for IPv6 PIM multicast routing protocol PIM-SMv2.
 	  If unsure, say N.
 
+config IPV6_SEG6
+	bool "IPv6: Segment Routing support"
+	depends on IPV6
+	select CRYPTO_HMAC
+	select CRYPTO_SHA1
+	select CRYPTO_SHA256
+	---help---
+	  Experimental support for IPv6 Segment Routing dataplane as defined
+	  in IETF draft-ietf-6man-segment-routing-header-02. This option
+	  enables the processing of SR-enabled packets allowing the kernel
+	  to act as a segment endpoint (intermediate or egress). It also
+	  enables an API for the kernel to act as an ingress SR router.
+
 endif # IPV6
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d8983e1..42c0ffb 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -239,6 +239,9 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.use_oif_addrs_only	= 0,
 	.ignore_routes_with_linkdown = 0,
 	.keep_addr_on_down	= 0,
+#ifdef CONFIG_IPV6_SEG6
+	.seg6_enabled		= 0,
+#endif
 };
 
 static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
@@ -285,6 +288,9 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.use_oif_addrs_only	= 0,
 	.ignore_routes_with_linkdown = 0,
 	.keep_addr_on_down	= 0,
+#ifdef CONFIG_IPV6_SEG6
+	.seg6_enabled		= 0,
+#endif
 };
 
 /* Check if a valid qdisc is available */
@@ -4965,6 +4971,9 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_DROP_UNICAST_IN_L2_MULTICAST] = cnf->drop_unicast_in_l2_multicast;
 	array[DEVCONF_DROP_UNSOLICITED_NA] = cnf->drop_unsolicited_na;
 	array[DEVCONF_KEEP_ADDR_ON_DOWN] = cnf->keep_addr_on_down;
+#ifdef CONFIG_IPV6_SEG6
+	array[DEVCONF_SEG6_ENABLED] = cnf->seg6_enabled;
+#endif
 }
 
 static inline size_t inet6_ifla6_size(void)
@@ -6056,6 +6065,15 @@ static const struct ctl_table addrconf_sysctl[] = {
 		.proc_handler	= proc_dointvec,
 
 	},
+#ifdef CONFIG_IPV6_SEG6
+	{
+		.procname	= "seg6_enabled",
+		.data		= &ipv6_devconf.seg6_enabled,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+#endif
 	{
 		/* sentinel */
 	}
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 139ceb6..b31f811 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -47,6 +47,9 @@
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
 #include <net/xfrm.h>
 #endif
+#ifdef CONFIG_IPV6_SEG6
+#include <linux/seg6.h>
+#endif
 
 #include <linux/uaccess.h>
 
@@ -286,6 +289,137 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
 	return -1;
 }
 
+#ifdef CONFIG_IPV6_SEG6
+static int ipv6_srh_rcv(struct sk_buff *skb)
+{
+	struct in6_addr *addr = NULL, *last_addr = NULL, *active_addr = NULL;
+	struct inet6_skb_parm *opt = IP6CB(skb);
+	struct net *net = dev_net(skb->dev);
+	struct ipv6_sr_hdr *hdr;
+	struct inet6_dev *idev;
+	int cleanup = 0;
+	int accept_seg6;
+
+	hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
+
+	idev = __in6_dev_get(skb->dev);
+
+	accept_seg6 = net->ipv6.devconf_all->seg6_enabled;
+	if (accept_seg6 > idev->cnf.seg6_enabled)
+		accept_seg6 = idev->cnf.seg6_enabled;
+
+	if (!accept_seg6) {
+		kfree_skb(skb);
+		return -1;
+	}
+
+looped_back:
+	last_addr = hdr->segments;
+
+	if (hdr->segments_left > 0) {
+		if (hdr->nexthdr != NEXTHDR_IPV6 && hdr->segments_left == 1 &&
+		    sr_get_flags(hdr) & SR6_FLAG_CLEANUP)
+			cleanup = 1;
+	} else {
+		if (hdr->nexthdr == NEXTHDR_IPV6) {
+			int offset = (hdr->hdrlen + 1) << 3;
+
+			if (!pskb_pull(skb, offset)) {
+				kfree_skb(skb);
+				return -1;
+			}
+			skb_postpull_rcsum(skb, skb_transport_header(skb),
+					   offset);
+
+			skb_reset_network_header(skb);
+			skb_reset_transport_header(skb);
+			skb->encapsulation = 0;
+
+			__skb_tunnel_rx(skb, skb->dev, net);
+
+			netif_rx(skb);
+			return -1;
+		}
+
+		opt->srcrt = skb_network_header_len(skb);
+		opt->lastopt = opt->srcrt;
+		skb->transport_header += (hdr->hdrlen + 1) << 3;
+		opt->nhoff = (&hdr->nexthdr) - skb_network_header(skb);
+
+		return 1;
+	}
+
+	if (skb_cloned(skb)) {
+		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) {
+			__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
+					IPSTATS_MIB_OUTDISCARDS);
+			kfree_skb(skb);
+			return -1;
+		}
+	}
+
+	if (skb->ip_summed == CHECKSUM_COMPLETE)
+		skb->ip_summed = CHECKSUM_NONE;
+
+	hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
+
+	active_addr = hdr->segments + hdr->segments_left;
+	hdr->segments_left--;
+	addr = hdr->segments + hdr->segments_left;
+
+	ipv6_hdr(skb)->daddr = *addr;
+
+	skb_push(skb, sizeof(struct ipv6hdr));
+
+	if (cleanup) {
+		int srhlen = (hdr->hdrlen + 1) << 3;
+		int nh = hdr->nexthdr;
+
+		memmove(skb_network_header(skb) + srhlen,
+			skb_network_header(skb),
+			(unsigned char *)hdr - skb_network_header(skb));
+		skb_pull(skb, srhlen);
+		skb->network_header += srhlen;
+		ipv6_hdr(skb)->nexthdr = nh;
+		ipv6_hdr(skb)->payload_len = htons(skb->len -
+						   sizeof(struct ipv6hdr));
+	}
+
+	skb_dst_drop(skb);
+
+	ip6_route_input(skb);
+
+	if (skb_dst(skb)->error) {
+		dst_input(skb);
+		return -1;
+	}
+
+	if (skb_dst(skb)->dev->flags & IFF_LOOPBACK) {
+		if (ipv6_hdr(skb)->hop_limit <= 1) {
+			__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
+					IPSTATS_MIB_INHDRERRORS);
+			icmpv6_send(skb, ICMPV6_TIME_EXCEED,
+				    ICMPV6_EXC_HOPLIMIT, 0);
+			kfree_skb(skb);
+			return -1;
+		}
+		ipv6_hdr(skb)->hop_limit--;
+
+		/* be sure that srh is still present before reinjecting */
+		if (!cleanup) {
+			skb_pull(skb, sizeof(struct ipv6hdr));
+			goto looped_back;
+		}
+		skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+		IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
+	}
+
+	dst_input(skb);
+
+	return -1;
+}
+#endif
+
 /********************************
   Routing header.
  ********************************/
@@ -326,6 +460,12 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
 		return -1;
 	}
 
+#ifdef CONFIG_IPV6_SEG6
+	/* segment routing */
+	if (hdr->type == IPV6_SRCRT_TYPE_4)
+		return ipv6_srh_rcv(skb);
+#endif
+
 looped_back:
 	if (hdr->segments_left == 0) {
 		switch (hdr->type) {
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
  2016-10-17 14:42 ` [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header) David Lebrun
@ 2016-10-17 14:42 ` David Lebrun
  2016-10-17 15:00   ` David Miller
  2016-10-17 17:07   ` Tom Herbert
  2016-10-17 14:42 ` [PATCH 3/9] ipv6: sr: add support for SRH encapsulation and injection with lwtunnels David Lebrun
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:42 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch adds the necessary hooks and structures to provide support
for SR-IPv6 control plane, essentially the Generic Netlink commands
that will be used for userspace control over the Segment Routing
kernel structures.

The genetlink commands provide control over two different structures:
tunnel source and HMAC data. The tunnel source is the source address
that will be used by default when encapsulating packets into an
outer IPv6 header + SRH. If the tunnel source is set to :: then an
address of the outgoing interface will be selected as the source.

The HMAC commands currently just return ENOTSUPP and will be implemented
in a future patch.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 include/linux/seg6_genl.h      |   6 ++
 include/net/netns/ipv6.h       |   3 +
 include/net/seg6.h             |  41 ++++++++
 include/uapi/linux/seg6_genl.h |  33 +++++++
 net/ipv6/Makefile              |   1 +
 net/ipv6/seg6.c                | 212 +++++++++++++++++++++++++++++++++++++++++
 6 files changed, 296 insertions(+)
 create mode 100644 include/linux/seg6_genl.h
 create mode 100644 include/net/seg6.h
 create mode 100644 include/uapi/linux/seg6_genl.h
 create mode 100644 net/ipv6/seg6.c

diff --git a/include/linux/seg6_genl.h b/include/linux/seg6_genl.h
new file mode 100644
index 0000000..d6c3fb4f
--- /dev/null
+++ b/include/linux/seg6_genl.h
@@ -0,0 +1,6 @@
+#ifndef _LINUX_SEG6_GENL_H
+#define _LINUX_SEG6_GENL_H
+
+#include <uapi/linux/seg6_genl.h>
+
+#endif
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 10d0848..67825bb 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -85,6 +85,9 @@ struct netns_ipv6 {
 #endif
 	atomic_t		dev_addr_genid;
 	atomic_t		fib6_sernum;
+#ifdef CONFIG_IPV6_SEG6
+	struct seg6_pernet_data *seg6_data;
+#endif
 };
 
 #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
diff --git a/include/net/seg6.h b/include/net/seg6.h
new file mode 100644
index 0000000..89b819e
--- /dev/null
+++ b/include/net/seg6.h
@@ -0,0 +1,41 @@
+/*
+ *  SR-IPv6 implementation
+ *
+ *  Author:
+ *  David Lebrun <david.lebrun@uclouvain.be>
+ *
+ *
+ *  This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _NET_SEG6_H
+#define _NET_SEG6_H
+
+#include <linux/net.h>
+#include <linux/ipv6.h>
+
+struct seg6_pernet_data {
+	struct mutex lock;
+	struct in6_addr __rcu *tun_src;
+};
+
+static inline struct seg6_pernet_data *seg6_pernet(struct net *net)
+{
+	return net->ipv6.seg6_data;
+}
+
+static inline void seg6_pernet_lock(struct net *net)
+{
+	mutex_lock(&seg6_pernet(net)->lock);
+}
+
+static inline void seg6_pernet_unlock(struct net *net)
+{
+	mutex_unlock(&seg6_pernet(net)->lock);
+}
+
+#endif
+
diff --git a/include/uapi/linux/seg6_genl.h b/include/uapi/linux/seg6_genl.h
new file mode 100644
index 0000000..4db988b
--- /dev/null
+++ b/include/uapi/linux/seg6_genl.h
@@ -0,0 +1,33 @@
+#ifndef _UAPI_LINUX_SEG6_GENL_H
+#define _UAPI_LINUX_SEG6_GENL_H
+
+#define SEG6_GENL_NAME		"SEG6"
+#define SEG6_GENL_VERSION	0x1
+
+enum {
+	SEG6_ATTR_UNSPEC,
+	SEG6_ATTR_DST,
+	SEG6_ATTR_DSTLEN,
+	SEG6_ATTR_HMACKEYID,
+	SEG6_ATTR_SECRET,
+	SEG6_ATTR_SECRETLEN,
+	SEG6_ATTR_ALGID,
+	SEG6_ATTR_HMACINFO,
+	__SEG6_ATTR_MAX,
+};
+
+#define SEG6_ATTR_MAX (__SEG6_ATTR_MAX - 1)
+
+enum {
+	SEG6_CMD_UNSPEC,
+	SEG6_CMD_SETHMAC,
+	SEG6_CMD_DUMPHMAC,
+	SEG6_CMD_SET_TUNSRC,
+	SEG6_CMD_GET_TUNSRC,
+	__SEG6_CMD_MAX,
+};
+
+#define SEG6_CMD_MAX (__SEG6_CMD_MAX - 1)
+
+#endif
+
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index c174ccb..5069257 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -44,6 +44,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
 obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
 obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
 obj-$(CONFIG_IPV6_FOU) += fou6.o
+obj-$(CONFIG_IPV6_SEG6) += seg6.o
 
 obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
 obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
diff --git a/net/ipv6/seg6.c b/net/ipv6/seg6.c
new file mode 100644
index 0000000..8dafe7b
--- /dev/null
+++ b/net/ipv6/seg6.c
@@ -0,0 +1,212 @@
+/*
+ *  SR-IPv6 implementation
+ *
+ *  Author:
+ *  David Lebrun <david.lebrun@uclouvain.be>
+ *
+ *
+ *  This program is free software; you can redistribute it and/or
+ *	  modify it under the terms of the GNU General Public License
+ *	  as published by the Free Software Foundation; either version
+ *	  2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/socket.h>
+#include <linux/net.h>
+#include <linux/in6.h>
+#include <linux/slab.h>
+
+#include <net/ipv6.h>
+#include <net/protocol.h>
+
+#include <net/seg6.h>
+#include <net/genetlink.h>
+#include <linux/seg6.h>
+#include <linux/seg6_genl.h>
+
+static const struct nla_policy seg6_genl_policy[SEG6_ATTR_MAX + 1] = {
+	[SEG6_ATTR_DST]				= { .type = NLA_BINARY,
+		.len = sizeof(struct in6_addr) },
+	[SEG6_ATTR_DSTLEN]			= { .type = NLA_S32, },
+	[SEG6_ATTR_HMACKEYID]		= { .type = NLA_U32, },
+	[SEG6_ATTR_SECRET]			= { .type = NLA_BINARY, },
+	[SEG6_ATTR_SECRETLEN]		= { .type = NLA_U8, },
+	[SEG6_ATTR_ALGID]			= { .type = NLA_U8, },
+	[SEG6_ATTR_HMACINFO]		= { .type = NLA_NESTED, },
+};
+
+static struct genl_family seg6_genl_family = {
+	.id = GENL_ID_GENERATE,
+	.hdrsize = 0,
+	.name = SEG6_GENL_NAME,
+	.version = SEG6_GENL_VERSION,
+	.maxattr = SEG6_ATTR_MAX,
+	.netnsok = true,
+};
+
+static int seg6_genl_sethmac(struct sk_buff *skb, struct genl_info *info)
+{
+	return -ENOTSUPP;
+}
+
+static int seg6_genl_set_tunsrc(struct sk_buff *skb, struct genl_info *info)
+{
+	struct net *net = genl_info_net(info);
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+	struct in6_addr *val, *t_old, *t_new;
+
+	if (!info->attrs[SEG6_ATTR_DST])
+		return -EINVAL;
+
+	val = (struct in6_addr *)nla_data(info->attrs[SEG6_ATTR_DST]);
+	t_new = kmemdup(val, sizeof(*val), GFP_KERNEL);
+
+	seg6_pernet_lock(net);
+
+	t_old = sdata->tun_src;
+	rcu_assign_pointer(sdata->tun_src, t_new);
+
+	seg6_pernet_unlock(net);
+
+	synchronize_net();
+	kfree(t_old);
+
+	return 0;
+}
+
+static int seg6_genl_get_tunsrc(struct sk_buff *skb, struct genl_info *info)
+{
+	struct net *net = genl_info_net(info);
+	struct sk_buff *msg;
+	void *hdr;
+	struct in6_addr *tun_src;
+
+	msg = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq,
+			  &seg6_genl_family, 0, SEG6_CMD_GET_TUNSRC);
+	if (!hdr)
+		goto free_msg;
+
+	rcu_read_lock();
+	tun_src = rcu_dereference(seg6_pernet(net)->tun_src);
+
+	if (nla_put(msg, SEG6_ATTR_DST, sizeof(struct in6_addr), tun_src))
+		goto nla_put_failure;
+
+	rcu_read_unlock();
+
+	genlmsg_end(msg, hdr);
+	genlmsg_reply(msg, info);
+
+	return 0;
+
+nla_put_failure:
+	rcu_read_unlock();
+	genlmsg_cancel(msg, hdr);
+free_msg:
+	nlmsg_free(msg);
+	return -ENOMEM;
+}
+
+static int seg6_genl_dumphmac(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return -ENOTSUPP;
+}
+
+static const struct genl_ops seg6_genl_ops[] = {
+	{
+		.cmd	= SEG6_CMD_SETHMAC,
+		.doit	= seg6_genl_sethmac,
+		.policy	= seg6_genl_policy,
+		.flags	= GENL_ADMIN_PERM,
+	},
+	{
+		.cmd	= SEG6_CMD_DUMPHMAC,
+		.dumpit	= seg6_genl_dumphmac,
+		.policy	= seg6_genl_policy,
+		.flags	= GENL_ADMIN_PERM,
+	},
+	{
+		.cmd	= SEG6_CMD_SET_TUNSRC,
+		.doit	= seg6_genl_set_tunsrc,
+		.policy	= seg6_genl_policy,
+		.flags	= GENL_ADMIN_PERM,
+	},
+	{
+		.cmd	= SEG6_CMD_GET_TUNSRC,
+		.doit	= seg6_genl_get_tunsrc,
+		.policy = seg6_genl_policy,
+		.flags	= GENL_ADMIN_PERM,
+	},
+};
+
+static int __net_init seg6_net_init(struct net *net)
+{
+	struct seg6_pernet_data *sdata;
+
+	sdata = kzalloc(sizeof(*sdata), GFP_KERNEL);
+	if (!sdata)
+		return -ENOMEM;
+
+	mutex_init(&sdata->lock);
+
+	sdata->tun_src = kzalloc(sizeof(*sdata->tun_src), GFP_KERNEL);
+	if (!sdata->tun_src) {
+		kfree(sdata);
+		return -ENOMEM;
+	}
+
+	net->ipv6.seg6_data = sdata;
+
+	return 0;
+}
+
+static void __net_exit seg6_net_exit(struct net *net)
+{
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+
+	kfree(sdata->tun_src);
+	kfree(sdata);
+}
+
+static struct pernet_operations ip6_segments_ops = {
+	.init = seg6_net_init,
+	.exit = seg6_net_exit,
+};
+
+static int __init seg6_init(void)
+{
+	int err = -ENOMEM;
+
+	err = genl_register_family_with_ops(&seg6_genl_family, seg6_genl_ops);
+	if (err)
+		goto out;
+
+	err = register_pernet_subsys(&ip6_segments_ops);
+	if (err)
+		goto out_unregister_genl;
+
+	pr_info("Segment Routing with IPv6\n");
+
+out:
+	return err;
+out_unregister_genl:
+	genl_unregister_family(&seg6_genl_family);
+	goto out;
+}
+module_init(seg6_init);
+
+static void __exit seg6_exit(void)
+{
+	unregister_pernet_subsys(&ip6_segments_ops);
+	genl_unregister_family(&seg6_genl_family);
+}
+module_exit(seg6_exit);
+
+MODULE_DESCRIPTION("Segment Routing with IPv6 core");
+MODULE_LICENSE("GPL v2");
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/9] ipv6: sr: add support for SRH encapsulation and injection with lwtunnels
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
  2016-10-17 14:42 ` [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header) David Lebrun
  2016-10-17 14:42 ` [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6 David Lebrun
@ 2016-10-17 14:42 ` David Lebrun
  2016-10-17 17:17   ` Tom Herbert
  2016-10-17 14:42 ` [PATCH 4/9] ipv6: sr: add core files for SR HMAC support David Lebrun
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:42 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch creates a new type of interfaceless lightweight tunnel (SEG6),
enabling the encapsulation and injection of SRH within locally emitted
packets and forwarded packets.

>From a configuration viewpoint, a seg6 tunnel would be configured as follows:

  ip -6 ro ad fc00::1/128 via <gw> encap seg6 mode encap segs fc42::1,fc42::2,fc42::3

Any packet whose destination address is fc00::1 would thus be encapsulated
within an outer IPv6 header containing the SRH with three segments, and would
actually be routed to the first segment of the list. If `mode inline' was
specified instead of `mode encap', then the SRH would be directly inserted
after the IPv6 header without outer encapsulation.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 include/linux/seg6_iptunnel.h      |   6 +
 include/net/seg6.h                 |   7 +
 include/uapi/linux/lwtunnel.h      |   1 +
 include/uapi/linux/seg6_iptunnel.h |  33 ++++
 net/core/lwtunnel.c                |   2 +
 net/ipv6/Makefile                  |   2 +-
 net/ipv6/seg6_iptunnel.c           | 315 +++++++++++++++++++++++++++++++++++++
 7 files changed, 365 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/seg6_iptunnel.h
 create mode 100644 include/uapi/linux/seg6_iptunnel.h
 create mode 100644 net/ipv6/seg6_iptunnel.c

diff --git a/include/linux/seg6_iptunnel.h b/include/linux/seg6_iptunnel.h
new file mode 100644
index 0000000..5377cf6
--- /dev/null
+++ b/include/linux/seg6_iptunnel.h
@@ -0,0 +1,6 @@
+#ifndef _LINUX_SEG6_IPTUNNEL_H
+#define _LINUX_SEG6_IPTUNNEL_H
+
+#include <uapi/linux/seg6_iptunnel.h>
+
+#endif
diff --git a/include/net/seg6.h b/include/net/seg6.h
index 89b819e..228f90f 100644
--- a/include/net/seg6.h
+++ b/include/net/seg6.h
@@ -16,6 +16,7 @@
 
 #include <linux/net.h>
 #include <linux/ipv6.h>
+#include <net/lwtunnel.h>
 
 struct seg6_pernet_data {
 	struct mutex lock;
@@ -37,5 +38,11 @@ static inline void seg6_pernet_unlock(struct net *net)
 	mutex_unlock(&seg6_pernet(net)->lock);
 }
 
+static inline struct seg6_iptunnel_encap *
+seg6_lwtunnel_encap(struct lwtunnel_state *lwtstate)
+{
+	return (struct seg6_iptunnel_encap *)lwtstate->data;
+}
+
 #endif
 
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index a478fe8..453cc62 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -9,6 +9,7 @@ enum lwtunnel_encap_types {
 	LWTUNNEL_ENCAP_IP,
 	LWTUNNEL_ENCAP_ILA,
 	LWTUNNEL_ENCAP_IP6,
+	LWTUNNEL_ENCAP_SEG6,
 	__LWTUNNEL_ENCAP_MAX,
 };
 
diff --git a/include/uapi/linux/seg6_iptunnel.h b/include/uapi/linux/seg6_iptunnel.h
new file mode 100644
index 0000000..2794a5f
--- /dev/null
+++ b/include/uapi/linux/seg6_iptunnel.h
@@ -0,0 +1,33 @@
+/*
+ *  SR-IPv6 implementation
+ *
+ *  Author:
+ *  David Lebrun <david.lebrun@uclouvain.be>
+ *
+ *
+ *  This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _UAPI_LINUX_SEG6_IPTUNNEL_H
+#define _UAPI_LINUX_SEG6_IPTUNNEL_H
+
+enum {
+	SEG6_IPTUNNEL_UNSPEC,
+	SEG6_IPTUNNEL_SRH,
+	__SEG6_IPTUNNEL_MAX,
+};
+#define SEG6_IPTUNNEL_MAX (__SEG6_IPTUNNEL_MAX - 1)
+
+struct seg6_iptunnel_encap {
+	int flags;
+	struct ipv6_sr_hdr srh[0];
+};
+
+#define SEG6_IPTUN_ENCAP_SIZE(x) (sizeof(*(x)) + (((x)->srh->hdrlen + 1) << 3))
+
+#define SEG6_IPTUN_FLAG_ENCAP   0x1
+
+#endif
diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
index 88fd642..03976e9 100644
--- a/net/core/lwtunnel.c
+++ b/net/core/lwtunnel.c
@@ -39,6 +39,8 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type)
 		return "MPLS";
 	case LWTUNNEL_ENCAP_ILA:
 		return "ILA";
+	case LWTUNNEL_ENCAP_SEG6:
+		return "SEG6";
 	case LWTUNNEL_ENCAP_IP6:
 	case LWTUNNEL_ENCAP_IP:
 	case LWTUNNEL_ENCAP_NONE:
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 5069257..2fe7b54 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -44,7 +44,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
 obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
 obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
 obj-$(CONFIG_IPV6_FOU) += fou6.o
-obj-$(CONFIG_IPV6_SEG6) += seg6.o
+obj-$(CONFIG_IPV6_SEG6) += seg6.o seg6_iptunnel.o
 
 obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
 obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
new file mode 100644
index 0000000..461375c
--- /dev/null
+++ b/net/ipv6/seg6_iptunnel.c
@@ -0,0 +1,315 @@
+/*
+ *  SR-IPv6 implementation
+ *
+ *  Author:
+ *  David Lebrun <david.lebrun@uclouvain.be>
+ *
+ *
+ *  This program is free software; you can redistribute it and/or
+ *        modify it under the terms of the GNU General Public License
+ *        as published by the Free Software Foundation; either version
+ *        2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/types.h>
+#include <linux/skbuff.h>
+#include <linux/net.h>
+#include <linux/module.h>
+#include <net/ip.h>
+#include <net/lwtunnel.h>
+#include <net/netevent.h>
+#include <net/netns/generic.h>
+#include <net/ip6_fib.h>
+#include <net/route.h>
+#include <net/seg6.h>
+#include <linux/seg6.h>
+#include <linux/seg6_iptunnel.h>
+#include <net/addrconf.h>
+#include <net/ip6_route.h>
+
+static const struct nla_policy seg6_iptunnel_policy[SEG6_IPTUNNEL_MAX + 1] = {
+	[SEG6_IPTUNNEL_SRH]	= { .type = NLA_BINARY },
+};
+
+int nla_put_srh(struct sk_buff *skb, int attrtype,
+		struct seg6_iptunnel_encap *tuninfo)
+{
+	struct nlattr *nla;
+	struct seg6_iptunnel_encap *data;
+	int len;
+
+	len = SEG6_IPTUN_ENCAP_SIZE(tuninfo);
+
+	nla = nla_reserve(skb, attrtype, len);
+	if (!nla)
+		return -EMSGSIZE;
+
+	data = nla_data(nla);
+	memcpy(data, tuninfo, len);
+
+	return 0;
+}
+
+static void set_tun_src(struct net *net, struct net_device *dev,
+			struct in6_addr *daddr, struct in6_addr *saddr)
+{
+	struct in6_addr *tun_src;
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+
+	rcu_read_lock();
+
+	tun_src = rcu_dereference(sdata->tun_src);
+
+	if (!ipv6_addr_any(tun_src)) {
+		memcpy(saddr, tun_src, sizeof(struct in6_addr));
+	} else {
+		ipv6_dev_get_saddr(net, dev, daddr, IPV6_PREFER_SRC_PUBLIC,
+				   saddr);
+	}
+
+	rcu_read_unlock();
+}
+
+/* encapsulate an IPv6 packet within an outer IPv6 header with a given SRH */
+static int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
+{
+	struct ipv6hdr *hdr, *inner_hdr;
+	struct ipv6_sr_hdr *isrh;
+	struct net *net = dev_net(skb_dst(skb)->dev);
+	int hdrlen, tot_len, err;
+
+	hdrlen = (osrh->hdrlen + 1) << 3;
+	tot_len = hdrlen + sizeof(*hdr);
+
+	err = pskb_expand_head(skb, tot_len, 0, GFP_ATOMIC);
+	if (unlikely(err))
+		return err;
+
+	inner_hdr = ipv6_hdr(skb);
+
+	skb_push(skb, tot_len);
+	skb_reset_network_header(skb);
+	skb_mac_header_rebuild(skb);
+	hdr = ipv6_hdr(skb);
+
+	/* inherit tc, flowlabel and hlim
+	 * hlim will be decremented in ip6_forward() afterwards and
+	 * decapsulation will overwrite inner hlim with outer hlim
+	 */
+	ip6_flow_hdr(hdr, ip6_tclass(ip6_flowinfo(inner_hdr)),
+		     ip6_flowlabel(inner_hdr));
+	hdr->hop_limit = inner_hdr->hop_limit;
+	hdr->nexthdr = NEXTHDR_ROUTING;
+
+	isrh = (void *)hdr + sizeof(*hdr);
+	memcpy(isrh, osrh, hdrlen);
+
+	isrh->nexthdr = NEXTHDR_IPV6;
+
+	hdr->daddr = isrh->segments[isrh->first_segment];
+	set_tun_src(net, skb->dev, &hdr->daddr, &hdr->saddr);
+
+	return 0;
+}
+
+/* insert an SRH within an IPv6 packet, just after the IPv6 header */
+static int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
+{
+	struct ipv6hdr *hdr, *oldhdr;
+	struct ipv6_sr_hdr *isrh;
+	int hdrlen, err;
+
+	hdrlen = (osrh->hdrlen + 1) << 3;
+
+	err = pskb_expand_head(skb, hdrlen, 0, GFP_ATOMIC);
+	if (unlikely(err))
+		return err;
+
+	oldhdr = ipv6_hdr(skb);
+
+	skb_push(skb, hdrlen);
+	skb_reset_network_header(skb);
+	skb_mac_header_rebuild(skb);
+
+	hdr = ipv6_hdr(skb);
+
+	memmove(hdr, oldhdr, sizeof(*hdr));
+
+	isrh = (void *)hdr + sizeof(*hdr);
+	memcpy(isrh, osrh, hdrlen);
+
+	isrh->nexthdr = hdr->nexthdr;
+	hdr->nexthdr = NEXTHDR_ROUTING;
+
+	isrh->segments[0] = hdr->daddr;
+	hdr->daddr = isrh->segments[isrh->first_segment];
+
+	return 0;
+}
+
+static int seg6_do_srh(struct sk_buff *skb)
+{
+	struct dst_entry *dst = skb_dst(skb);
+	struct seg6_iptunnel_encap *tinfo = seg6_lwtunnel_encap(dst->lwtstate);
+	int err = 0;
+
+	if (likely(!skb->encapsulation)) {
+		skb_reset_inner_headers(skb);
+		skb->encapsulation = 1;
+	}
+
+	if (tinfo->flags & SEG6_IPTUN_FLAG_ENCAP) {
+		err = seg6_do_srh_encap(skb, tinfo->srh);
+	} else {
+		err = seg6_do_srh_inline(skb, tinfo->srh);
+		skb_reset_inner_headers(skb);
+	}
+
+	if (err)
+		return err;
+
+	ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
+	skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+
+	skb_set_inner_protocol(skb, skb->protocol);
+
+	return 0;
+}
+
+int seg6_input(struct sk_buff *skb)
+{
+	int err;
+
+	err = seg6_do_srh(skb);
+	if (unlikely(err))
+		return err;
+
+	skb_dst_drop(skb);
+	ip6_route_input(skb);
+
+	return dst_input(skb);
+}
+
+int seg6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
+{
+	int err;
+	struct dst_entry *dst;
+	struct ipv6hdr *hdr;
+	struct flowi6 fl6;
+
+	err = seg6_do_srh(skb);
+	if (unlikely(err))
+		return err;
+
+	hdr = ipv6_hdr(skb);
+	fl6.daddr = hdr->daddr;
+	fl6.saddr = hdr->saddr;
+	fl6.flowlabel = ip6_flowinfo(hdr);
+	fl6.flowi6_mark = skb->mark;
+	fl6.flowi6_proto = hdr->nexthdr;
+
+	skb_dst_drop(skb);
+
+	err = ip6_dst_lookup(net, sk, &dst, &fl6);
+	if (unlikely(err))
+		return err;
+
+	skb_dst_set(skb, dst);
+
+	return dst_output(net, sk, skb);
+}
+
+static int seg6_build_state(struct net_device *dev, struct nlattr *nla,
+			    unsigned int family, const void *cfg,
+			    struct lwtunnel_state **ts)
+{
+	struct seg6_iptunnel_encap *tuninfo, *tuninfo_new;
+	struct nlattr *tb[SEG6_IPTUNNEL_MAX + 1];
+	struct lwtunnel_state *newts;
+	int tuninfo_len;
+	int err;
+
+	err = nla_parse_nested(tb, SEG6_IPTUNNEL_MAX, nla,
+			       seg6_iptunnel_policy);
+
+	if (err < 0)
+		return err;
+
+	if (!tb[SEG6_IPTUNNEL_SRH])
+		return -EINVAL;
+
+	tuninfo = nla_data(tb[SEG6_IPTUNNEL_SRH]);
+	tuninfo_len = SEG6_IPTUN_ENCAP_SIZE(tuninfo);
+
+	newts = lwtunnel_state_alloc(tuninfo_len);
+	if (!newts)
+		return -ENOMEM;
+
+	newts->len = tuninfo_len;
+	tuninfo_new = seg6_lwtunnel_encap(newts);
+	memcpy(tuninfo_new, tuninfo, tuninfo_len);
+
+	newts->type = LWTUNNEL_ENCAP_SEG6;
+	newts->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT |
+			LWTUNNEL_STATE_INPUT_REDIRECT;
+	newts->headroom = SEG6_IPTUN_ENCAP_SIZE(tuninfo);
+
+	*ts = newts;
+
+	return 0;
+}
+
+static int seg6_fill_encap_info(struct sk_buff *skb,
+				struct lwtunnel_state *lwtstate)
+{
+	struct seg6_iptunnel_encap *tuninfo = seg6_lwtunnel_encap(lwtstate);
+
+	if (nla_put_srh(skb, SEG6_IPTUNNEL_SRH, tuninfo))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+static int seg6_encap_nlsize(struct lwtunnel_state *lwtstate)
+{
+	struct seg6_iptunnel_encap *tuninfo = seg6_lwtunnel_encap(lwtstate);
+
+	return nla_total_size(SEG6_IPTUN_ENCAP_SIZE(tuninfo));
+}
+
+static int seg6_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b)
+{
+	struct seg6_iptunnel_encap *a_hdr = seg6_lwtunnel_encap(a);
+	struct seg6_iptunnel_encap *b_hdr = seg6_lwtunnel_encap(b);
+	int len = SEG6_IPTUN_ENCAP_SIZE(a_hdr);
+
+	if (len != SEG6_IPTUN_ENCAP_SIZE(b_hdr))
+		return 1;
+
+	return memcmp(a_hdr, b_hdr, len);
+}
+
+static const struct lwtunnel_encap_ops seg6_iptun_ops = {
+	.build_state = seg6_build_state,
+	.output = seg6_output,
+	.input = seg6_input,
+	.fill_encap = seg6_fill_encap_info,
+	.get_encap_size = seg6_encap_nlsize,
+	.cmp_encap = seg6_encap_cmp,
+};
+
+static int __init seg6_iptunnel_init(void)
+{
+	return lwtunnel_encap_add_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6);
+}
+module_init(seg6_iptunnel_init);
+
+static void __exit seg6_iptunnel_exit(void)
+{
+	lwtunnel_encap_del_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6);
+}
+module_exit(seg6_iptunnel_exit);
+
+MODULE_DESCRIPTION("Segment Routing with IPv6 IP Tunnels");
+MODULE_LICENSE("GPL v2");
+
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/9] ipv6: sr: add core files for SR HMAC support
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
                   ` (2 preceding siblings ...)
  2016-10-17 14:42 ` [PATCH 3/9] ipv6: sr: add support for SRH encapsulation and injection with lwtunnels David Lebrun
@ 2016-10-17 14:42 ` David Lebrun
  2016-10-17 17:24   ` Tom Herbert
  2016-10-19 13:20   ` zhuyj
  2016-10-17 14:43 ` [PATCH 5/9] ipv6: sr: implement API to control SR HMAC structures David Lebrun
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:42 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch adds the necessary functions to compute and check the HMAC signature
of an SR-enabled packet. Two HMAC algorithms are supported: hmac(sha1) and
hmac(sha256).

In order to avoid dynamic memory allocation for each HMAC computation,
a per-cpu ring buffer is allocated for this purpose.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 include/linux/seg6_hmac.h      |   6 +
 include/net/seg6_hmac.h        |  61 ++++++
 include/uapi/linux/seg6_hmac.h |  20 ++
 net/ipv6/seg6_hmac.c           | 432 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 519 insertions(+)
 create mode 100644 include/linux/seg6_hmac.h
 create mode 100644 include/net/seg6_hmac.h
 create mode 100644 include/uapi/linux/seg6_hmac.h
 create mode 100644 net/ipv6/seg6_hmac.c

diff --git a/include/linux/seg6_hmac.h b/include/linux/seg6_hmac.h
new file mode 100644
index 0000000..da437eb
--- /dev/null
+++ b/include/linux/seg6_hmac.h
@@ -0,0 +1,6 @@
+#ifndef _LINUX_SEG6_HMAC_H
+#define _LINUX_SEG6_HMAC_H
+
+#include <uapi/linux/seg6_hmac.h>
+
+#endif
diff --git a/include/net/seg6_hmac.h b/include/net/seg6_hmac.h
new file mode 100644
index 0000000..6e5ee6a
--- /dev/null
+++ b/include/net/seg6_hmac.h
@@ -0,0 +1,61 @@
+/*
+ *  SR-IPv6 implementation
+ *
+ *  Author:
+ *  David Lebrun <david.lebrun@uclouvain.be>
+ *
+ *
+ *  This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _NET_SEG6_HMAC_H
+#define _NET_SEG6_HMAC_H
+
+#include <net/flow.h>
+#include <net/ip6_fib.h>
+#include <net/sock.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/route.h>
+#include <net/seg6.h>
+#include <linux/seg6_hmac.h>
+
+#define SEG6_HMAC_MAX_DIGESTSIZE	160
+#define SEG6_HMAC_RING_SIZE		256
+
+struct seg6_hmac_info {
+	struct list_head list;
+
+	u32 hmackeyid;
+	char secret[SEG6_HMAC_SECRET_LEN];
+	u8 slen;
+	u8 alg_id;
+};
+
+struct seg6_hmac_algo {
+	u8 alg_id;
+	char name[64];
+	struct crypto_shash * __percpu *tfms;
+	struct shash_desc * __percpu *shashs;
+};
+
+extern int seg6_hmac_compute(struct seg6_hmac_info *hinfo,
+			     struct ipv6_sr_hdr *hdr, struct in6_addr *saddr,
+			     u8 *output);
+extern struct seg6_hmac_info *seg6_hmac_info_lookup(struct net *net, u32 key);
+extern int seg6_hmac_info_add(struct net *net, u32 key,
+			      struct seg6_hmac_info *hinfo);
+extern int seg6_hmac_info_del(struct net *net, u32 key,
+			      struct seg6_hmac_info *hinfo);
+extern int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
+			  struct ipv6_sr_hdr *srh);
+extern bool seg6_hmac_validate_skb(struct sk_buff *skb);
+extern int seg6_hmac_init(void);
+extern void seg6_hmac_exit(void);
+extern int seg6_hmac_net_init(struct net *net);
+extern void seg6_hmac_net_exit(struct net *net);
+
+#endif
diff --git a/include/uapi/linux/seg6_hmac.h b/include/uapi/linux/seg6_hmac.h
new file mode 100644
index 0000000..0b5eda7
--- /dev/null
+++ b/include/uapi/linux/seg6_hmac.h
@@ -0,0 +1,20 @@
+#ifndef _UAPI_LINUX_SEG6_HMAC_H
+#define _UAPI_LINUX_SEG6_HMAC_H
+
+#define SEG6_HMAC_SECRET_LEN	64
+#define SEG6_HMAC_FIELD_LEN	32
+
+struct sr6_tlv_hmac {
+	__u8 type;
+	__u8 len;
+	__u16 reserved;
+	__be32 hmackeyid;
+	__u8 hmac[SEG6_HMAC_FIELD_LEN];
+} __attribute__((packed));
+
+enum {
+	SEG6_HMAC_ALGO_SHA1 = 1,
+	SEG6_HMAC_ALGO_SHA256 = 2,
+};
+
+#endif
diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c
new file mode 100644
index 0000000..91bd59a
--- /dev/null
+++ b/net/ipv6/seg6_hmac.c
@@ -0,0 +1,432 @@
+/*
+ *  SR-IPv6 implementation -- HMAC functions
+ *
+ *  Author:
+ *  David Lebrun <david.lebrun@uclouvain.be>
+ *
+ *
+ *  This program is free software; you can redistribute it and/or
+ *      modify it under the terms of the GNU General Public License
+ *      as published by the Free Software Foundation; either version
+ *      2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/socket.h>
+#include <linux/sockios.h>
+#include <linux/net.h>
+#include <linux/netdevice.h>
+#include <linux/in6.h>
+#include <linux/icmpv6.h>
+#include <linux/mroute6.h>
+#include <linux/slab.h>
+
+#include <linux/netfilter.h>
+#include <linux/netfilter_ipv6.h>
+
+#include <net/sock.h>
+#include <net/snmp.h>
+
+#include <net/ipv6.h>
+#include <net/protocol.h>
+#include <net/transp_v6.h>
+#include <net/rawv6.h>
+#include <net/ndisc.h>
+#include <net/ip6_route.h>
+#include <net/addrconf.h>
+#include <net/xfrm.h>
+
+#include <linux/cryptohash.h>
+#include <crypto/hash.h>
+#include <crypto/sha.h>
+#include <net/seg6.h>
+#include <net/genetlink.h>
+#include <net/seg6_hmac.h>
+#include <linux/random.h>
+
+static char * __percpu *hmac_ring;
+
+static struct seg6_hmac_algo hmac_algos[] = {
+	{
+		.alg_id = SEG6_HMAC_ALGO_SHA1,
+		.name = "hmac(sha1)",
+	},
+	{
+		.alg_id = SEG6_HMAC_ALGO_SHA256,
+		.name = "hmac(sha256)",
+	},
+};
+
+static struct seg6_hmac_algo *__hmac_get_algo(u8 alg_id)
+{
+	int i, alg_count;
+	struct seg6_hmac_algo *algo;
+
+	alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
+	for (i = 0; i < alg_count; i++) {
+		algo = &hmac_algos[i];
+		if (algo->alg_id == alg_id)
+			return algo;
+	}
+
+	return NULL;
+}
+
+static int __do_hmac(struct seg6_hmac_info *hinfo, const char *text, u8 psize,
+		     u8 *output, int outlen)
+{
+	struct crypto_shash *tfm;
+	struct shash_desc *shash;
+	struct seg6_hmac_algo *algo;
+	int ret, dgsize;
+
+	algo = __hmac_get_algo(hinfo->alg_id);
+	if (!algo)
+		return -ENOENT;
+
+	tfm = *this_cpu_ptr(algo->tfms);
+
+	dgsize = crypto_shash_digestsize(tfm);
+	if (dgsize > outlen) {
+		pr_debug("sr-ipv6: __do_hmac: digest size too big (%d / %d)\n",
+			 dgsize, outlen);
+		return -ENOMEM;
+	}
+
+	ret = crypto_shash_setkey(tfm, hinfo->secret, hinfo->slen);
+	if (ret < 0) {
+		pr_debug("sr-ipv6: crypto_shash_setkey failed: err %d\n", ret);
+		goto failed;
+	}
+
+	shash = *this_cpu_ptr(algo->shashs);
+	shash->tfm = tfm;
+
+	ret = crypto_shash_digest(shash, text, psize, output);
+	if (ret < 0) {
+		pr_debug("sr-ipv6: crypto_shash_digest failed: err %d\n", ret);
+		goto failed;
+	}
+
+	return dgsize;
+
+failed:
+	return ret;
+}
+
+int seg6_hmac_compute(struct seg6_hmac_info *hinfo, struct ipv6_sr_hdr *hdr,
+		      struct in6_addr *saddr, u8 *output)
+{
+	int plen, i, dgsize, wrsize;
+	char *ring, *off;
+	u8 tmp_out[SEG6_HMAC_MAX_DIGESTSIZE];
+	__be32 hmackeyid = cpu_to_be32(hinfo->hmackeyid);
+
+	/* a 160-byte buffer for digest output allows to store highest known
+	 * hash function (RadioGatun) with up to 1216 bits
+	 */
+
+	/* saddr(16) + first_seg(1) + cleanup(1) + keyid(4) + seglist(16n) */
+	plen = 16 + 1 + 1 + 4 + (hdr->first_segment + 1) * 16;
+
+	/* this limit allows for 14 segments */
+	if (plen >= SEG6_HMAC_RING_SIZE)
+		return -EMSGSIZE;
+
+	local_bh_disable();
+	ring = *this_cpu_ptr(hmac_ring);
+	off = ring;
+	memcpy(off, saddr, 16);
+	off += 16;
+	*off++ = hdr->first_segment;
+	*off++ = !!(sr_get_flags(hdr) & SR6_FLAG_CLEANUP) << 7;
+	memcpy(off, &hmackeyid, 4);
+	off += 4;
+
+	for (i = 0; i < hdr->first_segment + 1; i++) {
+		memcpy(off, hdr->segments + i, 16);
+		off += 16;
+	}
+
+	dgsize = __do_hmac(hinfo, ring, plen, tmp_out,
+			   SEG6_HMAC_MAX_DIGESTSIZE);
+	local_bh_enable();
+
+	if (dgsize < 0)
+		return dgsize;
+
+	wrsize = SEG6_HMAC_FIELD_LEN;
+	if (wrsize > dgsize)
+		wrsize = dgsize;
+
+	memset(output, 0, SEG6_HMAC_FIELD_LEN);
+	memcpy(output, tmp_out, wrsize);
+
+	return 0;
+}
+EXPORT_SYMBOL(seg6_hmac_compute);
+
+/* checks if an incoming SR-enabled packet's HMAC status matches
+ * the incoming policy.
+ *
+ * called with rcu_read_lock()
+ */
+bool seg6_hmac_validate_skb(struct sk_buff *skb)
+{
+	struct inet6_dev *idev;
+	struct ipv6_sr_hdr *srh;
+	struct sr6_tlv_hmac *tlv;
+	struct seg6_hmac_info *hinfo;
+	struct net *net = dev_net(skb->dev);
+	u8 hmac_output[SEG6_HMAC_FIELD_LEN];
+
+	idev = __in6_dev_get(skb->dev);
+
+	srh = (struct ipv6_sr_hdr *)skb_transport_header(skb);
+
+	tlv = seg6_get_tlv_hmac(srh);
+
+	/* mandatory check but no tlv */
+	if (idev->cnf.seg6_require_hmac > 0 && !tlv)
+		return false;
+
+	/* no check */
+	if (idev->cnf.seg6_require_hmac < 0)
+		return true;
+
+	/* check only if present */
+	if (idev->cnf.seg6_require_hmac == 0 && !tlv)
+		return true;
+
+	/* now, seg6_require_hmac >= 0 && tlv */
+
+	hinfo = seg6_hmac_info_lookup(net, be32_to_cpu(tlv->hmackeyid));
+	if (!hinfo)
+		return false;
+
+	if (seg6_hmac_compute(hinfo, srh, &ipv6_hdr(skb)->saddr, hmac_output))
+		return false;
+
+	if (memcmp(hmac_output, tlv->hmac, SEG6_HMAC_FIELD_LEN) != 0)
+		return false;
+
+	return true;
+}
+
+/* called with rcu_read_lock() */
+struct seg6_hmac_info *seg6_hmac_info_lookup(struct net *net, u32 key)
+{
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+	struct seg6_hmac_info *hinfo;
+
+	list_for_each_entry_rcu(hinfo, &sdata->hmac_infos, list) {
+		if (hinfo->hmackeyid == key)
+			return hinfo;
+	}
+
+	return NULL;
+}
+EXPORT_SYMBOL(seg6_hmac_info_lookup);
+
+int seg6_hmac_info_add(struct net *net, u32 key, struct seg6_hmac_info *hinfo)
+{
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+	struct seg6_hmac_info *old_hinfo;
+
+	old_hinfo = seg6_hmac_info_lookup(net, key);
+	if (old_hinfo)
+		return -EEXIST;
+
+	list_add_rcu(&hinfo->list, &sdata->hmac_infos);
+
+	return 0;
+}
+EXPORT_SYMBOL(seg6_hmac_info_add);
+
+int seg6_hmac_info_del(struct net *net, u32 key, struct seg6_hmac_info *hinfo)
+{
+	struct seg6_hmac_info *tmp;
+
+	tmp = seg6_hmac_info_lookup(net, key);
+	if (!tmp)
+		return -ENOENT;
+
+	/* entry was replaced, ignore deletion */
+	if (tmp != hinfo)
+		return -ENOENT;
+
+	list_del_rcu(&hinfo->list);
+	synchronize_net();
+
+	return 0;
+}
+EXPORT_SYMBOL(seg6_hmac_info_del);
+
+static void seg6_hmac_info_flush(struct net *net)
+{
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+	struct seg6_hmac_info *hinfo;
+
+	seg6_pernet_lock(net);
+	while ((hinfo = list_first_or_null_rcu(&sdata->hmac_infos,
+					       struct seg6_hmac_info,
+					       list)) != NULL) {
+		list_del_rcu(&hinfo->list);
+		seg6_pernet_unlock(net);
+		synchronize_net();
+		kfree(hinfo);
+		seg6_pernet_lock(net);
+	}
+
+	seg6_pernet_unlock(net);
+}
+
+int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
+		   struct ipv6_sr_hdr *srh)
+{
+	struct seg6_hmac_info *hinfo;
+	int err = -ENOENT;
+	struct sr6_tlv_hmac *tlv;
+
+	tlv = seg6_get_tlv(srh, SR6_TLV_HMAC);
+	if (!tlv)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	hinfo = seg6_hmac_info_lookup(net, be32_to_cpu(tlv->hmackeyid));
+	if (!hinfo)
+		goto out;
+
+	memset(tlv->hmac, 0, SEG6_HMAC_FIELD_LEN);
+	err = seg6_hmac_compute(hinfo, srh, saddr, tlv->hmac);
+
+out:
+	rcu_read_unlock();
+	return err;
+}
+EXPORT_SYMBOL(seg6_push_hmac);
+
+static int seg6_hmac_init_ring(void)
+{
+	int i;
+
+	hmac_ring = alloc_percpu(char *);
+
+	if (!hmac_ring)
+		return -ENOMEM;
+
+	for_each_possible_cpu(i) {
+		char *ring = kzalloc(SEG6_HMAC_RING_SIZE, GFP_KERNEL);
+
+		if (!ring)
+			return -ENOMEM;
+
+		*per_cpu_ptr(hmac_ring, i) = ring;
+	}
+
+	return 0;
+}
+
+static int seg6_hmac_init_algo(void)
+{
+	int i, alg_count, cpu;
+	struct seg6_hmac_algo *algo;
+	struct crypto_shash *tfm;
+	struct shash_desc *shash;
+
+	alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
+
+	for (i = 0; i < alg_count; i++) {
+		int shsize;
+		struct crypto_shash **p_tfm;
+
+		algo = &hmac_algos[i];
+		algo->tfms = alloc_percpu(struct crypto_shash *);
+		if (!algo->tfms)
+			return -ENOMEM;
+
+		for_each_possible_cpu(cpu) {
+			tfm = crypto_alloc_shash(algo->name, 0, GFP_KERNEL);
+			if (IS_ERR(tfm))
+				return PTR_ERR(tfm);
+			p_tfm = per_cpu_ptr(algo->tfms, cpu);
+			*p_tfm = tfm;
+		}
+
+		p_tfm = this_cpu_ptr(algo->tfms);
+		tfm = *p_tfm;
+
+		shsize = sizeof(*shash) + crypto_shash_descsize(tfm);
+
+		algo->shashs = alloc_percpu(struct shash_desc *);
+		if (!algo->shashs)
+			return -ENOMEM;
+
+		for_each_possible_cpu(cpu) {
+			shash = kzalloc(shsize, GFP_KERNEL);
+			if (!shash)
+				return -ENOMEM;
+			*per_cpu_ptr(algo->shashs, cpu) = shash;
+		}
+	}
+
+	return 0;
+}
+
+int __init seg6_hmac_init(void)
+{
+	int ret;
+
+	ret = seg6_hmac_init_ring();
+	if (ret < 0)
+		goto out;
+
+	ret = seg6_hmac_init_algo();
+
+out:
+	return ret;
+}
+
+int __net_init seg6_hmac_net_init(struct net *net)
+{
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+
+	INIT_LIST_HEAD(&sdata->hmac_infos);
+	return 0;
+}
+
+void __exit seg6_hmac_exit(void)
+{
+	int i, alg_count, cpu;
+	struct seg6_hmac_algo *algo = NULL;
+
+	for_each_possible_cpu(i) {
+		char *ring = *per_cpu_ptr(hmac_ring, i);
+
+		kfree(ring);
+	}
+	free_percpu(hmac_ring);
+
+	alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
+	for (i = 0; i < alg_count; i++) {
+		algo = &hmac_algos[i];
+		for_each_possible_cpu(cpu) {
+			struct crypto_shash *tfm;
+			struct shash_desc *shash;
+
+			shash = *per_cpu_ptr(algo->shashs, cpu);
+			kfree(shash);
+			tfm = *per_cpu_ptr(algo->tfms, cpu);
+			crypto_free_shash(tfm);
+		}
+		free_percpu(algo->tfms);
+		free_percpu(algo->shashs);
+	}
+}
+
+void __net_exit seg6_hmac_net_exit(struct net *net)
+{
+	seg6_hmac_info_flush(net);
+}
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 5/9] ipv6: sr: implement API to control SR HMAC structures
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
                   ` (3 preceding siblings ...)
  2016-10-17 14:42 ` [PATCH 4/9] ipv6: sr: add core files for SR HMAC support David Lebrun
@ 2016-10-17 14:43 ` David Lebrun
  2016-10-17 14:43 ` [PATCH 6/9] ipv6: sr: add calls to verify and insert HMAC signatures David Lebrun
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:43 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch provides an implementation of the genetlink commands
to associate a given HMAC key identifier with an hashing algorithm
and a secret. It also provides a per-interface sysctl called
seg6_require_hmac, allowing a user-defined policy for processing
HMAC-signed SR-enabled packets. A value of -1 means that the HMAC
field will always be ignored. A value of 0 means that if an HMAC
field is present, its validity will be enforced (the packet is
dropped is the signature is incorrect). Finally, a value of 1 means
that any SR-enabled packet that does not contain an HMAC signature
or whose signature is incorrect will be dropped.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 include/linux/ipv6.h      |   1 +
 include/net/seg6.h        |   5 +-
 include/net/seg6_hmac.h   |   1 +
 include/uapi/linux/ipv6.h |   1 +
 net/ipv6/Makefile         |   2 +-
 net/ipv6/addrconf.c       |  10 +++
 net/ipv6/seg6.c           | 155 +++++++++++++++++++++++++++++++++++++++++++++-
 net/ipv6/seg6_hmac.c      |   2 +-
 8 files changed, 172 insertions(+), 5 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 75395ad..4354d55 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -66,6 +66,7 @@ struct ipv6_devconf {
 	__s32		keep_addr_on_down;
 #ifdef CONFIG_IPV6_SEG6
 	__s32		seg6_enabled;
+	__s32		seg6_require_hmac;
 #endif
 
 	struct ctl_table_header *sysctl_header;
diff --git a/include/net/seg6.h b/include/net/seg6.h
index 228f90f..f833c1f 100644
--- a/include/net/seg6.h
+++ b/include/net/seg6.h
@@ -17,10 +17,12 @@
 #include <linux/net.h>
 #include <linux/ipv6.h>
 #include <net/lwtunnel.h>
+#include <net/seg6_hmac.h>
 
 struct seg6_pernet_data {
 	struct mutex lock;
 	struct in6_addr __rcu *tun_src;
+	struct list_head hmac_infos;
 };
 
 static inline struct seg6_pernet_data *seg6_pernet(struct net *net)
@@ -44,5 +46,6 @@ seg6_lwtunnel_encap(struct lwtunnel_state *lwtstate)
 	return (struct seg6_iptunnel_encap *)lwtstate->data;
 }
 
-#endif
+extern struct sr6_tlv_hmac *seg6_get_tlv_hmac(struct ipv6_sr_hdr *srh);
 
+#endif
diff --git a/include/net/seg6_hmac.h b/include/net/seg6_hmac.h
index 6e5ee6a..9646b7e 100644
--- a/include/net/seg6_hmac.h
+++ b/include/net/seg6_hmac.h
@@ -21,6 +21,7 @@
 #include <linux/ipv6.h>
 #include <linux/route.h>
 #include <net/seg6.h>
+#include <linux/seg6.h>
 #include <linux/seg6_hmac.h>
 
 #define SEG6_HMAC_MAX_DIGESTSIZE	160
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 7ff1d65..53561be 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -180,6 +180,7 @@ enum {
 	DEVCONF_KEEP_ADDR_ON_DOWN,
 	DEVCONF_RTR_SOLICIT_MAX_INTERVAL,
 	DEVCONF_SEG6_ENABLED,
+	DEVCONF_SEG6_REQUIRE_HMAC,
 	DEVCONF_MAX
 };
 
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 2fe7b54..95dcb89 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -44,7 +44,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
 obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
 obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
 obj-$(CONFIG_IPV6_FOU) += fou6.o
-obj-$(CONFIG_IPV6_SEG6) += seg6.o seg6_iptunnel.o
+obj-$(CONFIG_IPV6_SEG6) += seg6.o seg6_iptunnel.o seg6_hmac.o
 
 obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
 obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 42c0ffb..2b5e036 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -241,6 +241,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.keep_addr_on_down	= 0,
 #ifdef CONFIG_IPV6_SEG6
 	.seg6_enabled		= 0,
+	.seg6_require_hmac	= 0,
 #endif
 };
 
@@ -290,6 +291,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.keep_addr_on_down	= 0,
 #ifdef CONFIG_IPV6_SEG6
 	.seg6_enabled		= 0,
+	.seg6_require_hmac	= 0,
 #endif
 };
 
@@ -4973,6 +4975,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_KEEP_ADDR_ON_DOWN] = cnf->keep_addr_on_down;
 #ifdef CONFIG_IPV6_SEG6
 	array[DEVCONF_SEG6_ENABLED] = cnf->seg6_enabled;
+	array[DEVCONF_SEG6_REQUIRE_HMAC] = cnf->seg6_require_hmac;
 #endif
 }
 
@@ -6073,6 +6076,13 @@ static const struct ctl_table addrconf_sysctl[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname	= "seg6_require_hmac",
+		.data		= &ipv6_devconf.seg6_enabled,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 #endif
 	{
 		/* sentinel */
diff --git a/net/ipv6/seg6.c b/net/ipv6/seg6.c
index 8dafe7b..df79b37 100644
--- a/net/ipv6/seg6.c
+++ b/net/ipv6/seg6.c
@@ -25,6 +25,7 @@
 #include <net/genetlink.h>
 #include <linux/seg6.h>
 #include <linux/seg6_genl.h>
+#include <net/seg6_hmac.h>
 
 static const struct nla_policy seg6_genl_policy[SEG6_ATTR_MAX + 1] = {
 	[SEG6_ATTR_DST]				= { .type = NLA_BINARY,
@@ -46,9 +47,93 @@ static struct genl_family seg6_genl_family = {
 	.netnsok = true,
 };
 
+struct sr6_tlv_hmac *seg6_get_tlv_hmac(struct ipv6_sr_hdr *srh)
+{
+	struct sr6_tlv_hmac *tlv;
+
+	if (srh->hdrlen < (srh->first_segment + 1) * 2 + 5)
+		return NULL;
+
+	if (!(sr_get_flags(srh) & SR6_FLAG_HMAC))
+		return NULL;
+
+	tlv = (struct sr6_tlv_hmac *)
+	      ((char *)srh + ((srh->hdrlen + 1) << 3) - 40);
+
+	if (tlv->type != SR6_TLV_HMAC || tlv->len != 38)
+		return NULL;
+
+	return tlv;
+}
+
 static int seg6_genl_sethmac(struct sk_buff *skb, struct genl_info *info)
 {
-	return -ENOTSUPP;
+	struct net *net = genl_info_net(info);
+	char *secret;
+	u32 hmackeyid;
+	u8 algid;
+	u8 slen;
+	struct seg6_hmac_info *hinfo;
+	int err = 0;
+
+	if (!info->attrs[SEG6_ATTR_HMACKEYID] ||
+	    !info->attrs[SEG6_ATTR_SECRETLEN] ||
+	    !info->attrs[SEG6_ATTR_ALGID])
+		return -EINVAL;
+
+	hmackeyid = nla_get_u32(info->attrs[SEG6_ATTR_HMACKEYID]);
+	slen = nla_get_u8(info->attrs[SEG6_ATTR_SECRETLEN]);
+	algid = nla_get_u8(info->attrs[SEG6_ATTR_ALGID]);
+
+	if (hmackeyid == 0)
+		return -EINVAL;
+
+	if (slen > SEG6_HMAC_SECRET_LEN)
+		return -EINVAL;
+
+	seg6_pernet_lock(net);
+	hinfo = seg6_hmac_info_lookup(net, hmackeyid);
+
+	if (!slen) {
+		if (!hinfo || seg6_hmac_info_del(net, hmackeyid, hinfo))
+			err = -ENOENT;
+		else
+			kfree(hinfo);
+
+		goto out_unlock;
+	}
+
+	if (!info->attrs[SEG6_ATTR_SECRET]) {
+		err = -EINVAL;
+		goto out_unlock;
+	}
+
+	if (hinfo) {
+		if (seg6_hmac_info_del(net, hmackeyid, hinfo)) {
+			err = -ENOENT;
+			goto out_unlock;
+		}
+		kfree(hinfo);
+	}
+
+	secret = (char *)nla_data(info->attrs[SEG6_ATTR_SECRET]);
+
+	hinfo = kzalloc(sizeof(*hinfo), GFP_KERNEL);
+	if (!hinfo) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
+
+	memcpy(hinfo->secret, secret, slen);
+	hinfo->slen = slen;
+	hinfo->alg_id = algid;
+	hinfo->hmackeyid = hmackeyid;
+
+	seg6_hmac_info_add(net, hmackeyid, hinfo);
+
+out_unlock:
+	seg6_pernet_unlock(net);
+	return err;
 }
 
 static int seg6_genl_set_tunsrc(struct sk_buff *skb, struct genl_info *info)
@@ -113,9 +198,63 @@ static int seg6_genl_get_tunsrc(struct sk_buff *skb, struct genl_info *info)
 	return -ENOMEM;
 }
 
+static int __seg6_hmac_fill_info(struct seg6_hmac_info *hinfo,
+				 struct sk_buff *msg)
+{
+	if (nla_put_u32(msg, SEG6_ATTR_HMACKEYID, hinfo->hmackeyid) ||
+	    nla_put_u8(msg, SEG6_ATTR_SECRETLEN, hinfo->slen) ||
+	    nla_put(msg, SEG6_ATTR_SECRET, hinfo->slen, hinfo->secret) ||
+	    nla_put_u8(msg, SEG6_ATTR_ALGID, hinfo->alg_id))
+		return -1;
+
+	return 0;
+}
+
+static int __seg6_genl_dumphmac_element(struct seg6_hmac_info *hinfo,
+					u32 portid, u32 seq, u32 flags,
+					struct sk_buff *skb, u8 cmd)
+{
+	void *hdr;
+
+	hdr = genlmsg_put(skb, portid, seq, &seg6_genl_family, flags, cmd);
+	if (!hdr)
+		return -ENOMEM;
+
+	if (__seg6_hmac_fill_info(hinfo, skb) < 0)
+		goto nla_put_failure;
+
+	genlmsg_end(skb, hdr);
+	return 0;
+
+nla_put_failure:
+	genlmsg_cancel(skb, hdr);
+	return -EMSGSIZE;
+}
+
 static int seg6_genl_dumphmac(struct sk_buff *skb, struct netlink_callback *cb)
 {
-	return -ENOTSUPP;
+	struct net *net = sock_net(skb->sk);
+	struct seg6_pernet_data *sdata = seg6_pernet(net);
+	struct seg6_hmac_info *hinfo;
+	int idx = 0, ret;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(hinfo, &sdata->hmac_infos, list) {
+		if (idx++ < cb->args[0])
+			continue;
+
+		ret = __seg6_genl_dumphmac_element(hinfo,
+						   NETLINK_CB(cb->skb).portid,
+						   cb->nlh->nlmsg_seq,
+						   NLM_F_MULTI,
+						   skb, SEG6_CMD_DUMPHMAC);
+		if (ret)
+			break;
+	}
+	rcu_read_unlock();
+
+	cb->args[0] = idx;
+	return skb->len;
 }
 
 static const struct genl_ops seg6_genl_ops[] = {
@@ -163,6 +302,8 @@ static int __net_init seg6_net_init(struct net *net)
 
 	net->ipv6.seg6_data = sdata;
 
+	seg6_hmac_net_init(net);
+
 	return 0;
 }
 
@@ -170,6 +311,8 @@ static void __net_exit seg6_net_exit(struct net *net)
 {
 	struct seg6_pernet_data *sdata = seg6_pernet(net);
 
+	seg6_hmac_net_exit(net);
+
 	kfree(sdata->tun_src);
 	kfree(sdata);
 }
@@ -191,10 +334,16 @@ static int __init seg6_init(void)
 	if (err)
 		goto out_unregister_genl;
 
+	err = seg6_hmac_init();
+	if (err)
+		goto out_unregister_pernet;
+
 	pr_info("Segment Routing with IPv6\n");
 
 out:
 	return err;
+out_unregister_pernet:
+	unregister_pernet_subsys(&ip6_segments_ops);
 out_unregister_genl:
 	genl_unregister_family(&seg6_genl_family);
 	goto out;
@@ -203,6 +352,8 @@ module_init(seg6_init);
 
 static void __exit seg6_exit(void)
 {
+	seg6_hmac_exit();
+
 	unregister_pernet_subsys(&ip6_segments_ops);
 	genl_unregister_family(&seg6_genl_family);
 }
diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c
index 91bd59a..99e90c8 100644
--- a/net/ipv6/seg6_hmac.c
+++ b/net/ipv6/seg6_hmac.c
@@ -289,7 +289,7 @@ int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
 	int err = -ENOENT;
 	struct sr6_tlv_hmac *tlv;
 
-	tlv = seg6_get_tlv(srh, SR6_TLV_HMAC);
+	tlv = seg6_get_tlv_hmac(srh);
 	if (!tlv)
 		return -EINVAL;
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/9] ipv6: sr: add calls to verify and insert HMAC signatures
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
                   ` (4 preceding siblings ...)
  2016-10-17 14:43 ` [PATCH 5/9] ipv6: sr: implement API to control SR HMAC structures David Lebrun
@ 2016-10-17 14:43 ` David Lebrun
  2016-10-17 14:43 ` [PATCH 7/9] ipv6: add source address argument for ipv6_push_nfrag_opts David Lebrun
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:43 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch enables the verification of the HMAC signature for transiting
SR-enabled packets, and its insertion on encapsulated/injected SRH.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 net/ipv6/exthdrs.c       |  6 ++++++
 net/ipv6/seg6_iptunnel.c | 13 +++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index b31f811..73b3743 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -49,6 +49,7 @@
 #endif
 #ifdef CONFIG_IPV6_SEG6
 #include <linux/seg6.h>
+#include <net/seg6_hmac.h>
 #endif
 
 #include <linux/uaccess.h>
@@ -313,6 +314,11 @@ static int ipv6_srh_rcv(struct sk_buff *skb)
 		return -1;
 	}
 
+	if (!seg6_hmac_validate_skb(skb)) {
+		kfree_skb(skb);
+		return -1;
+	}
+
 looped_back:
 	last_addr = hdr->segments;
 
diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
index 461375c..a32a7ff 100644
--- a/net/ipv6/seg6_iptunnel.c
+++ b/net/ipv6/seg6_iptunnel.c
@@ -109,12 +109,19 @@ static int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
 	hdr->daddr = isrh->segments[isrh->first_segment];
 	set_tun_src(net, skb->dev, &hdr->daddr, &hdr->saddr);
 
+	if (sr_get_flags(isrh) & SR6_FLAG_HMAC) {
+		err = seg6_push_hmac(net, &hdr->saddr, isrh);
+		if (unlikely(err))
+			return err;
+	}
+
 	return 0;
 }
 
 /* insert an SRH within an IPv6 packet, just after the IPv6 header */
 static int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
 {
+	struct net *net = dev_net(skb_dst(skb)->dev);
 	struct ipv6hdr *hdr, *oldhdr;
 	struct ipv6_sr_hdr *isrh;
 	int hdrlen, err;
@@ -144,6 +151,12 @@ static int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
 	isrh->segments[0] = hdr->daddr;
 	hdr->daddr = isrh->segments[isrh->first_segment];
 
+	if (sr_get_flags(isrh) & SR6_FLAG_HMAC) {
+		err = seg6_push_hmac(net, &hdr->saddr, isrh);
+		if (unlikely(err))
+			return err;
+	}
+
 	return 0;
 }
 
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 7/9] ipv6: add source address argument for ipv6_push_nfrag_opts
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
                   ` (5 preceding siblings ...)
  2016-10-17 14:43 ` [PATCH 6/9] ipv6: sr: add calls to verify and insert HMAC signatures David Lebrun
@ 2016-10-17 14:43 ` David Lebrun
  2016-10-17 14:43 ` [PATCH 8/9] ipv6: sr: add support for SRH injection through setsockopt David Lebrun
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:43 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch prepares for insertion of SRH through setsockopt().
The new source address argument is used when an HMAC field is
present in the SRH, which must be filled. The HMAC signature
process requires the source address as input text.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 include/net/ipv6.h    | 3 ++-
 net/ipv6/exthdrs.c    | 6 +++---
 net/ipv6/ip6_output.c | 5 +++--
 net/ipv6/ip6_tunnel.c | 2 +-
 4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 8fed1cd..0a3622b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -932,7 +932,8 @@ int ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb);
  */
 
 void ipv6_push_nfrag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
-			  u8 *proto, struct in6_addr **daddr_p);
+			  u8 *proto, struct in6_addr **daddr_p,
+			  struct in6_addr *saddr);
 void ipv6_push_frag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
 			 u8 *proto);
 
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 73b3743..c8a74a1 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -827,7 +827,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
 
 static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
 			    struct ipv6_rt_hdr *opt,
-			    struct in6_addr **addr_p)
+			    struct in6_addr **addr_p, struct in6_addr *saddr)
 {
 	struct rt0_hdr *phdr, *ihdr;
 	int hops;
@@ -861,10 +861,10 @@ static void ipv6_push_exthdr(struct sk_buff *skb, u8 *proto, u8 type, struct ipv
 
 void ipv6_push_nfrag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
 			  u8 *proto,
-			  struct in6_addr **daddr)
+			  struct in6_addr **daddr, struct in6_addr *saddr)
 {
 	if (opt->srcrt) {
-		ipv6_push_rthdr(skb, proto, opt->srcrt, daddr);
+		ipv6_push_rthdr(skb, proto, opt->srcrt, daddr, saddr);
 		/*
 		 * IPV6_RTHDRDSTOPTS is ignored
 		 * unless IPV6_RTHDR is set (RFC3542).
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6001e78..ddc878d 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -203,7 +203,8 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
 		if (opt->opt_flen)
 			ipv6_push_frag_opts(skb, opt, &proto);
 		if (opt->opt_nflen)
-			ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop);
+			ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop,
+					     &fl6->saddr);
 	}
 
 	skb_push(skb, sizeof(struct ipv6hdr));
@@ -1672,7 +1673,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 	if (opt && opt->opt_flen)
 		ipv6_push_frag_opts(skb, opt, &proto);
 	if (opt && opt->opt_nflen)
-		ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst);
+		ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst, &fl6->saddr);
 
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 6a66adb..1281e17 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1155,7 +1155,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 
 	if (encap_limit >= 0) {
 		init_tel_txopt(&opt, encap_limit);
-		ipv6_push_nfrag_opts(skb, &opt.ops, &proto, NULL);
+		ipv6_push_nfrag_opts(skb, &opt.ops, &proto, NULL, NULL);
 	}
 
 	/* Calculate max headroom for all the headers and adjust
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 8/9] ipv6: sr: add support for SRH injection through setsockopt
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
                   ` (6 preceding siblings ...)
  2016-10-17 14:43 ` [PATCH 7/9] ipv6: add source address argument for ipv6_push_nfrag_opts David Lebrun
@ 2016-10-17 14:43 ` David Lebrun
  2016-10-17 14:43 ` [PATCH 9/9] ipv6: sr: add documentation file for per-interface sysctls David Lebrun
  2016-10-17 15:09 ` [PATCH 0/9] net: add support for IPv6 Segment Routing David Laight
  9 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:43 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch adds support for per-socket SRH injection with the setsockopt
system call through the IPPROTO_IPV6, IPV6_RTHDR options.
The SRH is pushed through the ipv6_push_nfrag_opts function.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 net/ipv6/exthdrs.c       | 67 +++++++++++++++++++++++++++++++++++++++++++++---
 net/ipv6/ipv6_sockglue.c |  4 +++
 2 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index c8a74a1..7fb8b3f 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -825,9 +825,9 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
  *	for headers.
  */
 
-static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
-			    struct ipv6_rt_hdr *opt,
-			    struct in6_addr **addr_p, struct in6_addr *saddr)
+static void ipv6_push_rthdr0(struct sk_buff *skb, u8 *proto,
+			     struct ipv6_rt_hdr *opt,
+			     struct in6_addr **addr_p, struct in6_addr *saddr)
 {
 	struct rt0_hdr *phdr, *ihdr;
 	int hops;
@@ -850,6 +850,55 @@ static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
 	*proto = NEXTHDR_ROUTING;
 }
 
+#ifdef CONFIG_IPV6_SEG6
+static void ipv6_push_rthdr4(struct sk_buff *skb, u8 *proto,
+			     struct ipv6_rt_hdr *opt,
+			     struct in6_addr **addr_p, struct in6_addr *saddr)
+{
+	struct ipv6_sr_hdr *sr_phdr, *sr_ihdr;
+	struct net *net = NULL;
+	int plen, hops;
+
+	if (skb->dev)
+		net = dev_net(skb->dev);
+	else if (skb->sk)
+		net = sock_net(skb->sk);
+
+	WARN_ON(!net);
+
+	sr_ihdr = (struct ipv6_sr_hdr *)opt;
+	plen = (sr_ihdr->hdrlen + 1) << 3;
+
+	sr_phdr = (struct ipv6_sr_hdr *)skb_push(skb, plen);
+	memcpy(sr_phdr, sr_ihdr, sizeof(struct ipv6_sr_hdr));
+
+	hops = sr_ihdr->first_segment + 1;
+	memcpy(sr_phdr->segments + 1, sr_ihdr->segments + 1,
+	       (hops - 1) * sizeof(struct in6_addr));
+
+	sr_phdr->segments[0] = **addr_p;
+	*addr_p = &sr_ihdr->segments[hops - 1];
+
+	if (net && (sr_get_flags(sr_phdr) & SR6_FLAG_HMAC))
+		seg6_push_hmac(net, saddr, sr_phdr);
+
+	sr_phdr->nexthdr = *proto;
+	*proto = NEXTHDR_ROUTING;
+}
+#endif
+
+static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
+			    struct ipv6_rt_hdr *opt,
+			    struct in6_addr **addr_p, struct in6_addr *saddr)
+{
+#ifdef CONFIG_IPV6_SEG6
+	if (opt->type == IPV6_SRCRT_TYPE_4)
+		ipv6_push_rthdr4(skb, proto, opt, addr_p, saddr);
+#endif
+	if (opt->type == IPV6_SRCRT_TYPE_0)
+		ipv6_push_rthdr0(skb, proto, opt, addr_p, saddr);
+}
+
 static void ipv6_push_exthdr(struct sk_buff *skb, u8 *proto, u8 type, struct ipv6_opt_hdr *opt)
 {
 	struct ipv6_opt_hdr *h = (struct ipv6_opt_hdr *)skb_push(skb, ipv6_optlen(opt));
@@ -1091,7 +1140,17 @@ struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
 		return NULL;
 
 	*orig = fl6->daddr;
-	fl6->daddr = *((struct rt0_hdr *)opt->srcrt)->addr;
+
+#ifdef CONFIG_IPV6_SEG6
+	if (opt->srcrt->type == IPV6_SRCRT_TYPE_4) {
+		struct ipv6_sr_hdr *srh = (struct ipv6_sr_hdr *)opt->srcrt;
+
+		fl6->daddr = srh->segments[srh->first_segment];
+	}
+#endif
+	if (opt->srcrt->type == IPV6_SRCRT_TYPE_0)
+		fl6->daddr = *((struct rt0_hdr *)opt->srcrt)->addr;
+
 	return orig;
 }
 EXPORT_SYMBOL_GPL(fl6_update_dst);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 5330262..f81f3db 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -429,6 +429,10 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 
 				break;
 #endif
+#ifdef CONFIG_IPV6_SEG6
+			case IPV6_SRCRT_TYPE_4:
+				break;
+#endif
 			default:
 				goto sticky_done;
 			}
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 9/9] ipv6: sr: add documentation file for per-interface sysctls
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
                   ` (7 preceding siblings ...)
  2016-10-17 14:43 ` [PATCH 8/9] ipv6: sr: add support for SRH injection through setsockopt David Lebrun
@ 2016-10-17 14:43 ` David Lebrun
  2016-10-17 15:09 ` [PATCH 0/9] net: add support for IPv6 Segment Routing David Laight
  9 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 14:43 UTC (permalink / raw)
  To: netdev; +Cc: David Lebrun

This patch adds documentation for some SR-related per-interface
sysctls.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
 Documentation/networking/seg6-sysctl.txt | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
 create mode 100644 Documentation/networking/seg6-sysctl.txt

diff --git a/Documentation/networking/seg6-sysctl.txt b/Documentation/networking/seg6-sysctl.txt
new file mode 100644
index 0000000..bdbde23
--- /dev/null
+++ b/Documentation/networking/seg6-sysctl.txt
@@ -0,0 +1,18 @@
+/proc/sys/net/conf/<iface>/seg6_* variables:
+
+seg6_enabled - BOOL
+	Accept or drop SR-enabled IPv6 packets on this interface.
+
+	Relevant packets are those with SRH present and DA = local.
+
+	0 - disabled (default)
+	not 0 - enabled
+
+seg6_require_hmac - INTEGER
+	Define HMAC policy for ingress SR-enabled packets on this interface.
+
+	-1 - Ignore HMAC field
+	0 - Accept SR packets without HMAC, validate SR packets with HMAC
+	1 - Drop SR packets without HMAC, validate SR packets with HMAC
+
+	Default is 0.
-- 
2.7.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-17 14:42 ` [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header) David Lebrun
@ 2016-10-17 14:57   ` David Miller
  2016-10-17 15:04     ` David Lebrun
  2016-10-17 17:01   ` Tom Herbert
  2016-10-17 17:31   ` Tom Herbert
  2 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2016-10-17 14:57 UTC (permalink / raw)
  To: david.lebrun; +Cc: netdev

From: David Lebrun <david.lebrun@uclouvain.be>
Date: Mon, 17 Oct 2016 16:42:22 +0200

> +/*
> + * SRH
> + */
> +struct ipv6_sr_hdr {
> +	__u8	nexthdr;
> +	__u8	hdrlen;
> +	__u8	type;
> +	__u8	segments_left;
> +	__u8	first_segment;
> +	__be16	flags;
> +	__u8	reserved;
> +
> +	struct in6_addr segments[0];
> +} __attribute__((packed));

Please don't use packed, it results in extremely inefficient code on
several architectures.

You can simply declare the flags as two 8-bit pieces and all will work
out fine.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6
  2016-10-17 14:42 ` [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6 David Lebrun
@ 2016-10-17 15:00   ` David Miller
  2016-10-17 15:06     ` David Lebrun
  2016-10-17 17:07   ` Tom Herbert
  1 sibling, 1 reply; 29+ messages in thread
From: David Miller @ 2016-10-17 15:00 UTC (permalink / raw)
  To: david.lebrun; +Cc: netdev

From: David Lebrun <david.lebrun@uclouvain.be>
Date: Mon, 17 Oct 2016 16:42:23 +0200

> +static int seg6_genl_set_tunsrc(struct sk_buff *skb, struct genl_info *info)
> +{
> +	struct net *net = genl_info_net(info);
> +	struct seg6_pernet_data *sdata = seg6_pernet(net);
> +	struct in6_addr *val, *t_old, *t_new;

Please ordre local variables from longest to shortest line (AKA reverse
christmas tree layout).

Please audit your entire submission for this problem.

> +	val = (struct in6_addr *)nla_data(info->attrs[SEG6_ATTR_DST]);

Please remove all casts from void pointers, they are completely unecessary.
Since nla_data() returns "(void *)", this applies here.

Please audit your entire submission for this problem.

> +	mutex_init(&sdata->lock);
> +
> +	sdata->tun_src = kzalloc(sizeof(*sdata->tun_src), GFP_KERNEL);
> +	if (!sdata->tun_src) {
> +		kfree(sdata);
> +		return -ENOMEM;

Best not to free an object while you still hold a mutex inside of it.

Also taking the mutex makes no sense at all, this object has no global
visibility, therefore no other thread of control can operate upon it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-17 14:57   ` David Miller
@ 2016-10-17 15:04     ` David Lebrun
  0 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-17 15:04 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 254 bytes --]

On 10/17/2016 04:57 PM, David Miller wrote:
> Please don't use packed, it results in extremely inefficient code on
> several architectures.
> 
> You can simply declare the flags as two 8-bit pieces and all will work
> out fine.

Noted, will do


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6
  2016-10-17 15:00   ` David Miller
@ 2016-10-17 15:06     ` David Lebrun
  2016-10-17 15:21       ` David Miller
  0 siblings, 1 reply; 29+ messages in thread
From: David Lebrun @ 2016-10-17 15:06 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]

On 10/17/2016 05:00 PM, David Miller wrote:
> Please ordre local variables from longest to shortest line (AKA reverse
> christmas tree layout).
> 
> Please audit your entire submission for this problem.
> 
>> +	val = (struct in6_addr *)nla_data(info->attrs[SEG6_ATTR_DST]);
> 
> Please remove all casts from void pointers, they are completely unecessary.
> Since nla_data() returns "(void *)", this applies here.
> 
> Please audit your entire submission for this problem.
> 

Will do

>> +	mutex_init(&sdata->lock);
>> +
>> +	sdata->tun_src = kzalloc(sizeof(*sdata->tun_src), GFP_KERNEL);
>> +	if (!sdata->tun_src) {
>> +		kfree(sdata);
>> +		return -ENOMEM;
> 
> Best not to free an object while you still hold a mutex inside of it.
> 
> Also taking the mutex makes no sense at all, this object has no global
> visibility, therefore no other thread of control can operate upon it.
> 

The mutex is not taken, just initialized. Unless mutex_init takes the
lock, which would be quite strange ?


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH 0/9] net: add support for IPv6 Segment Routing
  2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
                   ` (8 preceding siblings ...)
  2016-10-17 14:43 ` [PATCH 9/9] ipv6: sr: add documentation file for per-interface sysctls David Lebrun
@ 2016-10-17 15:09 ` David Laight
  9 siblings, 0 replies; 29+ messages in thread
From: David Laight @ 2016-10-17 15:09 UTC (permalink / raw)
  To: 'David Lebrun', netdev

From: David Lebrun
> Sent: 17 October 2016 15:42
 
> Segment Routing (SR) is a source routing paradigm, architecturally
> defined in draft-ietf-spring-segment-routing-09 [1]. The IPv6 flavor of
> SR is defined in draft-ietf-6man-segment-routing-header-02 [2].
> 
> The main idea is that an SR-enabled packet contains a list of segments,
> which represent mandatory waypoints. Each waypoint is called a segment
> endpoint. The SR-enabled packet is routed normally (e.g. shortest path)
> between the segment endpoints. A node that inserts an SRH into a packet
> is called an ingress node, and a node that is the last segment endpoint
> is called an egress node.

Is this a new source routing definition?

I thought the original IPv6 definition contained source routing
(probably because someone thought it was a good idea (tm))
but no one implemented it because of the security implications.

You really don't want sending systems using source routing to
get packets across routers (etc) in ways that are contrary to
the packets actual addresses.

	David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6
  2016-10-17 15:06     ` David Lebrun
@ 2016-10-17 15:21       ` David Miller
  0 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2016-10-17 15:21 UTC (permalink / raw)
  To: david.lebrun; +Cc: netdev

From: David Lebrun <david.lebrun@uclouvain.be>
Date: Mon, 17 Oct 2016 17:06:06 +0200

> The mutex is not taken, just initialized. Unless mutex_init takes the
> lock, which would be quite strange ?

My bad, I misread the code.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-17 14:42 ` [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header) David Lebrun
  2016-10-17 14:57   ` David Miller
@ 2016-10-17 17:01   ` Tom Herbert
  2016-10-18 11:43     ` David Lebrun
  2016-10-20 13:04     ` David Lebrun
  2016-10-17 17:31   ` Tom Herbert
  2 siblings, 2 replies; 29+ messages in thread
From: Tom Herbert @ 2016-10-17 17:01 UTC (permalink / raw)
  To: David Lebrun; +Cc: Linux Kernel Network Developers

On Mon, Oct 17, 2016 at 7:42 AM, David Lebrun <david.lebrun@uclouvain.be> wrote:
> Implement minimal support for processing of SR-enabled packets
> as described in
> https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02.
>
> This patch implements the following operations:
> - Intermediate segment endpoint: incrementation of active segment and rerouting.
> - Egress for SR-encapsulated packets: decapsulation of outer IPv6 header + SRH
>   and routing of inner packet.
> - Cleanup flag support for SR-inlined packets: removal of SRH if we are the
>   penultimate segment endpoint.
>
> A per-interface sysctl seg6_enabled is provided, to accept/deny SR-enabled
> packets. Default is deny.
>
> This patch does not provide support for HMAC-signed packets.
>
> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
> ---
>  include/linux/ipv6.h      |   3 +
>  include/linux/seg6.h      |   6 ++
>  include/uapi/linux/ipv6.h |   2 +
>  include/uapi/linux/seg6.h |  46 +++++++++++++++
>  net/ipv6/Kconfig          |  13 +++++
>  net/ipv6/addrconf.c       |  18 ++++++
>  net/ipv6/exthdrs.c        | 140 ++++++++++++++++++++++++++++++++++++++++++++++
>  7 files changed, 228 insertions(+)
>  create mode 100644 include/linux/seg6.h
>  create mode 100644 include/uapi/linux/seg6.h
>
> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> index 7e9a789..75395ad 100644
> --- a/include/linux/ipv6.h
> +++ b/include/linux/ipv6.h
> @@ -64,6 +64,9 @@ struct ipv6_devconf {
>         } stable_secret;
>         __s32           use_oif_addrs_only;
>         __s32           keep_addr_on_down;
> +#ifdef CONFIG_IPV6_SEG6
> +       __s32           seg6_enabled;
> +#endif
>
>         struct ctl_table_header *sysctl_header;
>  };
> diff --git a/include/linux/seg6.h b/include/linux/seg6.h
> new file mode 100644
> index 0000000..7a66d2b
> --- /dev/null
> +++ b/include/linux/seg6.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_SEG6_H
> +#define _LINUX_SEG6_H
> +
> +#include <uapi/linux/seg6.h>
> +
> +#endif
> diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
> index 8c27723..7ff1d65 100644
> --- a/include/uapi/linux/ipv6.h
> +++ b/include/uapi/linux/ipv6.h
> @@ -39,6 +39,7 @@ struct in6_ifreq {
>  #define IPV6_SRCRT_STRICT      0x01    /* Deprecated; will be removed */
>  #define IPV6_SRCRT_TYPE_0      0       /* Deprecated; will be removed */
>  #define IPV6_SRCRT_TYPE_2      2       /* IPv6 type 2 Routing Header   */
> +#define IPV6_SRCRT_TYPE_4      4       /* Segment Routing with IPv6 */
>
>  /*
>   *     routing header
> @@ -178,6 +179,7 @@ enum {
>         DEVCONF_DROP_UNSOLICITED_NA,
>         DEVCONF_KEEP_ADDR_ON_DOWN,
>         DEVCONF_RTR_SOLICIT_MAX_INTERVAL,
> +       DEVCONF_SEG6_ENABLED,
>         DEVCONF_MAX
>  };
>
> diff --git a/include/uapi/linux/seg6.h b/include/uapi/linux/seg6.h
> new file mode 100644
> index 0000000..9f9e157
> --- /dev/null
> +++ b/include/uapi/linux/seg6.h
> @@ -0,0 +1,46 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#ifndef _UAPI_LINUX_SEG6_H
> +#define _UAPI_LINUX_SEG6_H
> +
> +/*
> + * SRH
> + */
> +struct ipv6_sr_hdr {
> +       __u8    nexthdr;
> +       __u8    hdrlen;
> +       __u8    type;
> +       __u8    segments_left;
> +       __u8    first_segment;
> +       __be16  flags;

Bad alignment for 16 bit field could be unpleasant on some
architectures. Might be better to split this into to u8's, defined
flags are only in first eight bits anyway.

> +       __u8    reserved;
> +
> +       struct in6_addr segments[0];
> +} __attribute__((packed));
> +
> +#define SR6_FLAG_CLEANUP       (1 << 15)
> +#define SR6_FLAG_PROTECTED     (1 << 14)
> +#define SR6_FLAG_OAM           (1 << 13)
> +#define SR6_FLAG_ALERT         (1 << 12)
> +#define SR6_FLAG_HMAC          (1 << 11)
> +
> +#define SR6_TLV_INGRESS                1
> +#define SR6_TLV_EGRESS         2
> +#define SR6_TLV_OPAQUE         3
> +#define SR6_TLV_PADDING                4
> +#define SR6_TLV_HMAC           5
> +
> +#define sr_get_flags(srh) (be16_to_cpu((srh)->flags))
> +
> +#endif
> diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
> index 2343e4f..691c318 100644
> --- a/net/ipv6/Kconfig
> +++ b/net/ipv6/Kconfig
> @@ -289,4 +289,17 @@ config IPV6_PIMSM_V2
>           Support for IPv6 PIM multicast routing protocol PIM-SMv2.
>           If unsure, say N.
>
> +config IPV6_SEG6
> +       bool "IPv6: Segment Routing support"
> +       depends on IPV6
> +       select CRYPTO_HMAC
> +       select CRYPTO_SHA1
> +       select CRYPTO_SHA256
> +       ---help---
> +         Experimental support for IPv6 Segment Routing dataplane as defined

I don't think calling this experimental is relevant.

> +         in IETF draft-ietf-6man-segment-routing-header-02. This option
> +         enables the processing of SR-enabled packets allowing the kernel
> +         to act as a segment endpoint (intermediate or egress). It also
> +         enables an API for the kernel to act as an ingress SR router.
> +
>  endif # IPV6
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index d8983e1..42c0ffb 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -239,6 +239,9 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
>         .use_oif_addrs_only     = 0,
>         .ignore_routes_with_linkdown = 0,
>         .keep_addr_on_down      = 0,
> +#ifdef CONFIG_IPV6_SEG6
> +       .seg6_enabled           = 0,
> +#endif
>  };
>
>  static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
> @@ -285,6 +288,9 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
>         .use_oif_addrs_only     = 0,
>         .ignore_routes_with_linkdown = 0,
>         .keep_addr_on_down      = 0,
> +#ifdef CONFIG_IPV6_SEG6
> +       .seg6_enabled           = 0,
> +#endif
>  };
>
>  /* Check if a valid qdisc is available */
> @@ -4965,6 +4971,9 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
>         array[DEVCONF_DROP_UNICAST_IN_L2_MULTICAST] = cnf->drop_unicast_in_l2_multicast;
>         array[DEVCONF_DROP_UNSOLICITED_NA] = cnf->drop_unsolicited_na;
>         array[DEVCONF_KEEP_ADDR_ON_DOWN] = cnf->keep_addr_on_down;
> +#ifdef CONFIG_IPV6_SEG6
> +       array[DEVCONF_SEG6_ENABLED] = cnf->seg6_enabled;
> +#endif
>  }
>
>  static inline size_t inet6_ifla6_size(void)
> @@ -6056,6 +6065,15 @@ static const struct ctl_table addrconf_sysctl[] = {
>                 .proc_handler   = proc_dointvec,
>
>         },
> +#ifdef CONFIG_IPV6_SEG6
> +       {
> +               .procname       = "seg6_enabled",
> +               .data           = &ipv6_devconf.seg6_enabled,
> +               .maxlen         = sizeof(int),
> +               .mode           = 0644,
> +               .proc_handler   = proc_dointvec,
> +       },
> +#endif
>         {
>                 /* sentinel */
>         }
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index 139ceb6..b31f811 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -47,6 +47,9 @@
>  #if IS_ENABLED(CONFIG_IPV6_MIP6)
>  #include <net/xfrm.h>
>  #endif
> +#ifdef CONFIG_IPV6_SEG6
> +#include <linux/seg6.h>
> +#endif
>
>  #include <linux/uaccess.h>
>
> @@ -286,6 +289,137 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
>         return -1;
>  }
>
> +#ifdef CONFIG_IPV6_SEG6
> +static int ipv6_srh_rcv(struct sk_buff *skb)
> +{
> +       struct in6_addr *addr = NULL, *last_addr = NULL, *active_addr = NULL;
> +       struct inet6_skb_parm *opt = IP6CB(skb);
> +       struct net *net = dev_net(skb->dev);
> +       struct ipv6_sr_hdr *hdr;
> +       struct inet6_dev *idev;
> +       int cleanup = 0;
> +       int accept_seg6;
> +
> +       hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
> +
> +       idev = __in6_dev_get(skb->dev);
> +
> +       accept_seg6 = net->ipv6.devconf_all->seg6_enabled;
> +       if (accept_seg6 > idev->cnf.seg6_enabled)
> +               accept_seg6 = idev->cnf.seg6_enabled;
> +
> +       if (!accept_seg6) {
> +               kfree_skb(skb);
> +               return -1;
> +       }
> +
> +looped_back:
> +       last_addr = hdr->segments;
> +
> +       if (hdr->segments_left > 0) {
> +               if (hdr->nexthdr != NEXTHDR_IPV6 && hdr->segments_left == 1 &&
> +                   sr_get_flags(hdr) & SR6_FLAG_CLEANUP)
> +                       cleanup = 1;
> +       } else {
> +               if (hdr->nexthdr == NEXTHDR_IPV6) {
> +                       int offset = (hdr->hdrlen + 1) << 3;
> +
> +                       if (!pskb_pull(skb, offset)) {
> +                               kfree_skb(skb);
> +                               return -1;
> +                       }
> +                       skb_postpull_rcsum(skb, skb_transport_header(skb),
> +                                          offset);
> +
> +                       skb_reset_network_header(skb);
> +                       skb_reset_transport_header(skb);
> +                       skb->encapsulation = 0;
> +
> +                       __skb_tunnel_rx(skb, skb->dev, net);
> +
> +                       netif_rx(skb);
> +                       return -1;
> +               }
> +
> +               opt->srcrt = skb_network_header_len(skb);
> +               opt->lastopt = opt->srcrt;
> +               skb->transport_header += (hdr->hdrlen + 1) << 3;
> +               opt->nhoff = (&hdr->nexthdr) - skb_network_header(skb);
> +
> +               return 1;
> +       }
> +
> +       if (skb_cloned(skb)) {
> +               if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) {
> +                       __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
> +                                       IPSTATS_MIB_OUTDISCARDS);
> +                       kfree_skb(skb);
> +                       return -1;
> +               }
> +       }
> +
> +       if (skb->ip_summed == CHECKSUM_COMPLETE)
> +               skb->ip_summed = CHECKSUM_NONE;
> +
Because the packet is being changed? Would it make sense to update the
checksum complete value based on the changes being made. Consider the
case that the next hop is local to the host (someone may try to
implement network virtualization this way).

> +       hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
> +
> +       active_addr = hdr->segments + hdr->segments_left;
> +       hdr->segments_left--;
> +       addr = hdr->segments + hdr->segments_left;
> +
> +       ipv6_hdr(skb)->daddr = *addr;
> +
> +       skb_push(skb, sizeof(struct ipv6hdr));
> +
> +       if (cleanup) {
> +               int srhlen = (hdr->hdrlen + 1) << 3;
> +               int nh = hdr->nexthdr;
> +
> +               memmove(skb_network_header(skb) + srhlen,
> +                       skb_network_header(skb),
> +                       (unsigned char *)hdr - skb_network_header(skb));
> +               skb_pull(skb, srhlen);
> +               skb->network_header += srhlen;
> +               ipv6_hdr(skb)->nexthdr = nh;
> +               ipv6_hdr(skb)->payload_len = htons(skb->len -
> +                                                  sizeof(struct ipv6hdr));
> +       }
> +
> +       skb_dst_drop(skb);
> +
> +       ip6_route_input(skb);
> +
> +       if (skb_dst(skb)->error) {
> +               dst_input(skb);
> +               return -1;
> +       }
> +
> +       if (skb_dst(skb)->dev->flags & IFF_LOOPBACK) {
> +               if (ipv6_hdr(skb)->hop_limit <= 1) {
> +                       __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
> +                                       IPSTATS_MIB_INHDRERRORS);
> +                       icmpv6_send(skb, ICMPV6_TIME_EXCEED,
> +                                   ICMPV6_EXC_HOPLIMIT, 0);
> +                       kfree_skb(skb);
> +                       return -1;
> +               }
> +               ipv6_hdr(skb)->hop_limit--;
> +
> +               /* be sure that srh is still present before reinjecting */
> +               if (!cleanup) {
> +                       skb_pull(skb, sizeof(struct ipv6hdr));
> +                       goto looped_back;
> +               }
> +               skb_set_transport_header(skb, sizeof(struct ipv6hdr));
> +               IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
> +       }
> +
> +       dst_input(skb);
> +
> +       return -1;
> +}
> +#endif
> +
>  /********************************
>    Routing header.
>   ********************************/
> @@ -326,6 +460,12 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
>                 return -1;
>         }
>
> +#ifdef CONFIG_IPV6_SEG6
> +       /* segment routing */
> +       if (hdr->type == IPV6_SRCRT_TYPE_4)
> +               return ipv6_srh_rcv(skb);
> +#endif

This doesn't belong in one of the switch statements in ipv6_rthdr_rcv?

> +
>  looped_back:
>         if (hdr->segments_left == 0) {
>                 switch (hdr->type) {
> --
> 2.7.3
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6
  2016-10-17 14:42 ` [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6 David Lebrun
  2016-10-17 15:00   ` David Miller
@ 2016-10-17 17:07   ` Tom Herbert
  2016-10-18 11:45     ` David Lebrun
  1 sibling, 1 reply; 29+ messages in thread
From: Tom Herbert @ 2016-10-17 17:07 UTC (permalink / raw)
  To: David Lebrun; +Cc: Linux Kernel Network Developers

On Mon, Oct 17, 2016 at 7:42 AM, David Lebrun <david.lebrun@uclouvain.be> wrote:
> This patch adds the necessary hooks and structures to provide support
> for SR-IPv6 control plane, essentially the Generic Netlink commands
> that will be used for userspace control over the Segment Routing
> kernel structures.
>
> The genetlink commands provide control over two different structures:
> tunnel source and HMAC data. The tunnel source is the source address
> that will be used by default when encapsulating packets into an
> outer IPv6 header + SRH. If the tunnel source is set to :: then an
> address of the outgoing interface will be selected as the source.
>
> The HMAC commands currently just return ENOTSUPP and will be implemented
> in a future patch.
>
> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
> ---
>  include/linux/seg6_genl.h      |   6 ++
>  include/net/netns/ipv6.h       |   3 +
>  include/net/seg6.h             |  41 ++++++++
>  include/uapi/linux/seg6_genl.h |  33 +++++++
>  net/ipv6/Makefile              |   1 +
>  net/ipv6/seg6.c                | 212 +++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 296 insertions(+)
>  create mode 100644 include/linux/seg6_genl.h
>  create mode 100644 include/net/seg6.h
>  create mode 100644 include/uapi/linux/seg6_genl.h
>  create mode 100644 net/ipv6/seg6.c
>
> diff --git a/include/linux/seg6_genl.h b/include/linux/seg6_genl.h
> new file mode 100644
> index 0000000..d6c3fb4f
> --- /dev/null
> +++ b/include/linux/seg6_genl.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_SEG6_GENL_H
> +#define _LINUX_SEG6_GENL_H
> +
> +#include <uapi/linux/seg6_genl.h>
> +
> +#endif
> diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
> index 10d0848..67825bb 100644
> --- a/include/net/netns/ipv6.h
> +++ b/include/net/netns/ipv6.h
> @@ -85,6 +85,9 @@ struct netns_ipv6 {
>  #endif
>         atomic_t                dev_addr_genid;
>         atomic_t                fib6_sernum;
> +#ifdef CONFIG_IPV6_SEG6
> +       struct seg6_pernet_data *seg6_data;
> +#endif
>  };
>
>  #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
> diff --git a/include/net/seg6.h b/include/net/seg6.h
> new file mode 100644
> index 0000000..89b819e
> --- /dev/null
> +++ b/include/net/seg6.h
> @@ -0,0 +1,41 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#ifndef _NET_SEG6_H
> +#define _NET_SEG6_H
> +
> +#include <linux/net.h>
> +#include <linux/ipv6.h>
> +
> +struct seg6_pernet_data {
> +       struct mutex lock;
> +       struct in6_addr __rcu *tun_src;
> +};
> +
> +static inline struct seg6_pernet_data *seg6_pernet(struct net *net)
> +{
> +       return net->ipv6.seg6_data;
> +}
> +
> +static inline void seg6_pernet_lock(struct net *net)
> +{
> +       mutex_lock(&seg6_pernet(net)->lock);
> +}
> +
> +static inline void seg6_pernet_unlock(struct net *net)
> +{
> +       mutex_unlock(&seg6_pernet(net)->lock);
> +}
> +
IMO it's better not to hide mutex_lock/unlock in a static inline
function. Pairing mutex_lock with mutex_lock is critical and should be
each to see in code.

> +#endif
> +
> diff --git a/include/uapi/linux/seg6_genl.h b/include/uapi/linux/seg6_genl.h
> new file mode 100644
> index 0000000..4db988b
> --- /dev/null
> +++ b/include/uapi/linux/seg6_genl.h
> @@ -0,0 +1,33 @@
> +#ifndef _UAPI_LINUX_SEG6_GENL_H
> +#define _UAPI_LINUX_SEG6_GENL_H
> +
> +#define SEG6_GENL_NAME         "SEG6"
> +#define SEG6_GENL_VERSION      0x1
> +
> +enum {
> +       SEG6_ATTR_UNSPEC,
> +       SEG6_ATTR_DST,
> +       SEG6_ATTR_DSTLEN,
> +       SEG6_ATTR_HMACKEYID,
> +       SEG6_ATTR_SECRET,
> +       SEG6_ATTR_SECRETLEN,
> +       SEG6_ATTR_ALGID,
> +       SEG6_ATTR_HMACINFO,
> +       __SEG6_ATTR_MAX,
> +};
> +
> +#define SEG6_ATTR_MAX (__SEG6_ATTR_MAX - 1)
> +
> +enum {
> +       SEG6_CMD_UNSPEC,
> +       SEG6_CMD_SETHMAC,
> +       SEG6_CMD_DUMPHMAC,
> +       SEG6_CMD_SET_TUNSRC,
> +       SEG6_CMD_GET_TUNSRC,
> +       __SEG6_CMD_MAX,
> +};
> +
> +#define SEG6_CMD_MAX (__SEG6_CMD_MAX - 1)
> +
> +#endif
> +
> diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
> index c174ccb..5069257 100644
> --- a/net/ipv6/Makefile
> +++ b/net/ipv6/Makefile
> @@ -44,6 +44,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
>  obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
>  obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
>  obj-$(CONFIG_IPV6_FOU) += fou6.o
> +obj-$(CONFIG_IPV6_SEG6) += seg6.o
>
>  obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
>  obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
> diff --git a/net/ipv6/seg6.c b/net/ipv6/seg6.c
> new file mode 100644
> index 0000000..8dafe7b
> --- /dev/null
> +++ b/net/ipv6/seg6.c
> @@ -0,0 +1,212 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *       modify it under the terms of the GNU General Public License
> + *       as published by the Free Software Foundation; either version
> + *       2 of the License, or (at your option) any later version.
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/net.h>
> +#include <linux/in6.h>
> +#include <linux/slab.h>
> +
> +#include <net/ipv6.h>
> +#include <net/protocol.h>
> +
> +#include <net/seg6.h>
> +#include <net/genetlink.h>
> +#include <linux/seg6.h>
> +#include <linux/seg6_genl.h>
> +
> +static const struct nla_policy seg6_genl_policy[SEG6_ATTR_MAX + 1] = {
> +       [SEG6_ATTR_DST]                         = { .type = NLA_BINARY,
> +               .len = sizeof(struct in6_addr) },
> +       [SEG6_ATTR_DSTLEN]                      = { .type = NLA_S32, },
> +       [SEG6_ATTR_HMACKEYID]           = { .type = NLA_U32, },
> +       [SEG6_ATTR_SECRET]                      = { .type = NLA_BINARY, },
> +       [SEG6_ATTR_SECRETLEN]           = { .type = NLA_U8, },
> +       [SEG6_ATTR_ALGID]                       = { .type = NLA_U8, },
> +       [SEG6_ATTR_HMACINFO]            = { .type = NLA_NESTED, },
> +};
> +
> +static struct genl_family seg6_genl_family = {
> +       .id = GENL_ID_GENERATE,
> +       .hdrsize = 0,
> +       .name = SEG6_GENL_NAME,
> +       .version = SEG6_GENL_VERSION,
> +       .maxattr = SEG6_ATTR_MAX,
> +       .netnsok = true,
> +};
> +
> +static int seg6_genl_sethmac(struct sk_buff *skb, struct genl_info *info)
> +{
> +       return -ENOTSUPP;

Is the intent to implement this later?

> +}
> +
> +static int seg6_genl_set_tunsrc(struct sk_buff *skb, struct genl_info *info)
> +{
> +       struct net *net = genl_info_net(info);
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +       struct in6_addr *val, *t_old, *t_new;
> +
> +       if (!info->attrs[SEG6_ATTR_DST])
> +               return -EINVAL;
> +
> +       val = (struct in6_addr *)nla_data(info->attrs[SEG6_ATTR_DST]);
> +       t_new = kmemdup(val, sizeof(*val), GFP_KERNEL);
> +
> +       seg6_pernet_lock(net);
> +
> +       t_old = sdata->tun_src;
> +       rcu_assign_pointer(sdata->tun_src, t_new);
> +
> +       seg6_pernet_unlock(net);
> +
> +       synchronize_net();
> +       kfree(t_old);
> +
> +       return 0;
> +}
> +
> +static int seg6_genl_get_tunsrc(struct sk_buff *skb, struct genl_info *info)
> +{
> +       struct net *net = genl_info_net(info);
> +       struct sk_buff *msg;
> +       void *hdr;
> +       struct in6_addr *tun_src;
> +
> +       msg = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +       if (!msg)
> +               return -ENOMEM;
> +
> +       hdr = genlmsg_put(msg, info->snd_portid, info->snd_seq,
> +                         &seg6_genl_family, 0, SEG6_CMD_GET_TUNSRC);
> +       if (!hdr)
> +               goto free_msg;
> +
> +       rcu_read_lock();
> +       tun_src = rcu_dereference(seg6_pernet(net)->tun_src);
> +
> +       if (nla_put(msg, SEG6_ATTR_DST, sizeof(struct in6_addr), tun_src))
> +               goto nla_put_failure;
> +
> +       rcu_read_unlock();
> +
> +       genlmsg_end(msg, hdr);
> +       genlmsg_reply(msg, info);
> +
> +       return 0;
> +
> +nla_put_failure:
> +       rcu_read_unlock();
> +       genlmsg_cancel(msg, hdr);
> +free_msg:
> +       nlmsg_free(msg);
> +       return -ENOMEM;
> +}
> +
> +static int seg6_genl_dumphmac(struct sk_buff *skb, struct netlink_callback *cb)
> +{
> +       return -ENOTSUPP;
> +}
> +
> +static const struct genl_ops seg6_genl_ops[] = {
> +       {
> +               .cmd    = SEG6_CMD_SETHMAC,
> +               .doit   = seg6_genl_sethmac,
> +               .policy = seg6_genl_policy,
> +               .flags  = GENL_ADMIN_PERM,
> +       },
> +       {
> +               .cmd    = SEG6_CMD_DUMPHMAC,
> +               .dumpit = seg6_genl_dumphmac,
> +               .policy = seg6_genl_policy,
> +               .flags  = GENL_ADMIN_PERM,
> +       },
> +       {
> +               .cmd    = SEG6_CMD_SET_TUNSRC,
> +               .doit   = seg6_genl_set_tunsrc,
> +               .policy = seg6_genl_policy,
> +               .flags  = GENL_ADMIN_PERM,
> +       },
> +       {
> +               .cmd    = SEG6_CMD_GET_TUNSRC,
> +               .doit   = seg6_genl_get_tunsrc,
> +               .policy = seg6_genl_policy,
> +               .flags  = GENL_ADMIN_PERM,
> +       },
> +};
> +
> +static int __net_init seg6_net_init(struct net *net)
> +{
> +       struct seg6_pernet_data *sdata;
> +
> +       sdata = kzalloc(sizeof(*sdata), GFP_KERNEL);
> +       if (!sdata)
> +               return -ENOMEM;
> +
> +       mutex_init(&sdata->lock);
> +
> +       sdata->tun_src = kzalloc(sizeof(*sdata->tun_src), GFP_KERNEL);
> +       if (!sdata->tun_src) {
> +               kfree(sdata);
> +               return -ENOMEM;
> +       }
> +
> +       net->ipv6.seg6_data = sdata;
> +
> +       return 0;
> +}
> +
> +static void __net_exit seg6_net_exit(struct net *net)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +
> +       kfree(sdata->tun_src);
> +       kfree(sdata);
> +}
> +
> +static struct pernet_operations ip6_segments_ops = {
> +       .init = seg6_net_init,
> +       .exit = seg6_net_exit,
> +};
> +
> +static int __init seg6_init(void)
> +{
> +       int err = -ENOMEM;
> +
> +       err = genl_register_family_with_ops(&seg6_genl_family, seg6_genl_ops);
> +       if (err)
> +               goto out;
> +
> +       err = register_pernet_subsys(&ip6_segments_ops);
> +       if (err)
> +               goto out_unregister_genl;
> +
> +       pr_info("Segment Routing with IPv6\n");
> +
> +out:
> +       return err;
> +out_unregister_genl:
> +       genl_unregister_family(&seg6_genl_family);
> +       goto out;
> +}
> +module_init(seg6_init);
> +
> +static void __exit seg6_exit(void)
> +{
> +       unregister_pernet_subsys(&ip6_segments_ops);
> +       genl_unregister_family(&seg6_genl_family);
> +}
> +module_exit(seg6_exit);
> +
> +MODULE_DESCRIPTION("Segment Routing with IPv6 core");
> +MODULE_LICENSE("GPL v2");
> --
> 2.7.3
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] ipv6: sr: add support for SRH encapsulation and injection with lwtunnels
  2016-10-17 14:42 ` [PATCH 3/9] ipv6: sr: add support for SRH encapsulation and injection with lwtunnels David Lebrun
@ 2016-10-17 17:17   ` Tom Herbert
  2016-10-18 11:47     ` David Lebrun
  0 siblings, 1 reply; 29+ messages in thread
From: Tom Herbert @ 2016-10-17 17:17 UTC (permalink / raw)
  To: David Lebrun; +Cc: Linux Kernel Network Developers

On Mon, Oct 17, 2016 at 7:42 AM, David Lebrun <david.lebrun@uclouvain.be> wrote:
> This patch creates a new type of interfaceless lightweight tunnel (SEG6),
> enabling the encapsulation and injection of SRH within locally emitted
> packets and forwarded packets.
>
> From a configuration viewpoint, a seg6 tunnel would be configured as follows:
>
>   ip -6 ro ad fc00::1/128 via <gw> encap seg6 mode encap segs fc42::1,fc42::2,fc42::3
>
> Any packet whose destination address is fc00::1 would thus be encapsulated
> within an outer IPv6 header containing the SRH with three segments, and would
> actually be routed to the first segment of the list. If `mode inline' was
> specified instead of `mode encap', then the SRH would be directly inserted
> after the IPv6 header without outer encapsulation.
>
> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
> ---
>  include/linux/seg6_iptunnel.h      |   6 +
>  include/net/seg6.h                 |   7 +
>  include/uapi/linux/lwtunnel.h      |   1 +
>  include/uapi/linux/seg6_iptunnel.h |  33 ++++
>  net/core/lwtunnel.c                |   2 +
>  net/ipv6/Makefile                  |   2 +-
>  net/ipv6/seg6_iptunnel.c           | 315 +++++++++++++++++++++++++++++++++++++
>  7 files changed, 365 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/seg6_iptunnel.h
>  create mode 100644 include/uapi/linux/seg6_iptunnel.h
>  create mode 100644 net/ipv6/seg6_iptunnel.c
>
> diff --git a/include/linux/seg6_iptunnel.h b/include/linux/seg6_iptunnel.h
> new file mode 100644
> index 0000000..5377cf6
> --- /dev/null
> +++ b/include/linux/seg6_iptunnel.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_SEG6_IPTUNNEL_H
> +#define _LINUX_SEG6_IPTUNNEL_H
> +
> +#include <uapi/linux/seg6_iptunnel.h>
> +
> +#endif
> diff --git a/include/net/seg6.h b/include/net/seg6.h
> index 89b819e..228f90f 100644
> --- a/include/net/seg6.h
> +++ b/include/net/seg6.h
> @@ -16,6 +16,7 @@
>
>  #include <linux/net.h>
>  #include <linux/ipv6.h>
> +#include <net/lwtunnel.h>
>
>  struct seg6_pernet_data {
>         struct mutex lock;
> @@ -37,5 +38,11 @@ static inline void seg6_pernet_unlock(struct net *net)
>         mutex_unlock(&seg6_pernet(net)->lock);
>  }
>
> +static inline struct seg6_iptunnel_encap *
> +seg6_lwtunnel_encap(struct lwtunnel_state *lwtstate)
> +{
> +       return (struct seg6_iptunnel_encap *)lwtstate->data;
> +}
> +
>  #endif
>
> diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
> index a478fe8..453cc62 100644
> --- a/include/uapi/linux/lwtunnel.h
> +++ b/include/uapi/linux/lwtunnel.h
> @@ -9,6 +9,7 @@ enum lwtunnel_encap_types {
>         LWTUNNEL_ENCAP_IP,
>         LWTUNNEL_ENCAP_ILA,
>         LWTUNNEL_ENCAP_IP6,
> +       LWTUNNEL_ENCAP_SEG6,
>         __LWTUNNEL_ENCAP_MAX,
>  };
>
> diff --git a/include/uapi/linux/seg6_iptunnel.h b/include/uapi/linux/seg6_iptunnel.h
> new file mode 100644
> index 0000000..2794a5f
> --- /dev/null
> +++ b/include/uapi/linux/seg6_iptunnel.h
> @@ -0,0 +1,33 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#ifndef _UAPI_LINUX_SEG6_IPTUNNEL_H
> +#define _UAPI_LINUX_SEG6_IPTUNNEL_H
> +
> +enum {
> +       SEG6_IPTUNNEL_UNSPEC,
> +       SEG6_IPTUNNEL_SRH,
> +       __SEG6_IPTUNNEL_MAX,
> +};
> +#define SEG6_IPTUNNEL_MAX (__SEG6_IPTUNNEL_MAX - 1)
> +
> +struct seg6_iptunnel_encap {
> +       int flags;
> +       struct ipv6_sr_hdr srh[0];
> +};
> +
> +#define SEG6_IPTUN_ENCAP_SIZE(x) (sizeof(*(x)) + (((x)->srh->hdrlen + 1) << 3))
> +
> +#define SEG6_IPTUN_FLAG_ENCAP   0x1
> +
> +#endif
> diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
> index 88fd642..03976e9 100644
> --- a/net/core/lwtunnel.c
> +++ b/net/core/lwtunnel.c
> @@ -39,6 +39,8 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type)
>                 return "MPLS";
>         case LWTUNNEL_ENCAP_ILA:
>                 return "ILA";
> +       case LWTUNNEL_ENCAP_SEG6:
> +               return "SEG6";
>         case LWTUNNEL_ENCAP_IP6:
>         case LWTUNNEL_ENCAP_IP:
>         case LWTUNNEL_ENCAP_NONE:
> diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
> index 5069257..2fe7b54 100644
> --- a/net/ipv6/Makefile
> +++ b/net/ipv6/Makefile
> @@ -44,7 +44,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
>  obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
>  obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
>  obj-$(CONFIG_IPV6_FOU) += fou6.o
> -obj-$(CONFIG_IPV6_SEG6) += seg6.o
> +obj-$(CONFIG_IPV6_SEG6) += seg6.o seg6_iptunnel.o
>
>  obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
>  obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
> diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
> new file mode 100644
> index 0000000..461375c
> --- /dev/null
> +++ b/net/ipv6/seg6_iptunnel.c
> @@ -0,0 +1,315 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *        modify it under the terms of the GNU General Public License
> + *        as published by the Free Software Foundation; either version
> + *        2 of the License, or (at your option) any later version.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/skbuff.h>
> +#include <linux/net.h>
> +#include <linux/module.h>
> +#include <net/ip.h>
> +#include <net/lwtunnel.h>
> +#include <net/netevent.h>
> +#include <net/netns/generic.h>
> +#include <net/ip6_fib.h>
> +#include <net/route.h>
> +#include <net/seg6.h>
> +#include <linux/seg6.h>
> +#include <linux/seg6_iptunnel.h>
> +#include <net/addrconf.h>
> +#include <net/ip6_route.h>
> +
> +static const struct nla_policy seg6_iptunnel_policy[SEG6_IPTUNNEL_MAX + 1] = {
> +       [SEG6_IPTUNNEL_SRH]     = { .type = NLA_BINARY },
> +};
> +
> +int nla_put_srh(struct sk_buff *skb, int attrtype,
> +               struct seg6_iptunnel_encap *tuninfo)
> +{
> +       struct nlattr *nla;
> +       struct seg6_iptunnel_encap *data;
> +       int len;
> +
> +       len = SEG6_IPTUN_ENCAP_SIZE(tuninfo);
> +
> +       nla = nla_reserve(skb, attrtype, len);
> +       if (!nla)
> +               return -EMSGSIZE;
> +
> +       data = nla_data(nla);
> +       memcpy(data, tuninfo, len);
> +
> +       return 0;
> +}
> +
> +static void set_tun_src(struct net *net, struct net_device *dev,
> +                       struct in6_addr *daddr, struct in6_addr *saddr)
> +{
> +       struct in6_addr *tun_src;
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +
> +       rcu_read_lock();
> +
> +       tun_src = rcu_dereference(sdata->tun_src);
> +
> +       if (!ipv6_addr_any(tun_src)) {
> +               memcpy(saddr, tun_src, sizeof(struct in6_addr));
> +       } else {
> +               ipv6_dev_get_saddr(net, dev, daddr, IPV6_PREFER_SRC_PUBLIC,
> +                                  saddr);
> +       }
> +
> +       rcu_read_unlock();
> +}
> +
> +/* encapsulate an IPv6 packet within an outer IPv6 header with a given SRH */
> +static int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
> +{
> +       struct ipv6hdr *hdr, *inner_hdr;
> +       struct ipv6_sr_hdr *isrh;
> +       struct net *net = dev_net(skb_dst(skb)->dev);
> +       int hdrlen, tot_len, err;
> +
> +       hdrlen = (osrh->hdrlen + 1) << 3;
> +       tot_len = hdrlen + sizeof(*hdr);
> +
> +       err = pskb_expand_head(skb, tot_len, 0, GFP_ATOMIC);
> +       if (unlikely(err))
> +               return err;
> +
> +       inner_hdr = ipv6_hdr(skb);
> +
> +       skb_push(skb, tot_len);
> +       skb_reset_network_header(skb);
> +       skb_mac_header_rebuild(skb);
> +       hdr = ipv6_hdr(skb);
> +
> +       /* inherit tc, flowlabel and hlim
> +        * hlim will be decremented in ip6_forward() afterwards and
> +        * decapsulation will overwrite inner hlim with outer hlim
> +        */
> +       ip6_flow_hdr(hdr, ip6_tclass(ip6_flowinfo(inner_hdr)),
> +                    ip6_flowlabel(inner_hdr));
> +       hdr->hop_limit = inner_hdr->hop_limit;
> +       hdr->nexthdr = NEXTHDR_ROUTING;
> +
> +       isrh = (void *)hdr + sizeof(*hdr);
> +       memcpy(isrh, osrh, hdrlen);
> +
> +       isrh->nexthdr = NEXTHDR_IPV6;
> +
> +       hdr->daddr = isrh->segments[isrh->first_segment];
> +       set_tun_src(net, skb->dev, &hdr->daddr, &hdr->saddr);
> +
> +       return 0;
> +}
> +
> +/* insert an SRH within an IPv6 packet, just after the IPv6 header */
> +static int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
> +{
> +       struct ipv6hdr *hdr, *oldhdr;
> +       struct ipv6_sr_hdr *isrh;
> +       int hdrlen, err;
> +
> +       hdrlen = (osrh->hdrlen + 1) << 3;
> +
> +       err = pskb_expand_head(skb, hdrlen, 0, GFP_ATOMIC);
> +       if (unlikely(err))
> +               return err;
> +
> +       oldhdr = ipv6_hdr(skb);
> +
> +       skb_push(skb, hdrlen);
> +       skb_reset_network_header(skb);
> +       skb_mac_header_rebuild(skb);
> +
> +       hdr = ipv6_hdr(skb);
> +
> +       memmove(hdr, oldhdr, sizeof(*hdr));
> +
> +       isrh = (void *)hdr + sizeof(*hdr);
> +       memcpy(isrh, osrh, hdrlen);
> +
> +       isrh->nexthdr = hdr->nexthdr;
> +       hdr->nexthdr = NEXTHDR_ROUTING;
> +
> +       isrh->segments[0] = hdr->daddr;
> +       hdr->daddr = isrh->segments[isrh->first_segment];
> +
> +       return 0;
> +}
> +
> +static int seg6_do_srh(struct sk_buff *skb)
> +{
> +       struct dst_entry *dst = skb_dst(skb);
> +       struct seg6_iptunnel_encap *tinfo = seg6_lwtunnel_encap(dst->lwtstate);
> +       int err = 0;
> +
> +       if (likely(!skb->encapsulation)) {
> +               skb_reset_inner_headers(skb);
> +               skb->encapsulation = 1;
> +       }
> +
> +       if (tinfo->flags & SEG6_IPTUN_FLAG_ENCAP) {
> +               err = seg6_do_srh_encap(skb, tinfo->srh);
> +       } else {
> +               err = seg6_do_srh_inline(skb, tinfo->srh);
> +               skb_reset_inner_headers(skb);
> +       }
> +
> +       if (err)
> +               return err;
> +
> +       ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
> +       skb_set_transport_header(skb, sizeof(struct ipv6hdr));
> +
> +       skb_set_inner_protocol(skb, skb->protocol);
> +
> +       return 0;
> +}
> +
> +int seg6_input(struct sk_buff *skb)
> +{
> +       int err;
> +
> +       err = seg6_do_srh(skb);
> +       if (unlikely(err))
> +               return err;
> +
> +       skb_dst_drop(skb);
> +       ip6_route_input(skb);
> +
> +       return dst_input(skb);
> +}
> +
> +int seg6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> +{
> +       int err;
> +       struct dst_entry *dst;
> +       struct ipv6hdr *hdr;
> +       struct flowi6 fl6;
> +
> +       err = seg6_do_srh(skb);
> +       if (unlikely(err))
> +               return err;
> +
> +       hdr = ipv6_hdr(skb);
> +       fl6.daddr = hdr->daddr;
> +       fl6.saddr = hdr->saddr;
> +       fl6.flowlabel = ip6_flowinfo(hdr);
> +       fl6.flowi6_mark = skb->mark;
> +       fl6.flowi6_proto = hdr->nexthdr;
> +
> +       skb_dst_drop(skb);
> +
> +       err = ip6_dst_lookup(net, sk, &dst, &fl6);

Please look at the use of dst_cache that I added in ila_lwt.c, I think
the SR has similar properties and might be able to use dst_cache which
is a significant performance improvement when source packets from a
connected socket. The dst_cache will work in LWT as long as the new
destination address is the same and other input to routing (saddr,
flow label, mark, etc.) are either always the same or assumed to be
immaterial to lookup of the second route.

> +       if (unlikely(err))
> +               return err;
> +
> +       skb_dst_set(skb, dst);
> +
> +       return dst_output(net, sk, skb);
> +}
> +
> +static int seg6_build_state(struct net_device *dev, struct nlattr *nla,
> +                           unsigned int family, const void *cfg,
> +                           struct lwtunnel_state **ts)
> +{
> +       struct seg6_iptunnel_encap *tuninfo, *tuninfo_new;
> +       struct nlattr *tb[SEG6_IPTUNNEL_MAX + 1];
> +       struct lwtunnel_state *newts;
> +       int tuninfo_len;
> +       int err;
> +
> +       err = nla_parse_nested(tb, SEG6_IPTUNNEL_MAX, nla,
> +                              seg6_iptunnel_policy);
> +
> +       if (err < 0)
> +               return err;
> +
> +       if (!tb[SEG6_IPTUNNEL_SRH])
> +               return -EINVAL;
> +
> +       tuninfo = nla_data(tb[SEG6_IPTUNNEL_SRH]);
> +       tuninfo_len = SEG6_IPTUN_ENCAP_SIZE(tuninfo);
> +
> +       newts = lwtunnel_state_alloc(tuninfo_len);
> +       if (!newts)
> +               return -ENOMEM;
> +
> +       newts->len = tuninfo_len;
> +       tuninfo_new = seg6_lwtunnel_encap(newts);
> +       memcpy(tuninfo_new, tuninfo, tuninfo_len);
> +
> +       newts->type = LWTUNNEL_ENCAP_SEG6;
> +       newts->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT |
> +                       LWTUNNEL_STATE_INPUT_REDIRECT;
> +       newts->headroom = SEG6_IPTUN_ENCAP_SIZE(tuninfo);
> +
> +       *ts = newts;
> +
> +       return 0;
> +}
> +
> +static int seg6_fill_encap_info(struct sk_buff *skb,
> +                               struct lwtunnel_state *lwtstate)
> +{
> +       struct seg6_iptunnel_encap *tuninfo = seg6_lwtunnel_encap(lwtstate);
> +
> +       if (nla_put_srh(skb, SEG6_IPTUNNEL_SRH, tuninfo))
> +               return -EMSGSIZE;
> +
> +       return 0;
> +}
> +
> +static int seg6_encap_nlsize(struct lwtunnel_state *lwtstate)
> +{
> +       struct seg6_iptunnel_encap *tuninfo = seg6_lwtunnel_encap(lwtstate);
> +
> +       return nla_total_size(SEG6_IPTUN_ENCAP_SIZE(tuninfo));
> +}
> +
> +static int seg6_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b)
> +{
> +       struct seg6_iptunnel_encap *a_hdr = seg6_lwtunnel_encap(a);
> +       struct seg6_iptunnel_encap *b_hdr = seg6_lwtunnel_encap(b);
> +       int len = SEG6_IPTUN_ENCAP_SIZE(a_hdr);
> +
> +       if (len != SEG6_IPTUN_ENCAP_SIZE(b_hdr))
> +               return 1;
> +
> +       return memcmp(a_hdr, b_hdr, len);
> +}
> +
> +static const struct lwtunnel_encap_ops seg6_iptun_ops = {
> +       .build_state = seg6_build_state,
> +       .output = seg6_output,
> +       .input = seg6_input,
> +       .fill_encap = seg6_fill_encap_info,
> +       .get_encap_size = seg6_encap_nlsize,
> +       .cmp_encap = seg6_encap_cmp,
> +};
> +
> +static int __init seg6_iptunnel_init(void)
> +{
> +       return lwtunnel_encap_add_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6);
> +}
> +module_init(seg6_iptunnel_init);
> +
> +static void __exit seg6_iptunnel_exit(void)
> +{
> +       lwtunnel_encap_del_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6);
> +}
> +module_exit(seg6_iptunnel_exit);
> +
> +MODULE_DESCRIPTION("Segment Routing with IPv6 IP Tunnels");
> +MODULE_LICENSE("GPL v2");
> +
> --
> 2.7.3
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/9] ipv6: sr: add core files for SR HMAC support
  2016-10-17 14:42 ` [PATCH 4/9] ipv6: sr: add core files for SR HMAC support David Lebrun
@ 2016-10-17 17:24   ` Tom Herbert
  2016-10-18 12:07     ` David Lebrun
  2016-10-19 13:20   ` zhuyj
  1 sibling, 1 reply; 29+ messages in thread
From: Tom Herbert @ 2016-10-17 17:24 UTC (permalink / raw)
  To: David Lebrun; +Cc: Linux Kernel Network Developers

On Mon, Oct 17, 2016 at 7:42 AM, David Lebrun <david.lebrun@uclouvain.be> wrote:
> This patch adds the necessary functions to compute and check the HMAC signature
> of an SR-enabled packet. Two HMAC algorithms are supported: hmac(sha1) and
> hmac(sha256).
>
> In order to avoid dynamic memory allocation for each HMAC computation,
> a per-cpu ring buffer is allocated for this purpose.
>
> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
> ---
>  include/linux/seg6_hmac.h      |   6 +
>  include/net/seg6_hmac.h        |  61 ++++++
>  include/uapi/linux/seg6_hmac.h |  20 ++
>  net/ipv6/seg6_hmac.c           | 432 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 519 insertions(+)
>  create mode 100644 include/linux/seg6_hmac.h
>  create mode 100644 include/net/seg6_hmac.h
>  create mode 100644 include/uapi/linux/seg6_hmac.h
>  create mode 100644 net/ipv6/seg6_hmac.c
>
> diff --git a/include/linux/seg6_hmac.h b/include/linux/seg6_hmac.h
> new file mode 100644
> index 0000000..da437eb
> --- /dev/null
> +++ b/include/linux/seg6_hmac.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_SEG6_HMAC_H
> +#define _LINUX_SEG6_HMAC_H
> +
> +#include <uapi/linux/seg6_hmac.h>
> +
> +#endif
> diff --git a/include/net/seg6_hmac.h b/include/net/seg6_hmac.h
> new file mode 100644
> index 0000000..6e5ee6a
> --- /dev/null
> +++ b/include/net/seg6_hmac.h
> @@ -0,0 +1,61 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#ifndef _NET_SEG6_HMAC_H
> +#define _NET_SEG6_HMAC_H
> +
> +#include <net/flow.h>
> +#include <net/ip6_fib.h>
> +#include <net/sock.h>
> +#include <linux/ip.h>
> +#include <linux/ipv6.h>
> +#include <linux/route.h>
> +#include <net/seg6.h>
> +#include <linux/seg6_hmac.h>
> +
> +#define SEG6_HMAC_MAX_DIGESTSIZE       160
> +#define SEG6_HMAC_RING_SIZE            256
> +
> +struct seg6_hmac_info {
> +       struct list_head list;
> +
> +       u32 hmackeyid;
> +       char secret[SEG6_HMAC_SECRET_LEN];
> +       u8 slen;
> +       u8 alg_id;
> +};
> +
> +struct seg6_hmac_algo {
> +       u8 alg_id;
> +       char name[64];
> +       struct crypto_shash * __percpu *tfms;
> +       struct shash_desc * __percpu *shashs;
> +};

A lot of this looks generic and potentially useful in other cases
where we we want to do HMAC over some headers (I'm thinking GUE can
probably use some of this for header authentication). Might be nice to
split out the generic pieces at some point.

> +
> +extern int seg6_hmac_compute(struct seg6_hmac_info *hinfo,
> +                            struct ipv6_sr_hdr *hdr, struct in6_addr *saddr,
> +                            u8 *output);
> +extern struct seg6_hmac_info *seg6_hmac_info_lookup(struct net *net, u32 key);
> +extern int seg6_hmac_info_add(struct net *net, u32 key,
> +                             struct seg6_hmac_info *hinfo);
> +extern int seg6_hmac_info_del(struct net *net, u32 key,
> +                             struct seg6_hmac_info *hinfo);
> +extern int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
> +                         struct ipv6_sr_hdr *srh);
> +extern bool seg6_hmac_validate_skb(struct sk_buff *skb);
> +extern int seg6_hmac_init(void);
> +extern void seg6_hmac_exit(void);
> +extern int seg6_hmac_net_init(struct net *net);
> +extern void seg6_hmac_net_exit(struct net *net);
> +
> +#endif
> diff --git a/include/uapi/linux/seg6_hmac.h b/include/uapi/linux/seg6_hmac.h
> new file mode 100644
> index 0000000..0b5eda7
> --- /dev/null
> +++ b/include/uapi/linux/seg6_hmac.h
> @@ -0,0 +1,20 @@
> +#ifndef _UAPI_LINUX_SEG6_HMAC_H
> +#define _UAPI_LINUX_SEG6_HMAC_H
> +
> +#define SEG6_HMAC_SECRET_LEN   64
> +#define SEG6_HMAC_FIELD_LEN    32
> +
> +struct sr6_tlv_hmac {
> +       __u8 type;
> +       __u8 len;
> +       __u16 reserved;
> +       __be32 hmackeyid;
> +       __u8 hmac[SEG6_HMAC_FIELD_LEN];
> +} __attribute__((packed));
> +
> +enum {
> +       SEG6_HMAC_ALGO_SHA1 = 1,
> +       SEG6_HMAC_ALGO_SHA256 = 2,
> +};
> +
> +#endif
> diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c
> new file mode 100644
> index 0000000..91bd59a
> --- /dev/null
> +++ b/net/ipv6/seg6_hmac.c
> @@ -0,0 +1,432 @@
> +/*
> + *  SR-IPv6 implementation -- HMAC functions
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/sockios.h>
> +#include <linux/net.h>
> +#include <linux/netdevice.h>
> +#include <linux/in6.h>
> +#include <linux/icmpv6.h>
> +#include <linux/mroute6.h>
> +#include <linux/slab.h>
> +
> +#include <linux/netfilter.h>
> +#include <linux/netfilter_ipv6.h>
> +
> +#include <net/sock.h>
> +#include <net/snmp.h>
> +
> +#include <net/ipv6.h>
> +#include <net/protocol.h>
> +#include <net/transp_v6.h>
> +#include <net/rawv6.h>
> +#include <net/ndisc.h>
> +#include <net/ip6_route.h>
> +#include <net/addrconf.h>
> +#include <net/xfrm.h>
> +
> +#include <linux/cryptohash.h>
> +#include <crypto/hash.h>
> +#include <crypto/sha.h>
> +#include <net/seg6.h>
> +#include <net/genetlink.h>
> +#include <net/seg6_hmac.h>
> +#include <linux/random.h>
> +
> +static char * __percpu *hmac_ring;
> +
> +static struct seg6_hmac_algo hmac_algos[] = {
> +       {
> +               .alg_id = SEG6_HMAC_ALGO_SHA1,
> +               .name = "hmac(sha1)",
> +       },
> +       {
> +               .alg_id = SEG6_HMAC_ALGO_SHA256,
> +               .name = "hmac(sha256)",
> +       },
> +};
> +
> +static struct seg6_hmac_algo *__hmac_get_algo(u8 alg_id)
> +{
> +       int i, alg_count;
> +       struct seg6_hmac_algo *algo;
> +
> +       alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
> +       for (i = 0; i < alg_count; i++) {
> +               algo = &hmac_algos[i];
> +               if (algo->alg_id == alg_id)
> +                       return algo;
> +       }
> +
> +       return NULL;
> +}
> +
> +static int __do_hmac(struct seg6_hmac_info *hinfo, const char *text, u8 psize,
> +                    u8 *output, int outlen)
> +{
> +       struct crypto_shash *tfm;
> +       struct shash_desc *shash;
> +       struct seg6_hmac_algo *algo;
> +       int ret, dgsize;
> +
> +       algo = __hmac_get_algo(hinfo->alg_id);
> +       if (!algo)
> +               return -ENOENT;
> +
> +       tfm = *this_cpu_ptr(algo->tfms);
> +
> +       dgsize = crypto_shash_digestsize(tfm);
> +       if (dgsize > outlen) {
> +               pr_debug("sr-ipv6: __do_hmac: digest size too big (%d / %d)\n",
> +                        dgsize, outlen);
> +               return -ENOMEM;
> +       }
> +
> +       ret = crypto_shash_setkey(tfm, hinfo->secret, hinfo->slen);
> +       if (ret < 0) {
> +               pr_debug("sr-ipv6: crypto_shash_setkey failed: err %d\n", ret);
> +               goto failed;
> +       }
> +
> +       shash = *this_cpu_ptr(algo->shashs);
> +       shash->tfm = tfm;
> +
> +       ret = crypto_shash_digest(shash, text, psize, output);
> +       if (ret < 0) {
> +               pr_debug("sr-ipv6: crypto_shash_digest failed: err %d\n", ret);
> +               goto failed;
> +       }
> +
> +       return dgsize;
> +
> +failed:
> +       return ret;
> +}
> +
> +int seg6_hmac_compute(struct seg6_hmac_info *hinfo, struct ipv6_sr_hdr *hdr,
> +                     struct in6_addr *saddr, u8 *output)
> +{
> +       int plen, i, dgsize, wrsize;
> +       char *ring, *off;
> +       u8 tmp_out[SEG6_HMAC_MAX_DIGESTSIZE];
> +       __be32 hmackeyid = cpu_to_be32(hinfo->hmackeyid);
> +
> +       /* a 160-byte buffer for digest output allows to store highest known
> +        * hash function (RadioGatun) with up to 1216 bits
> +        */
> +
> +       /* saddr(16) + first_seg(1) + cleanup(1) + keyid(4) + seglist(16n) */
> +       plen = 16 + 1 + 1 + 4 + (hdr->first_segment + 1) * 16;
> +
> +       /* this limit allows for 14 segments */
> +       if (plen >= SEG6_HMAC_RING_SIZE)
> +               return -EMSGSIZE;
> +
> +       local_bh_disable();
> +       ring = *this_cpu_ptr(hmac_ring);
> +       off = ring;
> +       memcpy(off, saddr, 16);
> +       off += 16;
> +       *off++ = hdr->first_segment;
> +       *off++ = !!(sr_get_flags(hdr) & SR6_FLAG_CLEANUP) << 7;
> +       memcpy(off, &hmackeyid, 4);
> +       off += 4;
> +
> +       for (i = 0; i < hdr->first_segment + 1; i++) {
> +               memcpy(off, hdr->segments + i, 16);
> +               off += 16;
> +       }
> +
> +       dgsize = __do_hmac(hinfo, ring, plen, tmp_out,
> +                          SEG6_HMAC_MAX_DIGESTSIZE);
> +       local_bh_enable();
> +
> +       if (dgsize < 0)
> +               return dgsize;
> +
> +       wrsize = SEG6_HMAC_FIELD_LEN;
> +       if (wrsize > dgsize)
> +               wrsize = dgsize;
> +
> +       memset(output, 0, SEG6_HMAC_FIELD_LEN);
> +       memcpy(output, tmp_out, wrsize);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(seg6_hmac_compute);
> +
> +/* checks if an incoming SR-enabled packet's HMAC status matches
> + * the incoming policy.
> + *
> + * called with rcu_read_lock()
> + */
> +bool seg6_hmac_validate_skb(struct sk_buff *skb)
> +{
> +       struct inet6_dev *idev;
> +       struct ipv6_sr_hdr *srh;
> +       struct sr6_tlv_hmac *tlv;
> +       struct seg6_hmac_info *hinfo;
> +       struct net *net = dev_net(skb->dev);
> +       u8 hmac_output[SEG6_HMAC_FIELD_LEN];
> +
> +       idev = __in6_dev_get(skb->dev);
> +
> +       srh = (struct ipv6_sr_hdr *)skb_transport_header(skb);
> +
> +       tlv = seg6_get_tlv_hmac(srh);
> +
> +       /* mandatory check but no tlv */
> +       if (idev->cnf.seg6_require_hmac > 0 && !tlv)
> +               return false;
> +
> +       /* no check */
> +       if (idev->cnf.seg6_require_hmac < 0)
> +               return true;
> +
> +       /* check only if present */
> +       if (idev->cnf.seg6_require_hmac == 0 && !tlv)
> +               return true;
> +
> +       /* now, seg6_require_hmac >= 0 && tlv */
> +
> +       hinfo = seg6_hmac_info_lookup(net, be32_to_cpu(tlv->hmackeyid));
> +       if (!hinfo)
> +               return false;
> +
> +       if (seg6_hmac_compute(hinfo, srh, &ipv6_hdr(skb)->saddr, hmac_output))
> +               return false;
> +
> +       if (memcmp(hmac_output, tlv->hmac, SEG6_HMAC_FIELD_LEN) != 0)
> +               return false;
> +
> +       return true;
> +}
> +
> +/* called with rcu_read_lock() */
> +struct seg6_hmac_info *seg6_hmac_info_lookup(struct net *net, u32 key)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +       struct seg6_hmac_info *hinfo;
> +
> +       list_for_each_entry_rcu(hinfo, &sdata->hmac_infos, list) {
> +               if (hinfo->hmackeyid == key)
> +                       return hinfo;
> +       }
> +
> +       return NULL;
> +}
> +EXPORT_SYMBOL(seg6_hmac_info_lookup);
> +
> +int seg6_hmac_info_add(struct net *net, u32 key, struct seg6_hmac_info *hinfo)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +       struct seg6_hmac_info *old_hinfo;
> +
> +       old_hinfo = seg6_hmac_info_lookup(net, key);
> +       if (old_hinfo)
> +               return -EEXIST;
> +
> +       list_add_rcu(&hinfo->list, &sdata->hmac_infos);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(seg6_hmac_info_add);
> +
> +int seg6_hmac_info_del(struct net *net, u32 key, struct seg6_hmac_info *hinfo)
> +{
> +       struct seg6_hmac_info *tmp;
> +
> +       tmp = seg6_hmac_info_lookup(net, key);
> +       if (!tmp)
> +               return -ENOENT;
> +
> +       /* entry was replaced, ignore deletion */
> +       if (tmp != hinfo)
> +               return -ENOENT;
> +
> +       list_del_rcu(&hinfo->list);
> +       synchronize_net();
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(seg6_hmac_info_del);
> +
> +static void seg6_hmac_info_flush(struct net *net)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +       struct seg6_hmac_info *hinfo;
> +
> +       seg6_pernet_lock(net);
> +       while ((hinfo = list_first_or_null_rcu(&sdata->hmac_infos,
> +                                              struct seg6_hmac_info,
> +                                              list)) != NULL) {
> +               list_del_rcu(&hinfo->list);
> +               seg6_pernet_unlock(net);
> +               synchronize_net();
> +               kfree(hinfo);
> +               seg6_pernet_lock(net);
> +       }
> +
> +       seg6_pernet_unlock(net);
> +}
> +
> +int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
> +                  struct ipv6_sr_hdr *srh)
> +{
> +       struct seg6_hmac_info *hinfo;
> +       int err = -ENOENT;
> +       struct sr6_tlv_hmac *tlv;
> +
> +       tlv = seg6_get_tlv(srh, SR6_TLV_HMAC);
> +       if (!tlv)
> +               return -EINVAL;
> +
> +       rcu_read_lock();
> +
> +       hinfo = seg6_hmac_info_lookup(net, be32_to_cpu(tlv->hmackeyid));
> +       if (!hinfo)
> +               goto out;
> +
> +       memset(tlv->hmac, 0, SEG6_HMAC_FIELD_LEN);
> +       err = seg6_hmac_compute(hinfo, srh, saddr, tlv->hmac);
> +
> +out:
> +       rcu_read_unlock();
> +       return err;
> +}
> +EXPORT_SYMBOL(seg6_push_hmac);
> +
> +static int seg6_hmac_init_ring(void)
> +{
> +       int i;
> +
> +       hmac_ring = alloc_percpu(char *);
> +
> +       if (!hmac_ring)
> +               return -ENOMEM;
> +
> +       for_each_possible_cpu(i) {
> +               char *ring = kzalloc(SEG6_HMAC_RING_SIZE, GFP_KERNEL);
> +
> +               if (!ring)
> +                       return -ENOMEM;
> +
> +               *per_cpu_ptr(hmac_ring, i) = ring;
> +       }
> +
> +       return 0;
> +}
> +
> +static int seg6_hmac_init_algo(void)
> +{
> +       int i, alg_count, cpu;
> +       struct seg6_hmac_algo *algo;
> +       struct crypto_shash *tfm;
> +       struct shash_desc *shash;
> +
> +       alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
> +
> +       for (i = 0; i < alg_count; i++) {
> +               int shsize;
> +               struct crypto_shash **p_tfm;
> +
> +               algo = &hmac_algos[i];
> +               algo->tfms = alloc_percpu(struct crypto_shash *);
> +               if (!algo->tfms)
> +                       return -ENOMEM;
> +
> +               for_each_possible_cpu(cpu) {
> +                       tfm = crypto_alloc_shash(algo->name, 0, GFP_KERNEL);
> +                       if (IS_ERR(tfm))
> +                               return PTR_ERR(tfm);
> +                       p_tfm = per_cpu_ptr(algo->tfms, cpu);
> +                       *p_tfm = tfm;
> +               }
> +
> +               p_tfm = this_cpu_ptr(algo->tfms);
> +               tfm = *p_tfm;
> +
> +               shsize = sizeof(*shash) + crypto_shash_descsize(tfm);
> +
> +               algo->shashs = alloc_percpu(struct shash_desc *);
> +               if (!algo->shashs)
> +                       return -ENOMEM;
> +
> +               for_each_possible_cpu(cpu) {
> +                       shash = kzalloc(shsize, GFP_KERNEL);
> +                       if (!shash)
> +                               return -ENOMEM;
> +                       *per_cpu_ptr(algo->shashs, cpu) = shash;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
> +int __init seg6_hmac_init(void)
> +{
> +       int ret;
> +
> +       ret = seg6_hmac_init_ring();
> +       if (ret < 0)
> +               goto out;
> +
> +       ret = seg6_hmac_init_algo();
> +
> +out:
> +       return ret;
> +}
> +
> +int __net_init seg6_hmac_net_init(struct net *net)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +
> +       INIT_LIST_HEAD(&sdata->hmac_infos);
> +       return 0;
> +}
> +
> +void __exit seg6_hmac_exit(void)
> +{
> +       int i, alg_count, cpu;
> +       struct seg6_hmac_algo *algo = NULL;
> +
> +       for_each_possible_cpu(i) {
> +               char *ring = *per_cpu_ptr(hmac_ring, i);
> +
> +               kfree(ring);
> +       }
> +       free_percpu(hmac_ring);
> +
> +       alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
> +       for (i = 0; i < alg_count; i++) {
> +               algo = &hmac_algos[i];
> +               for_each_possible_cpu(cpu) {
> +                       struct crypto_shash *tfm;
> +                       struct shash_desc *shash;
> +
> +                       shash = *per_cpu_ptr(algo->shashs, cpu);
> +                       kfree(shash);
> +                       tfm = *per_cpu_ptr(algo->tfms, cpu);
> +                       crypto_free_shash(tfm);
> +               }
> +               free_percpu(algo->tfms);
> +               free_percpu(algo->shashs);
> +       }
> +}
> +
> +void __net_exit seg6_hmac_net_exit(struct net *net)
> +{
> +       seg6_hmac_info_flush(net);
> +}
> --
> 2.7.3
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-17 14:42 ` [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header) David Lebrun
  2016-10-17 14:57   ` David Miller
  2016-10-17 17:01   ` Tom Herbert
@ 2016-10-17 17:31   ` Tom Herbert
  2 siblings, 0 replies; 29+ messages in thread
From: Tom Herbert @ 2016-10-17 17:31 UTC (permalink / raw)
  To: David Lebrun; +Cc: Linux Kernel Network Developers

On Mon, Oct 17, 2016 at 7:42 AM, David Lebrun <david.lebrun@uclouvain.be> wrote:
> Implement minimal support for processing of SR-enabled packets
> as described in
> https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02.
>
> This patch implements the following operations:
> - Intermediate segment endpoint: incrementation of active segment and rerouting.
> - Egress for SR-encapsulated packets: decapsulation of outer IPv6 header + SRH
>   and routing of inner packet.
> - Cleanup flag support for SR-inlined packets: removal of SRH if we are the
>   penultimate segment endpoint.
>
> A per-interface sysctl seg6_enabled is provided, to accept/deny SR-enabled
> packets. Default is deny.
>
> This patch does not provide support for HMAC-signed packets.
>
> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
> ---
>  include/linux/ipv6.h      |   3 +
>  include/linux/seg6.h      |   6 ++
>  include/uapi/linux/ipv6.h |   2 +
>  include/uapi/linux/seg6.h |  46 +++++++++++++++
>  net/ipv6/Kconfig          |  13 +++++
>  net/ipv6/addrconf.c       |  18 ++++++
>  net/ipv6/exthdrs.c        | 140 ++++++++++++++++++++++++++++++++++++++++++++++
>  7 files changed, 228 insertions(+)
>  create mode 100644 include/linux/seg6.h
>  create mode 100644 include/uapi/linux/seg6.h
>
> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> index 7e9a789..75395ad 100644
> --- a/include/linux/ipv6.h
> +++ b/include/linux/ipv6.h
> @@ -64,6 +64,9 @@ struct ipv6_devconf {
>         } stable_secret;
>         __s32           use_oif_addrs_only;
>         __s32           keep_addr_on_down;
> +#ifdef CONFIG_IPV6_SEG6
> +       __s32           seg6_enabled;
> +#endif
>
>         struct ctl_table_header *sysctl_header;
>  };
> diff --git a/include/linux/seg6.h b/include/linux/seg6.h
> new file mode 100644
> index 0000000..7a66d2b
> --- /dev/null
> +++ b/include/linux/seg6.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_SEG6_H
> +#define _LINUX_SEG6_H
> +
> +#include <uapi/linux/seg6.h>
> +
> +#endif
> diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
> index 8c27723..7ff1d65 100644
> --- a/include/uapi/linux/ipv6.h
> +++ b/include/uapi/linux/ipv6.h
> @@ -39,6 +39,7 @@ struct in6_ifreq {
>  #define IPV6_SRCRT_STRICT      0x01    /* Deprecated; will be removed */
>  #define IPV6_SRCRT_TYPE_0      0       /* Deprecated; will be removed */
>  #define IPV6_SRCRT_TYPE_2      2       /* IPv6 type 2 Routing Header   */
> +#define IPV6_SRCRT_TYPE_4      4       /* Segment Routing with IPv6 */
>
>  /*
>   *     routing header
> @@ -178,6 +179,7 @@ enum {
>         DEVCONF_DROP_UNSOLICITED_NA,
>         DEVCONF_KEEP_ADDR_ON_DOWN,
>         DEVCONF_RTR_SOLICIT_MAX_INTERVAL,
> +       DEVCONF_SEG6_ENABLED,
>         DEVCONF_MAX
>  };
>
> diff --git a/include/uapi/linux/seg6.h b/include/uapi/linux/seg6.h
> new file mode 100644
> index 0000000..9f9e157
> --- /dev/null
> +++ b/include/uapi/linux/seg6.h
> @@ -0,0 +1,46 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#ifndef _UAPI_LINUX_SEG6_H
> +#define _UAPI_LINUX_SEG6_H
> +
> +/*
> + * SRH
> + */
> +struct ipv6_sr_hdr {
> +       __u8    nexthdr;
> +       __u8    hdrlen;
> +       __u8    type;
> +       __u8    segments_left;
> +       __u8    first_segment;
> +       __be16  flags;
> +       __u8    reserved;
> +
> +       struct in6_addr segments[0];
> +} __attribute__((packed));
> +
> +#define SR6_FLAG_CLEANUP       (1 << 15)
> +#define SR6_FLAG_PROTECTED     (1 << 14)
> +#define SR6_FLAG_OAM           (1 << 13)
> +#define SR6_FLAG_ALERT         (1 << 12)
> +#define SR6_FLAG_HMAC          (1 << 11)
> +
> +#define SR6_TLV_INGRESS                1
> +#define SR6_TLV_EGRESS         2
> +#define SR6_TLV_OPAQUE         3
> +#define SR6_TLV_PADDING                4
> +#define SR6_TLV_HMAC           5
> +
> +#define sr_get_flags(srh) (be16_to_cpu((srh)->flags))
> +
> +#endif
> diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
> index 2343e4f..691c318 100644
> --- a/net/ipv6/Kconfig
> +++ b/net/ipv6/Kconfig
> @@ -289,4 +289,17 @@ config IPV6_PIMSM_V2
>           Support for IPv6 PIM multicast routing protocol PIM-SMv2.
>           If unsure, say N.
>
> +config IPV6_SEG6
> +       bool "IPv6: Segment Routing support"
> +       depends on IPV6
> +       select CRYPTO_HMAC
> +       select CRYPTO_SHA1
> +       select CRYPTO_SHA256
> +       ---help---
> +         Experimental support for IPv6 Segment Routing dataplane as defined
> +         in IETF draft-ietf-6man-segment-routing-header-02. This option
> +         enables the processing of SR-enabled packets allowing the kernel
> +         to act as a segment endpoint (intermediate or egress). It also
> +         enables an API for the kernel to act as an ingress SR router.
> +

I suggest that you eliminate IPV6_SEG6 as config, always include SR
with IPv6. But then maybe SR security should be a CONFIG variable
since this would pull in a lot of crypto. I imagine in a closed
environment (e.g. within a datacenter) security might not normally be
enabled.

>  endif # IPV6
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index d8983e1..42c0ffb 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -239,6 +239,9 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
>         .use_oif_addrs_only     = 0,
>         .ignore_routes_with_linkdown = 0,
>         .keep_addr_on_down      = 0,
> +#ifdef CONFIG_IPV6_SEG6
> +       .seg6_enabled           = 0,
> +#endif
>  };
>
>  static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
> @@ -285,6 +288,9 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
>         .use_oif_addrs_only     = 0,
>         .ignore_routes_with_linkdown = 0,
>         .keep_addr_on_down      = 0,
> +#ifdef CONFIG_IPV6_SEG6
> +       .seg6_enabled           = 0,
> +#endif
>  };
>
>  /* Check if a valid qdisc is available */
> @@ -4965,6 +4971,9 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
>         array[DEVCONF_DROP_UNICAST_IN_L2_MULTICAST] = cnf->drop_unicast_in_l2_multicast;
>         array[DEVCONF_DROP_UNSOLICITED_NA] = cnf->drop_unsolicited_na;
>         array[DEVCONF_KEEP_ADDR_ON_DOWN] = cnf->keep_addr_on_down;
> +#ifdef CONFIG_IPV6_SEG6
> +       array[DEVCONF_SEG6_ENABLED] = cnf->seg6_enabled;
> +#endif
>  }
>
>  static inline size_t inet6_ifla6_size(void)
> @@ -6056,6 +6065,15 @@ static const struct ctl_table addrconf_sysctl[] = {
>                 .proc_handler   = proc_dointvec,
>
>         },
> +#ifdef CONFIG_IPV6_SEG6
> +       {
> +               .procname       = "seg6_enabled",
> +               .data           = &ipv6_devconf.seg6_enabled,
> +               .maxlen         = sizeof(int),
> +               .mode           = 0644,
> +               .proc_handler   = proc_dointvec,
> +       },
> +#endif
>         {
>                 /* sentinel */
>         }
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index 139ceb6..b31f811 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -47,6 +47,9 @@
>  #if IS_ENABLED(CONFIG_IPV6_MIP6)
>  #include <net/xfrm.h>
>  #endif
> +#ifdef CONFIG_IPV6_SEG6
> +#include <linux/seg6.h>
> +#endif
>
>  #include <linux/uaccess.h>
>
> @@ -286,6 +289,137 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
>         return -1;
>  }
>
> +#ifdef CONFIG_IPV6_SEG6
> +static int ipv6_srh_rcv(struct sk_buff *skb)
> +{
> +       struct in6_addr *addr = NULL, *last_addr = NULL, *active_addr = NULL;
> +       struct inet6_skb_parm *opt = IP6CB(skb);
> +       struct net *net = dev_net(skb->dev);
> +       struct ipv6_sr_hdr *hdr;
> +       struct inet6_dev *idev;
> +       int cleanup = 0;
> +       int accept_seg6;
> +
> +       hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
> +
> +       idev = __in6_dev_get(skb->dev);
> +
> +       accept_seg6 = net->ipv6.devconf_all->seg6_enabled;
> +       if (accept_seg6 > idev->cnf.seg6_enabled)
> +               accept_seg6 = idev->cnf.seg6_enabled;
> +
> +       if (!accept_seg6) {
> +               kfree_skb(skb);
> +               return -1;
> +       }
> +
> +looped_back:
> +       last_addr = hdr->segments;
> +
> +       if (hdr->segments_left > 0) {
> +               if (hdr->nexthdr != NEXTHDR_IPV6 && hdr->segments_left == 1 &&
> +                   sr_get_flags(hdr) & SR6_FLAG_CLEANUP)
> +                       cleanup = 1;
> +       } else {
> +               if (hdr->nexthdr == NEXTHDR_IPV6) {
> +                       int offset = (hdr->hdrlen + 1) << 3;
> +
> +                       if (!pskb_pull(skb, offset)) {
> +                               kfree_skb(skb);
> +                               return -1;
> +                       }
> +                       skb_postpull_rcsum(skb, skb_transport_header(skb),
> +                                          offset);
> +
> +                       skb_reset_network_header(skb);
> +                       skb_reset_transport_header(skb);
> +                       skb->encapsulation = 0;
> +
> +                       __skb_tunnel_rx(skb, skb->dev, net);
> +
> +                       netif_rx(skb);
> +                       return -1;
> +               }
> +
> +               opt->srcrt = skb_network_header_len(skb);
> +               opt->lastopt = opt->srcrt;
> +               skb->transport_header += (hdr->hdrlen + 1) << 3;
> +               opt->nhoff = (&hdr->nexthdr) - skb_network_header(skb);
> +
> +               return 1;
> +       }
> +
> +       if (skb_cloned(skb)) {
> +               if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) {
> +                       __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
> +                                       IPSTATS_MIB_OUTDISCARDS);
> +                       kfree_skb(skb);
> +                       return -1;
> +               }
> +       }
> +
> +       if (skb->ip_summed == CHECKSUM_COMPLETE)
> +               skb->ip_summed = CHECKSUM_NONE;
> +
> +       hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
> +
> +       active_addr = hdr->segments + hdr->segments_left;
> +       hdr->segments_left--;
> +       addr = hdr->segments + hdr->segments_left;
> +
> +       ipv6_hdr(skb)->daddr = *addr;
> +
> +       skb_push(skb, sizeof(struct ipv6hdr));
> +
> +       if (cleanup) {
> +               int srhlen = (hdr->hdrlen + 1) << 3;
> +               int nh = hdr->nexthdr;
> +
> +               memmove(skb_network_header(skb) + srhlen,
> +                       skb_network_header(skb),
> +                       (unsigned char *)hdr - skb_network_header(skb));
> +               skb_pull(skb, srhlen);
> +               skb->network_header += srhlen;
> +               ipv6_hdr(skb)->nexthdr = nh;
> +               ipv6_hdr(skb)->payload_len = htons(skb->len -
> +                                                  sizeof(struct ipv6hdr));
> +       }
> +
> +       skb_dst_drop(skb);
> +
> +       ip6_route_input(skb);
> +
> +       if (skb_dst(skb)->error) {
> +               dst_input(skb);
> +               return -1;
> +       }
> +
> +       if (skb_dst(skb)->dev->flags & IFF_LOOPBACK) {
> +               if (ipv6_hdr(skb)->hop_limit <= 1) {
> +                       __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
> +                                       IPSTATS_MIB_INHDRERRORS);
> +                       icmpv6_send(skb, ICMPV6_TIME_EXCEED,
> +                                   ICMPV6_EXC_HOPLIMIT, 0);
> +                       kfree_skb(skb);
> +                       return -1;
> +               }
> +               ipv6_hdr(skb)->hop_limit--;
> +
> +               /* be sure that srh is still present before reinjecting */
> +               if (!cleanup) {
> +                       skb_pull(skb, sizeof(struct ipv6hdr));
> +                       goto looped_back;
> +               }
> +               skb_set_transport_header(skb, sizeof(struct ipv6hdr));
> +               IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
> +       }
> +
> +       dst_input(skb);
> +
> +       return -1;
> +}
> +#endif
> +
>  /********************************
>    Routing header.
>   ********************************/
> @@ -326,6 +460,12 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
>                 return -1;
>         }
>
> +#ifdef CONFIG_IPV6_SEG6
> +       /* segment routing */
> +       if (hdr->type == IPV6_SRCRT_TYPE_4)
> +               return ipv6_srh_rcv(skb);
> +#endif
> +
>  looped_back:
>         if (hdr->segments_left == 0) {
>                 switch (hdr->type) {
> --
> 2.7.3
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-17 17:01   ` Tom Herbert
@ 2016-10-18 11:43     ` David Lebrun
  2016-10-20 13:04     ` David Lebrun
  1 sibling, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-18 11:43 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

[-- Attachment #1: Type: text/plain, Size: 1877 bytes --]

On 10/17/2016 07:01 PM, Tom Herbert wrote:
>> +struct ipv6_sr_hdr {
>> +       __u8    nexthdr;
>> +       __u8    hdrlen;
>> +       __u8    type;
>> +       __u8    segments_left;
>> +       __u8    first_segment;
>> +       __be16  flags;
> 
> Bad alignment for 16 bit field could be unpleasant on some
> architectures. Might be better to split this into to u8's, defined
> flags are only in first eight bits anyway.
> 

Will do

>> +config IPV6_SEG6
>> +       bool "IPv6: Segment Routing support"
>> +       depends on IPV6
>> +       select CRYPTO_HMAC
>> +       select CRYPTO_SHA1
>> +       select CRYPTO_SHA256
>> +       ---help---
>> +         Experimental support for IPv6 Segment Routing dataplane as defined
> 
> I don't think calling this experimental is relevant.

OK

>> +       if (skb->ip_summed == CHECKSUM_COMPLETE)
>> +               skb->ip_summed = CHECKSUM_NONE;
>> +
> Because the packet is being changed? Would it make sense to update the
> checksum complete value based on the changes being made. Consider the
> case that the next hop is local to the host (someone may try to
> implement network virtualization this way).
> 

Seems to make sense, I will try your suggestion

>>
>> +#ifdef CONFIG_IPV6_SEG6
>> +       /* segment routing */
>> +       if (hdr->type == IPV6_SRCRT_TYPE_4)
>> +               return ipv6_srh_rcv(skb);
>> +#endif
> 
> This doesn't belong in one of the switch statements in ipv6_rthdr_rcv?
> 

From what I see, ipv6_rthdr_rcv was initially implemented to support
RH0, and then specific code was added at multiple points to handle MIP6.
The first switch already handles a specific case (i.e. segments_left ==
0), so the call to ipv6_srh_rcv() must happen before that. I choose not
to inline ipv6_srh_rcv into ipv6_rthdr_rcv as it would make the code
quite messy.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6
  2016-10-17 17:07   ` Tom Herbert
@ 2016-10-18 11:45     ` David Lebrun
  0 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-18 11:45 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

[-- Attachment #1: Type: text/plain, Size: 731 bytes --]

On 10/17/2016 07:07 PM, Tom Herbert wrote:
>> +static inline void seg6_pernet_lock(struct net *net)
>> +{
>> +       mutex_lock(&seg6_pernet(net)->lock);
>> +}
>> +
>> +static inline void seg6_pernet_unlock(struct net *net)
>> +{
>> +       mutex_unlock(&seg6_pernet(net)->lock);
>> +}
>> +
> IMO it's better not to hide mutex_lock/unlock in a static inline
> function. Pairing mutex_lock with mutex_lock is critical and should be
> each to see in code.
> 

OK

>> +
>> +static int seg6_genl_sethmac(struct sk_buff *skb, struct genl_info *info)
>> +{
>> +       return -ENOTSUPP;
> 
> Is the intent to implement this later?
> 

The implementation is in this patch series with the rest of the HMAC code


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] ipv6: sr: add support for SRH encapsulation and injection with lwtunnels
  2016-10-17 17:17   ` Tom Herbert
@ 2016-10-18 11:47     ` David Lebrun
  0 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-18 11:47 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

On 10/17/2016 07:17 PM, Tom Herbert wrote:
>> > +       err = ip6_dst_lookup(net, sk, &dst, &fl6);
> Please look at the use of dst_cache that I added in ila_lwt.c, I think
> the SR has similar properties and might be able to use dst_cache which
> is a significant performance improvement when source packets from a
> connected socket. The dst_cache will work in LWT as long as the new
> destination address is the same and other input to routing (saddr,
> flow label, mark, etc.) are either always the same or assumed to be
> immaterial to lookup of the second route.
> 

Great, thanks :)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/9] ipv6: sr: add core files for SR HMAC support
  2016-10-17 17:24   ` Tom Herbert
@ 2016-10-18 12:07     ` David Lebrun
  0 siblings, 0 replies; 29+ messages in thread
From: David Lebrun @ 2016-10-18 12:07 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

[-- Attachment #1: Type: text/plain, Size: 486 bytes --]

On 10/17/2016 07:24 PM, Tom Herbert wrote:
> A lot of this looks generic and potentially useful in other cases
> where we we want to do HMAC over some headers (I'm thinking GUE can
> probably use some of this for header authentication). Might be nice to
> split out the generic pieces at some point.

How would you see this integrated generically ? Directly into the crypto
subsystem ? The idea is interesting anyway, I can work on that after
this patch series goes through.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/9] ipv6: sr: add core files for SR HMAC support
  2016-10-17 14:42 ` [PATCH 4/9] ipv6: sr: add core files for SR HMAC support David Lebrun
  2016-10-17 17:24   ` Tom Herbert
@ 2016-10-19 13:20   ` zhuyj
  2016-10-19 14:43     ` David Laight
  1 sibling, 1 reply; 29+ messages in thread
From: zhuyj @ 2016-10-19 13:20 UTC (permalink / raw)
  To: David Lebrun; +Cc: netdev

 __attribute__((packed)) is potentially unsafe on some systems. The
symptom probably won't show up on an x86, which just makes the problem
more insidious; testing on x86 systems won't reveal the problem. (On
the x86, misaligned accesses are handled in hardware; if you
dereference an int* pointer that points to an odd address, it will be
a little slower than if it were properly aligned, but you'll get the
correct result.)

On some other systems, such as SPARC, attempting to access a
misaligned int object causes a bus error, crashing the program.

On Mon, Oct 17, 2016 at 10:42 PM, David Lebrun
<david.lebrun@uclouvain.be> wrote:
> This patch adds the necessary functions to compute and check the HMAC signature
> of an SR-enabled packet. Two HMAC algorithms are supported: hmac(sha1) and
> hmac(sha256).
>
> In order to avoid dynamic memory allocation for each HMAC computation,
> a per-cpu ring buffer is allocated for this purpose.
>
> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
> ---
>  include/linux/seg6_hmac.h      |   6 +
>  include/net/seg6_hmac.h        |  61 ++++++
>  include/uapi/linux/seg6_hmac.h |  20 ++
>  net/ipv6/seg6_hmac.c           | 432 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 519 insertions(+)
>  create mode 100644 include/linux/seg6_hmac.h
>  create mode 100644 include/net/seg6_hmac.h
>  create mode 100644 include/uapi/linux/seg6_hmac.h
>  create mode 100644 net/ipv6/seg6_hmac.c
>
> diff --git a/include/linux/seg6_hmac.h b/include/linux/seg6_hmac.h
> new file mode 100644
> index 0000000..da437eb
> --- /dev/null
> +++ b/include/linux/seg6_hmac.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_SEG6_HMAC_H
> +#define _LINUX_SEG6_HMAC_H
> +
> +#include <uapi/linux/seg6_hmac.h>
> +
> +#endif
> diff --git a/include/net/seg6_hmac.h b/include/net/seg6_hmac.h
> new file mode 100644
> index 0000000..6e5ee6a
> --- /dev/null
> +++ b/include/net/seg6_hmac.h
> @@ -0,0 +1,61 @@
> +/*
> + *  SR-IPv6 implementation
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#ifndef _NET_SEG6_HMAC_H
> +#define _NET_SEG6_HMAC_H
> +
> +#include <net/flow.h>
> +#include <net/ip6_fib.h>
> +#include <net/sock.h>
> +#include <linux/ip.h>
> +#include <linux/ipv6.h>
> +#include <linux/route.h>
> +#include <net/seg6.h>
> +#include <linux/seg6_hmac.h>
> +
> +#define SEG6_HMAC_MAX_DIGESTSIZE       160
> +#define SEG6_HMAC_RING_SIZE            256
> +
> +struct seg6_hmac_info {
> +       struct list_head list;
> +
> +       u32 hmackeyid;
> +       char secret[SEG6_HMAC_SECRET_LEN];
> +       u8 slen;
> +       u8 alg_id;
> +};
> +
> +struct seg6_hmac_algo {
> +       u8 alg_id;
> +       char name[64];
> +       struct crypto_shash * __percpu *tfms;
> +       struct shash_desc * __percpu *shashs;
> +};
> +
> +extern int seg6_hmac_compute(struct seg6_hmac_info *hinfo,
> +                            struct ipv6_sr_hdr *hdr, struct in6_addr *saddr,
> +                            u8 *output);
> +extern struct seg6_hmac_info *seg6_hmac_info_lookup(struct net *net, u32 key);
> +extern int seg6_hmac_info_add(struct net *net, u32 key,
> +                             struct seg6_hmac_info *hinfo);
> +extern int seg6_hmac_info_del(struct net *net, u32 key,
> +                             struct seg6_hmac_info *hinfo);
> +extern int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
> +                         struct ipv6_sr_hdr *srh);
> +extern bool seg6_hmac_validate_skb(struct sk_buff *skb);
> +extern int seg6_hmac_init(void);
> +extern void seg6_hmac_exit(void);
> +extern int seg6_hmac_net_init(struct net *net);
> +extern void seg6_hmac_net_exit(struct net *net);
> +
> +#endif
> diff --git a/include/uapi/linux/seg6_hmac.h b/include/uapi/linux/seg6_hmac.h
> new file mode 100644
> index 0000000..0b5eda7
> --- /dev/null
> +++ b/include/uapi/linux/seg6_hmac.h
> @@ -0,0 +1,20 @@
> +#ifndef _UAPI_LINUX_SEG6_HMAC_H
> +#define _UAPI_LINUX_SEG6_HMAC_H
> +
> +#define SEG6_HMAC_SECRET_LEN   64
> +#define SEG6_HMAC_FIELD_LEN    32
> +
> +struct sr6_tlv_hmac {
> +       __u8 type;
> +       __u8 len;
> +       __u16 reserved;
> +       __be32 hmackeyid;
> +       __u8 hmac[SEG6_HMAC_FIELD_LEN];
> +} __attribute__((packed));
> +
> +enum {
> +       SEG6_HMAC_ALGO_SHA1 = 1,
> +       SEG6_HMAC_ALGO_SHA256 = 2,
> +};
> +
> +#endif
> diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c
> new file mode 100644
> index 0000000..91bd59a
> --- /dev/null
> +++ b/net/ipv6/seg6_hmac.c
> @@ -0,0 +1,432 @@
> +/*
> + *  SR-IPv6 implementation -- HMAC functions
> + *
> + *  Author:
> + *  David Lebrun <david.lebrun@uclouvain.be>
> + *
> + *
> + *  This program is free software; you can redistribute it and/or
> + *      modify it under the terms of the GNU General Public License
> + *      as published by the Free Software Foundation; either version
> + *      2 of the License, or (at your option) any later version.
> + */
> +
> +#include <linux/errno.h>
> +#include <linux/types.h>
> +#include <linux/socket.h>
> +#include <linux/sockios.h>
> +#include <linux/net.h>
> +#include <linux/netdevice.h>
> +#include <linux/in6.h>
> +#include <linux/icmpv6.h>
> +#include <linux/mroute6.h>
> +#include <linux/slab.h>
> +
> +#include <linux/netfilter.h>
> +#include <linux/netfilter_ipv6.h>
> +
> +#include <net/sock.h>
> +#include <net/snmp.h>
> +
> +#include <net/ipv6.h>
> +#include <net/protocol.h>
> +#include <net/transp_v6.h>
> +#include <net/rawv6.h>
> +#include <net/ndisc.h>
> +#include <net/ip6_route.h>
> +#include <net/addrconf.h>
> +#include <net/xfrm.h>
> +
> +#include <linux/cryptohash.h>
> +#include <crypto/hash.h>
> +#include <crypto/sha.h>
> +#include <net/seg6.h>
> +#include <net/genetlink.h>
> +#include <net/seg6_hmac.h>
> +#include <linux/random.h>
> +
> +static char * __percpu *hmac_ring;
> +
> +static struct seg6_hmac_algo hmac_algos[] = {
> +       {
> +               .alg_id = SEG6_HMAC_ALGO_SHA1,
> +               .name = "hmac(sha1)",
> +       },
> +       {
> +               .alg_id = SEG6_HMAC_ALGO_SHA256,
> +               .name = "hmac(sha256)",
> +       },
> +};
> +
> +static struct seg6_hmac_algo *__hmac_get_algo(u8 alg_id)
> +{
> +       int i, alg_count;
> +       struct seg6_hmac_algo *algo;
> +
> +       alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
> +       for (i = 0; i < alg_count; i++) {
> +               algo = &hmac_algos[i];
> +               if (algo->alg_id == alg_id)
> +                       return algo;
> +       }
> +
> +       return NULL;
> +}
> +
> +static int __do_hmac(struct seg6_hmac_info *hinfo, const char *text, u8 psize,
> +                    u8 *output, int outlen)
> +{
> +       struct crypto_shash *tfm;
> +       struct shash_desc *shash;
> +       struct seg6_hmac_algo *algo;
> +       int ret, dgsize;
> +
> +       algo = __hmac_get_algo(hinfo->alg_id);
> +       if (!algo)
> +               return -ENOENT;
> +
> +       tfm = *this_cpu_ptr(algo->tfms);
> +
> +       dgsize = crypto_shash_digestsize(tfm);
> +       if (dgsize > outlen) {
> +               pr_debug("sr-ipv6: __do_hmac: digest size too big (%d / %d)\n",
> +                        dgsize, outlen);
> +               return -ENOMEM;
> +       }
> +
> +       ret = crypto_shash_setkey(tfm, hinfo->secret, hinfo->slen);
> +       if (ret < 0) {
> +               pr_debug("sr-ipv6: crypto_shash_setkey failed: err %d\n", ret);
> +               goto failed;
> +       }
> +
> +       shash = *this_cpu_ptr(algo->shashs);
> +       shash->tfm = tfm;
> +
> +       ret = crypto_shash_digest(shash, text, psize, output);
> +       if (ret < 0) {
> +               pr_debug("sr-ipv6: crypto_shash_digest failed: err %d\n", ret);
> +               goto failed;
> +       }
> +
> +       return dgsize;
> +
> +failed:
> +       return ret;
> +}
> +
> +int seg6_hmac_compute(struct seg6_hmac_info *hinfo, struct ipv6_sr_hdr *hdr,
> +                     struct in6_addr *saddr, u8 *output)
> +{
> +       int plen, i, dgsize, wrsize;
> +       char *ring, *off;
> +       u8 tmp_out[SEG6_HMAC_MAX_DIGESTSIZE];
> +       __be32 hmackeyid = cpu_to_be32(hinfo->hmackeyid);
> +
> +       /* a 160-byte buffer for digest output allows to store highest known
> +        * hash function (RadioGatun) with up to 1216 bits
> +        */
> +
> +       /* saddr(16) + first_seg(1) + cleanup(1) + keyid(4) + seglist(16n) */
> +       plen = 16 + 1 + 1 + 4 + (hdr->first_segment + 1) * 16;
> +
> +       /* this limit allows for 14 segments */
> +       if (plen >= SEG6_HMAC_RING_SIZE)
> +               return -EMSGSIZE;
> +
> +       local_bh_disable();
> +       ring = *this_cpu_ptr(hmac_ring);
> +       off = ring;
> +       memcpy(off, saddr, 16);
> +       off += 16;
> +       *off++ = hdr->first_segment;
> +       *off++ = !!(sr_get_flags(hdr) & SR6_FLAG_CLEANUP) << 7;
> +       memcpy(off, &hmackeyid, 4);
> +       off += 4;
> +
> +       for (i = 0; i < hdr->first_segment + 1; i++) {
> +               memcpy(off, hdr->segments + i, 16);
> +               off += 16;
> +       }
> +
> +       dgsize = __do_hmac(hinfo, ring, plen, tmp_out,
> +                          SEG6_HMAC_MAX_DIGESTSIZE);
> +       local_bh_enable();
> +
> +       if (dgsize < 0)
> +               return dgsize;
> +
> +       wrsize = SEG6_HMAC_FIELD_LEN;
> +       if (wrsize > dgsize)
> +               wrsize = dgsize;
> +
> +       memset(output, 0, SEG6_HMAC_FIELD_LEN);
> +       memcpy(output, tmp_out, wrsize);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(seg6_hmac_compute);
> +
> +/* checks if an incoming SR-enabled packet's HMAC status matches
> + * the incoming policy.
> + *
> + * called with rcu_read_lock()
> + */
> +bool seg6_hmac_validate_skb(struct sk_buff *skb)
> +{
> +       struct inet6_dev *idev;
> +       struct ipv6_sr_hdr *srh;
> +       struct sr6_tlv_hmac *tlv;
> +       struct seg6_hmac_info *hinfo;
> +       struct net *net = dev_net(skb->dev);
> +       u8 hmac_output[SEG6_HMAC_FIELD_LEN];
> +
> +       idev = __in6_dev_get(skb->dev);
> +
> +       srh = (struct ipv6_sr_hdr *)skb_transport_header(skb);
> +
> +       tlv = seg6_get_tlv_hmac(srh);
> +
> +       /* mandatory check but no tlv */
> +       if (idev->cnf.seg6_require_hmac > 0 && !tlv)
> +               return false;
> +
> +       /* no check */
> +       if (idev->cnf.seg6_require_hmac < 0)
> +               return true;
> +
> +       /* check only if present */
> +       if (idev->cnf.seg6_require_hmac == 0 && !tlv)
> +               return true;
> +
> +       /* now, seg6_require_hmac >= 0 && tlv */
> +
> +       hinfo = seg6_hmac_info_lookup(net, be32_to_cpu(tlv->hmackeyid));
> +       if (!hinfo)
> +               return false;
> +
> +       if (seg6_hmac_compute(hinfo, srh, &ipv6_hdr(skb)->saddr, hmac_output))
> +               return false;
> +
> +       if (memcmp(hmac_output, tlv->hmac, SEG6_HMAC_FIELD_LEN) != 0)
> +               return false;
> +
> +       return true;
> +}
> +
> +/* called with rcu_read_lock() */
> +struct seg6_hmac_info *seg6_hmac_info_lookup(struct net *net, u32 key)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +       struct seg6_hmac_info *hinfo;
> +
> +       list_for_each_entry_rcu(hinfo, &sdata->hmac_infos, list) {
> +               if (hinfo->hmackeyid == key)
> +                       return hinfo;
> +       }
> +
> +       return NULL;
> +}
> +EXPORT_SYMBOL(seg6_hmac_info_lookup);
> +
> +int seg6_hmac_info_add(struct net *net, u32 key, struct seg6_hmac_info *hinfo)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +       struct seg6_hmac_info *old_hinfo;
> +
> +       old_hinfo = seg6_hmac_info_lookup(net, key);
> +       if (old_hinfo)
> +               return -EEXIST;
> +
> +       list_add_rcu(&hinfo->list, &sdata->hmac_infos);
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(seg6_hmac_info_add);
> +
> +int seg6_hmac_info_del(struct net *net, u32 key, struct seg6_hmac_info *hinfo)
> +{
> +       struct seg6_hmac_info *tmp;
> +
> +       tmp = seg6_hmac_info_lookup(net, key);
> +       if (!tmp)
> +               return -ENOENT;
> +
> +       /* entry was replaced, ignore deletion */
> +       if (tmp != hinfo)
> +               return -ENOENT;
> +
> +       list_del_rcu(&hinfo->list);
> +       synchronize_net();
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(seg6_hmac_info_del);
> +
> +static void seg6_hmac_info_flush(struct net *net)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +       struct seg6_hmac_info *hinfo;
> +
> +       seg6_pernet_lock(net);
> +       while ((hinfo = list_first_or_null_rcu(&sdata->hmac_infos,
> +                                              struct seg6_hmac_info,
> +                                              list)) != NULL) {
> +               list_del_rcu(&hinfo->list);
> +               seg6_pernet_unlock(net);
> +               synchronize_net();
> +               kfree(hinfo);
> +               seg6_pernet_lock(net);
> +       }
> +
> +       seg6_pernet_unlock(net);
> +}
> +
> +int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
> +                  struct ipv6_sr_hdr *srh)
> +{
> +       struct seg6_hmac_info *hinfo;
> +       int err = -ENOENT;
> +       struct sr6_tlv_hmac *tlv;
> +
> +       tlv = seg6_get_tlv(srh, SR6_TLV_HMAC);
> +       if (!tlv)
> +               return -EINVAL;
> +
> +       rcu_read_lock();
> +
> +       hinfo = seg6_hmac_info_lookup(net, be32_to_cpu(tlv->hmackeyid));
> +       if (!hinfo)
> +               goto out;
> +
> +       memset(tlv->hmac, 0, SEG6_HMAC_FIELD_LEN);
> +       err = seg6_hmac_compute(hinfo, srh, saddr, tlv->hmac);
> +
> +out:
> +       rcu_read_unlock();
> +       return err;
> +}
> +EXPORT_SYMBOL(seg6_push_hmac);
> +
> +static int seg6_hmac_init_ring(void)
> +{
> +       int i;
> +
> +       hmac_ring = alloc_percpu(char *);
> +
> +       if (!hmac_ring)
> +               return -ENOMEM;
> +
> +       for_each_possible_cpu(i) {
> +               char *ring = kzalloc(SEG6_HMAC_RING_SIZE, GFP_KERNEL);
> +
> +               if (!ring)
> +                       return -ENOMEM;
> +
> +               *per_cpu_ptr(hmac_ring, i) = ring;
> +       }
> +
> +       return 0;
> +}
> +
> +static int seg6_hmac_init_algo(void)
> +{
> +       int i, alg_count, cpu;
> +       struct seg6_hmac_algo *algo;
> +       struct crypto_shash *tfm;
> +       struct shash_desc *shash;
> +
> +       alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
> +
> +       for (i = 0; i < alg_count; i++) {
> +               int shsize;
> +               struct crypto_shash **p_tfm;
> +
> +               algo = &hmac_algos[i];
> +               algo->tfms = alloc_percpu(struct crypto_shash *);
> +               if (!algo->tfms)
> +                       return -ENOMEM;
> +
> +               for_each_possible_cpu(cpu) {
> +                       tfm = crypto_alloc_shash(algo->name, 0, GFP_KERNEL);
> +                       if (IS_ERR(tfm))
> +                               return PTR_ERR(tfm);
> +                       p_tfm = per_cpu_ptr(algo->tfms, cpu);
> +                       *p_tfm = tfm;
> +               }
> +
> +               p_tfm = this_cpu_ptr(algo->tfms);
> +               tfm = *p_tfm;
> +
> +               shsize = sizeof(*shash) + crypto_shash_descsize(tfm);
> +
> +               algo->shashs = alloc_percpu(struct shash_desc *);
> +               if (!algo->shashs)
> +                       return -ENOMEM;
> +
> +               for_each_possible_cpu(cpu) {
> +                       shash = kzalloc(shsize, GFP_KERNEL);
> +                       if (!shash)
> +                               return -ENOMEM;
> +                       *per_cpu_ptr(algo->shashs, cpu) = shash;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
> +int __init seg6_hmac_init(void)
> +{
> +       int ret;
> +
> +       ret = seg6_hmac_init_ring();
> +       if (ret < 0)
> +               goto out;
> +
> +       ret = seg6_hmac_init_algo();
> +
> +out:
> +       return ret;
> +}
> +
> +int __net_init seg6_hmac_net_init(struct net *net)
> +{
> +       struct seg6_pernet_data *sdata = seg6_pernet(net);
> +
> +       INIT_LIST_HEAD(&sdata->hmac_infos);
> +       return 0;
> +}
> +
> +void __exit seg6_hmac_exit(void)
> +{
> +       int i, alg_count, cpu;
> +       struct seg6_hmac_algo *algo = NULL;
> +
> +       for_each_possible_cpu(i) {
> +               char *ring = *per_cpu_ptr(hmac_ring, i);
> +
> +               kfree(ring);
> +       }
> +       free_percpu(hmac_ring);
> +
> +       alg_count = sizeof(hmac_algos) / sizeof(struct seg6_hmac_algo);
> +       for (i = 0; i < alg_count; i++) {
> +               algo = &hmac_algos[i];
> +               for_each_possible_cpu(cpu) {
> +                       struct crypto_shash *tfm;
> +                       struct shash_desc *shash;
> +
> +                       shash = *per_cpu_ptr(algo->shashs, cpu);
> +                       kfree(shash);
> +                       tfm = *per_cpu_ptr(algo->tfms, cpu);
> +                       crypto_free_shash(tfm);
> +               }
> +               free_percpu(algo->tfms);
> +               free_percpu(algo->shashs);
> +       }
> +}
> +
> +void __net_exit seg6_hmac_net_exit(struct net *net)
> +{
> +       seg6_hmac_info_flush(net);
> +}
> --
> 2.7.3
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH 4/9] ipv6: sr: add core files for SR HMAC support
  2016-10-19 13:20   ` zhuyj
@ 2016-10-19 14:43     ` David Laight
  0 siblings, 0 replies; 29+ messages in thread
From: David Laight @ 2016-10-19 14:43 UTC (permalink / raw)
  To: 'zhuyj', David Lebrun; +Cc: netdev

From: zhuyj
> Sent: 19 October 2016 14:21
>  __attribute__((packed)) is potentially unsafe on some systems. The
> symptom probably won't show up on an x86, which just makes the problem
> more insidious; testing on x86 systems won't reveal the problem. (On
> the x86, misaligned accesses are handled in hardware; if you
> dereference an int* pointer that points to an odd address, it will be
> a little slower than if it were properly aligned, but you'll get the
> correct result.)
> 
> On some other systems, such as SPARC, attempting to access a
> misaligned int object causes a bus error, crashing the program.

__attribute__((packed)) causes the compiler to generate byte memory
access and shifts to access the unaligned data.
You don't get a fault, just rather more instructions that you had in mind.

You shouldn't really specify 'packed' unless there are real reasons
why the data structure will appear on misaligned addresses.

	David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-17 17:01   ` Tom Herbert
  2016-10-18 11:43     ` David Lebrun
@ 2016-10-20 13:04     ` David Lebrun
  2016-10-20 15:47       ` Tom Herbert
  1 sibling, 1 reply; 29+ messages in thread
From: David Lebrun @ 2016-10-20 13:04 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Linux Kernel Network Developers

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

On 10/17/2016 07:01 PM, Tom Herbert wrote:
>> > +
>> > +       if (skb->ip_summed == CHECKSUM_COMPLETE)
>> > +               skb->ip_summed = CHECKSUM_NONE;
>> > +
> Because the packet is being changed? Would it make sense to update the
> checksum complete value based on the changes being made. Consider the
> case that the next hop is local to the host (someone may try to
> implement network virtualization this way).
> 

Rethinking about that: even if the next hop is local, I am not sure to
see the benefits of updating the checksum instead of setting
CHECKSUM_NONE. For example, if the next and final hop is local and the
packet carries a TCP payload, tcp_checksum_complete() would force the
recomputation of the checksum anyway (unless ip_summed ==
CHECKSUM_UNNECESSARY).

So I fail to see a path where updating the checksum would be beneficial.

Am I missing something ?

David


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 163 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)
  2016-10-20 13:04     ` David Lebrun
@ 2016-10-20 15:47       ` Tom Herbert
  0 siblings, 0 replies; 29+ messages in thread
From: Tom Herbert @ 2016-10-20 15:47 UTC (permalink / raw)
  To: David Lebrun; +Cc: Linux Kernel Network Developers

On Thu, Oct 20, 2016 at 6:04 AM, David Lebrun <david.lebrun@uclouvain.be> wrote:
> On 10/17/2016 07:01 PM, Tom Herbert wrote:
>>> > +
>>> > +       if (skb->ip_summed == CHECKSUM_COMPLETE)
>>> > +               skb->ip_summed = CHECKSUM_NONE;
>>> > +
>> Because the packet is being changed? Would it make sense to update the
>> checksum complete value based on the changes being made. Consider the
>> case that the next hop is local to the host (someone may try to
>> implement network virtualization this way).
>>
>
> Rethinking about that: even if the next hop is local, I am not sure to
> see the benefits of updating the checksum instead of setting
> CHECKSUM_NONE. For example, if the next and final hop is local and the
> packet carries a TCP payload, tcp_checksum_complete() would force the
> recomputation of the checksum anyway (unless ip_summed ==
> CHECKSUM_UNNECESSARY).
>
Or unless skb->csum_valid is set (tcp_checksum_complete calls
skb_csum_unnecessary where the check is done). If the checksum
complete value is correct then skb->csum_valid would be set from
skb_checksum_init which is called early in tcp_v4_rcv and tcp_v6_rcv.
This way if the penultimate and final hops are local and
CHECKSUM_COMPLETE is set computing the packet checksum is avoided for
a TCP packet.

Tom

> So I fail to see a path where updating the checksum would be beneficial.
>
> Am I missing something ?
>
> David
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2016-10-20 15:47 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-17 14:42 [PATCH 0/9] net: add support for IPv6 Segment Routing David Lebrun
2016-10-17 14:42 ` [PATCH 1/9] ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header) David Lebrun
2016-10-17 14:57   ` David Miller
2016-10-17 15:04     ` David Lebrun
2016-10-17 17:01   ` Tom Herbert
2016-10-18 11:43     ` David Lebrun
2016-10-20 13:04     ` David Lebrun
2016-10-20 15:47       ` Tom Herbert
2016-10-17 17:31   ` Tom Herbert
2016-10-17 14:42 ` [PATCH 2/9] ipv6: sr: add code base for control plane support of SR-IPv6 David Lebrun
2016-10-17 15:00   ` David Miller
2016-10-17 15:06     ` David Lebrun
2016-10-17 15:21       ` David Miller
2016-10-17 17:07   ` Tom Herbert
2016-10-18 11:45     ` David Lebrun
2016-10-17 14:42 ` [PATCH 3/9] ipv6: sr: add support for SRH encapsulation and injection with lwtunnels David Lebrun
2016-10-17 17:17   ` Tom Herbert
2016-10-18 11:47     ` David Lebrun
2016-10-17 14:42 ` [PATCH 4/9] ipv6: sr: add core files for SR HMAC support David Lebrun
2016-10-17 17:24   ` Tom Herbert
2016-10-18 12:07     ` David Lebrun
2016-10-19 13:20   ` zhuyj
2016-10-19 14:43     ` David Laight
2016-10-17 14:43 ` [PATCH 5/9] ipv6: sr: implement API to control SR HMAC structures David Lebrun
2016-10-17 14:43 ` [PATCH 6/9] ipv6: sr: add calls to verify and insert HMAC signatures David Lebrun
2016-10-17 14:43 ` [PATCH 7/9] ipv6: add source address argument for ipv6_push_nfrag_opts David Lebrun
2016-10-17 14:43 ` [PATCH 8/9] ipv6: sr: add support for SRH injection through setsockopt David Lebrun
2016-10-17 14:43 ` [PATCH 9/9] ipv6: sr: add documentation file for per-interface sysctls David Lebrun
2016-10-17 15:09 ` [PATCH 0/9] net: add support for IPv6 Segment Routing David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.