All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/5] ipv6: Rework ext. headers infrastructure
@ 2019-01-23  5:31 Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 1/5] exthdrs: Create exthdrs_options.c Tom Herbert
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Tom Herbert @ 2019-01-23  5:31 UTC (permalink / raw)
  To: davem, netdev; +Cc: Tom Herbert

This patch set implements an infrastructure to allow fine grained
control over IPv6 Destination and Hop-by-Hop TLV options. This
includes being able to registers handlers for receiving options as
well as a set of parameters that sets permissions for who may send a
TLV option and the preferred format of a list of options.

The goal of the patch is to enable broader use cases of IPv6 options,
including those for which non-privileged users can send options. In
order to curtail misuse of options in such cases, a number of
requirements on how the option may be sent and formatted is
enforced.

Particular features are:

- A single 256 entry table containing the information about each
  TLV is defined. Lookups are simple offset accesses to the table
  based on TLV type. Both receiving and sending properties of TLVs
  are maintained in the table.
- Allow registration/deregistration of receive handlers for specific
  TLVs.
- Describe the properties of sending different TLVs in a TLV parameter
  table. The parameter table can be managed via netlink.
- Allow non-privileged users to send TLVs for which they have been
  granted permission.
- Provide a deep validation of TLVs and TLV lists to enforce specific
  limits and permission in order to thwart misuse of TLVs.
- Define a canonical format for sending TLVs that includes a
  preferred order, option alignment, minimal padding between TLVs.
- Allow individual TLVs to be added or set in txoptions list on a
  socket.

Tested: Write and read different TLVs on a socket via setsockopt
and getsockopt. Flow is add individual TLV, read back options,
set read option on new socket as a list. Verify options are
properly sent and received. Used a modified ip command in iproute2
to test managing the TLV parameter via netlink.


Tom Herbert (5):
  exthdrs: Create exthdrs_options.c
  exthdrs: Registration of TLV handlers and parameters
  ip6tlvs: Add netlink interface
  ip6tlvs: Validation of TX Destination and Hop-by-Hop options
  ip6tlvs: API to set and remove individual TLVs from DO or HBH EH

 include/net/ipv6.h         |   76 ++
 include/uapi/linux/in6.h   |   58 ++
 net/ipv6/Makefile          |    2 +-
 net/ipv6/datagram.c        |   27 +-
 net/ipv6/exthdrs.c         |  389 +---------
 net/ipv6/exthdrs_options.c | 1722 ++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/ipv6_sockglue.c   |  110 ++-
 7 files changed, 2007 insertions(+), 377 deletions(-)
 create mode 100644 net/ipv6/exthdrs_options.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next 1/5] exthdrs: Create exthdrs_options.c
  2019-01-23  5:31 [PATCH net-next 0/5] ipv6: Rework ext. headers infrastructure Tom Herbert
@ 2019-01-23  5:31 ` Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 2/5] exthdrs: Registration of TLV handlers and parameters Tom Herbert
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Tom Herbert @ 2019-01-23  5:31 UTC (permalink / raw)
  To: davem, netdev; +Cc: Tom Herbert

Create exthdrs_options.c to hold code related to Hop-by-Hop and
Destination extension header options. Move related functions in
exthdrs.c to the new file.
---
 include/net/ipv6.h         |   8 ++
 net/ipv6/Makefile          |   2 +-
 net/ipv6/exthdrs.c         | 342 --------------------------------------------
 net/ipv6/exthdrs_options.c | 346 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 355 insertions(+), 343 deletions(-)
 create mode 100644 net/ipv6/exthdrs_options.c

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index daf8086..8abdcdb 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -379,6 +379,14 @@ struct ipv6_txoptions *ipv6_renew_options(struct sock *sk,
 struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
 
+struct tlvtype_proc {
+	int	type;
+	bool	(*func)(struct sk_buff *skb, int offset);
+};
+
+extern const struct tlvtype_proc tlvprocdestopt_lst[];
+extern const struct tlvtype_proc tlvprochopopt_lst[];
+
 bool ipv6_opt_accepted(const struct sock *sk, const struct sk_buff *skb,
 		       const struct inet6_skb_parm *opt);
 struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index e0026fa..72bd775 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -10,7 +10,7 @@ ipv6-objs :=	af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \
 		route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \
 		raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o ping.o \
 		exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o \
-		udp_offload.o seg6.o fib6_notifier.o
+		udp_offload.o seg6.o fib6_notifier.o exthdrs_options.o
 
 ipv6-offload :=	ip6_offload.o tcpv6_offload.o exthdrs_offload.o
 
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 20291c2..6dbacf1 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -43,7 +43,6 @@
 #include <net/ndisc.h>
 #include <net/ip6_route.h>
 #include <net/addrconf.h>
-#include <net/calipso.h>
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
 #include <net/xfrm.h>
 #endif
@@ -55,19 +54,6 @@
 
 #include <linux/uaccess.h>
 
-/*
- *	Parsing tlv encoded headers.
- *
- *	Parsing function "func" returns true, if parsing succeed
- *	and false, if it failed.
- *	It MUST NOT touch skb->h.
- */
-
-struct tlvtype_proc {
-	int	type;
-	bool	(*func)(struct sk_buff *skb, int offset);
-};
-
 /*********************
   Generic functions
  *********************/
@@ -204,80 +190,6 @@ static bool ip6_parse_tlv(const struct tlvtype_proc *procs,
 	return false;
 }
 
-/*****************************
-  Destination options header.
- *****************************/
-
-#if IS_ENABLED(CONFIG_IPV6_MIP6)
-static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
-{
-	struct ipv6_destopt_hao *hao;
-	struct inet6_skb_parm *opt = IP6CB(skb);
-	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
-	int ret;
-
-	if (opt->dsthao) {
-		net_dbg_ratelimited("hao duplicated\n");
-		goto discard;
-	}
-	opt->dsthao = opt->dst1;
-	opt->dst1 = 0;
-
-	hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) + optoff);
-
-	if (hao->length != 16) {
-		net_dbg_ratelimited("hao invalid option length = %d\n",
-				    hao->length);
-		goto discard;
-	}
-
-	if (!(ipv6_addr_type(&hao->addr) & IPV6_ADDR_UNICAST)) {
-		net_dbg_ratelimited("hao is not an unicast addr: %pI6\n",
-				    &hao->addr);
-		goto discard;
-	}
-
-	ret = xfrm6_input_addr(skb, (xfrm_address_t *)&ipv6h->daddr,
-			       (xfrm_address_t *)&hao->addr, IPPROTO_DSTOPTS);
-	if (unlikely(ret < 0))
-		goto discard;
-
-	if (skb_cloned(skb)) {
-		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
-			goto discard;
-
-		/* update all variable using below by copied skbuff */
-		hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) +
-						  optoff);
-		ipv6h = ipv6_hdr(skb);
-	}
-
-	if (skb->ip_summed == CHECKSUM_COMPLETE)
-		skb->ip_summed = CHECKSUM_NONE;
-
-	swap(ipv6h->saddr, hao->addr);
-
-	if (skb->tstamp == 0)
-		__net_timestamp(skb);
-
-	return true;
-
- discard:
-	kfree_skb(skb);
-	return false;
-}
-#endif
-
-static const struct tlvtype_proc tlvprocdestopt_lst[] = {
-#if IS_ENABLED(CONFIG_IPV6_MIP6)
-	{
-		.type	= IPV6_TLV_HAO,
-		.func	= ipv6_dest_hao,
-	},
-#endif
-	{-1,			NULL}
-};
-
 static int ipv6_destopt_rcv(struct sk_buff *skb)
 {
 	struct inet6_dev *idev = __in6_dev_get(skb->dev);
@@ -706,122 +618,6 @@ void ipv6_exthdrs_exit(void)
 	inet6_del_protocol(&rthdr_protocol, IPPROTO_ROUTING);
 }
 
-/**********************************
-  Hop-by-hop options.
- **********************************/
-
-/*
- * Note: we cannot rely on skb_dst(skb) before we assign it in ip6_route_input().
- */
-static inline struct inet6_dev *ipv6_skb_idev(struct sk_buff *skb)
-{
-	return skb_dst(skb) ? ip6_dst_idev(skb_dst(skb)) : __in6_dev_get(skb->dev);
-}
-
-static inline struct net *ipv6_skb_net(struct sk_buff *skb)
-{
-	return skb_dst(skb) ? dev_net(skb_dst(skb)->dev) : dev_net(skb->dev);
-}
-
-/* Router Alert as of RFC 2711 */
-
-static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
-{
-	const unsigned char *nh = skb_network_header(skb);
-
-	if (nh[optoff + 1] == 2) {
-		IP6CB(skb)->flags |= IP6SKB_ROUTERALERT;
-		memcpy(&IP6CB(skb)->ra, nh + optoff + 2, sizeof(IP6CB(skb)->ra));
-		return true;
-	}
-	net_dbg_ratelimited("ipv6_hop_ra: wrong RA length %d\n",
-			    nh[optoff + 1]);
-	kfree_skb(skb);
-	return false;
-}
-
-/* Jumbo payload */
-
-static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
-{
-	const unsigned char *nh = skb_network_header(skb);
-	struct inet6_dev *idev = __in6_dev_get_safely(skb->dev);
-	struct net *net = ipv6_skb_net(skb);
-	u32 pkt_len;
-
-	if (nh[optoff + 1] != 4 || (optoff & 3) != 2) {
-		net_dbg_ratelimited("ipv6_hop_jumbo: wrong jumbo opt length/alignment %d\n",
-				    nh[optoff+1]);
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
-		goto drop;
-	}
-
-	pkt_len = ntohl(*(__be32 *)(nh + optoff + 2));
-	if (pkt_len <= IPV6_MAXPLEN) {
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
-		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff+2);
-		return false;
-	}
-	if (ipv6_hdr(skb)->payload_len) {
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
-		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff);
-		return false;
-	}
-
-	if (pkt_len > skb->len - sizeof(struct ipv6hdr)) {
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INTRUNCATEDPKTS);
-		goto drop;
-	}
-
-	if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr)))
-		goto drop;
-
-	IP6CB(skb)->flags |= IP6SKB_JUMBOGRAM;
-	return true;
-
-drop:
-	kfree_skb(skb);
-	return false;
-}
-
-/* CALIPSO RFC 5570 */
-
-static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
-{
-	const unsigned char *nh = skb_network_header(skb);
-
-	if (nh[optoff + 1] < 8)
-		goto drop;
-
-	if (nh[optoff + 6] * 4 + 8 > nh[optoff + 1])
-		goto drop;
-
-	if (!calipso_validate(skb, nh + optoff))
-		goto drop;
-
-	return true;
-
-drop:
-	kfree_skb(skb);
-	return false;
-}
-
-static const struct tlvtype_proc tlvprochopopt_lst[] = {
-	{
-		.type	= IPV6_TLV_ROUTERALERT,
-		.func	= ipv6_hop_ra,
-	},
-	{
-		.type	= IPV6_TLV_JUMBO,
-		.func	= ipv6_hop_jumbo,
-	},
-	{
-		.type	= IPV6_TLV_CALIPSO,
-		.func	= ipv6_hop_calipso,
-	},
-	{ -1, }
-};
-
 int ipv6_parse_hopopts(struct sk_buff *skb)
 {
 	struct inet6_skb_parm *opt = IP6CB(skb);
@@ -992,144 +788,6 @@ void ipv6_push_frag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt, u8 *pr
 }
 EXPORT_SYMBOL(ipv6_push_frag_opts);
 
-struct ipv6_txoptions *
-ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
-{
-	struct ipv6_txoptions *opt2;
-
-	opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
-	if (opt2) {
-		long dif = (char *)opt2 - (char *)opt;
-		memcpy(opt2, opt, opt->tot_len);
-		if (opt2->hopopt)
-			*((char **)&opt2->hopopt) += dif;
-		if (opt2->dst0opt)
-			*((char **)&opt2->dst0opt) += dif;
-		if (opt2->dst1opt)
-			*((char **)&opt2->dst1opt) += dif;
-		if (opt2->srcrt)
-			*((char **)&opt2->srcrt) += dif;
-		refcount_set(&opt2->refcnt, 1);
-	}
-	return opt2;
-}
-EXPORT_SYMBOL_GPL(ipv6_dup_options);
-
-static void ipv6_renew_option(int renewtype,
-			      struct ipv6_opt_hdr **dest,
-			      struct ipv6_opt_hdr *old,
-			      struct ipv6_opt_hdr *new,
-			      int newtype, char **p)
-{
-	struct ipv6_opt_hdr *src;
-
-	src = (renewtype == newtype ? new : old);
-	if (!src)
-		return;
-
-	memcpy(*p, src, ipv6_optlen(src));
-	*dest = (struct ipv6_opt_hdr *)*p;
-	*p += CMSG_ALIGN(ipv6_optlen(*dest));
-}
-
-/**
- * ipv6_renew_options - replace a specific ext hdr with a new one.
- *
- * @sk: sock from which to allocate memory
- * @opt: original options
- * @newtype: option type to replace in @opt
- * @newopt: new option of type @newtype to replace (user-mem)
- * @newoptlen: length of @newopt
- *
- * Returns a new set of options which is a copy of @opt with the
- * option type @newtype replaced with @newopt.
- *
- * @opt may be NULL, in which case a new set of options is returned
- * containing just @newopt.
- *
- * @newopt may be NULL, in which case the specified option type is
- * not copied into the new set of options.
- *
- * The new set of options is allocated from the socket option memory
- * buffer of @sk.
- */
-struct ipv6_txoptions *
-ipv6_renew_options(struct sock *sk, struct ipv6_txoptions *opt,
-		   int newtype, struct ipv6_opt_hdr *newopt)
-{
-	int tot_len = 0;
-	char *p;
-	struct ipv6_txoptions *opt2;
-
-	if (opt) {
-		if (newtype != IPV6_HOPOPTS && opt->hopopt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->hopopt));
-		if (newtype != IPV6_RTHDRDSTOPTS && opt->dst0opt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst0opt));
-		if (newtype != IPV6_RTHDR && opt->srcrt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->srcrt));
-		if (newtype != IPV6_DSTOPTS && opt->dst1opt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst1opt));
-	}
-
-	if (newopt)
-		tot_len += CMSG_ALIGN(ipv6_optlen(newopt));
-
-	if (!tot_len)
-		return NULL;
-
-	tot_len += sizeof(*opt2);
-	opt2 = sock_kmalloc(sk, tot_len, GFP_ATOMIC);
-	if (!opt2)
-		return ERR_PTR(-ENOBUFS);
-
-	memset(opt2, 0, tot_len);
-	refcount_set(&opt2->refcnt, 1);
-	opt2->tot_len = tot_len;
-	p = (char *)(opt2 + 1);
-
-	ipv6_renew_option(IPV6_HOPOPTS, &opt2->hopopt,
-			  (opt ? opt->hopopt : NULL),
-			  newopt, newtype, &p);
-	ipv6_renew_option(IPV6_RTHDRDSTOPTS, &opt2->dst0opt,
-			  (opt ? opt->dst0opt : NULL),
-			  newopt, newtype, &p);
-	ipv6_renew_option(IPV6_RTHDR,
-			  (struct ipv6_opt_hdr **)&opt2->srcrt,
-			  (opt ? (struct ipv6_opt_hdr *)opt->srcrt : NULL),
-			  newopt, newtype, &p);
-	ipv6_renew_option(IPV6_DSTOPTS, &opt2->dst1opt,
-			  (opt ? opt->dst1opt : NULL),
-			  newopt, newtype, &p);
-
-	opt2->opt_nflen = (opt2->hopopt ? ipv6_optlen(opt2->hopopt) : 0) +
-			  (opt2->dst0opt ? ipv6_optlen(opt2->dst0opt) : 0) +
-			  (opt2->srcrt ? ipv6_optlen(opt2->srcrt) : 0);
-	opt2->opt_flen = (opt2->dst1opt ? ipv6_optlen(opt2->dst1opt) : 0);
-
-	return opt2;
-}
-
-struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
-					  struct ipv6_txoptions *opt)
-{
-	/*
-	 * ignore the dest before srcrt unless srcrt is being included.
-	 * --yoshfuji
-	 */
-	if (opt && opt->dst0opt && !opt->srcrt) {
-		if (opt_space != opt) {
-			memcpy(opt_space, opt, sizeof(*opt_space));
-			opt = opt_space;
-		}
-		opt->opt_nflen -= ipv6_optlen(opt->dst0opt);
-		opt->dst0opt = NULL;
-	}
-
-	return opt;
-}
-EXPORT_SYMBOL_GPL(ipv6_fixup_options);
-
 /**
  * fl6_update_dst - update flowi destination address with info given
  *                  by srcrt option, if any.
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
new file mode 100644
index 0000000..70266a6
--- /dev/null
+++ b/net/ipv6/exthdrs_options.c
@@ -0,0 +1,346 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/errno.h>
+#include <linux/in6.h>
+#include <linux/net.h>
+#include <linux/netdevice.h>
+#include <linux/socket.h>
+#include <linux/types.h>
+#include <net/calipso.h>
+#include <net/ipv6.h>
+#include <net/ip6_route.h>
+#if IS_ENABLED(CONFIG_IPV6_MIP6)
+#include <net/xfrm.h>
+#endif
+
+/*	Parsing tlv encoded headers.
+ *
+ *	Parsing function "func" returns true, if parsing succeed
+ *	and false, if it failed.
+ *	It MUST NOT touch skb->h.
+ */
+
+struct ipv6_txoptions *
+ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
+{
+	struct ipv6_txoptions *opt2;
+
+	opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
+	if (opt2) {
+		long dif = (char *)opt2 - (char *)opt;
+
+		memcpy(opt2, opt, opt->tot_len);
+		if (opt2->hopopt)
+			*((char **)&opt2->hopopt) += dif;
+		if (opt2->dst0opt)
+			*((char **)&opt2->dst0opt) += dif;
+		if (opt2->dst1opt)
+			*((char **)&opt2->dst1opt) += dif;
+		if (opt2->srcrt)
+			*((char **)&opt2->srcrt) += dif;
+		refcount_set(&opt2->refcnt, 1);
+	}
+	return opt2;
+}
+EXPORT_SYMBOL_GPL(ipv6_dup_options);
+
+static void ipv6_renew_option(int renewtype,
+			      struct ipv6_opt_hdr **dest,
+			      struct ipv6_opt_hdr *old,
+			      struct ipv6_opt_hdr *new,
+			      int newtype, char **p)
+{
+	struct ipv6_opt_hdr *src;
+
+	src = (renewtype == newtype ? new : old);
+	if (!src)
+		return;
+
+	memcpy(*p, src, ipv6_optlen(src));
+	*dest = (struct ipv6_opt_hdr *)*p;
+	*p += CMSG_ALIGN(ipv6_optlen(*dest));
+}
+
+/**
+ * ipv6_renew_options - replace a specific ext hdr with a new one.
+ *
+ * @sk: sock from which to allocate memory
+ * @opt: original options
+ * @newtype: option type to replace in @opt
+ * @newopt: new option of type @newtype to replace (user-mem)
+ * @newoptlen: length of @newopt
+ *
+ * Returns a new set of options which is a copy of @opt with the
+ * option type @newtype replaced with @newopt.
+ *
+ * @opt may be NULL, in which case a new set of options is returned
+ * containing just @newopt.
+ *
+ * @newopt may be NULL, in which case the specified option type is
+ * not copied into the new set of options.
+ *
+ * The new set of options is allocated from the socket option memory
+ * buffer of @sk.
+ */
+struct ipv6_txoptions *
+ipv6_renew_options(struct sock *sk, struct ipv6_txoptions *opt,
+		   int newtype, struct ipv6_opt_hdr *newopt)
+{
+	int tot_len = 0;
+	char *p;
+	struct ipv6_txoptions *opt2;
+
+	if (opt) {
+		if (newtype != IPV6_HOPOPTS && opt->hopopt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->hopopt));
+		if (newtype != IPV6_RTHDRDSTOPTS && opt->dst0opt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst0opt));
+		if (newtype != IPV6_RTHDR && opt->srcrt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->srcrt));
+		if (newtype != IPV6_DSTOPTS && opt->dst1opt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst1opt));
+	}
+
+	if (newopt)
+		tot_len += CMSG_ALIGN(ipv6_optlen(newopt));
+
+	if (!tot_len)
+		return NULL;
+
+	tot_len += sizeof(*opt2);
+	opt2 = sock_kmalloc(sk, tot_len, GFP_ATOMIC);
+	if (!opt2)
+		return ERR_PTR(-ENOBUFS);
+
+	memset(opt2, 0, tot_len);
+	refcount_set(&opt2->refcnt, 1);
+	opt2->tot_len = tot_len;
+	p = (char *)(opt2 + 1);
+
+	ipv6_renew_option(IPV6_HOPOPTS, &opt2->hopopt,
+			  (opt ? opt->hopopt : NULL),
+			  newopt, newtype, &p);
+	ipv6_renew_option(IPV6_RTHDRDSTOPTS, &opt2->dst0opt,
+			  (opt ? opt->dst0opt : NULL),
+			  newopt, newtype, &p);
+	ipv6_renew_option(IPV6_RTHDR,
+			  (struct ipv6_opt_hdr **)&opt2->srcrt,
+			  (opt ? (struct ipv6_opt_hdr *)opt->srcrt : NULL),
+			  newopt, newtype, &p);
+	ipv6_renew_option(IPV6_DSTOPTS, &opt2->dst1opt,
+			  (opt ? opt->dst1opt : NULL),
+			  newopt, newtype, &p);
+
+	opt2->opt_nflen = (opt2->hopopt ? ipv6_optlen(opt2->hopopt) : 0) +
+			  (opt2->dst0opt ? ipv6_optlen(opt2->dst0opt) : 0) +
+			  (opt2->srcrt ? ipv6_optlen(opt2->srcrt) : 0);
+	opt2->opt_flen = (opt2->dst1opt ? ipv6_optlen(opt2->dst1opt) : 0);
+
+	return opt2;
+}
+
+struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
+					  struct ipv6_txoptions *opt)
+{
+	/* ignore the dest before srcrt unless srcrt is being included.
+	 * --yoshfuji
+	 */
+	if (opt && opt->dst0opt && !opt->srcrt) {
+		if (opt_space != opt) {
+			memcpy(opt_space, opt, sizeof(*opt_space));
+			opt = opt_space;
+		}
+		opt->opt_nflen -= ipv6_optlen(opt->dst0opt);
+		opt->dst0opt = NULL;
+	}
+
+	return opt;
+}
+EXPORT_SYMBOL_GPL(ipv6_fixup_options);
+
+/* Destination options header */
+
+#if IS_ENABLED(CONFIG_IPV6_MIP6)
+static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
+{
+	struct ipv6_destopt_hao *hao;
+	struct inet6_skb_parm *opt = IP6CB(skb);
+	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	int ret;
+
+	if (opt->dsthao) {
+		net_dbg_ratelimited("hao duplicated\n");
+		goto discard;
+	}
+	opt->dsthao = opt->dst1;
+	opt->dst1 = 0;
+
+	hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) + optoff);
+
+	if (hao->length != 16) {
+		net_dbg_ratelimited("hao invalid option length = %d\n",
+				    hao->length);
+		goto discard;
+	}
+
+	if (!(ipv6_addr_type(&hao->addr) & IPV6_ADDR_UNICAST)) {
+		net_dbg_ratelimited("hao is not an unicast addr: %pI6\n",
+				    &hao->addr);
+		goto discard;
+	}
+
+	ret = xfrm6_input_addr(skb, (xfrm_address_t *)&ipv6h->daddr,
+			       (xfrm_address_t *)&hao->addr, IPPROTO_DSTOPTS);
+	if (unlikely(ret < 0))
+		goto discard;
+
+	if (skb_cloned(skb)) {
+		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
+			goto discard;
+
+		/* update all variable using below by copied skbuff */
+		hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) +
+						  optoff);
+		ipv6h = ipv6_hdr(skb);
+	}
+
+	if (skb->ip_summed == CHECKSUM_COMPLETE)
+		skb->ip_summed = CHECKSUM_NONE;
+
+	swap(ipv6h->saddr, hao->addr);
+
+	if (skb->tstamp == 0)
+		__net_timestamp(skb);
+
+	return true;
+
+ discard:
+	kfree_skb(skb);
+	return false;
+}
+#endif
+
+const struct tlvtype_proc tlvprocdestopt_lst[] = {
+#if IS_ENABLED(CONFIG_IPV6_MIP6)
+	{
+		.type	= IPV6_TLV_HAO,
+		.func	= ipv6_dest_hao,
+	},
+#endif
+	{-1,			NULL}
+};
+
+/* Hop-by-hop options */
+
+/* Note: we cannot rely on skb_dst(skb) before we assign it in
+ * ip6_route_input().
+ */
+static inline struct inet6_dev *ipv6_skb_idev(struct sk_buff *skb)
+{
+	return skb_dst(skb) ? ip6_dst_idev(skb_dst(skb)) :
+	    __in6_dev_get(skb->dev);
+}
+
+static inline struct net *ipv6_skb_net(struct sk_buff *skb)
+{
+	return skb_dst(skb) ? dev_net(skb_dst(skb)->dev) : dev_net(skb->dev);
+}
+
+/* Router Alert as of RFC 2711 */
+
+static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
+{
+	const unsigned char *nh = skb_network_header(skb);
+
+	if (nh[optoff + 1] == 2) {
+		IP6CB(skb)->flags |= IP6SKB_ROUTERALERT;
+		memcpy(&IP6CB(skb)->ra, nh + optoff + 2,
+		       sizeof(IP6CB(skb)->ra));
+		return true;
+	}
+	net_dbg_ratelimited("%s: wrong RA length %d\n",
+			    __func__, nh[optoff + 1]);
+	kfree_skb(skb);
+	return false;
+}
+
+/* Jumbo payload */
+
+static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
+{
+	const unsigned char *nh = skb_network_header(skb);
+	struct inet6_dev *idev = __in6_dev_get_safely(skb->dev);
+	struct net *net = ipv6_skb_net(skb);
+	u32 pkt_len;
+
+	if (nh[optoff + 1] != 4 || (optoff & 3) != 2) {
+		net_dbg_ratelimited("%s: wrong jumbo opt length/alignment %d\n",
+				    __func__, nh[optoff + 1]);
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
+		goto drop;
+	}
+
+	pkt_len = ntohl(*(__be32 *)(nh + optoff + 2));
+	if (pkt_len <= IPV6_MAXPLEN) {
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
+		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff + 2);
+		return false;
+	}
+	if (ipv6_hdr(skb)->payload_len) {
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
+		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff);
+		return false;
+	}
+
+	if (pkt_len > skb->len - sizeof(struct ipv6hdr)) {
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INTRUNCATEDPKTS);
+		goto drop;
+	}
+
+	if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr)))
+		goto drop;
+
+	IP6CB(skb)->flags |= IP6SKB_JUMBOGRAM;
+	return true;
+
+drop:
+	kfree_skb(skb);
+	return false;
+}
+
+/* CALIPSO RFC 5570 */
+
+static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
+{
+	const unsigned char *nh = skb_network_header(skb);
+
+	if (nh[optoff + 1] < 8)
+		goto drop;
+
+	if (nh[optoff + 6] * 4 + 8 > nh[optoff + 1])
+		goto drop;
+
+	if (!calipso_validate(skb, nh + optoff))
+		goto drop;
+
+	return true;
+
+drop:
+	kfree_skb(skb);
+	return false;
+}
+
+const struct tlvtype_proc tlvprochopopt_lst[] = {
+	{
+		.type	= IPV6_TLV_ROUTERALERT,
+		.func	= ipv6_hop_ra,
+	},
+	{
+		.type	= IPV6_TLV_JUMBO,
+		.func	= ipv6_hop_jumbo,
+	},
+	{
+		.type	= IPV6_TLV_CALIPSO,
+		.func	= ipv6_hop_calipso,
+	},
+	{ -1, }
+};
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next 2/5] exthdrs: Registration of TLV handlers and parameters
  2019-01-23  5:31 [PATCH net-next 0/5] ipv6: Rework ext. headers infrastructure Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 1/5] exthdrs: Create exthdrs_options.c Tom Herbert
@ 2019-01-23  5:31 ` Tom Herbert
  2019-01-23 19:27   ` David Miller
  2019-01-23  5:31 ` [PATCH net-next 3/5] ip6tlvs: Add netlink interface Tom Herbert
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Tom Herbert @ 2019-01-23  5:31 UTC (permalink / raw)
  To: davem, netdev; +Cc: Tom Herbert

Define a table that contains 256 entries, one for each TLV. Each entry
points to a structure that contains parameters and handler functions
for receiving and transmitting TLVs. The receive and transmit properties
can be managed independently.

TLV transmit properties include a description of limits, alignment,
and preferred ordering. TLV receive properties provide the receiver
handler. A class attribute is defined in both receive and transmit
properties that indicate the type of extension header in which the
TLV may be used (e.g. Hop-by-Hop options, Destination options, or
Destination options before the routing header.

Receive TLV lookup and processing is modified to be a lookup in the
TLV table. tlv_{set,unset}_{rx,tx}_param function can be used to
set attributes in the TLV table. A table containing parameters for
TLVs supported by the kernel and is used to initialize the TLV table.
---
 include/net/ipv6.h         |  58 ++++++++-
 include/uapi/linux/in6.h   |  17 +++
 net/ipv6/exthdrs.c         |  47 ++++---
 net/ipv6/exthdrs_options.c | 314 +++++++++++++++++++++++++++++++++++++++++----
 4 files changed, 389 insertions(+), 47 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 8abdcdb..3d3b5a1 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -379,13 +379,61 @@ struct ipv6_txoptions *ipv6_renew_options(struct sock *sk,
 struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
 
-struct tlvtype_proc {
-	int	type;
-	bool	(*func)(struct sk_buff *skb, int offset);
+struct tlv_tx_param {
+	unsigned char preferred_order;
+	unsigned char admin_perm : 2;
+	unsigned char user_perm : 2;
+	unsigned char class : 3;
+	unsigned char align_mult : 4;
+	unsigned char align_off : 4;
+	unsigned char data_len_mult : 4;
+	unsigned char data_len_off : 4;
+	unsigned char min_data_len;
+	unsigned char max_data_len;
 };
 
-extern const struct tlvtype_proc tlvprocdestopt_lst[];
-extern const struct tlvtype_proc tlvprochopopt_lst[];
+struct tlv_rx_param {
+	unsigned char class: 3;
+	bool (*func)(unsigned int class, struct sk_buff *skb, int offset);
+};
+
+struct tlv_param {
+	struct tlv_tx_param tx_params;
+	struct tlv_rx_param rx_params;
+	struct rcu_head rcu;
+};
+
+extern struct tlv_param __rcu *tlv_param_table[256];
+
+/* Preferred TLV ordering (placed by increasing order) */
+#define TLV_PREF_ORDER_HAO		10
+#define TLV_PREF_ORDER_ROUTERALERT	20
+#define TLV_PREF_ORDER_JUMBO		30
+#define TLV_PREF_ORDER_CALIPSO		40
+
+/* tlv_deref_rx_params assume rcu_read_lock is held */
+static inline struct tlv_rx_param *tlv_deref_rx_params(unsigned int type)
+{
+	struct tlv_param *tp = rcu_dereference(tlv_param_table[type]);
+
+	return &tp->rx_params;
+}
+
+/* tlv_deref_tx_params assume rcu_read_lock is held */
+static inline struct tlv_tx_param *tlv_deref_tx_params(unsigned int type)
+{
+	struct tlv_param *tp = rcu_dereference(tlv_param_table[type]);
+
+	return &tp->tx_params;
+}
+
+int tlv_set_param(unsigned char type,
+		  const struct tlv_rx_param *rx_param_tmpl,
+		  const struct tlv_tx_param *tx_param_tmpl);
+int tlv_unset_rx_param(unsigned char type);
+int tlv_set_rx_param(unsigned char type, struct tlv_rx_param *rx_param_tmpl);
+int tlv_unset_tx_param(unsigned char type);
+int tlv_set_tx_param(unsigned char type, struct tlv_tx_param *tx_param_tmpl);
 
 bool ipv6_opt_accepted(const struct sock *sk, const struct sk_buff *skb,
 		       const struct inet6_skb_parm *opt);
diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
index 71d82fe..38e8e63 100644
--- a/include/uapi/linux/in6.h
+++ b/include/uapi/linux/in6.h
@@ -296,4 +296,21 @@ struct in6_flowlabel_req {
  * ...
  * MRT6_MAX
  */
+
+/* TLV permissions values */
+enum {
+	IPV6_TLV_PERM_NONE,
+	IPV6_TLV_PERM_WITH_CHECK,
+	IPV6_TLV_PERM_NO_CHECK,
+	IPV6_TLV_PERM_MAX = IPV6_TLV_PERM_NO_CHECK
+};
+
+/* Flags for EH type that can use a TLV option */
+#define IPV6_TLV_CLASS_FLAG_HOPOPT	0x1
+#define IPV6_TLV_CLASS_FLAG_RTRDSTOPT	0x2
+#define IPV6_TLV_CLASS_FLAG_DSTOPT	0x4
+#define IPV6_TLV_CLASS_MAX		0x7
+
+#define IPV6_TLV_CLASS_ANY_DSTOPT	(IPV6_TLV_CLASS_FLAG_RTRDSTOPT | \
+					 IPV6_TLV_CLASS_FLAG_DSTOPT)
 #endif /* _UAPI_LINUX_IN6_H */
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 6dbacf1..af4152e 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -100,15 +100,14 @@ static bool ip6_tlvopt_unknown(struct sk_buff *skb, int optoff,
 
 /* Parse tlv encoded option header (hop-by-hop or destination) */
 
-static bool ip6_parse_tlv(const struct tlvtype_proc *procs,
-			  struct sk_buff *skb,
+static bool ip6_parse_tlv(unsigned int class, struct sk_buff *skb,
 			  int max_count)
 {
 	int len = (skb_transport_header(skb)[1] + 1) << 3;
 	const unsigned char *nh = skb_network_header(skb);
 	int off = skb_network_header_len(skb);
-	const struct tlvtype_proc *curr;
 	bool disallow_unknowns = false;
+	struct tlv_rx_param *tprx;
 	int tlv_count = 0;
 	int padlen = 0;
 
@@ -117,12 +116,16 @@ static bool ip6_parse_tlv(const struct tlvtype_proc *procs,
 		max_count = -max_count;
 	}
 
-	if (skb_transport_offset(skb) + len > skb_headlen(skb))
-		goto bad;
+	if (skb_transport_offset(skb) + len > skb_headlen(skb)) {
+		kfree_skb(skb);
+		return false;
+	}
 
 	off += 2;
 	len -= 2;
 
+	rcu_read_unlock();
+
 	while (len > 0) {
 		int optlen = nh[off + 1] + 2;
 		int i;
@@ -162,19 +165,19 @@ static bool ip6_parse_tlv(const struct tlvtype_proc *procs,
 			if (tlv_count > max_count)
 				goto bad;
 
-			for (curr = procs; curr->type >= 0; curr++) {
-				if (curr->type == nh[off]) {
-					/* type specific length/alignment
-					   checks will be performed in the
-					   func(). */
-					if (curr->func(skb, off) == false)
-						return false;
-					break;
-				}
+			tprx = tlv_deref_rx_params(nh[off]);
+
+			if ((tprx->class & class) && tprx->func) {
+				/* Handler will apply additional checks to
+				 * the TLV
+				 */
+				if (!tprx->func(class, skb, off))
+					goto bad_nofree;
+			} else if (!ip6_tlvopt_unknown(skb, off,
+						       disallow_unknowns)) {
+				/* No appropriate handler, TLV is unknown */
+				goto bad_nofree;
 			}
-			if (curr->type < 0 &&
-			    !ip6_tlvopt_unknown(skb, off, disallow_unknowns))
-				return false;
 
 			padlen = 0;
 			break;
@@ -183,10 +186,14 @@ static bool ip6_parse_tlv(const struct tlvtype_proc *procs,
 		len -= optlen;
 	}
 
-	if (len == 0)
+	if (len == 0) {
+		rcu_read_unlock();
 		return true;
+	}
 bad:
 	kfree_skb(skb);
+bad_nofree:
+	rcu_read_unlock();
 	return false;
 }
 
@@ -220,7 +227,7 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
 	dstbuf = opt->dst1;
 #endif
 
-	if (ip6_parse_tlv(tlvprocdestopt_lst, skb,
+	if (ip6_parse_tlv(IPV6_TLV_CLASS_FLAG_DSTOPT, skb,
 			  init_net.ipv6.sysctl.max_dst_opts_cnt)) {
 		skb->transport_header += extlen;
 		opt = IP6CB(skb);
@@ -643,7 +650,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
 		goto fail_and_free;
 
 	opt->flags |= IP6SKB_HOPBYHOP;
-	if (ip6_parse_tlv(tlvprochopopt_lst, skb,
+	if (ip6_parse_tlv(IPV6_TLV_CLASS_FLAG_HOPOPT, skb,
 			  init_net.ipv6.sysctl.max_hbh_opts_cnt)) {
 		skb->transport_header += extlen;
 		opt = IP6CB(skb);
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
index 70266a6..a1b7a2e 100644
--- a/net/ipv6/exthdrs_options.c
+++ b/net/ipv6/exthdrs_options.c
@@ -11,6 +11,7 @@
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
 #include <net/xfrm.h>
 #endif
+#include <uapi/linux/in.h>
 
 /*	Parsing tlv encoded headers.
  *
@@ -19,6 +20,8 @@
  *	It MUST NOT touch skb->h.
  */
 
+struct tlv_param __rcu *tlv_param_table[256];
+
 struct ipv6_txoptions *
 ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
 {
@@ -160,7 +163,7 @@ EXPORT_SYMBOL_GPL(ipv6_fixup_options);
 /* Destination options header */
 
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
-static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
+static bool ipv6_dest_hao(unsigned int class, struct sk_buff *skb, int optoff)
 {
 	struct ipv6_destopt_hao *hao;
 	struct inet6_skb_parm *opt = IP6CB(skb);
@@ -219,16 +222,6 @@ static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
 }
 #endif
 
-const struct tlvtype_proc tlvprocdestopt_lst[] = {
-#if IS_ENABLED(CONFIG_IPV6_MIP6)
-	{
-		.type	= IPV6_TLV_HAO,
-		.func	= ipv6_dest_hao,
-	},
-#endif
-	{-1,			NULL}
-};
-
 /* Hop-by-hop options */
 
 /* Note: we cannot rely on skb_dst(skb) before we assign it in
@@ -247,7 +240,7 @@ static inline struct net *ipv6_skb_net(struct sk_buff *skb)
 
 /* Router Alert as of RFC 2711 */
 
-static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
+static bool ipv6_hop_ra(unsigned int class, struct sk_buff *skb, int optoff)
 {
 	const unsigned char *nh = skb_network_header(skb);
 
@@ -265,7 +258,7 @@ static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
 
 /* Jumbo payload */
 
-static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
+static bool ipv6_hop_jumbo(unsigned int class, struct sk_buff *skb, int optoff)
 {
 	const unsigned char *nh = skb_network_header(skb);
 	struct inet6_dev *idev = __in6_dev_get_safely(skb->dev);
@@ -309,7 +302,8 @@ static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
 
 /* CALIPSO RFC 5570 */
 
-static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
+static bool ipv6_hop_calipso(unsigned int class, struct sk_buff *skb,
+			     int optoff)
 {
 	const unsigned char *nh = skb_network_header(skb);
 
@@ -329,18 +323,294 @@ static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
 	return false;
 }
 
-const struct tlvtype_proc tlvprochopopt_lst[] = {
+/* TLV parameter table functions and structures */
+
+static void tlv_param_release(struct rcu_head *rcu)
+{
+	struct tlv_param *tp = container_of(rcu, struct tlv_param, rcu);
+
+	vfree(tp);
+}
+
+/* Default (unset) values for TX TLV parameters */
+static const struct tlv_param tlv_default_param = {
+	.tx_params.preferred_order = 0,
+	.tx_params.admin_perm = IPV6_TLV_PERM_NO_CHECK,
+	.tx_params.user_perm = IPV6_TLV_PERM_NONE,
+	.tx_params.class = 0,
+	.tx_params.align_mult = (4 - 1), /* Default alignment: 4n + 2 */
+	.tx_params.align_off = 2,
+	.tx_params.min_data_len = 0,
+	.tx_params.max_data_len = 255,
+	.tx_params.data_len_mult = (1 - 1), /* No default length align */
+	.tx_params.data_len_off = 0,
+};
+
+int __tlv_write_param(unsigned char type, const struct tlv_param *tp)
+{
+	static DEFINE_MUTEX(tlv_mutex);
+	struct tlv_param *old;
+
+	mutex_lock(&tlv_mutex);
+
+	old = rcu_dereference_protected(tlv_param_table[type],
+					lockdep_is_held(&tlv_mutex));
+
+	rcu_assign_pointer(tlv_param_table[type], tp);
+
+	if (old != &tlv_default_param) {
+		/* Old table entry is not default. Assume that it was
+		 * vmalloc'ed so schedule a vfree in rcu.
+		 */
+		call_rcu(&old->rcu, tlv_param_release);
+	}
+
+	mutex_unlock(&tlv_mutex);
+
+	return 0;
+}
+
+int tlv_set_param(unsigned char type,
+		  const struct tlv_rx_param *rx_param_tmpl,
+		  const struct tlv_tx_param *tx_param_tmpl)
+{
+	struct tlv_param *tp;
+	int ret;
+
+	if (type < 2)
+		return -EINVAL;
+
+	/* Need to alloc and copy from templates */
+
+	tp = vmalloc(sizeof(*tp));
+	if (!tp)
+		return -ENOMEM;
+
+	memcpy(&tp->rx_params, rx_param_tmpl, sizeof(tp->rx_params));
+	memcpy(&tp->tx_params, tx_param_tmpl, sizeof(tp->tx_params));
+
+	ret = __tlv_write_param(type, tp);
+	if (ret < 0)
+		vfree(tp);
+
+	return ret;
+}
+EXPORT_SYMBOL(tlv_set_param);
+
+int tlv_unset_rx_param(unsigned char type)
+{
+	struct tlv_tx_param *tptx;
+	int ret;
+
+	if (type < 2)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	tptx = tlv_deref_tx_params(type);
+
+	if (!tptx->preferred_order)
+		ret = __tlv_write_param(type, &tlv_default_param);
+	else
+		ret = tlv_set_param(type, &tlv_default_param.rx_params, tptx);
+
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(tlv_unset_rx_param);
+
+int tlv_set_rx_param(unsigned char type, struct tlv_rx_param *rx_param_tmpl)
+{
+	struct tlv_tx_param *tptx;
+	int ret;
+
+	if (type < 2)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	tptx = tlv_deref_tx_params(type);
+
+	ret = tlv_set_param(type, rx_param_tmpl, tptx);
+
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(tlv_set_rx_param);
+
+int tlv_unset_tx_param(unsigned char type)
+{
+	struct tlv_rx_param *tprx;
+	int ret;
+
+	if (type < 2)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	tprx = tlv_deref_rx_params(type);
+
+	if (!tprx->class)
+		ret = __tlv_write_param(type, &tlv_default_param);
+	else
+		ret = tlv_set_param(type, tprx, &tlv_default_param.tx_params);
+
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(tlv_unset_tx_param);
+
+int tlv_set_tx_param(unsigned char type, struct tlv_tx_param *tx_param_tmpl)
+{
+	struct tlv_rx_param *tprx;
+	int ret;
+
+	if (type < 2)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	tprx = tlv_deref_rx_params(type);
+
+	ret = tlv_set_param(type, tprx, tx_param_tmpl);
+
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(tlv_set_tx_param);
+
+struct tlv_init_params {
+	int type;
+	struct tlv_tx_param t;
+	struct tlv_rx_param r;
+};
+
+static const struct tlv_init_params tlv_init_params[] __initconst = {
 	{
-		.type	= IPV6_TLV_ROUTERALERT,
-		.func	= ipv6_hop_ra,
+		.type = IPV6_TLV_HAO,
+
+		.t.preferred_order = TLV_PREF_ORDER_HAO,
+		.t.admin_perm = IPV6_TLV_PERM_NO_CHECK,
+		.t.user_perm = IPV6_TLV_PERM_NONE,
+		.t.class = IPV6_TLV_CLASS_FLAG_DSTOPT,
+		.t.align_mult = (8 - 1), /* Align to 8n + 6 */
+		.t.align_off = 6,
+		.t.min_data_len = 16,
+		.t.max_data_len = 16,
+		.t.data_len_mult = (1 - 1), /* Fixed length */
+		.t.data_len_off = 0,
+
+		.r.func = ipv6_dest_hao,
+		.r.class = IPV6_TLV_CLASS_FLAG_DSTOPT,
 	},
 	{
-		.type	= IPV6_TLV_JUMBO,
-		.func	= ipv6_hop_jumbo,
+		.type = IPV6_TLV_ROUTERALERT,
+
+		.t.preferred_order = TLV_PREF_ORDER_ROUTERALERT,
+		.t.admin_perm = IPV6_TLV_PERM_NO_CHECK,
+		.t.user_perm = IPV6_TLV_PERM_NONE,
+		.t.class = IPV6_TLV_CLASS_FLAG_HOPOPT,
+		.t.align_mult = (2 - 1), /* Align to 2n */
+		.t.align_off = 0,
+		.t.min_data_len = 2,
+		.t.max_data_len = 2,
+		.t.data_len_mult = (1 - 1), /* Fixed length */
+		.t.data_len_off = 0,
+
+		.r.func = ipv6_hop_ra,
+		.r.class = IPV6_TLV_CLASS_FLAG_HOPOPT,
 	},
 	{
-		.type	= IPV6_TLV_CALIPSO,
-		.func	= ipv6_hop_calipso,
+		.type = IPV6_TLV_JUMBO,
+
+		.t.preferred_order = TLV_PREF_ORDER_JUMBO,
+		.t.admin_perm = IPV6_TLV_PERM_NO_CHECK,
+		.t.user_perm = IPV6_TLV_PERM_NONE,
+		.t.class = IPV6_TLV_CLASS_FLAG_HOPOPT,
+		.t.align_mult = (4 - 1), /* Align to 4n + 2 */
+		.t.align_off = 2,
+		.t.min_data_len = 4,
+		.t.max_data_len = 4,
+		.t.data_len_mult = (1 - 1), /* Fixed length */
+		.t.data_len_off = 0,
+
+		.r.func = ipv6_hop_jumbo,
+		.r.class = IPV6_TLV_CLASS_FLAG_HOPOPT,
 	},
-	{ -1, }
+	{
+		.type = IPV6_TLV_CALIPSO,
+
+		.t.preferred_order = TLV_PREF_ORDER_CALIPSO,
+		.t.admin_perm = IPV6_TLV_PERM_NO_CHECK,
+		.t.user_perm = IPV6_TLV_PERM_NONE,
+		.t.class = IPV6_TLV_CLASS_FLAG_HOPOPT,
+		.t.align_mult = (4 - 1), /* Align to 4n + 2 */
+		.t.align_off = 2,
+		.t.min_data_len = 8,
+		.t.max_data_len = 252,
+		.t.data_len_mult = (4 - 1), /* Length is multiple of 4 */
+		.t.data_len_off = 0,
+
+		.r.func = ipv6_hop_calipso,
+		.r.class = IPV6_TLV_CLASS_FLAG_HOPOPT,
+	}
 };
+
+static int __init exthdrs_init(void)
+{
+	unsigned long check_map[BITS_TO_LONGS(256)];
+	const struct tlv_rx_param *rx_params;
+	const struct tlv_tx_param *tx_params;
+	int i, ret;
+
+	memset(check_map, 0, sizeof(check_map));
+
+	for (i = 2; i < 256; i++)
+		RCU_INIT_POINTER(tlv_param_table[i], &tlv_default_param);
+
+	for (i = 0; i < ARRAY_SIZE(tlv_init_params); i++) {
+		const struct tlv_init_params *tpi = &tlv_init_params[i];
+		unsigned int order = tpi->t.preferred_order;
+
+		WARN_ON(tpi->type < 2); /* Padding TLV initialized? */
+
+		if (order) {
+			WARN_ON(test_bit(order, check_map));
+			set_bit(order, check_map);
+			tx_params = &tpi->t;
+		} else {
+			tx_params = &tlv_default_param.tx_params;
+		}
+
+		if (tpi->r.class)
+			rx_params = &tpi->r;
+		else
+			rx_params = &tlv_default_param.rx_params;
+
+		ret = tlv_set_param(tpi->type, rx_params, tx_params);
+		if (ret < 0)
+			goto fail;
+	}
+
+	return 0;
+
+fail:
+	/* Undo anything that was set. */
+	for (i = 0; i < ARRAY_SIZE(tlv_init_params); i++)
+		__tlv_write_param(tlv_init_params[i].type, &tlv_default_param);
+
+	for (i = 2; i < 256; i++)
+		RCU_INIT_POINTER(tlv_param_table[i], NULL);
+
+	return ret;
+}
+module_init(exthdrs_init);
+
+static void __exit exthdrs_fini(void)
+{
+}
+module_exit(exthdrs_fini);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next 3/5] ip6tlvs: Add netlink interface
  2019-01-23  5:31 [PATCH net-next 0/5] ipv6: Rework ext. headers infrastructure Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 1/5] exthdrs: Create exthdrs_options.c Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 2/5] exthdrs: Registration of TLV handlers and parameters Tom Herbert
@ 2019-01-23  5:31 ` Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 4/5] ip6tlvs: Validation of TX Destination and Hop-by-Hop options Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 5/5] ip6tlvs: API to set and remove individual TLVs from DO or HBH EH Tom Herbert
  4 siblings, 0 replies; 7+ messages in thread
From: Tom Herbert @ 2019-01-23  5:31 UTC (permalink / raw)
  To: davem, netdev; +Cc: Tom Herbert

Add a netlink interface to manage the TX TLV parameters. Managed
parameters include those for validating and sending TLVs being sent
such as alignment, TLV ordering, length limits, etc.
---
 include/uapi/linux/in6.h   |  32 +++++
 net/ipv6/exthdrs_options.c | 292 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 324 insertions(+)

diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
index 38e8e63..a54cf96 100644
--- a/include/uapi/linux/in6.h
+++ b/include/uapi/linux/in6.h
@@ -297,6 +297,38 @@ struct in6_flowlabel_req {
  * MRT6_MAX
  */
 
+/* NETLINK_GENERIC related info for IPv6 TLVs */
+
+#define IPV6_TLV_GENL_NAME	"ipv6-tlv"
+#define IPV6_TLV_GENL_VERSION	0x1
+
+enum {
+	IPV6_TLV_ATTR_UNSPEC,
+	IPV6_TLV_ATTR_TYPE,			/* u8, > 1 */
+	IPV6_TLV_ATTR_ORDER,			/* u8 */
+	IPV6_TLV_ATTR_ADMIN_PERM,		/* u8, perm value */
+	IPV6_TLV_ATTR_USER_PERM,		/* u8, perm value */
+	IPV6_TLV_ATTR_CLASS,			/* u8, 3 bit flags */
+	IPV6_TLV_ATTR_ALIGN_MULT,		/* u8, 1 to 16 */
+	IPV6_TLV_ATTR_ALIGN_OFF,		/* u8, 0 to 15 */
+	IPV6_TLV_ATTR_MIN_DATA_LEN,		/* u8 (option data length) */
+	IPV6_TLV_ATTR_MAX_DATA_LEN,		/* u8 (option data length) */
+	IPV6_TLV_ATTR_DATA_LEN_MULT,		/* u8, 1 to 16 */
+	IPV6_TLV_ATTR_DATA_LEN_OFF,		/* u8, 0 to 15 */
+
+	__IPV6_TLV_ATTR_MAX,
+};
+
+#define IPV6_TLV_ATTR_MAX              (__IPV6_TLV_ATTR_MAX - 1)
+
+enum {
+	IPV6_TLV_CMD_SET,
+	IPV6_TLV_CMD_UNSET,
+	IPV6_TLV_CMD_GET,
+
+	__IPV6_TLV_CMD_MAX,
+};
+
 /* TLV permissions values */
 enum {
 	IPV6_TLV_PERM_NONE,
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
index a1b7a2e..da9e257 100644
--- a/net/ipv6/exthdrs_options.c
+++ b/net/ipv6/exthdrs_options.c
@@ -6,11 +6,13 @@
 #include <linux/socket.h>
 #include <linux/types.h>
 #include <net/calipso.h>
+#include <net/genetlink.h>
 #include <net/ipv6.h>
 #include <net/ip6_route.h>
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
 #include <net/xfrm.h>
 #endif
+#include <uapi/linux/genetlink.h>
 #include <uapi/linux/in.h>
 
 /*	Parsing tlv encoded headers.
@@ -560,6 +562,291 @@ static const struct tlv_init_params tlv_init_params[] __initconst = {
 	}
 };
 
+static struct genl_family tlv_nl_family;
+
+static const struct nla_policy tlv_nl_policy[IPV6_TLV_ATTR_MAX + 1] = {
+	[IPV6_TLV_ATTR_TYPE] =		{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_ORDER] =		{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_ADMIN_PERM] =	{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_USER_PERM] =	{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_CLASS] =		{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_ALIGN_MULT] =	{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_ALIGN_OFF] =	{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_MIN_DATA_LEN] =	{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_MAX_DATA_LEN] =	{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_DATA_LEN_OFF] =	{ .type = NLA_U8, },
+	[IPV6_TLV_ATTR_DATA_LEN_MULT] =	{ .type = NLA_U8, },
+};
+
+static int tlv_nl_cmd_set(struct sk_buff *skb, struct genl_info *info)
+{
+	struct tlv_tx_param new_tx;
+	int retv = -EINVAL, i;
+	struct tlv_param *tp;
+	u8 tlv_type, v;
+
+	if (!info->attrs[IPV6_TLV_ATTR_TYPE])
+		return -EINVAL;
+
+	tlv_type = nla_get_u8(info->attrs[IPV6_TLV_ATTR_TYPE]);
+	if (tlv_type < 2)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	new_tx = *tlv_deref_tx_params(tlv_type);
+
+	if (info->attrs[IPV6_TLV_ATTR_ORDER]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_ORDER]);
+		if (v) {
+			for (i = 2; i < 256; i++) {
+				/* Preferred orders must be unique */
+				tp = rcu_dereference(tlv_param_table[i]);
+				if (tp->tx_params.preferred_order == v &&
+				    i != tlv_type) {
+					retv = -EALREADY;
+					goto out;
+				}
+			}
+			new_tx.preferred_order = v;
+		}
+	}
+
+	if (!new_tx.preferred_order) {
+		unsigned long check_map[BITS_TO_LONGS(255)];
+		int pos;
+
+		/* Preferred order not specified, automatically set one.
+		 * This is chosen to be the first value after the greatest
+		 * order in use.
+		 */
+		memset(check_map, 0, sizeof(check_map));
+
+		for (i = 2; i < 256; i++) {
+			unsigned int order;
+
+			tp = rcu_dereference(tlv_param_table[i]);
+			order = tp->tx_params.preferred_order;
+
+			if (!order)
+				continue;
+
+			WARN_ON(test_bit(255 - order, check_map));
+			set_bit(255 - order, check_map);
+		}
+
+		pos = find_first_bit(check_map, 255);
+		if (pos)
+			new_tx.preferred_order = 255 - (pos - 1);
+		else
+			new_tx.preferred_order = 255 -
+			    find_first_zero_bit(check_map, sizeof(check_map));
+	}
+
+	if (info->attrs[IPV6_TLV_ATTR_ADMIN_PERM]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_ADMIN_PERM]);
+		if (v > IPV6_TLV_PERM_MAX)
+			goto out;
+		new_tx.admin_perm = v;
+	}
+
+	if (info->attrs[IPV6_TLV_ATTR_USER_PERM]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_USER_PERM]);
+		if (v > IPV6_TLV_PERM_MAX)
+			goto out;
+		new_tx.user_perm = v;
+	}
+
+	if (info->attrs[IPV6_TLV_ATTR_CLASS]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_CLASS]);
+		if (v > IPV6_TLV_CLASS_MAX)
+			goto out;
+		new_tx.class = v;
+	}
+
+	if (info->attrs[IPV6_TLV_ATTR_ALIGN_MULT]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_ALIGN_MULT]);
+		if (v > 16 || v < 1)
+			goto out;
+		new_tx.align_mult = v - 1;
+	}
+
+	if (info->attrs[IPV6_TLV_ATTR_ALIGN_OFF]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_ALIGN_OFF]);
+		if (v > 15)
+			goto out;
+		new_tx.align_off = v;
+	}
+
+	if (info->attrs[IPV6_TLV_ATTR_MAX_DATA_LEN])
+		new_tx.max_data_len =
+		    nla_get_u8(info->attrs[IPV6_TLV_ATTR_MAX_DATA_LEN]);
+
+	if (info->attrs[IPV6_TLV_ATTR_MIN_DATA_LEN])
+		new_tx.min_data_len =
+		    nla_get_u8(info->attrs[IPV6_TLV_ATTR_MIN_DATA_LEN]);
+
+	if (info->attrs[IPV6_TLV_ATTR_DATA_LEN_MULT]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_DATA_LEN_MULT]);
+		if (v > 16 || v < 1)
+			goto out;
+		new_tx.data_len_mult = v - 1;
+	}
+
+	if (info->attrs[IPV6_TLV_ATTR_DATA_LEN_OFF]) {
+		v = nla_get_u8(info->attrs[IPV6_TLV_ATTR_DATA_LEN_OFF]);
+		if (v > 15)
+			goto out;
+		new_tx.data_len_off = v;
+	}
+
+	retv = tlv_set_tx_param(tlv_type, &new_tx);
+
+out:
+	rcu_read_unlock();
+	return retv;
+}
+
+static int tlv_nl_cmd_unset(struct sk_buff *skb, struct genl_info *info)
+{
+	unsigned int tlv_type;
+
+	if (!info->attrs[IPV6_TLV_ATTR_TYPE])
+		return -EINVAL;
+
+	tlv_type = nla_get_u8(info->attrs[IPV6_TLV_ATTR_TYPE]);
+	if (tlv_type < 2)
+		return -EINVAL;
+
+	return tlv_unset_tx_param(tlv_type);
+}
+
+static int tlv_fill_info(int tlv_type, struct sk_buff *msg, bool admin)
+{
+	struct tlv_tx_param *tptx;
+	int ret = 0;
+
+	rcu_read_lock();
+
+	tptx = tlv_deref_tx_params(tlv_type);
+
+	if (nla_put_u8(msg, IPV6_TLV_ATTR_TYPE, tlv_type) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_ORDER, tptx->preferred_order) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_USER_PERM, tptx->user_perm) ||
+	    (admin && nla_put_u8(msg, IPV6_TLV_ATTR_ADMIN_PERM,
+				 tptx->admin_perm)) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_CLASS, tptx->class) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_ALIGN_MULT, tptx->align_mult + 1) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_ALIGN_OFF, tptx->align_off) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_MIN_DATA_LEN, tptx->min_data_len) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_MAX_DATA_LEN, tptx->max_data_len) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_DATA_LEN_MULT,
+		       tptx->data_len_mult + 1) ||
+	    nla_put_u8(msg, IPV6_TLV_ATTR_DATA_LEN_OFF, tptx->data_len_off))
+		ret = -1;
+
+	rcu_read_unlock();
+
+	return ret;
+}
+
+static int tlv_dump_info(int tlv_type, u32 portid, u32 seq, u32 flags,
+			 struct sk_buff *skb, u8 cmd, bool admin)
+{
+	void *hdr;
+
+	hdr = genlmsg_put(skb, portid, seq, &tlv_nl_family, flags, cmd);
+	if (!hdr)
+		return -ENOMEM;
+
+	if (tlv_fill_info(tlv_type, skb, admin) < 0) {
+		genlmsg_cancel(skb, hdr);
+		return -EMSGSIZE;
+	}
+
+	genlmsg_end(skb, hdr);
+
+	return 0;
+}
+
+static int tlv_nl_cmd_get(struct sk_buff *skb, struct genl_info *info)
+{
+	struct sk_buff *msg;
+	int ret, tlv_type;
+
+	if (!info->attrs[IPV6_TLV_ATTR_TYPE])
+		return -EINVAL;
+
+	tlv_type = nla_get_u8(info->attrs[IPV6_TLV_ATTR_TYPE]);
+	if (tlv_type < 2)
+		return -EINVAL;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	ret = tlv_dump_info(tlv_type, info->snd_portid, info->snd_seq, 0, msg,
+			    info->genlhdr->cmd,
+			    netlink_capable(skb, CAP_NET_ADMIN));
+	if (ret < 0) {
+		nlmsg_free(msg);
+		return ret;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+
+static int tlv_nl_dump(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	int idx = 0, ret, i;
+
+	for (i = 2; i < 256; i++) {
+		if (idx++ < cb->args[0])
+			continue;
+		ret = tlv_dump_info(i, NETLINK_CB(cb->skb).portid,
+				    cb->nlh->nlmsg_seq, NLM_F_MULTI,
+				    skb, IPV6_TLV_CMD_GET,
+				    netlink_capable(cb->skb, CAP_NET_ADMIN));
+		if (ret)
+			break;
+	}
+
+	cb->args[0] = idx;
+	return skb->len;
+}
+
+static const struct genl_ops tlv_nl_ops[] = {
+{
+	.cmd = IPV6_TLV_CMD_SET,
+	.doit = tlv_nl_cmd_set,
+	.policy = tlv_nl_policy,
+	.flags = GENL_ADMIN_PERM,
+},
+{
+	.cmd = IPV6_TLV_CMD_UNSET,
+	.doit = tlv_nl_cmd_unset,
+	.policy = tlv_nl_policy,
+	.flags = GENL_ADMIN_PERM,
+},
+{
+	.cmd = IPV6_TLV_CMD_GET,
+	.doit = tlv_nl_cmd_get,
+	.dumpit = tlv_nl_dump,
+	.policy = tlv_nl_policy,
+},
+};
+
+static struct genl_family tlv_nl_family __ro_after_init = {
+	.hdrsize	= 0,
+	.name		= IPV6_TLV_GENL_NAME,
+	.version	= IPV6_TLV_GENL_VERSION,
+	.maxattr	= IPV6_TLV_ATTR_MAX,
+	.netnsok	= true,
+	.module		= THIS_MODULE,
+	.ops		= tlv_nl_ops,
+	.n_ops		= ARRAY_SIZE(tlv_nl_ops),
+};
+
 static int __init exthdrs_init(void)
 {
 	unsigned long check_map[BITS_TO_LONGS(256)];
@@ -596,6 +883,10 @@ static int __init exthdrs_init(void)
 			goto fail;
 	}
 
+	ret = genl_register_family(&tlv_nl_family);
+	if (ret < 0)
+		goto fail;
+
 	return 0;
 
 fail:
@@ -612,5 +903,6 @@ module_init(exthdrs_init);
 
 static void __exit exthdrs_fini(void)
 {
+	genl_unregister_family(&tlv_nl_family);
 }
 module_exit(exthdrs_fini);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next 4/5] ip6tlvs: Validation of TX Destination and Hop-by-Hop options
  2019-01-23  5:31 [PATCH net-next 0/5] ipv6: Rework ext. headers infrastructure Tom Herbert
                   ` (2 preceding siblings ...)
  2019-01-23  5:31 ` [PATCH net-next 3/5] ip6tlvs: Add netlink interface Tom Herbert
@ 2019-01-23  5:31 ` Tom Herbert
  2019-01-23  5:31 ` [PATCH net-next 5/5] ip6tlvs: API to set and remove individual TLVs from DO or HBH EH Tom Herbert
  4 siblings, 0 replies; 7+ messages in thread
From: Tom Herbert @ 2019-01-23  5:31 UTC (permalink / raw)
  To: davem, netdev; +Cc: Tom Herbert

Validate Destination and Hop-by-Hop options. This uses the information
in the TLV parameters table to validate various aspects of both
individual TLVs as well as a list of TLVs in an EH.

There are two levels of validation that can be performed: simple checks
and deep checks. Simple checks validate on the most basic properties
such as that the TLV list fits into the EH. Deep checks do a fine
grained validation.

With proper permissions in the TLV table, this patch allows
non-privileged users to send TLVs. Given that TLVs are open ended and
potentially a source of DOS attack, deep checks are performed to
limit the format that a user can send. If deep checks are enabled,
a canonical format for sending TLVs is enforced (in adherence with
the robustness principle). A TLV must be well ordered with respect
to the preferred order for the TLV. Each TLV must be aligned as
described in the parameter table. Minimal padding (one padding TLV)
is used to align TLVs. The length of the extension header as well as
the count of non-padding TLVs is checked against max_*_opts_len and
max_*_opts_cnt. For individual TLVs, lengths length alignment is
checked.
---
 include/net/ipv6.h         |   7 ++
 net/ipv6/datagram.c        |  27 ++--
 net/ipv6/exthdrs_options.c | 298 +++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/ipv6_sockglue.c   |  30 ++++-
 4 files changed, 348 insertions(+), 14 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 3d3b5a1..0bd659b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -379,6 +379,13 @@ struct ipv6_txoptions *ipv6_renew_options(struct sock *sk,
 struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
 
+int ipv6_opt_validate_tlvs(struct net *net, struct ipv6_opt_hdr *opt,
+			   unsigned int optname, bool admin);
+int ipv6_opt_validate_single_tlv(struct net *net, unsigned int optname,
+				 unsigned char *tlv, size_t len,
+				 bool deleting, bool admin);
+int ipv6_opt_check_perm(struct net *net, struct sock *sk,
+			int optname, bool admin);
 struct tlv_tx_param {
 	unsigned char preferred_order;
 	unsigned char admin_perm : 2;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index ee4a4e5..ef50439 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -853,10 +853,13 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				err = -EINVAL;
 				goto exit_f;
 			}
-			if (!ns_capable(net->user_ns, CAP_NET_RAW)) {
-				err = -EPERM;
+
+			err = ipv6_opt_validate_tlvs(net, hdr, IPV6_HOPOPTS,
+						     ns_capable(net->user_ns,
+								CAP_NET_RAW));
+			if (err < 0)
 				goto exit_f;
-			}
+
 			opt->opt_nflen += len;
 			opt->hopopt = hdr;
 			break;
@@ -873,10 +876,13 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				err = -EINVAL;
 				goto exit_f;
 			}
-			if (!ns_capable(net->user_ns, CAP_NET_RAW)) {
-				err = -EPERM;
+
+			err = ipv6_opt_validate_tlvs(net, hdr, IPV6_DSTOPTS,
+						     ns_capable(net->user_ns,
+								CAP_NET_RAW));
+			if (err < 0)
 				goto exit_f;
-			}
+
 			if (opt->dst1opt) {
 				err = -EINVAL;
 				goto exit_f;
@@ -898,10 +904,13 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				err = -EINVAL;
 				goto exit_f;
 			}
-			if (!ns_capable(net->user_ns, CAP_NET_RAW)) {
-				err = -EPERM;
+
+			err = ipv6_opt_validate_tlvs(net, hdr, cmsg->cmsg_type,
+						     ns_capable(net->user_ns,
+								CAP_NET_RAW));
+			if (err < 0)
 				goto exit_f;
-			}
+
 			if (cmsg->cmsg_type == IPV6_DSTOPTS) {
 				opt->opt_flen += len;
 				opt->dst1opt = hdr;
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
index da9e257..401f8ff 100644
--- a/net/ipv6/exthdrs_options.c
+++ b/net/ipv6/exthdrs_options.c
@@ -162,6 +162,304 @@ struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 }
 EXPORT_SYMBOL_GPL(ipv6_fixup_options);
 
+/* TLV validation functions */
+
+/* Validate a single non-padding TLV */
+static int __ipv6_opt_validate_single_tlv(struct net *net, unsigned char *tlv,
+					  struct tlv_tx_param *tptx,
+					  unsigned int class, bool *deep_check,
+					  bool deleting, bool admin)
+{
+	if (tlv[0] < 2) /* Must be non-padding */
+		return -EINVAL;
+
+	/* Check permissions */
+	switch (admin ? tptx->admin_perm : tptx->user_perm) {
+	case IPV6_TLV_PERM_NO_CHECK:
+		/* Allowed with no deep checks */
+		*deep_check = false;
+		return 0;
+	case IPV6_TLV_PERM_WITH_CHECK:
+		/* Allowed with deep checks */
+		*deep_check = true;
+		break;
+	default:
+		/* No permission */
+		return -EPERM;
+	}
+
+	/* Perform deep checks on the TLV */
+
+	/* Check class */
+	if ((tptx->class & class) != class)
+		return -EINVAL;
+
+	/* Don't bother checking lengths when deleting, the TLV is only
+	 * needed here for lookup
+	 */
+	if (deleting) {
+		/* Don't bother with deep checks when deleting */
+		*deep_check = false;
+	} else {
+		/* Check length */
+		if (tlv[1] < tptx->min_data_len || tlv[1] > tptx->max_data_len)
+			return -EINVAL;
+
+		/* Check length alignment */
+		if ((tlv[1] % (tptx->data_len_mult + 1)) != tptx->data_len_off)
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static unsigned int optname_to_tlv_class(int optname)
+{
+	switch (optname) {
+	case IPV6_HOPOPTS:
+		return IPV6_TLV_CLASS_FLAG_HOPOPT;
+	case IPV6_RTHDRDSTOPTS:
+		return IPV6_TLV_CLASS_FLAG_RTRDSTOPT;
+	case IPV6_DSTOPTS:
+		return IPV6_TLV_CLASS_FLAG_DSTOPT;
+	default:
+		return -1U;
+	}
+}
+
+static int __ipv6_opt_validate_tlvs(struct net *net, struct ipv6_opt_hdr *opt,
+				    unsigned int optname, bool deleting,
+				    bool admin)
+{
+	unsigned int max_len = 0, max_cnt = 0, cnt = 0;
+	unsigned char *tlv = (unsigned char *)opt;
+	bool deep_check, did_deep_check = false;
+	unsigned int opt_len, tlv_len, offset;
+	unsigned int padding = 0, numpad = 0;
+	unsigned char prev_tlv_order = 0;
+	struct tlv_tx_param *tptx;
+	int retc, ret = -EINVAL;
+	unsigned int class;
+
+	opt_len = ipv6_optlen(opt);
+	offset = sizeof(*opt);
+
+	class = optname_to_tlv_class(optname);
+
+	switch (optname) {
+	case IPV6_HOPOPTS:
+		max_len = net->ipv6.sysctl.max_hbh_opts_len;
+		max_cnt = net->ipv6.sysctl.max_hbh_opts_cnt;
+		break;
+	case IPV6_RTHDRDSTOPTS:
+	case IPV6_DSTOPTS:
+		max_len = net->ipv6.sysctl.max_dst_opts_len;
+		max_cnt = net->ipv6.sysctl.max_dst_opts_cnt;
+		break;
+	}
+
+	rcu_read_lock();
+
+	while (offset < opt_len) {
+		switch (tlv[offset]) {
+		case IPV6_TLV_PAD1:
+			tlv_len = 1;
+			padding++;
+			numpad++;
+			break;
+		case IPV6_TLV_PADN:
+			if (offset + 1 >= opt_len)
+				goto out;
+
+			tlv_len = tlv[offset + 1] + 2;
+
+			if (offset + tlv_len > opt_len)
+				goto out;
+
+			padding += tlv_len;
+			numpad++;
+			break;
+		default:
+			if (offset + 1 >= opt_len)
+				goto out;
+
+			tlv_len = tlv[offset + 1] + 2;
+
+			if (offset + tlv_len > opt_len)
+				goto out;
+
+			tptx = tlv_deref_tx_params(tlv[offset]);
+			retc = __ipv6_opt_validate_single_tlv(net, &tlv[offset],
+							      tptx, class,
+							      &deep_check,
+							      deleting, admin);
+			if (retc < 0) {
+				ret = retc;
+				goto out;
+			}
+
+			if (deep_check) {
+				/* Check for too many options */
+				if (++cnt > max_cnt) {
+					ret = -E2BIG;
+					goto out;
+				}
+
+				/* Check order */
+				if (tptx->preferred_order < prev_tlv_order)
+					goto out;
+
+				/* Check alignment */
+				if ((offset % (tptx->align_mult + 1)) !=
+				    tptx->align_off)
+					goto out;
+
+				/* Check for right amount of padding */
+				if (numpad > 1 || padding > tptx->align_mult)
+					goto out;
+
+				prev_tlv_order = tptx->preferred_order;
+			}
+
+			padding = 0;
+			numpad = 0;
+			did_deep_check = true;
+		}
+		offset += tlv_len;
+	}
+
+	/* If we did at least one deep check apply length limit */
+	if (did_deep_check && opt_len > max_len) {
+		ret = -EMSGSIZE;
+		goto out;
+	}
+
+	/* All good */
+	ret = 0;
+out:
+	rcu_read_unlock();
+
+	return ret;
+}
+
+/**
+ * __ipv6_opt_validate_tlvs - Validate TLVs.
+ * @net: Current net
+ * @opt: The option header
+ * @optname: IPV6_HOPOPTS, IPV6_RTHDRDSTOPTS, or IPV6_DSTOPTS
+ * @admin: Set for privileged user
+ *
+ * Description:
+ * Walks the TLVs in a list to verify that the TLV lengths and other
+ * parameters are in bounds for a Destination or Hop-by-Hop option.
+ * Return -EINVAL is there is a problem, zero otherwise.
+ */
+int ipv6_opt_validate_tlvs(struct net *net, struct ipv6_opt_hdr *opt,
+			   unsigned int optname, bool admin)
+{
+	return __ipv6_opt_validate_tlvs(net, opt, optname, false, admin);
+}
+EXPORT_SYMBOL(ipv6_opt_validate_tlvs);
+
+/**
+ * ipv6_opt_validate_single - Check that a single TLV is valid.
+ * @net: Current net
+ * @optname: IPV6_HOPOPTS, IPV6_RTHDRDSTOPTS, or IPV6_DSTOPTS
+ * @tlv: The TLV as array of bytes
+ * @len: Length of buffer holding TLV
+ *
+ * Description:
+ * Validates a single TLV. The TLV must be non-padding type. The length
+ * of the TLV (as determined by the second byte that gives length of the
+ * option data) must match @len.
+ */
+int ipv6_opt_validate_single_tlv(struct net *net, unsigned int optname,
+				 unsigned char *tlv, size_t len,
+				 bool deleting, bool admin)
+{
+	struct tlv_tx_param *tptx;
+	unsigned int class;
+	bool deep_check;
+	int ret = 0;
+
+	class = optname_to_tlv_class(optname);
+
+	switch (tlv[0]) {
+	case IPV6_TLV_PAD1:
+	case IPV6_TLV_PADN:
+		return -EINVAL;
+	default:
+		break;
+	}
+
+	if (len < 2)
+		return -EINVAL;
+
+	if (tlv[1] + 2 != len)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	tptx = tlv_deref_tx_params(tlv[0]);
+
+	ret = __ipv6_opt_validate_single_tlv(net, tlv, tptx, class,
+					     &deep_check, deleting, admin);
+
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(ipv6_opt_validate_single_tlv);
+
+/**
+ * ipv6_opt_check_perm - Check is current capabilities allows modify
+ * txopts.
+ * @net: Current net
+ * @sk: the socket
+ * @optname: IPV6_HOPOPTS, IPV6_RTHDRDSTOPTS, or IPV6_DSTOPTS
+ * @admin: Set for privileged user
+ *
+ * Description:
+ *
+ * Checks whether the permissions of TLV that are set on a socket permit
+ * modificationr.
+ *
+ */
+int ipv6_opt_check_perm(struct net *net, struct sock *sk, int optname,
+			bool admin)
+{
+	struct ipv6_txoptions *old = txopt_get(inet6_sk(sk));
+	struct ipv6_opt_hdr *opt;
+	int retv = -EPERM;
+
+	if (!old)
+		return 0;
+
+	switch (optname) {
+	case IPV6_HOPOPTS:
+		opt = old->hopopt;
+		break;
+	case IPV6_RTHDRDSTOPTS:
+		opt = old->dst0opt;
+		break;
+	case IPV6_DSTOPTS:
+		opt = old->dst1opt;
+		break;
+	default:
+		goto out;
+	}
+
+	/* Just call the validate function on the options as being
+	 * deleted.
+	 */
+	retv = __ipv6_opt_validate_tlvs(net, opt, optname, true, admin);
+
+out:
+	txopt_put(old);
+	return retv;
+}
+EXPORT_SYMBOL(ipv6_opt_check_perm);
+
 /* Destination options header */
 
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 973e215..009c8a4 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -400,11 +400,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		struct ipv6_txoptions *opt;
 		struct ipv6_opt_hdr *new = NULL;
 
-		/* hop-by-hop / destination options are privileged option */
-		retv = -EPERM;
-		if (optname != IPV6_RTHDR && !ns_capable(net->user_ns, CAP_NET_RAW))
-			break;
-
 		/* remove any sticky options header with a zero option
 		 * length, per RFC3542.
 		 */
@@ -427,6 +422,31 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			}
 		}
 
+		if (optname != IPV6_RTHDR) {
+			bool cap = ns_capable(net->user_ns, CAP_NET_RAW);
+
+			/* First check if we have permission to delete
+			 * the existing options on the socket.
+			 */
+			retv = ipv6_opt_check_perm(net, sk, optname, cap);
+			if (retv < 0) {
+				kfree(new);
+				break;
+			}
+
+			/* Check permissions and other validations on new
+			 * TLVs
+			 */
+			if (new) {
+				retv = ipv6_opt_validate_tlvs(net, new,
+							      optname, cap);
+				if (retv < 0) {
+					kfree(new);
+					break;
+				}
+			}
+		}
+
 		opt = rcu_dereference_protected(np->opt,
 						lockdep_sock_is_held(sk));
 		opt = ipv6_renew_options(sk, opt, optname, new);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next 5/5] ip6tlvs: API to set and remove individual TLVs from DO or HBH EH
  2019-01-23  5:31 [PATCH net-next 0/5] ipv6: Rework ext. headers infrastructure Tom Herbert
                   ` (3 preceding siblings ...)
  2019-01-23  5:31 ` [PATCH net-next 4/5] ip6tlvs: Validation of TX Destination and Hop-by-Hop options Tom Herbert
@ 2019-01-23  5:31 ` Tom Herbert
  4 siblings, 0 replies; 7+ messages in thread
From: Tom Herbert @ 2019-01-23  5:31 UTC (permalink / raw)
  To: davem, netdev; +Cc: Tom Herbert

Add functions and socket options that allows setting and removing
of individual TLVs from Hop-by-Hop, Destination, or Routing Header
Destination options that are set in txoptions of a socket. When an
individual TLV optiosn is set it is merge into the existing options
at the position in the list described by preferred order attribute in
the TLV parameters table.

This code is based in part on the TLV option handling in calipso.c.

Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 include/net/ipv6.h         |  13 ++
 include/uapi/linux/in6.h   |   9 +
 net/ipv6/exthdrs_options.c | 516 +++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/ipv6_sockglue.c   |  80 +++++++
 4 files changed, 618 insertions(+)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 0bd659b..fb337aa 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -378,6 +378,8 @@ struct ipv6_txoptions *ipv6_renew_options(struct sock *sk,
 					  struct ipv6_opt_hdr *newopt);
 struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
+int ipv6_opt_update(struct sock *sk, struct ipv6_txoptions *opt,
+		    int which, struct ipv6_opt_hdr *new);
 
 int ipv6_opt_validate_tlvs(struct net *net, struct ipv6_opt_hdr *opt,
 			   unsigned int optname, bool admin);
@@ -386,6 +388,17 @@ int ipv6_opt_validate_single_tlv(struct net *net, unsigned int optname,
 				 bool deleting, bool admin);
 int ipv6_opt_check_perm(struct net *net, struct sock *sk,
 			int optname, bool admin);
+
+int ipv6_opt_tlv_find(struct ipv6_opt_hdr *opt, unsigned char *targ_tlv,
+		      unsigned int *start, unsigned int *end);
+struct ipv6_opt_hdr *ipv6_opt_tlv_insert(struct net *net,
+					 struct ipv6_opt_hdr *opt,
+					 int optname, unsigned char *tlv,
+					 bool admin);
+struct ipv6_opt_hdr *ipv6_opt_tlv_delete(struct net *net,
+					 struct ipv6_opt_hdr *opt,
+					 unsigned char *tlv, bool admin);
+
 struct tlv_tx_param {
 	unsigned char preferred_order;
 	unsigned char admin_perm : 2;
diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
index a54cf96..f6edf31 100644
--- a/include/uapi/linux/in6.h
+++ b/include/uapi/linux/in6.h
@@ -288,6 +288,15 @@ struct in6_flowlabel_req {
 #define IPV6_RECVFRAGSIZE	77
 #define IPV6_FREEBIND		78
 
+/* API to set single Destination or Hop-by-Hop options */
+
+#define IPV6_HOPOPTS_TLV		79
+#define IPV6_RTHDRDSTOPTS_TLV		80
+#define IPV6_DSTOPTS_TLV		81
+#define IPV6_HOPOPTS_DEL_TLV		82
+#define IPV6_RTHDRDSTOPTS_DEL_TLV	83
+#define IPV6_DSTOPTS_DEL_TLV		84
+
 /*
  * Multicast Routing:
  * see include/uapi/linux/mroute6.h.
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
index 401f8ff..00fdefd 100644
--- a/net/ipv6/exthdrs_options.c
+++ b/net/ipv6/exthdrs_options.c
@@ -162,6 +162,35 @@ struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 }
 EXPORT_SYMBOL_GPL(ipv6_fixup_options);
 
+/**
+ * ipv6_opt_update - Replaces socket's options with a new set
+ * @sk: the socket
+ * @opt: TX options from socket
+ * @which: which set of options
+ * @new: new extension header for the options
+ *
+ * Description:
+ * Replaces @sk's options with @new for type @which.  @new may be NULL to
+ * leave the socket with no options for the given type.
+ *
+ */
+int ipv6_opt_update(struct sock *sk, struct ipv6_txoptions *opt,
+		    int which, struct ipv6_opt_hdr *new)
+{
+	opt = ipv6_renew_options(sk, opt, which, new);
+	if (IS_ERR(opt))
+		return PTR_ERR(opt);
+
+	opt = ipv6_update_options(sk, opt);
+	if (opt) {
+		atomic_sub(opt->tot_len, &sk->sk_omem_alloc);
+		txopt_put(opt);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(ipv6_opt_update);
+
 /* TLV validation functions */
 
 /* Validate a single non-padding TLV */
@@ -460,6 +489,493 @@ int ipv6_opt_check_perm(struct net *net, struct sock *sk, int optname,
 }
 EXPORT_SYMBOL(ipv6_opt_check_perm);
 
+/* Functions to manage individual TLVs */
+
+/**
+ * __ipv6_opt_tlv_find - Finds a particular TLV in an IPv6 options header
+ * (destinaton or hop-by-hop options). If TLV is not present, then the
+ * preferred insertion point is determined.
+ * @opt: the options header (an EH header followed by data)
+ * @targ_tlv: Prototype of TLV to find
+ * @start: on return holds the offset of any leading padding if option
+ *       is present, or offset at which option is inserted.
+ * @end: on return holds the offset of the first non-pad TLV after option
+ *       if the option was found, else points to the first TLV after
+ *       padding at intsertion point.
+ *
+ * Description:
+ * Finds the space occupied by particular option (including any leading and
+ * trailing padding), or the perferred position for insertion if the
+ * TLV is not present.
+ *
+ * If the option is found then @start and @end are set to the offsets within
+ * @opt of the start of padding before the first found option and the end of
+ * padding after the first found option. In this case the function returns
+ * the offset in @opt of the found option (a value >= 2 since the TLV
+ * must be after the option header).
+ *
+ * In the absence of the searched option, @start is set to offset in @opt at
+ * which the option may be inserted per the ordering and alignment rules
+ * in the TLV parameter table, and @end is set to the end + 1 of any
+ * padding at the @start offset. When the option is not found -ENOENT is
+ * returned.
+ *
+ * rcu_read_lock assumed held.
+ */
+static int __ipv6_opt_tlv_find(struct ipv6_opt_hdr *opt,
+			       unsigned char *targ_tlv,
+			       unsigned int *start, unsigned int *end)
+{
+	unsigned int offset_s = 0, offset_e = 0, last_s = 0;
+	unsigned char *tlv = (unsigned char *)opt;
+	unsigned int pad_e = sizeof(*opt);
+	int ret_val = -ENOENT, tlv_len;
+	unsigned int opt_len, offset;
+	struct tlv_tx_param *tptx;
+	unsigned int targ_order;
+	bool found_cand = false;
+
+	opt_len = ipv6_optlen(opt);
+	offset = sizeof(*opt);
+
+	tptx = tlv_deref_tx_params(targ_tlv[0]);
+
+	targ_order = tptx->preferred_order;
+
+	while (offset < opt_len) {
+		switch (tlv[offset]) {
+		case IPV6_TLV_PAD1:
+			if (offset_e)
+				offset_e = offset;
+			tlv_len = 1;
+			break;
+		case IPV6_TLV_PADN:
+			if (offset_e)
+				offset_e = offset;
+			tlv_len = tlv[offset + 1] + 2;
+			break;
+		default:
+			if (ret_val >= 0)
+				goto out;
+
+			/* Not found yet */
+
+			if (tlv[offset] == targ_tlv[0]) {
+				/* Found it */
+
+				ret_val = offset;
+				offset_e = offset;
+				offset_s = last_s;
+				found_cand = true;
+			} else {
+				struct tlv_tx_param *tptx1;
+
+				tptx1 = tlv_deref_tx_params(tlv[offset]);
+
+				if (targ_order < tptx1->preferred_order &&
+				    !found_cand) {
+					/* Found candidate for insert location
+					 */
+
+					pad_e = offset;
+					offset_s = last_s;
+					found_cand = true;
+				}
+			}
+
+			last_s = offset;
+			tlv_len = tlv[offset + 1] + 2;
+			break;
+		}
+
+		offset += tlv_len;
+	}
+
+	if (!found_cand) {
+		/* Not found and insert point is after all options */
+		offset_s = last_s;
+		pad_e = opt_len;
+	}
+
+out:
+	if (offset_s)
+		*start = offset_s +
+		    (tlv[offset_s] ? tlv[offset_s + 1] + 2 : 1);
+	else
+		*start = sizeof(*opt);
+
+	if (ret_val >= 0)
+		*end = offset_e +
+		    (tlv[offset_e] ? tlv[offset_e + 1] + 2 : 1);
+	else
+		*end = pad_e;
+
+	return ret_val;
+}
+
+int ipv6_opt_tlv_find(struct ipv6_opt_hdr *opt, unsigned char *targ_tlv,
+		      unsigned int *start, unsigned int *end)
+{
+	int ret;
+
+	rcu_read_lock();
+	ret = __ipv6_opt_tlv_find(opt, targ_tlv, start, end);
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(ipv6_opt_tlv_find);
+
+/**
+ * ipv6_opt_tlv_pad_write - Writes pad bytes in TLV format
+ * @buf: the buffer
+ * @offset: offset from start of buffer to write padding
+ * @count: number of pad bytes to write
+ *
+ * Description:
+ * Write @count bytes of TLV padding into @buffer starting at offset @offset.
+ * @count should be less than 8 - see RFC 4942.
+ *
+ */
+static int ipv6_opt_tlv_pad_write(unsigned char *buf, unsigned int offset,
+				  unsigned int count)
+{
+	if (WARN_ON_ONCE(count >= 8))
+		return -EINVAL;
+
+	switch (count) {
+	case 0:
+		break;
+	case 1:
+		buf[offset] = IPV6_TLV_PAD1;
+		break;
+	default:
+		buf[offset] = IPV6_TLV_PADN;
+		buf[offset + 1] = count - 2;
+		if (count > 2)
+			memset(buf + offset + 2, 0, count - 2);
+		break;
+	}
+	return 0;
+}
+
+static unsigned int compute_padding(unsigned int offset, unsigned int mult,
+				    unsigned int moff)
+{
+	return (mult - ((offset - moff) % mult)) % mult;
+}
+
+static int tlv_find_next(unsigned char *tlv, unsigned int offset,
+			 unsigned int optlen)
+{
+	while (offset < optlen) {
+		switch (tlv[offset]) {
+		case IPV6_TLV_PAD1:
+			offset++;
+			break;
+		case IPV6_TLV_PADN:
+			offset += tlv[offset + 1] + 2;
+			break;
+		default:
+			return offset;
+		}
+	}
+
+	return (optlen);
+}
+
+/* __tlv_sum_alignment assumes ruc_read_lock is held */
+static size_t __tlv_sum_alignment(unsigned char *tlv, unsigned int offset,
+				  unsigned int optlen)
+{
+	int sum = 0;
+
+	offset = tlv_find_next(tlv, offset, optlen);
+
+	while (offset < optlen) {
+		struct tlv_tx_param *tptx;
+
+		tptx = tlv_deref_tx_params(tlv[offset]);
+		sum += tptx->align_mult;
+		offset += tlv[offset + 1] + 2;
+		offset = tlv_find_next(tlv, offset, optlen);
+	}
+
+	return sum;
+}
+
+/* __copy_and_align_tlvs assumes ruc_read_lock is held */
+static int __copy_and_align_tlvs(unsigned int src_off, unsigned char *src,
+				 unsigned int dst_off, unsigned char *dst,
+				 unsigned int optlen)
+{
+	unsigned int padding, len;
+	struct tlv_tx_param *tptx;
+
+	if (!src)
+		return dst_off;
+
+	src_off = tlv_find_next(src, src_off, optlen);
+
+	while (src_off < optlen) {
+		tptx = tlv_deref_tx_params(src[src_off]);
+
+		padding = compute_padding(dst_off, tptx->align_mult + 1,
+					  tptx->align_off);
+		ipv6_opt_tlv_pad_write(dst, dst_off, padding);
+		dst_off += padding;
+
+		len = src[src_off + 1] + 2;
+		memcpy(&dst[dst_off], &src[src_off], len);
+
+		src_off += len;
+		dst_off += len;
+		src_off = tlv_find_next(src, src_off, optlen);
+	}
+
+	return dst_off;
+}
+
+static int count_tlvs(struct ipv6_opt_hdr *opt)
+{
+	unsigned char *tlv = (unsigned char *)opt;
+	unsigned int opt_len, tlv_len, offset, cnt = 0;
+
+	opt_len = ipv6_optlen(opt);
+	offset = sizeof(*opt);
+
+	while (offset < opt_len) {
+		switch (tlv[offset]) {
+		case IPV6_TLV_PAD1:
+			tlv_len = 1;
+			break;
+		case IPV6_TLV_PADN:
+			tlv_len = tlv[offset + 1] + 2;
+			break;
+		default:
+			cnt++;
+			tlv_len = tlv[offset + 1] + 2;
+			break;
+		}
+		offset += tlv_len;
+	}
+
+	return cnt;
+}
+
+#define IPV6_OPT_MAX_END_PAD 7
+
+/**
+ * ipv6_opt_tlv_insert - Inserts a TLV into an IPv6 destination options
+ * or Hop-by-Hop options extension header.
+ *
+ * @net: Current net
+ * @opt: the original options extensions header
+ * @optname: IPV6_HOPOPTS, IPV6_RTHDRDSTOPTS, or IPV6_DSTOPTS
+ * @tlv: the new TLV being inserted
+ * @admin: Set for privileged user
+ *
+ * Description:
+ * Creates a new options header based on @opt with the specified option
+ * in @tlv option added to it.  If @opt already contains the same type
+ * of TLV, then the TLV is overwritten, otherwise the new TLV is appended
+ * after any existing TLVs.  If @opt is NULL then the new header
+ * will contain just the new option and any needed padding.
+ *
+ * Assumes option has been validated.
+ */
+struct ipv6_opt_hdr *ipv6_opt_tlv_insert(struct net *net,
+					 struct ipv6_opt_hdr *opt,
+					 int optname, unsigned char *tlv,
+					 bool admin)
+{
+	unsigned int start = 0, end = 0, buf_len, pad, optlen,  max_align;
+	size_t tlv_len = tlv[1] + 2;
+	struct tlv_tx_param *tptx;
+	struct ipv6_opt_hdr *new;
+	int ret_val;
+	u8 perm;
+
+	rcu_read_lock();
+
+	if (opt) {
+		optlen = ipv6_optlen(opt);
+		ret_val = __ipv6_opt_tlv_find(opt, tlv, &start, &end);
+		if (ret_val < 0) {
+			if (ret_val != -ENOENT) {
+				rcu_read_unlock();
+				return ERR_PTR(ret_val);
+			}
+		} else if (((unsigned char *)opt)[ret_val + 1] == tlv[1]) {
+			unsigned int roff = ret_val + tlv[1] + 2;
+
+			/* Replace existing TLV with one of the same length,
+			 * we can fast path this.
+			 */
+
+			rcu_read_unlock();
+
+			new = kmalloc(optlen, GFP_ATOMIC);
+			if (!new)
+				return ERR_PTR(-ENOMEM);
+
+			memcpy((unsigned char *)new,
+			       (unsigned char *)opt, ret_val);
+			memcpy((unsigned char *)new + ret_val, tlv, tlv[1] + 2);
+			memcpy((unsigned char *)new + roff,
+			       (unsigned char *)opt + roff, optlen - roff);
+
+			return new;
+		}
+	} else {
+		optlen = 0;
+		start = sizeof(*opt);
+		end = 0;
+	}
+
+	tptx = tlv_deref_tx_params(tlv[0]);
+
+	/* Maximum buffer size we'll need including possible padding */
+	max_align = __tlv_sum_alignment((unsigned char *)opt, end, optlen) +
+	    tptx->align_mult + IPV6_OPT_MAX_END_PAD;
+
+	buf_len = optlen + start - end + tlv_len + max_align;
+	new = kmalloc(buf_len, GFP_ATOMIC);
+	if (!new) {
+		rcu_read_unlock();
+		return ERR_PTR(-ENOMEM);
+	}
+
+	buf_len = start;
+
+	if (start > sizeof(*opt))
+		memcpy(new, opt, start);
+
+	pad = compute_padding(start, tptx->align_mult + 1, tptx->align_off);
+	ipv6_opt_tlv_pad_write((__u8 *)new, start, pad);
+	buf_len += pad;
+
+	memcpy((__u8 *)new + buf_len, tlv, tlv_len);
+	buf_len += tlv_len;
+
+	buf_len = __copy_and_align_tlvs(end, (__u8 *)opt, buf_len,
+					(__u8 *)new, optlen);
+
+	perm = admin ? tptx->admin_perm : tptx->user_perm;
+
+	rcu_read_unlock();
+
+	/* Trailer pad to 8 byte alignment */
+	pad = (8 - (buf_len & 7)) & 7;
+	ipv6_opt_tlv_pad_write((__u8 *)new, buf_len, pad);
+	buf_len += pad;
+
+	/* Set header */
+	new->nexthdr = 0;
+	new->hdrlen = buf_len / 8 - 1;
+
+	if (perm != IPV6_TLV_PERM_NO_CHECK) {
+		switch (optname) {
+		case IPV6_HOPOPTS:
+			if (buf_len > net->ipv6.sysctl.max_hbh_opts_len)
+				return ERR_PTR(-EMSGSIZE);
+			if (count_tlvs(new) > net->ipv6.sysctl.max_hbh_opts_cnt)
+				return ERR_PTR(-E2BIG);
+			break;
+		case IPV6_RTHDRDSTOPTS:
+		case IPV6_DSTOPTS:
+			if (buf_len > net->ipv6.sysctl.max_dst_opts_len)
+				return ERR_PTR(-EMSGSIZE);
+			if (count_tlvs(new) > net->ipv6.sysctl.max_dst_opts_cnt)
+				return ERR_PTR(-E2BIG);
+			break;
+		}
+	}
+
+	return new;
+}
+EXPORT_SYMBOL(ipv6_opt_tlv_insert);
+
+/* rcu_read_lock assume held */
+struct ipv6_opt_hdr *__ipv6_opt_tlv_delete(struct ipv6_opt_hdr *opt,
+					   unsigned int start,
+					   unsigned int end)
+{
+	unsigned int pad, optlen, buf_len;
+	struct ipv6_opt_hdr *new;
+	size_t max_align;
+
+	optlen = ipv6_optlen(opt);
+	if (start == sizeof(*opt) && end == optlen) {
+		/* There's no other option in the header so return NULL */
+		return NULL;
+	}
+
+	max_align = __tlv_sum_alignment((unsigned char *)opt, end, optlen) +
+	    IPV6_OPT_MAX_END_PAD;
+
+	new = kmalloc(optlen - (end - start) + max_align, GFP_ATOMIC);
+	if (!new)
+		return ERR_PTR(-ENOMEM); /* DIFF */
+
+	memcpy(new, opt, start);
+
+	buf_len = __copy_and_align_tlvs(end, (__u8 *)opt, start,
+					(__u8 *)new, optlen);
+
+	/* Now set trailer padding, buf_len is at the end of the last TLV at
+	 * this point
+	 */
+	pad = (8 - (buf_len & 7)) & 7;
+	ipv6_opt_tlv_pad_write((__u8 *)new, buf_len, pad);
+	buf_len += pad;
+
+	/* Set new header length */
+	new->hdrlen = buf_len / 8 - 1;
+
+	return new;
+}
+
+/**
+ * ipv6_opt_tlv_delete - Removes the specified option from the destination
+ * or Hop-by-Hop extension header.
+ * @net: Current net
+ * @opt: The original header
+ * @tlv: Prototype of TLV being removed
+ * @admin: Set for privileged user
+ *
+ * Description:
+ * Creates a new header based on @opt without the specified option in
+ * @tlv. A new options header is returned without the option. If @opt
+ * doesn't contain the specified option ERR_PTR(-ENOENT) is returned.
+ * If @opt contains no other non-padding options, NULL is returned.
+ * Otherwise, a new header is created and returned without the option
+ * (and removing as much padding as possible).
+ */
+struct ipv6_opt_hdr *ipv6_opt_tlv_delete(struct net *net,
+					 struct ipv6_opt_hdr *opt,
+					 unsigned char *tlv, bool admin)
+{
+	struct ipv6_opt_hdr *retopt;
+	unsigned int start, end;
+	int ret_val;
+
+	rcu_read_lock();
+
+	ret_val = __ipv6_opt_tlv_find(opt, tlv, &start, &end);
+	if (ret_val < 0) {
+		rcu_read_unlock();
+		return ERR_PTR(ret_val);
+	}
+
+	retopt = __ipv6_opt_tlv_delete(opt, start, end);
+
+	rcu_read_unlock();
+
+	return retopt;
+}
+EXPORT_SYMBOL(ipv6_opt_tlv_delete);
+
 /* Destination options header */
 
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 009c8a4..affa46c 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -493,6 +493,86 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		break;
 	}
 
+	case IPV6_HOPOPTS_TLV:
+	case IPV6_RTHDRDSTOPTS_TLV:
+	case IPV6_DSTOPTS_TLV:
+	case IPV6_HOPOPTS_DEL_TLV:
+	case IPV6_RTHDRDSTOPTS_DEL_TLV:
+	case IPV6_DSTOPTS_DEL_TLV:
+	{
+		struct ipv6_opt_hdr *old = NULL, *new = NULL;
+		struct ipv6_txoptions *opt;
+		bool deleting = false;
+		void *new_opt = NULL;
+		int which = -1;
+		bool admin;
+
+		new_opt = memdup_user(optval, optlen);
+		if (IS_ERR(new_opt)) {
+			retv = PTR_ERR(new_opt);
+			break;
+		}
+
+		opt = rcu_dereference_protected(np->opt,
+						lockdep_sock_is_held(sk));
+
+		switch (optname) {
+		case IPV6_HOPOPTS_DEL_TLV:
+			deleting = true;
+			/* Fallthrough */
+		case IPV6_HOPOPTS_TLV:
+			if (opt)
+				old = opt->hopopt;
+			which = IPV6_HOPOPTS;
+			break;
+		case IPV6_RTHDRDSTOPTS_DEL_TLV:
+			deleting = true;
+			/* Fallthrough */
+		case IPV6_RTHDRDSTOPTS_TLV:
+			if (opt)
+				old = opt->dst0opt;
+			which = IPV6_RTHDRDSTOPTS;
+			break;
+		case IPV6_DSTOPTS_DEL_TLV:
+			deleting = true;
+			/* Fallthrough */
+		case IPV6_DSTOPTS_TLV:
+			if (opt)
+				old = opt->dst1opt;
+			which = IPV6_DSTOPTS;
+			break;
+		}
+
+		admin = ns_capable(net->user_ns, CAP_NET_RAW);
+
+		retv = ipv6_opt_validate_single_tlv(net, which, new_opt, optlen,
+						    deleting, admin);
+		if (retv < 0)
+			break;
+
+		if (deleting) {
+			if (!old)
+				break;
+			new = ipv6_opt_tlv_delete(net, old, new_opt, admin);
+		} else {
+			new = ipv6_opt_tlv_insert(net, old, which, new_opt,
+						  admin);
+		}
+
+		kfree(new_opt);
+
+		if (IS_ERR(new)) {
+			retv = PTR_ERR(new);
+			break;
+		}
+
+		retv = ipv6_opt_update(sk, opt, which, new);
+
+		kfree(new);
+
+		break;
+	}
+
 	case IPV6_PKTINFO:
 	{
 		struct in6_pktinfo pkt;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 2/5] exthdrs: Registration of TLV handlers and parameters
  2019-01-23  5:31 ` [PATCH net-next 2/5] exthdrs: Registration of TLV handlers and parameters Tom Herbert
@ 2019-01-23 19:27   ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2019-01-23 19:27 UTC (permalink / raw)
  To: tom; +Cc: netdev, tom

From: Tom Herbert <tom@herbertland.com>
Date: Tue, 22 Jan 2019 21:31:20 -0800

> Define a table that contains 256 entries, one for each TLV. Each entry
> points to a structure that contains parameters and handler functions
> for receiving and transmitting TLVs. The receive and transmit properties
> can be managed independently.

A new 2K table of pointers, most of which are empty.

No, thank you.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-23 19:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-23  5:31 [PATCH net-next 0/5] ipv6: Rework ext. headers infrastructure Tom Herbert
2019-01-23  5:31 ` [PATCH net-next 1/5] exthdrs: Create exthdrs_options.c Tom Herbert
2019-01-23  5:31 ` [PATCH net-next 2/5] exthdrs: Registration of TLV handlers and parameters Tom Herbert
2019-01-23 19:27   ` David Miller
2019-01-23  5:31 ` [PATCH net-next 3/5] ip6tlvs: Add netlink interface Tom Herbert
2019-01-23  5:31 ` [PATCH net-next 4/5] ip6tlvs: Validation of TX Destination and Hop-by-Hop options Tom Herbert
2019-01-23  5:31 ` [PATCH net-next 5/5] ip6tlvs: API to set and remove individual TLVs from DO or HBH EH Tom Herbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.