All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
@ 2019-12-26 22:51 Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 1/9] ipeh: Fix destopts counters on drop Tom Herbert
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel; +Cc: Tom Herbert

The fundamental rationale here is to make various TLVs, in particular
Hop-by-Hop and Destination options, usable, robust, scalable, and
extensible to support emerging functionality.

Specifically, this patch set:

1) Allow modules to register support for Hop-by-Hop and Destination
options. This is useful for development and deployment of new options.
2) Allow non-privileged users to set Hop-by-Hop and Destination
options for their packets or connections. This is especially useful
for options like Path MTU and IOAM options where the information in
the options is both sourced and sinked by the application. The
alternative to this would be to create more side interfaces so that
the option can be enabled via the kernel-- such side interfaces would
be overkill IMO.
3) In conjunction with #2, validation of the options being set by an
application is done. The validation for non-privileged users is
purposely strict, but even in the case of privileged user validation
is useful to disallow allow application from sending gibberish (for
instance, now a TLV could be created with a length exceeding the bound
of the extension header).
4) Consolidate various TLV mechanisms. Segment routing should be able
to use the same TLV parsing function, as should UDP options when they
come into the kernel.
5) Option lookup on receive is O(1) instead of list scan.

Subsequent patch sets will include:

6) Allow setting specific (Hop-by-Hop and Destination) options on a
socket. This would also allow some options to be set by application
and some might be set by kernel.
7) Allow options processing to be done in the context of a socket.
This will be useful for FAST and PMTU options.
8) Allow experimental IPv6 options in the same way that experimental
TCP options are allowed.
9) Support a robust means of extension header insertion. Extension
header insertion is a controversial mechanism that some router vendors
are insisting upon (see ongoing discussion in 6man list). The way they
are currently doing it breaks the stack (particularly ICMP and the way
networks are debugged), with proper support we can at least mitigate the
effects of the problems being created by extension header insertion.
10) Support IPv4 extension headers. This again attempts to address
some horrendous and completely non-robust hacks that are currently
being perpetuated by some router vendors. For instance, some middlebox
implementations are currently insert into TCP or UDP payload their own
data with the assumption that a peer device will restore correct data.
If they ever miss fixing up the payload then we now have systematic
silent data corruption (IMO, this is very dangerous in a large scale
deployment!). We can offer a better approach...

Changes in this patch set:

  - Reorganize extension header files to separate out common
    API components
  - Create common TLV handler that will can be used in other use
    cases (e.g. segment routing TLVs, UDP options)
  - Allow registration of TLV handlers
  - Elaborate on the TLV tables to include more characteristics
  - Add a netlink interface to set TLV parameters (such as
    alignment requirements, authorization to send, etc.)
  - Enhance validation of TLVs being sent. Validation is strict
    (unless overridden by admin) following that sending clause
    of the robustness principle
  - Allow non-privileged users to set Hop-by-Hop and Destination
    Options if authorized by the admin

v2:
  - Fix build errors from missing include file.

v3:
  - Fix kbuild issue for ipv6_opt_hdr declared inside parameter list
    in ipeh.h

v4:
  - Resubmit

v5:
  - Fix reverse christmas tree issue

v6:
  - Address comments from Simon Horman
  - Remove new EXTHDRS Kconfig symbol, just use IPV6 for now
  - Split out introduction of parse_error for TLV parsing loop into its
    own patch
  - Fix drop counters in HBH and destination options processing
  - Add extack error messages in netlink code
  - Added range of permissions in include/uapi/linux/ipeh.h
  - Check that min data length is <= max data length when setting
    TLV attributes

v7:
  - Fix incorrect index in checking for nonzero padding
  - Use dev_net(skb->dev) in all cases of __IP6_INC_STATS for hopopts
    and destopts (addresses comment from Willem de Bruijin)

v8:
  - Elaborate on justification for patches in the summary commit log

Tom Herbert (9):
  ipeh: Fix destopts counters on drop
  ipeh: Create exthdrs_options.c and ipeh.h
  ipeh: Move generic EH functions to exthdrs_common.c
  ipeh: Generic TLV parser
  ipeh: Add callback to ipeh_parse_tlv to handle errors
  ip6tlvs: Registration of TLV handlers and parameters
  ip6tlvs: Add TX parameters
  ip6tlvs: Add netlink interface
  ip6tlvs: Validation of TX Destination and Hop-by-Hop options

 include/net/ipeh.h         |  209 ++++++++
 include/net/ipv6.h         |   12 +-
 include/uapi/linux/in6.h   |    6 +
 include/uapi/linux/ipeh.h  |   53 ++
 net/dccp/ipv6.c            |    2 +-
 net/ipv6/Makefile          |    3 +-
 net/ipv6/calipso.c         |    6 +-
 net/ipv6/datagram.c        |   51 +-
 net/ipv6/exthdrs.c         |  514 ++-----------------
 net/ipv6/exthdrs_common.c  | 1216 ++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/exthdrs_options.c |  342 +++++++++++++
 net/ipv6/ipv6_sockglue.c   |   39 +-
 net/ipv6/raw.c             |    2 +-
 net/ipv6/tcp_ipv6.c        |    2 +-
 net/ipv6/udp.c             |    2 +-
 net/l2tp/l2tp_ip6.c        |    2 +-
 net/sctp/ipv6.c            |    2 +-
 17 files changed, 1942 insertions(+), 521 deletions(-)
 create mode 100644 include/net/ipeh.h
 create mode 100644 include/uapi/linux/ipeh.h
 create mode 100644 net/ipv6/exthdrs_common.c
 create mode 100644 net/ipv6/exthdrs_options.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 1/9] ipeh: Fix destopts counters on drop
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 2/9] ipeh: Create exthdrs_options.c and ipeh.h Tom Herbert
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Bump IPSTATS_MIB_INHDRERRORS when extension header limit is exceeded.

Only take net from skb->dev, don't use dst for counters.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 net/ipv6/exthdrs.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index ab5add0..e7eacc4 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -276,28 +276,25 @@ static const struct tlvtype_proc tlvprocdestopt_lst[] = {
 
 static int ipv6_destopt_rcv(struct sk_buff *skb)
 {
-	struct inet6_dev *idev = __in6_dev_get(skb->dev);
 	struct inet6_skb_parm *opt = IP6CB(skb);
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
 	__u16 dstbuf;
 #endif
-	struct dst_entry *dst = skb_dst(skb);
 	struct net *net = dev_net(skb->dev);
 	int extlen;
 
 	if (!pskb_may_pull(skb, skb_transport_offset(skb) + 8) ||
 	    !pskb_may_pull(skb, (skb_transport_offset(skb) +
 				 ((skb_transport_header(skb)[1] + 1) << 3)))) {
-		__IP6_INC_STATS(dev_net(dst->dev), idev,
-				IPSTATS_MIB_INHDRERRORS);
-fail_and_free:
 		kfree_skb(skb);
-		return -1;
+		goto fail;
 	}
 
 	extlen = (skb_transport_header(skb)[1] + 1) << 3;
-	if (extlen > net->ipv6.sysctl.max_dst_opts_len)
-		goto fail_and_free;
+	if (extlen > net->ipv6.sysctl.max_dst_opts_len) {
+		kfree_skb(skb);
+		goto fail;
+	}
 
 	opt->lastopt = opt->dst1 = skb_network_header_len(skb);
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -316,7 +313,9 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
 		return 1;
 	}
 
-	__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
+fail:
+	__IP6_INC_STATS(dev_net(skb->dev), __in6_dev_get(skb->dev),
+			IPSTATS_MIB_INHDRERRORS);
 	return -1;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 2/9] ipeh: Create exthdrs_options.c and ipeh.h
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 1/9] ipeh: Fix destopts counters on drop Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 3/9] ipeh: Move generic EH functions to exthdrs_common.c Tom Herbert
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Create exthdrs_options.c to hold code related to specific Hop-by-Hop
and Destination extension header options. Move related functions in
exthdrs.c to the new file.

Create include net/ipeh.h to contain common definitions for IP extension
headers.

Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h         |  22 +++++
 include/net/ipv6.h         |   1 +
 net/ipv6/Makefile          |   2 +-
 net/ipv6/exthdrs.c         | 204 ---------------------------------------------
 net/ipv6/exthdrs_options.c | 201 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 225 insertions(+), 205 deletions(-)
 create mode 100644 include/net/ipeh.h
 create mode 100644 net/ipv6/exthdrs_options.c

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
new file mode 100644
index 0000000..ec2d186
--- /dev/null
+++ b/include/net/ipeh.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _NET_IPEH_H
+#define _NET_IPEH_H
+
+#include <linux/skbuff.h>
+
+/*
+ *     Parsing tlv encoded headers.
+ *
+ *     Parsing function "func" returns true, if parsing succeed
+ *     and false, if it failed.
+ *     It MUST NOT touch skb->h.
+ */
+struct tlvtype_proc {
+	int	type;
+	bool	(*func)(struct sk_buff *skb, int offset);
+};
+
+extern const struct tlvtype_proc tlvprocdestopt_lst[];
+extern const struct tlvtype_proc tlvprochopopt_lst[];
+
+#endif /* _NET_IPEH_H */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 4e95f6d..1d84394 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -20,6 +20,7 @@
 #include <net/flow_dissector.h>
 #include <net/snmp.h>
 #include <net/netns/hash.h>
+#include <net/ipeh.h>
 
 #define SIN6_LEN_RFC2133	24
 
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 8ccf355..df3919b 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -10,7 +10,7 @@ ipv6-objs :=	af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \
 		route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \
 		raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o ping.o \
 		exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o \
-		udp_offload.o seg6.o fib6_notifier.o
+		udp_offload.o seg6.o fib6_notifier.o exthdrs_options.o
 
 ipv6-offload :=	ip6_offload.o tcpv6_offload.o exthdrs_offload.o
 
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index e7eacc4..4709cc1 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -39,7 +39,6 @@
 #include <net/ndisc.h>
 #include <net/ip6_route.h>
 #include <net/addrconf.h>
-#include <net/calipso.h>
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
 #include <net/xfrm.h>
 #endif
@@ -51,19 +50,6 @@
 
 #include <linux/uaccess.h>
 
-/*
- *	Parsing tlv encoded headers.
- *
- *	Parsing function "func" returns true, if parsing succeed
- *	and false, if it failed.
- *	It MUST NOT touch skb->h.
- */
-
-struct tlvtype_proc {
-	int	type;
-	bool	(*func)(struct sk_buff *skb, int offset);
-};
-
 /*********************
   Generic functions
  *********************/
@@ -200,80 +186,6 @@ static bool ip6_parse_tlv(const struct tlvtype_proc *procs,
 	return false;
 }
 
-/*****************************
-  Destination options header.
- *****************************/
-
-#if IS_ENABLED(CONFIG_IPV6_MIP6)
-static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
-{
-	struct ipv6_destopt_hao *hao;
-	struct inet6_skb_parm *opt = IP6CB(skb);
-	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
-	int ret;
-
-	if (opt->dsthao) {
-		net_dbg_ratelimited("hao duplicated\n");
-		goto discard;
-	}
-	opt->dsthao = opt->dst1;
-	opt->dst1 = 0;
-
-	hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) + optoff);
-
-	if (hao->length != 16) {
-		net_dbg_ratelimited("hao invalid option length = %d\n",
-				    hao->length);
-		goto discard;
-	}
-
-	if (!(ipv6_addr_type(&hao->addr) & IPV6_ADDR_UNICAST)) {
-		net_dbg_ratelimited("hao is not an unicast addr: %pI6\n",
-				    &hao->addr);
-		goto discard;
-	}
-
-	ret = xfrm6_input_addr(skb, (xfrm_address_t *)&ipv6h->daddr,
-			       (xfrm_address_t *)&hao->addr, IPPROTO_DSTOPTS);
-	if (unlikely(ret < 0))
-		goto discard;
-
-	if (skb_cloned(skb)) {
-		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
-			goto discard;
-
-		/* update all variable using below by copied skbuff */
-		hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) +
-						  optoff);
-		ipv6h = ipv6_hdr(skb);
-	}
-
-	if (skb->ip_summed == CHECKSUM_COMPLETE)
-		skb->ip_summed = CHECKSUM_NONE;
-
-	swap(ipv6h->saddr, hao->addr);
-
-	if (skb->tstamp == 0)
-		__net_timestamp(skb);
-
-	return true;
-
- discard:
-	kfree_skb(skb);
-	return false;
-}
-#endif
-
-static const struct tlvtype_proc tlvprocdestopt_lst[] = {
-#if IS_ENABLED(CONFIG_IPV6_MIP6)
-	{
-		.type	= IPV6_TLV_HAO,
-		.func	= ipv6_dest_hao,
-	},
-#endif
-	{-1,			NULL}
-};
-
 static int ipv6_destopt_rcv(struct sk_buff *skb)
 {
 	struct inet6_skb_parm *opt = IP6CB(skb);
@@ -701,122 +613,6 @@ void ipv6_exthdrs_exit(void)
 	inet6_del_protocol(&rthdr_protocol, IPPROTO_ROUTING);
 }
 
-/**********************************
-  Hop-by-hop options.
- **********************************/
-
-/*
- * Note: we cannot rely on skb_dst(skb) before we assign it in ip6_route_input().
- */
-static inline struct inet6_dev *ipv6_skb_idev(struct sk_buff *skb)
-{
-	return skb_dst(skb) ? ip6_dst_idev(skb_dst(skb)) : __in6_dev_get(skb->dev);
-}
-
-static inline struct net *ipv6_skb_net(struct sk_buff *skb)
-{
-	return skb_dst(skb) ? dev_net(skb_dst(skb)->dev) : dev_net(skb->dev);
-}
-
-/* Router Alert as of RFC 2711 */
-
-static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
-{
-	const unsigned char *nh = skb_network_header(skb);
-
-	if (nh[optoff + 1] == 2) {
-		IP6CB(skb)->flags |= IP6SKB_ROUTERALERT;
-		memcpy(&IP6CB(skb)->ra, nh + optoff + 2, sizeof(IP6CB(skb)->ra));
-		return true;
-	}
-	net_dbg_ratelimited("ipv6_hop_ra: wrong RA length %d\n",
-			    nh[optoff + 1]);
-	kfree_skb(skb);
-	return false;
-}
-
-/* Jumbo payload */
-
-static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
-{
-	const unsigned char *nh = skb_network_header(skb);
-	struct inet6_dev *idev = __in6_dev_get_safely(skb->dev);
-	struct net *net = ipv6_skb_net(skb);
-	u32 pkt_len;
-
-	if (nh[optoff + 1] != 4 || (optoff & 3) != 2) {
-		net_dbg_ratelimited("ipv6_hop_jumbo: wrong jumbo opt length/alignment %d\n",
-				    nh[optoff+1]);
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
-		goto drop;
-	}
-
-	pkt_len = ntohl(*(__be32 *)(nh + optoff + 2));
-	if (pkt_len <= IPV6_MAXPLEN) {
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
-		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff+2);
-		return false;
-	}
-	if (ipv6_hdr(skb)->payload_len) {
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
-		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff);
-		return false;
-	}
-
-	if (pkt_len > skb->len - sizeof(struct ipv6hdr)) {
-		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INTRUNCATEDPKTS);
-		goto drop;
-	}
-
-	if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr)))
-		goto drop;
-
-	IP6CB(skb)->flags |= IP6SKB_JUMBOGRAM;
-	return true;
-
-drop:
-	kfree_skb(skb);
-	return false;
-}
-
-/* CALIPSO RFC 5570 */
-
-static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
-{
-	const unsigned char *nh = skb_network_header(skb);
-
-	if (nh[optoff + 1] < 8)
-		goto drop;
-
-	if (nh[optoff + 6] * 4 + 8 > nh[optoff + 1])
-		goto drop;
-
-	if (!calipso_validate(skb, nh + optoff))
-		goto drop;
-
-	return true;
-
-drop:
-	kfree_skb(skb);
-	return false;
-}
-
-static const struct tlvtype_proc tlvprochopopt_lst[] = {
-	{
-		.type	= IPV6_TLV_ROUTERALERT,
-		.func	= ipv6_hop_ra,
-	},
-	{
-		.type	= IPV6_TLV_JUMBO,
-		.func	= ipv6_hop_jumbo,
-	},
-	{
-		.type	= IPV6_TLV_CALIPSO,
-		.func	= ipv6_hop_calipso,
-	},
-	{ -1, }
-};
-
 int ipv6_parse_hopopts(struct sk_buff *skb)
 {
 	struct inet6_skb_parm *opt = IP6CB(skb);
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
new file mode 100644
index 0000000..032e072
--- /dev/null
+++ b/net/ipv6/exthdrs_options.c
@@ -0,0 +1,201 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/errno.h>
+#include <linux/in6.h>
+#include <linux/net.h>
+#include <linux/netdevice.h>
+#include <linux/socket.h>
+#include <linux/types.h>
+#include <net/calipso.h>
+#include <net/ipv6.h>
+#include <net/ip6_route.h>
+#if IS_ENABLED(CONFIG_IPV6_MIP6)
+#include <net/xfrm.h>
+#endif
+
+/* Destination options header */
+
+#if IS_ENABLED(CONFIG_IPV6_MIP6)
+static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
+{
+	struct ipv6_destopt_hao *hao;
+	struct inet6_skb_parm *opt = IP6CB(skb);
+	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	int ret;
+
+	if (opt->dsthao) {
+		net_dbg_ratelimited("hao duplicated\n");
+		goto discard;
+	}
+	opt->dsthao = opt->dst1;
+	opt->dst1 = 0;
+
+	hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) + optoff);
+
+	if (hao->length != 16) {
+		net_dbg_ratelimited("hao invalid option length = %d\n",
+				    hao->length);
+		goto discard;
+	}
+
+	if (!(ipv6_addr_type(&hao->addr) & IPV6_ADDR_UNICAST)) {
+		net_dbg_ratelimited("hao is not an unicast addr: %pI6\n",
+				    &hao->addr);
+		goto discard;
+	}
+
+	ret = xfrm6_input_addr(skb, (xfrm_address_t *)&ipv6h->daddr,
+			       (xfrm_address_t *)&hao->addr, IPPROTO_DSTOPTS);
+	if (unlikely(ret < 0))
+		goto discard;
+
+	if (skb_cloned(skb)) {
+		if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
+			goto discard;
+
+		/* update all variable using below by copied skbuff */
+		hao = (struct ipv6_destopt_hao *)(skb_network_header(skb) +
+						  optoff);
+		ipv6h = ipv6_hdr(skb);
+	}
+
+	if (skb->ip_summed == CHECKSUM_COMPLETE)
+		skb->ip_summed = CHECKSUM_NONE;
+
+	swap(ipv6h->saddr, hao->addr);
+
+	if (skb->tstamp == 0)
+		__net_timestamp(skb);
+
+	return true;
+
+ discard:
+	kfree_skb(skb);
+	return false;
+}
+#endif
+
+const struct tlvtype_proc tlvprocdestopt_lst[] = {
+#if IS_ENABLED(CONFIG_IPV6_MIP6)
+	{
+		.type	= IPV6_TLV_HAO,
+		.func	= ipv6_dest_hao,
+	},
+#endif
+	{-1,			NULL}
+};
+
+/* Hop-by-hop options */
+
+/* Note: we cannot rely on skb_dst(skb) before we assign it in
+ * ip6_route_input().
+ */
+static inline struct inet6_dev *ipv6_skb_idev(struct sk_buff *skb)
+{
+	return skb_dst(skb) ? ip6_dst_idev(skb_dst(skb)) :
+	    __in6_dev_get(skb->dev);
+}
+
+static inline struct net *ipv6_skb_net(struct sk_buff *skb)
+{
+	return skb_dst(skb) ? dev_net(skb_dst(skb)->dev) : dev_net(skb->dev);
+}
+
+/* Router Alert as of RFC 2711 */
+
+static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
+{
+	const unsigned char *nh = skb_network_header(skb);
+
+	if (nh[optoff + 1] == 2) {
+		IP6CB(skb)->flags |= IP6SKB_ROUTERALERT;
+		memcpy(&IP6CB(skb)->ra, nh + optoff + 2,
+		       sizeof(IP6CB(skb)->ra));
+		return true;
+	}
+	net_dbg_ratelimited("%s: wrong RA length %d\n",
+			    __func__, nh[optoff + 1]);
+	kfree_skb(skb);
+	return false;
+}
+
+/* Jumbo payload */
+
+static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
+{
+	const unsigned char *nh = skb_network_header(skb);
+	struct inet6_dev *idev = __in6_dev_get_safely(skb->dev);
+	struct net *net = ipv6_skb_net(skb);
+	u32 pkt_len;
+
+	if (nh[optoff + 1] != 4 || (optoff & 3) != 2) {
+		net_dbg_ratelimited("%s: wrong jumbo opt length/alignment %d\n",
+				    __func__, nh[optoff + 1]);
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
+		goto drop;
+	}
+
+	pkt_len = ntohl(*(__be32 *)(nh + optoff + 2));
+	if (pkt_len <= IPV6_MAXPLEN) {
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
+		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff + 2);
+		return false;
+	}
+	if (ipv6_hdr(skb)->payload_len) {
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
+		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, optoff);
+		return false;
+	}
+
+	if (pkt_len > skb->len - sizeof(struct ipv6hdr)) {
+		__IP6_INC_STATS(net, idev, IPSTATS_MIB_INTRUNCATEDPKTS);
+		goto drop;
+	}
+
+	if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr)))
+		goto drop;
+
+	IP6CB(skb)->flags |= IP6SKB_JUMBOGRAM;
+	return true;
+
+drop:
+	kfree_skb(skb);
+	return false;
+}
+
+/* CALIPSO RFC 5570 */
+
+static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
+{
+	const unsigned char *nh = skb_network_header(skb);
+
+	if (nh[optoff + 1] < 8)
+		goto drop;
+
+	if (nh[optoff + 6] * 4 + 8 > nh[optoff + 1])
+		goto drop;
+
+	if (!calipso_validate(skb, nh + optoff))
+		goto drop;
+
+	return true;
+
+drop:
+	kfree_skb(skb);
+	return false;
+}
+
+const struct tlvtype_proc tlvprochopopt_lst[] = {
+	{
+		.type	= IPV6_TLV_ROUTERALERT,
+		.func	= ipv6_hop_ra,
+	},
+	{
+		.type	= IPV6_TLV_JUMBO,
+		.func	= ipv6_hop_jumbo,
+	},
+	{
+		.type	= IPV6_TLV_CALIPSO,
+		.func	= ipv6_hop_calipso,
+	},
+	{ -1, }
+};
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 3/9] ipeh: Move generic EH functions to exthdrs_common.c
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 1/9] ipeh: Fix destopts counters on drop Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 2/9] ipeh: Create exthdrs_options.c and ipeh.h Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 4/9] ipeh: Generic TLV parser Tom Herbert
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Move generic functions in exthdrs.c to new exthdrs_common.c so that
exthdrs.c only contains functions that are specific to IPv6 processing,
and exthdrs_common.c contains functions that are generic. These
functions include those that will be used with IPv4 extension headers.
Generic extension header related functions are prefixed by ipeh_.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h        |  12 ++++
 include/net/ipv6.h        |   9 ---
 net/dccp/ipv6.c           |   2 +-
 net/ipv6/Makefile         |   1 +
 net/ipv6/calipso.c        |   6 +-
 net/ipv6/exthdrs.c        | 138 --------------------------------------------
 net/ipv6/exthdrs_common.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/ipv6_sockglue.c  |   2 +-
 net/ipv6/raw.c            |   2 +-
 net/ipv6/tcp_ipv6.c       |   2 +-
 net/ipv6/udp.c            |   2 +-
 net/l2tp/l2tp_ip6.c       |   2 +-
 net/sctp/ipv6.c           |   2 +-
 13 files changed, 167 insertions(+), 157 deletions(-)
 create mode 100644 net/ipv6/exthdrs_common.c

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
index ec2d186..3b24831 100644
--- a/include/net/ipeh.h
+++ b/include/net/ipeh.h
@@ -19,4 +19,16 @@ struct tlvtype_proc {
 extern const struct tlvtype_proc tlvprocdestopt_lst[];
 extern const struct tlvtype_proc tlvprochopopt_lst[];
 
+struct ipv6_txoptions;
+struct ipv6_opt_hdr;
+
+struct ipv6_txoptions *ipeh_dup_options(struct sock *sk,
+					struct ipv6_txoptions *opt);
+struct ipv6_txoptions *ipeh_renew_options(struct sock *sk,
+					  struct ipv6_txoptions *opt,
+					  int newtype,
+					  struct ipv6_opt_hdr *newopt);
+struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
+					  struct ipv6_txoptions *opt);
+
 #endif /* _NET_IPEH_H */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 1d84394..6c4c834 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -429,15 +429,6 @@ int ip6_ra_control(struct sock *sk, int sel);
 
 int ipv6_parse_hopopts(struct sk_buff *skb);
 
-struct ipv6_txoptions *ipv6_dup_options(struct sock *sk,
-					struct ipv6_txoptions *opt);
-struct ipv6_txoptions *ipv6_renew_options(struct sock *sk,
-					  struct ipv6_txoptions *opt,
-					  int newtype,
-					  struct ipv6_opt_hdr *newopt);
-struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
-					  struct ipv6_txoptions *opt);
-
 bool ipv6_opt_accepted(const struct sock *sk, const struct sk_buff *skb,
 		       const struct inet6_skb_parm *opt);
 struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 1e5e08c..deed6efa 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -515,7 +515,7 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk,
 	if (!opt)
 		opt = rcu_dereference(np->opt);
 	if (opt) {
-		opt = ipv6_dup_options(newsk, opt);
+		opt = ipeh_dup_options(newsk, opt);
 		RCU_INIT_POINTER(newnp->opt, opt);
 	}
 	inet_csk(newsk)->icsk_ext_hdr_len = 0;
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index df3919b..3033d3e 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -44,6 +44,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
 obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
 obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
 obj-$(CONFIG_IPV6_FOU) += fou6.o
+obj-$(CONFIG_IPV6) += exthdrs_common.o
 
 obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
 obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
diff --git a/net/ipv6/calipso.c b/net/ipv6/calipso.c
index 221c81f..9c84848 100644
--- a/net/ipv6/calipso.c
+++ b/net/ipv6/calipso.c
@@ -785,7 +785,7 @@ static int calipso_opt_update(struct sock *sk, struct ipv6_opt_hdr *hop)
 {
 	struct ipv6_txoptions *old = txopt_get(inet6_sk(sk)), *txopts;
 
-	txopts = ipv6_renew_options(sk, old, IPV6_HOPOPTS, hop);
+	txopts = ipeh_renew_options(sk, old, IPV6_HOPOPTS, hop);
 	txopt_put(old);
 	if (IS_ERR(txopts))
 		return PTR_ERR(txopts);
@@ -1207,7 +1207,7 @@ static int calipso_req_setattr(struct request_sock *req,
 	if (IS_ERR(new))
 		return PTR_ERR(new);
 
-	txopts = ipv6_renew_options(sk, req_inet->ipv6_opt, IPV6_HOPOPTS, new);
+	txopts = ipeh_renew_options(sk, req_inet->ipv6_opt, IPV6_HOPOPTS, new);
 
 	kfree(new);
 
@@ -1244,7 +1244,7 @@ static void calipso_req_delattr(struct request_sock *req)
 	if (calipso_opt_del(req_inet->ipv6_opt->hopopt, &new))
 		return; /* Nothing to do */
 
-	txopts = ipv6_renew_options(sk, req_inet->ipv6_opt, IPV6_HOPOPTS, new);
+	txopts = ipeh_renew_options(sk, req_inet->ipv6_opt, IPV6_HOPOPTS, new);
 
 	if (!IS_ERR(txopts)) {
 		txopts = xchg(&req_inet->ipv6_opt, txopts);
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 4709cc1..86e562c 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -783,144 +783,6 @@ void ipv6_push_frag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt, u8 *pr
 }
 EXPORT_SYMBOL(ipv6_push_frag_opts);
 
-struct ipv6_txoptions *
-ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
-{
-	struct ipv6_txoptions *opt2;
-
-	opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
-	if (opt2) {
-		long dif = (char *)opt2 - (char *)opt;
-		memcpy(opt2, opt, opt->tot_len);
-		if (opt2->hopopt)
-			*((char **)&opt2->hopopt) += dif;
-		if (opt2->dst0opt)
-			*((char **)&opt2->dst0opt) += dif;
-		if (opt2->dst1opt)
-			*((char **)&opt2->dst1opt) += dif;
-		if (opt2->srcrt)
-			*((char **)&opt2->srcrt) += dif;
-		refcount_set(&opt2->refcnt, 1);
-	}
-	return opt2;
-}
-EXPORT_SYMBOL_GPL(ipv6_dup_options);
-
-static void ipv6_renew_option(int renewtype,
-			      struct ipv6_opt_hdr **dest,
-			      struct ipv6_opt_hdr *old,
-			      struct ipv6_opt_hdr *new,
-			      int newtype, char **p)
-{
-	struct ipv6_opt_hdr *src;
-
-	src = (renewtype == newtype ? new : old);
-	if (!src)
-		return;
-
-	memcpy(*p, src, ipv6_optlen(src));
-	*dest = (struct ipv6_opt_hdr *)*p;
-	*p += CMSG_ALIGN(ipv6_optlen(*dest));
-}
-
-/**
- * ipv6_renew_options - replace a specific ext hdr with a new one.
- *
- * @sk: sock from which to allocate memory
- * @opt: original options
- * @newtype: option type to replace in @opt
- * @newopt: new option of type @newtype to replace (user-mem)
- * @newoptlen: length of @newopt
- *
- * Returns a new set of options which is a copy of @opt with the
- * option type @newtype replaced with @newopt.
- *
- * @opt may be NULL, in which case a new set of options is returned
- * containing just @newopt.
- *
- * @newopt may be NULL, in which case the specified option type is
- * not copied into the new set of options.
- *
- * The new set of options is allocated from the socket option memory
- * buffer of @sk.
- */
-struct ipv6_txoptions *
-ipv6_renew_options(struct sock *sk, struct ipv6_txoptions *opt,
-		   int newtype, struct ipv6_opt_hdr *newopt)
-{
-	int tot_len = 0;
-	char *p;
-	struct ipv6_txoptions *opt2;
-
-	if (opt) {
-		if (newtype != IPV6_HOPOPTS && opt->hopopt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->hopopt));
-		if (newtype != IPV6_RTHDRDSTOPTS && opt->dst0opt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst0opt));
-		if (newtype != IPV6_RTHDR && opt->srcrt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->srcrt));
-		if (newtype != IPV6_DSTOPTS && opt->dst1opt)
-			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst1opt));
-	}
-
-	if (newopt)
-		tot_len += CMSG_ALIGN(ipv6_optlen(newopt));
-
-	if (!tot_len)
-		return NULL;
-
-	tot_len += sizeof(*opt2);
-	opt2 = sock_kmalloc(sk, tot_len, GFP_ATOMIC);
-	if (!opt2)
-		return ERR_PTR(-ENOBUFS);
-
-	memset(opt2, 0, tot_len);
-	refcount_set(&opt2->refcnt, 1);
-	opt2->tot_len = tot_len;
-	p = (char *)(opt2 + 1);
-
-	ipv6_renew_option(IPV6_HOPOPTS, &opt2->hopopt,
-			  (opt ? opt->hopopt : NULL),
-			  newopt, newtype, &p);
-	ipv6_renew_option(IPV6_RTHDRDSTOPTS, &opt2->dst0opt,
-			  (opt ? opt->dst0opt : NULL),
-			  newopt, newtype, &p);
-	ipv6_renew_option(IPV6_RTHDR,
-			  (struct ipv6_opt_hdr **)&opt2->srcrt,
-			  (opt ? (struct ipv6_opt_hdr *)opt->srcrt : NULL),
-			  newopt, newtype, &p);
-	ipv6_renew_option(IPV6_DSTOPTS, &opt2->dst1opt,
-			  (opt ? opt->dst1opt : NULL),
-			  newopt, newtype, &p);
-
-	opt2->opt_nflen = (opt2->hopopt ? ipv6_optlen(opt2->hopopt) : 0) +
-			  (opt2->dst0opt ? ipv6_optlen(opt2->dst0opt) : 0) +
-			  (opt2->srcrt ? ipv6_optlen(opt2->srcrt) : 0);
-	opt2->opt_flen = (opt2->dst1opt ? ipv6_optlen(opt2->dst1opt) : 0);
-
-	return opt2;
-}
-
-struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
-					  struct ipv6_txoptions *opt)
-{
-	/*
-	 * ignore the dest before srcrt unless srcrt is being included.
-	 * --yoshfuji
-	 */
-	if (opt && opt->dst0opt && !opt->srcrt) {
-		if (opt_space != opt) {
-			memcpy(opt_space, opt, sizeof(*opt_space));
-			opt = opt_space;
-		}
-		opt->opt_nflen -= ipv6_optlen(opt->dst0opt);
-		opt->dst0opt = NULL;
-	}
-
-	return opt;
-}
-EXPORT_SYMBOL_GPL(ipv6_fixup_options);
-
 /**
  * fl6_update_dst - update flowi destination address with info given
  *                  by srcrt option, if any.
diff --git a/net/ipv6/exthdrs_common.c b/net/ipv6/exthdrs_common.c
new file mode 100644
index 0000000..2c68184
--- /dev/null
+++ b/net/ipv6/exthdrs_common.c
@@ -0,0 +1,144 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Extension header and TLV library code that is not specific to IPv6. */
+#include <linux/export.h>
+#include <net/ipv6.h>
+
+struct ipv6_txoptions *
+ipeh_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
+{
+	struct ipv6_txoptions *opt2;
+
+	opt2 = sock_kmalloc(sk, opt->tot_len, GFP_ATOMIC);
+	if (opt2) {
+		long dif = (char *)opt2 - (char *)opt;
+
+		memcpy(opt2, opt, opt->tot_len);
+		if (opt2->hopopt)
+			*((char **)&opt2->hopopt) += dif;
+		if (opt2->dst0opt)
+			*((char **)&opt2->dst0opt) += dif;
+		if (opt2->dst1opt)
+			*((char **)&opt2->dst1opt) += dif;
+		if (opt2->srcrt)
+			*((char **)&opt2->srcrt) += dif;
+		refcount_set(&opt2->refcnt, 1);
+	}
+	return opt2;
+}
+EXPORT_SYMBOL_GPL(ipeh_dup_options);
+
+static void ipeh_renew_option(int renewtype,
+			      struct ipv6_opt_hdr **dest,
+			      struct ipv6_opt_hdr *old,
+			      struct ipv6_opt_hdr *new,
+			      int newtype, char **p)
+{
+	struct ipv6_opt_hdr *src;
+
+	src = (renewtype == newtype ? new : old);
+	if (!src)
+		return;
+
+	memcpy(*p, src, ipv6_optlen(src));
+	*dest = (struct ipv6_opt_hdr *)*p;
+	*p += CMSG_ALIGN(ipv6_optlen(*dest));
+}
+
+/**
+ * ipeh_renew_options - replace a specific ext hdr with a new one.
+ *
+ * @sk: sock from which to allocate memory
+ * @opt: original options
+ * @newtype: option type to replace in @opt
+ * @newopt: new option of type @newtype to replace (user-mem)
+ * @newoptlen: length of @newopt
+ *
+ * Returns a new set of options which is a copy of @opt with the
+ * option type @newtype replaced with @newopt.
+ *
+ * @opt may be NULL, in which case a new set of options is returned
+ * containing just @newopt.
+ *
+ * @newopt may be NULL, in which case the specified option type is
+ * not copied into the new set of options.
+ *
+ * The new set of options is allocated from the socket option memory
+ * buffer of @sk.
+ */
+struct ipv6_txoptions *
+ipeh_renew_options(struct sock *sk, struct ipv6_txoptions *opt,
+		   int newtype, struct ipv6_opt_hdr *newopt)
+{
+	int tot_len = 0;
+	char *p;
+	struct ipv6_txoptions *opt2;
+
+	if (opt) {
+		if (newtype != IPV6_HOPOPTS && opt->hopopt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->hopopt));
+		if (newtype != IPV6_RTHDRDSTOPTS && opt->dst0opt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst0opt));
+		if (newtype != IPV6_RTHDR && opt->srcrt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->srcrt));
+		if (newtype != IPV6_DSTOPTS && opt->dst1opt)
+			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst1opt));
+	}
+
+	if (newopt)
+		tot_len += CMSG_ALIGN(ipv6_optlen(newopt));
+
+	if (!tot_len)
+		return NULL;
+
+	tot_len += sizeof(*opt2);
+	opt2 = sock_kmalloc(sk, tot_len, GFP_ATOMIC);
+	if (!opt2)
+		return ERR_PTR(-ENOBUFS);
+
+	memset(opt2, 0, tot_len);
+	refcount_set(&opt2->refcnt, 1);
+	opt2->tot_len = tot_len;
+	p = (char *)(opt2 + 1);
+
+	ipeh_renew_option(IPV6_HOPOPTS, &opt2->hopopt,
+			  (opt ? opt->hopopt : NULL),
+			  newopt, newtype, &p);
+	ipeh_renew_option(IPV6_RTHDRDSTOPTS, &opt2->dst0opt,
+			  (opt ? opt->dst0opt : NULL),
+			  newopt, newtype, &p);
+	ipeh_renew_option(IPV6_RTHDR,
+			  (struct ipv6_opt_hdr **)&opt2->srcrt,
+			  (opt ? (struct ipv6_opt_hdr *)opt->srcrt : NULL),
+			  newopt, newtype, &p);
+	ipeh_renew_option(IPV6_DSTOPTS, &opt2->dst1opt,
+			  (opt ? opt->dst1opt : NULL),
+			  newopt, newtype, &p);
+
+	opt2->opt_nflen = (opt2->hopopt ? ipv6_optlen(opt2->hopopt) : 0) +
+			  (opt2->dst0opt ? ipv6_optlen(opt2->dst0opt) : 0) +
+			  (opt2->srcrt ? ipv6_optlen(opt2->srcrt) : 0);
+	opt2->opt_flen = (opt2->dst1opt ? ipv6_optlen(opt2->dst1opt) : 0);
+
+	return opt2;
+}
+EXPORT_SYMBOL(ipeh_renew_options);
+
+struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
+					  struct ipv6_txoptions *opt)
+{
+	/* ignore the dest before srcrt unless srcrt is being included.
+	 * --yoshfuji
+	 */
+	if (opt && opt->dst0opt && !opt->srcrt) {
+		if (opt_space != opt) {
+			memcpy(opt_space, opt, sizeof(*opt_space));
+			opt = opt_space;
+		}
+		opt->opt_nflen -= ipv6_optlen(opt->dst0opt);
+		opt->dst0opt = NULL;
+	}
+
+	return opt;
+}
+EXPORT_SYMBOL_GPL(ipeh_fixup_options);
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 79fc012..7810988 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -427,7 +427,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 
 		opt = rcu_dereference_protected(np->opt,
 						lockdep_sock_is_held(sk));
-		opt = ipv6_renew_options(sk, opt, optname, new);
+		opt = ipeh_renew_options(sk, opt, optname, new);
 		kfree(new);
 		if (IS_ERR(opt)) {
 			retv = PTR_ERR(opt);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index dfe5e60..1bffc05 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -889,7 +889,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	}
 	if (flowlabel)
 		opt = fl6_merge_options(&opt_space, flowlabel, opt);
-	opt = ipv6_fixup_options(&opt_space, opt);
+	opt = ipeh_fixup_options(&opt_space, opt);
 
 	fl6.flowi6_proto = proto;
 	fl6.flowi6_mark = ipc6.sockc.mark;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index df5fd91..406ffe0 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1250,7 +1250,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
 	if (!opt)
 		opt = rcu_dereference(np->opt);
 	if (opt) {
-		opt = ipv6_dup_options(newsk, opt);
+		opt = ipeh_dup_options(newsk, opt);
 		RCU_INIT_POINTER(newnp->opt, opt);
 	}
 	inet_csk(newsk)->icsk_ext_hdr_len = 0;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 9fec580..426fff3 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1391,7 +1391,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	}
 	if (flowlabel)
 		opt = fl6_merge_options(&opt_space, flowlabel, opt);
-	opt = ipv6_fixup_options(&opt_space, opt);
+	opt = ipeh_fixup_options(&opt_space, opt);
 	ipc6.opt = opt;
 
 	fl6.flowi6_proto = sk->sk_protocol;
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index d148766..3d5ac76 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -590,7 +590,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	}
 	if (flowlabel)
 		opt = fl6_merge_options(&opt_space, flowlabel, opt);
-	opt = ipv6_fixup_options(&opt_space, opt);
+	opt = ipeh_fixup_options(&opt_space, opt);
 	ipc6.opt = opt;
 
 	fl6.flowi6_proto = sk->sk_protocol;
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index bc734cf..279d2d2 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -441,7 +441,7 @@ static void sctp_v6_copy_ip_options(struct sock *sk, struct sock *newsk)
 	rcu_read_lock();
 	opt = rcu_dereference(np->opt);
 	if (opt) {
-		opt = ipv6_dup_options(newsk, opt);
+		opt = ipeh_dup_options(newsk, opt);
 		if (!opt)
 			pr_err("%s: Failed to copy ip options\n", __func__);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 4/9] ipeh: Generic TLV parser
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
                   ` (2 preceding siblings ...)
  2019-12-26 22:51 ` [PATCH v8 net-next 3/9] ipeh: Move generic EH functions to exthdrs_common.c Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 5/9] ipeh: Add callback to ipeh_parse_tlv to handle errors Tom Herbert
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Create a generic TLV parser. This will be used with various
extension headers that carry options including Destination,
Hop-by-Hop, Segment Routing TLVs, and other cases of simple
stateless TLV parsing.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h        |  11 ++++
 net/ipv6/exthdrs.c        | 142 ++--------------------------------------------
 net/ipv6/exthdrs_common.c | 137 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 154 insertions(+), 136 deletions(-)

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
index 3b24831..81f92f8 100644
--- a/include/net/ipeh.h
+++ b/include/net/ipeh.h
@@ -31,4 +31,15 @@ struct ipv6_txoptions *ipeh_renew_options(struct sock *sk,
 struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
 
+/* The generic TLV parser assumes that the type value of PAD1 is 0, and PADN
+ * is 1. This is true for IPv6 Destination and Hop-by-Hop Options. For Segment
+ * Routing TLVs, PAD1 is also 0, however PADN is 4 so the latter necessitates
+ * some change to the parser to support Segment Routing TLVs.
+ */
+#define IPEH_TLV_PAD1	0
+#define IPEH_TLV_PADN	1
+
+bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
+		    int max_count, int off, int len);
+
 #endif /* _NET_IPEH_H */
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 86e562c..7b4183c 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -54,138 +54,6 @@
   Generic functions
  *********************/
 
-/* An unknown option is detected, decide what to do */
-
-static bool ip6_tlvopt_unknown(struct sk_buff *skb, int optoff,
-			       bool disallow_unknowns)
-{
-	if (disallow_unknowns) {
-		/* If unknown TLVs are disallowed by configuration
-		 * then always silently drop packet. Note this also
-		 * means no ICMP parameter problem is sent which
-		 * could be a good property to mitigate a reflection DOS
-		 * attack.
-		 */
-
-		goto drop;
-	}
-
-	switch ((skb_network_header(skb)[optoff] & 0xC0) >> 6) {
-	case 0: /* ignore */
-		return true;
-
-	case 1: /* drop packet */
-		break;
-
-	case 3: /* Send ICMP if not a multicast address and drop packet */
-		/* Actually, it is redundant check. icmp_send
-		   will recheck in any case.
-		 */
-		if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr))
-			break;
-		/* fall through */
-	case 2: /* send ICMP PARM PROB regardless and drop packet */
-		icmpv6_param_prob(skb, ICMPV6_UNK_OPTION, optoff);
-		return false;
-	}
-
-drop:
-	kfree_skb(skb);
-	return false;
-}
-
-/* Parse tlv encoded option header (hop-by-hop or destination) */
-
-static bool ip6_parse_tlv(const struct tlvtype_proc *procs,
-			  struct sk_buff *skb,
-			  int max_count)
-{
-	int len = (skb_transport_header(skb)[1] + 1) << 3;
-	const unsigned char *nh = skb_network_header(skb);
-	int off = skb_network_header_len(skb);
-	const struct tlvtype_proc *curr;
-	bool disallow_unknowns = false;
-	int tlv_count = 0;
-	int padlen = 0;
-
-	if (unlikely(max_count < 0)) {
-		disallow_unknowns = true;
-		max_count = -max_count;
-	}
-
-	if (skb_transport_offset(skb) + len > skb_headlen(skb))
-		goto bad;
-
-	off += 2;
-	len -= 2;
-
-	while (len > 0) {
-		int optlen = nh[off + 1] + 2;
-		int i;
-
-		switch (nh[off]) {
-		case IPV6_TLV_PAD1:
-			optlen = 1;
-			padlen++;
-			if (padlen > 7)
-				goto bad;
-			break;
-
-		case IPV6_TLV_PADN:
-			/* RFC 2460 states that the purpose of PadN is
-			 * to align the containing header to multiples
-			 * of 8. 7 is therefore the highest valid value.
-			 * See also RFC 4942, Section 2.1.9.5.
-			 */
-			padlen += optlen;
-			if (padlen > 7)
-				goto bad;
-			/* RFC 4942 recommends receiving hosts to
-			 * actively check PadN payload to contain
-			 * only zeroes.
-			 */
-			for (i = 2; i < optlen; i++) {
-				if (nh[off + i] != 0)
-					goto bad;
-			}
-			break;
-
-		default: /* Other TLV code so scan list */
-			if (optlen > len)
-				goto bad;
-
-			tlv_count++;
-			if (tlv_count > max_count)
-				goto bad;
-
-			for (curr = procs; curr->type >= 0; curr++) {
-				if (curr->type == nh[off]) {
-					/* type specific length/alignment
-					   checks will be performed in the
-					   func(). */
-					if (curr->func(skb, off) == false)
-						return false;
-					break;
-				}
-			}
-			if (curr->type < 0 &&
-			    !ip6_tlvopt_unknown(skb, off, disallow_unknowns))
-				return false;
-
-			padlen = 0;
-			break;
-		}
-		off += optlen;
-		len -= optlen;
-	}
-
-	if (len == 0)
-		return true;
-bad:
-	kfree_skb(skb);
-	return false;
-}
-
 static int ipv6_destopt_rcv(struct sk_buff *skb)
 {
 	struct inet6_skb_parm *opt = IP6CB(skb);
@@ -213,8 +81,9 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
 	dstbuf = opt->dst1;
 #endif
 
-	if (ip6_parse_tlv(tlvprocdestopt_lst, skb,
-			  init_net.ipv6.sysctl.max_dst_opts_cnt)) {
+	if (ipeh_parse_tlv(tlvprocdestopt_lst, skb,
+			   init_net.ipv6.sysctl.max_dst_opts_cnt,
+			   2, extlen - 2)) {
 		skb->transport_header += extlen;
 		opt = IP6CB(skb);
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -638,8 +507,9 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
 		goto fail_and_free;
 
 	opt->flags |= IP6SKB_HOPBYHOP;
-	if (ip6_parse_tlv(tlvprochopopt_lst, skb,
-			  init_net.ipv6.sysctl.max_hbh_opts_cnt)) {
+	if (ipeh_parse_tlv(tlvprochopopt_lst, skb,
+			   init_net.ipv6.sysctl.max_hbh_opts_cnt,
+			   2, extlen - 2)) {
 		skb->transport_header += extlen;
 		opt = IP6CB(skb);
 		opt->nhoff = sizeof(struct ipv6hdr);
diff --git a/net/ipv6/exthdrs_common.c b/net/ipv6/exthdrs_common.c
index 2c68184..d0c4ec3 100644
--- a/net/ipv6/exthdrs_common.c
+++ b/net/ipv6/exthdrs_common.c
@@ -142,3 +142,140 @@ struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
 	return opt;
 }
 EXPORT_SYMBOL_GPL(ipeh_fixup_options);
+
+/* An unknown option is detected, decide what to do */
+
+static bool ip6_tlvopt_unknown(struct sk_buff *skb, int optoff,
+			       bool disallow_unknowns)
+{
+	if (disallow_unknowns) {
+		/* If unknown TLVs are disallowed by configuration
+		 * then always silently drop packet. Note this also
+		 * means no ICMP parameter problem is sent which
+		 * could be a good property to mitigate a reflection DOS
+		 * attack.
+		 */
+
+		goto drop;
+	}
+
+	switch ((skb_network_header(skb)[optoff] & 0xC0) >> 6) {
+	case 0: /* ignore */
+		return true;
+
+	case 1: /* drop packet */
+		break;
+
+	case 3: /* Send ICMP if not a multicast address and drop packet */
+		/* Actually, it is redundant check. icmp_send
+		 * will recheck in any case.
+		 */
+		if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr))
+			break;
+		/* fall through */
+	case 2: /* send ICMP PARM PROB regardless and drop packet */
+		icmpv6_param_prob(skb, ICMPV6_UNK_OPTION, optoff);
+		return false;
+	}
+
+drop:
+	kfree_skb(skb);
+	return false;
+}
+
+/* Generic extension header TLV parser
+ *
+ * Arguments:
+ *   - skb_transport_header points to the extension header containing options
+ *   - off is offset from skb_transport_header where first TLV is
+ *   - len is length of TLV block
+ */
+bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
+		    int max_count, int off, int len)
+{
+	const unsigned char *nh = skb_network_header(skb);
+	const struct tlvtype_proc *curr;
+	bool disallow_unknowns = false;
+	int tlv_count = 0;
+	int padlen = 0;
+
+	if (unlikely(max_count < 0)) {
+		disallow_unknowns = true;
+		max_count = -max_count;
+	}
+
+	if (skb_transport_offset(skb) + off + len > skb_headlen(skb))
+		goto bad;
+
+	/* ops function based offset on network header */
+	off += skb_network_header_len(skb);
+
+	while (len > 0) {
+		int optlen = nh[off + 1] + 2;
+		int i;
+
+		switch (nh[off]) {
+		case IPEH_TLV_PAD1:
+			optlen = 1;
+			padlen++;
+			if (padlen > 7)
+				goto bad;
+			break;
+
+		case IPEH_TLV_PADN:
+			/* RFC 2460 states that the purpose of PadN is
+			 * to align the containing header to multiples
+			 * of 8. 7 is therefore the highest valid value.
+			 * See also RFC 4942, Section 2.1.9.5.
+			 */
+			padlen += optlen;
+			if (padlen > 7)
+				goto bad;
+
+			/* RFC 4942 recommends receiving hosts to
+			 * actively check PadN payload to contain
+			 * only zeroes.
+			 */
+			for (i = 2; i < optlen; i++) {
+				if (nh[off + i] != 0)
+					goto bad;
+			}
+			break;
+
+		default: /* Other TLV code so scan list */
+			if (optlen > len)
+				goto bad;
+
+			tlv_count++;
+			if (tlv_count > max_count)
+				goto bad;
+
+			for (curr = procs; curr->type >= 0; curr++) {
+				if (curr->type == nh[off]) {
+					/* type specific length/alignment
+					 * checks will be performed in the
+					 * func().
+					 */
+					if (curr->func(skb, off) == false)
+						return false;
+					break;
+				}
+			}
+			if (curr->type < 0 &&
+			    !ip6_tlvopt_unknown(skb, off, disallow_unknowns))
+				return false;
+
+			padlen = 0;
+			break;
+		}
+		off += optlen;
+		len -= optlen;
+	}
+
+	if (len == 0)
+		return true;
+bad:
+	kfree_skb(skb);
+	return false;
+}
+EXPORT_SYMBOL(ipeh_parse_tlv);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 5/9] ipeh: Add callback to ipeh_parse_tlv to handle errors
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
                   ` (3 preceding siblings ...)
  2019-12-26 22:51 ` [PATCH v8 net-next 4/9] ipeh: Generic TLV parser Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 6/9] ip6tlvs: Registration of TLV handlers and parameters Tom Herbert
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Add a callback function to ipeh_parse_tlv so that the caller
can handler parsing errors in an appropriate way.

The enum ipeh_parse_errors contains the various errors and the
particular error is an argument of the callback.

ipv6_parse_error is the callback function for parsing IPv6 TLVs.
This mostly subsumes the functionality of ip6_tlvopt_unknown which
is removed.

The callback is called at various points in ipeh_parse_tlv when an
error is encountered. If the callback returns false then the packet
is discarded, else on true being returned the error is not fatal and
TLV processing continues (for instance, an unknown TLV is to be
ignored).

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h        | 17 +++++++++-
 net/ipv6/exthdrs.c        | 55 ++++++++++++++++++++++++++++++--
 net/ipv6/exthdrs_common.c | 80 ++++++++++++++++++-----------------------------
 3 files changed, 99 insertions(+), 53 deletions(-)

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
index 81f92f8..3d2bec6 100644
--- a/include/net/ipeh.h
+++ b/include/net/ipeh.h
@@ -31,6 +31,19 @@ struct ipv6_txoptions *ipeh_renew_options(struct sock *sk,
 struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
 
+/* Generic extension header TLV parser */
+
+enum ipeh_parse_errors {
+	IPEH_PARSE_ERR_PAD1,		/* Excessive PAD1 */
+	IPEH_PARSE_ERR_PADN,		/* Excessive PADN */
+	IPEH_PARSE_ERR_PADNZ,		/* Non-zero padding data */
+	IPEH_PARSE_ERR_EH_TOOBIG,	/* Length of EH exceeds limit */
+	IPEH_PARSE_ERR_OPT_TOOBIG,	/* Option size exceeds limit */
+	IPEH_PARSE_ERR_OPT_TOOMANY,	/* Option count exceeds limit */
+	IPEH_PARSE_ERR_OPT_UNK_DISALW,	/* Unknown option disallowed */
+	IPEH_PARSE_ERR_OPT_UNK,		/* Unknown option */
+};
+
 /* The generic TLV parser assumes that the type value of PAD1 is 0, and PADN
  * is 1. This is true for IPv6 Destination and Hop-by-Hop Options. For Segment
  * Routing TLVs, PAD1 is also 0, however PADN is 4 so the latter necessitates
@@ -40,6 +53,8 @@ struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
 #define IPEH_TLV_PADN	1
 
 bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
-		    int max_count, int off, int len);
+		    int max_count, int off, int len,
+		    bool (*parse_error)(struct sk_buff *skb,
+					int off, enum ipeh_parse_errors error));
 
 #endif /* _NET_IPEH_H */
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 7b4183c..e23955c 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -54,6 +54,57 @@
   Generic functions
  *********************/
 
+/* Handle parse errors from ipeh generic TLV parser */
+static bool ipv6_parse_error(struct sk_buff *skb, int off,
+			     enum ipeh_parse_errors error)
+{
+	switch (error) {
+	case IPEH_PARSE_ERR_OPT_UNK_DISALW:
+		/* If unknown TLVs are disallowed by configuration
+		 * then always silently drop packet. Note this also
+		 * means no ICMP parameter problem is sent which
+		 * could be a good property to mitigate a reflection DOS
+		 * attack.
+		 */
+
+		break;
+	case IPEH_PARSE_ERR_OPT_UNK:
+		/* High order two bits of option type indicate action to
+		 * take on unrecognized option.
+		 */
+		switch ((skb_network_header(skb)[off] & 0xC0) >> 6) {
+		case 0:
+			/* ignore */
+			return true;
+
+		case 1: /* drop packet */
+			break;
+
+		case 3: /* Send ICMP if not a multicast address and drop packet
+			 *
+			 * Actually, it is redundant check. icmp_send
+			 * will recheck in any case.
+			 */
+			if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr))
+				break;
+
+			/* fall through */
+		case 2: /* send ICMP PARM PROB regardless and drop packet */
+			icmpv6_send(skb, ICMPV6_PARAMPROB,
+				    ICMPV6_UNK_OPTION, off);
+			break;
+		}
+		break;
+	default:
+		/* Silent drop on other errors */
+
+		break;
+	}
+
+	/* Will be dropping packet */
+	return false;
+}
+
 static int ipv6_destopt_rcv(struct sk_buff *skb)
 {
 	struct inet6_skb_parm *opt = IP6CB(skb);
@@ -83,7 +134,7 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
 
 	if (ipeh_parse_tlv(tlvprocdestopt_lst, skb,
 			   init_net.ipv6.sysctl.max_dst_opts_cnt,
-			   2, extlen - 2)) {
+			   2, extlen - 2, ipv6_parse_error)) {
 		skb->transport_header += extlen;
 		opt = IP6CB(skb);
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -509,7 +560,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
 	opt->flags |= IP6SKB_HOPBYHOP;
 	if (ipeh_parse_tlv(tlvprochopopt_lst, skb,
 			   init_net.ipv6.sysctl.max_hbh_opts_cnt,
-			   2, extlen - 2)) {
+			   2, extlen - 2, ipv6_parse_error)) {
 		skb->transport_header += extlen;
 		opt = IP6CB(skb);
 		opt->nhoff = sizeof(struct ipv6hdr);
diff --git a/net/ipv6/exthdrs_common.c b/net/ipv6/exthdrs_common.c
index d0c4ec3..4d8a799 100644
--- a/net/ipv6/exthdrs_common.c
+++ b/net/ipv6/exthdrs_common.c
@@ -143,55 +143,18 @@ struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
 }
 EXPORT_SYMBOL_GPL(ipeh_fixup_options);
 
-/* An unknown option is detected, decide what to do */
-
-static bool ip6_tlvopt_unknown(struct sk_buff *skb, int optoff,
-			       bool disallow_unknowns)
-{
-	if (disallow_unknowns) {
-		/* If unknown TLVs are disallowed by configuration
-		 * then always silently drop packet. Note this also
-		 * means no ICMP parameter problem is sent which
-		 * could be a good property to mitigate a reflection DOS
-		 * attack.
-		 */
-
-		goto drop;
-	}
-
-	switch ((skb_network_header(skb)[optoff] & 0xC0) >> 6) {
-	case 0: /* ignore */
-		return true;
-
-	case 1: /* drop packet */
-		break;
-
-	case 3: /* Send ICMP if not a multicast address and drop packet */
-		/* Actually, it is redundant check. icmp_send
-		 * will recheck in any case.
-		 */
-		if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr))
-			break;
-		/* fall through */
-	case 2: /* send ICMP PARM PROB regardless and drop packet */
-		icmpv6_param_prob(skb, ICMPV6_UNK_OPTION, optoff);
-		return false;
-	}
-
-drop:
-	kfree_skb(skb);
-	return false;
-}
-
 /* Generic extension header TLV parser
  *
  * Arguments:
  *   - skb_transport_header points to the extension header containing options
  *   - off is offset from skb_transport_header where first TLV is
  *   - len is length of TLV block
+ *   - parse_error is protocol specific function to handle parser errors
  */
 bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
-		    int max_count, int off, int len)
+		    int max_count, int off, int len,
+		    bool (*parse_error)(struct sk_buff *skb,
+					int off, enum ipeh_parse_errors error))
 {
 	const unsigned char *nh = skb_network_header(skb);
 	const struct tlvtype_proc *curr;
@@ -204,8 +167,15 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 		max_count = -max_count;
 	}
 
-	if (skb_transport_offset(skb) + off + len > skb_headlen(skb))
-		goto bad;
+	if (skb_transport_offset(skb) + off + len > skb_headlen(skb)) {
+		if (!parse_error(skb, skb_transport_offset(skb),
+				 IPEH_PARSE_ERR_EH_TOOBIG)) {
+			kfree_skb(skb);
+			return false;
+		}
+
+		len = skb_headlen(skb) - skb_transport_offset(skb) - off;
+	}
 
 	/* ops function based offset on network header */
 	off += skb_network_header_len(skb);
@@ -218,8 +188,10 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 		case IPEH_TLV_PAD1:
 			optlen = 1;
 			padlen++;
-			if (padlen > 7)
+			if (padlen > 7 &&
+			    !parse_error(skb, off, IPEH_PARSE_ERR_PAD1))
 				goto bad;
+
 			break;
 
 		case IPEH_TLV_PADN:
@@ -229,7 +201,8 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 			 * See also RFC 4942, Section 2.1.9.5.
 			 */
 			padlen += optlen;
-			if (padlen > 7)
+			if (padlen > 7 &&
+			    !parse_error(skb, off, IPEH_PARSE_ERR_PADN))
 				goto bad;
 
 			/* RFC 4942 recommends receiving hosts to
@@ -237,17 +210,21 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 			 * only zeroes.
 			 */
 			for (i = 2; i < optlen; i++) {
-				if (nh[off + i] != 0)
+				if (nh[off + i] != 0 &&
+				    !parse_error(skb, off + i,
+						 IPEH_PARSE_ERR_PADNZ))
 					goto bad;
 			}
 			break;
 
 		default: /* Other TLV code so scan list */
-			if (optlen > len)
+			if (optlen > len &&
+			    !parse_error(skb, off, IPEH_PARSE_ERR_OPT_TOOBIG))
 				goto bad;
 
 			tlv_count++;
-			if (tlv_count > max_count)
+			if (tlv_count > max_count &&
+			    !parse_error(skb, off, IPEH_PARSE_ERR_OPT_TOOMANY))
 				goto bad;
 
 			for (curr = procs; curr->type >= 0; curr++) {
@@ -262,8 +239,11 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 				}
 			}
 			if (curr->type < 0 &&
-			    !ip6_tlvopt_unknown(skb, off, disallow_unknowns))
-				return false;
+			    !parse_error(skb, off,
+					 disallow_unknowns ?
+						IPEH_PARSE_ERR_OPT_UNK_DISALW :
+						IPEH_PARSE_ERR_OPT_UNK))
+				goto bad;
 
 			padlen = 0;
 			break;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 6/9] ip6tlvs: Registration of TLV handlers and parameters
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
                   ` (4 preceding siblings ...)
  2019-12-26 22:51 ` [PATCH v8 net-next 5/9] ipeh: Add callback to ipeh_parse_tlv to handle errors Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 7/9] ip6tlvs: Add TX parameters Tom Herbert
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Create a single TLV parameter table that holds meta information for IPv6
Hop-by-Hop and Destination TLVs. The data structure is composed of a 256
element array of u8's (one entry for each TLV type to allow O(1)
lookup). Each entry provides an offset into an array of TLV proc data
structures which follows the array of u8s. The TLV proc data structure
contains parameters and handler functions for receiving and transmitting
TLVs. The zero'th element in the TLV proc array provides default
parameters for TLVs.

A class attribute indicates the type of extension header in which the
TLV may be used (e.g. Hop-by-Hop options, Destination options, or
Destination options before the routing header).

Functions are defined to manipulate entries in the TLV parameter table.

* tlv_{set|unset}_proc set a TLV proc entry (ops and parameters)
* tlv_{set|unset}_params set parameters only

Receive TLV lookup and processing is modified to be a lookup in the TLV
parameter table. An init table containing parameters for TLVs supported
by the kernel is used to initialize the TLV table.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h         | 107 +++++++++++++++++--
 include/net/ipv6.h         |   3 +
 include/uapi/linux/ipeh.h  |  16 +++
 net/ipv6/exthdrs.c         |  14 ++-
 net/ipv6/exthdrs_common.c  | 261 ++++++++++++++++++++++++++++++++++++++++++---
 net/ipv6/exthdrs_options.c |  63 +++++++----
 6 files changed, 416 insertions(+), 48 deletions(-)
 create mode 100644 include/uapi/linux/ipeh.h

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
index 3d2bec6..fc7d543 100644
--- a/include/net/ipeh.h
+++ b/include/net/ipeh.h
@@ -11,13 +11,105 @@
  *     and false, if it failed.
  *     It MUST NOT touch skb->h.
  */
-struct tlvtype_proc {
-	int	type;
-	bool	(*func)(struct sk_buff *skb, int offset);
+struct tlv_ops {
+	bool	(*func)(unsigned int class, struct sk_buff *skb, int offset);
 };
 
-extern const struct tlvtype_proc tlvprocdestopt_lst[];
-extern const struct tlvtype_proc tlvprochopopt_lst[];
+struct tlv_rx_params {
+	unsigned char class : 4;
+};
+
+struct tlv_tx_params {
+};
+
+struct tlv_params {
+	struct tlv_rx_params r;
+	struct tlv_tx_params t;
+};
+
+struct tlv_proc {
+	struct tlv_ops ops;
+	struct tlv_params params;
+};
+
+struct tlv_type {
+	struct tlv_proc proc;
+};
+
+struct tlv_proc_init {
+	int type;
+	struct tlv_proc proc;
+};
+
+struct tlv_param_table_data {
+	unsigned char entries[256];
+	unsigned char count;
+	struct rcu_head rcu;
+	struct tlv_type types[0];
+};
+
+struct tlv_param_table {
+	struct tlv_param_table_data __rcu *data;
+};
+
+extern struct tlv_param_table ipv6_tlv_param_table;
+
+int __ipeh_tlv_set(struct tlv_param_table *tlv_param_table,
+		   unsigned char type, const struct tlv_params *params,
+		   const struct tlv_ops *ops);
+
+static inline int ipeh_tlv_set_params(struct tlv_param_table *tlv_param_table,
+				      unsigned char type,
+				      const struct tlv_params *params)
+{
+	return __ipeh_tlv_set(tlv_param_table, type, params, NULL);
+}
+
+static inline int ipeh_tlv_set_proc(struct tlv_param_table *tlv_param_table,
+			       unsigned char type,
+			       const struct tlv_proc *proc)
+{
+	return __ipeh_tlv_set(tlv_param_table, type,
+			      &proc->params, &proc->ops);
+}
+
+int __ipeh_tlv_unset(struct tlv_param_table *tlv_param_table,
+		     unsigned char type, bool params_only);
+
+static inline int ipeh_tlv_unset_params(struct tlv_param_table *tlv_param_table,
+					unsigned char type)
+{
+	return __ipeh_tlv_unset(tlv_param_table, type, true);
+}
+
+static inline int ipeh_tlv_unset_proc(struct tlv_param_table *tlv_param_table,
+				      unsigned char type)
+{
+	return __ipeh_tlv_unset(tlv_param_table, type, false);
+}
+
+/* ipeh_tlv_get_proc_by_type assumes rcu_read_lock is held */
+static inline struct tlv_proc *ipeh_tlv_get_proc_by_type(
+		struct tlv_param_table *tlv_param_table, unsigned char type)
+{
+	struct tlv_param_table_data *tpt =
+				rcu_dereference(tlv_param_table->data);
+
+	return &tpt->types[tpt->entries[type]].proc;
+}
+
+/* ipeh_tlv_get_proc assumes rcu_read_lock is held */
+static inline struct tlv_proc *ipeh_tlv_get_proc(
+		struct tlv_param_table *tlv_param_table,
+		const __u8 *tlv)
+{
+	return ipeh_tlv_get_proc_by_type(tlv_param_table, tlv[0]);
+}
+
+int ipeh_exthdrs_init(struct tlv_param_table *tlv_param_table,
+		      const struct tlv_proc_init *init_params,
+		      int num_init_params);
+void ipeh_exthdrs_fini(struct tlv_param_table *tlv_param_table);
 
 struct ipv6_txoptions;
 struct ipv6_opt_hdr;
@@ -52,8 +144,9 @@ enum ipeh_parse_errors {
 #define IPEH_TLV_PAD1	0
 #define IPEH_TLV_PADN	1
 
-bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
-		    int max_count, int off, int len,
+bool ipeh_parse_tlv(unsigned int class,
+		    struct tlv_param_table *tlv_param_table,
+		    struct sk_buff *skb, int max_count, int off, int len,
 		    bool (*parse_error)(struct sk_buff *skb,
 					int off, enum ipeh_parse_errors error));
 
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 6c4c834..e290e90 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -429,6 +429,9 @@ int ip6_ra_control(struct sock *sk, int sel);
 
 int ipv6_parse_hopopts(struct sk_buff *skb);
 
+int ipv6_exthdrs_options_init(void);
+void ipv6_exthdrs_options_exit(void);
+
 bool ipv6_opt_accepted(const struct sock *sk, const struct sk_buff *skb,
 		       const struct inet6_skb_parm *opt);
 struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
diff --git a/include/uapi/linux/ipeh.h b/include/uapi/linux/ipeh.h
new file mode 100644
index 0000000..c4302b7
--- /dev/null
+++ b/include/uapi/linux/ipeh.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* ipeh.h - IP extension header TLV management */
+
+#ifndef _UAPI_LINUX_IPEH_H
+#define _UAPI_LINUX_IPEH_H
+
+/* Flags for EH type that can use a TLV option */
+#define IPEH_TLV_CLASS_FLAG_HOPOPT	BIT(0)
+#define IPEH_TLV_CLASS_FLAG_RTRDSTOPT	BIT(1)
+#define IPEH_TLV_CLASS_FLAG_DSTOPT	BIT(2)
+
+#define IPEH_TLV_CLASS_FLAG_MASK (IPEH_TLV_CLASS_FLAG_HOPOPT |		\
+				  IPEH_TLV_CLASS_FLAG_RTRDSTOPT |	\
+				  IPEH_TLV_CLASS_FLAG_DSTOPT)
+
+#endif /* _UAPI_LINUX_IPEH_H */
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index e23955c..ca7a9c0 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -47,6 +47,7 @@
 #ifdef CONFIG_IPV6_SEG6_HMAC
 #include <net/seg6_hmac.h>
 #endif
+#include <uapi/linux/ipeh.h>
 
 #include <linux/uaccess.h>
 
@@ -132,7 +133,8 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
 	dstbuf = opt->dst1;
 #endif
 
-	if (ipeh_parse_tlv(tlvprocdestopt_lst, skb,
+	if (ipeh_parse_tlv(IPEH_TLV_CLASS_FLAG_DSTOPT,
+			   &ipv6_tlv_param_table, skb,
 			   init_net.ipv6.sysctl.max_dst_opts_cnt,
 			   2, extlen - 2, ipv6_parse_error)) {
 		skb->transport_header += extlen;
@@ -517,8 +519,13 @@ int __init ipv6_exthdrs_init(void)
 	if (ret)
 		goto out_destopt;
 
+	ret = ipv6_exthdrs_options_init();
+	if (ret)
+		goto out_nodata;
 out:
 	return ret;
+out_nodata:
+	inet6_del_protocol(&nodata_protocol, IPPROTO_NONE);
 out_destopt:
 	inet6_del_protocol(&destopt_protocol, IPPROTO_DSTOPTS);
 out_rthdr:
@@ -528,6 +535,7 @@ int __init ipv6_exthdrs_init(void)
 
 void ipv6_exthdrs_exit(void)
 {
+	ipv6_exthdrs_options_exit();
 	inet6_del_protocol(&nodata_protocol, IPPROTO_NONE);
 	inet6_del_protocol(&destopt_protocol, IPPROTO_DSTOPTS);
 	inet6_del_protocol(&rthdr_protocol, IPPROTO_ROUTING);
@@ -558,8 +566,8 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
 		goto fail_and_free;
 
 	opt->flags |= IP6SKB_HOPBYHOP;
-	if (ipeh_parse_tlv(tlvprochopopt_lst, skb,
-			   init_net.ipv6.sysctl.max_hbh_opts_cnt,
+	if (ipeh_parse_tlv(IPEH_TLV_CLASS_FLAG_HOPOPT, &ipv6_tlv_param_table,
+			   skb, init_net.ipv6.sysctl.max_hbh_opts_cnt,
 			   2, extlen - 2, ipv6_parse_error)) {
 		skb->transport_header += extlen;
 		opt = IP6CB(skb);
diff --git a/net/ipv6/exthdrs_common.c b/net/ipv6/exthdrs_common.c
index 4d8a799..f36513d 100644
--- a/net/ipv6/exthdrs_common.c
+++ b/net/ipv6/exthdrs_common.c
@@ -151,14 +151,15 @@ EXPORT_SYMBOL_GPL(ipeh_fixup_options);
  *   - len is length of TLV block
  *   - parse_error is protocol specific function to handle parser errors
  */
-bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
-		    int max_count, int off, int len,
+bool ipeh_parse_tlv(unsigned int class,
+		    struct tlv_param_table *tlv_param_table,
+		    struct sk_buff *skb, int max_count, int off, int len,
 		    bool (*parse_error)(struct sk_buff *skb,
 					int off, enum ipeh_parse_errors error))
 {
 	const unsigned char *nh = skb_network_header(skb);
-	const struct tlvtype_proc *curr;
 	bool disallow_unknowns = false;
+	const struct tlv_proc *curr;
 	int tlv_count = 0;
 	int padlen = 0;
 
@@ -180,6 +181,8 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 	/* ops function based offset on network header */
 	off += skb_network_header_len(skb);
 
+	rcu_read_lock();
+
 	while (len > 0) {
 		int optlen = nh[off + 1] + 2;
 		int i;
@@ -227,19 +230,17 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 			    !parse_error(skb, off, IPEH_PARSE_ERR_OPT_TOOMANY))
 				goto bad;
 
-			for (curr = procs; curr->type >= 0; curr++) {
-				if (curr->type == nh[off]) {
-					/* type specific length/alignment
-					 * checks will be performed in the
-					 * func().
-					 */
-					if (curr->func(skb, off) == false)
-						return false;
-					break;
+			curr = ipeh_tlv_get_proc(tlv_param_table, &nh[off]);
+			if ((curr->params.r.class & class) && curr->ops.func) {
+				/* Handler will apply additional checks to
+				 * the TLV
+				 */
+				if (!curr->ops.func(class, skb, off)) {
+					/* Ops function frees skb on failure */
+					rcu_read_unlock();
+					return false;
 				}
-			}
-			if (curr->type < 0 &&
-			    !parse_error(skb, off,
+			} else if (!parse_error(skb, off,
 					 disallow_unknowns ?
 						IPEH_PARSE_ERR_OPT_UNK_DISALW :
 						IPEH_PARSE_ERR_OPT_UNK))
@@ -252,10 +253,238 @@ bool ipeh_parse_tlv(const struct tlvtype_proc *procs, struct sk_buff *skb,
 		len -= optlen;
 	}
 
-	if (len == 0)
+	if (len == 0) {
+		rcu_read_unlock();
 		return true;
+	}
 bad:
+	rcu_read_unlock();
 	kfree_skb(skb);
 	return false;
 }
 EXPORT_SYMBOL(ipeh_parse_tlv);
+
+/* TLV parameter table functions and structures */
+
+/* Default (unset) values for TLV parameters */
+static const struct tlv_proc tlv_default_proc = {
+};
+
+static DEFINE_MUTEX(tlv_mutex);
+
+static size_t tlv_param_table_size(unsigned char count)
+{
+	return sizeof(struct tlv_param_table_data) +
+	    (count * sizeof(struct tlv_type));
+}
+
+static void tlv_param_table_release(struct rcu_head *rcu)
+{
+	struct tlv_param_table_data *tpt =
+	    container_of(rcu, struct tlv_param_table_data, rcu);
+
+	kvfree(tpt);
+}
+
+/* mutex held */
+static int __tlv_set_one(struct tlv_param_table *tlv_param_table,
+			 unsigned char type, const struct tlv_params *params,
+			 const struct tlv_ops *ops)
+{
+	struct tlv_param_table_data *tpt, *told;
+	struct tlv_type *ttype;
+
+	told = rcu_dereference_protected(tlv_param_table->data,
+					 lockdep_is_held(&tlv_mutex));
+
+	/* Create new TLV table. If there is no exsiting entry then we are
+	 * adding a new one to the table, else we're modifying an entry.
+	 */
+	tpt = kvmalloc(tlv_param_table_size(told->count + !told->entries[type]),
+		       GFP_KERNEL);
+	if (!tpt)
+		return -ENOMEM;
+
+	memcpy(tpt, told, tlv_param_table_size(told->count));
+
+	if (!told->entries[type]) {
+		memset(&tpt->types[told->count], 0, sizeof(struct tlv_type));
+		tpt->entries[type] = told->count;
+		tpt->count = told->count + 1;
+	}
+
+	ttype = &tpt->types[tpt->entries[type]];
+
+	ttype->proc.params = *params;
+	ttype->proc.ops = ops ? *ops : tlv_default_proc.ops;
+
+	rcu_assign_pointer(tlv_param_table->data, tpt);
+	call_rcu(&told->rcu, tlv_param_table_release);
+
+	return 0;
+}
+
+int __ipeh_tlv_set(struct tlv_param_table *tlv_param_table, unsigned char type,
+		   const struct tlv_params *params, const struct tlv_ops *ops)
+{
+	int retv;
+
+	if (type < 2)
+		return -EINVAL;
+
+	mutex_lock(&tlv_mutex);
+	retv = __tlv_set_one(tlv_param_table, type, params, ops);
+	mutex_unlock(&tlv_mutex);
+
+	return retv;
+}
+EXPORT_SYMBOL(__ipeh_tlv_set);
+
+/* mutex held */
+static int __tlv_unset_one(struct tlv_param_table *tlv_param_table,
+			   unsigned char type)
+{
+	struct tlv_param_table_data *tpt, *told;
+	unsigned int i, pos;
+
+	told = rcu_dereference_protected(tlv_param_table->data,
+					 lockdep_is_held(&tlv_mutex));
+
+	if (!told->entries[type])
+		return 0;
+
+	tpt = kvmalloc(tlv_param_table_size(told->count - 1),
+		       GFP_KERNEL);
+	if (!tpt)
+		return -ENOMEM;
+
+	pos = told->entries[type];
+
+	memcpy(tpt->types, told->types, pos * sizeof(struct tlv_type));
+	memcpy(&tpt->types[pos], &told->types[pos + 1],
+	       (told->count - pos - 1) * sizeof(struct tlv_type));
+
+	for (i = 0; i < 256; i++) {
+		if (told->entries[i] > pos)
+			tpt->entries[i] = told->entries[i] - 1;
+		else
+			tpt->entries[i] = told->entries[i];
+	}
+
+	/* Clear entry for type being unset (point to default params) */
+	tpt->entries[type] = 0;
+
+	tpt->count = told->count - 1;
+
+	rcu_assign_pointer(tlv_param_table->data, tpt);
+	call_rcu(&told->rcu, tlv_param_table_release);
+
+	return 0;
+}
+
+/* tlv_internal_proc_type is used to check it the TLV proc was set
+ * internally. This is deduced by checking if any operations are defined.
+ */
+static bool tlv_internal_proc_type(struct tlv_proc *proc)
+{
+	return !!proc->ops.func;
+}
+
+int __ipeh_tlv_unset(struct tlv_param_table *tlv_param_table,
+		     unsigned char type, bool params_only)
+{
+	struct tlv_proc *tproc;
+	int retv;
+
+	if (type < 2)
+		return -EINVAL;
+
+	mutex_lock(&tlv_mutex);
+
+	tproc = ipeh_tlv_get_proc_by_type(tlv_param_table, type);
+
+	if (params_only && tlv_internal_proc_type(tproc)) {
+		/* TLV was set by internal source, so maintain the
+		 * non-parameter fields (i.e. the operations).
+		 */
+		retv = __tlv_set_one(tlv_param_table, type,
+				     &tlv_default_proc.params,
+				     &tproc->ops);
+	} else {
+		retv = __tlv_unset_one(tlv_param_table, type);
+	}
+
+	mutex_unlock(&tlv_mutex);
+
+	return retv;
+}
+EXPORT_SYMBOL(__ipeh_tlv_unset);
+
+int ipeh_exthdrs_init(struct tlv_param_table *tlv_param_table,
+		      const struct tlv_proc_init *tlv_init_params,
+		      int num_init_params)
+{
+	struct tlv_param_table_data *tpt;
+	int pos = 0, i;
+	size_t tsize;
+
+	tsize = tlv_param_table_size(num_init_params + 1);
+
+	tpt = kvmalloc(tsize, GFP_KERNEL);
+	if (!tpt)
+		return -ENOMEM;
+
+	memset(tpt, 0, tsize);
+
+	/* Zeroth TLV proc entry is default */
+	tpt->types[pos++].proc = tlv_default_proc;
+
+	for (i = 0; i < num_init_params; i++, pos++) {
+		const struct tlv_proc_init *tpi = &tlv_init_params[i];
+
+		if (WARN_ON(tpi->type < 2)) {
+			 /* Padding TLV initialized? */
+			goto err_inval;
+		}
+		if (WARN_ON(tpt->entries[tpi->type])) {
+			/* TLV type already set */
+			goto err_inval;
+		}
+
+		tpt->types[pos].proc = tpi->proc;
+		tpt->entries[tpi->type] = pos;
+	}
+
+	tpt->count = pos;
+
+	RCU_INIT_POINTER(tlv_param_table->data, tpt);
+
+	return 0;
+
+err_inval:
+	kvfree(tpt);
+	return -EINVAL;
+}
+EXPORT_SYMBOL(ipeh_exthdrs_init);
+
+static void tlv_destroy_param_table(struct tlv_param_table *tlv_param_table)
+{
+	struct tlv_param_table_data *tpt;
+
+	mutex_lock(&tlv_mutex);
+
+	tpt = rcu_dereference_protected(tlv_param_table->data,
+					lockdep_is_held(&tlv_mutex));
+	if (tpt) {
+		rcu_assign_pointer(tlv_param_table->data, NULL);
+		call_rcu(&tpt->rcu, tlv_param_table_release);
+	}
+
+	mutex_unlock(&tlv_mutex);
+}
+
+void ipeh_exthdrs_fini(struct tlv_param_table *tlv_param_table)
+{
+	tlv_destroy_param_table(tlv_param_table);
+}
+EXPORT_SYMBOL(ipeh_exthdrs_fini);
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
index 032e072..d4b373e 100644
--- a/net/ipv6/exthdrs_options.c
+++ b/net/ipv6/exthdrs_options.c
@@ -11,11 +11,12 @@
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
 #include <net/xfrm.h>
 #endif
+#include <uapi/linux/ipeh.h>
 
 /* Destination options header */
 
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
-static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
+static bool ipv6_dest_hao(unsigned int class, struct sk_buff *skb, int optoff)
 {
 	struct ipv6_destopt_hao *hao;
 	struct inet6_skb_parm *opt = IP6CB(skb);
@@ -74,16 +75,6 @@ static bool ipv6_dest_hao(struct sk_buff *skb, int optoff)
 }
 #endif
 
-const struct tlvtype_proc tlvprocdestopt_lst[] = {
-#if IS_ENABLED(CONFIG_IPV6_MIP6)
-	{
-		.type	= IPV6_TLV_HAO,
-		.func	= ipv6_dest_hao,
-	},
-#endif
-	{-1,			NULL}
-};
-
 /* Hop-by-hop options */
 
 /* Note: we cannot rely on skb_dst(skb) before we assign it in
@@ -102,7 +93,7 @@ static inline struct net *ipv6_skb_net(struct sk_buff *skb)
 
 /* Router Alert as of RFC 2711 */
 
-static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
+static bool ipv6_hop_ra(unsigned int class, struct sk_buff *skb, int optoff)
 {
 	const unsigned char *nh = skb_network_header(skb);
 
@@ -120,7 +111,7 @@ static bool ipv6_hop_ra(struct sk_buff *skb, int optoff)
 
 /* Jumbo payload */
 
-static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
+static bool ipv6_hop_jumbo(unsigned int class, struct sk_buff *skb, int optoff)
 {
 	const unsigned char *nh = skb_network_header(skb);
 	struct inet6_dev *idev = __in6_dev_get_safely(skb->dev);
@@ -164,7 +155,8 @@ static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff)
 
 /* CALIPSO RFC 5570 */
 
-static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
+static bool ipv6_hop_calipso(unsigned int class, struct sk_buff *skb,
+			     int optoff)
 {
 	const unsigned char *nh = skb_network_header(skb);
 
@@ -184,18 +176,45 @@ static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
 	return false;
 }
 
-const struct tlvtype_proc tlvprochopopt_lst[] = {
+static const struct tlv_proc_init tlv_ipv6_init_params[] __initconst = {
+#if IS_ENABLED(CONFIG_IPV6_MIP6)
 	{
-		.type	= IPV6_TLV_ROUTERALERT,
-		.func	= ipv6_hop_ra,
+		.type = IPV6_TLV_HAO,
+
+		.proc.ops.func = ipv6_dest_hao,
+		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_DSTOPT,
 	},
+#endif
 	{
-		.type	= IPV6_TLV_JUMBO,
-		.func	= ipv6_hop_jumbo,
+		.type = IPV6_TLV_ROUTERALERT,
+
+		.proc.ops.func = ipv6_hop_ra,
+		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
 	},
 	{
-		.type	= IPV6_TLV_CALIPSO,
-		.func	= ipv6_hop_calipso,
+		.type = IPV6_TLV_JUMBO,
+
+		.proc.ops.func	= ipv6_hop_jumbo,
+		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
+	},
+	{
+		.type = IPV6_TLV_CALIPSO,
+
+		.proc.ops.func = ipv6_hop_calipso,
+		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
 	},
-	{ -1, }
 };
+
+struct tlv_param_table __rcu ipv6_tlv_param_table;
+EXPORT_SYMBOL(ipv6_tlv_param_table);
+
+int __init ipv6_exthdrs_options_init(void)
+{
+	return ipeh_exthdrs_init(&ipv6_tlv_param_table, tlv_ipv6_init_params,
+				 ARRAY_SIZE(tlv_ipv6_init_params));
+}
+
+void ipv6_exthdrs_options_exit(void)
+{
+	ipeh_exthdrs_fini(&ipv6_tlv_param_table);
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 7/9] ip6tlvs: Add TX parameters
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
                   ` (5 preceding siblings ...)
  2019-12-26 22:51 ` [PATCH v8 net-next 6/9] ip6tlvs: Registration of TLV handlers and parameters Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 8/9] ip6tlvs: Add netlink interface Tom Herbert
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Define a number of transmit parameters for TLV Parameter table
definitions. These will be used for validating TLVs that are set
on a socket.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h         | 18 ++++++++++++++++
 include/uapi/linux/ipeh.h  |  8 +++++++
 net/ipv6/exthdrs_common.c  | 53 +++++++++++++++++++++++++++++++++++++++++++++-
 net/ipv6/exthdrs_options.c | 45 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
index fc7d543..6f46e2c 100644
--- a/include/net/ipeh.h
+++ b/include/net/ipeh.h
@@ -20,6 +20,17 @@ struct tlv_rx_params {
 };
 
 struct tlv_tx_params {
+	unsigned char admin_perm : 2;
+	unsigned char user_perm : 2;
+	unsigned char class : 3;
+	unsigned char rsvd : 1;
+	unsigned char align_mult : 4;
+	unsigned char align_off : 4;
+	unsigned char data_len_mult : 4;
+	unsigned char data_len_off : 4;
+	unsigned char min_data_len;
+	unsigned char max_data_len;
+	unsigned short preferred_order;
 };
 
 struct tlv_params {
@@ -54,6 +65,13 @@ struct tlv_param_table {
 
 extern struct tlv_param_table ipv6_tlv_param_table;
 
+/* Preferred TLV ordering for HBH and Dest options (placed by increasing order)
+ */
+#define IPEH_TLV_PREF_ORDER_HAO			10
+#define IPEH_TLV_PREF_ORDER_ROUTERALERT		20
+#define IPEH_TLV_PREF_ORDER_JUMBO		30
+#define IPEH_TLV_PREF_ORDER_CALIPSO		40
+
 int __ipeh_tlv_set(struct tlv_param_table *tlv_param_table,
 		   unsigned char type, const struct tlv_params *params,
 		   const struct tlv_ops *ops);
diff --git a/include/uapi/linux/ipeh.h b/include/uapi/linux/ipeh.h
index c4302b7..dbf0728 100644
--- a/include/uapi/linux/ipeh.h
+++ b/include/uapi/linux/ipeh.h
@@ -13,4 +13,12 @@
 				  IPEH_TLV_CLASS_FLAG_RTRDSTOPT |	\
 				  IPEH_TLV_CLASS_FLAG_DSTOPT)
 
+/* TLV permissions values */
+enum {
+	IPEH_TLV_PERM_NONE,
+	IPEH_TLV_PERM_WITH_CHECK,
+	IPEH_TLV_PERM_NO_CHECK,
+	IPEH_TLV_PERM_MAX = IPEH_TLV_PERM_NO_CHECK
+};
+
 #endif /* _UAPI_LINUX_IPEH_H */
diff --git a/net/ipv6/exthdrs_common.c b/net/ipv6/exthdrs_common.c
index f36513d..e9309ea 100644
--- a/net/ipv6/exthdrs_common.c
+++ b/net/ipv6/exthdrs_common.c
@@ -3,6 +3,7 @@
 /* Extension header and TLV library code that is not specific to IPv6. */
 #include <linux/export.h>
 #include <net/ipv6.h>
+#include <uapi/linux/ipeh.h>
 
 struct ipv6_txoptions *
 ipeh_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
@@ -268,6 +269,13 @@ EXPORT_SYMBOL(ipeh_parse_tlv);
 
 /* Default (unset) values for TLV parameters */
 static const struct tlv_proc tlv_default_proc = {
+	.params.t = {
+		.admin_perm = IPEH_TLV_PERM_NO_CHECK,
+		.user_perm = IPEH_TLV_PERM_NONE,
+		.align_mult = (4 - 1), /* Default alignment: 4n + 2 */
+		.align_off = 2,
+		.max_data_len = 255,
+	},
 };
 
 static DEFINE_MUTEX(tlv_mutex);
@@ -287,16 +295,45 @@ static void tlv_param_table_release(struct rcu_head *rcu)
 }
 
 /* mutex held */
+static int check_order(struct tlv_param_table_data *tpt, unsigned char type,
+		       unsigned short order)
+{
+	int i;
+
+	if (!order)
+		return -EINVAL;
+
+	for (i = 2; i < 256; i++) {
+		struct tlv_type *ttype = &tpt->types[tpt->entries[i]];
+
+		if (!tpt->entries[i])
+			continue;
+
+		if (order == ttype->proc.params.t.preferred_order &&
+		    i != type)
+			return -EALREADY;
+	}
+
+	return 0;
+}
+
+/* mutex held */
 static int __tlv_set_one(struct tlv_param_table *tlv_param_table,
 			 unsigned char type, const struct tlv_params *params,
 			 const struct tlv_ops *ops)
 {
 	struct tlv_param_table_data *tpt, *told;
 	struct tlv_type *ttype;
+	int retv;
 
 	told = rcu_dereference_protected(tlv_param_table->data,
 					 lockdep_is_held(&tlv_mutex));
 
+	/* Check preferred order */
+	retv = check_order(told, type, params->t.preferred_order);
+	if (retv)
+		return retv;
+
 	/* Create new TLV table. If there is no exsiting entry then we are
 	 * adding a new one to the table, else we're modifying an entry.
 	 */
@@ -425,7 +462,7 @@ int ipeh_exthdrs_init(struct tlv_param_table *tlv_param_table,
 		      int num_init_params)
 {
 	struct tlv_param_table_data *tpt;
-	int pos = 0, i;
+	int pos = 0, i, j;
 	size_t tsize;
 
 	tsize = tlv_param_table_size(num_init_params + 1);
@@ -451,6 +488,20 @@ int ipeh_exthdrs_init(struct tlv_param_table *tlv_param_table,
 			goto err_inval;
 		}
 
+		if (WARN_ON(!tpi->proc.params.t.preferred_order)) {
+			/* Preferred order must be non-zero */
+			goto err_inval;
+		}
+
+		for (j = 0; j < i; j++) {
+			const struct tlv_proc_init *tpix = &tlv_init_params[j];
+
+			if (WARN_ON(tpi->proc.params.t.preferred_order ==
+				    tpix->proc.params.t.preferred_order)) {
+				/* Preferred order must be unique */
+				goto err_inval;
+			}
+		}
 		tpt->types[pos].proc = tpi->proc;
 		tpt->entries[tpi->type] = pos;
 	}
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
index d4b373e..3b50b58 100644
--- a/net/ipv6/exthdrs_options.c
+++ b/net/ipv6/exthdrs_options.c
@@ -183,6 +183,17 @@ static const struct tlv_proc_init tlv_ipv6_init_params[] __initconst = {
 
 		.proc.ops.func = ipv6_dest_hao,
 		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_DSTOPT,
+
+		.proc.params.t = {
+			.preferred_order = IPEH_TLV_PREF_ORDER_HAO,
+			.admin_perm = IPEH_TLV_PERM_NO_CHECK,
+			.user_perm = IPEH_TLV_PERM_NONE,
+			.class = IPEH_TLV_CLASS_FLAG_DSTOPT,
+			.align_mult = (8 - 1), /* Align to 8n + 6 */
+			.align_off = 6,
+			.min_data_len = 16,
+			.max_data_len = 16,
+		},
 	},
 #endif
 	{
@@ -190,18 +201,52 @@ static const struct tlv_proc_init tlv_ipv6_init_params[] __initconst = {
 
 		.proc.ops.func = ipv6_hop_ra,
 		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
+
+		.proc.params.t = {
+			.preferred_order = IPEH_TLV_PREF_ORDER_ROUTERALERT,
+			.admin_perm = IPEH_TLV_PERM_NO_CHECK,
+			.user_perm = IPEH_TLV_PERM_NONE,
+			.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
+			.align_mult = (2 - 1), /* Align to 2n */
+			.min_data_len = 2,
+			.max_data_len = 2,
+		},
 	},
 	{
 		.type = IPV6_TLV_JUMBO,
 
 		.proc.ops.func	= ipv6_hop_jumbo,
 		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
+
+		.proc.params.t = {
+			.preferred_order = IPEH_TLV_PREF_ORDER_JUMBO,
+			.admin_perm = IPEH_TLV_PERM_NO_CHECK,
+			.user_perm = IPEH_TLV_PERM_NONE,
+			.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
+			.align_mult = (4 - 1), /* Align to 4n + 2 */
+			.align_off = 2,
+			.min_data_len = 4,
+			.max_data_len = 4,
+		},
 	},
 	{
 		.type = IPV6_TLV_CALIPSO,
 
 		.proc.ops.func = ipv6_hop_calipso,
 		.proc.params.r.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
+
+		.proc.params.t = {
+			.preferred_order = IPEH_TLV_PREF_ORDER_CALIPSO,
+			.admin_perm = IPEH_TLV_PERM_NO_CHECK,
+			.user_perm = IPEH_TLV_PERM_NONE,
+			.class = IPEH_TLV_CLASS_FLAG_HOPOPT,
+			.align_mult = (4 - 1), /* Align to 4n + 2 */
+			.align_off = 2,
+			.min_data_len = 8,
+			.max_data_len = 252,
+			.data_len_mult = (4 - 1),
+					/* Length is multiple of 4 */
+		},
 	},
 };
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 8/9] ip6tlvs: Add netlink interface
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
                   ` (6 preceding siblings ...)
  2019-12-26 22:51 ` [PATCH v8 net-next 7/9] ip6tlvs: Add TX parameters Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2019-12-26 22:51 ` [PATCH v8 net-next 9/9] ip6tlvs: Validation of TX Destination and Hop-by-Hop options Tom Herbert
  2020-01-02 21:41 ` [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure David Miller
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Add a netlink interface to manage the TX TLV parameters. Managed
parameters include those for validating and sending TLVs being sent
such as alignment, TLV ordering, length limits, etc.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h         |  16 +++
 include/net/ipv6.h         |   1 +
 include/uapi/linux/in6.h   |   6 +
 include/uapi/linux/ipeh.h  |  29 +++++
 net/ipv6/exthdrs_common.c  | 279 +++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/exthdrs_options.c |  81 ++++++++++++-
 6 files changed, 410 insertions(+), 2 deletions(-)

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
index 6f46e2c..7ddbda1 100644
--- a/include/net/ipeh.h
+++ b/include/net/ipeh.h
@@ -3,6 +3,7 @@
 #define _NET_IPEH_H
 
 #include <linux/skbuff.h>
+#include <net/genetlink.h>
 
 /*
  *     Parsing tlv encoded headers.
@@ -106,6 +107,21 @@ static inline int ipeh_tlv_unset_proc(struct tlv_param_table *tlv_param_table,
 	return __ipeh_tlv_unset(tlv_param_table, type, false);
 }
 
+extern const struct nla_policy ipeh_tlv_nl_policy[];
+
+int ipeh_tlv_nl_cmd_set(struct tlv_param_table *tlv_param_table,
+			struct genl_family *tlv_nl_family,
+			struct sk_buff *skb, struct genl_info *info);
+int ipeh_tlv_nl_cmd_unset(struct tlv_param_table *tlv_param_table,
+			  struct genl_family *tlv_nl_family,
+			  struct sk_buff *skb, struct genl_info *info);
+int ipeh_tlv_nl_cmd_get(struct tlv_param_table *tlv_param_table,
+			struct genl_family *tlv_nl_family,
+			struct sk_buff *skb, struct genl_info *info);
+int ipeh_tlv_nl_dump(struct tlv_param_table *tlv_param_table,
+		     struct genl_family *tlv_nl_family,
+		     struct sk_buff *skb, struct netlink_callback *cb);
+
 /* ipeh_tlv_get_proc_by_type assumes rcu_read_lock is held */
 static inline struct tlv_proc *ipeh_tlv_get_proc_by_type(
 		struct tlv_param_table *tlv_param_table, unsigned char type)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index e290e90..68b7fb8 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -14,6 +14,7 @@
 #include <linux/jhash.h>
 #include <linux/refcount.h>
 #include <linux/jump_label_ratelimit.h>
+#include <net/genetlink.h>
 #include <net/if_inet6.h>
 #include <net/ndisc.h>
 #include <net/flow.h>
diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
index 9f2273a..d5fe3d9 100644
--- a/include/uapi/linux/in6.h
+++ b/include/uapi/linux/in6.h
@@ -297,4 +297,10 @@ struct in6_flowlabel_req {
  * ...
  * MRT6_MAX
  */
+
+ /* NETLINK_GENERIC related info for IPv6 TLVs */
+
+#define IPV6_TLV_GENL_NAME		"ipv6-tlv"
+#define IPV6_TLV_GENL_VERSION		0x1
+
 #endif /* _UAPI_LINUX_IN6_H */
diff --git a/include/uapi/linux/ipeh.h b/include/uapi/linux/ipeh.h
index dbf0728..fb1d6e5 100644
--- a/include/uapi/linux/ipeh.h
+++ b/include/uapi/linux/ipeh.h
@@ -21,4 +21,33 @@ enum {
 	IPEH_TLV_PERM_MAX = IPEH_TLV_PERM_NO_CHECK
 };
 
+/* NETLINK_GENERIC related info for IP TLVs */
+
+enum {
+	IPEH_TLV_ATTR_UNSPEC,
+	IPEH_TLV_ATTR_TYPE,			/* u8, > 1 */
+	IPEH_TLV_ATTR_ORDER,			/* u16 */
+	IPEH_TLV_ATTR_ADMIN_PERM,		/* u8, perm value (0 to 2) */
+	IPEH_TLV_ATTR_USER_PERM,		/* u8, perm value (0 to 2) */
+	IPEH_TLV_ATTR_CLASS,			/* u8, 3 bit flags */
+	IPEH_TLV_ATTR_ALIGN_MULT,		/* u8, 1 to 16 */
+	IPEH_TLV_ATTR_ALIGN_OFF,		/* u8, 0 to 15 */
+	IPEH_TLV_ATTR_MIN_DATA_LEN,		/* u8 (option data length) */
+	IPEH_TLV_ATTR_MAX_DATA_LEN,		/* u8 (option data length) */
+	IPEH_TLV_ATTR_DATA_LEN_MULT,		/* u8, 1 to 16 */
+	IPEH_TLV_ATTR_DATA_LEN_OFF,		/* u8, 0 to 15 */
+
+	__IPEH_TLV_ATTR_MAX,
+};
+
+#define IPEH_TLV_ATTR_MAX              (__IPEH_TLV_ATTR_MAX - 1)
+
+enum {
+	IPEH_TLV_CMD_SET,
+	IPEH_TLV_CMD_UNSET,
+	IPEH_TLV_CMD_GET,
+
+	__IPEH_TLV_CMD_MAX,
+};
+
 #endif /* _UAPI_LINUX_IPEH_H */
diff --git a/net/ipv6/exthdrs_common.c b/net/ipv6/exthdrs_common.c
index e9309ea..794f3ae 100644
--- a/net/ipv6/exthdrs_common.c
+++ b/net/ipv6/exthdrs_common.c
@@ -457,6 +457,285 @@ int __ipeh_tlv_unset(struct tlv_param_table *tlv_param_table,
 }
 EXPORT_SYMBOL(__ipeh_tlv_unset);
 
+const struct nla_policy ipeh_tlv_nl_policy[IPEH_TLV_ATTR_MAX + 1] = {
+	[IPEH_TLV_ATTR_TYPE] =		{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_ORDER] =		{ .type = NLA_U16, },
+	[IPEH_TLV_ATTR_ADMIN_PERM] =	{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_USER_PERM] =	{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_CLASS] =		{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_ALIGN_MULT] =	{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_ALIGN_OFF] =	{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_MIN_DATA_LEN] =	{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_MAX_DATA_LEN] =	{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_DATA_LEN_OFF] =	{ .type = NLA_U8, },
+	[IPEH_TLV_ATTR_DATA_LEN_MULT] =	{ .type = NLA_U8, },
+};
+EXPORT_SYMBOL(ipeh_tlv_nl_policy);
+
+int ipeh_tlv_nl_cmd_set(struct tlv_param_table *tlv_param_table,
+			struct genl_family *tlv_nl_family,
+			struct sk_buff *skb, struct genl_info *info)
+{
+	struct tlv_params new_params;
+	struct tlv_proc *tproc;
+	unsigned char type;
+	int retv = -EINVAL;
+	unsigned int v;
+
+	if (!info->attrs[IPEH_TLV_ATTR_TYPE]) {
+		NL_SET_ERR_MSG(info->extack, "No TLV type");
+		return -EINVAL;
+	}
+
+	type = nla_get_u8(info->attrs[IPEH_TLV_ATTR_TYPE]);
+	if (type < 2) {
+		NL_SET_ERR_MSG(info->extack,
+			       "Invalid TLV type (less than 2)");
+		return -EINVAL;
+	}
+
+	rcu_read_lock();
+
+	/* Base new parameters on existing ones */
+	tproc = ipeh_tlv_get_proc_by_type(tlv_param_table, type);
+	new_params = tproc->params;
+
+	if (info->attrs[IPEH_TLV_ATTR_ORDER]) {
+		v = nla_get_u16(info->attrs[IPEH_TLV_ATTR_ORDER]);
+		new_params.t.preferred_order = v;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_ADMIN_PERM]) {
+		v = nla_get_u8(info->attrs[IPEH_TLV_ATTR_ADMIN_PERM]);
+		if (v > IPEH_TLV_PERM_MAX) {
+			NL_SET_ERR_MSG(info->extack,
+				       "Bad admin perm value");
+			goto out;
+		}
+		new_params.t.admin_perm = v;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_USER_PERM]) {
+		v = nla_get_u8(info->attrs[IPEH_TLV_ATTR_USER_PERM]);
+		if (v > IPEH_TLV_PERM_MAX) {
+			NL_SET_ERR_MSG(info->extack,
+				       "Bad user perm value");
+			goto out;
+		}
+		new_params.t.user_perm = v;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_CLASS]) {
+		v = nla_get_u8(info->attrs[IPEH_TLV_ATTR_CLASS]);
+		if (!v || (v & ~IPEH_TLV_CLASS_FLAG_MASK)) {
+			NL_SET_ERR_MSG(info->extack, "Bad TLV class");
+			goto out;
+		}
+		new_params.t.class = v;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_ALIGN_MULT]) {
+		v = nla_get_u8(info->attrs[IPEH_TLV_ATTR_ALIGN_MULT]);
+		if (v > 16 || v < 1) {
+			NL_SET_ERR_MSG(info->extack,
+				       "Alignment must be < 16 and > 0");
+			goto out;
+		}
+		new_params.t.align_mult = v - 1;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_ALIGN_OFF]) {
+		v = nla_get_u8(info->attrs[IPEH_TLV_ATTR_ALIGN_OFF]);
+		if (v > 15) {
+			NL_SET_ERR_MSG(info->extack,
+				       "Alignment offset must be < 16");
+			goto out;
+		}
+		new_params.t.align_off = v;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_MAX_DATA_LEN])
+		new_params.t.max_data_len =
+		    nla_get_u8(info->attrs[IPEH_TLV_ATTR_MAX_DATA_LEN]);
+
+	if (info->attrs[IPEH_TLV_ATTR_MIN_DATA_LEN])
+		new_params.t.min_data_len =
+		    nla_get_u8(info->attrs[IPEH_TLV_ATTR_MIN_DATA_LEN]);
+
+	if (new_params.t.min_data_len > new_params.t.max_data_len) {
+		NL_SET_ERR_MSG(info->extack,
+			       "Min data length must be less than or equal to max data length");
+		goto out;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_DATA_LEN_MULT]) {
+		v = nla_get_u8(info->attrs[IPEH_TLV_ATTR_DATA_LEN_MULT]);
+		if (v > 16 || v < 1) {
+			NL_SET_ERR_MSG(info->extack,
+				       "Length multiple must be < 16 and > 0");
+			goto out;
+		}
+		new_params.t.data_len_mult = v - 1;
+	}
+
+	if (info->attrs[IPEH_TLV_ATTR_DATA_LEN_OFF]) {
+		v = nla_get_u8(info->attrs[IPEH_TLV_ATTR_DATA_LEN_OFF]);
+		if (v > 15) {
+			NL_SET_ERR_MSG(info->extack,
+				       "Data length offset must be < 16");
+			goto out;
+		}
+		new_params.t.data_len_off = v;
+	}
+
+	retv = ipeh_tlv_set_params(tlv_param_table, type, &new_params);
+
+out:
+	rcu_read_unlock();
+	return retv;
+}
+EXPORT_SYMBOL(ipeh_tlv_nl_cmd_set);
+
+int ipeh_tlv_nl_cmd_unset(struct tlv_param_table *tlv_param_table,
+			  struct genl_family *tlv_nl_family,
+			  struct sk_buff *skb, struct genl_info *info)
+{
+	unsigned char type;
+
+	if (!info->attrs[IPEH_TLV_ATTR_TYPE]) {
+		NL_SET_ERR_MSG(info->extack, "No TLV type");
+		return -EINVAL;
+	}
+
+	type = nla_get_u8(info->attrs[IPEH_TLV_ATTR_TYPE]);
+	if (type < 2) {
+		NL_SET_ERR_MSG(info->extack,
+			       "Invalid TLV type (less than 2)");
+		return -EINVAL;
+	}
+
+	return ipeh_tlv_unset_params(tlv_param_table, type);
+}
+EXPORT_SYMBOL(ipeh_tlv_nl_cmd_unset);
+
+static int tlv_fill_info(struct tlv_proc *tproc, unsigned char type,
+			 struct sk_buff *msg, bool admin)
+{
+	struct tlv_params *tp = &tproc->params;
+	int ret = 0;
+
+	if (nla_put_u8(msg, IPEH_TLV_ATTR_TYPE, type) ||
+	    nla_put_u16(msg, IPEH_TLV_ATTR_ORDER, tp->t.preferred_order) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_USER_PERM, tp->t.user_perm) ||
+	    (admin && nla_put_u8(msg, IPEH_TLV_ATTR_ADMIN_PERM,
+				 tp->t.admin_perm)) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_CLASS, tp->t.class) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_ALIGN_MULT, tp->t.align_mult + 1) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_ALIGN_OFF, tp->t.align_off) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_MIN_DATA_LEN, tp->t.min_data_len) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_MAX_DATA_LEN, tp->t.max_data_len) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_DATA_LEN_MULT,
+		       tp->t.data_len_mult + 1) ||
+	    nla_put_u8(msg, IPEH_TLV_ATTR_DATA_LEN_OFF, tp->t.data_len_off))
+		ret = -1;
+
+	return ret;
+}
+
+static int tlv_dump_info(struct tlv_proc *tproc, unsigned char type,
+			 struct genl_family *tlv_nl_family, u32 portid,
+			 u32 seq, u32 flags, struct sk_buff *skb, u8 cmd,
+			 bool admin)
+{
+	void *hdr;
+
+	hdr = genlmsg_put(skb, portid, seq, tlv_nl_family, flags, cmd);
+	if (!hdr)
+		return -ENOMEM;
+
+	if (tlv_fill_info(tproc, type, skb, admin) < 0) {
+		genlmsg_cancel(skb, hdr);
+		return -EMSGSIZE;
+	}
+
+	genlmsg_end(skb, hdr);
+
+	return 0;
+}
+
+int ipeh_tlv_nl_cmd_get(struct tlv_param_table *tlv_param_table,
+			struct genl_family *tlv_nl_family,
+			struct sk_buff *skb, struct genl_info *info)
+{
+	struct tlv_proc *tproc;
+	struct sk_buff *msg;
+	unsigned char type;
+	int ret;
+
+	if (!info->attrs[IPEH_TLV_ATTR_TYPE]) {
+		NL_SET_ERR_MSG(info->extack, "No TLV type");
+		return -EINVAL;
+	}
+
+	type = nla_get_u8(info->attrs[IPEH_TLV_ATTR_TYPE]);
+	if (type < 2) {
+		NL_SET_ERR_MSG(info->extack,
+			       "Invalid TLV type (less than 2)");
+		return -EINVAL;
+	}
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	rcu_read_lock();
+
+	tproc = ipeh_tlv_get_proc_by_type(tlv_param_table, type);
+	ret = tlv_dump_info(tproc, type, tlv_nl_family, info->snd_portid,
+			    info->snd_seq, 0, msg, info->genlhdr->cmd,
+			    netlink_capable(skb, CAP_NET_ADMIN));
+
+	rcu_read_unlock();
+
+	if (ret < 0) {
+		nlmsg_free(msg);
+		return ret;
+	}
+
+	return genlmsg_reply(msg, info);
+}
+EXPORT_SYMBOL(ipeh_tlv_nl_cmd_get);
+
+int ipeh_tlv_nl_dump(struct tlv_param_table *tlv_param_table,
+		     struct genl_family *tlv_nl_family,
+		     struct sk_buff *skb, struct netlink_callback *cb)
+{
+	struct tlv_proc *tproc;
+	int idx = 0, ret, i;
+
+	rcu_read_lock();
+
+	for (i = 2; i < 256; i++) {
+		if (idx++ < cb->args[0])
+			continue;
+
+		tproc = ipeh_tlv_get_proc_by_type(tlv_param_table, i);
+		ret = tlv_dump_info(tproc, i, tlv_nl_family,
+				    NETLINK_CB(cb->skb).portid,
+				    cb->nlh->nlmsg_seq, NLM_F_MULTI,
+				    skb, IPEH_TLV_CMD_GET,
+				    netlink_capable(cb->skb, CAP_NET_ADMIN));
+		if (ret)
+			break;
+	}
+
+	rcu_read_unlock();
+
+	cb->args[0] = idx;
+	return skb->len;
+}
+EXPORT_SYMBOL(ipeh_tlv_nl_dump);
+
 int ipeh_exthdrs_init(struct tlv_param_table *tlv_param_table,
 		      const struct tlv_proc_init *tlv_init_params,
 		      int num_init_params)
diff --git a/net/ipv6/exthdrs_options.c b/net/ipv6/exthdrs_options.c
index 3b50b58..c1889f6 100644
--- a/net/ipv6/exthdrs_options.c
+++ b/net/ipv6/exthdrs_options.c
@@ -6,6 +6,7 @@
 #include <linux/socket.h>
 #include <linux/types.h>
 #include <net/calipso.h>
+#include <net/genetlink.h>
 #include <net/ipv6.h>
 #include <net/ip6_route.h>
 #if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -253,13 +254,89 @@ static const struct tlv_proc_init tlv_ipv6_init_params[] __initconst = {
 struct tlv_param_table __rcu ipv6_tlv_param_table;
 EXPORT_SYMBOL(ipv6_tlv_param_table);
 
+static int ipv6_tlv_nl_cmd_set(struct sk_buff *skb, struct genl_info *info);
+static int ipv6_tlv_nl_cmd_unset(struct sk_buff *skb, struct genl_info *info);
+static int ipv6_tlv_nl_cmd_get(struct sk_buff *skb, struct genl_info *info);
+static int ipv6_tlv_nl_dump(struct sk_buff *skb, struct netlink_callback *cb);
+
+static const struct genl_ops ipv6_tlv_nl_ops[] = {
+{
+	.cmd = IPEH_TLV_CMD_SET,
+	.doit = ipv6_tlv_nl_cmd_set,
+	.flags = GENL_ADMIN_PERM,
+},
+{
+	.cmd = IPEH_TLV_CMD_UNSET,
+	.doit = ipv6_tlv_nl_cmd_unset,
+	.flags = GENL_ADMIN_PERM,
+},
+{
+	.cmd = IPEH_TLV_CMD_GET,
+	.doit = ipv6_tlv_nl_cmd_get,
+	.dumpit = ipv6_tlv_nl_dump,
+},
+};
+
+struct genl_family ipv6_tlv_nl_family __ro_after_init = {
+	.hdrsize	= 0,
+	.name		= IPV6_TLV_GENL_NAME,
+	.version	= IPV6_TLV_GENL_VERSION,
+	.maxattr	= IPEH_TLV_ATTR_MAX,
+	.policy		= ipeh_tlv_nl_policy,
+	.netnsok	= true,
+	.parallel_ops	= true,
+	.ops		= ipv6_tlv_nl_ops,
+	.n_ops		= ARRAY_SIZE(ipv6_tlv_nl_ops),
+	.module		= THIS_MODULE,
+};
+
+static int ipv6_tlv_nl_cmd_set(struct sk_buff *skb, struct genl_info *info)
+{
+	return ipeh_tlv_nl_cmd_set(&ipv6_tlv_param_table, &ipv6_tlv_nl_family,
+				   skb, info);
+}
+
+static int ipv6_tlv_nl_cmd_unset(struct sk_buff *skb, struct genl_info *info)
+{
+	return ipeh_tlv_nl_cmd_unset(&ipv6_tlv_param_table, &ipv6_tlv_nl_family,
+				     skb, info);
+}
+
+static int ipv6_tlv_nl_cmd_get(struct sk_buff *skb, struct genl_info *info)
+{
+	return ipeh_tlv_nl_cmd_get(&ipv6_tlv_param_table, &ipv6_tlv_nl_family,
+				   skb, info);
+}
+
+static int ipv6_tlv_nl_dump(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	return ipeh_tlv_nl_dump(&ipv6_tlv_param_table, &ipv6_tlv_nl_family,
+				skb, cb);
+}
+
 int __init ipv6_exthdrs_options_init(void)
 {
-	return ipeh_exthdrs_init(&ipv6_tlv_param_table, tlv_ipv6_init_params,
-				 ARRAY_SIZE(tlv_ipv6_init_params));
+	int err;
+
+	err = genl_register_family(&ipv6_tlv_nl_family);
+	if (err)
+		goto genl_fail;
+
+	ipeh_exthdrs_init(&ipv6_tlv_param_table, tlv_ipv6_init_params,
+			  ARRAY_SIZE(tlv_ipv6_init_params));
+	if (err)
+		goto ipv6_fail;
+
+	return 0;
+
+ipv6_fail:
+	genl_unregister_family(&ipv6_tlv_nl_family);
+genl_fail:
+	return err;
 }
 
 void ipv6_exthdrs_options_exit(void)
 {
 	ipeh_exthdrs_fini(&ipv6_tlv_param_table);
+	genl_unregister_family(&ipv6_tlv_nl_family);
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 net-next 9/9] ip6tlvs: Validation of TX Destination and Hop-by-Hop options
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
                   ` (7 preceding siblings ...)
  2019-12-26 22:51 ` [PATCH v8 net-next 8/9] ip6tlvs: Add netlink interface Tom Herbert
@ 2019-12-26 22:51 ` Tom Herbert
  2020-01-02 21:41 ` [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure David Miller
  9 siblings, 0 replies; 26+ messages in thread
From: Tom Herbert @ 2019-12-26 22:51 UTC (permalink / raw)
  To: davem, netdev, simon.horman, willemdebruijn.kernel
  Cc: Tom Herbert, Tom Herbert

From: Tom Herbert <tom@quantonium.net>

Validate Destination and Hop-by-Hop options being set for transmit.
This uses the information in the TLV parameters table to validate
various aspects of both individual TLVs as well as a list of TLVs in
an extension header.

There are two levels of validation that can be performed: simple checks
and deep checks. Simple checks validate only the most basic properties
such as that the TLV list fits into the EH. Deep checks do a fine
grained validation that includes perferred ordering, length limits,
and length alignment.

With proper permissions set in the TLV parameter table, this patch
allows non-privileged users to send TLVs. Given that TLVs are open
ended and potentially a source of DOS attack, deep checks are
performed to limit the format that a non-privileged user can send.
If deep checks are enabled, a canonical format for sending TLVs is
enforced (in adherence with the robustness principle). A TLV must
be well ordered with respect to the preferred order for the TLV.
Each TLV must be aligned as described in the parameter table. Minimal
padding (one padding TLV) is used to align TLVs. The length of the
extension header as well as the count of non-padding TLVs is checked
against max_*_opts_len and max_*_opts_cnt. For individual TLVs, length
limits and length alignment is checked.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/ipeh.h        |  22 +++
 net/ipv6/datagram.c       |  51 ++++--
 net/ipv6/exthdrs_common.c | 396 ++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv6/ipv6_sockglue.c  |  39 ++---
 4 files changed, 469 insertions(+), 39 deletions(-)

diff --git a/include/net/ipeh.h b/include/net/ipeh.h
index 7ddbda1..bbc6339 100644
--- a/include/net/ipeh.h
+++ b/include/net/ipeh.h
@@ -157,6 +157,28 @@ struct ipv6_txoptions *ipeh_renew_options(struct sock *sk,
 struct ipv6_txoptions *ipeh_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
 
+int ipeh_opt_validate_tlvs(struct net *net,
+			   struct tlv_param_table *tlv_param_table,
+			   struct ipv6_opt_hdr *opt,
+			   unsigned int optname, bool admin,
+			   unsigned int max_len, unsigned int max_cnt);
+int ipeh_opt_validate_single_tlv(struct net *net,
+				 struct tlv_param_table *tlv_param_table,
+				 unsigned int optname, const __u8 *tlv,
+				 size_t len, bool deleting, bool admin);
+int ipeh_opt_check_perm(struct net *net,
+			struct tlv_param_table *tlv_param_table,
+			struct ipv6_txoptions *txopt, int optname, bool admin);
+
+struct ipv6_txoptions *ipeh_txopt_from_opt(struct sock *sk,
+					   struct tlv_param_table
+						*tlv_param_table,
+					   struct ipv6_txoptions *opt,
+					   int optname, char __user *optval,
+					   unsigned int optlen,
+					   unsigned int max_len,
+					   unsigned int max_cnt);
+
 /* Generic extension header TLV parser */
 
 enum ipeh_parse_errors {
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 390bedd..fd850f4 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -839,7 +839,10 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 			break;
 
 		case IPV6_2292HOPOPTS:
-		case IPV6_HOPOPTS:
+		case IPV6_HOPOPTS: {
+			int max_len = net->ipv6.sysctl.max_hbh_opts_len;
+			int max_cnt = net->ipv6.sysctl.max_hbh_opts_cnt;
+
 			if (opt->hopopt || cmsg->cmsg_len < CMSG_LEN(sizeof(struct ipv6_opt_hdr))) {
 				err = -EINVAL;
 				goto exit_f;
@@ -851,15 +854,24 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				err = -EINVAL;
 				goto exit_f;
 			}
-			if (!ns_capable(net->user_ns, CAP_NET_RAW)) {
-				err = -EPERM;
+
+			err = ipeh_opt_validate_tlvs(net, &ipv6_tlv_param_table,
+						     hdr, IPV6_HOPOPTS,
+						     ns_capable(net->user_ns,
+								CAP_NET_RAW),
+						     max_len, max_cnt);
+			if (err < 0)
 				goto exit_f;
-			}
+
 			opt->opt_nflen += len;
 			opt->hopopt = hdr;
 			break;
+		}
+
+		case IPV6_2292DSTOPTS: {
+			int max_len = net->ipv6.sysctl.max_dst_opts_len;
+			int max_cnt = net->ipv6.sysctl.max_dst_opts_cnt;
 
-		case IPV6_2292DSTOPTS:
 			if (cmsg->cmsg_len < CMSG_LEN(sizeof(struct ipv6_opt_hdr))) {
 				err = -EINVAL;
 				goto exit_f;
@@ -871,10 +883,14 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				err = -EINVAL;
 				goto exit_f;
 			}
-			if (!ns_capable(net->user_ns, CAP_NET_RAW)) {
-				err = -EPERM;
+			err = ipeh_opt_validate_tlvs(net, &ipv6_tlv_param_table,
+						     hdr, IPV6_DSTOPTS,
+						     ns_capable(net->user_ns,
+								CAP_NET_RAW),
+						     max_len, max_cnt);
+			if (err < 0)
 				goto exit_f;
-			}
+
 			if (opt->dst1opt) {
 				err = -EINVAL;
 				goto exit_f;
@@ -882,9 +898,13 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 			opt->opt_flen += len;
 			opt->dst1opt = hdr;
 			break;
+		}
 
 		case IPV6_DSTOPTS:
-		case IPV6_RTHDRDSTOPTS:
+		case IPV6_RTHDRDSTOPTS: {
+			int max_len = net->ipv6.sysctl.max_dst_opts_len;
+			int max_cnt = net->ipv6.sysctl.max_dst_opts_cnt;
+
 			if (cmsg->cmsg_len < CMSG_LEN(sizeof(struct ipv6_opt_hdr))) {
 				err = -EINVAL;
 				goto exit_f;
@@ -896,10 +916,15 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				err = -EINVAL;
 				goto exit_f;
 			}
-			if (!ns_capable(net->user_ns, CAP_NET_RAW)) {
-				err = -EPERM;
+
+			err = ipeh_opt_validate_tlvs(net, &ipv6_tlv_param_table,
+						     hdr, IPV6_DSTOPTS,
+						     ns_capable(net->user_ns,
+								CAP_NET_RAW),
+						     max_len, max_cnt);
+			if (err < 0)
 				goto exit_f;
-			}
+
 			if (cmsg->cmsg_type == IPV6_DSTOPTS) {
 				opt->opt_flen += len;
 				opt->dst1opt = hdr;
@@ -908,7 +933,7 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				opt->dst0opt = hdr;
 			}
 			break;
-
+		}
 		case IPV6_2292RTHDR:
 		case IPV6_RTHDR:
 			if (cmsg->cmsg_len < CMSG_LEN(sizeof(struct ipv6_rt_hdr))) {
diff --git a/net/ipv6/exthdrs_common.c b/net/ipv6/exthdrs_common.c
index 794f3ae..5d40198 100644
--- a/net/ipv6/exthdrs_common.c
+++ b/net/ipv6/exthdrs_common.c
@@ -265,6 +265,332 @@ bool ipeh_parse_tlv(unsigned int class,
 }
 EXPORT_SYMBOL(ipeh_parse_tlv);
 
+/* TLV validation functions */
+
+/* Validate a single non-padding TLV */
+static int __ipeh_opt_validate_single_tlv(struct net *net, const __u8 *tlv,
+					  struct tlv_proc *tproc,
+					  unsigned int class, bool *deep_check,
+					  bool deleting, bool admin)
+{
+	struct tlv_tx_params *tptx = &tproc->params.t;
+
+	if (tlv[0] < 2) /* Must be non-padding */
+		return -EINVAL;
+
+	/* Check permissions */
+	switch (admin ? tptx->admin_perm : tptx->user_perm) {
+	case IPEH_TLV_PERM_NO_CHECK:
+		/* Allowed with no deep checks */
+		*deep_check = false;
+		return 0;
+	case IPEH_TLV_PERM_WITH_CHECK:
+		/* Allowed with deep checks */
+		*deep_check = true;
+		break;
+	default:
+		/* No permission */
+		return -EPERM;
+	}
+
+	/* Perform deep checks on the TLV */
+
+	/* Check class */
+	if ((tptx->class & class) != class)
+		return -EINVAL;
+
+	/* Don't bother checking lengths when deleting, the TLV is only
+	 * needed here for lookup
+	 */
+	if (deleting) {
+		/* Don't bother with deep checks when deleting */
+		*deep_check = false;
+	} else {
+		/* Check length */
+		if (tlv[1] < tptx->min_data_len || tlv[1] > tptx->max_data_len)
+			return -EINVAL;
+
+		/* Check length alignment */
+		if ((tlv[1] % (tptx->data_len_mult + 1)) != tptx->data_len_off)
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static unsigned int optname_to_tlv_class(int optname)
+{
+	switch (optname) {
+	case IPV6_HOPOPTS:
+		return IPEH_TLV_CLASS_FLAG_HOPOPT;
+	case IPV6_RTHDRDSTOPTS:
+		return IPEH_TLV_CLASS_FLAG_RTRDSTOPT;
+	case IPV6_DSTOPTS:
+		return IPEH_TLV_CLASS_FLAG_DSTOPT;
+	default:
+		return -1U;
+	}
+}
+
+static int __ipeh_opt_validate_tlvs(struct net *net,
+				    struct tlv_param_table *tlv_param_table,
+				    struct ipv6_opt_hdr *opt,
+				    unsigned int optname, bool deleting,
+				    bool admin, unsigned int max_len,
+				    unsigned int max_cnt)
+{
+	bool deep_check = !admin, did_deep_check = false;
+	unsigned int opt_len, tlv_len, offset;
+	unsigned int padding = 0, numpad = 0;
+	unsigned short prev_tlv_order = 0;
+	bool nonzero_padding = false;
+	unsigned int class, cnt = 0;
+	struct tlv_tx_params *tptx;
+	int retc, ret = -EINVAL;
+	__u8 *tlv = (__u8 *)opt;
+	struct tlv_proc *tproc;
+
+	opt_len = ipv6_optlen(opt);
+	offset = sizeof(*opt);
+
+	class = optname_to_tlv_class(optname);
+
+	rcu_read_lock();
+
+	while (offset < opt_len) {
+		switch (tlv[offset]) {
+		case IPV6_TLV_PAD1:
+			tlv_len = 1;
+			padding++;
+			numpad++;
+			break;
+		case IPV6_TLV_PADN: {
+			int i;
+
+			if (offset + 2 > opt_len)
+				goto out;
+
+			tlv_len = tlv[offset + 1] + 2;
+
+			if (offset + tlv_len > opt_len)
+				goto out;
+
+			/* Check for nonzero padding */
+			for (i = 2; i < tlv_len; i++) {
+				if (tlv[offset + i] != 0) {
+					nonzero_padding = true;
+					break;
+				}
+			}
+
+			padding += tlv_len;
+			numpad++;
+			break;
+		}
+		default:
+			if (offset + 2 > opt_len)
+				goto out;
+
+			tlv_len = tlv[offset + 1] + 2;
+
+			if (offset + tlv_len > opt_len)
+				goto out;
+
+			tproc = ipeh_tlv_get_proc(tlv_param_table,
+						  &tlv[offset]);
+			tptx = &tproc->params.t;
+
+			retc = __ipeh_opt_validate_single_tlv(net, &tlv[offset],
+							      tproc, class,
+							      &deep_check,
+							      deleting, admin);
+			if (retc < 0) {
+				ret = retc;
+				goto out;
+			}
+
+			if (deep_check) {
+				/* Check for too many options */
+				if (++cnt > max_cnt) {
+					ret = -E2BIG;
+					goto out;
+				}
+
+				/* Check order */
+				if (tptx->preferred_order < prev_tlv_order)
+					goto out;
+
+				/* Check alignment */
+				if ((offset % (tptx->align_mult + 1)) !=
+				    tptx->align_off)
+					goto out;
+
+				/* Check for right amount of padding */
+				if (numpad > 1 || padding > tptx->align_mult ||
+				    nonzero_padding)
+					goto out;
+
+				prev_tlv_order = tptx->preferred_order;
+
+				did_deep_check = true;
+			}
+			nonzero_padding = false;
+			padding = 0;
+			numpad = 0;
+		}
+		offset += tlv_len;
+	}
+
+	/* Check trailing padding. Note this covers the case option list
+	 * only contains padding.
+	 */
+	if (deep_check && (numpad > 1 || padding > 7 || nonzero_padding))
+		goto out;
+
+	/* If we did at least one deep check apply length limit */
+	if (did_deep_check && opt_len > max_len) {
+		ret = -EMSGSIZE;
+		goto out;
+	}
+
+	/* All good */
+	ret = 0;
+out:
+	rcu_read_unlock();
+
+	return ret;
+}
+
+/**
+ * ipeh_opt_validate_tlvs - Validate TLVs.
+ * @net: Current net
+ * @tlv_param_table: TLV parameter table
+ * @opt: The option header
+ * @optname: IPV6_HOPOPTS, IPV6_RTHDRDSTOPTS, or IPV6_DSTOPTS
+ * @admin: Set for privileged user
+ * @max_len: Maximum length for TLV
+ * @max_cnt: Maximum number of non-padding TLVs
+ *
+ * Description:
+ * Walks the TLVs in a list to verify that the TLV lengths and other
+ * parameters are in bounds for a Destination or Hop-by-Hop option.
+ * Return -EINVAL is there is a problem, zero otherwise.
+ */
+int ipeh_opt_validate_tlvs(struct net *net,
+			   struct tlv_param_table *tlv_param_table,
+			   struct ipv6_opt_hdr *opt, unsigned int optname,
+			   bool admin, unsigned int max_len,
+			   unsigned int max_cnt)
+{
+	return __ipeh_opt_validate_tlvs(net, tlv_param_table, opt, optname,
+					false, admin, max_len, max_cnt);
+}
+EXPORT_SYMBOL(ipeh_opt_validate_tlvs);
+
+/**
+ * ipeh_opt_validate_single_tlv - Check that a single TLV is valid.
+ * @net: Current net
+ * @tlv_param_table: TLV parameter table
+ * @optname: IPV6_HOPOPTS, IPV6_RTHDRDSTOPTS, or IPV6_DSTOPTS
+ * @tlv: The TLV as array of bytes
+ * @len: Length of buffer holding TLV
+ * @deleting: TLV is being deleted
+ * @admin: Set for privileged user
+ *
+ * Description:
+ * Validates a single TLV. The TLV must be non-padding type. The length
+ * of the TLV (as determined by the second byte that gives length of the
+ * option data) must match @len.
+ */
+int ipeh_opt_validate_single_tlv(struct net *net,
+				 struct tlv_param_table *tlv_param_table,
+				 unsigned int optname, const __u8 *tlv,
+				 size_t len, bool deleting, bool admin)
+{
+	struct tlv_proc *tproc;
+	unsigned int class;
+	bool deep_check;
+	int ret = 0;
+
+	class = optname_to_tlv_class(optname);
+
+	if (tlv[0] < 2)
+		return -EINVAL;
+
+	if (len < 2)
+		return -EINVAL;
+
+	if (tlv[1] + 2 != len)
+		return -EINVAL;
+
+	rcu_read_lock();
+
+	tproc = ipeh_tlv_get_proc(tlv_param_table, tlv);
+
+	ret = __ipeh_opt_validate_single_tlv(net, tlv, tproc, class,
+					     &deep_check, deleting, admin);
+
+	rcu_read_unlock();
+
+	return ret;
+}
+EXPORT_SYMBOL(ipeh_opt_validate_single_tlv);
+
+/**
+ * ipeh_opt_check_perm - Check that current capabilities allows modifying
+ * txopts.
+ * @net: Current net
+ * @tlv_param_table: TLV parameter table
+ * @txopts: TX options from the socket
+ * @optname: IPV6_HOPOPTS, IPV6_RTHDRDSTOPTS, or IPV6_DSTOPTS
+ * @admin: Set for privileged user
+ *
+ * Description:
+ *
+ * Checks whether the permissions of TLV that are set on a socket permit
+ * modification.
+ *
+ */
+int ipeh_opt_check_perm(struct net *net,
+			struct tlv_param_table *tlv_param_table,
+			struct ipv6_txoptions *txopt, int optname, bool admin)
+{
+	struct ipv6_opt_hdr *opt;
+	int retv = -EPERM;
+
+	if (!txopt)
+		return 0;
+
+	switch (optname) {
+	case IPV6_HOPOPTS:
+		opt = txopt->hopopt;
+		break;
+	case IPV6_RTHDRDSTOPTS:
+		opt = txopt->dst0opt;
+		break;
+	case IPV6_DSTOPTS:
+		opt = txopt->dst1opt;
+		break;
+	default:
+		goto out;
+	}
+
+	if (!opt) {
+		retv = 0;
+		goto out;
+	}
+
+	/* Just call the validate function on the options as being
+	 * deleted.
+	 */
+	retv = __ipeh_opt_validate_tlvs(net, tlv_param_table, opt, optname,
+					true, admin, -1U, -1U);
+
+out:
+	return retv;
+}
+EXPORT_SYMBOL(ipeh_opt_check_perm);
+
 /* TLV parameter table functions and structures */
 
 /* Default (unset) values for TLV parameters */
@@ -457,6 +783,76 @@ int __ipeh_tlv_unset(struct tlv_param_table *tlv_param_table,
 }
 EXPORT_SYMBOL(__ipeh_tlv_unset);
 
+/* Utility function tp create TX options from a setsockopt that is setting
+ * options on a socket.
+ */
+struct ipv6_txoptions *ipeh_txopt_from_opt(struct sock *sk,
+					   struct tlv_param_table
+							*tlv_param_table,
+					   struct ipv6_txoptions *opt,
+					   int optname, char __user *optval,
+					   unsigned int optlen,
+					   unsigned int max_len,
+					   unsigned int max_cnt)
+{
+	struct ipv6_opt_hdr *new = NULL;
+	struct net *net = sock_net(sk);
+	int retv;
+
+	/* remove any sticky options header with a zero option
+	 * length, per RFC3542.
+	 */
+	if (optlen == 0) {
+		optval = NULL;
+	} else if (!optval) {
+		return ERR_PTR(-EINVAL);
+	} else if (optlen < sizeof(struct ipv6_opt_hdr) ||
+		 optlen & 0x7 || optlen > 8 * 255) {
+		return ERR_PTR(-EINVAL);
+	} else {
+		new = memdup_user(optval, optlen);
+		if (IS_ERR(new))
+			return (struct ipv6_txoptions *)new;
+		if (unlikely(ipv6_optlen(new) > optlen)) {
+			kfree(new);
+			return ERR_PTR(-EINVAL);
+		}
+	}
+
+	if (optname != IPV6_RTHDR) {
+		bool cap = ns_capable(net->user_ns, CAP_NET_RAW);
+
+		/* First check if we have permission to delete
+		 * the existing options on the socket.
+		 */
+		retv = ipeh_opt_check_perm(net, tlv_param_table,
+					   opt, optname, cap);
+		if (retv < 0) {
+			kfree(new);
+			return ERR_PTR(retv);
+		}
+
+		/* Check permissions and other validations on new
+		 * TLVs
+		 */
+		if (new) {
+			retv = ipeh_opt_validate_tlvs(net, tlv_param_table,
+						      new, optname, cap,
+						      max_len, max_cnt);
+			if (retv < 0) {
+				kfree(new);
+				return ERR_PTR(retv);
+			}
+		}
+	}
+
+	opt = ipeh_renew_options(sk, opt, optname, new);
+	kfree(new);
+
+	return opt;
+}
+EXPORT_SYMBOL(ipeh_txopt_from_opt);
+
 const struct nla_policy ipeh_tlv_nl_policy[IPEH_TLV_ATTR_MAX + 1] = {
 	[IPEH_TLV_ATTR_TYPE] =		{ .type = NLA_U8, },
 	[IPEH_TLV_ATTR_ORDER] =		{ .type = NLA_U16, },
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 7810988..d0f7693 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -395,40 +395,27 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 	case IPV6_RTHDR:
 	case IPV6_DSTOPTS:
 	{
+		unsigned int max_len = -1U, max_cnt = -1U;
 		struct ipv6_txoptions *opt;
-		struct ipv6_opt_hdr *new = NULL;
 
-		/* hop-by-hop / destination options are privileged option */
-		retv = -EPERM;
-		if (optname != IPV6_RTHDR && !ns_capable(net->user_ns, CAP_NET_RAW))
+		switch (optname) {
+		case IPV6_HOPOPTS:
+			max_len = net->ipv6.sysctl.max_hbh_opts_len;
+			max_cnt = net->ipv6.sysctl.max_hbh_opts_cnt;
 			break;
-
-		/* remove any sticky options header with a zero option
-		 * length, per RFC3542.
-		 */
-		if (optlen == 0)
-			optval = NULL;
-		else if (!optval)
-			goto e_inval;
-		else if (optlen < sizeof(struct ipv6_opt_hdr) ||
-			 optlen & 0x7 || optlen > 8 * 255)
-			goto e_inval;
-		else {
-			new = memdup_user(optval, optlen);
-			if (IS_ERR(new)) {
-				retv = PTR_ERR(new);
+		case IPV6_RTHDRDSTOPTS:
+		case IPV6_DSTOPTS:
+			max_len = net->ipv6.sysctl.max_dst_opts_len;
+			max_cnt = net->ipv6.sysctl.max_dst_opts_cnt;
 				break;
-			}
-			if (unlikely(ipv6_optlen(new) > optlen)) {
-				kfree(new);
-				goto e_inval;
-			}
 		}
 
 		opt = rcu_dereference_protected(np->opt,
 						lockdep_sock_is_held(sk));
-		opt = ipeh_renew_options(sk, opt, optname, new);
-		kfree(new);
+		opt = ipeh_txopt_from_opt(sk, &ipv6_tlv_param_table, opt,
+					  optname, optval, optlen, max_len,
+					  max_cnt);
+
 		if (IS_ERR(opt)) {
 			retv = PTR_ERR(opt);
 			break;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
                   ` (8 preceding siblings ...)
  2019-12-26 22:51 ` [PATCH v8 net-next 9/9] ip6tlvs: Validation of TX Destination and Hop-by-Hop options Tom Herbert
@ 2020-01-02 21:41 ` David Miller
  2020-01-03  0:42   ` Tom Herbert
  9 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2020-01-02 21:41 UTC (permalink / raw)
  To: tom; +Cc: netdev, simon.horman, willemdebruijn.kernel

From: Tom Herbert <tom@herbertland.com>
Date: Thu, 26 Dec 2019 14:51:29 -0800

> The fundamental rationale here is to make various TLVs, in particular
> Hop-by-Hop and Destination options, usable, robust, scalable, and
> extensible to support emerging functionality.

So, patch #1 is fine and it seems to structure the code to more easily
enable support for:

https://tools.ietf.org/html/draft-ietf-6man-icmp-limits-07

(I'll note in passing how frustrating it is that, based upon your
handling of things in that past, I know that I have to go out and
explicitly look for draft RFCs containing your name in order to figure
out what your overall long term agenda actually is.  You should be
stating these kinds of things in your commit messages)

But as for the rest of the patch series, what are these "emerging
functionalities" you are talking about?

I've heard some noises about people wanting to do some kind of "kerberos
for packets".  Or even just plain putting app + user ID information into
options.

Is that where this is going?  I have no idea, because you won't say.

And honestly, this stuff sounds so easy to misuse by governments and
other entities.  It could also be used to allow ISPs to limit users
in very undesirable and unfair ways.   And honestly, surveilance and
limiting are the most likely uses for such a facility.  I can't see
it legitimately being promoted as a "security" feature, really.

I think the whole TX socket option can wait.

And because of that the whole consolidation and cleanup of the option
handling code is untenable, because without a use case all it does is
make -stable backports insanely painful.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-02 21:41 ` [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure David Miller
@ 2020-01-03  0:42   ` Tom Herbert
  2020-01-03  7:11     ` kernel Dev
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Herbert @ 2020-01-03  0:42 UTC (permalink / raw)
  To: David Miller
  Cc: Linux Kernel Network Developers, Simon Horman, Willem de Bruijn

On Thu, Jan 2, 2020 at 1:41 PM David Miller <davem@davemloft.net> wrote:
>
> From: Tom Herbert <tom@herbertland.com>
> Date: Thu, 26 Dec 2019 14:51:29 -0800
>
> > The fundamental rationale here is to make various TLVs, in particular
> > Hop-by-Hop and Destination options, usable, robust, scalable, and
> > extensible to support emerging functionality.
>
> So, patch #1 is fine and it seems to structure the code to more easily
> enable support for:
>
> https://tools.ietf.org/html/draft-ietf-6man-icmp-limits-07
>
> (I'll note in passing how frustrating it is that, based upon your
> handling of things in that past, I know that I have to go out and
> explicitly look for draft RFCs containing your name in order to figure
> out what your overall long term agenda actually is.  You should be
> stating these kinds of things in your commit messages)
>
> But as for the rest of the patch series, what are these "emerging
> functionalities" you are talking about?
>
> I've heard some noises about people wanting to do some kind of "kerberos
> for packets".  Or even just plain putting app + user ID information into
> options.
>
> Is that where this is going?  I have no idea, because you won't say.
>
Yes, there is some of that. Here are some of the use cases for HBH options:

PMTU option: draft-ietf-6man-mtu-option-01. There is a P4
implementation as well as Linux PoC for this that was demonstated
@IETF103 hackathon.
IOAM option: https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options-00.
There is also P4 implementation and Linux router support demonstrated
at IETF104 hackathon. INT is a related technology that would also use
this.
FAST option: https://datatracker.ietf.org/doc/draft-herbert-fast/. I
have PoC for this. There are some other protocol proposals in the is
are (I know Huawei has something to describe the QoS that should be
applied).

There are others including the whole space especially as a real
solution for host to networking signaling gets fleshed out. There's
also the whole world of segment routing options and where that's
going.

> And honestly, this stuff sounds so easy to misuse by governments and
> other entities.  It could also be used to allow ISPs to limit users
> in very undesirable and unfair ways.   And honestly, surveilance and
> limiting are the most likely uses for such a facility.  I can't see
> it legitimately being promoted as a "security" feature, really.
>
Yes, but the problem isn't unique to IPv6 options nor would abuse be
prevented by not implementing them in Linux. Router vendors will
happily provide the necessary support to allow abuse :-) AH is the
prescribed way to prevent this sort of abuse (aside from encrypting
everything that isn't necessary to route packets, but that's another
story). AH is fully supported by Linux, good luck finding a router
vendor that cares about it :-)

> I think the whole TX socket option can wait.
>
> And because of that the whole consolidation and cleanup of the option
> handling code is untenable, because without a use case all it does is
> make -stable backports insanely painful.

The problem with "wait and see" approach is that Linux is not the only
game in town. There are other players that are pursuing this area
(Cisco and Huawei in particular). They are able to implement protocols
more to appease their short term marketing requirements with little
regard for what is best for the community. This is why Linux is so
critical to networking, it is the only open forum where real scrutiny
is applied to how protocols are implemented. If the alternatives are
given free to lead then it's very likely we'll end up being stuck with
what they do and probably have to follow their lead regardless of how
miserable they make the protocols. We've already seen this in segment
routing, their attempts to kill IP fragmentation, and all the other
examples of protocol ossification that unnecessarily restrict what
hosts are allowed to send in the network and hence reduce the utility
and security we are able to offer the user.

The other data point I will offer is that the current Linux
implementation of IPv6 destination and hop-by-hop options in the
kernel is next to useless. Nobody is using the ones that have been
implemented, and adding support for a new is a major pain-- the
ability for modules to register support for an option seems like an
obvious feature to me. Similarly, the restriction that only admin can
set options is overly restrictive-- allowing to non-privileged users
to send options under tightly controlled constraints set by the admin
also seems reasonable to me.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03  0:42   ` Tom Herbert
@ 2020-01-03  7:11     ` kernel Dev
  2020-01-03 17:35       ` Tom Herbert
  0 siblings, 1 reply; 26+ messages in thread
From: kernel Dev @ 2020-01-03  7:11 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David Miller, Linux Kernel Network Developers, Simon Horman,
	Willem de Bruijn

Tom, 
Happy new year!!

I believe that these patches cost you great effort. However, we would like to see the 6-10 subsequent patch set to be really able to understand where are you going with these ones.

At some point you mentioned that router vendors make protocol in miserable way. Do you believe the right way is that every individual defines the protocol the way he wants in a single authored IETF draft ? 

Regarding 10) Support IPv4 extension headers.
I see that your drafts describing the idea are expired [1][2]. 
Do you plan to add to the kernel the implementation of expired contents ? or did you abandoned these drafts and described the idea somewhere else that I’m not aware of ?   

[1] https://tools.ietf.org/html/draft-herbert-ipv4-udpencap-eh-01
[2] https://tools.ietf.org/html/draft-herbert-ipv4-eh-01

Ahmed 


On Thu, 2 Jan 2020 16:42:24 -0800
Tom Herbert <tom@herbertland.com> wrote:

> On Thu, Jan 2, 2020 at 1:41 PM David Miller <davem@davemloft.net> wrote:
> >
> > From: Tom Herbert <tom@herbertland.com>
> > Date: Thu, 26 Dec 2019 14:51:29 -0800
> >
> > > The fundamental rationale here is to make various TLVs, in particular
> > > Hop-by-Hop and Destination options, usable, robust, scalable, and
> > > extensible to support emerging functionality.
> >
> > So, patch #1 is fine and it seems to structure the code to more easily
> > enable support for:
> >
> > https://tools.ietf.org/html/draft-ietf-6man-icmp-limits-07
> >
> > (I'll note in passing how frustrating it is that, based upon your
> > handling of things in that past, I know that I have to go out and
> > explicitly look for draft RFCs containing your name in order to figure
> > out what your overall long term agenda actually is.  You should be
> > stating these kinds of things in your commit messages)
> >
> > But as for the rest of the patch series, what are these "emerging
> > functionalities" you are talking about?
> >
> > I've heard some noises about people wanting to do some kind of "kerberos
> > for packets".  Or even just plain putting app + user ID information into
> > options.
> >
> > Is that where this is going?  I have no idea, because you won't say.
> >
> Yes, there is some of that. Here are some of the use cases for HBH options:
> 
> PMTU option: draft-ietf-6man-mtu-option-01. There is a P4
> implementation as well as Linux PoC for this that was demonstated
> @IETF103 hackathon.
> IOAM option: https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options-00.
> There is also P4 implementation and Linux router support demonstrated
> at IETF104 hackathon. INT is a related technology that would also use
> this.
> FAST option: https://datatracker.ietf.org/doc/draft-herbert-fast/. I
> have PoC for this. There are some other protocol proposals in the is
> are (I know Huawei has something to describe the QoS that should be
> applied).
> 
> There are others including the whole space especially as a real
> solution for host to networking signaling gets fleshed out. There's
> also the whole world of segment routing options and where that's
> going.
> 
> > And honestly, this stuff sounds so easy to misuse by governments and
> > other entities.  It could also be used to allow ISPs to limit users
> > in very undesirable and unfair ways.   And honestly, surveilance and
> > limiting are the most likely uses for such a facility.  I can't see
> > it legitimately being promoted as a "security" feature, really.
> >
> Yes, but the problem isn't unique to IPv6 options nor would abuse be
> prevented by not implementing them in Linux. Router vendors will
> happily provide the necessary support to allow abuse :-) AH is the
> prescribed way to prevent this sort of abuse (aside from encrypting
> everything that isn't necessary to route packets, but that's another
> story). AH is fully supported by Linux, good luck finding a router
> vendor that cares about it :-)
> 
> > I think the whole TX socket option can wait.
> >
> > And because of that the whole consolidation and cleanup of the option
> > handling code is untenable, because without a use case all it does is
> > make -stable backports insanely painful.
> 
> The problem with "wait and see" approach is that Linux is not the only
> game in town. There are other players that are pursuing this area
> (Cisco and Huawei in particular). They are able to implement protocols
> more to appease their short term marketing requirements with little
> regard for what is best for the community. This is why Linux is so
> critical to networking, it is the only open forum where real scrutiny
> is applied to how protocols are implemented. If the alternatives are
> given free to lead then it's very likely we'll end up being stuck with
> what they do and probably have to follow their lead regardless of how
> miserable they make the protocols. We've already seen this in segment
> routing, their attempts to kill IP fragmentation, and all the other
> examples of protocol ossification that unnecessarily restrict what
> hosts are allowed to send in the network and hence reduce the utility
> and security we are able to offer the user.
> 
> The other data point I will offer is that the current Linux
> implementation of IPv6 destination and hop-by-hop options in the
> kernel is next to useless. Nobody is using the ones that have been
> implemented, and adding support for a new is a major pain-- the
> ability for modules to register support for an option seems like an
> obvious feature to me. Similarly, the restriction that only admin can
> set options is overly restrictive-- allowing to non-privileged users
> to send options under tightly controlled constraints set by the admin
> also seems reasonable to me.
> 
> Tom


-- 
kernel Dev <ahabdels.dev@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03  7:11     ` kernel Dev
@ 2020-01-03 17:35       ` Tom Herbert
  2020-01-03 20:45         ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Herbert @ 2020-01-03 17:35 UTC (permalink / raw)
  To: kernel Dev
  Cc: David Miller, Linux Kernel Network Developers, Simon Horman,
	Willem de Bruijn

On Thu, Jan 2, 2020 at 11:11 PM kernel Dev <ahabdels.dev@gmail.com> wrote:
>
> Tom,
> Happy new year!!
>
Happy new year to you!

> I believe that these patches cost you great effort. However, we would like to see the 6-10 subsequent patch set to be really able to understand where are you going with these ones.
>
I can post those as RFC.

> At some point you mentioned that router vendors make protocol in miserable way. Do you believe the right way is that every individual defines the protocol the way he wants in a single authored IETF draft ?
>
No, that defies the whole purpose of standard and interoperable
protocols. The problem we are finding with IETF is that it has no
means to enforce conformance of the protocols it standardizes. Router
vendors openly exploit this, for instance at the last IETF the Cisco
engineer presenting extension header insertion (a clear violation of
RFC8200) plainly said upfront that regardless of any feedback or input
they will continue developing and deploying it the way they want. Note
this is not an indictment of all router vendors and their engineers,
there are many that are trying to do the right thing-- but it's really
pretty easy for a few engineers at large vendors to cheat the system
in different ways. The only pushback IETF can do is to not standardize
these non-conformant quasi-proprietary protocols. The real way to
combat this provide open implementation that demonstrates the correct
use of the protocols and show that's more extensible and secure than
these "hacks".

> Regarding 10) Support IPv4 extension headers.
> I see that your drafts describing the idea are expired [1][2].
> Do you plan to add to the kernel the implementation of expired contents ? or did you abandoned these drafts and described the idea somewhere else that I’m not aware of ?
>
> [1] https://tools.ietf.org/html/draft-herbert-ipv4-udpencap-eh-01
> [2] https://tools.ietf.org/html/draft-herbert-ipv4-eh-01
>
[1] is obsoleted by [2]. I will update [2] shortly. I may also propose
an IETF hackathon project to bring up IOAM over IPv4 if you are
interested.

Tom

> Ahmed
>
>
> On Thu, 2 Jan 2020 16:42:24 -0800
> Tom Herbert <tom@herbertland.com> wrote:
>
> > On Thu, Jan 2, 2020 at 1:41 PM David Miller <davem@davemloft.net> wrote:
> > >
> > > From: Tom Herbert <tom@herbertland.com>
> > > Date: Thu, 26 Dec 2019 14:51:29 -0800
> > >
> > > > The fundamental rationale here is to make various TLVs, in particular
> > > > Hop-by-Hop and Destination options, usable, robust, scalable, and
> > > > extensible to support emerging functionality.
> > >
> > > So, patch #1 is fine and it seems to structure the code to more easily
> > > enable support for:
> > >
> > > https://tools.ietf.org/html/draft-ietf-6man-icmp-limits-07
> > >
> > > (I'll note in passing how frustrating it is that, based upon your
> > > handling of things in that past, I know that I have to go out and
> > > explicitly look for draft RFCs containing your name in order to figure
> > > out what your overall long term agenda actually is.  You should be
> > > stating these kinds of things in your commit messages)
> > >
> > > But as for the rest of the patch series, what are these "emerging
> > > functionalities" you are talking about?
> > >
> > > I've heard some noises about people wanting to do some kind of "kerberos
> > > for packets".  Or even just plain putting app + user ID information into
> > > options.
> > >
> > > Is that where this is going?  I have no idea, because you won't say.
> > >
> > Yes, there is some of that. Here are some of the use cases for HBH options:
> >
> > PMTU option: draft-ietf-6man-mtu-option-01. There is a P4
> > implementation as well as Linux PoC for this that was demonstated
> > @IETF103 hackathon.
> > IOAM option: https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options-00.
> > There is also P4 implementation and Linux router support demonstrated
> > at IETF104 hackathon. INT is a related technology that would also use
> > this.
> > FAST option: https://datatracker.ietf.org/doc/draft-herbert-fast/. I
> > have PoC for this. There are some other protocol proposals in the is
> > are (I know Huawei has something to describe the QoS that should be
> > applied).
> >
> > There are others including the whole space especially as a real
> > solution for host to networking signaling gets fleshed out. There's
> > also the whole world of segment routing options and where that's
> > going.
> >
> > > And honestly, this stuff sounds so easy to misuse by governments and
> > > other entities.  It could also be used to allow ISPs to limit users
> > > in very undesirable and unfair ways.   And honestly, surveilance and
> > > limiting are the most likely uses for such a facility.  I can't see
> > > it legitimately being promoted as a "security" feature, really.
> > >
> > Yes, but the problem isn't unique to IPv6 options nor would abuse be
> > prevented by not implementing them in Linux. Router vendors will
> > happily provide the necessary support to allow abuse :-) AH is the
> > prescribed way to prevent this sort of abuse (aside from encrypting
> > everything that isn't necessary to route packets, but that's another
> > story). AH is fully supported by Linux, good luck finding a router
> > vendor that cares about it :-)
> >
> > > I think the whole TX socket option can wait.
> > >
> > > And because of that the whole consolidation and cleanup of the option
> > > handling code is untenable, because without a use case all it does is
> > > make -stable backports insanely painful.
> >
> > The problem with "wait and see" approach is that Linux is not the only
> > game in town. There are other players that are pursuing this area
> > (Cisco and Huawei in particular). They are able to implement protocols
> > more to appease their short term marketing requirements with little
> > regard for what is best for the community. This is why Linux is so
> > critical to networking, it is the only open forum where real scrutiny
> > is applied to how protocols are implemented. If the alternatives are
> > given free to lead then it's very likely we'll end up being stuck with
> > what they do and probably have to follow their lead regardless of how
> > miserable they make the protocols. We've already seen this in segment
> > routing, their attempts to kill IP fragmentation, and all the other
> > examples of protocol ossification that unnecessarily restrict what
> > hosts are allowed to send in the network and hence reduce the utility
> > and security we are able to offer the user.
> >
> > The other data point I will offer is that the current Linux
> > implementation of IPv6 destination and hop-by-hop options in the
> > kernel is next to useless. Nobody is using the ones that have been
> > implemented, and adding support for a new is a major pain-- the
> > ability for modules to register support for an option seems like an
> > obvious feature to me. Similarly, the restriction that only admin can
> > set options is overly restrictive-- allowing to non-privileged users
> > to send options under tightly controlled constraints set by the admin
> > also seems reasonable to me.
> >
> > Tom
>
>
> --
> kernel Dev <ahabdels.dev@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03 17:35       ` Tom Herbert
@ 2020-01-03 20:45         ` David Miller
  2020-01-03 22:31           ` Tom Herbert
  0 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2020-01-03 20:45 UTC (permalink / raw)
  To: tom; +Cc: ahabdels.dev, netdev, simon.horman, willemdebruijn.kernel

From: Tom Herbert <tom@herbertland.com>
Date: Fri, 3 Jan 2020 09:35:08 -0800

> The real way to combat this provide open implementation that
> demonstrates the correct use of the protocols and show that's more
> extensible and secure than these "hacks".

Keep dreaming, this won't stop Cisco from doing whatever it wants to do.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03 20:45         ` David Miller
@ 2020-01-03 22:31           ` Tom Herbert
  2020-01-03 22:57             ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Herbert @ 2020-01-03 22:31 UTC (permalink / raw)
  To: David Miller
  Cc: Ahmed Abdelsalam, Linux Kernel Network Developers, Simon Horman,
	Willem de Bruijn

On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
>
> From: Tom Herbert <tom@herbertland.com>
> Date: Fri, 3 Jan 2020 09:35:08 -0800
>
> > The real way to combat this provide open implementation that
> > demonstrates the correct use of the protocols and show that's more
> > extensible and secure than these "hacks".
>
> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.

See QUIC. See TLS. See TCP fast open. See transport layer encryption.
These are prime examples where we've steered the Internet from host
protocols and implementation to successfully obsolete or at least work
around protocol ossification that was perpetuated by router vendors.
Cisco is not the Internet!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03 22:31           ` Tom Herbert
@ 2020-01-03 22:57             ` David Miller
  2020-01-03 23:48               ` Tom Herbert
  0 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2020-01-03 22:57 UTC (permalink / raw)
  To: tom; +Cc: ahabdels.dev, netdev, simon.horman, willemdebruijn.kernel

From: Tom Herbert <tom@herbertland.com>
Date: Fri, 3 Jan 2020 14:31:58 -0800

> On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
>>
>> From: Tom Herbert <tom@herbertland.com>
>> Date: Fri, 3 Jan 2020 09:35:08 -0800
>>
>> > The real way to combat this provide open implementation that
>> > demonstrates the correct use of the protocols and show that's more
>> > extensible and secure than these "hacks".
>>
>> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> 
> See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> These are prime examples where we've steered the Internet from host
> protocols and implementation to successfully obsolete or at least work
> around protocol ossification that was perpetuated by router vendors.
> Cisco is not the Internet!

Seriously, I wish you luck stopping the SRv6 header insertion stuff.

It's simply not happening, no matter what transport layer technology
you throw at the situation.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03 22:57             ` David Miller
@ 2020-01-03 23:48               ` Tom Herbert
  2020-01-03 23:53                 ` Erik Kline
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Herbert @ 2020-01-03 23:48 UTC (permalink / raw)
  To: David Miller
  Cc: Ahmed Abdelsalam, Linux Kernel Network Developers, Simon Horman,
	Willem de Bruijn

On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
>
> From: Tom Herbert <tom@herbertland.com>
> Date: Fri, 3 Jan 2020 14:31:58 -0800
>
> > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> >>
> >> From: Tom Herbert <tom@herbertland.com>
> >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> >>
> >> > The real way to combat this provide open implementation that
> >> > demonstrates the correct use of the protocols and show that's more
> >> > extensible and secure than these "hacks".
> >>
> >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> >
> > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > These are prime examples where we've steered the Internet from host
> > protocols and implementation to successfully obsolete or at least work
> > around protocol ossification that was perpetuated by router vendors.
> > Cisco is not the Internet!
>
> Seriously, I wish you luck stopping the SRv6 header insertion stuff.
>
Dave,

I agree we can't stop it, but maybe we can steer it to be at least
palatable. There are valid use cases for extension header insertion.
Ironically, SRv6 header insertion isn't one of them; the proponents
have failed to offer even a single reason why the alternative of IPv6
encapsulation isn't sufficient (believe me, we've asked _many_ times
for some justification and only get hand waving!). There are, however,
some interesting uses cases like in IOAM where the operator would like
to annotate packets as they traverse the network. Encapsulation is
insufficient if they don't know what the end point would be or they
don't want the annotation to change the path the packets take (versus
those that aren't annotated).

The salient problem with extension header insertion is lost of
attribution. It is fundamental in the IP protocol that the contents of
a packet are attributed to the source host identified by the source
address. If some intermediate node inserts an extension header that
subsequently breaks the packet downstream then there is no obvious way
to debug this. If an ICMP message is sent because of the receiving
data, then receiving host can't do much with it; it's not the source
of the data in error and nothing in the packet tells who the culprit
is. The Cisco guys have at least conceded one point on SRv6 insertion
due to pushback on this, their latest draft only does SRv6 insertion
on packets that have already been encapsulated in IPIP on ingress into
the domain. This is intended to at least restrict the modified packets
to a controlled domain (I'm note sure if any implementations enforce
this though). My proposal is to require an "attribution" HBH option
that would clearly identify inserted data put in a packet by
middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
allow extension header insertion, but require protocol to give
attribution and make it at least somewhat robust and manageable.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03 23:48               ` Tom Herbert
@ 2020-01-03 23:53                 ` Erik Kline
  2020-01-04  0:37                   ` Tom Herbert
  0 siblings, 1 reply; 26+ messages in thread
From: Erik Kline @ 2020-01-03 23:53 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David Miller, Ahmed Abdelsalam, Linux Kernel Network Developers,
	Simon Horman, Willem de Bruijn

On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
>
> On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> >
> > From: Tom Herbert <tom@herbertland.com>
> > Date: Fri, 3 Jan 2020 14:31:58 -0800
> >
> > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > >>
> > >> From: Tom Herbert <tom@herbertland.com>
> > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > >>
> > >> > The real way to combat this provide open implementation that
> > >> > demonstrates the correct use of the protocols and show that's more
> > >> > extensible and secure than these "hacks".
> > >>
> > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > >
> > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > These are prime examples where we've steered the Internet from host
> > > protocols and implementation to successfully obsolete or at least work
> > > around protocol ossification that was perpetuated by router vendors.
> > > Cisco is not the Internet!
> >
> > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> >
> Dave,
>
> I agree we can't stop it, but maybe we can steer it to be at least
> palatable. There are valid use cases for extension header insertion.
> Ironically, SRv6 header insertion isn't one of them; the proponents
> have failed to offer even a single reason why the alternative of IPv6
> encapsulation isn't sufficient (believe me, we've asked _many_ times
> for some justification and only get hand waving!). There are, however,
> some interesting uses cases like in IOAM where the operator would like
> to annotate packets as they traverse the network. Encapsulation is
> insufficient if they don't know what the end point would be or they
> don't want the annotation to change the path the packets take (versus
> those that aren't annotated).
>
> The salient problem with extension header insertion is lost of

And the problems that can be introduced by changing the effective path MTU...

> attribution. It is fundamental in the IP protocol that the contents of
> a packet are attributed to the source host identified by the source
> address. If some intermediate node inserts an extension header that
> subsequently breaks the packet downstream then there is no obvious way
> to debug this. If an ICMP message is sent because of the receiving
> data, then receiving host can't do much with it; it's not the source
> of the data in error and nothing in the packet tells who the culprit
> is. The Cisco guys have at least conceded one point on SRv6 insertion
> due to pushback on this, their latest draft only does SRv6 insertion
> on packets that have already been encapsulated in IPIP on ingress into
> the domain. This is intended to at least restrict the modified packets
> to a controlled domain (I'm note sure if any implementations enforce
> this though). My proposal is to require an "attribution" HBH option
> that would clearly identify inserted data put in a packet by
> middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> allow extension header insertion, but require protocol to give
> attribution and make it at least somewhat robust and manageable.
>
> Tom

FWIW the SRv6 header insertion stuff is still under discussion in
spring wg (last I knew).  I proposed one option that could be used to
avoid insertion (allow for extra scratch space
https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
but nothing has been conclusively resolved last I checked.

As everyone probably knows, the draft-ietf-* documents are
working-group-adopted documents (though final publication is never
guaranteed).  My current reading of 6man tea leaves is that neither
"ICMP limits" and "MTU option" docs were terribly contentious.
Whether code reorg is important for implementing these I'm not
competent enough to say.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-03 23:53                 ` Erik Kline
@ 2020-01-04  0:37                   ` Tom Herbert
  2020-01-04  8:05                     ` kernel Dev
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Herbert @ 2020-01-04  0:37 UTC (permalink / raw)
  To: ek
  Cc: David Miller, Ahmed Abdelsalam, Linux Kernel Network Developers,
	Simon Horman, Willem de Bruijn

On Fri, Jan 3, 2020 at 3:53 PM Erik Kline <ek@loon.com> wrote:
>
> On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
> >
> > On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> > >
> > > From: Tom Herbert <tom@herbertland.com>
> > > Date: Fri, 3 Jan 2020 14:31:58 -0800
> > >
> > > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > > >>
> > > >> From: Tom Herbert <tom@herbertland.com>
> > > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > > >>
> > > >> > The real way to combat this provide open implementation that
> > > >> > demonstrates the correct use of the protocols and show that's more
> > > >> > extensible and secure than these "hacks".
> > > >>
> > > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > > >
> > > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > > These are prime examples where we've steered the Internet from host
> > > > protocols and implementation to successfully obsolete or at least work
> > > > around protocol ossification that was perpetuated by router vendors.
> > > > Cisco is not the Internet!
> > >
> > > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> > >
> > Dave,
> >
> > I agree we can't stop it, but maybe we can steer it to be at least
> > palatable. There are valid use cases for extension header insertion.
> > Ironically, SRv6 header insertion isn't one of them; the proponents
> > have failed to offer even a single reason why the alternative of IPv6
> > encapsulation isn't sufficient (believe me, we've asked _many_ times
> > for some justification and only get hand waving!). There are, however,
> > some interesting uses cases like in IOAM where the operator would like
> > to annotate packets as they traverse the network. Encapsulation is
> > insufficient if they don't know what the end point would be or they
> > don't want the annotation to change the path the packets take (versus
> > those that aren't annotated).
> >
> > The salient problem with extension header insertion is lost of
>
> And the problems that can be introduced by changing the effective path MTU...
>
Eric,

Yep, increasing the size of packet in transit potentially wreaks havoc
on PMTU discovery, however I personally think that the issue might be
overblown. We already have the same problem when tunneling is done in
the network since most tunneling implementations and deployments just
assume the operator has set large enough MTUs. As long as all the
overhead inserted into the packet doesn't reduce the end host PMTU
below 1280, PMTU discovery and probably even PTB for a packet with
inserted headers still has right effect.

> > attribution. It is fundamental in the IP protocol that the contents of
> > a packet are attributed to the source host identified by the source
> > address. If some intermediate node inserts an extension header that
> > subsequently breaks the packet downstream then there is no obvious way
> > to debug this. If an ICMP message is sent because of the receiving
> > data, then receiving host can't do much with it; it's not the source
> > of the data in error and nothing in the packet tells who the culprit
> > is. The Cisco guys have at least conceded one point on SRv6 insertion
> > due to pushback on this, their latest draft only does SRv6 insertion
> > on packets that have already been encapsulated in IPIP on ingress into
> > the domain. This is intended to at least restrict the modified packets
> > to a controlled domain (I'm note sure if any implementations enforce
> > this though). My proposal is to require an "attribution" HBH option
> > that would clearly identify inserted data put in a packet by
> > middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> > allow extension header insertion, but require protocol to give
> > attribution and make it at least somewhat robust and manageable.
> >
> > Tom
>
> FWIW the SRv6 header insertion stuff is still under discussion in
> spring wg (last I knew).  I proposed one option that could be used to

It's also under discussion in 6man.

> avoid insertion (allow for extra scratch space
> https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
> but nothing has been conclusively resolved last I checked.
>

I saw your proposal. It's a good idea from POV to be conformant with
RFC8200 and avoid the PMTU problems, but the header insertion
proponents aren't going to like it at all. First, it means that the
source is in control of the insertion policy and host is required to
change-- no way they'll buy into that ;-). Secondly, if the scratch
space isn't used they'll undoubtedly claim that is unnecessary
overhead.

Tom

> As everyone probably knows, the draft-ietf-* documents are
> working-group-adopted documents (though final publication is never
> guaranteed).  My current reading of 6man tea leaves is that neither
> "ICMP limits" and "MTU option" docs were terribly contentious.
> Whether code reorg is important for implementing these I'm not
> competent enough to say.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-04  0:37                   ` Tom Herbert
@ 2020-01-04  8:05                     ` kernel Dev
  2020-01-04 17:45                       ` Tom Herbert
  0 siblings, 1 reply; 26+ messages in thread
From: kernel Dev @ 2020-01-04  8:05 UTC (permalink / raw)
  To: Tom Herbert
  Cc: ek, David Miller, Linux Kernel Network Developers, Simon Horman,
	Willem de Bruijn

Tom, 

I will not go into whether Tom or router vendors is right from IETF perspective as here is not the place to discuss. 

But it seems to me that the motivation behind these patches is just to pushback on the current IETF proposals. 

The patches timeline is completely aligned with when IETF threads get into tough discussions (May 2019, August 2019, and now). 

I’m not the one to decide, but IMO people should not add stuff to the kernel just to enforce their opinions on other mailers. 

 
On Fri, 3 Jan 2020 16:37:33 -0800
Tom Herbert <tom@herbertland.com> wrote:

> On Fri, Jan 3, 2020 at 3:53 PM Erik Kline <ek@loon.com> wrote:
> >
> > On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
> > >
> > > On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> > > >
> > > > From: Tom Herbert <tom@herbertland.com>
> > > > Date: Fri, 3 Jan 2020 14:31:58 -0800
> > > >
> > > > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > > > >>
> > > > >> From: Tom Herbert <tom@herbertland.com>
> > > > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > > > >>
> > > > >> > The real way to combat this provide open implementation that
> > > > >> > demonstrates the correct use of the protocols and show that's more
> > > > >> > extensible and secure than these "hacks".
> > > > >>
> > > > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > > > >
> > > > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > > > These are prime examples where we've steered the Internet from host
> > > > > protocols and implementation to successfully obsolete or at least work
> > > > > around protocol ossification that was perpetuated by router vendors.
> > > > > Cisco is not the Internet!
> > > >
> > > > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> > > >
> > > Dave,
> > >
> > > I agree we can't stop it, but maybe we can steer it to be at least
> > > palatable. There are valid use cases for extension header insertion.
> > > Ironically, SRv6 header insertion isn't one of them; the proponents
> > > have failed to offer even a single reason why the alternative of IPv6
> > > encapsulation isn't sufficient (believe me, we've asked _many_ times
> > > for some justification and only get hand waving!). There are, however,
> > > some interesting uses cases like in IOAM where the operator would like
> > > to annotate packets as they traverse the network. Encapsulation is
> > > insufficient if they don't know what the end point would be or they
> > > don't want the annotation to change the path the packets take (versus
> > > those that aren't annotated).
> > >
> > > The salient problem with extension header insertion is lost of
> >
> > And the problems that can be introduced by changing the effective path MTU...
> >
> Eric,
> 
> Yep, increasing the size of packet in transit potentially wreaks havoc
> on PMTU discovery, however I personally think that the issue might be
> overblown. We already have the same problem when tunneling is done in
> the network since most tunneling implementations and deployments just
> assume the operator has set large enough MTUs. As long as all the
> overhead inserted into the packet doesn't reduce the end host PMTU
> below 1280, PMTU discovery and probably even PTB for a packet with
> inserted headers still has right effect.
> 
> > > attribution. It is fundamental in the IP protocol that the contents of
> > > a packet are attributed to the source host identified by the source
> > > address. If some intermediate node inserts an extension header that
> > > subsequently breaks the packet downstream then there is no obvious way
> > > to debug this. If an ICMP message is sent because of the receiving
> > > data, then receiving host can't do much with it; it's not the source
> > > of the data in error and nothing in the packet tells who the culprit
> > > is. The Cisco guys have at least conceded one point on SRv6 insertion
> > > due to pushback on this, their latest draft only does SRv6 insertion
> > > on packets that have already been encapsulated in IPIP on ingress into
> > > the domain. This is intended to at least restrict the modified packets
> > > to a controlled domain (I'm note sure if any implementations enforce
> > > this though). My proposal is to require an "attribution" HBH option
> > > that would clearly identify inserted data put in a packet by
> > > middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> > > allow extension header insertion, but require protocol to give
> > > attribution and make it at least somewhat robust and manageable.
> > >
> > > Tom
> >
> > FWIW the SRv6 header insertion stuff is still under discussion in
> > spring wg (last I knew).  I proposed one option that could be used to
> 
> It's also under discussion in 6man.
> 
> > avoid insertion (allow for extra scratch space
> > https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
> > but nothing has been conclusively resolved last I checked.
> >
> 
> I saw your proposal. It's a good idea from POV to be conformant with
> RFC8200 and avoid the PMTU problems, but the header insertion
> proponents aren't going to like it at all. First, it means that the
> source is in control of the insertion policy and host is required to
> change-- no way they'll buy into that ;-). Secondly, if the scratch
> space isn't used they'll undoubtedly claim that is unnecessary
> overhead.
> 
> Tom
> 
> > As everyone probably knows, the draft-ietf-* documents are
> > working-group-adopted documents (though final publication is never
> > guaranteed).  My current reading of 6man tea leaves is that neither
> > "ICMP limits" and "MTU option" docs were terribly contentious.
> > Whether code reorg is important for implementing these I'm not
> > competent enough to say.


-- 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-04  8:05                     ` kernel Dev
@ 2020-01-04 17:45                       ` Tom Herbert
  2020-01-04 19:02                         ` kernel Dev
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Herbert @ 2020-01-04 17:45 UTC (permalink / raw)
  To: kernel Dev
  Cc: Erik Kline, David Miller, Linux Kernel Network Developers,
	Simon Horman, Willem de Bruijn

On Sat, Jan 4, 2020 at 12:05 AM kernel Dev <ahabdels.dev@gmail.com> wrote:
>
> Tom,
>
> I will not go into whether Tom or router vendors is right from IETF perspective as here is not the place to discuss.
>
> But it seems to me that the motivation behind these patches is just to pushback on the current IETF proposals.
>
Sorry, but that is completely untrue. The patches are a general
improvement. The ability to allow modules to register handlers for
options code points has nothing to do with "pushback on the current
IETF protocols". This sort of registration is a mechanism used all
over the place. Similarly, allowing non-priveledged users to send
options is not for any specific protocol-- it is a *general*
mechanism.

> The patches timeline is completely aligned with when IETF threads get into tough discussions (May 2019, August 2019, and now).
>
Yes, discussion about new protocols in IETF tends to correlate with
development and implementation of the protocols. That shouldn't
surprise anyone. SRv6 for instance was highly controversial in IETF
and yet the patches went in.

> I’m not the one to decide, but IMO people should not add stuff to the kernel just to enforce their opinions on other mailers.

I take exception with your insinuation. Seeing as how you might be new
to Linux kernel development I will ignore it. But, in the future, I
strongly suggest you be careful about accusing people about their
motivations based solely on one interaction.

Tom


>
>
> On Fri, 3 Jan 2020 16:37:33 -0800
> Tom Herbert <tom@herbertland.com> wrote:
>
> > On Fri, Jan 3, 2020 at 3:53 PM Erik Kline <ek@loon.com> wrote:
> > >
> > > On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
> > > >
> > > > On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> > > > >
> > > > > From: Tom Herbert <tom@herbertland.com>
> > > > > Date: Fri, 3 Jan 2020 14:31:58 -0800
> > > > >
> > > > > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > > > > >>
> > > > > >> From: Tom Herbert <tom@herbertland.com>
> > > > > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > > > > >>
> > > > > >> > The real way to combat this provide open implementation that
> > > > > >> > demonstrates the correct use of the protocols and show that's more
> > > > > >> > extensible and secure than these "hacks".
> > > > > >>
> > > > > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > > > > >
> > > > > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > > > > These are prime examples where we've steered the Internet from host
> > > > > > protocols and implementation to successfully obsolete or at least work
> > > > > > around protocol ossification that was perpetuated by router vendors.
> > > > > > Cisco is not the Internet!
> > > > >
> > > > > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> > > > >
> > > > Dave,
> > > >
> > > > I agree we can't stop it, but maybe we can steer it to be at least
> > > > palatable. There are valid use cases for extension header insertion.
> > > > Ironically, SRv6 header insertion isn't one of them; the proponents
> > > > have failed to offer even a single reason why the alternative of IPv6
> > > > encapsulation isn't sufficient (believe me, we've asked _many_ times
> > > > for some justification and only get hand waving!). There are, however,
> > > > some interesting uses cases like in IOAM where the operator would like
> > > > to annotate packets as they traverse the network. Encapsulation is
> > > > insufficient if they don't know what the end point would be or they
> > > > don't want the annotation to change the path the packets take (versus
> > > > those that aren't annotated).
> > > >
> > > > The salient problem with extension header insertion is lost of
> > >
> > > And the problems that can be introduced by changing the effective path MTU...
> > >
> > Eric,
> >
> > Yep, increasing the size of packet in transit potentially wreaks havoc
> > on PMTU discovery, however I personally think that the issue might be
> > overblown. We already have the same problem when tunneling is done in
> > the network since most tunneling implementations and deployments just
> > assume the operator has set large enough MTUs. As long as all the
> > overhead inserted into the packet doesn't reduce the end host PMTU
> > below 1280, PMTU discovery and probably even PTB for a packet with
> > inserted headers still has right effect.
> >
> > > > attribution. It is fundamental in the IP protocol that the contents of
> > > > a packet are attributed to the source host identified by the source
> > > > address. If some intermediate node inserts an extension header that
> > > > subsequently breaks the packet downstream then there is no obvious way
> > > > to debug this. If an ICMP message is sent because of the receiving
> > > > data, then receiving host can't do much with it; it's not the source
> > > > of the data in error and nothing in the packet tells who the culprit
> > > > is. The Cisco guys have at least conceded one point on SRv6 insertion
> > > > due to pushback on this, their latest draft only does SRv6 insertion
> > > > on packets that have already been encapsulated in IPIP on ingress into
> > > > the domain. This is intended to at least restrict the modified packets
> > > > to a controlled domain (I'm note sure if any implementations enforce
> > > > this though). My proposal is to require an "attribution" HBH option
> > > > that would clearly identify inserted data put in a packet by
> > > > middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> > > > allow extension header insertion, but require protocol to give
> > > > attribution and make it at least somewhat robust and manageable.
> > > >
> > > > Tom
> > >
> > > FWIW the SRv6 header insertion stuff is still under discussion in
> > > spring wg (last I knew).  I proposed one option that could be used to
> >
> > It's also under discussion in 6man.
> >
> > > avoid insertion (allow for extra scratch space
> > > https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
> > > but nothing has been conclusively resolved last I checked.
> > >
> >
> > I saw your proposal. It's a good idea from POV to be conformant with
> > RFC8200 and avoid the PMTU problems, but the header insertion
> > proponents aren't going to like it at all. First, it means that the
> > source is in control of the insertion policy and host is required to
> > change-- no way they'll buy into that ;-). Secondly, if the scratch
> > space isn't used they'll undoubtedly claim that is unnecessary
> > overhead.
> >
> > Tom
> >
> > > As everyone probably knows, the draft-ietf-* documents are
> > > working-group-adopted documents (though final publication is never
> > > guaranteed).  My current reading of 6man tea leaves is that neither
> > > "ICMP limits" and "MTU option" docs were terribly contentious.
> > > Whether code reorg is important for implementing these I'm not
> > > competent enough to say.
>
>
> --

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-04 17:45                       ` Tom Herbert
@ 2020-01-04 19:02                         ` kernel Dev
  2020-01-04 19:27                           ` kernel Dev
  2020-01-04 20:22                           ` Tom Herbert
  0 siblings, 2 replies; 26+ messages in thread
From: kernel Dev @ 2020-01-04 19:02 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Erik Kline, David Miller, Linux Kernel Network Developers,
	Simon Horman, Willem de Bruijn

Hi Tom, 

I wasn’t accusing anyone in my message. It is a personal opinion. 
If my message was misunderstood, then please accept my apologies. 

Also SRv6 is not highly controversial and this is proven by its adoption rate and ecosystem.
I have never seen a highly controversial technology that has (in two years) at least 8 public deployments and is supported by 18 publicly known routing platforms from 8 different vendors. Also has support in the mainline of Linux kernel, FD.io VPP, Wireshark, tcpdump, iptables and nftables.

If you are a network operator, would you deply a highly controversial technology in your network ? 


On Sat, 4 Jan 2020 09:45:59 -0800
Tom Herbert <tom@herbertland.com> wrote:

> On Sat, Jan 4, 2020 at 12:05 AM kernel Dev <ahabdels.dev@gmail.com> wrote:
> >
> > Tom,
> >
> > I will not go into whether Tom or router vendors is right from IETF perspective as here is not the place to discuss.
> >
> > But it seems to me that the motivation behind these patches is just to pushback on the current IETF proposals.
> >
> Sorry, but that is completely untrue. The patches are a general
> improvement. The ability to allow modules to register handlers for
> options code points has nothing to do with "pushback on the current
> IETF protocols". This sort of registration is a mechanism used all
> over the place. Similarly, allowing non-priveledged users to send
> options is not for any specific protocol-- it is a *general*
> mechanism.
> 
> > The patches timeline is completely aligned with when IETF threads get into tough discussions (May 2019, August 2019, and now).
> >
> Yes, discussion about new protocols in IETF tends to correlate with
> development and implementation of the protocols. That shouldn't
> surprise anyone. SRv6 for instance was highly controversial in IETF
> and yet the patches went in.
> 
> > I’m not the one to decide, but IMO people should not add stuff to the kernel just to enforce their opinions on other mailers.
> 
> I take exception with your insinuation. Seeing as how you might be new
> to Linux kernel development I will ignore it. But, in the future, I
> strongly suggest you be careful about accusing people about their
> motivations based solely on one interaction.
> 
> Tom
> 
> 
> >
> >
> > On Fri, 3 Jan 2020 16:37:33 -0800
> > Tom Herbert <tom@herbertland.com> wrote:
> >
> > > On Fri, Jan 3, 2020 at 3:53 PM Erik Kline <ek@loon.com> wrote:
> > > >
> > > > On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
> > > > >
> > > > > On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> > > > > >
> > > > > > From: Tom Herbert <tom@herbertland.com>
> > > > > > Date: Fri, 3 Jan 2020 14:31:58 -0800
> > > > > >
> > > > > > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > > > > > >>
> > > > > > >> From: Tom Herbert <tom@herbertland.com>
> > > > > > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > > > > > >>
> > > > > > >> > The real way to combat this provide open implementation that
> > > > > > >> > demonstrates the correct use of the protocols and show that's more
> > > > > > >> > extensible and secure than these "hacks".
> > > > > > >>
> > > > > > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > > > > > >
> > > > > > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > > > > > These are prime examples where we've steered the Internet from host
> > > > > > > protocols and implementation to successfully obsolete or at least work
> > > > > > > around protocol ossification that was perpetuated by router vendors.
> > > > > > > Cisco is not the Internet!
> > > > > >
> > > > > > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> > > > > >
> > > > > Dave,
> > > > >
> > > > > I agree we can't stop it, but maybe we can steer it to be at least
> > > > > palatable. There are valid use cases for extension header insertion.
> > > > > Ironically, SRv6 header insertion isn't one of them; the proponents
> > > > > have failed to offer even a single reason why the alternative of IPv6
> > > > > encapsulation isn't sufficient (believe me, we've asked _many_ times
> > > > > for some justification and only get hand waving!). There are, however,
> > > > > some interesting uses cases like in IOAM where the operator would like
> > > > > to annotate packets as they traverse the network. Encapsulation is
> > > > > insufficient if they don't know what the end point would be or they
> > > > > don't want the annotation to change the path the packets take (versus
> > > > > those that aren't annotated).
> > > > >
> > > > > The salient problem with extension header insertion is lost of
> > > >
> > > > And the problems that can be introduced by changing the effective path MTU...
> > > >
> > > Eric,
> > >
> > > Yep, increasing the size of packet in transit potentially wreaks havoc
> > > on PMTU discovery, however I personally think that the issue might be
> > > overblown. We already have the same problem when tunneling is done in
> > > the network since most tunneling implementations and deployments just
> > > assume the operator has set large enough MTUs. As long as all the
> > > overhead inserted into the packet doesn't reduce the end host PMTU
> > > below 1280, PMTU discovery and probably even PTB for a packet with
> > > inserted headers still has right effect.
> > >
> > > > > attribution. It is fundamental in the IP protocol that the contents of
> > > > > a packet are attributed to the source host identified by the source
> > > > > address. If some intermediate node inserts an extension header that
> > > > > subsequently breaks the packet downstream then there is no obvious way
> > > > > to debug this. If an ICMP message is sent because of the receiving
> > > > > data, then receiving host can't do much with it; it's not the source
> > > > > of the data in error and nothing in the packet tells who the culprit
> > > > > is. The Cisco guys have at least conceded one point on SRv6 insertion
> > > > > due to pushback on this, their latest draft only does SRv6 insertion
> > > > > on packets that have already been encapsulated in IPIP on ingress into
> > > > > the domain. This is intended to at least restrict the modified packets
> > > > > to a controlled domain (I'm note sure if any implementations enforce
> > > > > this though). My proposal is to require an "attribution" HBH option
> > > > > that would clearly identify inserted data put in a packet by
> > > > > middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> > > > > allow extension header insertion, but require protocol to give
> > > > > attribution and make it at least somewhat robust and manageable.
> > > > >
> > > > > Tom
> > > >
> > > > FWIW the SRv6 header insertion stuff is still under discussion in
> > > > spring wg (last I knew).  I proposed one option that could be used to
> > >
> > > It's also under discussion in 6man.
> > >
> > > > avoid insertion (allow for extra scratch space
> > > > https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
> > > > but nothing has been conclusively resolved last I checked.
> > > >
> > >
> > > I saw your proposal. It's a good idea from POV to be conformant with
> > > RFC8200 and avoid the PMTU problems, but the header insertion
> > > proponents aren't going to like it at all. First, it means that the
> > > source is in control of the insertion policy and host is required to
> > > change-- no way they'll buy into that ;-). Secondly, if the scratch
> > > space isn't used they'll undoubtedly claim that is unnecessary
> > > overhead.
> > >
> > > Tom
> > >
> > > > As everyone probably knows, the draft-ietf-* documents are
> > > > working-group-adopted documents (though final publication is never
> > > > guaranteed).  My current reading of 6man tea leaves is that neither
> > > > "ICMP limits" and "MTU option" docs were terribly contentious.
> > > > Whether code reorg is important for implementing these I'm not
> > > > competent enough to say.
> >
> >
> > --


-- 
kernel Dev <ahabdels.dev@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-04 19:02                         ` kernel Dev
@ 2020-01-04 19:27                           ` kernel Dev
  2020-01-04 20:22                           ` Tom Herbert
  1 sibling, 0 replies; 26+ messages in thread
From: kernel Dev @ 2020-01-04 19:27 UTC (permalink / raw)
  To: kernel Dev
  Cc: Tom Herbert, Erik Kline, David Miller,
	Linux Kernel Network Developers, Simon Horman, Willem de Bruijn

sorry forgot to add the link for the public SRv6 deployment. 
https://tools.ietf.org/html/draft-matsushima-spring-srv6-deployment-status-04

On Sat, 4 Jan 2020 21:02:29 +0200
kernel Dev <ahabdels.dev@gmail.com> wrote:

> Hi Tom, 
> 
> I wasn’t accusing anyone in my message. It is a personal opinion. 
> If my message was misunderstood, then please accept my apologies. 
> 
> Also SRv6 is not highly controversial and this is proven by its adoption rate and ecosystem.
> I have never seen a highly controversial technology that has (in two years) at least 8 public deployments and is supported by 18 publicly known routing platforms from 8 different vendors. Also has support in the mainline of Linux kernel, FD.io VPP, Wireshark, tcpdump, iptables and nftables.
> 
> If you are a network operator, would you deply a highly controversial technology in your network ? 
> 
> 
> On Sat, 4 Jan 2020 09:45:59 -0800
> Tom Herbert <tom@herbertland.com> wrote:
> 
> > On Sat, Jan 4, 2020 at 12:05 AM kernel Dev <ahabdels.dev@gmail.com> wrote:
> > >
> > > Tom,
> > >
> > > I will not go into whether Tom or router vendors is right from IETF perspective as here is not the place to discuss.
> > >
> > > But it seems to me that the motivation behind these patches is just to pushback on the current IETF proposals.
> > >
> > Sorry, but that is completely untrue. The patches are a general
> > improvement. The ability to allow modules to register handlers for
> > options code points has nothing to do with "pushback on the current
> > IETF protocols". This sort of registration is a mechanism used all
> > over the place. Similarly, allowing non-priveledged users to send
> > options is not for any specific protocol-- it is a *general*
> > mechanism.
> > 
> > > The patches timeline is completely aligned with when IETF threads get into tough discussions (May 2019, August 2019, and now).
> > >
> > Yes, discussion about new protocols in IETF tends to correlate with
> > development and implementation of the protocols. That shouldn't
> > surprise anyone. SRv6 for instance was highly controversial in IETF
> > and yet the patches went in.
> > 
> > > I’m not the one to decide, but IMO people should not add stuff to the kernel just to enforce their opinions on other mailers.
> > 
> > I take exception with your insinuation. Seeing as how you might be new
> > to Linux kernel development I will ignore it. But, in the future, I
> > strongly suggest you be careful about accusing people about their
> > motivations based solely on one interaction.
> > 
> > Tom
> > 
> > 
> > >
> > >
> > > On Fri, 3 Jan 2020 16:37:33 -0800
> > > Tom Herbert <tom@herbertland.com> wrote:
> > >
> > > > On Fri, Jan 3, 2020 at 3:53 PM Erik Kline <ek@loon.com> wrote:
> > > > >
> > > > > On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
> > > > > >
> > > > > > On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> > > > > > >
> > > > > > > From: Tom Herbert <tom@herbertland.com>
> > > > > > > Date: Fri, 3 Jan 2020 14:31:58 -0800
> > > > > > >
> > > > > > > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > > > > > > >>
> > > > > > > >> From: Tom Herbert <tom@herbertland.com>
> > > > > > > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > > > > > > >>
> > > > > > > >> > The real way to combat this provide open implementation that
> > > > > > > >> > demonstrates the correct use of the protocols and show that's more
> > > > > > > >> > extensible and secure than these "hacks".
> > > > > > > >>
> > > > > > > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > > > > > > >
> > > > > > > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > > > > > > These are prime examples where we've steered the Internet from host
> > > > > > > > protocols and implementation to successfully obsolete or at least work
> > > > > > > > around protocol ossification that was perpetuated by router vendors.
> > > > > > > > Cisco is not the Internet!
> > > > > > >
> > > > > > > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> > > > > > >
> > > > > > Dave,
> > > > > >
> > > > > > I agree we can't stop it, but maybe we can steer it to be at least
> > > > > > palatable. There are valid use cases for extension header insertion.
> > > > > > Ironically, SRv6 header insertion isn't one of them; the proponents
> > > > > > have failed to offer even a single reason why the alternative of IPv6
> > > > > > encapsulation isn't sufficient (believe me, we've asked _many_ times
> > > > > > for some justification and only get hand waving!). There are, however,
> > > > > > some interesting uses cases like in IOAM where the operator would like
> > > > > > to annotate packets as they traverse the network. Encapsulation is
> > > > > > insufficient if they don't know what the end point would be or they
> > > > > > don't want the annotation to change the path the packets take (versus
> > > > > > those that aren't annotated).
> > > > > >
> > > > > > The salient problem with extension header insertion is lost of
> > > > >
> > > > > And the problems that can be introduced by changing the effective path MTU...
> > > > >
> > > > Eric,
> > > >
> > > > Yep, increasing the size of packet in transit potentially wreaks havoc
> > > > on PMTU discovery, however I personally think that the issue might be
> > > > overblown. We already have the same problem when tunneling is done in
> > > > the network since most tunneling implementations and deployments just
> > > > assume the operator has set large enough MTUs. As long as all the
> > > > overhead inserted into the packet doesn't reduce the end host PMTU
> > > > below 1280, PMTU discovery and probably even PTB for a packet with
> > > > inserted headers still has right effect.
> > > >
> > > > > > attribution. It is fundamental in the IP protocol that the contents of
> > > > > > a packet are attributed to the source host identified by the source
> > > > > > address. If some intermediate node inserts an extension header that
> > > > > > subsequently breaks the packet downstream then there is no obvious way
> > > > > > to debug this. If an ICMP message is sent because of the receiving
> > > > > > data, then receiving host can't do much with it; it's not the source
> > > > > > of the data in error and nothing in the packet tells who the culprit
> > > > > > is. The Cisco guys have at least conceded one point on SRv6 insertion
> > > > > > due to pushback on this, their latest draft only does SRv6 insertion
> > > > > > on packets that have already been encapsulated in IPIP on ingress into
> > > > > > the domain. This is intended to at least restrict the modified packets
> > > > > > to a controlled domain (I'm note sure if any implementations enforce
> > > > > > this though). My proposal is to require an "attribution" HBH option
> > > > > > that would clearly identify inserted data put in a packet by
> > > > > > middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> > > > > > allow extension header insertion, but require protocol to give
> > > > > > attribution and make it at least somewhat robust and manageable.
> > > > > >
> > > > > > Tom
> > > > >
> > > > > FWIW the SRv6 header insertion stuff is still under discussion in
> > > > > spring wg (last I knew).  I proposed one option that could be used to
> > > >
> > > > It's also under discussion in 6man.
> > > >
> > > > > avoid insertion (allow for extra scratch space
> > > > > https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
> > > > > but nothing has been conclusively resolved last I checked.
> > > > >
> > > >
> > > > I saw your proposal. It's a good idea from POV to be conformant with
> > > > RFC8200 and avoid the PMTU problems, but the header insertion
> > > > proponents aren't going to like it at all. First, it means that the
> > > > source is in control of the insertion policy and host is required to
> > > > change-- no way they'll buy into that ;-). Secondly, if the scratch
> > > > space isn't used they'll undoubtedly claim that is unnecessary
> > > > overhead.
> > > >
> > > > Tom
> > > >
> > > > > As everyone probably knows, the draft-ietf-* documents are
> > > > > working-group-adopted documents (though final publication is never
> > > > > guaranteed).  My current reading of 6man tea leaves is that neither
> > > > > "ICMP limits" and "MTU option" docs were terribly contentious.
> > > > > Whether code reorg is important for implementing these I'm not
> > > > > competent enough to say.
> > >
> > >
> > > --
> 
> 
> -- 
> kernel Dev <ahabdels.dev@gmail.com>


-- 
kernel Dev <ahabdels.dev@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-04 19:02                         ` kernel Dev
  2020-01-04 19:27                           ` kernel Dev
@ 2020-01-04 20:22                           ` Tom Herbert
  2020-01-07 14:27                             ` kernel Dev
  1 sibling, 1 reply; 26+ messages in thread
From: Tom Herbert @ 2020-01-04 20:22 UTC (permalink / raw)
  To: kernel Dev
  Cc: Erik Kline, David Miller, Linux Kernel Network Developers,
	Simon Horman, Willem de Bruijn

On Sat, Jan 4, 2020 at 11:02 AM kernel Dev <ahabdels.dev@gmail.com> wrote:
>
> Hi Tom,
>
> I wasn’t accusing anyone in my message. It is a personal opinion.
> If my message was misunderstood, then please accept my apologies.
>
> Also SRv6 is not highly controversial and this is proven by its adoption rate and ecosystem.

It was controversial in the five years it look to get to WGLC, and
thanks to things like extension header insertion it will remain
controversial.

> I have never seen a highly controversial technology that has (in two years) at least 8 public deployments and is supported by 18 publicly known routing platforms from 8 different vendors. Also has support in the mainline of Linux kernel, FD.io VPP, Wireshark, tcpdump, iptables and nftables.
>

These support only support some subset of the protocol. For instance,
SRv6 TLVs including HMAC TLV are defined in SRH, however only Linux
claims to support them. However, the Linux implementation isn't
protocol conformant: it doesn't support required padding options and
just assumes HMAC is the one and only TLV. Previously, I've posted
patches to fix that based on the generic parser in this patch set.
Those original patches for fixing SRv6 would now be out of date since
it was decided to change the PADN option type in SRv6 to be 5 instead
of 1 thereby breaking compatibility with HBH and Destination Options.
Seeing as how you seem interested in SRv6, it would be nice if you
could look into fixing the SRv6 TLV implementation to be conformant (I
can repost my original patches if that would help).

Thanks,
Tom


> If you are a network operator, would you deply a highly controversial technology in your network ?
>
>
> On Sat, 4 Jan 2020 09:45:59 -0800
> Tom Herbert <tom@herbertland.com> wrote:
>
> > On Sat, Jan 4, 2020 at 12:05 AM kernel Dev <ahabdels.dev@gmail.com> wrote:
> > >
> > > Tom,
> > >
> > > I will not go into whether Tom or router vendors is right from IETF perspective as here is not the place to discuss.
> > >
> > > But it seems to me that the motivation behind these patches is just to pushback on the current IETF proposals.
> > >
> > Sorry, but that is completely untrue. The patches are a general
> > improvement. The ability to allow modules to register handlers for
> > options code points has nothing to do with "pushback on the current
> > IETF protocols". This sort of registration is a mechanism used all
> > over the place. Similarly, allowing non-priveledged users to send
> > options is not for any specific protocol-- it is a *general*
> > mechanism.
> >
> > > The patches timeline is completely aligned with when IETF threads get into tough discussions (May 2019, August 2019, and now).
> > >
> > Yes, discussion about new protocols in IETF tends to correlate with
> > development and implementation of the protocols. That shouldn't
> > surprise anyone. SRv6 for instance was highly controversial in IETF
> > and yet the patches went in.
> >
> > > I’m not the one to decide, but IMO people should not add stuff to the kernel just to enforce their opinions on other mailers.
> >
> > I take exception with your insinuation. Seeing as how you might be new
> > to Linux kernel development I will ignore it. But, in the future, I
> > strongly suggest you be careful about accusing people about their
> > motivations based solely on one interaction.
> >
> > Tom
> >
> >
> > >
> > >
> > > On Fri, 3 Jan 2020 16:37:33 -0800
> > > Tom Herbert <tom@herbertland.com> wrote:
> > >
> > > > On Fri, Jan 3, 2020 at 3:53 PM Erik Kline <ek@loon.com> wrote:
> > > > >
> > > > > On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
> > > > > >
> > > > > > On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> > > > > > >
> > > > > > > From: Tom Herbert <tom@herbertland.com>
> > > > > > > Date: Fri, 3 Jan 2020 14:31:58 -0800
> > > > > > >
> > > > > > > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > > > > > > >>
> > > > > > > >> From: Tom Herbert <tom@herbertland.com>
> > > > > > > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > > > > > > >>
> > > > > > > >> > The real way to combat this provide open implementation that
> > > > > > > >> > demonstrates the correct use of the protocols and show that's more
> > > > > > > >> > extensible and secure than these "hacks".
> > > > > > > >>
> > > > > > > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > > > > > > >
> > > > > > > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > > > > > > These are prime examples where we've steered the Internet from host
> > > > > > > > protocols and implementation to successfully obsolete or at least work
> > > > > > > > around protocol ossification that was perpetuated by router vendors.
> > > > > > > > Cisco is not the Internet!
> > > > > > >
> > > > > > > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> > > > > > >
> > > > > > Dave,
> > > > > >
> > > > > > I agree we can't stop it, but maybe we can steer it to be at least
> > > > > > palatable. There are valid use cases for extension header insertion.
> > > > > > Ironically, SRv6 header insertion isn't one of them; the proponents
> > > > > > have failed to offer even a single reason why the alternative of IPv6
> > > > > > encapsulation isn't sufficient (believe me, we've asked _many_ times
> > > > > > for some justification and only get hand waving!). There are, however,
> > > > > > some interesting uses cases like in IOAM where the operator would like
> > > > > > to annotate packets as they traverse the network. Encapsulation is
> > > > > > insufficient if they don't know what the end point would be or they
> > > > > > don't want the annotation to change the path the packets take (versus
> > > > > > those that aren't annotated).
> > > > > >
> > > > > > The salient problem with extension header insertion is lost of
> > > > >
> > > > > And the problems that can be introduced by changing the effective path MTU...
> > > > >
> > > > Eric,
> > > >
> > > > Yep, increasing the size of packet in transit potentially wreaks havoc
> > > > on PMTU discovery, however I personally think that the issue might be
> > > > overblown. We already have the same problem when tunneling is done in
> > > > the network since most tunneling implementations and deployments just
> > > > assume the operator has set large enough MTUs. As long as all the
> > > > overhead inserted into the packet doesn't reduce the end host PMTU
> > > > below 1280, PMTU discovery and probably even PTB for a packet with
> > > > inserted headers still has right effect.
> > > >
> > > > > > attribution. It is fundamental in the IP protocol that the contents of
> > > > > > a packet are attributed to the source host identified by the source
> > > > > > address. If some intermediate node inserts an extension header that
> > > > > > subsequently breaks the packet downstream then there is no obvious way
> > > > > > to debug this. If an ICMP message is sent because of the receiving
> > > > > > data, then receiving host can't do much with it; it's not the source
> > > > > > of the data in error and nothing in the packet tells who the culprit
> > > > > > is. The Cisco guys have at least conceded one point on SRv6 insertion
> > > > > > due to pushback on this, their latest draft only does SRv6 insertion
> > > > > > on packets that have already been encapsulated in IPIP on ingress into
> > > > > > the domain. This is intended to at least restrict the modified packets
> > > > > > to a controlled domain (I'm note sure if any implementations enforce
> > > > > > this though). My proposal is to require an "attribution" HBH option
> > > > > > that would clearly identify inserted data put in a packet by
> > > > > > middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> > > > > > allow extension header insertion, but require protocol to give
> > > > > > attribution and make it at least somewhat robust and manageable.
> > > > > >
> > > > > > Tom
> > > > >
> > > > > FWIW the SRv6 header insertion stuff is still under discussion in
> > > > > spring wg (last I knew).  I proposed one option that could be used to
> > > >
> > > > It's also under discussion in 6man.
> > > >
> > > > > avoid insertion (allow for extra scratch space
> > > > > https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
> > > > > but nothing has been conclusively resolved last I checked.
> > > > >
> > > >
> > > > I saw your proposal. It's a good idea from POV to be conformant with
> > > > RFC8200 and avoid the PMTU problems, but the header insertion
> > > > proponents aren't going to like it at all. First, it means that the
> > > > source is in control of the insertion policy and host is required to
> > > > change-- no way they'll buy into that ;-). Secondly, if the scratch
> > > > space isn't used they'll undoubtedly claim that is unnecessary
> > > > overhead.
> > > >
> > > > Tom
> > > >
> > > > > As everyone probably knows, the draft-ietf-* documents are
> > > > > working-group-adopted documents (though final publication is never
> > > > > guaranteed).  My current reading of 6man tea leaves is that neither
> > > > > "ICMP limits" and "MTU option" docs were terribly contentious.
> > > > > Whether code reorg is important for implementing these I'm not
> > > > > competent enough to say.
> > >
> > >
> > > --
>
>
> --
> kernel Dev <ahabdels.dev@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure
  2020-01-04 20:22                           ` Tom Herbert
@ 2020-01-07 14:27                             ` kernel Dev
  0 siblings, 0 replies; 26+ messages in thread
From: kernel Dev @ 2020-01-07 14:27 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Erik Kline, David Miller, Linux Kernel Network Developers,
	Simon Horman, Willem de Bruijn

On Sat, 4 Jan 2020 12:22:17 -0800
Tom Herbert <tom@herbertland.com> wrote:

> On Sat, Jan 4, 2020 at 11:02 AM kernel Dev <ahabdels.dev@gmail.com> wrote:
> >
> > Hi Tom,
> >
> > I wasn’t accusing anyone in my message. It is a personal opinion.
> > If my message was misunderstood, then please accept my apologies.
> >
> > Also SRv6 is not highly controversial and this is proven by its adoption rate and ecosystem.
> 
> It was controversial in the five years it look to get to WGLC, and
> thanks to things like extension header insertion it will remain
> controversial.

But it is not anymore :) 

> 
> > I have never seen a highly controversial technology that has (in two years) at least 8 public deployments and is supported by 18 publicly known routing platforms from 8 different vendors. Also has support in the mainline of Linux kernel, FD.io VPP, Wireshark, tcpdump, iptables and nftables.
> >
> 
> These support only support some subset of the protocol. For instance,
> SRv6 TLVs including HMAC TLV are defined in SRH, however only Linux
> claims to support them. However, the Linux implementation isn't
> protocol conformant: it doesn't support required padding options and
> just assumes HMAC is the one and only TLV. Previously, I've posted
> patches to fix that based on the generic parser in this patch set.
> Those original patches for fixing SRv6 would now be out of date since
> it was decided to change the PADN option type in SRv6 to be 5 instead
> of 1 thereby breaking compatibility with HBH and Destination Options.
> Seeing as how you seem interested in SRv6, it would be nice if you
> could look into fixing the SRv6 TLV implementation to be conformant (I
> can repost my original patches if that would help).

I’m ok and willing to contribute to add missing pieces of SRv6 to the Linux kernel. 

I was just not convinced with the idea of IPv4 extensions headers as we do not need to reinvent the wheel. 

We need IPv6 as we ran out of IPv4, and I believe SRv6 can handle most of IPv6 use-cases. 

> 
> Thanks,
> Tom
> 
> 
> > If you are a network operator, would you deply a highly controversial technology in your network ?
> >
> >
> > On Sat, 4 Jan 2020 09:45:59 -0800
> > Tom Herbert <tom@herbertland.com> wrote:
> >
> > > On Sat, Jan 4, 2020 at 12:05 AM kernel Dev <ahabdels.dev@gmail.com> wrote:
> > > >
> > > > Tom,
> > > >
> > > > I will not go into whether Tom or router vendors is right from IETF perspective as here is not the place to discuss.
> > > >
> > > > But it seems to me that the motivation behind these patches is just to pushback on the current IETF proposals.
> > > >
> > > Sorry, but that is completely untrue. The patches are a general
> > > improvement. The ability to allow modules to register handlers for
> > > options code points has nothing to do with "pushback on the current
> > > IETF protocols". This sort of registration is a mechanism used all
> > > over the place. Similarly, allowing non-priveledged users to send
> > > options is not for any specific protocol-- it is a *general*
> > > mechanism.
> > >
> > > > The patches timeline is completely aligned with when IETF threads get into tough discussions (May 2019, August 2019, and now).
> > > >
> > > Yes, discussion about new protocols in IETF tends to correlate with
> > > development and implementation of the protocols. That shouldn't
> > > surprise anyone. SRv6 for instance was highly controversial in IETF
> > > and yet the patches went in.
> > >
> > > > I’m not the one to decide, but IMO people should not add stuff to the kernel just to enforce their opinions on other mailers.
> > >
> > > I take exception with your insinuation. Seeing as how you might be new
> > > to Linux kernel development I will ignore it. But, in the future, I
> > > strongly suggest you be careful about accusing people about their
> > > motivations based solely on one interaction.
> > >
> > > Tom
> > >
> > >
> > > >
> > > >
> > > > On Fri, 3 Jan 2020 16:37:33 -0800
> > > > Tom Herbert <tom@herbertland.com> wrote:
> > > >
> > > > > On Fri, Jan 3, 2020 at 3:53 PM Erik Kline <ek@loon.com> wrote:
> > > > > >
> > > > > > On Fri, 3 Jan 2020 at 15:49, Tom Herbert <tom@herbertland.com> wrote:
> > > > > > >
> > > > > > > On Fri, Jan 3, 2020 at 2:57 PM David Miller <davem@davemloft.net> wrote:
> > > > > > > >
> > > > > > > > From: Tom Herbert <tom@herbertland.com>
> > > > > > > > Date: Fri, 3 Jan 2020 14:31:58 -0800
> > > > > > > >
> > > > > > > > > On Fri, Jan 3, 2020 at 12:45 PM David Miller <davem@davemloft.net> wrote:
> > > > > > > > >>
> > > > > > > > >> From: Tom Herbert <tom@herbertland.com>
> > > > > > > > >> Date: Fri, 3 Jan 2020 09:35:08 -0800
> > > > > > > > >>
> > > > > > > > >> > The real way to combat this provide open implementation that
> > > > > > > > >> > demonstrates the correct use of the protocols and show that's more
> > > > > > > > >> > extensible and secure than these "hacks".
> > > > > > > > >>
> > > > > > > > >> Keep dreaming, this won't stop Cisco from doing whatever it wants to do.
> > > > > > > > >
> > > > > > > > > See QUIC. See TLS. See TCP fast open. See transport layer encryption.
> > > > > > > > > These are prime examples where we've steered the Internet from host
> > > > > > > > > protocols and implementation to successfully obsolete or at least work
> > > > > > > > > around protocol ossification that was perpetuated by router vendors.
> > > > > > > > > Cisco is not the Internet!
> > > > > > > >
> > > > > > > > Seriously, I wish you luck stopping the SRv6 header insertion stuff.
> > > > > > > >
> > > > > > > Dave,
> > > > > > >
> > > > > > > I agree we can't stop it, but maybe we can steer it to be at least
> > > > > > > palatable. There are valid use cases for extension header insertion.
> > > > > > > Ironically, SRv6 header insertion isn't one of them; the proponents
> > > > > > > have failed to offer even a single reason why the alternative of IPv6
> > > > > > > encapsulation isn't sufficient (believe me, we've asked _many_ times
> > > > > > > for some justification and only get hand waving!). There are, however,
> > > > > > > some interesting uses cases like in IOAM where the operator would like
> > > > > > > to annotate packets as they traverse the network. Encapsulation is
> > > > > > > insufficient if they don't know what the end point would be or they
> > > > > > > don't want the annotation to change the path the packets take (versus
> > > > > > > those that aren't annotated).
> > > > > > >
> > > > > > > The salient problem with extension header insertion is lost of
> > > > > >
> > > > > > And the problems that can be introduced by changing the effective path MTU...
> > > > > >
> > > > > Eric,
> > > > >
> > > > > Yep, increasing the size of packet in transit potentially wreaks havoc
> > > > > on PMTU discovery, however I personally think that the issue might be
> > > > > overblown. We already have the same problem when tunneling is done in
> > > > > the network since most tunneling implementations and deployments just
> > > > > assume the operator has set large enough MTUs. As long as all the
> > > > > overhead inserted into the packet doesn't reduce the end host PMTU
> > > > > below 1280, PMTU discovery and probably even PTB for a packet with
> > > > > inserted headers still has right effect.
> > > > >
> > > > > > > attribution. It is fundamental in the IP protocol that the contents of
> > > > > > > a packet are attributed to the source host identified by the source
> > > > > > > address. If some intermediate node inserts an extension header that
> > > > > > > subsequently breaks the packet downstream then there is no obvious way
> > > > > > > to debug this. If an ICMP message is sent because of the receiving
> > > > > > > data, then receiving host can't do much with it; it's not the source
> > > > > > > of the data in error and nothing in the packet tells who the culprit
> > > > > > > is. The Cisco guys have at least conceded one point on SRv6 insertion
> > > > > > > due to pushback on this, their latest draft only does SRv6 insertion
> > > > > > > on packets that have already been encapsulated in IPIP on ingress into
> > > > > > > the domain. This is intended to at least restrict the modified packets
> > > > > > > to a controlled domain (I'm note sure if any implementations enforce
> > > > > > > this though). My proposal is to require an "attribution" HBH option
> > > > > > > that would clearly identify inserted data put in a packet by
> > > > > > > middleboxes (draft-herbert-6man-eh-attrib-00). This is a tradeoff to
> > > > > > > allow extension header insertion, but require protocol to give
> > > > > > > attribution and make it at least somewhat robust and manageable.
> > > > > > >
> > > > > > > Tom
> > > > > >
> > > > > > FWIW the SRv6 header insertion stuff is still under discussion in
> > > > > > spring wg (last I knew).  I proposed one option that could be used to
> > > > >
> > > > > It's also under discussion in 6man.
> > > > >
> > > > > > avoid insertion (allow for extra scratch space
> > > > > > https://mailarchive.ietf.org/arch/msg/spring/UhThRTNxbHWNiMGgRi3U0SqLaDA),
> > > > > > but nothing has been conclusively resolved last I checked.
> > > > > >
> > > > >
> > > > > I saw your proposal. It's a good idea from POV to be conformant with
> > > > > RFC8200 and avoid the PMTU problems, but the header insertion
> > > > > proponents aren't going to like it at all. First, it means that the
> > > > > source is in control of the insertion policy and host is required to
> > > > > change-- no way they'll buy into that ;-). Secondly, if the scratch
> > > > > space isn't used they'll undoubtedly claim that is unnecessary
> > > > > overhead.
> > > > >
> > > > > Tom
> > > > >
> > > > > > As everyone probably knows, the draft-ietf-* documents are
> > > > > > working-group-adopted documents (though final publication is never
> > > > > > guaranteed).  My current reading of 6man tea leaves is that neither
> > > > > > "ICMP limits" and "MTU option" docs were terribly contentious.
> > > > > > Whether code reorg is important for implementing these I'm not
> > > > > > competent enough to say.
> > > >
> > > >
> > > > --
> >
> >
> > --
> > kernel Dev <ahabdels.dev@gmail.com>


-- 
kernel Dev <ahabdels.dev@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-01-07 14:28 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-26 22:51 [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 1/9] ipeh: Fix destopts counters on drop Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 2/9] ipeh: Create exthdrs_options.c and ipeh.h Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 3/9] ipeh: Move generic EH functions to exthdrs_common.c Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 4/9] ipeh: Generic TLV parser Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 5/9] ipeh: Add callback to ipeh_parse_tlv to handle errors Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 6/9] ip6tlvs: Registration of TLV handlers and parameters Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 7/9] ip6tlvs: Add TX parameters Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 8/9] ip6tlvs: Add netlink interface Tom Herbert
2019-12-26 22:51 ` [PATCH v8 net-next 9/9] ip6tlvs: Validation of TX Destination and Hop-by-Hop options Tom Herbert
2020-01-02 21:41 ` [PATCH v8 net-next 0/9] ipv6: Extension header infrastructure David Miller
2020-01-03  0:42   ` Tom Herbert
2020-01-03  7:11     ` kernel Dev
2020-01-03 17:35       ` Tom Herbert
2020-01-03 20:45         ` David Miller
2020-01-03 22:31           ` Tom Herbert
2020-01-03 22:57             ` David Miller
2020-01-03 23:48               ` Tom Herbert
2020-01-03 23:53                 ` Erik Kline
2020-01-04  0:37                   ` Tom Herbert
2020-01-04  8:05                     ` kernel Dev
2020-01-04 17:45                       ` Tom Herbert
2020-01-04 19:02                         ` kernel Dev
2020-01-04 19:27                           ` kernel Dev
2020-01-04 20:22                           ` Tom Herbert
2020-01-07 14:27                             ` kernel Dev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.