netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/5] Route ICMPv6 errors with the flow when ECMP in use
@ 2016-10-28 12:32 Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 1/5] ipv6: Fold rt6_info_hash_nhsfn() into its only caller Jakub Sitnicki
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2016-10-28 12:32 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Hannes Frederic Sowa, Hideaki YOSHIFUJI, Tom Herbert

The motivation for this series is to route ICMPv6 error messages
together with the flow they belong to when multipath routing is in
use. It intends to bring the ECMP routing in IPv6 stack on par with
IPv4.

This enables the use of tools that rely on ICMP error messages such as
traceroute and makes PMTU discovery work both ways. However, for it to
work IPv6 flow labels have to be same in both directions
(i.e. reflected) or need to be chosen in a manner that ensures that
the flow going in the opposite direction would actually be routed to a
given path.

Even though we generally don't expect this, as receiving and sending
IPv6 are free to choose flow labels at will, we make an assumption
here that the enitity in charge of configuring ECMP routing will also
be in control of the server hosts, and can set up flow label
reflection. However, if this is not the case, the patchset doesn't
make the situation worse.

One potential user of the changes here would be an anycast service
hosted behind an ECMP router(s).

Changes have been tested in a virtual setup with a topology as below:

                  Re1 --- Hs1
                 /
 Hc --- Ri --- Rc
                 \
                  Re1 --- Hs2

 Hc  - client host
 HsX - server host
 Rc  - core router
 ReX - edge router
 Ri  - intermediate router

To test the changes, traceroute in UDP mode to the client host, with
flow label set, has been run from one of the server hosts. Full test
is available at [1].

-Jakub

[1] https://github.com/jsitnicki/tools/blob/master/net/tests/ecmp/test-ecmp-icmpv6-error-routing.sh

v1 -> v2:
 - don't use "extern" in external function declaration in header file,
   pointed out by David Miller;
 - style change, put as many arguments as possible on the first line of
   a function call, and align consecutive lines to the first argument,
   pointed out by David Miller;
 - expand the cover letter based on the feedback from David Miller and
   Hannes Sowa;

Jakub Sitnicki (5):
  ipv6: Fold rt6_info_hash_nhsfn() into its only caller
  net: Extend struct flowi6 with multipath hash
  ipv6: Use multipath hash from flow info if available
  ipv6: Compute multipath hash for sent ICMP errors from offending
    packet
  ipv6: Compute multipath hash for forwarded ICMP errors from offending
    packet

 include/linux/icmpv6.h |  2 ++
 include/net/flow.h     |  1 +
 net/ipv6/icmp.c        | 21 +++++++++++++++++++++
 net/ipv6/route.c       | 40 +++++++++++++++++++++++++++++-----------
 4 files changed, 53 insertions(+), 11 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next v2 1/5] ipv6: Fold rt6_info_hash_nhsfn() into its only caller
  2016-10-28 12:32 [PATCH net-next v2 0/5] Route ICMPv6 errors with the flow when ECMP in use Jakub Sitnicki
@ 2016-10-28 12:32 ` Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 2/5] net: Extend struct flowi6 with multipath hash Jakub Sitnicki
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2016-10-28 12:32 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Hannes Frederic Sowa, Hideaki YOSHIFUJI, Tom Herbert

Commit 644d0e656958 ("ipv6 Use get_hash_from_flowi6 for rt6 hash") has
turned rt6_info_hash_nhsfn() into a one-liner, so it no longer makes
sense to keep it around.

Also the accompanying documentation comment has become outdated, so just
remove it altogether.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/route.c | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index bdbc38e..0514b35 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -425,16 +425,6 @@ static bool rt6_check_expired(const struct rt6_info *rt)
 	return false;
 }
 
-/* Multipath route selection:
- *   Hash based function using packet header and flowlabel.
- * Adapted from fib_info_hashfn()
- */
-static int rt6_info_hash_nhsfn(unsigned int candidate_count,
-			       const struct flowi6 *fl6)
-{
-	return get_hash_from_flowi6(fl6) % candidate_count;
-}
-
 static struct rt6_info *rt6_multipath_select(struct rt6_info *match,
 					     struct flowi6 *fl6, int oif,
 					     int strict)
@@ -442,7 +432,7 @@ static struct rt6_info *rt6_multipath_select(struct rt6_info *match,
 	struct rt6_info *sibling, *next_sibling;
 	int route_choosen;
 
-	route_choosen = rt6_info_hash_nhsfn(match->rt6i_nsiblings + 1, fl6);
+	route_choosen = get_hash_from_flowi6(fl6) % (match->rt6i_nsiblings + 1);
 	/* Don't change the route, if route_choosen == 0
 	 * (siblings does not include ourself)
 	 */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v2 2/5] net: Extend struct flowi6 with multipath hash
  2016-10-28 12:32 [PATCH net-next v2 0/5] Route ICMPv6 errors with the flow when ECMP in use Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 1/5] ipv6: Fold rt6_info_hash_nhsfn() into its only caller Jakub Sitnicki
@ 2016-10-28 12:32 ` Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 3/5] ipv6: Use multipath hash from flow info if available Jakub Sitnicki
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2016-10-28 12:32 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Hannes Frederic Sowa, Hideaki YOSHIFUJI, Tom Herbert

Allow for functions that fill out the IPv6 flow info to also pass a hash
computed over the skb contents. The hash value will drive the multipath
routing decisions.

This is intended for special treatment of ICMPv6 errors, where we would
like to make a routing decision based on the flow identifying the
offending IPv6 datagram that triggered the error, rather than the flow
of the ICMP error itself.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/flow.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/flow.h b/include/net/flow.h
index 035aa77..73ee3aa 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -143,6 +143,7 @@ struct flowi6 {
 #define fl6_ipsec_spi		uli.spi
 #define fl6_mh_type		uli.mht.type
 #define fl6_gre_key		uli.gre_key
+	__u32			mp_hash;
 } __attribute__((__aligned__(BITS_PER_LONG/8)));
 
 struct flowidn {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v2 3/5] ipv6: Use multipath hash from flow info if available
  2016-10-28 12:32 [PATCH net-next v2 0/5] Route ICMPv6 errors with the flow when ECMP in use Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 1/5] ipv6: Fold rt6_info_hash_nhsfn() into its only caller Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 2/5] net: Extend struct flowi6 with multipath hash Jakub Sitnicki
@ 2016-10-28 12:32 ` Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 4/5] ipv6: Compute multipath hash for sent ICMP errors from offending packet Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 5/5] ipv6: Compute multipath hash for forwarded " Jakub Sitnicki
  4 siblings, 0 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2016-10-28 12:32 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Hannes Frederic Sowa, Hideaki YOSHIFUJI, Tom Herbert

Allow our callers to influence the choice of ECMP link by honoring the
hash passed together with the flow info. This will allow for special
treatment of ICMP errors which we would like to route over the same link
as the IP datagram that triggered the error.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/route.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 0514b35..1184c2b 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -430,9 +430,11 @@ static struct rt6_info *rt6_multipath_select(struct rt6_info *match,
 					     int strict)
 {
 	struct rt6_info *sibling, *next_sibling;
+	unsigned int hash;
 	int route_choosen;
 
-	route_choosen = get_hash_from_flowi6(fl6) % (match->rt6i_nsiblings + 1);
+	hash = fl6->mp_hash ? : get_hash_from_flowi6(fl6);
+	route_choosen = hash % (match->rt6i_nsiblings + 1);
 	/* Don't change the route, if route_choosen == 0
 	 * (siblings does not include ourself)
 	 */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v2 4/5] ipv6: Compute multipath hash for sent ICMP errors from offending packet
  2016-10-28 12:32 [PATCH net-next v2 0/5] Route ICMPv6 errors with the flow when ECMP in use Jakub Sitnicki
                   ` (2 preceding siblings ...)
  2016-10-28 12:32 ` [PATCH net-next v2 3/5] ipv6: Use multipath hash from flow info if available Jakub Sitnicki
@ 2016-10-28 12:32 ` Jakub Sitnicki
  2016-10-28 12:32 ` [PATCH net-next v2 5/5] ipv6: Compute multipath hash for forwarded " Jakub Sitnicki
  4 siblings, 0 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2016-10-28 12:32 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Hannes Frederic Sowa, Hideaki YOSHIFUJI, Tom Herbert

Improve debuggability with tools like traceroute and make PMTUD work in
setups that make use of ECMP routing by sending ICMP errors down the
same path as the offending packet would travel, if it was going in the
opposite direction.

There is a caveat, flows in both directions need use the same
label. Otherwise packets from flow in the opposite direction and ICMP
errors will not be routed over the same ECMP link.

Export the function for calculating the multipath hash so that we can
use it also on receive side, when forwarding ICMP errors.

v1 -> v2:
 - don't use "extern" in external function declaration in header file,
   pointed out by David Miller

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/linux/icmpv6.h |  2 ++
 net/ipv6/icmp.c        | 21 +++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/icmpv6.h b/include/linux/icmpv6.h
index 57086e9..803c241 100644
--- a/include/linux/icmpv6.h
+++ b/include/linux/icmpv6.h
@@ -45,4 +45,6 @@ extern void				icmpv6_flow_init(struct sock *sk,
 							 const struct in6_addr *saddr,
 							 const struct in6_addr *daddr,
 							 int oif);
+struct ipv6hdr;
+u32					icmpv6_multipath_hash(const struct ipv6hdr *iph);
 #endif
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index bd59c34..ab376b3d1 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -385,6 +385,26 @@ static struct dst_entry *icmpv6_route_lookup(struct net *net,
 	return ERR_PTR(err);
 }
 
+u32 icmpv6_multipath_hash(const struct ipv6hdr *iph)
+{
+	struct flowi6 fl6;
+
+	/* Calculate the multipath hash from the offending IP datagram that
+	 * triggered the ICMP error. The source and destination addresses are
+	 * swapped as we do our best to route the ICMP message together with the
+	 * flow it belongs to. However, flows in both directions have to have
+	 * the same label (e.g. by using flow label reflection) for it to
+	 * happen.
+	 */
+	memset(&fl6, 0, sizeof(fl6));
+	fl6.daddr = iph->saddr;
+	fl6.saddr = iph->daddr;
+	fl6.flowlabel = ip6_flowinfo(iph);
+	fl6.flowi6_proto = iph->nexthdr;
+
+	return get_hash_from_flowi6(&fl6);
+}
+
 /*
  *	Send an ICMP message in response to a packet in error
  */
@@ -484,6 +504,7 @@ static void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info,
 	fl6.flowi6_oif = iif;
 	fl6.fl6_icmp_type = type;
 	fl6.fl6_icmp_code = code;
+	fl6.mp_hash = icmpv6_multipath_hash(hdr);
 	security_skb_classify_flow(skb, flowi6_to_flowi(&fl6));
 
 	sk = icmpv6_xmit_lock(net);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v2 5/5] ipv6: Compute multipath hash for forwarded ICMP errors from offending packet
  2016-10-28 12:32 [PATCH net-next v2 0/5] Route ICMPv6 errors with the flow when ECMP in use Jakub Sitnicki
                   ` (3 preceding siblings ...)
  2016-10-28 12:32 ` [PATCH net-next v2 4/5] ipv6: Compute multipath hash for sent ICMP errors from offending packet Jakub Sitnicki
@ 2016-10-28 12:32 ` Jakub Sitnicki
  4 siblings, 0 replies; 6+ messages in thread
From: Jakub Sitnicki @ 2016-10-28 12:32 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Hannes Frederic Sowa, Hideaki YOSHIFUJI, Tom Herbert

Same as for the transmit path, let's do our best to ensure that received
ICMP errors that may be subject to forwarding will be routed the same
path as flow that triggered the error, if it was going in the opposite
direction.

v1 -> v2:
 - style change, put as many arguments as possible on the first line of
   a function call, and align consecutive lines to the first argument,
   pointed out by David Miller

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/route.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 1184c2b..269e30d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1150,6 +1150,30 @@ struct dst_entry *ip6_route_input_lookup(struct net *net,
 }
 EXPORT_SYMBOL_GPL(ip6_route_input_lookup);
 
+static u32 ip6_multipath_icmp_hash(const struct sk_buff *skb)
+{
+	const struct icmp6hdr *icmph = icmp6_hdr(skb);
+	const struct ipv6hdr *inner_iph;
+	struct ipv6hdr _inner_iph;
+
+	if (icmph->icmp6_type != ICMPV6_DEST_UNREACH &&
+	    icmph->icmp6_type != ICMPV6_PKT_TOOBIG &&
+	    icmph->icmp6_type != ICMPV6_TIME_EXCEED &&
+	    icmph->icmp6_type != ICMPV6_PARAMPROB)
+		goto standard_hash;
+
+	inner_iph = skb_header_pointer(skb,
+				       skb_transport_offset(skb) + sizeof(*icmph),
+				       sizeof(_inner_iph), &_inner_iph);
+	if (!inner_iph)
+		goto standard_hash;
+
+	return icmpv6_multipath_hash(inner_iph);
+
+standard_hash:
+	return 0; /* compute it later, if needed */
+}
+
 void ip6_route_input(struct sk_buff *skb)
 {
 	const struct ipv6hdr *iph = ipv6_hdr(skb);
@@ -1168,6 +1192,8 @@ void ip6_route_input(struct sk_buff *skb)
 	tun_info = skb_tunnel_info(skb);
 	if (tun_info && !(tun_info->mode & IP_TUNNEL_INFO_TX))
 		fl6.flowi6_tun_key.tun_id = tun_info->key.tun_id;
+	if (unlikely(fl6.flowi6_proto == IPPROTO_ICMPV6))
+		fl6.mp_hash = ip6_multipath_icmp_hash(skb);
 	skb_dst_drop(skb);
 	skb_dst_set(skb, ip6_route_input_lookup(net, skb->dev, &fl6, flags));
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-28 12:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28 12:32 [PATCH net-next v2 0/5] Route ICMPv6 errors with the flow when ECMP in use Jakub Sitnicki
2016-10-28 12:32 ` [PATCH net-next v2 1/5] ipv6: Fold rt6_info_hash_nhsfn() into its only caller Jakub Sitnicki
2016-10-28 12:32 ` [PATCH net-next v2 2/5] net: Extend struct flowi6 with multipath hash Jakub Sitnicki
2016-10-28 12:32 ` [PATCH net-next v2 3/5] ipv6: Use multipath hash from flow info if available Jakub Sitnicki
2016-10-28 12:32 ` [PATCH net-next v2 4/5] ipv6: Compute multipath hash for sent ICMP errors from offending packet Jakub Sitnicki
2016-10-28 12:32 ` [PATCH net-next v2 5/5] ipv6: Compute multipath hash for forwarded " Jakub Sitnicki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).