All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH net-next v2 00/10] Add support for custom multipath hash
@ 2021-05-09 15:16 Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 01/10] ipv4: Calculate multipath hash inside switch statement Ido Schimmel
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

This patchset adds support for custom multipath hash policy for both
IPv4 and IPv6 traffic. The new policy allows user space to control the
outer and inner packet fields used for the hash computation.

Motivation
==========

Linux currently supports different multipath hash policies for IPv4 and
IPv6 traffic:

* Layer 3
* Layer 4
* Layer 3 or inner layer 3, if present

These policies hash on a fixed set of fields, which is inflexible and
against operators' requirements to control the hash input: "The ability
to control the inputs to the hash function should be a consideration in
any load-balancing RFP" [1].

An example of this inflexibility can be seen by the fact that none of
the current policies allows operators to use the standard 5-tuple and
the flow label for multipath hash computation. Such a policy is useful
in the following real-world example of a data center with the following
types of traffic:

* Anycast IPv6 TCP traffic towards layer 4 load balancers. Flow label is
constant (zero) to avoid breaking established connections

* Non-encapsulated IPv6 traffic. Flow label is used to re-route flows
around problematic (congested / failed) paths [2]

* IPv6 encapsulated traffic (IPv4-in-IPv6 or IPv6-in-IPv6). Outer flow
label is generated from encapsulated packet

* UDP encapsulated traffic. Outer source port is generated from
encapsulated packet

In the above example, using the inner flow information for hash
computation in addition to the outer flow information is useful during
failures of the BPF agent that selectively generates the flow label
based on the traffic type. In such cases, the self-healing properties of
the flow label are lost, but encapsulated flows are still load balanced.

Control over the inner fields is even more critical when encapsulation
is performed by hardware routers. For example, the Spectrum ASIC can
only encode 8 bits of entropy in the outer flow label / outer UDP source
port when performing IP / UDP encapsulation. In the case of IPv4 GRE
encapsulation there is no outer field to encode the inner hash in.

User interface
==============

In accordance with existing multipath hash configuration, the new custom
policy is added as a new option (3) to the
net.ipv{4,6}.fib_multipath_hash_policy sysctls. When the new policy is
used, the packet fields used for hash computation are determined by the
net.ipv{4,6}.fib_multipath_hash_fields sysctls. These sysctls accept a
bitmask according to the following table (from ip-sysctl.rst):

	====== ============================
	0x0001 Source IP address
	0x0002 Destination IP address
	0x0004 IP protocol
	0x0008 Flow Label
	0x0010 Source port
	0x0020 Destination port
	0x0040 Inner source IP address
	0x0080 Inner destination IP address
	0x0100 Inner IP protocol
	0x0200 Inner Flow Label
	0x0400 Inner source port
	0x0800 Inner destination port
	====== ============================

For example, to allow IPv6 traffic to be hashed based on standard
5-tuple and flow label:

 # sysctl -wq net.ipv6.fib_multipath_hash_fields=0x0037
 # sysctl -wq net.ipv6.fib_multipath_hash_policy=3

Implementation
==============

As with existing policies, the new policy relies on the flow dissector
to extract the packet fields for the hash computation. However, unlike
existing policies that either use the outer or inner flow, the new
policy might require both flows to be dissected.

To avoid unnecessary invocations of the flow dissector, the data path
skips dissection of the outer or inner flows if none of the outer or
inner fields are required.

In addition, inner flow dissection is not performed when no
encapsulation was encountered (i.e., 'FLOW_DIS_ENCAPSULATION' not set by
flow dissector) during dissection of the outer flow.

Testing
=======

Three new selftests are added with three different topologies that allow
testing of following traffic combinations:

* Non-encapsulated IPv4 / IPv6 traffic
* IPv4 / IPv6 overlay over IPv4 underlay
* IPv4 / IPv6 overlay over IPv6 underlay

All three tests follow the same pattern. Each time a different packet
field is used for hash computation. When the field changes in the packet
stream, traffic is expected to be balanced across the two paths. When
the field does not change, traffic is expected to be unbalanced across
the two paths.

Patchset overview
=================

Patches #1-#3 add custom multipath hash support for IPv4 traffic
Patches #4-#7 do the same for IPv6
Patches #8-#10 add selftests

Future work
===========

mlxsw support can be found here [3].

Changes since RFC v1 [4]:

* Use a bitmask instead of a bitmap (David Ahern)

[1] https://blog.apnic.net/2018/01/11/ipv6-flow-label-misuse-hashing/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3acf3ec3f4b0fd4263989f2e4227bbd1c42b5fe1
[3] https://github.com/idosch/linux/commits/submit/custom_hash_v2
[4] https://lore.kernel.org/netdev/20210502162257.3472453-1-idosch@idosch.org/

Ido Schimmel (10):
  ipv4: Calculate multipath hash inside switch statement
  ipv4: Add a sysctl to control multipath hash fields
  ipv4: Add custom multipath hash policy
  ipv6: Use a more suitable label name
  ipv6: Calculate multipath hash inside switch statement
  ipv6: Add a sysctl to control multipath hash fields
  ipv6: Add custom multipath hash policy
  selftests: forwarding: Add test for custom multipath hash
  selftests: forwarding: Add test for custom multipath hash with IPv4
    GRE
  selftests: forwarding: Add test for custom multipath hash with IPv6
    GRE

 Documentation/networking/ip-sysctl.rst        |  58 +++
 include/net/ip_fib.h                          |  43 ++
 include/net/ipv6.h                            |   8 +
 include/net/netns/ipv4.h                      |   1 +
 include/net/netns/ipv6.h                      |   3 +-
 net/ipv4/fib_frontend.c                       |   6 +
 net/ipv4/route.c                              | 127 ++++-
 net/ipv4/sysctl_net_ipv4.c                    |  14 +-
 net/ipv6/ip6_fib.c                            |   9 +-
 net/ipv6/route.c                              | 131 ++++-
 net/ipv6/sysctl_net_ipv6.c                    |  14 +-
 .../net/forwarding/custom_multipath_hash.sh   | 364 ++++++++++++++
 .../forwarding/gre_custom_multipath_hash.sh   | 456 +++++++++++++++++
 .../ip6gre_custom_multipath_hash.sh           | 458 ++++++++++++++++++
 14 files changed, 1683 insertions(+), 9 deletions(-)
 create mode 100755 tools/testing/selftests/net/forwarding/custom_multipath_hash.sh
 create mode 100755 tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh
 create mode 100755 tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh

-- 
2.31.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 01/10] ipv4: Calculate multipath hash inside switch statement
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields Ido Schimmel
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

A subsequent patch will add another multipath hash policy where the
multipath hash is calculated directly by the policy specific code and
not outside of the switch statement.

Prepare for this change by moving the multipath hash calculation inside
the switch statement.

No functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv4/route.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f6787c55f6ab..9d61e969446e 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1912,7 +1912,7 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 {
 	u32 multipath_hash = fl4 ? fl4->flowi4_multipath_hash : 0;
 	struct flow_keys hash_keys;
-	u32 mhash;
+	u32 mhash = 0;
 
 	switch (net->ipv4.sysctl_fib_multipath_hash_policy) {
 	case 0:
@@ -1924,6 +1924,7 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 			hash_keys.addrs.v4addrs.src = fl4->saddr;
 			hash_keys.addrs.v4addrs.dst = fl4->daddr;
 		}
+		mhash = flow_hash_from_keys(&hash_keys);
 		break;
 	case 1:
 		/* skb is currently provided only when forwarding */
@@ -1957,6 +1958,7 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 			hash_keys.ports.dst = fl4->fl4_dport;
 			hash_keys.basic.ip_proto = fl4->flowi4_proto;
 		}
+		mhash = flow_hash_from_keys(&hash_keys);
 		break;
 	case 2:
 		memset(&hash_keys, 0, sizeof(hash_keys));
@@ -1987,9 +1989,9 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 			hash_keys.addrs.v4addrs.src = fl4->saddr;
 			hash_keys.addrs.v4addrs.dst = fl4->daddr;
 		}
+		mhash = flow_hash_from_keys(&hash_keys);
 		break;
 	}
-	mhash = flow_hash_from_keys(&hash_keys);
 
 	if (multipath_hash)
 		mhash = jhash_2words(mhash, multipath_hash, 0);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 01/10] ipv4: Calculate multipath hash inside switch statement Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-11 15:10   ` David Ahern
  2021-05-11 15:49   ` David Ahern
  2021-05-09 15:16 ` [RFC PATCH net-next v2 03/10] ipv4: Add custom multipath hash policy Ido Schimmel
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

A subsequent patch will add a new multipath hash policy where the packet
fields used for multipath hash calculation are determined by user space.
This patch adds a sysctl that allows user space to set these fields.

The packet fields are represented using a bitmask and are common between
IPv4 and IPv6 to allow user space to use the same numbering across both
protocols. For example, to hash based on standard 5-tuple:

 # sysctl -w net.ipv4.fib_multipath_hash_fields=0x0037
 net.ipv4.fib_multipath_hash_fields = 0x0037

The kernel rejects unknown fields, for example:

 # sysctl -w net.ipv4.fib_multipath_hash_fields=0x1000
 sysctl: setting key "net.ipv4.fib_multipath_hash_fields": Invalid argument

More fields can be added in the future, if needed.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 Documentation/networking/ip-sysctl.rst | 27 ++++++++++++++++
 include/net/ip_fib.h                   | 43 ++++++++++++++++++++++++++
 include/net/netns/ipv4.h               |  1 +
 net/ipv4/fib_frontend.c                |  6 ++++
 net/ipv4/sysctl_net_ipv4.c             | 11 +++++++
 5 files changed, 88 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index c2ecc9894fd0..15982f830abc 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -100,6 +100,33 @@ fib_multipath_hash_policy - INTEGER
 	- 1 - Layer 4
 	- 2 - Layer 3 or inner Layer 3 if present
 
+fib_multipath_hash_fields - UNSIGNED INTEGER
+	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
+	fields used for multipath hash calculation are determined by this
+	sysctl.
+
+	This value is a bitmask which enables various fields for multipath hash
+	calculation.
+
+	Possible fields are:
+
+	====== ============================
+	0x0001 Source IP address
+	0x0002 Destination IP address
+	0x0004 IP protocol
+	0x0008 Unused
+	0x0010 Source port
+	0x0020 Destination port
+	0x0040 Inner source IP address
+	0x0080 Inner destination IP address
+	0x0100 Inner IP protocol
+	0x0200 Inner Flow Label
+	0x0400 Inner source port
+	0x0800 Inner destination port
+	====== ============================
+
+	Default: 0x0007 (source IP, destination IP and IP protocol)
+
 fib_sync_mem - UNSIGNED INTEGER
 	Amount of dirty memory from fib entries that can be backlogged before
 	synchronize_rcu is forced.
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index a914f33f3ed5..3ab2563b1a23 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -466,6 +466,49 @@ int fib_sync_up(struct net_device *dev, unsigned char nh_flags);
 void fib_sync_mtu(struct net_device *dev, u32 orig_mtu);
 void fib_nhc_update_mtu(struct fib_nh_common *nhc, u32 new, u32 orig);
 
+/* Fields used for sysctl_fib_multipath_hash_fields.
+ * Common to IPv4 and IPv6.
+ *
+ * Add new fields at the end. This is user API.
+ */
+#define FIB_MULTIPATH_HASH_FIELD_SRC_IP			BIT(0)
+#define FIB_MULTIPATH_HASH_FIELD_DST_IP			BIT(1)
+#define FIB_MULTIPATH_HASH_FIELD_IP_PROTO		BIT(2)
+#define FIB_MULTIPATH_HASH_FIELD_FLOWLABEL		BIT(3)
+#define FIB_MULTIPATH_HASH_FIELD_SRC_PORT		BIT(4)
+#define FIB_MULTIPATH_HASH_FIELD_DST_PORT		BIT(5)
+#define FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP		BIT(6)
+#define FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP		BIT(7)
+#define FIB_MULTIPATH_HASH_FIELD_INNER_IP_PROTO		BIT(8)
+#define FIB_MULTIPATH_HASH_FIELD_INNER_FLOWLABEL	BIT(9)
+#define FIB_MULTIPATH_HASH_FIELD_INNER_SRC_PORT		BIT(10)
+#define FIB_MULTIPATH_HASH_FIELD_INNER_DST_PORT		BIT(11)
+
+#define FIB_MULTIPATH_HASH_FIELD_OUTER_MASK		\
+	(FIB_MULTIPATH_HASH_FIELD_SRC_IP |		\
+	 FIB_MULTIPATH_HASH_FIELD_DST_IP |		\
+	 FIB_MULTIPATH_HASH_FIELD_IP_PROTO |		\
+	 FIB_MULTIPATH_HASH_FIELD_FLOWLABEL |		\
+	 FIB_MULTIPATH_HASH_FIELD_SRC_PORT |		\
+	 FIB_MULTIPATH_HASH_FIELD_DST_PORT)
+
+#define FIB_MULTIPATH_HASH_FIELD_INNER_MASK		\
+	(FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP |	\
+	 FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP |	\
+	 FIB_MULTIPATH_HASH_FIELD_INNER_IP_PROTO |	\
+	 FIB_MULTIPATH_HASH_FIELD_INNER_FLOWLABEL |	\
+	 FIB_MULTIPATH_HASH_FIELD_INNER_SRC_PORT |	\
+	 FIB_MULTIPATH_HASH_FIELD_INNER_DST_PORT)
+
+#define FIB_MULTIPATH_HASH_FIELD_ALL_MASK		\
+	(FIB_MULTIPATH_HASH_FIELD_OUTER_MASK |		\
+	 FIB_MULTIPATH_HASH_FIELD_INNER_MASK)
+
+#define FIB_MULTIPATH_HASH_FIELD_DEFAULT_MASK		\
+	(FIB_MULTIPATH_HASH_FIELD_SRC_IP |		\
+	 FIB_MULTIPATH_HASH_FIELD_DST_IP |		\
+	 FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
+
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
 int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 		       const struct sk_buff *skb, struct flow_keys *flkeys);
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index f6af8d96d3c6..746c80cd4257 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -210,6 +210,7 @@ struct netns_ipv4 {
 #endif
 #endif
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
+	u32 sysctl_fib_multipath_hash_fields;
 	u8 sysctl_fib_multipath_use_neigh;
 	u8 sysctl_fib_multipath_hash_policy;
 #endif
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 84bb707bd88d..129213b7d834 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1516,6 +1516,12 @@ static int __net_init ip_fib_net_init(struct net *net)
 	if (err)
 		return err;
 
+#ifdef CONFIG_IP_ROUTE_MULTIPATH
+	/* Default to 3-tuple */
+	net->ipv4.sysctl_fib_multipath_hash_fields =
+		FIB_MULTIPATH_HASH_FIELD_DEFAULT_MASK;
+#endif
+
 	/* Avoid false sharing : Use at least a full cache line */
 	size = max_t(size_t, size, L1_CACHE_BYTES);
 
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a62934b9f15a..da627c4d633a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -19,6 +19,7 @@
 #include <net/snmp.h>
 #include <net/icmp.h>
 #include <net/ip.h>
+#include <net/ip_fib.h>
 #include <net/route.h>
 #include <net/tcp.h>
 #include <net/udp.h>
@@ -48,6 +49,8 @@ static int ip_ping_group_range_min[] = { 0, 0 };
 static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
 static u32 u32_max_div_HZ = UINT_MAX / HZ;
 static int one_day_secs = 24 * 3600;
+static u32 fib_multipath_hash_fields_all_mask __maybe_unused =
+	FIB_MULTIPATH_HASH_FIELD_ALL_MASK;
 
 /* obsolete */
 static int sysctl_tcp_low_latency __read_mostly;
@@ -1052,6 +1055,14 @@ static struct ctl_table ipv4_net_table[] = {
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= &two,
 	},
+	{
+		.procname	= "fib_multipath_hash_fields",
+		.data		= &init_net.ipv4.sysctl_fib_multipath_hash_fields,
+		.maxlen		= sizeof(u32),
+		.mode		= 0644,
+		.proc_handler	= proc_douintvec_minmax,
+		.extra2		= &fib_multipath_hash_fields_all_mask,
+	},
 #endif
 	{
 		.procname	= "ip_unprivileged_port_start",
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 03/10] ipv4: Add custom multipath hash policy
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 01/10] ipv4: Calculate multipath hash inside switch statement Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-11 15:46   ` David Ahern
  2021-05-09 15:16 ` [RFC PATCH net-next v2 04/10] ipv6: Use a more suitable label name Ido Schimmel
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Add a new multipath hash policy where the packet fields used for hash
calculation are determined by user space via the
fib_multipath_hash_fields sysctl that was introduced in the previous
patch.

The current set of available packet fields includes both outer and inner
fields, which requires two invocations of the flow dissector. Avoid
unnecessary dissection of the outer or inner flows by skipping
dissection if none of the outer or inner fields are required.

In accordance with the existing policies, when an skb is not available,
packet fields are extracted from the provided flow key. In which case,
only outer fields are considered.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 Documentation/networking/ip-sysctl.rst |   2 +
 net/ipv4/route.c                       | 121 +++++++++++++++++++++++++
 net/ipv4/sysctl_net_ipv4.c             |   3 +-
 3 files changed, 125 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 15982f830abc..2c3b7677222e 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -99,6 +99,8 @@ fib_multipath_hash_policy - INTEGER
 	- 0 - Layer 3
 	- 1 - Layer 4
 	- 2 - Layer 3 or inner Layer 3 if present
+	- 3 - Custom multipath hash. Fields used for multipath hash calculation
+	  are determined by fib_multipath_hash_fields sysctl
 
 fib_multipath_hash_fields - UNSIGNED INTEGER
 	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 9d61e969446e..a4c477475f4c 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1906,6 +1906,121 @@ static void ip_multipath_l3_keys(const struct sk_buff *skb,
 	hash_keys->addrs.v4addrs.dst = key_iph->daddr;
 }
 
+static u32 fib_multipath_custom_hash_outer(const struct net *net,
+					   const struct sk_buff *skb,
+					   bool *p_has_inner)
+{
+	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
+	struct flow_keys keys, hash_keys;
+
+	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
+		return 0;
+
+	memset(&hash_keys, 0, sizeof(hash_keys));
+	skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_ENCAP);
+
+	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
+		hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
+		hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
+		hash_keys.basic.ip_proto = keys.basic.ip_proto;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
+		hash_keys.ports.src = keys.ports.src;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
+		hash_keys.ports.dst = keys.ports.dst;
+
+	*p_has_inner = !!(keys.control.flags & FLOW_DIS_ENCAPSULATION);
+	return flow_hash_from_keys(&hash_keys);
+}
+
+static u32 fib_multipath_custom_hash_inner(const struct net *net,
+					   const struct sk_buff *skb,
+					   bool has_inner)
+{
+	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
+	struct flow_keys keys, hash_keys;
+
+	/* We assume the packet carries an encapsulation, but if none was
+	 * encountered during dissection of the outer flow, then there is no
+	 * point in calling the flow dissector again.
+	 */
+	if (!has_inner)
+		return 0;
+
+	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_MASK))
+		return 0;
+
+	memset(&hash_keys, 0, sizeof(hash_keys));
+	skb_flow_dissect_flow_keys(skb, &keys, 0);
+
+	if (!(keys.control.flags & FLOW_DIS_ENCAPSULATION))
+		return 0;
+
+	if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
+		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
+			hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
+			hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
+	} else if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
+		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
+			hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
+			hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_FLOWLABEL)
+			hash_keys.tags.flow_label = keys.tags.flow_label;
+	}
+
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_IP_PROTO)
+		hash_keys.basic.ip_proto = keys.basic.ip_proto;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_PORT)
+		hash_keys.ports.src = keys.ports.src;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_PORT)
+		hash_keys.ports.dst = keys.ports.dst;
+
+	return flow_hash_from_keys(&hash_keys);
+}
+
+static u32 fib_multipath_custom_hash_skb(const struct net *net,
+					 const struct sk_buff *skb)
+{
+	u32 mhash, mhash_inner;
+	bool has_inner = true;
+
+	mhash = fib_multipath_custom_hash_outer(net, skb, &has_inner);
+	mhash_inner = fib_multipath_custom_hash_inner(net, skb, has_inner);
+
+	return jhash_2words(mhash, mhash_inner, 0);
+}
+
+static u32 fib_multipath_custom_hash_fl4(const struct net *net,
+					 const struct flowi4 *fl4)
+{
+	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
+	struct flow_keys hash_keys;
+
+	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
+		return 0;
+
+	memset(&hash_keys, 0, sizeof(hash_keys));
+	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
+		hash_keys.addrs.v4addrs.src = fl4->saddr;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
+		hash_keys.addrs.v4addrs.dst = fl4->daddr;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
+		hash_keys.basic.ip_proto = fl4->flowi4_proto;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
+		hash_keys.ports.src = fl4->fl4_sport;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
+		hash_keys.ports.dst = fl4->fl4_dport;
+
+	return flow_hash_from_keys(&hash_keys);
+}
+
 /* if skb is set it will be used and fl4 can be NULL */
 int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 		       const struct sk_buff *skb, struct flow_keys *flkeys)
@@ -1991,6 +2106,12 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 		}
 		mhash = flow_hash_from_keys(&hash_keys);
 		break;
+	case 3:
+		if (skb)
+			mhash = fib_multipath_custom_hash_skb(net, skb);
+		else
+			mhash = fib_multipath_custom_hash_fl4(net, fl4);
+		break;
 	}
 
 	if (multipath_hash)
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index da627c4d633a..90b3b924b761 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -30,6 +30,7 @@
 #include <net/netevent.h>
 
 static int two = 2;
+static int three __maybe_unused = 3;
 static int four = 4;
 static int thousand = 1000;
 static int tcp_retr1_max = 255;
@@ -1053,7 +1054,7 @@ static struct ctl_table ipv4_net_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_fib_multipath_hash_policy,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= &three,
 	},
 	{
 		.procname	= "fib_multipath_hash_fields",
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 04/10] ipv6: Use a more suitable label name
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
                   ` (2 preceding siblings ...)
  2021-05-09 15:16 ` [RFC PATCH net-next v2 03/10] ipv4: Add custom multipath hash policy Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 05/10] ipv6: Calculate multipath hash inside switch statement Ido Schimmel
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

The 'out_timer' label was added in commit 63152fc0de4d ("[NETNS][IPV6]
ip6_fib - gc timer per namespace") when the timer was allocated on the
heap.

Commit 417f28bb3407 ("netns: dont alloc ipv6 fib timer list") removed
the allocation, but kept the label name.

Rename it to a more suitable name.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv6/ip6_fib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 679699e953f1..33d2d6a4e28c 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -2362,7 +2362,7 @@ static int __net_init fib6_net_init(struct net *net)
 
 	net->ipv6.rt6_stats = kzalloc(sizeof(*net->ipv6.rt6_stats), GFP_KERNEL);
 	if (!net->ipv6.rt6_stats)
-		goto out_timer;
+		goto out_notifier;
 
 	/* Avoid false sharing : Use at least a full cache line */
 	size = max_t(size_t, size, L1_CACHE_BYTES);
@@ -2407,7 +2407,7 @@ static int __net_init fib6_net_init(struct net *net)
 	kfree(net->ipv6.fib_table_hash);
 out_rt6_stats:
 	kfree(net->ipv6.rt6_stats);
-out_timer:
+out_notifier:
 	fib6_notifier_exit(net);
 	return -ENOMEM;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 05/10] ipv6: Calculate multipath hash inside switch statement
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
                   ` (3 preceding siblings ...)
  2021-05-09 15:16 ` [RFC PATCH net-next v2 04/10] ipv6: Use a more suitable label name Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 06/10] ipv6: Add a sysctl to control multipath hash fields Ido Schimmel
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

A subsequent patch will add another multipath hash policy where the
multipath hash is calculated directly by the policy specific code and
not outside of the switch statement.

Prepare for this change by moving the multipath hash calculation inside
the switch statement.

No functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 net/ipv6/route.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index a22822bdbf39..9935e18146e5 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2331,7 +2331,7 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6,
 		       const struct sk_buff *skb, struct flow_keys *flkeys)
 {
 	struct flow_keys hash_keys;
-	u32 mhash;
+	u32 mhash = 0;
 
 	switch (ip6_multipath_hash_policy(net)) {
 	case 0:
@@ -2345,6 +2345,7 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6,
 			hash_keys.tags.flow_label = (__force u32)flowi6_get_flowlabel(fl6);
 			hash_keys.basic.ip_proto = fl6->flowi6_proto;
 		}
+		mhash = flow_hash_from_keys(&hash_keys);
 		break;
 	case 1:
 		if (skb) {
@@ -2376,6 +2377,7 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6,
 			hash_keys.ports.dst = fl6->fl6_dport;
 			hash_keys.basic.ip_proto = fl6->flowi6_proto;
 		}
+		mhash = flow_hash_from_keys(&hash_keys);
 		break;
 	case 2:
 		memset(&hash_keys, 0, sizeof(hash_keys));
@@ -2412,9 +2414,9 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6,
 			hash_keys.tags.flow_label = (__force u32)flowi6_get_flowlabel(fl6);
 			hash_keys.basic.ip_proto = fl6->flowi6_proto;
 		}
+		mhash = flow_hash_from_keys(&hash_keys);
 		break;
 	}
-	mhash = flow_hash_from_keys(&hash_keys);
 
 	return mhash >> 1;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 06/10] ipv6: Add a sysctl to control multipath hash fields
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
                   ` (4 preceding siblings ...)
  2021-05-09 15:16 ` [RFC PATCH net-next v2 05/10] ipv6: Calculate multipath hash inside switch statement Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 07/10] ipv6: Add custom multipath hash policy Ido Schimmel
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

A subsequent patch will add a new multipath hash policy where the packet
fields used for multipath hash calculation are determined by user space.
This patch adds a sysctl that allows user space to set these fields.

The packet fields are represented using a bitmask and are common between
IPv4 and IPv6 to allow user space to use the same numbering across both
protocols. For example, to hash based on standard 5-tuple:

 # sysctl -w net.ipv6.fib_multipath_hash_fields=0x0037
 net.ipv6.fib_multipath_hash_fields = 0x0037

To avoid introducing holes in 'struct netns_sysctl_ipv6', move the
'bindv6only' field after the multipath hash fields.

The kernel rejects unknown fields, for example:

 # sysctl -w net.ipv6.fib_multipath_hash_fields=0x1000
 sysctl: setting key "net.ipv6.fib_multipath_hash_fields": Invalid argument

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 Documentation/networking/ip-sysctl.rst | 27 ++++++++++++++++++++++++++
 include/net/ipv6.h                     |  8 ++++++++
 include/net/netns/ipv6.h               |  3 ++-
 net/ipv6/ip6_fib.c                     |  5 +++++
 net/ipv6/sysctl_net_ipv6.c             | 11 +++++++++++
 5 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 2c3b7677222e..f7ae65524ff3 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1773,6 +1773,33 @@ fib_multipath_hash_policy - INTEGER
 	- 1 - Layer 4 (standard 5-tuple)
 	- 2 - Layer 3 or inner Layer 3 if present
 
+fib_multipath_hash_fields - UNSIGNED INTEGER
+	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
+	fields used for multipath hash calculation are determined by this
+	sysctl.
+
+	This value is a bitmask which enables various fields for multipath hash
+	calculation.
+
+	Possible fields are:
+
+	====== ============================
+	0x0001 Source IP address
+	0x0002 Destination IP address
+	0x0004 IP protocol
+	0x0008 Flow Label
+	0x0010 Source port
+	0x0020 Destination port
+	0x0040 Inner source IP address
+	0x0080 Inner destination IP address
+	0x0100 Inner IP protocol
+	0x0200 Inner Flow Label
+	0x0400 Inner source port
+	0x0800 Inner destination port
+	====== ============================
+
+	Default: 0x0007 (source IP, destination IP and IP protocol)
+
 anycast_src_echo_reply - BOOLEAN
 	Controls the use of anycast addresses as source addresses for ICMPv6
 	echo reply
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 448bf2b34759..f2d0ecc257bb 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -926,11 +926,19 @@ static inline int ip6_multipath_hash_policy(const struct net *net)
 {
 	return net->ipv6.sysctl.multipath_hash_policy;
 }
+static inline u32 ip6_multipath_hash_fields(const struct net *net)
+{
+	return net->ipv6.sysctl.multipath_hash_fields;
+}
 #else
 static inline int ip6_multipath_hash_policy(const struct net *net)
 {
 	return 0;
 }
+static inline u32 ip6_multipath_hash_fields(const struct net *net)
+{
+	return 0;
+}
 #endif
 
 /*
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 6153c8067009..bde0b7adb4a3 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -28,8 +28,9 @@ struct netns_sysctl_ipv6 {
 	int ip6_rt_gc_elasticity;
 	int ip6_rt_mtu_expires;
 	int ip6_rt_min_advmss;
-	u8 bindv6only;
+	u32 multipath_hash_fields;
 	u8 multipath_hash_policy;
+	u8 bindv6only;
 	u8 flowlabel_consistency;
 	u8 auto_flowlabels;
 	int icmpv6_time;
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 33d2d6a4e28c..2d650dc24349 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -32,6 +32,7 @@
 #include <net/lwtunnel.h>
 #include <net/fib_notifier.h>
 
+#include <net/ip_fib.h>
 #include <net/ip6_fib.h>
 #include <net/ip6_route.h>
 
@@ -2355,6 +2356,10 @@ static int __net_init fib6_net_init(struct net *net)
 	if (err)
 		return err;
 
+	/* Default to 3-tuple */
+	net->ipv6.sysctl.multipath_hash_fields =
+		FIB_MULTIPATH_HASH_FIELD_DEFAULT_MASK;
+
 	spin_lock_init(&net->ipv6.fib6_gc_lock);
 	rwlock_init(&net->ipv6.fib6_walker_lock);
 	INIT_LIST_HEAD(&net->ipv6.fib6_walkers);
diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
index 27102c3d6e1d..fb73d9839bc8 100644
--- a/net/ipv6/sysctl_net_ipv6.c
+++ b/net/ipv6/sysctl_net_ipv6.c
@@ -17,6 +17,7 @@
 #include <net/addrconf.h>
 #include <net/inet_frag.h>
 #include <net/netevent.h>
+#include <net/ip_fib.h>
 #ifdef CONFIG_NETLABEL
 #include <net/calipso.h>
 #endif
@@ -24,6 +25,8 @@
 static int two = 2;
 static int flowlabel_reflect_max = 0x7;
 static int auto_flowlabels_max = IP6_AUTO_FLOW_LABEL_MAX;
+static u32 rt6_multipath_hash_fields_all_mask =
+	FIB_MULTIPATH_HASH_FIELD_ALL_MASK;
 
 static int proc_rt6_multipath_hash_policy(struct ctl_table *table, int write,
 					  void *buffer, size_t *lenp, loff_t *ppos)
@@ -151,6 +154,14 @@ static struct ctl_table ipv6_table_template[] = {
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= &two,
 	},
+	{
+		.procname	= "fib_multipath_hash_fields",
+		.data		= &init_net.ipv6.sysctl.multipath_hash_fields,
+		.maxlen		= sizeof(u32),
+		.mode		= 0644,
+		.proc_handler	= proc_douintvec_minmax,
+		.extra2		= &rt6_multipath_hash_fields_all_mask,
+	},
 	{
 		.procname	= "seg6_flowlabel",
 		.data		= &init_net.ipv6.sysctl.seg6_flowlabel,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 07/10] ipv6: Add custom multipath hash policy
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
                   ` (5 preceding siblings ...)
  2021-05-09 15:16 ` [RFC PATCH net-next v2 06/10] ipv6: Add a sysctl to control multipath hash fields Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 08/10] selftests: forwarding: Add test for custom multipath hash Ido Schimmel
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Add a new multipath hash policy where the packet fields used for hash
calculation are determined by user space via the
fib_multipath_hash_fields sysctl that was introduced in the previous
patch.

The current set of available packet fields includes both outer and inner
fields, which requires two invocations of the flow dissector. Avoid
unnecessary dissection of the outer or inner flows by skipping
dissection if none of the outer or inner fields are required.

In accordance with the existing policies, when an skb is not available,
packet fields are extracted from the provided flow key. In which case,
only outer fields are considered.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 Documentation/networking/ip-sysctl.rst |   2 +
 net/ipv6/route.c                       | 125 +++++++++++++++++++++++++
 net/ipv6/sysctl_net_ipv6.c             |   3 +-
 3 files changed, 129 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index f7ae65524ff3..3d5f17d3c4f6 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1772,6 +1772,8 @@ fib_multipath_hash_policy - INTEGER
 	- 0 - Layer 3 (source and destination addresses plus flow label)
 	- 1 - Layer 4 (standard 5-tuple)
 	- 2 - Layer 3 or inner Layer 3 if present
+	- 3 - Custom multipath hash. Fields used for multipath hash calculation
+	  are determined by fib_multipath_hash_fields sysctl
 
 fib_multipath_hash_fields - UNSIGNED INTEGER
 	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 9935e18146e5..c46889381ae4 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2326,6 +2326,125 @@ static void ip6_multipath_l3_keys(const struct sk_buff *skb,
 	}
 }
 
+static u32 rt6_multipath_custom_hash_outer(const struct net *net,
+					   const struct sk_buff *skb,
+					   bool *p_has_inner)
+{
+	u32 hash_fields = ip6_multipath_hash_fields(net);
+	struct flow_keys keys, hash_keys;
+
+	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
+		return 0;
+
+	memset(&hash_keys, 0, sizeof(hash_keys));
+	skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_ENCAP);
+
+	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
+		hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
+		hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
+		hash_keys.basic.ip_proto = keys.basic.ip_proto;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_FLOWLABEL)
+		hash_keys.tags.flow_label = keys.tags.flow_label;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
+		hash_keys.ports.src = keys.ports.src;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
+		hash_keys.ports.dst = keys.ports.dst;
+
+	*p_has_inner = !!(keys.control.flags & FLOW_DIS_ENCAPSULATION);
+	return flow_hash_from_keys(&hash_keys);
+}
+
+static u32 rt6_multipath_custom_hash_inner(const struct net *net,
+					   const struct sk_buff *skb,
+					   bool has_inner)
+{
+	u32 hash_fields = ip6_multipath_hash_fields(net);
+	struct flow_keys keys, hash_keys;
+
+	/* We assume the packet carries an encapsulation, but if none was
+	 * encountered during dissection of the outer flow, then there is no
+	 * point in calling the flow dissector again.
+	 */
+	if (!has_inner)
+		return 0;
+
+	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_MASK))
+		return 0;
+
+	memset(&hash_keys, 0, sizeof(hash_keys));
+	skb_flow_dissect_flow_keys(skb, &keys, 0);
+
+	if (!(keys.control.flags & FLOW_DIS_ENCAPSULATION))
+		return 0;
+
+	if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
+		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
+			hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
+			hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
+	} else if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
+		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
+			hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
+			hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst;
+		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_FLOWLABEL)
+			hash_keys.tags.flow_label = keys.tags.flow_label;
+	}
+
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_IP_PROTO)
+		hash_keys.basic.ip_proto = keys.basic.ip_proto;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_PORT)
+		hash_keys.ports.src = keys.ports.src;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_PORT)
+		hash_keys.ports.dst = keys.ports.dst;
+
+	return flow_hash_from_keys(&hash_keys);
+}
+
+static u32 rt6_multipath_custom_hash_skb(const struct net *net,
+					 const struct sk_buff *skb)
+{
+	u32 mhash, mhash_inner;
+	bool has_inner = true;
+
+	mhash = rt6_multipath_custom_hash_outer(net, skb, &has_inner);
+	mhash_inner = rt6_multipath_custom_hash_inner(net, skb, has_inner);
+
+	return jhash_2words(mhash, mhash_inner, 0);
+}
+
+static u32 rt6_multipath_custom_hash_fl6(const struct net *net,
+					 const struct flowi6 *fl6)
+{
+	u32 hash_fields = ip6_multipath_hash_fields(net);
+	struct flow_keys hash_keys;
+
+	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
+		return 0;
+
+	memset(&hash_keys, 0, sizeof(hash_keys));
+	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
+		hash_keys.addrs.v6addrs.src = fl6->saddr;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
+		hash_keys.addrs.v6addrs.dst = fl6->daddr;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
+		hash_keys.basic.ip_proto = fl6->flowi6_proto;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_FLOWLABEL)
+		hash_keys.tags.flow_label = (__force u32)flowi6_get_flowlabel(fl6);
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
+		hash_keys.ports.src = fl6->fl6_sport;
+	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
+		hash_keys.ports.dst = fl6->fl6_dport;
+
+	return flow_hash_from_keys(&hash_keys);
+}
+
 /* if skb is set it will be used and fl6 can be NULL */
 u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6,
 		       const struct sk_buff *skb, struct flow_keys *flkeys)
@@ -2416,6 +2535,12 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6,
 		}
 		mhash = flow_hash_from_keys(&hash_keys);
 		break;
+	case 3:
+		if (skb)
+			mhash = rt6_multipath_custom_hash_skb(net, skb);
+		else
+			mhash = rt6_multipath_custom_hash_fl6(net, fl6);
+		break;
 	}
 
 	return mhash >> 1;
diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
index fb73d9839bc8..fe608f619ffd 100644
--- a/net/ipv6/sysctl_net_ipv6.c
+++ b/net/ipv6/sysctl_net_ipv6.c
@@ -23,6 +23,7 @@
 #endif
 
 static int two = 2;
+static int three = 3;
 static int flowlabel_reflect_max = 0x7;
 static int auto_flowlabels_max = IP6_AUTO_FLOW_LABEL_MAX;
 static u32 rt6_multipath_hash_fields_all_mask =
@@ -152,7 +153,7 @@ static struct ctl_table ipv6_table_template[] = {
 		.mode		= 0644,
 		.proc_handler   = proc_rt6_multipath_hash_policy,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= &two,
+		.extra2		= &three,
 	},
 	{
 		.procname	= "fib_multipath_hash_fields",
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 08/10] selftests: forwarding: Add test for custom multipath hash
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
                   ` (6 preceding siblings ...)
  2021-05-09 15:16 ` [RFC PATCH net-next v2 07/10] ipv6: Add custom multipath hash policy Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 09/10] selftests: forwarding: Add test for custom multipath hash with IPv4 GRE Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 10/10] selftests: forwarding: Add test for custom multipath hash with IPv6 GRE Ido Schimmel
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Test that when the hash policy is set to custom, traffic is distributed
only according to the outer fields set in the fib_multipath_hash_fields
sysctl.

Each time set a different field and make sure traffic is only
distributed when the field is changed in the packet stream.

The test only verifies the behavior with non-encapsulated IPv4 and IPv6
packets. Subsequent patches will add tests for IPv4/IPv6 overlays on top
of IPv4/IPv6 underlay networks.

Example output:

 # ./custom_multipath_hash.sh
 TEST: ping                                                          [ OK ]
 TEST: ping6                                                         [ OK ]
 INFO: Running IPv4 custom multipath hash tests
 TEST: Multipath hash field: Source IP (balanced)                    [ OK ]
 INFO: Packets sent on path1 / path2: 6353 / 6254
 TEST: Multipath hash field: Source IP (unbalanced)                  [ OK ]
 INFO: Packets sent on path1 / path2: 0 / 12600
 TEST: Multipath hash field: Destination IP (balanced)               [ OK ]
 INFO: Packets sent on path1 / path2: 6102 / 6502
 TEST: Multipath hash field: Destination IP (unbalanced)             [ OK ]
 INFO: Packets sent on path1 / path2: 1 / 12601
 TEST: Multipath hash field: Source port (balanced)                  [ OK ]
 INFO: Packets sent on path1 / path2: 16428 / 16345
 TEST: Multipath hash field: Source port (unbalanced)                [ OK ]
 INFO: Packets sent on path1 / path2: 32770 / 2
 TEST: Multipath hash field: Destination port (balanced)             [ OK ]
 INFO: Packets sent on path1 / path2: 16428 / 16345
 TEST: Multipath hash field: Destination port (unbalanced)           [ OK ]
 INFO: Packets sent on path1 / path2: 32770 / 2
 INFO: Running IPv6 custom multipath hash tests
 TEST: Multipath hash field: Source IP (balanced)                    [ OK ]
 INFO: Packets sent on path1 / path2: 6704 / 5903
 TEST: Multipath hash field: Source IP (unbalanced)                  [ OK ]
 INFO: Packets sent on path1 / path2: 12600 / 0
 TEST: Multipath hash field: Destination IP (balanced)               [ OK ]
 INFO: Packets sent on path1 / path2: 5551 / 7052
 TEST: Multipath hash field: Destination IP (unbalanced)             [ OK ]
 INFO: Packets sent on path1 / path2: 12603 / 0
 TEST: Multipath hash field: Flowlabel (balanced)                    [ OK ]
 INFO: Packets sent on path1 / path2: 8378 / 8080
 TEST: Multipath hash field: Flowlabel (unbalanced)                  [ OK ]
 INFO: Packets sent on path1 / path2: 2 / 12603
 TEST: Multipath hash field: Source port (balanced)                  [ OK ]
 INFO: Packets sent on path1 / path2: 16385 / 16388
 TEST: Multipath hash field: Source port (unbalanced)                [ OK ]
 INFO: Packets sent on path1 / path2: 0 / 32774
 TEST: Multipath hash field: Destination port (balanced)             [ OK ]
 INFO: Packets sent on path1 / path2: 16386 / 16390
 TEST: Multipath hash field: Destination port (unbalanced)           [ OK ]
 INFO: Packets sent on path1 / path2: 32771 / 2

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 .../net/forwarding/custom_multipath_hash.sh   | 364 ++++++++++++++++++
 1 file changed, 364 insertions(+)
 create mode 100755 tools/testing/selftests/net/forwarding/custom_multipath_hash.sh

diff --git a/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh
new file mode 100755
index 000000000000..a15d21dc035a
--- /dev/null
+++ b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh
@@ -0,0 +1,364 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test traffic distribution between two paths when using custom hash policy.
+#
+# +--------------------------------+
+# | H1                             |
+# |                     $h1 +      |
+# |   198.51.100.{2-253}/24 |      |
+# |   2001:db8:1::{2-fd}/64 |      |
+# +-------------------------|------+
+#                           |
+# +-------------------------|-------------------------+
+# | SW1                     |                         |
+# |                    $rp1 +                         |
+# |         198.51.100.1/24                           |
+# |        2001:db8:1::1/64                           |
+# |                                                   |
+# |                                                   |
+# |            $rp11 +             + $rp12            |
+# |     192.0.2.1/28 |             | 192.0.2.17/28    |
+# | 2001:db8:2::1/64 |             | 2001:db8:3::1/64 |
+# +------------------|-------------|------------------+
+#                    |             |
+# +------------------|-------------|------------------+
+# | SW2              |             |                  |
+# |                  |             |                  |
+# |            $rp21 +             + $rp22            |
+# |     192.0.2.2/28                 192.0.2.18/28    |
+# | 2001:db8:2::2/64                 2001:db8:3::2/64 |
+# |                                                   |
+# |                                                   |
+# |                    $rp2 +                         |
+# |          203.0.113.1/24 |                         |
+# |        2001:db8:4::1/64 |                         |
+# +-------------------------|-------------------------+
+#                           |
+# +-------------------------|------+
+# | H2                      |      |
+# |                     $h2 +      |
+# |    203.0.113.{2-253}/24        |
+# |   2001:db8:4::{2-fd}/64        |
+# +--------------------------------+
+
+ALL_TESTS="
+	ping_ipv4
+	ping_ipv6
+	custom_hash
+"
+
+NUM_NETIFS=8
+source lib.sh
+
+h1_create()
+{
+	simple_if_init $h1 198.51.100.2/24 2001:db8:1::2/64
+	ip route add vrf v$h1 default via 198.51.100.1 dev $h1
+	ip -6 route add vrf v$h1 default via 2001:db8:1::1 dev $h1
+}
+
+h1_destroy()
+{
+	ip -6 route del vrf v$h1 default
+	ip route del vrf v$h1 default
+	simple_if_fini $h1 198.51.100.2/24 2001:db8:1::2/64
+}
+
+sw1_create()
+{
+	simple_if_init $rp1 198.51.100.1/24 2001:db8:1::1/64
+	__simple_if_init $rp11 v$rp1 192.0.2.1/28 2001:db8:2::1/64
+	__simple_if_init $rp12 v$rp1 192.0.2.17/28 2001:db8:3::1/64
+
+	ip route add vrf v$rp1 203.0.113.0/24 \
+		nexthop via 192.0.2.2 dev $rp11 \
+		nexthop via 192.0.2.18 dev $rp12
+
+	ip -6 route add vrf v$rp1 2001:db8:4::/64 \
+		nexthop via 2001:db8:2::2 dev $rp11 \
+		nexthop via 2001:db8:3::2 dev $rp12
+}
+
+sw1_destroy()
+{
+	ip -6 route del vrf v$rp1 2001:db8:4::/64
+
+	ip route del vrf v$rp1 203.0.113.0/24
+
+	__simple_if_fini $rp12 192.0.2.17/28 2001:db8:3::1/64
+	__simple_if_fini $rp11 192.0.2.1/28 2001:db8:2::1/64
+	simple_if_fini $rp1 198.51.100.1/24 2001:db8:1::1/64
+}
+
+sw2_create()
+{
+	simple_if_init $rp2 203.0.113.1/24 2001:db8:4::1/64
+	__simple_if_init $rp21 v$rp2 192.0.2.2/28 2001:db8:2::2/64
+	__simple_if_init $rp22 v$rp2 192.0.2.18/28 2001:db8:3::2/64
+
+	ip route add vrf v$rp2 198.51.100.0/24 \
+		nexthop via 192.0.2.1 dev $rp21 \
+		nexthop via 192.0.2.17 dev $rp22
+
+	ip -6 route add vrf v$rp2 2001:db8:1::/64 \
+		nexthop via 2001:db8:2::1 dev $rp21 \
+		nexthop via 2001:db8:3::1 dev $rp22
+}
+
+sw2_destroy()
+{
+	ip -6 route del vrf v$rp2 2001:db8:1::/64
+
+	ip route del vrf v$rp2 198.51.100.0/24
+
+	__simple_if_fini $rp22 192.0.2.18/28 2001:db8:3::2/64
+	__simple_if_fini $rp21 192.0.2.2/28 2001:db8:2::2/64
+	simple_if_fini $rp2 203.0.113.1/24 2001:db8:4::1/64
+}
+
+h2_create()
+{
+	simple_if_init $h2 203.0.113.2/24 2001:db8:4::2/64
+	ip route add vrf v$h2 default via 203.0.113.1 dev $h2
+	ip -6 route add vrf v$h2 default via 2001:db8:4::1 dev $h2
+}
+
+h2_destroy()
+{
+	ip -6 route del vrf v$h2 default
+	ip route del vrf v$h2 default
+	simple_if_fini $h2 203.0.113.2/24 2001:db8:4::2/64
+}
+
+setup_prepare()
+{
+	h1=${NETIFS[p1]}
+
+	rp1=${NETIFS[p2]}
+
+	rp11=${NETIFS[p3]}
+	rp21=${NETIFS[p4]}
+
+	rp12=${NETIFS[p5]}
+	rp22=${NETIFS[p6]}
+
+	rp2=${NETIFS[p7]}
+
+	h2=${NETIFS[p8]}
+
+	vrf_prepare
+	h1_create
+	sw1_create
+	sw2_create
+	h2_create
+
+	forwarding_enable
+}
+
+cleanup()
+{
+	pre_cleanup
+
+	forwarding_restore
+
+	h2_destroy
+	sw2_destroy
+	sw1_destroy
+	h1_destroy
+	vrf_cleanup
+}
+
+ping_ipv4()
+{
+	ping_test $h1 203.0.113.2
+}
+
+ping_ipv6()
+{
+	ping6_test $h1 2001:db8:4::2
+}
+
+send_src_ipv4()
+{
+	$MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_dst_ipv4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_src_udp4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \
+		-d 1msec -t udp "sp=0-32768,dp=30000"
+}
+
+send_dst_udp4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \
+		-d 1msec -t udp "sp=20000,dp=0-32768"
+}
+
+send_src_ipv6()
+{
+	$MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:4::2 \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_dst_ipv6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:4::2-2001:db8:4::fd" \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_flowlabel()
+{
+	# Generate 16384 echo requests, each with a random flow label.
+	for _ in $(seq 1 16384); do
+		ip vrf exec v$h1 \
+			$PING6 2001:db8:4::2 -F 0 -c 1 -q >/dev/null 2>&1
+	done
+}
+
+send_src_udp6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:4::2 \
+		-d 1msec -t udp "sp=0-32768,dp=30000"
+}
+
+send_dst_udp6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:4::2 \
+		-d 1msec -t udp "sp=20000,dp=0-32768"
+}
+
+custom_hash_test()
+{
+	local field="$1"; shift
+	local balanced="$1"; shift
+	local send_flows="$@"
+
+	RET=0
+
+	local t0_rp11=$(link_stats_tx_packets_get $rp11)
+	local t0_rp12=$(link_stats_tx_packets_get $rp12)
+
+	$send_flows
+
+	local t1_rp11=$(link_stats_tx_packets_get $rp11)
+	local t1_rp12=$(link_stats_tx_packets_get $rp12)
+
+	local d_rp11=$((t1_rp11 - t0_rp11))
+	local d_rp12=$((t1_rp12 - t0_rp12))
+
+	local diff=$((d_rp12 - d_rp11))
+	local sum=$((d_rp11 + d_rp12))
+
+	local pct=$(echo "$diff / $sum * 100" | bc -l)
+	local is_balanced=$(echo "-20 <= $pct && $pct <= 20" | bc)
+
+	[[ ( $is_balanced -eq 1 && $balanced == "balanced" ) ||
+	   ( $is_balanced -eq 0 && $balanced == "unbalanced" ) ]]
+	check_err $? "Expected traffic to be $balanced, but it is not"
+
+	log_test "Multipath hash field: $field ($balanced)"
+	log_info "Packets sent on path1 / path2: $d_rp11 / $d_rp12"
+}
+
+custom_hash_v4()
+{
+	log_info "Running IPv4 custom multipath hash tests"
+
+	sysctl_set net.ipv4.fib_multipath_hash_policy 3
+
+	# Prevent the neighbour table from overflowing, as different neighbour
+	# entries will be created on $ol4 when using different destination IPs.
+	sysctl_set net.ipv4.neigh.default.gc_thresh1 1024
+	sysctl_set net.ipv4.neigh.default.gc_thresh2 1024
+	sysctl_set net.ipv4.neigh.default.gc_thresh3 1024
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0001
+	custom_hash_test "Source IP" "balanced" send_src_ipv4
+	custom_hash_test "Source IP" "unbalanced" send_dst_ipv4
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0002
+	custom_hash_test "Destination IP" "balanced" send_dst_ipv4
+	custom_hash_test "Destination IP" "unbalanced" send_src_ipv4
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0010
+	custom_hash_test "Source port" "balanced" send_src_udp4
+	custom_hash_test "Source port" "unbalanced" send_dst_udp4
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0020
+	custom_hash_test "Destination port" "balanced" send_dst_udp4
+	custom_hash_test "Destination port" "unbalanced" send_src_udp4
+
+	sysctl_restore net.ipv4.neigh.default.gc_thresh3
+	sysctl_restore net.ipv4.neigh.default.gc_thresh2
+	sysctl_restore net.ipv4.neigh.default.gc_thresh1
+
+	sysctl_restore net.ipv4.fib_multipath_hash_policy
+}
+
+custom_hash_v6()
+{
+	log_info "Running IPv6 custom multipath hash tests"
+
+	sysctl_set net.ipv6.fib_multipath_hash_policy 3
+
+	# Prevent the neighbour table from overflowing, as different neighbour
+	# entries will be created on $ol4 when using different destination IPs.
+	sysctl_set net.ipv6.neigh.default.gc_thresh1 1024
+	sysctl_set net.ipv6.neigh.default.gc_thresh2 1024
+	sysctl_set net.ipv6.neigh.default.gc_thresh3 1024
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0001
+	custom_hash_test "Source IP" "balanced" send_src_ipv6
+	custom_hash_test "Source IP" "unbalanced" send_dst_ipv6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0002
+	custom_hash_test "Destination IP" "balanced" send_dst_ipv6
+	custom_hash_test "Destination IP" "unbalanced" send_src_ipv6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0008
+	custom_hash_test "Flowlabel" "balanced" send_flowlabel
+	custom_hash_test "Flowlabel" "unbalanced" send_src_ipv6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0010
+	custom_hash_test "Source port" "balanced" send_src_udp6
+	custom_hash_test "Source port" "unbalanced" send_dst_udp6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0020
+	custom_hash_test "Destination port" "balanced" send_dst_udp6
+	custom_hash_test "Destination port" "unbalanced" send_src_udp6
+
+	sysctl_restore net.ipv6.neigh.default.gc_thresh3
+	sysctl_restore net.ipv6.neigh.default.gc_thresh2
+	sysctl_restore net.ipv6.neigh.default.gc_thresh1
+
+	sysctl_restore net.ipv6.fib_multipath_hash_policy
+}
+
+custom_hash()
+{
+	# Test that when the hash policy is set to custom, traffic is
+	# distributed only according to the fields set in the
+	# fib_multipath_hash_fields sysctl.
+	#
+	# Each time set a different field and make sure traffic is only
+	# distributed when the field is changed in the packet stream.
+	custom_hash_v4
+	custom_hash_v6
+}
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+tests_run
+
+exit $EXIT_STATUS
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 09/10] selftests: forwarding: Add test for custom multipath hash with IPv4 GRE
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
                   ` (7 preceding siblings ...)
  2021-05-09 15:16 ` [RFC PATCH net-next v2 08/10] selftests: forwarding: Add test for custom multipath hash Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  2021-05-09 15:16 ` [RFC PATCH net-next v2 10/10] selftests: forwarding: Add test for custom multipath hash with IPv6 GRE Ido Schimmel
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Test that when the hash policy is set to custom, traffic is distributed
only according to the inner fields set in the fib_multipath_hash_fields
sysctl.

Each time set a different field and make sure traffic is only
distributed when the field is changed in the packet stream.

The test only verifies the behavior of IPv4/IPv6 overlays on top of an
IPv4 underlay network. A subsequent patch will do the same with an IPv6
underlay network.

Example output:

 # ./gre_custom_multipath_hash.sh
 TEST: ping                                                          [ OK ]
 TEST: ping6                                                         [ OK ]
 INFO: Running IPv4 overlay custom multipath hash tests
 TEST: Multipath hash field: Inner source IP (balanced)              [ OK ]
 INFO: Packets sent on path1 / path2: 6601 / 6001
 TEST: Multipath hash field: Inner source IP (unbalanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 0 / 12600
 TEST: Multipath hash field: Inner destination IP (balanced)         [ OK ]
 INFO: Packets sent on path1 / path2: 6802 / 5802
 TEST: Multipath hash field: Inner destination IP (unbalanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 12601 / 1
 TEST: Multipath hash field: Inner source port (balanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 16430 / 16344
 TEST: Multipath hash field: Inner source port (unbalanced)          [ OK ]
 INFO: Packets sent on path1 / path2: 0 / 32772
 TEST: Multipath hash field: Inner destination port (balanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 16430 / 16343
 TEST: Multipath hash field: Inner destination port (unbalanced)     [ OK ]
 INFO: Packets sent on path1 / path2: 0 / 32772
 INFO: Running IPv6 overlay custom multipath hash tests
 TEST: Multipath hash field: Inner source IP (balanced)              [ OK ]
 INFO: Packets sent on path1 / path2: 6702 / 5900
 TEST: Multipath hash field: Inner source IP (unbalanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 0 / 12601
 TEST: Multipath hash field: Inner destination IP (balanced)         [ OK ]
 INFO: Packets sent on path1 / path2: 5751 / 6851
 TEST: Multipath hash field: Inner destination IP (unbalanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 12602 / 1
 TEST: Multipath hash field: Inner flowlabel (balanced)              [ OK ]
 INFO: Packets sent on path1 / path2: 8364 / 8065
 TEST: Multipath hash field: Inner flowlabel (unbalanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 12601 / 0
 TEST: Multipath hash field: Inner source port (balanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 16425 / 16349
 TEST: Multipath hash field: Inner source port (unbalanced)          [ OK ]
 INFO: Packets sent on path1 / path2: 1 / 32770
 TEST: Multipath hash field: Inner destination port (balanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 16425 / 16349
 TEST: Multipath hash field: Inner destination port (unbalanced)     [ OK ]
 INFO: Packets sent on path1 / path2: 2 / 32770

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 .../forwarding/gre_custom_multipath_hash.sh   | 456 ++++++++++++++++++
 1 file changed, 456 insertions(+)
 create mode 100755 tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh

diff --git a/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh
new file mode 100755
index 000000000000..a73f52efcb6c
--- /dev/null
+++ b/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh
@@ -0,0 +1,456 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test traffic distribution when there are multiple paths between an IPv4 GRE
+# tunnel. The tunnel carries IPv4 and IPv6 traffic between multiple hosts.
+# Multiple routes are in the underlay network. With the default multipath
+# policy, SW2 will only look at the outer IP addresses, hence only a single
+# route would be used.
+#
+# +--------------------------------+
+# | H1                             |
+# |                     $h1 +      |
+# |   198.51.100.{2-253}/24 |      |
+# |   2001:db8:1::{2-fd}/64 |      |
+# +-------------------------|------+
+#                           |
+# +-------------------------|------------------+
+# | SW1                     |                  |
+# |                    $ol1 +                  |
+# |         198.51.100.1/24                    |
+# |        2001:db8:1::1/64                    |
+# |                                            |
+# |   + g1 (gre)                               |
+# |     loc=192.0.2.1                          |
+# |     rem=192.0.2.2 --.                      |
+# |     tos=inherit     |                      |
+# |                     v                      |
+# |                     + $ul1                 |
+# |                     | 192.0.2.17/28        |
+# +---------------------|----------------------+
+#                       |
+# +---------------------|----------------------+
+# | SW2                 |                      |
+# |               $ul21 +                      |
+# |       192.0.2.18/28 |                      |
+# |                     |                      |
+# !   __________________+___                   |
+# |  /                      \                  |
+# |  |                      |                  |
+# |  + $ul22.111 (vlan)     + $ul22.222 (vlan) |
+# |  | 192.0.2.33/28        | 192.0.2.49/28    |
+# |  |                      |                  |
+# +--|----------------------|------------------+
+#    |                      |
+# +--|----------------------|------------------+
+# |  |                      |                  |
+# |  + $ul32.111 (vlan)     + $ul32.222 (vlan) |
+# |  | 192.0.2.34/28        | 192.0.2.50/28    |
+# |  |                      |                  |
+# |  \__________________+___/                  |
+# |                     |                      |
+# |                     |                      |
+# |               $ul31 +                      |
+# |       192.0.2.65/28 |                  SW3 |
+# +---------------------|----------------------+
+#                       |
+# +---------------------|----------------------+
+# |                     + $ul4                 |
+# |                     ^ 192.0.2.66/28        |
+# |                     |                      |
+# |   + g2 (gre)        |                      |
+# |     loc=192.0.2.2   |                      |
+# |     rem=192.0.2.1 --'                      |
+# |     tos=inherit                            |
+# |                                            |
+# |                    $ol4 +                  |
+# |          203.0.113.1/24 |                  |
+# |        2001:db8:2::1/64 |              SW4 |
+# +-------------------------|------------------+
+#                           |
+# +-------------------------|------+
+# |                         |      |
+# |                     $h2 +      |
+# |    203.0.113.{2-253}/24        |
+# |   2001:db8:2::{2-fd}/64     H2 |
+# +--------------------------------+
+
+ALL_TESTS="
+	ping_ipv4
+	ping_ipv6
+	custom_hash
+"
+
+NUM_NETIFS=10
+source lib.sh
+
+h1_create()
+{
+	simple_if_init $h1 198.51.100.2/24 2001:db8:1::2/64
+	ip route add vrf v$h1 default via 198.51.100.1 dev $h1
+	ip -6 route add vrf v$h1 default via 2001:db8:1::1 dev $h1
+}
+
+h1_destroy()
+{
+	ip -6 route del vrf v$h1 default
+	ip route del vrf v$h1 default
+	simple_if_fini $h1 198.51.100.2/24 2001:db8:1::2/64
+}
+
+sw1_create()
+{
+	simple_if_init $ol1 198.51.100.1/24 2001:db8:1::1/64
+	__simple_if_init $ul1 v$ol1 192.0.2.17/28
+
+	tunnel_create g1 gre 192.0.2.1 192.0.2.2 tos inherit dev v$ol1
+	__simple_if_init g1 v$ol1 192.0.2.1/32
+	ip route add vrf v$ol1 192.0.2.2/32 via 192.0.2.18
+
+	ip route add vrf v$ol1 203.0.113.0/24 dev g1
+	ip -6 route add vrf v$ol1 2001:db8:2::/64 dev g1
+}
+
+sw1_destroy()
+{
+	ip -6 route del vrf v$ol1 2001:db8:2::/64
+	ip route del vrf v$ol1 203.0.113.0/24
+
+	ip route del vrf v$ol1 192.0.2.2/32
+	__simple_if_fini g1 192.0.2.1/32
+	tunnel_destroy g1
+
+	__simple_if_fini $ul1 192.0.2.17/28
+	simple_if_fini $ol1 198.51.100.1/24 2001:db8:1::1/64
+}
+
+sw2_create()
+{
+	simple_if_init $ul21 192.0.2.18/28
+	__simple_if_init $ul22 v$ul21
+	vlan_create $ul22 111 v$ul21 192.0.2.33/28
+	vlan_create $ul22 222 v$ul21 192.0.2.49/28
+
+	ip route add vrf v$ul21 192.0.2.1/32 via 192.0.2.17
+	ip route add vrf v$ul21 192.0.2.2/32 \
+	   nexthop via 192.0.2.34 \
+	   nexthop via 192.0.2.50
+}
+
+sw2_destroy()
+{
+	ip route del vrf v$ul21 192.0.2.2/32
+	ip route del vrf v$ul21 192.0.2.1/32
+
+	vlan_destroy $ul22 222
+	vlan_destroy $ul22 111
+	__simple_if_fini $ul22
+	simple_if_fini $ul21 192.0.2.18/28
+}
+
+sw3_create()
+{
+	simple_if_init $ul31 192.0.2.65/28
+	__simple_if_init $ul32 v$ul31
+	vlan_create $ul32 111 v$ul31 192.0.2.34/28
+	vlan_create $ul32 222 v$ul31 192.0.2.50/28
+
+	ip route add vrf v$ul31 192.0.2.2/32 via 192.0.2.66
+	ip route add vrf v$ul31 192.0.2.1/32 \
+	   nexthop via 192.0.2.33 \
+	   nexthop via 192.0.2.49
+
+	tc qdisc add dev $ul32 clsact
+	tc filter add dev $ul32 ingress pref 111 prot 802.1Q \
+	   flower vlan_id 111 action pass
+	tc filter add dev $ul32 ingress pref 222 prot 802.1Q \
+	   flower vlan_id 222 action pass
+}
+
+sw3_destroy()
+{
+	tc qdisc del dev $ul32 clsact
+
+	ip route del vrf v$ul31 192.0.2.1/32
+	ip route del vrf v$ul31 192.0.2.2/32
+
+	vlan_destroy $ul32 222
+	vlan_destroy $ul32 111
+	__simple_if_fini $ul32
+	simple_if_fini $ul31 192.0.2.65/28
+}
+
+sw4_create()
+{
+	simple_if_init $ol4 203.0.113.1/24 2001:db8:2::1/64
+	__simple_if_init $ul4 v$ol4 192.0.2.66/28
+
+	tunnel_create g2 gre 192.0.2.2 192.0.2.1 tos inherit dev v$ol4
+	__simple_if_init g2 v$ol4 192.0.2.2/32
+	ip route add vrf v$ol4 192.0.2.1/32 via 192.0.2.65
+
+	ip route add vrf v$ol4 198.51.100.0/24 dev g2
+	ip -6 route add vrf v$ol4 2001:db8:1::/64 dev g2
+}
+
+sw4_destroy()
+{
+	ip -6 route del vrf v$ol4 2001:db8:1::/64
+	ip route del vrf v$ol4 198.51.100.0/24
+
+	ip route del vrf v$ol4 192.0.2.1/32
+	__simple_if_fini g2 192.0.2.2/32
+	tunnel_destroy g2
+
+	__simple_if_fini $ul4 192.0.2.66/28
+	simple_if_fini $ol4 203.0.113.1/24 2001:db8:2::1/64
+}
+
+h2_create()
+{
+	simple_if_init $h2 203.0.113.2/24 2001:db8:2::2/64
+	ip route add vrf v$h2 default via 203.0.113.1 dev $h2
+	ip -6 route add vrf v$h2 default via 2001:db8:2::1 dev $h2
+}
+
+h2_destroy()
+{
+	ip -6 route del vrf v$h2 default
+	ip route del vrf v$h2 default
+	simple_if_fini $h2 203.0.113.2/24 2001:db8:2::2/64
+}
+
+setup_prepare()
+{
+	h1=${NETIFS[p1]}
+
+	ol1=${NETIFS[p2]}
+	ul1=${NETIFS[p3]}
+
+	ul21=${NETIFS[p4]}
+	ul22=${NETIFS[p5]}
+
+	ul32=${NETIFS[p6]}
+	ul31=${NETIFS[p7]}
+
+	ul4=${NETIFS[p8]}
+	ol4=${NETIFS[p9]}
+
+	h2=${NETIFS[p10]}
+
+	vrf_prepare
+	h1_create
+	sw1_create
+	sw2_create
+	sw3_create
+	sw4_create
+	h2_create
+
+	forwarding_enable
+}
+
+cleanup()
+{
+	pre_cleanup
+
+	forwarding_restore
+
+	h2_destroy
+	sw4_destroy
+	sw3_destroy
+	sw2_destroy
+	sw1_destroy
+	h1_destroy
+	vrf_cleanup
+}
+
+ping_ipv4()
+{
+	ping_test $h1 203.0.113.2
+}
+
+ping_ipv6()
+{
+	ping6_test $h1 2001:db8:2::2
+}
+
+send_src_ipv4()
+{
+	$MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_dst_ipv4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_src_udp4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \
+		-d 1msec -t udp "sp=0-32768,dp=30000"
+}
+
+send_dst_udp4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \
+		-d 1msec -t udp "sp=20000,dp=0-32768"
+}
+
+send_src_ipv6()
+{
+	$MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_dst_ipv6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_flowlabel()
+{
+	# Generate 16384 echo requests, each with a random flow label.
+	for _ in $(seq 1 16384); do
+		ip vrf exec v$h1 \
+			$PING6 2001:db8:2::2 -F 0 -c 1 -q >/dev/null 2>&1
+	done
+}
+
+send_src_udp6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \
+		-d 1msec -t udp "sp=0-32768,dp=30000"
+}
+
+send_dst_udp6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \
+		-d 1msec -t udp "sp=20000,dp=0-32768"
+}
+
+custom_hash_test()
+{
+	local field="$1"; shift
+	local balanced="$1"; shift
+	local send_flows="$@"
+
+	RET=0
+
+	local t0_111=$(tc_rule_stats_get $ul32 111 ingress)
+	local t0_222=$(tc_rule_stats_get $ul32 222 ingress)
+
+	$send_flows
+
+	local t1_111=$(tc_rule_stats_get $ul32 111 ingress)
+	local t1_222=$(tc_rule_stats_get $ul32 222 ingress)
+
+	local d111=$((t1_111 - t0_111))
+	local d222=$((t1_222 - t0_222))
+
+	local diff=$((d222 - d111))
+	local sum=$((d111 + d222))
+
+	local pct=$(echo "$diff / $sum * 100" | bc -l)
+	local is_balanced=$(echo "-20 <= $pct && $pct <= 20" | bc)
+
+	[[ ( $is_balanced -eq 1 && $balanced == "balanced" ) ||
+	   ( $is_balanced -eq 0 && $balanced == "unbalanced" ) ]]
+	check_err $? "Expected traffic to be $balanced, but it is not"
+
+	log_test "Multipath hash field: $field ($balanced)"
+	log_info "Packets sent on path1 / path2: $d111 / $d222"
+}
+
+custom_hash_v4()
+{
+	log_info "Running IPv4 overlay custom multipath hash tests"
+
+	# Prevent the neighbour table from overflowing, as different neighbour
+	# entries will be created on $ol4 when using different destination IPs.
+	sysctl_set net.ipv4.neigh.default.gc_thresh1 1024
+	sysctl_set net.ipv4.neigh.default.gc_thresh2 1024
+	sysctl_set net.ipv4.neigh.default.gc_thresh3 1024
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0040
+	custom_hash_test "Inner source IP" "balanced" send_src_ipv4
+	custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv4
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0080
+	custom_hash_test "Inner destination IP" "balanced" send_dst_ipv4
+	custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv4
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0400
+	custom_hash_test "Inner source port" "balanced" send_src_udp4
+	custom_hash_test "Inner source port" "unbalanced" send_dst_udp4
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0800
+	custom_hash_test "Inner destination port" "balanced" send_dst_udp4
+	custom_hash_test "Inner destination port" "unbalanced" send_src_udp4
+
+	sysctl_restore net.ipv4.neigh.default.gc_thresh3
+	sysctl_restore net.ipv4.neigh.default.gc_thresh2
+	sysctl_restore net.ipv4.neigh.default.gc_thresh1
+}
+
+custom_hash_v6()
+{
+	log_info "Running IPv6 overlay custom multipath hash tests"
+
+	# Prevent the neighbour table from overflowing, as different neighbour
+	# entries will be created on $ol4 when using different destination IPs.
+	sysctl_set net.ipv6.neigh.default.gc_thresh1 1024
+	sysctl_set net.ipv6.neigh.default.gc_thresh2 1024
+	sysctl_set net.ipv6.neigh.default.gc_thresh3 1024
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0040
+	custom_hash_test "Inner source IP" "balanced" send_src_ipv6
+	custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv6
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0080
+	custom_hash_test "Inner destination IP" "balanced" send_dst_ipv6
+	custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv6
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0200
+	custom_hash_test "Inner flowlabel" "balanced" send_flowlabel
+	custom_hash_test "Inner flowlabel" "unbalanced" send_src_ipv6
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0400
+	custom_hash_test "Inner source port" "balanced" send_src_udp6
+	custom_hash_test "Inner source port" "unbalanced" send_dst_udp6
+
+	sysctl_set net.ipv4.fib_multipath_hash_fields 0x0800
+	custom_hash_test "Inner destination port" "balanced" send_dst_udp6
+	custom_hash_test "Inner destination port" "unbalanced" send_src_udp6
+
+	sysctl_restore net.ipv6.neigh.default.gc_thresh3
+	sysctl_restore net.ipv6.neigh.default.gc_thresh2
+	sysctl_restore net.ipv6.neigh.default.gc_thresh1
+}
+
+custom_hash()
+{
+	# Test that when the hash policy is set to custom, traffic is
+	# distributed only according to the fields set in the
+	# fib_multipath_hash_fields sysctl.
+	#
+	# Each time set a different field and make sure traffic is only
+	# distributed when the field is changed in the packet stream.
+
+	sysctl_set net.ipv4.fib_multipath_hash_policy 3
+
+	custom_hash_v4
+	custom_hash_v6
+
+	sysctl_restore net.ipv4.fib_multipath_hash_policy
+}
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+tests_run
+
+exit $EXIT_STATUS
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH net-next v2 10/10] selftests: forwarding: Add test for custom multipath hash with IPv6 GRE
  2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
                   ` (8 preceding siblings ...)
  2021-05-09 15:16 ` [RFC PATCH net-next v2 09/10] selftests: forwarding: Add test for custom multipath hash with IPv4 GRE Ido Schimmel
@ 2021-05-09 15:16 ` Ido Schimmel
  9 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-09 15:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, dsahern, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

From: Ido Schimmel <idosch@nvidia.com>

Test that when the hash policy is set to custom, traffic is distributed
only according to the inner fields set in the fib_multipath_hash_fields
sysctl.

Each time set a different field and make sure traffic is only
distributed when the field is changed in the packet stream.

The test only verifies the behavior of IPv4/IPv6 overlays on top of an
IPv6 underlay network. The previous patch verified the same with an IPv4
underlay network.

Example output:

 # ./ip6gre_custom_multipath_hash.sh
 TEST: ping                                                          [ OK ]
 TEST: ping6                                                         [ OK ]
 INFO: Running IPv4 overlay custom multipath hash tests
 TEST: Multipath hash field: Inner source IP (balanced)              [ OK ]
 INFO: Packets sent on path1 / path2: 6602 / 6002
 TEST: Multipath hash field: Inner source IP (unbalanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 1 / 12601
 TEST: Multipath hash field: Inner destination IP (balanced)         [ OK ]
 INFO: Packets sent on path1 / path2: 6802 / 5801
 TEST: Multipath hash field: Inner destination IP (unbalanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 12602 / 3
 TEST: Multipath hash field: Inner source port (balanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 16431 / 16344
 TEST: Multipath hash field: Inner source port (unbalanced)          [ OK ]
 INFO: Packets sent on path1 / path2: 0 / 32773
 TEST: Multipath hash field: Inner destination port (balanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 16431 / 16344
 TEST: Multipath hash field: Inner destination port (unbalanced)     [ OK ]
 INFO: Packets sent on path1 / path2: 2 / 32772
 INFO: Running IPv6 overlay custom multipath hash tests
 TEST: Multipath hash field: Inner source IP (balanced)              [ OK ]
 INFO: Packets sent on path1 / path2: 6704 / 5902
 TEST: Multipath hash field: Inner source IP (unbalanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 1 / 12600
 TEST: Multipath hash field: Inner destination IP (balanced)         [ OK ]
 INFO: Packets sent on path1 / path2: 5751 / 6852
 TEST: Multipath hash field: Inner destination IP (unbalanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 12602 / 0
 TEST: Multipath hash field: Inner flowlabel (balanced)              [ OK ]
 INFO: Packets sent on path1 / path2: 8272 / 8181
 TEST: Multipath hash field: Inner flowlabel (unbalanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 3 / 12602
 TEST: Multipath hash field: Inner source port (balanced)            [ OK ]
 INFO: Packets sent on path1 / path2: 16424 / 16351
 TEST: Multipath hash field: Inner source port (unbalanced)          [ OK ]
 INFO: Packets sent on path1 / path2: 3 / 32774
 TEST: Multipath hash field: Inner destination port (balanced)       [ OK ]
 INFO: Packets sent on path1 / path2: 16425 / 16350
 TEST: Multipath hash field: Inner destination port (unbalanced)     [ OK ]
 INFO: Packets sent on path1 / path2: 2 / 32773

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 .../ip6gre_custom_multipath_hash.sh           | 458 ++++++++++++++++++
 1 file changed, 458 insertions(+)
 create mode 100755 tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh

diff --git a/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh
new file mode 100755
index 000000000000..8fea2c2e0b25
--- /dev/null
+++ b/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh
@@ -0,0 +1,458 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test traffic distribution when there are multiple paths between an IPv6 GRE
+# tunnel. The tunnel carries IPv4 and IPv6 traffic between multiple hosts.
+# Multiple routes are in the underlay network. With the default multipath
+# policy, SW2 will only look at the outer IP addresses, hence only a single
+# route would be used.
+#
+# +--------------------------------+
+# | H1                             |
+# |                     $h1 +      |
+# |   198.51.100.{2-253}/24 |      |
+# |   2001:db8:1::{2-fd}/64 |      |
+# +-------------------------|------+
+#                           |
+# +-------------------------|-------------------+
+# | SW1                     |                   |
+# |                    $ol1 +                   |
+# |         198.51.100.1/24                     |
+# |        2001:db8:1::1/64                     |
+# |                                             |
+# |+ g1 (ip6gre)                                |
+# |  loc=2001:db8:3::1                          |
+# |  rem=2001:db8:3::2 -.                       |
+# |     tos=inherit     |                       |
+# |                     v                       |
+# |                     + $ul1                  |
+# |                     | 2001:db8:10::1/64     |
+# +---------------------|-----------------------+
+#                       |
+# +---------------------|-----------------------+
+# | SW2                 |                       |
+# |               $ul21 +                       |
+# |   2001:db8:10::2/64 |                       |
+# |                     |                       |
+# !   __________________+___                    |
+# |  /                      \                   |
+# |  |                      |                   |
+# |  + $ul22.111 (vlan)     + $ul22.222 (vlan)  |
+# |  | 2001:db8:11::1/64    | 2001:db8:12::1/64 |
+# |  |                      |                   |
+# +--|----------------------|-------------------+
+#    |                      |
+# +--|----------------------|-------------------+
+# |  |                      |                   |
+# |  + $ul32.111 (vlan)     + $ul32.222 (vlan)  |
+# |  | 2001:db8:11::2/64    | 2001:db8:12::2/64 |
+# |  |                      |                   |
+# |  \__________________+___/                   |
+# |                     |                       |
+# |                     |                       |
+# |               $ul31 +                       |
+# |   2001:db8:13::1/64 |                   SW3 |
+# +---------------------|-----------------------+
+#                       |
+# +---------------------|-----------------------+
+# |                     + $ul4                  |
+# |                     ^ 2001:db8:13::2/64     |
+# |                     |                       |
+# |+ g2 (ip6gre)        |                       |
+# |  loc=2001:db8:3::2  |                       |
+# |  rem=2001:db8:3::1 -'                       |
+# |  tos=inherit                                |
+# |                                             |
+# |                    $ol4 +                   |
+# |          203.0.113.1/24 |                   |
+# |        2001:db8:2::1/64 |               SW4 |
+# +-------------------------|-------------------+
+#                           |
+# +-------------------------|------+
+# |                         |      |
+# |                     $h2 +      |
+# |    203.0.113.{2-253}/24        |
+# |   2001:db8:2::{2-fd}/64     H2 |
+# +--------------------------------+
+
+ALL_TESTS="
+	ping_ipv4
+	ping_ipv6
+	custom_hash
+"
+
+NUM_NETIFS=10
+source lib.sh
+
+h1_create()
+{
+	simple_if_init $h1 198.51.100.2/24 2001:db8:1::2/64
+	ip route add vrf v$h1 default via 198.51.100.1 dev $h1
+	ip -6 route add vrf v$h1 default via 2001:db8:1::1 dev $h1
+}
+
+h1_destroy()
+{
+	ip -6 route del vrf v$h1 default
+	ip route del vrf v$h1 default
+	simple_if_fini $h1 198.51.100.2/24 2001:db8:1::2/64
+}
+
+sw1_create()
+{
+	simple_if_init $ol1 198.51.100.1/24 2001:db8:1::1/64
+	__simple_if_init $ul1 v$ol1 2001:db8:10::1/64
+
+	tunnel_create g1 ip6gre 2001:db8:3::1 2001:db8:3::2 tos inherit \
+		dev v$ol1
+	__simple_if_init g1 v$ol1 2001:db8:3::1/128
+	ip route add vrf v$ol1 2001:db8:3::2/128 via 2001:db8:10::2
+
+	ip route add vrf v$ol1 203.0.113.0/24 dev g1
+	ip -6 route add vrf v$ol1 2001:db8:2::/64 dev g1
+}
+
+sw1_destroy()
+{
+	ip -6 route del vrf v$ol1 2001:db8:2::/64
+	ip route del vrf v$ol1 203.0.113.0/24
+
+	ip route del vrf v$ol1 2001:db8:3::2/128
+	__simple_if_fini g1 2001:db8:3::1/128
+	tunnel_destroy g1
+
+	__simple_if_fini $ul1 2001:db8:10::1/64
+	simple_if_fini $ol1 198.51.100.1/24 2001:db8:1::1/64
+}
+
+sw2_create()
+{
+	simple_if_init $ul21 2001:db8:10::2/64
+	__simple_if_init $ul22 v$ul21
+	vlan_create $ul22 111 v$ul21 2001:db8:11::1/64
+	vlan_create $ul22 222 v$ul21 2001:db8:12::1/64
+
+	ip -6 route add vrf v$ul21 2001:db8:3::1/128 via 2001:db8:10::1
+	ip -6 route add vrf v$ul21 2001:db8:3::2/128 \
+	   nexthop via 2001:db8:11::2 \
+	   nexthop via 2001:db8:12::2
+}
+
+sw2_destroy()
+{
+	ip -6 route del vrf v$ul21 2001:db8:3::2/128
+	ip -6 route del vrf v$ul21 2001:db8:3::1/128
+
+	vlan_destroy $ul22 222
+	vlan_destroy $ul22 111
+	__simple_if_fini $ul22
+	simple_if_fini $ul21 2001:db8:10::2/64
+}
+
+sw3_create()
+{
+	simple_if_init $ul31 2001:db8:13::1/64
+	__simple_if_init $ul32 v$ul31
+	vlan_create $ul32 111 v$ul31 2001:db8:11::2/64
+	vlan_create $ul32 222 v$ul31 2001:db8:12::2/64
+
+	ip -6 route add vrf v$ul31 2001:db8:3::2/128 via 2001:db8:13::2
+	ip -6 route add vrf v$ul31 2001:db8:3::1/128 \
+	   nexthop via 2001:db8:11::1 \
+	   nexthop via 2001:db8:12::1
+
+	tc qdisc add dev $ul32 clsact
+	tc filter add dev $ul32 ingress pref 111 prot 802.1Q \
+	   flower vlan_id 111 action pass
+	tc filter add dev $ul32 ingress pref 222 prot 802.1Q \
+	   flower vlan_id 222 action pass
+}
+
+sw3_destroy()
+{
+	tc qdisc del dev $ul32 clsact
+
+	ip -6 route del vrf v$ul31 2001:db8:3::1/128
+	ip -6 route del vrf v$ul31 2001:db8:3::2/128
+
+	vlan_destroy $ul32 222
+	vlan_destroy $ul32 111
+	__simple_if_fini $ul32
+	simple_if_fini $ul31 2001:db8:13::1/64
+}
+
+sw4_create()
+{
+	simple_if_init $ol4 203.0.113.1/24 2001:db8:2::1/64
+	__simple_if_init $ul4 v$ol4 2001:db8:13::2/64
+
+	tunnel_create g2 ip6gre 2001:db8:3::2 2001:db8:3::1 tos inherit \
+		dev v$ol4
+	__simple_if_init g2 v$ol4 2001:db8:3::2/128
+	ip -6 route add vrf v$ol4 2001:db8:3::1/128 via 2001:db8:13::1
+
+	ip route add vrf v$ol4 198.51.100.0/24 dev g2
+	ip -6 route add vrf v$ol4 2001:db8:1::/64 dev g2
+}
+
+sw4_destroy()
+{
+	ip -6 route del vrf v$ol4 2001:db8:1::/64
+	ip route del vrf v$ol4 198.51.100.0/24
+
+	ip -6 route del vrf v$ol4 2001:db8:3::1/128
+	__simple_if_fini g2 2001:db8:3::2/128
+	tunnel_destroy g2
+
+	__simple_if_fini $ul4 2001:db8:13::2/64
+	simple_if_fini $ol4 203.0.113.1/24 2001:db8:2::1/64
+}
+
+h2_create()
+{
+	simple_if_init $h2 203.0.113.2/24 2001:db8:2::2/64
+	ip route add vrf v$h2 default via 203.0.113.1 dev $h2
+	ip -6 route add vrf v$h2 default via 2001:db8:2::1 dev $h2
+}
+
+h2_destroy()
+{
+	ip -6 route del vrf v$h2 default
+	ip route del vrf v$h2 default
+	simple_if_fini $h2 203.0.113.2/24 2001:db8:2::2/64
+}
+
+setup_prepare()
+{
+	h1=${NETIFS[p1]}
+
+	ol1=${NETIFS[p2]}
+	ul1=${NETIFS[p3]}
+
+	ul21=${NETIFS[p4]}
+	ul22=${NETIFS[p5]}
+
+	ul32=${NETIFS[p6]}
+	ul31=${NETIFS[p7]}
+
+	ul4=${NETIFS[p8]}
+	ol4=${NETIFS[p9]}
+
+	h2=${NETIFS[p10]}
+
+	vrf_prepare
+	h1_create
+	sw1_create
+	sw2_create
+	sw3_create
+	sw4_create
+	h2_create
+
+	forwarding_enable
+}
+
+cleanup()
+{
+	pre_cleanup
+
+	forwarding_restore
+
+	h2_destroy
+	sw4_destroy
+	sw3_destroy
+	sw2_destroy
+	sw1_destroy
+	h1_destroy
+	vrf_cleanup
+}
+
+ping_ipv4()
+{
+	ping_test $h1 203.0.113.2
+}
+
+ping_ipv6()
+{
+	ping6_test $h1 2001:db8:2::2
+}
+
+send_src_ipv4()
+{
+	$MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_dst_ipv4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_src_udp4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \
+		-d 1msec -t udp "sp=0-32768,dp=30000"
+}
+
+send_dst_udp4()
+{
+	$MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \
+		-d 1msec -t udp "sp=20000,dp=0-32768"
+}
+
+send_src_ipv6()
+{
+	$MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_dst_ipv6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \
+		-d 1msec -c 50 -t udp "sp=20000,dp=30000"
+}
+
+send_flowlabel()
+{
+	# Generate 16384 echo requests, each with a random flow label.
+	for _ in $(seq 1 16384); do
+		ip vrf exec v$h1 \
+			$PING6 2001:db8:2::2 -F 0 -c 1 -q >/dev/null 2>&1
+	done
+}
+
+send_src_udp6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \
+		-d 1msec -t udp "sp=0-32768,dp=30000"
+}
+
+send_dst_udp6()
+{
+	$MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \
+		-d 1msec -t udp "sp=20000,dp=0-32768"
+}
+
+custom_hash_test()
+{
+	local field="$1"; shift
+	local balanced="$1"; shift
+	local send_flows="$@"
+
+	RET=0
+
+	local t0_111=$(tc_rule_stats_get $ul32 111 ingress)
+	local t0_222=$(tc_rule_stats_get $ul32 222 ingress)
+
+	$send_flows
+
+	local t1_111=$(tc_rule_stats_get $ul32 111 ingress)
+	local t1_222=$(tc_rule_stats_get $ul32 222 ingress)
+
+	local d111=$((t1_111 - t0_111))
+	local d222=$((t1_222 - t0_222))
+
+	local diff=$((d222 - d111))
+	local sum=$((d111 + d222))
+
+	local pct=$(echo "$diff / $sum * 100" | bc -l)
+	local is_balanced=$(echo "-20 <= $pct && $pct <= 20" | bc)
+
+	[[ ( $is_balanced -eq 1 && $balanced == "balanced" ) ||
+	   ( $is_balanced -eq 0 && $balanced == "unbalanced" ) ]]
+	check_err $? "Expected traffic to be $balanced, but it is not"
+
+	log_test "Multipath hash field: $field ($balanced)"
+	log_info "Packets sent on path1 / path2: $d111 / $d222"
+}
+
+custom_hash_v4()
+{
+	log_info "Running IPv4 overlay custom multipath hash tests"
+
+	# Prevent the neighbour table from overflowing, as different neighbour
+	# entries will be created on $ol4 when using different destination IPs.
+	sysctl_set net.ipv4.neigh.default.gc_thresh1 1024
+	sysctl_set net.ipv4.neigh.default.gc_thresh2 1024
+	sysctl_set net.ipv4.neigh.default.gc_thresh3 1024
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0040
+	custom_hash_test "Inner source IP" "balanced" send_src_ipv4
+	custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv4
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0080
+	custom_hash_test "Inner destination IP" "balanced" send_dst_ipv4
+	custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv4
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0400
+	custom_hash_test "Inner source port" "balanced" send_src_udp4
+	custom_hash_test "Inner source port" "unbalanced" send_dst_udp4
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0800
+	custom_hash_test "Inner destination port" "balanced" send_dst_udp4
+	custom_hash_test "Inner destination port" "unbalanced" send_src_udp4
+
+	sysctl_restore net.ipv4.neigh.default.gc_thresh3
+	sysctl_restore net.ipv4.neigh.default.gc_thresh2
+	sysctl_restore net.ipv4.neigh.default.gc_thresh1
+}
+
+custom_hash_v6()
+{
+	log_info "Running IPv6 overlay custom multipath hash tests"
+
+	# Prevent the neighbour table from overflowing, as different neighbour
+	# entries will be created on $ol4 when using different destination IPs.
+	sysctl_set net.ipv6.neigh.default.gc_thresh1 1024
+	sysctl_set net.ipv6.neigh.default.gc_thresh2 1024
+	sysctl_set net.ipv6.neigh.default.gc_thresh3 1024
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0040
+	custom_hash_test "Inner source IP" "balanced" send_src_ipv6
+	custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0080
+	custom_hash_test "Inner destination IP" "balanced" send_dst_ipv6
+	custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0200
+	custom_hash_test "Inner flowlabel" "balanced" send_flowlabel
+	custom_hash_test "Inner flowlabel" "unbalanced" send_src_ipv6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0400
+	custom_hash_test "Inner source port" "balanced" send_src_udp6
+	custom_hash_test "Inner source port" "unbalanced" send_dst_udp6
+
+	sysctl_set net.ipv6.fib_multipath_hash_fields 0x0800
+	custom_hash_test "Inner destination port" "balanced" send_dst_udp6
+	custom_hash_test "Inner destination port" "unbalanced" send_src_udp6
+
+	sysctl_restore net.ipv6.neigh.default.gc_thresh3
+	sysctl_restore net.ipv6.neigh.default.gc_thresh2
+	sysctl_restore net.ipv6.neigh.default.gc_thresh1
+}
+
+custom_hash()
+{
+	# Test that when the hash policy is set to custom, traffic is
+	# distributed only according to the fields set in the
+	# fib_multipath_hash_fields sysctl.
+	#
+	# Each time set a different field and make sure traffic is only
+	# distributed when the field is changed in the packet stream.
+
+	sysctl_set net.ipv6.fib_multipath_hash_policy 3
+
+	custom_hash_v4
+	custom_hash_v6
+
+	sysctl_restore net.ipv6.fib_multipath_hash_policy
+}
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+tests_run
+
+exit $EXIT_STATUS
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields
  2021-05-09 15:16 ` [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields Ido Schimmel
@ 2021-05-11 15:10   ` David Ahern
  2021-05-11 19:58     ` Ido Schimmel
  2021-05-11 15:49   ` David Ahern
  1 sibling, 1 reply; 18+ messages in thread
From: David Ahern @ 2021-05-11 15:10 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, petrm, roopa, nikolay, ssuryaextr, mlxsw, Ido Schimmel

On 5/9/21 9:16 AM, Ido Schimmel wrote:
> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> index c2ecc9894fd0..15982f830abc 100644
> --- a/Documentation/networking/ip-sysctl.rst
> +++ b/Documentation/networking/ip-sysctl.rst
> @@ -100,6 +100,33 @@ fib_multipath_hash_policy - INTEGER
>  	- 1 - Layer 4
>  	- 2 - Layer 3 or inner Layer 3 if present
>  
> +fib_multipath_hash_fields - UNSIGNED INTEGER
> +	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
> +	fields used for multipath hash calculation are determined by this
> +	sysctl.
> +
> +	This value is a bitmask which enables various fields for multipath hash
> +	calculation.
> +
> +	Possible fields are:
> +
> +	====== ============================
> +	0x0001 Source IP address
> +	0x0002 Destination IP address
> +	0x0004 IP protocol
> +	0x0008 Unused

Document that this bit is flowlabel for IPv6 and ignored for ipv4.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH net-next v2 03/10] ipv4: Add custom multipath hash policy
  2021-05-09 15:16 ` [RFC PATCH net-next v2 03/10] ipv4: Add custom multipath hash policy Ido Schimmel
@ 2021-05-11 15:46   ` David Ahern
  2021-05-11 20:02     ` Ido Schimmel
  0 siblings, 1 reply; 18+ messages in thread
From: David Ahern @ 2021-05-11 15:46 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, petrm, roopa, nikolay, ssuryaextr, mlxsw, Ido Schimmel

On 5/9/21 9:16 AM, Ido Schimmel wrote:
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 9d61e969446e..a4c477475f4c 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -1906,6 +1906,121 @@ static void ip_multipath_l3_keys(const struct sk_buff *skb,
>  	hash_keys->addrs.v4addrs.dst = key_iph->daddr;
>  }
>  
> +static u32 fib_multipath_custom_hash_outer(const struct net *net,
> +					   const struct sk_buff *skb,
> +					   bool *p_has_inner)
> +{
> +	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
> +	struct flow_keys keys, hash_keys;
> +
> +	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
> +		return 0;
> +
> +	memset(&hash_keys, 0, sizeof(hash_keys));
> +	skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_ENCAP);
> +
> +	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
> +		hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
> +		hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
> +		hash_keys.basic.ip_proto = keys.basic.ip_proto;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
> +		hash_keys.ports.src = keys.ports.src;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
> +		hash_keys.ports.dst = keys.ports.dst;
> +
> +	*p_has_inner = !!(keys.control.flags & FLOW_DIS_ENCAPSULATION);
> +	return flow_hash_from_keys(&hash_keys);
> +}
> +
> +static u32 fib_multipath_custom_hash_inner(const struct net *net,
> +					   const struct sk_buff *skb,
> +					   bool has_inner)
> +{
> +	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
> +	struct flow_keys keys, hash_keys;
> +
> +	/* We assume the packet carries an encapsulation, but if none was
> +	 * encountered during dissection of the outer flow, then there is no
> +	 * point in calling the flow dissector again.
> +	 */
> +	if (!has_inner)
> +		return 0;
> +
> +	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_MASK))
> +		return 0;
> +
> +	memset(&hash_keys, 0, sizeof(hash_keys));
> +	skb_flow_dissect_flow_keys(skb, &keys, 0);
> +
> +	if (!(keys.control.flags & FLOW_DIS_ENCAPSULATION))
> +		return 0;
> +
> +	if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
> +		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
> +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
> +			hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
> +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
> +			hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
> +	} else if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
> +		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
> +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
> +			hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src;
> +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
> +			hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst;
> +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_FLOWLABEL)
> +			hash_keys.tags.flow_label = keys.tags.flow_label;
> +	}
> +
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_IP_PROTO)
> +		hash_keys.basic.ip_proto = keys.basic.ip_proto;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_PORT)
> +		hash_keys.ports.src = keys.ports.src;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_PORT)
> +		hash_keys.ports.dst = keys.ports.dst;
> +
> +	return flow_hash_from_keys(&hash_keys);
> +}
> +
> +static u32 fib_multipath_custom_hash_skb(const struct net *net,
> +					 const struct sk_buff *skb)
> +{
> +	u32 mhash, mhash_inner;
> +	bool has_inner = true;
> +

Is it not possible to do the dissect once here and pass keys to outer
and inner functions?

	memset(&hash_keys, 0, sizeof(hash_keys));
	skb_flow_dissect_flow_keys(skb, &keys, flag);


> +	mhash = fib_multipath_custom_hash_outer(net, skb, &has_inner);
> +	mhash_inner = fib_multipath_custom_hash_inner(net, skb, has_inner);
> +
> +	return jhash_2words(mhash, mhash_inner, 0);
> +}
> +
> +static u32 fib_multipath_custom_hash_fl4(const struct net *net,
> +					 const struct flowi4 *fl4)
> +{
> +	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
> +	struct flow_keys hash_keys;
> +
> +	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
> +		return 0;
> +
> +	memset(&hash_keys, 0, sizeof(hash_keys));
> +	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
> +		hash_keys.addrs.v4addrs.src = fl4->saddr;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
> +		hash_keys.addrs.v4addrs.dst = fl4->daddr;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
> +		hash_keys.basic.ip_proto = fl4->flowi4_proto;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
> +		hash_keys.ports.src = fl4->fl4_sport;
> +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
> +		hash_keys.ports.dst = fl4->fl4_dport;
> +
> +	return flow_hash_from_keys(&hash_keys);
> +}
> +
>  /* if skb is set it will be used and fl4 can be NULL */
>  int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
>  		       const struct sk_buff *skb, struct flow_keys *flkeys)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields
  2021-05-09 15:16 ` [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields Ido Schimmel
  2021-05-11 15:10   ` David Ahern
@ 2021-05-11 15:49   ` David Ahern
  2021-05-11 20:05     ` Ido Schimmel
  1 sibling, 1 reply; 18+ messages in thread
From: David Ahern @ 2021-05-11 15:49 UTC (permalink / raw)
  To: Ido Schimmel, netdev
  Cc: davem, kuba, petrm, roopa, nikolay, ssuryaextr, mlxsw, Ido Schimmel

On 5/9/21 9:16 AM, Ido Schimmel wrote:
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index a62934b9f15a..da627c4d633a 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -19,6 +19,7 @@
>  #include <net/snmp.h>
>  #include <net/icmp.h>
>  #include <net/ip.h>
> +#include <net/ip_fib.h>
>  #include <net/route.h>
>  #include <net/tcp.h>
>  #include <net/udp.h>
> @@ -48,6 +49,8 @@ static int ip_ping_group_range_min[] = { 0, 0 };
>  static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
>  static u32 u32_max_div_HZ = UINT_MAX / HZ;
>  static int one_day_secs = 24 * 3600;
> +static u32 fib_multipath_hash_fields_all_mask __maybe_unused =
> +	FIB_MULTIPATH_HASH_FIELD_ALL_MASK;
>  
>  /* obsolete */
>  static int sysctl_tcp_low_latency __read_mostly;
> @@ -1052,6 +1055,14 @@ static struct ctl_table ipv4_net_table[] = {
>  		.extra1		= SYSCTL_ZERO,
>  		.extra2		= &two,
>  	},
> +	{
> +		.procname	= "fib_multipath_hash_fields",
> +		.data		= &init_net.ipv4.sysctl_fib_multipath_hash_fields,
> +		.maxlen		= sizeof(u32),
> +		.mode		= 0644,
> +		.proc_handler	= proc_douintvec_minmax,
> +		.extra2		= &fib_multipath_hash_fields_all_mask,

no .extra1 means 0 is allowed which effectively disables hashing and
multipath selection; only the first leg will be used. Is that intended?



> +	},
>  #endif
>  	{
>  		.procname	= "ip_unprivileged_port_start",
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields
  2021-05-11 15:10   ` David Ahern
@ 2021-05-11 19:58     ` Ido Schimmel
  0 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-11 19:58 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, kuba, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

On Tue, May 11, 2021 at 09:10:46AM -0600, David Ahern wrote:
> On 5/9/21 9:16 AM, Ido Schimmel wrote:
> > diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> > index c2ecc9894fd0..15982f830abc 100644
> > --- a/Documentation/networking/ip-sysctl.rst
> > +++ b/Documentation/networking/ip-sysctl.rst
> > @@ -100,6 +100,33 @@ fib_multipath_hash_policy - INTEGER
> >  	- 1 - Layer 4
> >  	- 2 - Layer 3 or inner Layer 3 if present
> >  
> > +fib_multipath_hash_fields - UNSIGNED INTEGER
> > +	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
> > +	fields used for multipath hash calculation are determined by this
> > +	sysctl.
> > +
> > +	This value is a bitmask which enables various fields for multipath hash
> > +	calculation.
> > +
> > +	Possible fields are:
> > +
> > +	====== ============================
> > +	0x0001 Source IP address
> > +	0x0002 Destination IP address
> > +	0x0004 IP protocol
> > +	0x0008 Unused
> 
> Document that this bit is flowlabel for IPv6 and ignored for ipv4.

OK

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH net-next v2 03/10] ipv4: Add custom multipath hash policy
  2021-05-11 15:46   ` David Ahern
@ 2021-05-11 20:02     ` Ido Schimmel
  0 siblings, 0 replies; 18+ messages in thread
From: Ido Schimmel @ 2021-05-11 20:02 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, kuba, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

On Tue, May 11, 2021 at 09:46:27AM -0600, David Ahern wrote:
> On 5/9/21 9:16 AM, Ido Schimmel wrote:
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index 9d61e969446e..a4c477475f4c 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -1906,6 +1906,121 @@ static void ip_multipath_l3_keys(const struct sk_buff *skb,
> >  	hash_keys->addrs.v4addrs.dst = key_iph->daddr;
> >  }
> >  
> > +static u32 fib_multipath_custom_hash_outer(const struct net *net,
> > +					   const struct sk_buff *skb,
> > +					   bool *p_has_inner)
> > +{
> > +	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
> > +	struct flow_keys keys, hash_keys;
> > +
> > +	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
> > +		return 0;
> > +
> > +	memset(&hash_keys, 0, sizeof(hash_keys));
> > +	skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_ENCAP);
> > +
> > +	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
> > +		hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
> > +		hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
> > +		hash_keys.basic.ip_proto = keys.basic.ip_proto;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
> > +		hash_keys.ports.src = keys.ports.src;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
> > +		hash_keys.ports.dst = keys.ports.dst;
> > +
> > +	*p_has_inner = !!(keys.control.flags & FLOW_DIS_ENCAPSULATION);
> > +	return flow_hash_from_keys(&hash_keys);
> > +}
> > +
> > +static u32 fib_multipath_custom_hash_inner(const struct net *net,
> > +					   const struct sk_buff *skb,
> > +					   bool has_inner)
> > +{
> > +	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
> > +	struct flow_keys keys, hash_keys;
> > +
> > +	/* We assume the packet carries an encapsulation, but if none was
> > +	 * encountered during dissection of the outer flow, then there is no
> > +	 * point in calling the flow dissector again.
> > +	 */
> > +	if (!has_inner)
> > +		return 0;
> > +
> > +	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_MASK))
> > +		return 0;
> > +
> > +	memset(&hash_keys, 0, sizeof(hash_keys));
> > +	skb_flow_dissect_flow_keys(skb, &keys, 0);
> > +
> > +	if (!(keys.control.flags & FLOW_DIS_ENCAPSULATION))
> > +		return 0;
> > +
> > +	if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
> > +		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
> > +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
> > +			hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
> > +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
> > +			hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
> > +	} else if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
> > +		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
> > +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP)
> > +			hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src;
> > +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP)
> > +			hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst;
> > +		if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_FLOWLABEL)
> > +			hash_keys.tags.flow_label = keys.tags.flow_label;
> > +	}
> > +
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_IP_PROTO)
> > +		hash_keys.basic.ip_proto = keys.basic.ip_proto;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_SRC_PORT)
> > +		hash_keys.ports.src = keys.ports.src;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_INNER_DST_PORT)
> > +		hash_keys.ports.dst = keys.ports.dst;
> > +
> > +	return flow_hash_from_keys(&hash_keys);
> > +}
> > +
> > +static u32 fib_multipath_custom_hash_skb(const struct net *net,
> > +					 const struct sk_buff *skb)
> > +{
> > +	u32 mhash, mhash_inner;
> > +	bool has_inner = true;
> > +
> 
> Is it not possible to do the dissect once here and pass keys to outer
> and inner functions?
> 
> 	memset(&hash_keys, 0, sizeof(hash_keys));
> 	skb_flow_dissect_flow_keys(skb, &keys, flag);

Not that I'm aware. For outer flow we need to pass
'FLOW_DISSECTOR_F_STOP_AT_ENCAP'. For inner flow, we shouldn't pass any
flags, but make sure encapsulation was encountered by checking
'keys.control.flags & FLOW_DIS_ENCAPSULATION'.

Also, 'struct flow_keys' has keys for a single flow.

> 
> 
> > +	mhash = fib_multipath_custom_hash_outer(net, skb, &has_inner);
> > +	mhash_inner = fib_multipath_custom_hash_inner(net, skb, has_inner);
> > +
> > +	return jhash_2words(mhash, mhash_inner, 0);
> > +}
> > +
> > +static u32 fib_multipath_custom_hash_fl4(const struct net *net,
> > +					 const struct flowi4 *fl4)
> > +{
> > +	u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
> > +	struct flow_keys hash_keys;
> > +
> > +	if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
> > +		return 0;
> > +
> > +	memset(&hash_keys, 0, sizeof(hash_keys));
> > +	hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_IP)
> > +		hash_keys.addrs.v4addrs.src = fl4->saddr;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_IP)
> > +		hash_keys.addrs.v4addrs.dst = fl4->daddr;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_IP_PROTO)
> > +		hash_keys.basic.ip_proto = fl4->flowi4_proto;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_SRC_PORT)
> > +		hash_keys.ports.src = fl4->fl4_sport;
> > +	if (hash_fields & FIB_MULTIPATH_HASH_FIELD_DST_PORT)
> > +		hash_keys.ports.dst = fl4->fl4_dport;
> > +
> > +	return flow_hash_from_keys(&hash_keys);
> > +}
> > +
> >  /* if skb is set it will be used and fl4 can be NULL */
> >  int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
> >  		       const struct sk_buff *skb, struct flow_keys *flkeys)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields
  2021-05-11 15:49   ` David Ahern
@ 2021-05-11 20:05     ` Ido Schimmel
  2021-05-17 14:47       ` David Ahern
  0 siblings, 1 reply; 18+ messages in thread
From: Ido Schimmel @ 2021-05-11 20:05 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, davem, kuba, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

On Tue, May 11, 2021 at 09:49:30AM -0600, David Ahern wrote:
> On 5/9/21 9:16 AM, Ido Schimmel wrote:
> > diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> > index a62934b9f15a..da627c4d633a 100644
> > --- a/net/ipv4/sysctl_net_ipv4.c
> > +++ b/net/ipv4/sysctl_net_ipv4.c
> > @@ -19,6 +19,7 @@
> >  #include <net/snmp.h>
> >  #include <net/icmp.h>
> >  #include <net/ip.h>
> > +#include <net/ip_fib.h>
> >  #include <net/route.h>
> >  #include <net/tcp.h>
> >  #include <net/udp.h>
> > @@ -48,6 +49,8 @@ static int ip_ping_group_range_min[] = { 0, 0 };
> >  static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
> >  static u32 u32_max_div_HZ = UINT_MAX / HZ;
> >  static int one_day_secs = 24 * 3600;
> > +static u32 fib_multipath_hash_fields_all_mask __maybe_unused =
> > +	FIB_MULTIPATH_HASH_FIELD_ALL_MASK;
> >  
> >  /* obsolete */
> >  static int sysctl_tcp_low_latency __read_mostly;
> > @@ -1052,6 +1055,14 @@ static struct ctl_table ipv4_net_table[] = {
> >  		.extra1		= SYSCTL_ZERO,
> >  		.extra2		= &two,
> >  	},
> > +	{
> > +		.procname	= "fib_multipath_hash_fields",
> > +		.data		= &init_net.ipv4.sysctl_fib_multipath_hash_fields,
> > +		.maxlen		= sizeof(u32),
> > +		.mode		= 0644,
> > +		.proc_handler	= proc_douintvec_minmax,
> > +		.extra2		= &fib_multipath_hash_fields_all_mask,
> 
> no .extra1 means 0 is allowed which effectively disables hashing and
> multipath selection; only the first leg will be used. Is that intended?

I didn't see any reason to forbid it, but I don't see any reason to use
it with '0' either. With this patch:

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 81343037de06..4fa77f182dcb 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1078,6 +1078,7 @@ static struct ctl_table ipv4_net_table[] = {
 		.maxlen		= sizeof(u32),
 		.mode		= 0644,
 		.proc_handler	= proc_fib_multipath_hash_fields,
+		.extra1		= SYSCTL_ONE,
 		.extra2		= &fib_multipath_hash_fields_all_mask,
 	},
 #endif

We get:

# sysctl -w net.ipv4.fib_multipath_hash_fields=0
sysctl: setting key "net.ipv4.fib_multipath_hash_fields": Invalid argument

I assume you want to see this change in the next version (and for IPv6)?

> 
> 
> 
> > +	},
> >  #endif
> >  	{
> >  		.procname	= "ip_unprivileged_port_start",
> > 
> 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields
  2021-05-11 20:05     ` Ido Schimmel
@ 2021-05-17 14:47       ` David Ahern
  0 siblings, 0 replies; 18+ messages in thread
From: David Ahern @ 2021-05-17 14:47 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, davem, kuba, petrm, roopa, nikolay, ssuryaextr, mlxsw,
	Ido Schimmel

On 5/11/21 2:05 PM, Ido Schimmel wrote:
> 
> # sysctl -w net.ipv4.fib_multipath_hash_fields=0
> sysctl: setting key "net.ipv4.fib_multipath_hash_fields": Invalid argument
> 
> I assume you want to see this change in the next version (and for IPv6)?
> 

Yes, I do not know of any reason to allow the hash policy to disable
multipath routes.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-05-17 16:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-09 15:16 [RFC PATCH net-next v2 00/10] Add support for custom multipath hash Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 01/10] ipv4: Calculate multipath hash inside switch statement Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 02/10] ipv4: Add a sysctl to control multipath hash fields Ido Schimmel
2021-05-11 15:10   ` David Ahern
2021-05-11 19:58     ` Ido Schimmel
2021-05-11 15:49   ` David Ahern
2021-05-11 20:05     ` Ido Schimmel
2021-05-17 14:47       ` David Ahern
2021-05-09 15:16 ` [RFC PATCH net-next v2 03/10] ipv4: Add custom multipath hash policy Ido Schimmel
2021-05-11 15:46   ` David Ahern
2021-05-11 20:02     ` Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 04/10] ipv6: Use a more suitable label name Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 05/10] ipv6: Calculate multipath hash inside switch statement Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 06/10] ipv6: Add a sysctl to control multipath hash fields Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 07/10] ipv6: Add custom multipath hash policy Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 08/10] selftests: forwarding: Add test for custom multipath hash Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 09/10] selftests: forwarding: Add test for custom multipath hash with IPv4 GRE Ido Schimmel
2021-05-09 15:16 ` [RFC PATCH net-next v2 10/10] selftests: forwarding: Add test for custom multipath hash with IPv6 GRE Ido Schimmel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.