[PATCH net-next] hv_netvsc: don't make assumptions on struct flow

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-07  9:33 ` Vitaly Kuznetsov
  0 siblings, 0 replies; 43+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-07  9:33 UTC (permalink / raw)
  To: netdev
  Cc: K. Y. Srinivasan, Haiyang Zhang, devel, linux-kernel,
	Eric Dumazet, David Miller

Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
 VLAN ID to flow_keys")) introduced a performance regression in netvsc
driver. Is problem is, however, not the above mentioned commit but the
fact that netvsc_set_hash() function did some assumptions on the struct
flow_keys data layout and this is wrong. We need to extract the data we
need (src/dst addresses and ports) after the dissect.

The issue could also be solved in a completely different way: as suggested
by Eric instead of our own homegrown netvsc_set_hash() we could use
skb_get_hash() which does more or less the same. Unfortunately, the
testing done by Simon showed that Hyper-V hosts are not happy with our
Jenkins hash, selecting the output queue with the current algorithm based
on Toeplitz hash works significantly better.

Tested-by: Simon Xiao <sixiao@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
This patch is an alternative to Haiyang's "hv_netvsc: Use simple parser
for IPv4 and v6 headers".
---
 drivers/net/hyperv/netvsc_drv.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 409b48e..4dea44e 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -238,18 +238,26 @@ static bool netvsc_set_hash(u32 *hash, struct sk_buff *skb)
 {
 	struct flow_keys flow;
 	int data_len;
+	u8 data[sizeof(flow.addrs) + sizeof(flow.ports)];
 
-	if (!skb_flow_dissect_flow_keys(skb, &flow, 0) ||
-	    !(flow.basic.n_proto == htons(ETH_P_IP) ||
-	      flow.basic.n_proto == htons(ETH_P_IPV6)))
+	if (!skb_flow_dissect_flow_keys(skb, &flow, 0))
 		return false;
 
-	if (flow.basic.ip_proto == IPPROTO_TCP)
-		data_len = 12;
+	if (flow.basic.n_proto == htons(ETH_P_IP))
+		data_len = sizeof(flow.addrs.v4addrs);
+	else if (flow.basic.n_proto == htons(ETH_P_IPV6))
+		data_len = sizeof(flow.addrs.v6addrs);
 	else
-		data_len = 8;
+		return false;
+
+	memcpy(data, &flow.addrs, data_len);
+
+	if (flow.basic.ip_proto == IPPROTO_TCP) {
+		memcpy(data + data_len, &flow.ports, sizeof(flow.ports));
+		data_len += sizeof(flow.ports);
+	}
 
-	*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, &flow, data_len);
+	*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, data, data_len);
 
 	return true;
 }
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-07  9:33 ` Vitaly Kuznetsov
  0 siblings, 0 replies; 43+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-07  9:33 UTC (permalink / raw)
  To: netdev; +Cc: Eric Dumazet, Haiyang Zhang, linux-kernel, devel, David Miller

Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
 VLAN ID to flow_keys")) introduced a performance regression in netvsc
driver. Is problem is, however, not the above mentioned commit but the
fact that netvsc_set_hash() function did some assumptions on the struct
flow_keys data layout and this is wrong. We need to extract the data we
need (src/dst addresses and ports) after the dissect.

The issue could also be solved in a completely different way: as suggested
by Eric instead of our own homegrown netvsc_set_hash() we could use
skb_get_hash() which does more or less the same. Unfortunately, the
testing done by Simon showed that Hyper-V hosts are not happy with our
Jenkins hash, selecting the output queue with the current algorithm based
on Toeplitz hash works significantly better.

Tested-by: Simon Xiao <sixiao@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
This patch is an alternative to Haiyang's "hv_netvsc: Use simple parser
for IPv4 and v6 headers".
---
 drivers/net/hyperv/netvsc_drv.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 409b48e..4dea44e 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -238,18 +238,26 @@ static bool netvsc_set_hash(u32 *hash, struct sk_buff *skb)
 {
 	struct flow_keys flow;
 	int data_len;
+	u8 data[sizeof(flow.addrs) + sizeof(flow.ports)];
 
-	if (!skb_flow_dissect_flow_keys(skb, &flow, 0) ||
-	    !(flow.basic.n_proto == htons(ETH_P_IP) ||
-	      flow.basic.n_proto == htons(ETH_P_IPV6)))
+	if (!skb_flow_dissect_flow_keys(skb, &flow, 0))
 		return false;
 
-	if (flow.basic.ip_proto == IPPROTO_TCP)
-		data_len = 12;
+	if (flow.basic.n_proto == htons(ETH_P_IP))
+		data_len = sizeof(flow.addrs.v4addrs);
+	else if (flow.basic.n_proto == htons(ETH_P_IPV6))
+		data_len = sizeof(flow.addrs.v6addrs);
 	else
-		data_len = 8;
+		return false;
+
+	memcpy(data, &flow.addrs, data_len);
+
+	if (flow.basic.ip_proto == IPPROTO_TCP) {
+		memcpy(data + data_len, &flow.ports, sizeof(flow.ports));
+		data_len += sizeof(flow.ports);
+	}
 
-	*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, &flow, data_len);
+	*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, data, data_len);
 
 	return true;
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-07  9:33 ` Vitaly Kuznetsov
  (?)
@ 2016-01-07 12:52 ` Eric Dumazet
  2016-01-07 13:28     ` Vitaly Kuznetsov
  2016-01-09  0:17     ` Tom Herbert
  -1 siblings, 2 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-01-07 12:52 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Tom Herbert
  Cc: netdev, K. Y. Srinivasan, Haiyang Zhang, devel, linux-kernel,
	David Miller

On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> driver. Is problem is, however, not the above mentioned commit but the
> fact that netvsc_set_hash() function did some assumptions on the struct
> flow_keys data layout and this is wrong. We need to extract the data we
> need (src/dst addresses and ports) after the dissect.
> 
> The issue could also be solved in a completely different way: as suggested
> by Eric instead of our own homegrown netvsc_set_hash() we could use
> skb_get_hash() which does more or less the same. Unfortunately, the
> testing done by Simon showed that Hyper-V hosts are not happy with our
> Jenkins hash, selecting the output queue with the current algorithm based
> on Toeplitz hash works significantly better.

Were tests done on IPv6 traffic ?

Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
bit : 96 iterations)

For IPv6 it is 3 times this, since we have to hash 36 bytes.

I do not see how it can compete with skb_get_hash() that directly gives
skb->hash for local TCP flows.

See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
("net: Save TX flow hash in sock and set in skbuf on xmit")
and 877d1f6291f8e391237e324be58479a3e3a7407c
("net: Set sk_txhash from a random number")

I understand Microsoft loves Toeplitz, but this looks not well placed
here.

I suspect there is another problem.

Please share your numbers and test methodology, and the alternative
patch Simon tested so that we can double check it.

Thanks.

PS: For the time being this patch can probably be applied on -net tree,
as it fixes a real bug.




^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-07 12:52 ` Eric Dumazet
@ 2016-01-07 13:28     ` Vitaly Kuznetsov
  2016-01-09  0:17     ` Tom Herbert
  1 sibling, 0 replies; 43+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-07 13:28 UTC (permalink / raw)
  To: Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, netdev, K. Y. Srinivasan, Haiyang Zhang, devel,
	linux-kernel, David Miller

[-- Attachment #1: Type: text/plain, Size: 2330 bytes --]

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>> driver. Is problem is, however, not the above mentioned commit but the
>> fact that netvsc_set_hash() function did some assumptions on the struct
>> flow_keys data layout and this is wrong. We need to extract the data we
>> need (src/dst addresses and ports) after the dissect.
>> 
>> The issue could also be solved in a completely different way: as suggested
>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>> skb_get_hash() which does more or less the same. Unfortunately, the
>> testing done by Simon showed that Hyper-V hosts are not happy with our
>> Jenkins hash, selecting the output queue with the current algorithm based
>> on Toeplitz hash works significantly better.
>
> Were tests done on IPv6 traffic ?
>

Simon, could you please test this patch for IPv6 and show us the numbers?

> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> bit : 96 iterations)
>
> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>
> I do not see how it can compete with skb_get_hash() that directly gives
> skb->hash for local TCP flows.
>

My guess is that this is not the bottleneck, something is happening
behind the scene with out packets in Hyper-V host (e.g. re-distributing
them to hardware queues?) but I don't know the internals, Microsoft
folks could probably comment.


> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> and 877d1f6291f8e391237e324be58479a3e3a7407c
> ("net: Set sk_txhash from a random number")
>
> I understand Microsoft loves Toeplitz, but this looks not well placed
> here.
>
> I suspect there is another problem.
>
> Please share your numbers and test methodology, and the alternative
> patch Simon tested so that we can double check it.
>

Alternative patch which uses skb_get_hash() attached. Simon, could you
please share the rest (environment, metodology, numbers) with us here?
Thanks!

> Thanks.
>
> PS: For the time being this patch can probably be applied on -net tree,
> as it fixes a real bug.

-- 
  Vitaly


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-hv_netvsc-use-skb_get_hash-instead-of-a-homegrown-im.patch --]
[-- Type: text/x-patch, Size: 2420 bytes --]

>From 0040e79c1303bd225ddbbce679ea944ea11ad0bd Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Wed, 6 Jan 2016 12:14:10 +0100
Subject: [PATCH] hv_netvsc: use skb_get_hash() instead of a homegrown
 implementation

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/net/hyperv/netvsc_drv.c | 67 ++---------------------------------------
 1 file changed, 3 insertions(+), 64 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 409b48e..038bf4f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -195,65 +195,6 @@ static void *init_ppi_data(struct rndis_message *msg, u32 ppi_size,
 	return ppi;
 }
 
-union sub_key {
-	u64 k;
-	struct {
-		u8 pad[3];
-		u8 kb;
-		u32 ka;
-	};
-};
-
-/* Toeplitz hash function
- * data: network byte order
- * return: host byte order
- */
-static u32 comp_hash(u8 *key, int klen, void *data, int dlen)
-{
-	union sub_key subk;
-	int k_next = 4;
-	u8 dt;
-	int i, j;
-	u32 ret = 0;
-
-	subk.k = 0;
-	subk.ka = ntohl(*(u32 *)key);
-
-	for (i = 0; i < dlen; i++) {
-		subk.kb = key[k_next];
-		k_next = (k_next + 1) % klen;
-		dt = ((u8 *)data)[i];
-		for (j = 0; j < 8; j++) {
-			if (dt & 0x80)
-				ret ^= subk.ka;
-			dt <<= 1;
-			subk.k <<= 1;
-		}
-	}
-
-	return ret;
-}
-
-static bool netvsc_set_hash(u32 *hash, struct sk_buff *skb)
-{
-	struct flow_keys flow;
-	int data_len;
-
-	if (!skb_flow_dissect_flow_keys(skb, &flow, 0) ||
-	    !(flow.basic.n_proto == htons(ETH_P_IP) ||
-	      flow.basic.n_proto == htons(ETH_P_IPV6)))
-		return false;
-
-	if (flow.basic.ip_proto == IPPROTO_TCP)
-		data_len = 12;
-	else
-		data_len = 8;
-
-	*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, &flow, data_len);
-
-	return true;
-}
-
 static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 			void *accel_priv, select_queue_fallback_t fallback)
 {
@@ -266,11 +207,9 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 	if (nvsc_dev == NULL || ndev->real_num_tx_queues <= 1)
 		return 0;
 
-	if (netvsc_set_hash(&hash, skb)) {
-		q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
-			ndev->real_num_tx_queues;
-		skb_set_hash(skb, hash, PKT_HASH_TYPE_L3);
-	}
+	hash = skb_get_hash(skb);
+	q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
+		ndev->real_num_tx_queues;
 
 	return q_idx;
 }
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-07 13:28     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 43+ messages in thread
From: Vitaly Kuznetsov @ 2016-01-07 13:28 UTC (permalink / raw)
  To: Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, Haiyang Zhang, linux-kernel, netdev, devel, David Miller

[-- Attachment #1: Type: text/plain, Size: 2330 bytes --]

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>> driver. Is problem is, however, not the above mentioned commit but the
>> fact that netvsc_set_hash() function did some assumptions on the struct
>> flow_keys data layout and this is wrong. We need to extract the data we
>> need (src/dst addresses and ports) after the dissect.
>> 
>> The issue could also be solved in a completely different way: as suggested
>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>> skb_get_hash() which does more or less the same. Unfortunately, the
>> testing done by Simon showed that Hyper-V hosts are not happy with our
>> Jenkins hash, selecting the output queue with the current algorithm based
>> on Toeplitz hash works significantly better.
>
> Were tests done on IPv6 traffic ?
>

Simon, could you please test this patch for IPv6 and show us the numbers?

> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> bit : 96 iterations)
>
> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>
> I do not see how it can compete with skb_get_hash() that directly gives
> skb->hash for local TCP flows.
>

My guess is that this is not the bottleneck, something is happening
behind the scene with out packets in Hyper-V host (e.g. re-distributing
them to hardware queues?) but I don't know the internals, Microsoft
folks could probably comment.


> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> and 877d1f6291f8e391237e324be58479a3e3a7407c
> ("net: Set sk_txhash from a random number")
>
> I understand Microsoft loves Toeplitz, but this looks not well placed
> here.
>
> I suspect there is another problem.
>
> Please share your numbers and test methodology, and the alternative
> patch Simon tested so that we can double check it.
>

Alternative patch which uses skb_get_hash() attached. Simon, could you
please share the rest (environment, metodology, numbers) with us here?
Thanks!

> Thanks.
>
> PS: For the time being this patch can probably be applied on -net tree,
> as it fixes a real bug.

-- 
  Vitaly


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-hv_netvsc-use-skb_get_hash-instead-of-a-homegrown-im.patch --]
[-- Type: text/x-patch, Size: 2420 bytes --]

>From 0040e79c1303bd225ddbbce679ea944ea11ad0bd Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Wed, 6 Jan 2016 12:14:10 +0100
Subject: [PATCH] hv_netvsc: use skb_get_hash() instead of a homegrown
 implementation

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/net/hyperv/netvsc_drv.c | 67 ++---------------------------------------
 1 file changed, 3 insertions(+), 64 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 409b48e..038bf4f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -195,65 +195,6 @@ static void *init_ppi_data(struct rndis_message *msg, u32 ppi_size,
 	return ppi;
 }
 
-union sub_key {
-	u64 k;
-	struct {
-		u8 pad[3];
-		u8 kb;
-		u32 ka;
-	};
-};
-
-/* Toeplitz hash function
- * data: network byte order
- * return: host byte order
- */
-static u32 comp_hash(u8 *key, int klen, void *data, int dlen)
-{
-	union sub_key subk;
-	int k_next = 4;
-	u8 dt;
-	int i, j;
-	u32 ret = 0;
-
-	subk.k = 0;
-	subk.ka = ntohl(*(u32 *)key);
-
-	for (i = 0; i < dlen; i++) {
-		subk.kb = key[k_next];
-		k_next = (k_next + 1) % klen;
-		dt = ((u8 *)data)[i];
-		for (j = 0; j < 8; j++) {
-			if (dt & 0x80)
-				ret ^= subk.ka;
-			dt <<= 1;
-			subk.k <<= 1;
-		}
-	}
-
-	return ret;
-}
-
-static bool netvsc_set_hash(u32 *hash, struct sk_buff *skb)
-{
-	struct flow_keys flow;
-	int data_len;
-
-	if (!skb_flow_dissect_flow_keys(skb, &flow, 0) ||
-	    !(flow.basic.n_proto == htons(ETH_P_IP) ||
-	      flow.basic.n_proto == htons(ETH_P_IPV6)))
-		return false;
-
-	if (flow.basic.ip_proto == IPPROTO_TCP)
-		data_len = 12;
-	else
-		data_len = 8;
-
-	*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, &flow, data_len);
-
-	return true;
-}
-
 static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 			void *accel_priv, select_queue_fallback_t fallback)
 {
@@ -266,11 +207,9 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 	if (nvsc_dev == NULL || ndev->real_num_tx_queues <= 1)
 		return 0;
 
-	if (netvsc_set_hash(&hash, skb)) {
-		q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
-			ndev->real_num_tx_queues;
-		skb_set_hash(skb, hash, PKT_HASH_TYPE_L3);
-	}
+	hash = skb_get_hash(skb);
+	q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
+		ndev->real_num_tx_queues;
 
 	return q_idx;
 }
-- 
2.4.3


[-- Attachment #3: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-07 13:28     ` Vitaly Kuznetsov
  (?)
@ 2016-01-08  1:02     ` John Fastabend
  2016-01-08  3:49         ` KY Srinivasan
  -1 siblings, 1 reply; 43+ messages in thread
From: John Fastabend @ 2016-01-08  1:02 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, netdev, K. Y. Srinivasan, Haiyang Zhang, devel,
	linux-kernel, David Miller

On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> 
>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>>> driver. Is problem is, however, not the above mentioned commit but the
>>> fact that netvsc_set_hash() function did some assumptions on the struct
>>> flow_keys data layout and this is wrong. We need to extract the data we
>>> need (src/dst addresses and ports) after the dissect.
>>>
>>> The issue could also be solved in a completely different way: as suggested
>>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>>> skb_get_hash() which does more or less the same. Unfortunately, the
>>> testing done by Simon showed that Hyper-V hosts are not happy with our
>>> Jenkins hash, selecting the output queue with the current algorithm based
>>> on Toeplitz hash works significantly better.
>>

Also can I ask the maybe naive question. It looks like the hypervisor
is populating some table via a mailbox msg and this is used to select
the queues I guess with some sort of weighting function?

What happens if you just remove select_queue altogether? Or maybe just
what is this 16 entry table doing? How does this work on my larger
systems with 64+ cores can I only use 16 cores? Sorry I really have
no experience with hyperV and this got me curious.

Thanks,
John

>> Were tests done on IPv6 traffic ?
>>
> 
> Simon, could you please test this patch for IPv6 and show us the numbers?
> 
>> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
>> bit : 96 iterations)
>>
>> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>>
>> I do not see how it can compete with skb_get_hash() that directly gives
>> skb->hash for local TCP flows.
>>
> 
> My guess is that this is not the bottleneck, something is happening
> behind the scene with out packets in Hyper-V host (e.g. re-distributing
> them to hardware queues?) but I don't know the internals, Microsoft
> folks could probably comment.
> 
> 
>> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
>> ("net: Save TX flow hash in sock and set in skbuf on xmit")
>> and 877d1f6291f8e391237e324be58479a3e3a7407c
>> ("net: Set sk_txhash from a random number")
>>
>> I understand Microsoft loves Toeplitz, but this looks not well placed
>> here.
>>
>> I suspect there is another problem.
>>
>> Please share your numbers and test methodology, and the alternative
>> patch Simon tested so that we can double check it.
>>
> 
> Alternative patch which uses skb_get_hash() attached. Simon, could you
> please share the rest (environment, metodology, numbers) with us here?
> Thanks!
> 
>> Thanks.
>>
>> PS: For the time being this patch can probably be applied on -net tree,
>> as it fixes a real bug.
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-08  1:02     ` John Fastabend
@ 2016-01-08  3:49         ` KY Srinivasan
  0 siblings, 0 replies; 43+ messages in thread
From: KY Srinivasan @ 2016-01-08  3:49 UTC (permalink / raw)
  To: John Fastabend, Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, netdev, Haiyang Zhang, devel, linux-kernel, David Miller



> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Thursday, January 7, 2016 5:02 PM
> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
> flow_keys layout
> 
> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> > Eric Dumazet <eric.dumazet@gmail.com> writes:
> >
> >> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
> >>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> >>> driver. Is problem is, however, not the above mentioned commit but the
> >>> fact that netvsc_set_hash() function did some assumptions on the struct
> >>> flow_keys data layout and this is wrong. We need to extract the data we
> >>> need (src/dst addresses and ports) after the dissect.
> >>>
> >>> The issue could also be solved in a completely different way: as suggested
> >>> by Eric instead of our own homegrown netvsc_set_hash() we could use
> >>> skb_get_hash() which does more or less the same. Unfortunately, the
> >>> testing done by Simon showed that Hyper-V hosts are not happy with our
> >>> Jenkins hash, selecting the output queue with the current algorithm based
> >>> on Toeplitz hash works significantly better.
> >>
> 
> Also can I ask the maybe naive question. It looks like the hypervisor
> is populating some table via a mailbox msg and this is used to select
> the queues I guess with some sort of weighting function?
> 
> What happens if you just remove select_queue altogether? Or maybe just
> what is this 16 entry table doing? How does this work on my larger
> systems with 64+ cores can I only use 16 cores? Sorry I really have
> no experience with hyperV and this got me curious.

We will limit the number of VRSS channels to the number of CPUs in
a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
will only open up 8 VRSS channels. On the host side currently traffic
spreading is done in software and we have found that limiting to 8 CPUs
gives us the best throughput. In Windows Server 2016, we will be 
distributing traffic on the host in hardware; the heuristics in the guest
may change.

Regards,

K. Y
> 
> Thanks,
> John
> 
> >> Were tests done on IPv6 traffic ?
> >>
> >
> > Simon, could you please test this patch for IPv6 and show us the numbers?
> >
> >> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> >> bit : 96 iterations)
> >>
> >> For IPv6 it is 3 times this, since we have to hash 36 bytes.
> >>
> >> I do not see how it can compete with skb_get_hash() that directly gives
> >> skb->hash for local TCP flows.
> >>
> >
> > My guess is that this is not the bottleneck, something is happening
> > behind the scene with out packets in Hyper-V host (e.g. re-distributing
> > them to hardware queues?) but I don't know the internals, Microsoft
> > folks could probably comment.
> >
> >
> >> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> >> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> >> and 877d1f6291f8e391237e324be58479a3e3a7407c
> >> ("net: Set sk_txhash from a random number")
> >>
> >> I understand Microsoft loves Toeplitz, but this looks not well placed
> >> here.
> >>
> >> I suspect there is another problem.
> >>
> >> Please share your numbers and test methodology, and the alternative
> >> patch Simon tested so that we can double check it.
> >>
> >
> > Alternative patch which uses skb_get_hash() attached. Simon, could you
> > please share the rest (environment, metodology, numbers) with us here?
> > Thanks!
> >
> >> Thanks.
> >>
> >> PS: For the time being this patch can probably be applied on -net tree,
> >> as it fixes a real bug.
> >

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-08  3:49         ` KY Srinivasan
  0 siblings, 0 replies; 43+ messages in thread
From: KY Srinivasan @ 2016-01-08  3:49 UTC (permalink / raw)
  To: John Fastabend, Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, Haiyang Zhang, linux-kernel, netdev, devel, David Miller



> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Thursday, January 7, 2016 5:02 PM
> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
> flow_keys layout
> 
> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> > Eric Dumazet <eric.dumazet@gmail.com> writes:
> >
> >> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
> >>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> >>> driver. Is problem is, however, not the above mentioned commit but the
> >>> fact that netvsc_set_hash() function did some assumptions on the struct
> >>> flow_keys data layout and this is wrong. We need to extract the data we
> >>> need (src/dst addresses and ports) after the dissect.
> >>>
> >>> The issue could also be solved in a completely different way: as suggested
> >>> by Eric instead of our own homegrown netvsc_set_hash() we could use
> >>> skb_get_hash() which does more or less the same. Unfortunately, the
> >>> testing done by Simon showed that Hyper-V hosts are not happy with our
> >>> Jenkins hash, selecting the output queue with the current algorithm based
> >>> on Toeplitz hash works significantly better.
> >>
> 
> Also can I ask the maybe naive question. It looks like the hypervisor
> is populating some table via a mailbox msg and this is used to select
> the queues I guess with some sort of weighting function?
> 
> What happens if you just remove select_queue altogether? Or maybe just
> what is this 16 entry table doing? How does this work on my larger
> systems with 64+ cores can I only use 16 cores? Sorry I really have
> no experience with hyperV and this got me curious.

We will limit the number of VRSS channels to the number of CPUs in
a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
will only open up 8 VRSS channels. On the host side currently traffic
spreading is done in software and we have found that limiting to 8 CPUs
gives us the best throughput. In Windows Server 2016, we will be 
distributing traffic on the host in hardware; the heuristics in the guest
may change.

Regards,

K. Y
> 
> Thanks,
> John
> 
> >> Were tests done on IPv6 traffic ?
> >>
> >
> > Simon, could you please test this patch for IPv6 and show us the numbers?
> >
> >> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> >> bit : 96 iterations)
> >>
> >> For IPv6 it is 3 times this, since we have to hash 36 bytes.
> >>
> >> I do not see how it can compete with skb_get_hash() that directly gives
> >> skb->hash for local TCP flows.
> >>
> >
> > My guess is that this is not the bottleneck, something is happening
> > behind the scene with out packets in Hyper-V host (e.g. re-distributing
> > them to hardware queues?) but I don't know the internals, Microsoft
> > folks could probably comment.
> >
> >
> >> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> >> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> >> and 877d1f6291f8e391237e324be58479a3e3a7407c
> >> ("net: Set sk_txhash from a random number")
> >>
> >> I understand Microsoft loves Toeplitz, but this looks not well placed
> >> here.
> >>
> >> I suspect there is another problem.
> >>
> >> Please share your numbers and test methodology, and the alternative
> >> patch Simon tested so that we can double check it.
> >>
> >
> > Alternative patch which uses skb_get_hash() attached. Simon, could you
> > please share the rest (environment, metodology, numbers) with us here?
> > Thanks!
> >
> >> Thanks.
> >>
> >> PS: For the time being this patch can probably be applied on -net tree,
> >> as it fixes a real bug.
> >

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-08  3:49         ` KY Srinivasan
@ 2016-01-08  6:16           ` John Fastabend
  -1 siblings, 0 replies; 43+ messages in thread
From: John Fastabend @ 2016-01-08  6:16 UTC (permalink / raw)
  To: KY Srinivasan, Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, netdev, Haiyang Zhang, devel, linux-kernel, David Miller

On 16-01-07 07:49 PM, KY Srinivasan wrote:
> 
> 
>> -----Original Message-----
>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>> Sent: Thursday, January 7, 2016 5:02 PM
>> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
>> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
>> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
>> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
>> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
>> <davem@davemloft.net>
>> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
>> flow_keys layout
>>
>> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
>>> Eric Dumazet <eric.dumazet@gmail.com> writes:
>>>
>>>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>>>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>>>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>>>>> driver. Is problem is, however, not the above mentioned commit but the
>>>>> fact that netvsc_set_hash() function did some assumptions on the struct
>>>>> flow_keys data layout and this is wrong. We need to extract the data we
>>>>> need (src/dst addresses and ports) after the dissect.
>>>>>
>>>>> The issue could also be solved in a completely different way: as suggested
>>>>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>>>>> skb_get_hash() which does more or less the same. Unfortunately, the
>>>>> testing done by Simon showed that Hyper-V hosts are not happy with our
>>>>> Jenkins hash, selecting the output queue with the current algorithm based
>>>>> on Toeplitz hash works significantly better.
>>>>
>>
>> Also can I ask the maybe naive question. It looks like the hypervisor
>> is populating some table via a mailbox msg and this is used to select
>> the queues I guess with some sort of weighting function?
>>
>> What happens if you just remove select_queue altogether? Or maybe just
>> what is this 16 entry table doing? How does this work on my larger
>> systems with 64+ cores can I only use 16 cores? Sorry I really have
>> no experience with hyperV and this got me curious.
> 
> We will limit the number of VRSS channels to the number of CPUs in
> a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
> will only open up 8 VRSS channels. On the host side currently traffic
> spreading is done in software and we have found that limiting to 8 CPUs
> gives us the best throughput. In Windows Server 2016, we will be 
> distributing traffic on the host in hardware; the heuristics in the guest
> may change.
> 
> Regards,
> 
> K. Y

I think a better way to do this would be to query the numa node when
the interface comes online via dev_to_node() and then use cpu_to_node()
or create/find some better variant to get a list of cpus on the numa
node.

At this point you can use the xps mapping interface
netif_set_xps_queue() to get the right queue to cpu binding. If you want
to cap it to max 8 queues that works as well. I don't think there is
any value to have more tx queues than number of cpus in use.

If you do it this way all the normal mechanisms to setup queue mappings
will work for users who are doing some special configuration and the
default will still be what you want.

I guess I should go do this numa mapping for ixgbe and friends now that
I mention it. Last perf numbers I had showed cross numa affinitizing
was pretty bad.

Thanks,
John

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-08  6:16           ` John Fastabend
  0 siblings, 0 replies; 43+ messages in thread
From: John Fastabend @ 2016-01-08  6:16 UTC (permalink / raw)
  To: KY Srinivasan, Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, Haiyang Zhang, linux-kernel, netdev, devel, David Miller

On 16-01-07 07:49 PM, KY Srinivasan wrote:
> 
> 
>> -----Original Message-----
>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>> Sent: Thursday, January 7, 2016 5:02 PM
>> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
>> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
>> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
>> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
>> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
>> <davem@davemloft.net>
>> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
>> flow_keys layout
>>
>> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
>>> Eric Dumazet <eric.dumazet@gmail.com> writes:
>>>
>>>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>>>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>>>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>>>>> driver. Is problem is, however, not the above mentioned commit but the
>>>>> fact that netvsc_set_hash() function did some assumptions on the struct
>>>>> flow_keys data layout and this is wrong. We need to extract the data we
>>>>> need (src/dst addresses and ports) after the dissect.
>>>>>
>>>>> The issue could also be solved in a completely different way: as suggested
>>>>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>>>>> skb_get_hash() which does more or less the same. Unfortunately, the
>>>>> testing done by Simon showed that Hyper-V hosts are not happy with our
>>>>> Jenkins hash, selecting the output queue with the current algorithm based
>>>>> on Toeplitz hash works significantly better.
>>>>
>>
>> Also can I ask the maybe naive question. It looks like the hypervisor
>> is populating some table via a mailbox msg and this is used to select
>> the queues I guess with some sort of weighting function?
>>
>> What happens if you just remove select_queue altogether? Or maybe just
>> what is this 16 entry table doing? How does this work on my larger
>> systems with 64+ cores can I only use 16 cores? Sorry I really have
>> no experience with hyperV and this got me curious.
> 
> We will limit the number of VRSS channels to the number of CPUs in
> a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
> will only open up 8 VRSS channels. On the host side currently traffic
> spreading is done in software and we have found that limiting to 8 CPUs
> gives us the best throughput. In Windows Server 2016, we will be 
> distributing traffic on the host in hardware; the heuristics in the guest
> may change.
> 
> Regards,
> 
> K. Y

I think a better way to do this would be to query the numa node when
the interface comes online via dev_to_node() and then use cpu_to_node()
or create/find some better variant to get a list of cpus on the numa
node.

At this point you can use the xps mapping interface
netif_set_xps_queue() to get the right queue to cpu binding. If you want
to cap it to max 8 queues that works as well. I don't think there is
any value to have more tx queues than number of cpus in use.

If you do it this way all the normal mechanisms to setup queue mappings
will work for users who are doing some special configuration and the
default will still be what you want.

I guess I should go do this numa mapping for ixgbe and friends now that
I mention it. Last perf numbers I had showed cross numa affinitizing
was pretty bad.

Thanks,
John

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-08  6:16           ` John Fastabend
  (?)
@ 2016-01-08 18:01           ` KY Srinivasan
  -1 siblings, 0 replies; 43+ messages in thread
From: KY Srinivasan @ 2016-01-08 18:01 UTC (permalink / raw)
  To: John Fastabend, Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, netdev, Haiyang Zhang, devel, linux-kernel, David Miller



> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Thursday, January 7, 2016 10:17 PM
> To: KY Srinivasan <kys@microsoft.com>; Vitaly Kuznetsov
> <vkuznets@redhat.com>; Simon Xiao <sixiao@microsoft.com>; Eric Dumazet
> <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org;
> Haiyang Zhang <haiyangz@microsoft.com>; devel@linuxdriverproject.org;
> linux-kernel@vger.kernel.org; David Miller <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
> flow_keys layout
> 
> On 16-01-07 07:49 PM, KY Srinivasan wrote:
> >
> >
> >> -----Original Message-----
> >> From: John Fastabend [mailto:john.fastabend@gmail.com]
> >> Sent: Thursday, January 7, 2016 5:02 PM
> >> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
> >> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
> >> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> >> Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>;
> >> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> >> <davem@davemloft.net>
> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct
> >> flow_keys layout
> >>
> >> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> >>> Eric Dumazet <eric.dumazet@gmail.com> writes:
> >>>
> >>>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >>>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net:
> Add
> >>>>>  VLAN ID to flow_keys")) introduced a performance regression in
> netvsc
> >>>>> driver. Is problem is, however, not the above mentioned commit but
> the
> >>>>> fact that netvsc_set_hash() function did some assumptions on the
> struct
> >>>>> flow_keys data layout and this is wrong. We need to extract the data
> we
> >>>>> need (src/dst addresses and ports) after the dissect.
> >>>>>
> >>>>> The issue could also be solved in a completely different way: as
> suggested
> >>>>> by Eric instead of our own homegrown netvsc_set_hash() we could
> use
> >>>>> skb_get_hash() which does more or less the same. Unfortunately,
> the
> >>>>> testing done by Simon showed that Hyper-V hosts are not happy with
> our
> >>>>> Jenkins hash, selecting the output queue with the current algorithm
> based
> >>>>> on Toeplitz hash works significantly better.
> >>>>
> >>
> >> Also can I ask the maybe naive question. It looks like the hypervisor
> >> is populating some table via a mailbox msg and this is used to select
> >> the queues I guess with some sort of weighting function?
> >>
> >> What happens if you just remove select_queue altogether? Or maybe
> just
> >> what is this 16 entry table doing? How does this work on my larger
> >> systems with 64+ cores can I only use 16 cores? Sorry I really have
> >> no experience with hyperV and this got me curious.
> >
> > We will limit the number of VRSS channels to the number of CPUs in
> > a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
> > will only open up 8 VRSS channels. On the host side currently traffic
> > spreading is done in software and we have found that limiting to 8 CPUs
> > gives us the best throughput. In Windows Server 2016, we will be
> > distributing traffic on the host in hardware; the heuristics in the guest
> > may change.
> >
> > Regards,
> >
> > K. Y
> 
> I think a better way to do this would be to query the numa node when
> the interface comes online via dev_to_node() and then use cpu_to_node()
> or create/find some better variant to get a list of cpus on the numa
> node.
> 
> At this point you can use the xps mapping interface
> netif_set_xps_queue() to get the right queue to cpu binding. If you want
> to cap it to max 8 queues that works as well. I don't think there is
> any value to have more tx queues than number of cpus in use.
> 
> If you do it this way all the normal mechanisms to setup queue mappings
> will work for users who are doing some special configuration and the
> default will still be what you want.
> 
> I guess I should go do this numa mapping for ixgbe and friends now that
> I mention it. Last perf numbers I had showed cross numa affinitizing
> was pretty bad.
> 
> Thanks,
> John
John,

I am little confused. In the guest, we need to first open the sub-channels (VRSS queues) based on what the host is offering. While we cannot open more sub-channels than what the host is offering, the guest can certainly open fewer sub-channels. I was describing the heuristics for how many sub-channels the guest currently opens. This is based on the NUMA topology presented to the guest and the number of VCPUs provisioned for the guest. The binding of VCPUs to the channels occur at the point of opening these channels.

Regards,

K. Y 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-07 13:28     ` Vitaly Kuznetsov
@ 2016-01-08 21:07       ` Haiyang Zhang
  -1 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-08 21:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, netdev, KY Srinivasan, devel, linux-kernel, David Miller



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Thursday, January 7, 2016 8:28 AM
> To: Simon Xiao <sixiao@microsoft.com>; Eric Dumazet
> <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> 
> > On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net:
> Add
> >>  VLAN ID to flow_keys")) introduced a performance regression in
> netvsc
> >> driver. Is problem is, however, not the above mentioned commit but
> the
> >> fact that netvsc_set_hash() function did some assumptions on the
> struct
> >> flow_keys data layout and this is wrong. We need to extract the data
> we
> >> need (src/dst addresses and ports) after the dissect.
> >>
> >> The issue could also be solved in a completely different way: as
> suggested
> >> by Eric instead of our own homegrown netvsc_set_hash() we could use
> >> skb_get_hash() which does more or less the same. Unfortunately, the
> >> testing done by Simon showed that Hyper-V hosts are not happy with
> our
> >> Jenkins hash, selecting the output queue with the current algorithm
> based
> >> on Toeplitz hash works significantly better.
> >
> > Were tests done on IPv6 traffic ?
> >
> 
> Simon, could you please test this patch for IPv6 and show us the numbers?
> 
> > Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration
> per
> > bit : 96 iterations)
> >
> > For IPv6 it is 3 times this, since we have to hash 36 bytes.
> >
> > I do not see how it can compete with skb_get_hash() that directly
> gives
> > skb->hash for local TCP flows.
> >
> 
> My guess is that this is not the bottleneck, something is happening
> behind the scene with out packets in Hyper-V host (e.g. re-distributing
> them to hardware queues?) but I don't know the internals, Microsoft
> folks could probably comment.

The Hyper-V vRSS protocol lets us use the Toeplitz hash algorithm. We are
currently running further tests, including IPv6 too, and will share the 
results when available.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-08 21:07       ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-08 21:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Simon Xiao, Eric Dumazet
  Cc: Tom Herbert, linux-kernel, netdev, devel, David Miller



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Thursday, January 7, 2016 8:28 AM
> To: Simon Xiao <sixiao@microsoft.com>; Eric Dumazet
> <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> 
> > On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net:
> Add
> >>  VLAN ID to flow_keys")) introduced a performance regression in
> netvsc
> >> driver. Is problem is, however, not the above mentioned commit but
> the
> >> fact that netvsc_set_hash() function did some assumptions on the
> struct
> >> flow_keys data layout and this is wrong. We need to extract the data
> we
> >> need (src/dst addresses and ports) after the dissect.
> >>
> >> The issue could also be solved in a completely different way: as
> suggested
> >> by Eric instead of our own homegrown netvsc_set_hash() we could use
> >> skb_get_hash() which does more or less the same. Unfortunately, the
> >> testing done by Simon showed that Hyper-V hosts are not happy with
> our
> >> Jenkins hash, selecting the output queue with the current algorithm
> based
> >> on Toeplitz hash works significantly better.
> >
> > Were tests done on IPv6 traffic ?
> >
> 
> Simon, could you please test this patch for IPv6 and show us the numbers?
> 
> > Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration
> per
> > bit : 96 iterations)
> >
> > For IPv6 it is 3 times this, since we have to hash 36 bytes.
> >
> > I do not see how it can compete with skb_get_hash() that directly
> gives
> > skb->hash for local TCP flows.
> >
> 
> My guess is that this is not the bottleneck, something is happening
> behind the scene with out packets in Hyper-V host (e.g. re-distributing
> them to hardware queues?) but I don't know the internals, Microsoft
> folks could probably comment.

The Hyper-V vRSS protocol lets us use the Toeplitz hash algorithm. We are
currently running further tests, including IPv6 too, and will share the 
results when available.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-07 12:52 ` Eric Dumazet
@ 2016-01-09  0:17     ` Tom Herbert
  2016-01-09  0:17     ` Tom Herbert
  1 sibling, 0 replies; 43+ messages in thread
From: Tom Herbert @ 2016-01-09  0:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Vitaly Kuznetsov, Linux Kernel Network Developers,
	K. Y. Srinivasan, Haiyang Zhang, devel, LKML, David Miller

On Thu, Jan 7, 2016 at 4:52 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>> driver. Is problem is, however, not the above mentioned commit but the
>> fact that netvsc_set_hash() function did some assumptions on the struct
>> flow_keys data layout and this is wrong. We need to extract the data we
>> need (src/dst addresses and ports) after the dissect.
>>
>> The issue could also be solved in a completely different way: as suggested
>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>> skb_get_hash() which does more or less the same. Unfortunately, the
>> testing done by Simon showed that Hyper-V hosts are not happy with our
>> Jenkins hash, selecting the output queue with the current algorithm based
>> on Toeplitz hash works significantly better.
>
> Were tests done on IPv6 traffic ?
>
> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> bit : 96 iterations)
>
> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>
> I do not see how it can compete with skb_get_hash() that directly gives
> skb->hash for local TCP flows.
>
> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> and 877d1f6291f8e391237e324be58479a3e3a7407c
> ("net: Set sk_txhash from a random number")
>
> I understand Microsoft loves Toeplitz, but this looks not well placed
> here.
>
+1

We need a little more of an explanation on why "Toeplitz hash works
significantly better" than Jenkins hash. We already know that Toeplitz
is expensive to do in SW and there has been some proposals to optimize
it which don't seem to have been applied to hv_netvsc (I believe Eric
had a good implementation). Assuming skb_get_hash isn't sufficient the
Toeplitz hash should be in a common library anyway, we really don't
want drivers or modules inventing new ways to hash packets this point!

Tom

> I suspect there is another problem.
>
> Please share your numbers and test methodology, and the alternative
> patch Simon tested so that we can double check it.
>
> Thanks.
>
> PS: For the time being this patch can probably be applied on -net tree,
> as it fixes a real bug.
>
>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-09  0:17     ` Tom Herbert
  0 siblings, 0 replies; 43+ messages in thread
From: Tom Herbert @ 2016-01-09  0:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linux Kernel Network Developers, Haiyang Zhang, LKML, devel,
	David Miller

On Thu, Jan 7, 2016 at 4:52 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>> driver. Is problem is, however, not the above mentioned commit but the
>> fact that netvsc_set_hash() function did some assumptions on the struct
>> flow_keys data layout and this is wrong. We need to extract the data we
>> need (src/dst addresses and ports) after the dissect.
>>
>> The issue could also be solved in a completely different way: as suggested
>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>> skb_get_hash() which does more or less the same. Unfortunately, the
>> testing done by Simon showed that Hyper-V hosts are not happy with our
>> Jenkins hash, selecting the output queue with the current algorithm based
>> on Toeplitz hash works significantly better.
>
> Were tests done on IPv6 traffic ?
>
> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> bit : 96 iterations)
>
> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>
> I do not see how it can compete with skb_get_hash() that directly gives
> skb->hash for local TCP flows.
>
> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> and 877d1f6291f8e391237e324be58479a3e3a7407c
> ("net: Set sk_txhash from a random number")
>
> I understand Microsoft loves Toeplitz, but this looks not well placed
> here.
>
+1

We need a little more of an explanation on why "Toeplitz hash works
significantly better" than Jenkins hash. We already know that Toeplitz
is expensive to do in SW and there has been some proposals to optimize
it which don't seem to have been applied to hv_netvsc (I believe Eric
had a good implementation). Assuming skb_get_hash isn't sufficient the
Toeplitz hash should be in a common library anyway, we really don't
want drivers or modules inventing new ways to hash packets this point!

Tom

> I suspect there is another problem.
>
> Please share your numbers and test methodology, and the alternative
> patch Simon tested so that we can double check it.
>
> Thanks.
>
> PS: For the time being this patch can probably be applied on -net tree,
> as it fixes a real bug.
>
>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-07  9:33 ` Vitaly Kuznetsov
@ 2016-01-10 22:25   ` David Miller
  -1 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2016-01-10 22:25 UTC (permalink / raw)
  To: vkuznets; +Cc: netdev, kys, haiyangz, devel, linux-kernel, eric.dumazet

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Thu,  7 Jan 2016 10:33:09 +0100

> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> driver. Is problem is, however, not the above mentioned commit but the
> fact that netvsc_set_hash() function did some assumptions on the struct
> flow_keys data layout and this is wrong. We need to extract the data we
> need (src/dst addresses and ports) after the dissect.
> 
> The issue could also be solved in a completely different way: as suggested
> by Eric instead of our own homegrown netvsc_set_hash() we could use
> skb_get_hash() which does more or less the same. Unfortunately, the
> testing done by Simon showed that Hyper-V hosts are not happy with our
> Jenkins hash, selecting the output queue with the current algorithm based
> on Toeplitz hash works significantly better.
> 
> Tested-by: Simon Xiao <sixiao@microsoft.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Stop using this Toeplitz thing and just use the proper hash the stack
is already calculating for you.

There is no way this is faster, and the continued attempts to
shoe-horn Toeplitz usage into this driver is resulting in incredibly
ugly and rediculous code.

I'm not applying any patches that further the use of Toeplitz as the
hash function in this driver.  You must use the clean,
efficient, facilities the kernel has already for packet hashing.

If every driver did what you guys are doing, we'd be in a heap of
trouble, and I'm simply not going to allow this to continue any
longer.

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-10 22:25   ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2016-01-10 22:25 UTC (permalink / raw)
  To: vkuznets; +Cc: eric.dumazet, netdev, haiyangz, linux-kernel, devel

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Thu,  7 Jan 2016 10:33:09 +0100

> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> driver. Is problem is, however, not the above mentioned commit but the
> fact that netvsc_set_hash() function did some assumptions on the struct
> flow_keys data layout and this is wrong. We need to extract the data we
> need (src/dst addresses and ports) after the dissect.
> 
> The issue could also be solved in a completely different way: as suggested
> by Eric instead of our own homegrown netvsc_set_hash() we could use
> skb_get_hash() which does more or less the same. Unfortunately, the
> testing done by Simon showed that Hyper-V hosts are not happy with our
> Jenkins hash, selecting the output queue with the current algorithm based
> on Toeplitz hash works significantly better.
> 
> Tested-by: Simon Xiao <sixiao@microsoft.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Stop using this Toeplitz thing and just use the proper hash the stack
is already calculating for you.

There is no way this is faster, and the continued attempts to
shoe-horn Toeplitz usage into this driver is resulting in incredibly
ugly and rediculous code.

I'm not applying any patches that further the use of Toeplitz as the
hash function in this driver.  You must use the clean,
efficient, facilities the kernel has already for packet hashing.

If every driver did what you guys are doing, we'd be in a heap of
trouble, and I'm simply not going to allow this to continue any
longer.

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-10 22:25   ` David Miller
@ 2016-01-13 23:10     ` Haiyang Zhang
  -1 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-13 23:10 UTC (permalink / raw)
  To: David Miller, vkuznets
  Cc: netdev, KY Srinivasan, devel, linux-kernel, eric.dumazet



> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Sunday, January 10, 2016 5:26 PM
> To: vkuznets@redhat.com
> Cc: netdev@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang
> Zhang <haiyangz@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org; eric.dumazet@gmail.com
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date: Thu,  7 Jan 2016 10:33:09 +0100
> 
> > Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net:
> Add
> >  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> > driver. Is problem is, however, not the above mentioned commit but the
> > fact that netvsc_set_hash() function did some assumptions on the
> struct
> > flow_keys data layout and this is wrong. We need to extract the data
> we
> > need (src/dst addresses and ports) after the dissect.
> >
> > The issue could also be solved in a completely different way: as
> suggested
> > by Eric instead of our own homegrown netvsc_set_hash() we could use
> > skb_get_hash() which does more or less the same. Unfortunately, the
> > testing done by Simon showed that Hyper-V hosts are not happy with our
> > Jenkins hash, selecting the output queue with the current algorithm
> based
> > on Toeplitz hash works significantly better.
> >
> > Tested-by: Simon Xiao <sixiao@microsoft.com>
> > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> Stop using this Toeplitz thing and just use the proper hash the stack
> is already calculating for you.
> 
> There is no way this is faster, and the continued attempts to
> shoe-horn Toeplitz usage into this driver is resulting in incredibly
> ugly and rediculous code.
> 
> I'm not applying any patches that further the use of Toeplitz as the
> hash function in this driver.  You must use the clean,
> efficient, facilities the kernel has already for packet hashing.
> 
> If every driver did what you guys are doing, we'd be in a heap of
> trouble, and I'm simply not going to allow this to continue any
> longer.
> 
> Thanks.

I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms, 
and found that the Toeplitz provides much better distribution of the 
connections into send-indirection-table entries. See the data below -- 
showing how many TCP connections are distributed into each of the 
sixteen table entries. The Toeplitz hash distributes the connections 
almost perfectly evenly, but the Jenkins hash distributes them unevenly. 
For example, in case of 64 connections, some entries are 0 or 1, some 
other entries are 8. This could cause too many connections in one VMBus 
channel and slow down the throughput. This is consistent to our test 
which showing slower performance while using the generic skb_get_hash 
(Jenkins) than using Toeplitz hash (see perf numbers below).


#connections:32:
Toeplitz:2,2,2,2,2,1,2,2,2,2,2,3,2,2,2,2,
Jenkins:3,2,2,4,1,1,0,2,1,1,4,3,2,5,1,0,
#connections:64:
Toeplitz:4,4,5,4,4,3,4,4,4,4,4,4,4,4,4,4,
Jenkins:4,5,4,6,3,5,0,6,1,2,8,3,6,8,2,1,
#connections:128:
Toeplitz:8,8,8,8,8,7,9,8,8,8,8,8,8,8,8,8,
Jenkins:8,12,10,9,7,8,3,10,6,8,9,8,10,11,6,3,

Throughput (Gbps) comparison:
#conn		Toeplitz	Jenkins
32		26.6		23.2
64		32.1		23.4
128		29.1		24.1

For long term solution, I think we should put the Toeplitz hash as 
another option to the generic hash function in kernel... But, for the 
time being, can you accept this patch to fix the assumptions on 
struct flow_keys layout?

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-13 23:10     ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-13 23:10 UTC (permalink / raw)
  To: David Miller, vkuznets; +Cc: netdev, linux-kernel, eric.dumazet, devel



> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Sunday, January 10, 2016 5:26 PM
> To: vkuznets@redhat.com
> Cc: netdev@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang
> Zhang <haiyangz@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org; eric.dumazet@gmail.com
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date: Thu,  7 Jan 2016 10:33:09 +0100
> 
> > Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net:
> Add
> >  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> > driver. Is problem is, however, not the above mentioned commit but the
> > fact that netvsc_set_hash() function did some assumptions on the
> struct
> > flow_keys data layout and this is wrong. We need to extract the data
> we
> > need (src/dst addresses and ports) after the dissect.
> >
> > The issue could also be solved in a completely different way: as
> suggested
> > by Eric instead of our own homegrown netvsc_set_hash() we could use
> > skb_get_hash() which does more or less the same. Unfortunately, the
> > testing done by Simon showed that Hyper-V hosts are not happy with our
> > Jenkins hash, selecting the output queue with the current algorithm
> based
> > on Toeplitz hash works significantly better.
> >
> > Tested-by: Simon Xiao <sixiao@microsoft.com>
> > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> Stop using this Toeplitz thing and just use the proper hash the stack
> is already calculating for you.
> 
> There is no way this is faster, and the continued attempts to
> shoe-horn Toeplitz usage into this driver is resulting in incredibly
> ugly and rediculous code.
> 
> I'm not applying any patches that further the use of Toeplitz as the
> hash function in this driver.  You must use the clean,
> efficient, facilities the kernel has already for packet hashing.
> 
> If every driver did what you guys are doing, we'd be in a heap of
> trouble, and I'm simply not going to allow this to continue any
> longer.
> 
> Thanks.

I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms, 
and found that the Toeplitz provides much better distribution of the 
connections into send-indirection-table entries. See the data below -- 
showing how many TCP connections are distributed into each of the 
sixteen table entries. The Toeplitz hash distributes the connections 
almost perfectly evenly, but the Jenkins hash distributes them unevenly. 
For example, in case of 64 connections, some entries are 0 or 1, some 
other entries are 8. This could cause too many connections in one VMBus 
channel and slow down the throughput. This is consistent to our test 
which showing slower performance while using the generic skb_get_hash 
(Jenkins) than using Toeplitz hash (see perf numbers below).


#connections:32:
Toeplitz:2,2,2,2,2,1,2,2,2,2,2,3,2,2,2,2,
Jenkins:3,2,2,4,1,1,0,2,1,1,4,3,2,5,1,0,
#connections:64:
Toeplitz:4,4,5,4,4,3,4,4,4,4,4,4,4,4,4,4,
Jenkins:4,5,4,6,3,5,0,6,1,2,8,3,6,8,2,1,
#connections:128:
Toeplitz:8,8,8,8,8,7,9,8,8,8,8,8,8,8,8,8,
Jenkins:8,12,10,9,7,8,3,10,6,8,9,8,10,11,6,3,

Throughput (Gbps) comparison:
#conn		Toeplitz	Jenkins
32		26.6		23.2
64		32.1		23.4
128		29.1		24.1

For long term solution, I think we should put the Toeplitz hash as 
another option to the generic hash function in kernel... But, for the 
time being, can you accept this patch to fix the assumptions on 
struct flow_keys layout?

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-13 23:10     ` Haiyang Zhang
@ 2016-01-14  4:56       ` David Miller
  -1 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2016-01-14  4:56 UTC (permalink / raw)
  To: haiyangz; +Cc: vkuznets, netdev, kys, devel, linux-kernel, eric.dumazet

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Wed, 13 Jan 2016 23:10:57 +0000

> I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms, 
> and found that the Toeplitz provides much better distribution of the 
> connections into send-indirection-table entries.

This fails to take into consideration how massively more expensive
Toeplitz is to compute.

This also fails to show what the real life performance implications
are.

Just showing distributions is meaningless if it doesn't indicate
what kind of performance distrubtion A or B achieves.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14  4:56       ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2016-01-14  4:56 UTC (permalink / raw)
  To: haiyangz; +Cc: eric.dumazet, netdev, linux-kernel, devel

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Wed, 13 Jan 2016 23:10:57 +0000

> I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms, 
> and found that the Toeplitz provides much better distribution of the 
> connections into send-indirection-table entries.

This fails to take into consideration how massively more expensive
Toeplitz is to compute.

This also fails to show what the real life performance implications
are.

Just showing distributions is meaningless if it doesn't indicate
what kind of performance distrubtion A or B achieves.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-13 23:10     ` Haiyang Zhang
@ 2016-01-14 17:14       ` Tom Herbert
  -1 siblings, 0 replies; 43+ messages in thread
From: Tom Herbert @ 2016-01-14 17:14 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: David Miller, vkuznets, netdev, KY Srinivasan, devel,
	linux-kernel, eric.dumazet

> I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms,
> and found that the Toeplitz provides much better distribution of the
> connections into send-indirection-table entries. See the data below --
> showing how many TCP connections are distributed into each of the
> sixteen table entries. The Toeplitz hash distributes the connections
> almost perfectly evenly, but the Jenkins hash distributes them unevenly.
> For example, in case of 64 connections, some entries are 0 or 1, some
> other entries are 8. This could cause too many connections in one VMBus
> channel and slow down the throughput. This is consistent to our test
> which showing slower performance while using the generic skb_get_hash
> (Jenkins) than using Toeplitz hash (see perf numbers below).
>
>
> #connections:32:
> Toeplitz:2,2,2,2,2,1,2,2,2,2,2,3,2,2,2,2,
> Jenkins:3,2,2,4,1,1,0,2,1,1,4,3,2,5,1,0,
> #connections:64:
> Toeplitz:4,4,5,4,4,3,4,4,4,4,4,4,4,4,4,4,
> Jenkins:4,5,4,6,3,5,0,6,1,2,8,3,6,8,2,1,
> #connections:128:
> Toeplitz:8,8,8,8,8,7,9,8,8,8,8,8,8,8,8,8,
> Jenkins:8,12,10,9,7,8,3,10,6,8,9,8,10,11,6,3,
>
These results for Toeplitz are not plausible. Given random input you
cannot expect any hash function to produce such uniform results. I
suspect either your input data is biased or how your applying the hash
is.

When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get
something more reasonable:

Toeplitz
Buckets: 3 7 4 5 3 6 2 6 2 4 4 5 4 3 2 4

Jenkins
Buckets: 6 7 4 4 3 2 6 3 1 4 3 5 5 4 4 3

> Throughput (Gbps) comparison:
> #conn           Toeplitz        Jenkins
> 32              26.6            23.2
> 64              32.1            23.4
> 128             29.1            24.1
>
> For long term solution, I think we should put the Toeplitz hash as
> another option to the generic hash function in kernel... But, for the
> time being, can you accept this patch to fix the assumptions on
> struct flow_keys layout?
>
Toeplitz is about a 100x more expensive to compute in the CPU than
Jenkins, we can get that down to 50x by precomputing a bunch of lookup
tables for a given key but that is at the expense of memory. Besides
that, there is a fair amount of analysis already showing that Jenkins
hash provides a good distribution and has good enough (though not
great) Avalanche effect. Probably the only reason we would need
Toeplitz in SW is if we wanted to match a computation being done by
HW.

One hash that might be better than Jenkins is CRC. This seems to have
good uniformity and Avalanche effect, and by using crc32 instruction
it seems be a little faster than running Jenkins hash.

Tom

> Thanks,
> - Haiyang
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 17:14       ` Tom Herbert
  0 siblings, 0 replies; 43+ messages in thread
From: Tom Herbert @ 2016-01-14 17:14 UTC (permalink / raw)
  To: Haiyang Zhang; +Cc: eric.dumazet, netdev, linux-kernel, devel, David Miller

> I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms,
> and found that the Toeplitz provides much better distribution of the
> connections into send-indirection-table entries. See the data below --
> showing how many TCP connections are distributed into each of the
> sixteen table entries. The Toeplitz hash distributes the connections
> almost perfectly evenly, but the Jenkins hash distributes them unevenly.
> For example, in case of 64 connections, some entries are 0 or 1, some
> other entries are 8. This could cause too many connections in one VMBus
> channel and slow down the throughput. This is consistent to our test
> which showing slower performance while using the generic skb_get_hash
> (Jenkins) than using Toeplitz hash (see perf numbers below).
>
>
> #connections:32:
> Toeplitz:2,2,2,2,2,1,2,2,2,2,2,3,2,2,2,2,
> Jenkins:3,2,2,4,1,1,0,2,1,1,4,3,2,5,1,0,
> #connections:64:
> Toeplitz:4,4,5,4,4,3,4,4,4,4,4,4,4,4,4,4,
> Jenkins:4,5,4,6,3,5,0,6,1,2,8,3,6,8,2,1,
> #connections:128:
> Toeplitz:8,8,8,8,8,7,9,8,8,8,8,8,8,8,8,8,
> Jenkins:8,12,10,9,7,8,3,10,6,8,9,8,10,11,6,3,
>
These results for Toeplitz are not plausible. Given random input you
cannot expect any hash function to produce such uniform results. I
suspect either your input data is biased or how your applying the hash
is.

When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get
something more reasonable:

Toeplitz
Buckets: 3 7 4 5 3 6 2 6 2 4 4 5 4 3 2 4

Jenkins
Buckets: 6 7 4 4 3 2 6 3 1 4 3 5 5 4 4 3

> Throughput (Gbps) comparison:
> #conn           Toeplitz        Jenkins
> 32              26.6            23.2
> 64              32.1            23.4
> 128             29.1            24.1
>
> For long term solution, I think we should put the Toeplitz hash as
> another option to the generic hash function in kernel... But, for the
> time being, can you accept this patch to fix the assumptions on
> struct flow_keys layout?
>
Toeplitz is about a 100x more expensive to compute in the CPU than
Jenkins, we can get that down to 50x by precomputing a bunch of lookup
tables for a given key but that is at the expense of memory. Besides
that, there is a fair amount of analysis already showing that Jenkins
hash provides a good distribution and has good enough (though not
great) Avalanche effect. Probably the only reason we would need
Toeplitz in SW is if we wanted to match a computation being done by
HW.

One hash that might be better than Jenkins is CRC. This seems to have
good uniformity and Avalanche effect, and by using crc32 instruction
it seems be a little faster than running Jenkins hash.

Tom

> Thanks,
> - Haiyang
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 17:14       ` Tom Herbert
@ 2016-01-14 17:53         ` One Thousand Gnomes
  -1 siblings, 0 replies; 43+ messages in thread
From: One Thousand Gnomes @ 2016-01-14 17:53 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Haiyang Zhang, David Miller, vkuznets, netdev, KY Srinivasan,
	devel, linux-kernel, eric.dumazet

> These results for Toeplitz are not plausible. Given random input you
> cannot expect any hash function to produce such uniform results. I
> suspect either your input data is biased or how your applying the hash
> is.
> 
> When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get
> something more reasonable:

IPv4 address patterns are not random. Nothing like it. A long long time
ago we did do a bunch of tuning for network hashes using big porn site
data sets. Random it was not.

It's probably hard to repeat that exercise now with geo specific routing,
and all the front end caches and redirectors on big sites but I'd
strongly suggest random input is not a good test, and also that you need
to worry more about hash attacks than perfect distributions.

Alan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 17:53         ` One Thousand Gnomes
  0 siblings, 0 replies; 43+ messages in thread
From: One Thousand Gnomes @ 2016-01-14 17:53 UTC (permalink / raw)
  To: Tom Herbert
  Cc: eric.dumazet, netdev, Haiyang Zhang, linux-kernel, devel, David Miller

> These results for Toeplitz are not plausible. Given random input you
> cannot expect any hash function to produce such uniform results. I
> suspect either your input data is biased or how your applying the hash
> is.
> 
> When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get
> something more reasonable:

IPv4 address patterns are not random. Nothing like it. A long long time
ago we did do a bunch of tuning for network hashes using big porn site
data sets. Random it was not.

It's probably hard to repeat that exercise now with geo specific routing,
and all the front end caches and redirectors on big sites but I'd
strongly suggest random input is not a good test, and also that you need
to worry more about hash attacks than perfect distributions.

Alan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-13 23:10     ` Haiyang Zhang
                       ` (2 preceding siblings ...)
  (?)
@ 2016-01-14 17:53     ` Eric Dumazet
  -1 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-01-14 17:53 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: David Miller, vkuznets, netdev, KY Srinivasan, devel, linux-kernel

On Wed, 2016-01-13 at 23:10 +0000, Haiyang Zhang wrote:

> I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms, 
> and found that the Toeplitz provides much better distribution of the 
> connections into send-indirection-table entries. See the data below -- 
> showing how many TCP connections are distributed into each of the 
> sixteen table entries. The Toeplitz hash distributes the connections 
> almost perfectly evenly, but the Jenkins hash distributes them unevenly. 
> For example, in case of 64 connections, some entries are 0 or 1, some 
> other entries are 8. This could cause too many connections in one VMBus 
> channel and slow down the throughput.

So a VMBus channel has a limit of number of flows ? Why is it so ?

What happens with 1000 flows ?

>  This is consistent to our test 
> which showing slower performance while using the generic skb_get_hash 
> (Jenkins) than using Toeplitz hash (see perf numbers below).
> 
> 
> #connections:32:
> Toeplitz:2,2,2,2,2,1,2,2,2,2,2,3,2,2,2,2,
> Jenkins:3,2,2,4,1,1,0,2,1,1,4,3,2,5,1,0,
> #connections:64:
> Toeplitz:4,4,5,4,4,3,4,4,4,4,4,4,4,4,4,4,
> Jenkins:4,5,4,6,3,5,0,6,1,2,8,3,6,8,2,1,
> #connections:128:
> Toeplitz:8,8,8,8,8,7,9,8,8,8,8,8,8,8,8,8,
> Jenkins:8,12,10,9,7,8,3,10,6,8,9,8,10,11,6,3,
> 
> Throughput (Gbps) comparison:
> #conn		Toeplitz	Jenkins
> 32		26.6		23.2
> 64		32.1		23.4
> 128		29.1		24.1
> 
> For long term solution, I think we should put the Toeplitz hash as 
> another option to the generic hash function in kernel... But, for the 
> time being, can you accept this patch to fix the assumptions on 
> struct flow_keys layout?


I find your Toeplitz distribution has an anomaly.

Having 128 connections distributed almost _perfectly_ into 16 buckets is
telling something how the source/destination ports where allocated
maybe, knowing the RSS key or something ?

It looks too _perfect_ to be true.

Here what I get here from 20 runs of 128 sessions using 
prandom_u32() hash, distributed to 16 buckets (hash % 16)

: 6,9,9,6,11,8,9,7,7,7,9,8,8,7,9,8
: 6,9,6,6,6,9,8,5,12,10,7,7,9,7,13,8
: 7,4,9,9,10,9,8,7,15,4,8,8,11,10,2,7
: 12,5,10,6,7,4,10,10,6,5,10,14,8,8,5,8
: 4,8,5,13,7,4,7,9,7,6,6,9,6,11,17,9
: 10,10,8,5,7,4,5,14,6,9,9,7,8,9,7,10
: 6,4,9,10,13,8,8,7,6,5,8,9,7,5,15,8
: 11,13,7,4,8,6,6,9,10,8,8,5,6,6,11,10
: 8,8,11,7,12,13,5,8,9,6,8,10,5,4,9,5
: 13,5,5,4,5,11,8,8,11,8,9,10,10,6,9,6
: 13,6,12,6,6,7,4,9,5,14,9,12,9,4,4,8
: 4,9,10,12,10,4,8,6,8,5,14,10,5,8,8,7
: 7,7,6,6,12,13,8,12,7,6,8,9,6,5,12,4
: 4,12,9,10,2,12,10,13,5,8,4,6,8,10,4,11
: 5,6,10,10,10,9,16,8,8,7,4,10,7,6,6,6
: 9,13,10,11,6,9,4,7,7,9,7,6,9,9,7,5
: 8,7,4,8,6,9,9,8,7,10,8,10,17,7,5,5
: 10,5,10,8,9,5,9,6,12,8,5,8,7,9,7,10
: 8,10,10,7,10,7,13,3,9,5,7,2,10,9,12,6
: 4,6,13,6,6,6,12,9,11,5,7,10,9,8,11,5

This looks more 'random' to me, and _if_ I use Jenkins hash I have the
same distribution.

Sure, it is not 'perfectly spread', but who said that all flows are
sending the same amount of traffic in the real world ?

Using Toeplitz hash is adding a cost of 300 ns per IPV6 packet.

TCP_RR (small RPC) workload would certainly not like to compute Toeplitz
for every packet.

I would like we do not add complexity just to make some benchmark
better.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 17:53         ` One Thousand Gnomes
@ 2016-01-14 18:24           ` Eric Dumazet
  -1 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-01-14 18:24 UTC (permalink / raw)
  To: One Thousand Gnomes
  Cc: Tom Herbert, Haiyang Zhang, David Miller, vkuznets, netdev,
	KY Srinivasan, devel, linux-kernel

On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> > These results for Toeplitz are not plausible. Given random input you
> > cannot expect any hash function to produce such uniform results. I
> > suspect either your input data is biased or how your applying the hash
> > is.
> > 
> > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get
> > something more reasonable:
> 
> IPv4 address patterns are not random. Nothing like it. A long long time
> ago we did do a bunch of tuning for network hashes using big porn site
> data sets. Random it was not.
> 

I ran my tests with non random IPV4 addresses, as I had 2 hosts,
one server, one client. (typical benchmark stuff)

The only 'random' part was the ports, so maybe ~20 bits of entropy,
considering how we allocate ports during connect() to a given
destination to avoid port reuse.

> It's probably hard to repeat that exercise now with geo specific routing,
> and all the front end caches and redirectors on big sites but I'd
> strongly suggest random input is not a good test, and also that you need
> to worry more about hash attacks than perfect distributions.

Anyway, the exercise is not to find a hash that exactly splits 128 flows
into 16 buckets, according to the number of flows per bucket.

Maybe only 4 flows are sending at 3Gbits, and others are sending at 100
kbits. There is no way the driver can predict the future.

This is why we prefer to select a queue given the cpu sending the
packet. This permits a natural shift based on actual load, and is the
default on linux (see XPS in Documentation/networking/scaling.txt)

Only this driver has a selection based on a flow 'hash'.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 18:24           ` Eric Dumazet
  0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-01-14 18:24 UTC (permalink / raw)
  To: One Thousand Gnomes
  Cc: Tom Herbert, Haiyang Zhang, linux-kernel, netdev, devel, David Miller

On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> > These results for Toeplitz are not plausible. Given random input you
> > cannot expect any hash function to produce such uniform results. I
> > suspect either your input data is biased or how your applying the hash
> > is.
> > 
> > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get
> > something more reasonable:
> 
> IPv4 address patterns are not random. Nothing like it. A long long time
> ago we did do a bunch of tuning for network hashes using big porn site
> data sets. Random it was not.
> 

I ran my tests with non random IPV4 addresses, as I had 2 hosts,
one server, one client. (typical benchmark stuff)

The only 'random' part was the ports, so maybe ~20 bits of entropy,
considering how we allocate ports during connect() to a given
destination to avoid port reuse.

> It's probably hard to repeat that exercise now with geo specific routing,
> and all the front end caches and redirectors on big sites but I'd
> strongly suggest random input is not a good test, and also that you need
> to worry more about hash attacks than perfect distributions.

Anyway, the exercise is not to find a hash that exactly splits 128 flows
into 16 buckets, according to the number of flows per bucket.

Maybe only 4 flows are sending at 3Gbits, and others are sending at 100
kbits. There is no way the driver can predict the future.

This is why we prefer to select a queue given the cpu sending the
packet. This permits a natural shift based on actual load, and is the
default on linux (see XPS in Documentation/networking/scaling.txt)

Only this driver has a selection based on a flow 'hash'.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 18:24           ` Eric Dumazet
@ 2016-01-14 18:35             ` Haiyang Zhang
  -1 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 18:35 UTC (permalink / raw)
  To: Eric Dumazet, One Thousand Gnomes
  Cc: Tom Herbert, David Miller, vkuznets, netdev, KY Srinivasan,
	devel, linux-kernel

> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Thursday, January 14, 2016 1:24 PM
> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> > > These results for Toeplitz are not plausible. Given random input you
> > > cannot expect any hash function to produce such uniform results. I
> > > suspect either your input data is biased or how your applying the
> hash
> > > is.
> > >
> > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I
> get
> > > something more reasonable:
> >
> > IPv4 address patterns are not random. Nothing like it. A long long
> time
> > ago we did do a bunch of tuning for network hashes using big porn site
> > data sets. Random it was not.
> >
> 
> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
> one server, one client. (typical benchmark stuff)
> 
> The only 'random' part was the ports, so maybe ~20 bits of entropy,
> considering how we allocate ports during connect() to a given
> destination to avoid port reuse.
> 
> > It's probably hard to repeat that exercise now with geo specific
> routing,
> > and all the front end caches and redirectors on big sites but I'd
> > strongly suggest random input is not a good test, and also that you
> need
> > to worry more about hash attacks than perfect distributions.
> 
> Anyway, the exercise is not to find a hash that exactly splits 128 flows
> into 16 buckets, according to the number of flows per bucket.
> 
> Maybe only 4 flows are sending at 3Gbits, and others are sending at 100
> kbits. There is no way the driver can predict the future.
> 
> This is why we prefer to select a queue given the cpu sending the
> packet. This permits a natural shift based on actual load, and is the
> default on linux (see XPS in Documentation/networking/scaling.txt)
> 
> Only this driver has a selection based on a flow 'hash'.

Also, the port number selection may not be random either. For example, 
the well-known network throughput test tool, iperf, use port numbers with 
equal increment among them. We tested these non-random cases, and found 
the Toeplitz hash has distributed evenly, but Jenkins hash has non-even 
distribution.

I'm aware of the test from Tom Herbert <tom@herbertland.com>, which 
showing similar results of Toeplitz v.s. Jenkins with random inputs.

In summary, the Toeplitz performs better in case of non-random inputs, 
and performs similar to Jenkins in random inputs (which may not be the 
case in real world). So we still prefer to use Toeplitz hash.

To minimize the computational overhead, we may consider put the hash 
in a per-connection cache in TCP layer, so it only needs one time 
computation. But, even with the computation overhead at this moment, 
the throughput based on Toeplitz hash is better than Jenkins:
Throughput (Gbps) comparison:
#conn		Toeplitz	Jenkins
32		26.6		23.2
64		32.1		23.4
128		29.1		24.1

Also, to the questions from Eric Dumazet <eric.dumazet@gmail.com> -- no, 
there is not limit of the number of connections per VMBus channel. But, 
if one channel has a lot more connections than other channels, the 
unbalanced work load slow down the overall throughput.

The purpose of send-indirection-table is to shift the workload by change 
the mapping of table entry v.s. the channel. The updated table is sent 
by host to guest from time to time. But if the hash function distributes 
too many connections into one table entry, it cannot spread them into 
different channels.

Thanks to everyone who joined the discussion.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 18:35             ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 18:35 UTC (permalink / raw)
  To: Eric Dumazet, One Thousand Gnomes
  Cc: Tom Herbert, linux-kernel, netdev, devel, David Miller

> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Thursday, January 14, 2016 1:24 PM
> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> > > These results for Toeplitz are not plausible. Given random input you
> > > cannot expect any hash function to produce such uniform results. I
> > > suspect either your input data is biased or how your applying the
> hash
> > > is.
> > >
> > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I
> get
> > > something more reasonable:
> >
> > IPv4 address patterns are not random. Nothing like it. A long long
> time
> > ago we did do a bunch of tuning for network hashes using big porn site
> > data sets. Random it was not.
> >
> 
> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
> one server, one client. (typical benchmark stuff)
> 
> The only 'random' part was the ports, so maybe ~20 bits of entropy,
> considering how we allocate ports during connect() to a given
> destination to avoid port reuse.
> 
> > It's probably hard to repeat that exercise now with geo specific
> routing,
> > and all the front end caches and redirectors on big sites but I'd
> > strongly suggest random input is not a good test, and also that you
> need
> > to worry more about hash attacks than perfect distributions.
> 
> Anyway, the exercise is not to find a hash that exactly splits 128 flows
> into 16 buckets, according to the number of flows per bucket.
> 
> Maybe only 4 flows are sending at 3Gbits, and others are sending at 100
> kbits. There is no way the driver can predict the future.
> 
> This is why we prefer to select a queue given the cpu sending the
> packet. This permits a natural shift based on actual load, and is the
> default on linux (see XPS in Documentation/networking/scaling.txt)
> 
> Only this driver has a selection based on a flow 'hash'.

Also, the port number selection may not be random either. For example, 
the well-known network throughput test tool, iperf, use port numbers with 
equal increment among them. We tested these non-random cases, and found 
the Toeplitz hash has distributed evenly, but Jenkins hash has non-even 
distribution.

I'm aware of the test from Tom Herbert <tom@herbertland.com>, which 
showing similar results of Toeplitz v.s. Jenkins with random inputs.

In summary, the Toeplitz performs better in case of non-random inputs, 
and performs similar to Jenkins in random inputs (which may not be the 
case in real world). So we still prefer to use Toeplitz hash.

To minimize the computational overhead, we may consider put the hash 
in a per-connection cache in TCP layer, so it only needs one time 
computation. But, even with the computation overhead at this moment, 
the throughput based on Toeplitz hash is better than Jenkins:
Throughput (Gbps) comparison:
#conn		Toeplitz	Jenkins
32		26.6		23.2
64		32.1		23.4
128		29.1		24.1

Also, to the questions from Eric Dumazet <eric.dumazet@gmail.com> -- no, 
there is not limit of the number of connections per VMBus channel. But, 
if one channel has a lot more connections than other channels, the 
unbalanced work load slow down the overall throughput.

The purpose of send-indirection-table is to shift the workload by change 
the mapping of table entry v.s. the channel. The updated table is sent 
by host to guest from time to time. But if the hash function distributes 
too many connections into one table entry, it cannot spread them into 
different channels.

Thanks to everyone who joined the discussion.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 18:35             ` Haiyang Zhang
  (?)
@ 2016-01-14 18:48             ` Tom Herbert
  2016-01-14 19:15                 ` Haiyang Zhang
  -1 siblings, 1 reply; 43+ messages in thread
From: Tom Herbert @ 2016-01-14 18:48 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: Eric Dumazet, One Thousand Gnomes, David Miller, vkuznets,
	netdev, KY Srinivasan, devel, linux-kernel

On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang <haiyangz@microsoft.com> wrote:
>
>
>> -----Original Message-----
>> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>> Sent: Thursday, January 14, 2016 1:24 PM
>> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
>> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
>> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
>> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> struct flow_keys layout
>>
>> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
>> > > These results for Toeplitz are not plausible. Given random input you
>> > > cannot expect any hash function to produce such uniform results. I
>> > > suspect either your input data is biased or how your applying the
>> hash
>> > > is.
>> > >
>> > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I
>> get
>> > > something more reasonable:
>> >
>> > IPv4 address patterns are not random. Nothing like it. A long long
>> time
>> > ago we did do a bunch of tuning for network hashes using big porn site
>> > data sets. Random it was not.
>> >
>>
>> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
>> one server, one client. (typical benchmark stuff)
>>
>> The only 'random' part was the ports, so maybe ~20 bits of entropy,
>> considering how we allocate ports during connect() to a given
>> destination to avoid port reuse.
>>
>> > It's probably hard to repeat that exercise now with geo specific
>> routing,
>> > and all the front end caches and redirectors on big sites but I'd
>> > strongly suggest random input is not a good test, and also that you
>> need
>> > to worry more about hash attacks than perfect distributions.
>>
>> Anyway, the exercise is not to find a hash that exactly splits 128 flows
>> into 16 buckets, according to the number of flows per bucket.
>>
>> Maybe only 4 flows are sending at 3Gbits, and others are sending at 100
>> kbits. There is no way the driver can predict the future.
>>
>> This is why we prefer to select a queue given the cpu sending the
>> packet. This permits a natural shift based on actual load, and is the
>> default on linux (see XPS in Documentation/networking/scaling.txt)
>>
>> Only this driver has a selection based on a flow 'hash'.
>
> Also, the port number selection may not be random either. For example,
> the well-known network throughput test tool, iperf, use port numbers with
> equal increment among them. We tested these non-random cases, and found
> the Toeplitz hash has distributed evenly, but Jenkins hash has non-even
> distribution.
>
> I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
> showing similar results of Toeplitz v.s. Jenkins with random inputs.
>
> In summary, the Toeplitz performs better in case of non-random inputs,
> and performs similar to Jenkins in random inputs (which may not be the
> case in real world). So we still prefer to use Toeplitz hash.
>
You are basing your conclusions on one toy benchmark. I don't believe
that an realistically loaded web server is going to consistently give
you tuples that happen to somehow fit into a nice model so that the
bias benefits your load distribution.

> To minimize the computational overhead, we may consider put the hash
> in a per-connection cache in TCP layer, so it only needs one time
> computation. But, even with the computation overhead at this moment,
> the throughput based on Toeplitz hash is better than Jenkins:
> Throughput (Gbps) comparison:
> #conn           Toeplitz        Jenkins
> 32              26.6            23.2
> 64              32.1            23.4
> 128             29.1            24.1
>
You don't need to do that. We already store a random hash value in the
connection context. If you want to make it non-random then just
replace that with a simple global counter. This will have the exact
same effect that you see in your tests without needing any expensive
computation.

> Also, to the questions from Eric Dumazet <eric.dumazet@gmail.com> -- no,
> there is not limit of the number of connections per VMBus channel. But,
> if one channel has a lot more connections than other channels, the
> unbalanced work load slow down the overall throughput.
>
> The purpose of send-indirection-table is to shift the workload by change
> the mapping of table entry v.s. the channel. The updated table is sent
> by host to guest from time to time. But if the hash function distributes
> too many connections into one table entry, it cannot spread them into
> different channels.
>
> Thanks to everyone who joined the discussion.
>
> Thanks,
> - Haiyang
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 18:48             ` Tom Herbert
@ 2016-01-14 19:15                 ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 19:15 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Eric Dumazet, One Thousand Gnomes, David Miller, vkuznets,
	netdev, KY Srinivasan, devel, linux-kernel



> -----Original Message-----
> From: Tom Herbert [mailto:tom@herbertland.com]
> Sent: Thursday, January 14, 2016 1:49 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang <haiyangz@microsoft.com>
> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> >> Sent: Thursday, January 14, 2016 1:24 PM
> >> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
> >> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
> >> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> >> kernel@vger.kernel.org
> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> >> struct flow_keys layout
> >>
> >> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> >> > > These results for Toeplitz are not plausible. Given random input
> you
> >> > > cannot expect any hash function to produce such uniform results.
> I
> >> > > suspect either your input data is biased or how your applying the
> >> hash
> >> > > is.
> >> > >
> >> > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I
> >> get
> >> > > something more reasonable:
> >> >
> >> > IPv4 address patterns are not random. Nothing like it. A long long
> >> time
> >> > ago we did do a bunch of tuning for network hashes using big porn
> site
> >> > data sets. Random it was not.
> >> >
> >>
> >> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
> >> one server, one client. (typical benchmark stuff)
> >>
> >> The only 'random' part was the ports, so maybe ~20 bits of entropy,
> >> considering how we allocate ports during connect() to a given
> >> destination to avoid port reuse.
> >>
> >> > It's probably hard to repeat that exercise now with geo specific
> >> routing,
> >> > and all the front end caches and redirectors on big sites but I'd
> >> > strongly suggest random input is not a good test, and also that you
> >> need
> >> > to worry more about hash attacks than perfect distributions.
> >>
> >> Anyway, the exercise is not to find a hash that exactly splits 128
> flows
> >> into 16 buckets, according to the number of flows per bucket.
> >>
> >> Maybe only 4 flows are sending at 3Gbits, and others are sending at
> 100
> >> kbits. There is no way the driver can predict the future.
> >>
> >> This is why we prefer to select a queue given the cpu sending the
> >> packet. This permits a natural shift based on actual load, and is the
> >> default on linux (see XPS in Documentation/networking/scaling.txt)
> >>
> >> Only this driver has a selection based on a flow 'hash'.
> >
> > Also, the port number selection may not be random either. For example,
> > the well-known network throughput test tool, iperf, use port numbers
> with
> > equal increment among them. We tested these non-random cases, and
> found
> > the Toeplitz hash has distributed evenly, but Jenkins hash has non-
> even
> > distribution.
> >
> > I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
> > showing similar results of Toeplitz v.s. Jenkins with random inputs.
> >
> > In summary, the Toeplitz performs better in case of non-random inputs,
> > and performs similar to Jenkins in random inputs (which may not be the
> > case in real world). So we still prefer to use Toeplitz hash.
> >
> You are basing your conclusions on one toy benchmark. I don't believe
> that an realistically loaded web server is going to consistently give
> you tuples that happen to somehow fit into a nice model so that the
> bias benefits your load distribution.
> 
> > To minimize the computational overhead, we may consider put the hash
> > in a per-connection cache in TCP layer, so it only needs one time
> > computation. But, even with the computation overhead at this moment,
> > the throughput based on Toeplitz hash is better than Jenkins:
> > Throughput (Gbps) comparison:
> > #conn           Toeplitz        Jenkins
> > 32              26.6            23.2
> > 64              32.1            23.4
> > 128             29.1            24.1
> >
> You don't need to do that. We already store a random hash value in the
> connection context. If you want to make it non-random then just
> replace that with a simple global counter. This will have the exact
> same effect that you see in your tests without needing any expensive
> computation.

Could you point me to the data field of connection context where this 
hash value is stored? Is it computed only one time?

Thanks!

- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 19:15                 ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 19:15 UTC (permalink / raw)
  To: Tom Herbert
  Cc: One Thousand Gnomes, Eric Dumazet, netdev, linux-kernel, devel,
	David Miller



> -----Original Message-----
> From: Tom Herbert [mailto:tom@herbertland.com]
> Sent: Thursday, January 14, 2016 1:49 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang <haiyangz@microsoft.com>
> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> >> Sent: Thursday, January 14, 2016 1:24 PM
> >> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
> >> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
> >> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> >> kernel@vger.kernel.org
> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> >> struct flow_keys layout
> >>
> >> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> >> > > These results for Toeplitz are not plausible. Given random input
> you
> >> > > cannot expect any hash function to produce such uniform results.
> I
> >> > > suspect either your input data is biased or how your applying the
> >> hash
> >> > > is.
> >> > >
> >> > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I
> >> get
> >> > > something more reasonable:
> >> >
> >> > IPv4 address patterns are not random. Nothing like it. A long long
> >> time
> >> > ago we did do a bunch of tuning for network hashes using big porn
> site
> >> > data sets. Random it was not.
> >> >
> >>
> >> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
> >> one server, one client. (typical benchmark stuff)
> >>
> >> The only 'random' part was the ports, so maybe ~20 bits of entropy,
> >> considering how we allocate ports during connect() to a given
> >> destination to avoid port reuse.
> >>
> >> > It's probably hard to repeat that exercise now with geo specific
> >> routing,
> >> > and all the front end caches and redirectors on big sites but I'd
> >> > strongly suggest random input is not a good test, and also that you
> >> need
> >> > to worry more about hash attacks than perfect distributions.
> >>
> >> Anyway, the exercise is not to find a hash that exactly splits 128
> flows
> >> into 16 buckets, according to the number of flows per bucket.
> >>
> >> Maybe only 4 flows are sending at 3Gbits, and others are sending at
> 100
> >> kbits. There is no way the driver can predict the future.
> >>
> >> This is why we prefer to select a queue given the cpu sending the
> >> packet. This permits a natural shift based on actual load, and is the
> >> default on linux (see XPS in Documentation/networking/scaling.txt)
> >>
> >> Only this driver has a selection based on a flow 'hash'.
> >
> > Also, the port number selection may not be random either. For example,
> > the well-known network throughput test tool, iperf, use port numbers
> with
> > equal increment among them. We tested these non-random cases, and
> found
> > the Toeplitz hash has distributed evenly, but Jenkins hash has non-
> even
> > distribution.
> >
> > I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
> > showing similar results of Toeplitz v.s. Jenkins with random inputs.
> >
> > In summary, the Toeplitz performs better in case of non-random inputs,
> > and performs similar to Jenkins in random inputs (which may not be the
> > case in real world). So we still prefer to use Toeplitz hash.
> >
> You are basing your conclusions on one toy benchmark. I don't believe
> that an realistically loaded web server is going to consistently give
> you tuples that happen to somehow fit into a nice model so that the
> bias benefits your load distribution.
> 
> > To minimize the computational overhead, we may consider put the hash
> > in a per-connection cache in TCP layer, so it only needs one time
> > computation. But, even with the computation overhead at this moment,
> > the throughput based on Toeplitz hash is better than Jenkins:
> > Throughput (Gbps) comparison:
> > #conn           Toeplitz        Jenkins
> > 32              26.6            23.2
> > 64              32.1            23.4
> > 128             29.1            24.1
> >
> You don't need to do that. We already store a random hash value in the
> connection context. If you want to make it non-random then just
> replace that with a simple global counter. This will have the exact
> same effect that you see in your tests without needing any expensive
> computation.

Could you point me to the data field of connection context where this 
hash value is stored? Is it computed only one time?

Thanks!

- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 19:15                 ` Haiyang Zhang
  (?)
@ 2016-01-14 19:41                 ` Tom Herbert
  2016-01-14 20:23                     ` Haiyang Zhang
  -1 siblings, 1 reply; 43+ messages in thread
From: Tom Herbert @ 2016-01-14 19:41 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: Eric Dumazet, One Thousand Gnomes, David Miller, vkuznets,
	netdev, KY Srinivasan, devel, linux-kernel

On Thu, Jan 14, 2016 at 11:15 AM, Haiyang Zhang <haiyangz@microsoft.com> wrote:
>
>
>> -----Original Message-----
>> From: Tom Herbert [mailto:tom@herbertland.com]
>> Sent: Thursday, January 14, 2016 1:49 PM
>> To: Haiyang Zhang <haiyangz@microsoft.com>
>> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
>> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
>> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> struct flow_keys layout
>>
>> On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang <haiyangz@microsoft.com>
>> wrote:
>> >
>> >
>> >> -----Original Message-----
>> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>> >> Sent: Thursday, January 14, 2016 1:24 PM
>> >> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
>> >> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
>> >> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
>> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> >> kernel@vger.kernel.org
>> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> >> struct flow_keys layout
>> >>
>> >> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
>> >> > > These results for Toeplitz are not plausible. Given random input
>> you
>> >> > > cannot expect any hash function to produce such uniform results.
>> I
>> >> > > suspect either your input data is biased or how your applying the
>> >> hash
>> >> > > is.
>> >> > >
>> >> > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I
>> >> get
>> >> > > something more reasonable:
>> >> >
>> >> > IPv4 address patterns are not random. Nothing like it. A long long
>> >> time
>> >> > ago we did do a bunch of tuning for network hashes using big porn
>> site
>> >> > data sets. Random it was not.
>> >> >
>> >>
>> >> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
>> >> one server, one client. (typical benchmark stuff)
>> >>
>> >> The only 'random' part was the ports, so maybe ~20 bits of entropy,
>> >> considering how we allocate ports during connect() to a given
>> >> destination to avoid port reuse.
>> >>
>> >> > It's probably hard to repeat that exercise now with geo specific
>> >> routing,
>> >> > and all the front end caches and redirectors on big sites but I'd
>> >> > strongly suggest random input is not a good test, and also that you
>> >> need
>> >> > to worry more about hash attacks than perfect distributions.
>> >>
>> >> Anyway, the exercise is not to find a hash that exactly splits 128
>> flows
>> >> into 16 buckets, according to the number of flows per bucket.
>> >>
>> >> Maybe only 4 flows are sending at 3Gbits, and others are sending at
>> 100
>> >> kbits. There is no way the driver can predict the future.
>> >>
>> >> This is why we prefer to select a queue given the cpu sending the
>> >> packet. This permits a natural shift based on actual load, and is the
>> >> default on linux (see XPS in Documentation/networking/scaling.txt)
>> >>
>> >> Only this driver has a selection based on a flow 'hash'.
>> >
>> > Also, the port number selection may not be random either. For example,
>> > the well-known network throughput test tool, iperf, use port numbers
>> with
>> > equal increment among them. We tested these non-random cases, and
>> found
>> > the Toeplitz hash has distributed evenly, but Jenkins hash has non-
>> even
>> > distribution.
>> >
>> > I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
>> > showing similar results of Toeplitz v.s. Jenkins with random inputs.
>> >
>> > In summary, the Toeplitz performs better in case of non-random inputs,
>> > and performs similar to Jenkins in random inputs (which may not be the
>> > case in real world). So we still prefer to use Toeplitz hash.
>> >
>> You are basing your conclusions on one toy benchmark. I don't believe
>> that an realistically loaded web server is going to consistently give
>> you tuples that happen to somehow fit into a nice model so that the
>> bias benefits your load distribution.
>>
>> > To minimize the computational overhead, we may consider put the hash
>> > in a per-connection cache in TCP layer, so it only needs one time
>> > computation. But, even with the computation overhead at this moment,
>> > the throughput based on Toeplitz hash is better than Jenkins:
>> > Throughput (Gbps) comparison:
>> > #conn           Toeplitz        Jenkins
>> > 32              26.6            23.2
>> > 64              32.1            23.4
>> > 128             29.1            24.1
>> >
>> You don't need to do that. We already store a random hash value in the
>> connection context. If you want to make it non-random then just
>> replace that with a simple global counter. This will have the exact
>> same effect that you see in your tests without needing any expensive
>> computation.
>
> Could you point me to the data field of connection context where this
> hash value is stored? Is it computed only one time?
>
sk_txhash in struct sock. It is set to a random number on TCP or UDP
connect call, It can be reset to a different random value when
connection is seen to be have trouble (sk_rethink_txhash).

Also when you say "Toeplitz performs better in case of non-random
inputs" please quantify exactly how your input data is not random.
What header changes with each connection in your test...

> Thanks!
>
> - Haiyang
>
>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 19:41                 ` Tom Herbert
@ 2016-01-14 20:23                     ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 20:23 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Eric Dumazet, One Thousand Gnomes, David Miller, vkuznets,
	netdev, KY Srinivasan, devel, linux-kernel



> -----Original Message-----
> From: Tom Herbert [mailto:tom@herbertland.com]
> Sent: Thursday, January 14, 2016 2:41 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, Jan 14, 2016 at 11:15 AM, Haiyang Zhang <haiyangz@microsoft.com>
> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Tom Herbert [mailto:tom@herbertland.com]
> >> Sent: Thursday, January 14, 2016 1:49 PM
> >> To: Haiyang Zhang <haiyangz@microsoft.com>
> >> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
> >> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> >> kernel@vger.kernel.org
> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> >> struct flow_keys layout
> >>
> >> On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang
> <haiyangz@microsoft.com>
> >> wrote:
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> >> >> Sent: Thursday, January 14, 2016 1:24 PM
> >> >> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
> >> >> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
> >> >> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
> >> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> >> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> >> >> kernel@vger.kernel.org
> >> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> >> >> struct flow_keys layout
> >> >>
> >> >> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> >> >> > > These results for Toeplitz are not plausible. Given random
> input
> >> you
> >> >> > > cannot expect any hash function to produce such uniform
> results.
> >> I
> >> >> > > suspect either your input data is biased or how your applying
> the
> >> >> hash
> >> >> > > is.
> >> >> > >
> >> >> > > When I run 64 random IPv4 3-tuples through Toeplitz and
> Jenkins I
> >> >> get
> >> >> > > something more reasonable:
> >> >> >
> >> >> > IPv4 address patterns are not random. Nothing like it. A long
> long
> >> >> time
> >> >> > ago we did do a bunch of tuning for network hashes using big
> porn
> >> site
> >> >> > data sets. Random it was not.
> >> >> >
> >> >>
> >> >> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
> >> >> one server, one client. (typical benchmark stuff)
> >> >>
> >> >> The only 'random' part was the ports, so maybe ~20 bits of entropy,
> >> >> considering how we allocate ports during connect() to a given
> >> >> destination to avoid port reuse.
> >> >>
> >> >> > It's probably hard to repeat that exercise now with geo specific
> >> >> routing,
> >> >> > and all the front end caches and redirectors on big sites but
> I'd
> >> >> > strongly suggest random input is not a good test, and also that
> you
> >> >> need
> >> >> > to worry more about hash attacks than perfect distributions.
> >> >>
> >> >> Anyway, the exercise is not to find a hash that exactly splits 128
> >> flows
> >> >> into 16 buckets, according to the number of flows per bucket.
> >> >>
> >> >> Maybe only 4 flows are sending at 3Gbits, and others are sending
> at
> >> 100
> >> >> kbits. There is no way the driver can predict the future.
> >> >>
> >> >> This is why we prefer to select a queue given the cpu sending the
> >> >> packet. This permits a natural shift based on actual load, and is
> the
> >> >> default on linux (see XPS in Documentation/networking/scaling.txt)
> >> >>
> >> >> Only this driver has a selection based on a flow 'hash'.
> >> >
> >> > Also, the port number selection may not be random either. For
> example,
> >> > the well-known network throughput test tool, iperf, use port
> numbers
> >> with
> >> > equal increment among them. We tested these non-random cases, and
> >> found
> >> > the Toeplitz hash has distributed evenly, but Jenkins hash has non-
> >> even
> >> > distribution.
> >> >
> >> > I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
> >> > showing similar results of Toeplitz v.s. Jenkins with random inputs.
> >> >
> >> > In summary, the Toeplitz performs better in case of non-random
> inputs,
> >> > and performs similar to Jenkins in random inputs (which may not be
> the
> >> > case in real world). So we still prefer to use Toeplitz hash.
> >> >
> >> You are basing your conclusions on one toy benchmark. I don't believe
> >> that an realistically loaded web server is going to consistently give
> >> you tuples that happen to somehow fit into a nice model so that the
> >> bias benefits your load distribution.
> >>
> >> > To minimize the computational overhead, we may consider put the
> hash
> >> > in a per-connection cache in TCP layer, so it only needs one time
> >> > computation. But, even with the computation overhead at this moment,
> >> > the throughput based on Toeplitz hash is better than Jenkins:
> >> > Throughput (Gbps) comparison:
> >> > #conn           Toeplitz        Jenkins
> >> > 32              26.6            23.2
> >> > 64              32.1            23.4
> >> > 128             29.1            24.1
> >> >
> >> You don't need to do that. We already store a random hash value in
> the
> >> connection context. If you want to make it non-random then just
> >> replace that with a simple global counter. This will have the exact
> >> same effect that you see in your tests without needing any expensive
> >> computation.
> >
> > Could you point me to the data field of connection context where this
> > hash value is stored? Is it computed only one time?
> >
> sk_txhash in struct sock. It is set to a random number on TCP or UDP
> connect call, It can be reset to a different random value when
> connection is seen to be have trouble (sk_rethink_txhash).
> 
> Also when you say "Toeplitz performs better in case of non-random
> inputs" please quantify exactly how your input data is not random.
> What header changes with each connection in your test...

Thank you for the info! 

For non-random inputs, I used the port selection of iperf that increases 
the port number by 2 for each connection. Only send-port numbers are 
different, other values are the same. I also tested some other fixed 
increment, Toeplitz spreads the connections evenly. For real applications, 
if the load came from local area, then the IP/port combinations are 
likely to have some non-random patterns.

For our driver, we are thinking to put the Toeplitz hash to the sk_txhash, 
so it only needs to be computed only once, or during sk_rethink_txhash. 
So, the computational overhead happens almost only once.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 20:23                     ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 20:23 UTC (permalink / raw)
  To: Tom Herbert
  Cc: One Thousand Gnomes, Eric Dumazet, netdev, linux-kernel, devel,
	David Miller



> -----Original Message-----
> From: Tom Herbert [mailto:tom@herbertland.com]
> Sent: Thursday, January 14, 2016 2:41 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, Jan 14, 2016 at 11:15 AM, Haiyang Zhang <haiyangz@microsoft.com>
> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Tom Herbert [mailto:tom@herbertland.com]
> >> Sent: Thursday, January 14, 2016 1:49 PM
> >> To: Haiyang Zhang <haiyangz@microsoft.com>
> >> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
> >> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> >> kernel@vger.kernel.org
> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> >> struct flow_keys layout
> >>
> >> On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang
> <haiyangz@microsoft.com>
> >> wrote:
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> >> >> Sent: Thursday, January 14, 2016 1:24 PM
> >> >> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
> >> >> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
> >> >> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
> >> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> >> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> >> >> kernel@vger.kernel.org
> >> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> >> >> struct flow_keys layout
> >> >>
> >> >> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
> >> >> > > These results for Toeplitz are not plausible. Given random
> input
> >> you
> >> >> > > cannot expect any hash function to produce such uniform
> results.
> >> I
> >> >> > > suspect either your input data is biased or how your applying
> the
> >> >> hash
> >> >> > > is.
> >> >> > >
> >> >> > > When I run 64 random IPv4 3-tuples through Toeplitz and
> Jenkins I
> >> >> get
> >> >> > > something more reasonable:
> >> >> >
> >> >> > IPv4 address patterns are not random. Nothing like it. A long
> long
> >> >> time
> >> >> > ago we did do a bunch of tuning for network hashes using big
> porn
> >> site
> >> >> > data sets. Random it was not.
> >> >> >
> >> >>
> >> >> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
> >> >> one server, one client. (typical benchmark stuff)
> >> >>
> >> >> The only 'random' part was the ports, so maybe ~20 bits of entropy,
> >> >> considering how we allocate ports during connect() to a given
> >> >> destination to avoid port reuse.
> >> >>
> >> >> > It's probably hard to repeat that exercise now with geo specific
> >> >> routing,
> >> >> > and all the front end caches and redirectors on big sites but
> I'd
> >> >> > strongly suggest random input is not a good test, and also that
> you
> >> >> need
> >> >> > to worry more about hash attacks than perfect distributions.
> >> >>
> >> >> Anyway, the exercise is not to find a hash that exactly splits 128
> >> flows
> >> >> into 16 buckets, according to the number of flows per bucket.
> >> >>
> >> >> Maybe only 4 flows are sending at 3Gbits, and others are sending
> at
> >> 100
> >> >> kbits. There is no way the driver can predict the future.
> >> >>
> >> >> This is why we prefer to select a queue given the cpu sending the
> >> >> packet. This permits a natural shift based on actual load, and is
> the
> >> >> default on linux (see XPS in Documentation/networking/scaling.txt)
> >> >>
> >> >> Only this driver has a selection based on a flow 'hash'.
> >> >
> >> > Also, the port number selection may not be random either. For
> example,
> >> > the well-known network throughput test tool, iperf, use port
> numbers
> >> with
> >> > equal increment among them. We tested these non-random cases, and
> >> found
> >> > the Toeplitz hash has distributed evenly, but Jenkins hash has non-
> >> even
> >> > distribution.
> >> >
> >> > I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
> >> > showing similar results of Toeplitz v.s. Jenkins with random inputs.
> >> >
> >> > In summary, the Toeplitz performs better in case of non-random
> inputs,
> >> > and performs similar to Jenkins in random inputs (which may not be
> the
> >> > case in real world). So we still prefer to use Toeplitz hash.
> >> >
> >> You are basing your conclusions on one toy benchmark. I don't believe
> >> that an realistically loaded web server is going to consistently give
> >> you tuples that happen to somehow fit into a nice model so that the
> >> bias benefits your load distribution.
> >>
> >> > To minimize the computational overhead, we may consider put the
> hash
> >> > in a per-connection cache in TCP layer, so it only needs one time
> >> > computation. But, even with the computation overhead at this moment,
> >> > the throughput based on Toeplitz hash is better than Jenkins:
> >> > Throughput (Gbps) comparison:
> >> > #conn           Toeplitz        Jenkins
> >> > 32              26.6            23.2
> >> > 64              32.1            23.4
> >> > 128             29.1            24.1
> >> >
> >> You don't need to do that. We already store a random hash value in
> the
> >> connection context. If you want to make it non-random then just
> >> replace that with a simple global counter. This will have the exact
> >> same effect that you see in your tests without needing any expensive
> >> computation.
> >
> > Could you point me to the data field of connection context where this
> > hash value is stored? Is it computed only one time?
> >
> sk_txhash in struct sock. It is set to a random number on TCP or UDP
> connect call, It can be reset to a different random value when
> connection is seen to be have trouble (sk_rethink_txhash).
> 
> Also when you say "Toeplitz performs better in case of non-random
> inputs" please quantify exactly how your input data is not random.
> What header changes with each connection in your test...

Thank you for the info! 

For non-random inputs, I used the port selection of iperf that increases 
the port number by 2 for each connection. Only send-port numbers are 
different, other values are the same. I also tested some other fixed 
increment, Toeplitz spreads the connections evenly. For real applications, 
if the load came from local area, then the IP/port combinations are 
likely to have some non-random patterns.

For our driver, we are thinking to put the Toeplitz hash to the sk_txhash, 
so it only needs to be computed only once, or during sk_rethink_txhash. 
So, the computational overhead happens almost only once.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 20:23                     ` Haiyang Zhang
@ 2016-01-14 21:44                       ` Tom Herbert
  -1 siblings, 0 replies; 43+ messages in thread
From: Tom Herbert @ 2016-01-14 21:44 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: Eric Dumazet, One Thousand Gnomes, David Miller, vkuznets,
	netdev, KY Srinivasan, devel, linux-kernel

On Thu, Jan 14, 2016 at 12:23 PM, Haiyang Zhang <haiyangz@microsoft.com> wrote:
>
>
>> -----Original Message-----
>> From: Tom Herbert [mailto:tom@herbertland.com]
>> Sent: Thursday, January 14, 2016 2:41 PM
>> To: Haiyang Zhang <haiyangz@microsoft.com>
>> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
>> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
>> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> struct flow_keys layout
>>
>> On Thu, Jan 14, 2016 at 11:15 AM, Haiyang Zhang <haiyangz@microsoft.com>
>> wrote:
>> >
>> >
>> >> -----Original Message-----
>> >> From: Tom Herbert [mailto:tom@herbertland.com]
>> >> Sent: Thursday, January 14, 2016 1:49 PM
>> >> To: Haiyang Zhang <haiyangz@microsoft.com>
>> >> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
>> >> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
>> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> >> kernel@vger.kernel.org
>> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> >> struct flow_keys layout
>> >>
>> >> On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang
>> <haiyangz@microsoft.com>
>> >> wrote:
>> >> >
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>> >> >> Sent: Thursday, January 14, 2016 1:24 PM
>> >> >> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
>> >> >> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
>> >> >> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
>> >> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> >> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> >> >> kernel@vger.kernel.org
>> >> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> >> >> struct flow_keys layout
>> >> >>
>> >> >> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
>> >> >> > > These results for Toeplitz are not plausible. Given random
>> input
>> >> you
>> >> >> > > cannot expect any hash function to produce such uniform
>> results.
>> >> I
>> >> >> > > suspect either your input data is biased or how your applying
>> the
>> >> >> hash
>> >> >> > > is.
>> >> >> > >
>> >> >> > > When I run 64 random IPv4 3-tuples through Toeplitz and
>> Jenkins I
>> >> >> get
>> >> >> > > something more reasonable:
>> >> >> >
>> >> >> > IPv4 address patterns are not random. Nothing like it. A long
>> long
>> >> >> time
>> >> >> > ago we did do a bunch of tuning for network hashes using big
>> porn
>> >> site
>> >> >> > data sets. Random it was not.
>> >> >> >
>> >> >>
>> >> >> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
>> >> >> one server, one client. (typical benchmark stuff)
>> >> >>
>> >> >> The only 'random' part was the ports, so maybe ~20 bits of entropy,
>> >> >> considering how we allocate ports during connect() to a given
>> >> >> destination to avoid port reuse.
>> >> >>
>> >> >> > It's probably hard to repeat that exercise now with geo specific
>> >> >> routing,
>> >> >> > and all the front end caches and redirectors on big sites but
>> I'd
>> >> >> > strongly suggest random input is not a good test, and also that
>> you
>> >> >> need
>> >> >> > to worry more about hash attacks than perfect distributions.
>> >> >>
>> >> >> Anyway, the exercise is not to find a hash that exactly splits 128
>> >> flows
>> >> >> into 16 buckets, according to the number of flows per bucket.
>> >> >>
>> >> >> Maybe only 4 flows are sending at 3Gbits, and others are sending
>> at
>> >> 100
>> >> >> kbits. There is no way the driver can predict the future.
>> >> >>
>> >> >> This is why we prefer to select a queue given the cpu sending the
>> >> >> packet. This permits a natural shift based on actual load, and is
>> the
>> >> >> default on linux (see XPS in Documentation/networking/scaling.txt)
>> >> >>
>> >> >> Only this driver has a selection based on a flow 'hash'.
>> >> >
>> >> > Also, the port number selection may not be random either. For
>> example,
>> >> > the well-known network throughput test tool, iperf, use port
>> numbers
>> >> with
>> >> > equal increment among them. We tested these non-random cases, and
>> >> found
>> >> > the Toeplitz hash has distributed evenly, but Jenkins hash has non-
>> >> even
>> >> > distribution.
>> >> >
>> >> > I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
>> >> > showing similar results of Toeplitz v.s. Jenkins with random inputs.
>> >> >
>> >> > In summary, the Toeplitz performs better in case of non-random
>> inputs,
>> >> > and performs similar to Jenkins in random inputs (which may not be
>> the
>> >> > case in real world). So we still prefer to use Toeplitz hash.
>> >> >
>> >> You are basing your conclusions on one toy benchmark. I don't believe
>> >> that an realistically loaded web server is going to consistently give
>> >> you tuples that happen to somehow fit into a nice model so that the
>> >> bias benefits your load distribution.
>> >>
>> >> > To minimize the computational overhead, we may consider put the
>> hash
>> >> > in a per-connection cache in TCP layer, so it only needs one time
>> >> > computation. But, even with the computation overhead at this moment,
>> >> > the throughput based on Toeplitz hash is better than Jenkins:
>> >> > Throughput (Gbps) comparison:
>> >> > #conn           Toeplitz        Jenkins
>> >> > 32              26.6            23.2
>> >> > 64              32.1            23.4
>> >> > 128             29.1            24.1
>> >> >
>> >> You don't need to do that. We already store a random hash value in
>> the
>> >> connection context. If you want to make it non-random then just
>> >> replace that with a simple global counter. This will have the exact
>> >> same effect that you see in your tests without needing any expensive
>> >> computation.
>> >
>> > Could you point me to the data field of connection context where this
>> > hash value is stored? Is it computed only one time?
>> >
>> sk_txhash in struct sock. It is set to a random number on TCP or UDP
>> connect call, It can be reset to a different random value when
>> connection is seen to be have trouble (sk_rethink_txhash).
>>
>> Also when you say "Toeplitz performs better in case of non-random
>> inputs" please quantify exactly how your input data is not random.
>> What header changes with each connection in your test...
>
> Thank you for the info!
>
> For non-random inputs, I used the port selection of iperf that increases
> the port number by 2 for each connection. Only send-port numbers are
> different, other values are the same. I also tested some other fixed
> increment, Toeplitz spreads the connections evenly. For real applications,
> if the load came from local area, then the IP/port combinations are
> likely to have some non-random patterns.
>
Okay, by only changing source port I can produce the same uniformity:

64 connections with a step of 2 for changing source port gives:

Buckets: 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

_but_, I can also find also make steps that severely mess up load
distribution. Step 1024 gives:

Buckets: 0 8 8 0 0 8 8 0 8 0 0 8 8 0 0 8

The fact that we can negatively affect the output of Toeplitz so
predictably is actually a liability and not a benefit. This sort of
thing can be the basis of a DOS attack and is why we kicked out XOR
hash in favor of Jenkins.

> For our driver, we are thinking to put the Toeplitz hash to the sk_txhash,
> so it only needs to be computed only once, or during sk_rethink_txhash.
> So, the computational overhead happens almost only once.
>
> Thanks,
> - Haiyang
>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 21:44                       ` Tom Herbert
  0 siblings, 0 replies; 43+ messages in thread
From: Tom Herbert @ 2016-01-14 21:44 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: One Thousand Gnomes, Eric Dumazet, netdev, linux-kernel, devel,
	David Miller

On Thu, Jan 14, 2016 at 12:23 PM, Haiyang Zhang <haiyangz@microsoft.com> wrote:
>
>
>> -----Original Message-----
>> From: Tom Herbert [mailto:tom@herbertland.com]
>> Sent: Thursday, January 14, 2016 2:41 PM
>> To: Haiyang Zhang <haiyangz@microsoft.com>
>> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
>> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
>> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> struct flow_keys layout
>>
>> On Thu, Jan 14, 2016 at 11:15 AM, Haiyang Zhang <haiyangz@microsoft.com>
>> wrote:
>> >
>> >
>> >> -----Original Message-----
>> >> From: Tom Herbert [mailto:tom@herbertland.com]
>> >> Sent: Thursday, January 14, 2016 1:49 PM
>> >> To: Haiyang Zhang <haiyangz@microsoft.com>
>> >> Cc: Eric Dumazet <eric.dumazet@gmail.com>; One Thousand Gnomes
>> >> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
>> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> >> kernel@vger.kernel.org
>> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> >> struct flow_keys layout
>> >>
>> >> On Thu, Jan 14, 2016 at 10:35 AM, Haiyang Zhang
>> <haiyangz@microsoft.com>
>> >> wrote:
>> >> >
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>> >> >> Sent: Thursday, January 14, 2016 1:24 PM
>> >> >> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
>> >> >> Cc: Tom Herbert <tom@herbertland.com>; Haiyang Zhang
>> >> >> <haiyangz@microsoft.com>; David Miller <davem@davemloft.net>;
>> >> >> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
>> >> >> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
>> >> >> kernel@vger.kernel.org
>> >> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
>> >> >> struct flow_keys layout
>> >> >>
>> >> >> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote:
>> >> >> > > These results for Toeplitz are not plausible. Given random
>> input
>> >> you
>> >> >> > > cannot expect any hash function to produce such uniform
>> results.
>> >> I
>> >> >> > > suspect either your input data is biased or how your applying
>> the
>> >> >> hash
>> >> >> > > is.
>> >> >> > >
>> >> >> > > When I run 64 random IPv4 3-tuples through Toeplitz and
>> Jenkins I
>> >> >> get
>> >> >> > > something more reasonable:
>> >> >> >
>> >> >> > IPv4 address patterns are not random. Nothing like it. A long
>> long
>> >> >> time
>> >> >> > ago we did do a bunch of tuning for network hashes using big
>> porn
>> >> site
>> >> >> > data sets. Random it was not.
>> >> >> >
>> >> >>
>> >> >> I ran my tests with non random IPV4 addresses, as I had 2 hosts,
>> >> >> one server, one client. (typical benchmark stuff)
>> >> >>
>> >> >> The only 'random' part was the ports, so maybe ~20 bits of entropy,
>> >> >> considering how we allocate ports during connect() to a given
>> >> >> destination to avoid port reuse.
>> >> >>
>> >> >> > It's probably hard to repeat that exercise now with geo specific
>> >> >> routing,
>> >> >> > and all the front end caches and redirectors on big sites but
>> I'd
>> >> >> > strongly suggest random input is not a good test, and also that
>> you
>> >> >> need
>> >> >> > to worry more about hash attacks than perfect distributions.
>> >> >>
>> >> >> Anyway, the exercise is not to find a hash that exactly splits 128
>> >> flows
>> >> >> into 16 buckets, according to the number of flows per bucket.
>> >> >>
>> >> >> Maybe only 4 flows are sending at 3Gbits, and others are sending
>> at
>> >> 100
>> >> >> kbits. There is no way the driver can predict the future.
>> >> >>
>> >> >> This is why we prefer to select a queue given the cpu sending the
>> >> >> packet. This permits a natural shift based on actual load, and is
>> the
>> >> >> default on linux (see XPS in Documentation/networking/scaling.txt)
>> >> >>
>> >> >> Only this driver has a selection based on a flow 'hash'.
>> >> >
>> >> > Also, the port number selection may not be random either. For
>> example,
>> >> > the well-known network throughput test tool, iperf, use port
>> numbers
>> >> with
>> >> > equal increment among them. We tested these non-random cases, and
>> >> found
>> >> > the Toeplitz hash has distributed evenly, but Jenkins hash has non-
>> >> even
>> >> > distribution.
>> >> >
>> >> > I'm aware of the test from Tom Herbert <tom@herbertland.com>, which
>> >> > showing similar results of Toeplitz v.s. Jenkins with random inputs.
>> >> >
>> >> > In summary, the Toeplitz performs better in case of non-random
>> inputs,
>> >> > and performs similar to Jenkins in random inputs (which may not be
>> the
>> >> > case in real world). So we still prefer to use Toeplitz hash.
>> >> >
>> >> You are basing your conclusions on one toy benchmark. I don't believe
>> >> that an realistically loaded web server is going to consistently give
>> >> you tuples that happen to somehow fit into a nice model so that the
>> >> bias benefits your load distribution.
>> >>
>> >> > To minimize the computational overhead, we may consider put the
>> hash
>> >> > in a per-connection cache in TCP layer, so it only needs one time
>> >> > computation. But, even with the computation overhead at this moment,
>> >> > the throughput based on Toeplitz hash is better than Jenkins:
>> >> > Throughput (Gbps) comparison:
>> >> > #conn           Toeplitz        Jenkins
>> >> > 32              26.6            23.2
>> >> > 64              32.1            23.4
>> >> > 128             29.1            24.1
>> >> >
>> >> You don't need to do that. We already store a random hash value in
>> the
>> >> connection context. If you want to make it non-random then just
>> >> replace that with a simple global counter. This will have the exact
>> >> same effect that you see in your tests without needing any expensive
>> >> computation.
>> >
>> > Could you point me to the data field of connection context where this
>> > hash value is stored? Is it computed only one time?
>> >
>> sk_txhash in struct sock. It is set to a random number on TCP or UDP
>> connect call, It can be reset to a different random value when
>> connection is seen to be have trouble (sk_rethink_txhash).
>>
>> Also when you say "Toeplitz performs better in case of non-random
>> inputs" please quantify exactly how your input data is not random.
>> What header changes with each connection in your test...
>
> Thank you for the info!
>
> For non-random inputs, I used the port selection of iperf that increases
> the port number by 2 for each connection. Only send-port numbers are
> different, other values are the same. I also tested some other fixed
> increment, Toeplitz spreads the connections evenly. For real applications,
> if the load came from local area, then the IP/port combinations are
> likely to have some non-random patterns.
>
Okay, by only changing source port I can produce the same uniformity:

64 connections with a step of 2 for changing source port gives:

Buckets: 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

_but_, I can also find also make steps that severely mess up load
distribution. Step 1024 gives:

Buckets: 0 8 8 0 0 8 8 0 8 0 0 8 8 0 0 8

The fact that we can negatively affect the output of Toeplitz so
predictably is actually a liability and not a benefit. This sort of
thing can be the basis of a DOS attack and is why we kicked out XOR
hash in favor of Jenkins.

> For our driver, we are thinking to put the Toeplitz hash to the sk_txhash,
> so it only needs to be computed only once, or during sk_rethink_txhash.
> So, the computational overhead happens almost only once.
>
> Thanks,
> - Haiyang
>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 21:44                       ` Tom Herbert
  (?)
@ 2016-01-14 22:06                       ` David Miller
  -1 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2016-01-14 22:06 UTC (permalink / raw)
  To: tom
  Cc: haiyangz, eric.dumazet, gnomes, vkuznets, netdev, kys, devel,
	linux-kernel

From: Tom Herbert <tom@herbertland.com>
Date: Thu, 14 Jan 2016 13:44:24 -0800

> The fact that we can negatively affect the output of Toeplitz so
> predictably is actually a liability and not a benefit. This sort of
> thing can be the basis of a DOS attack and is why we kicked out XOR
> hash in favor of Jenkins.

+1

Toeplitz should not be used for any software calculated flow hash
whatsoever.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 20:23                     ` Haiyang Zhang
@ 2016-01-14 22:08                       ` Eric Dumazet
  -1 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-01-14 22:08 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: Tom Herbert, One Thousand Gnomes, David Miller, vkuznets, netdev,
	KY Srinivasan, devel, linux-kernel

On Thu, 2016-01-14 at 20:23 +0000, Haiyang Zhang wrote:
> 

> For non-random inputs, I used the port selection of iperf that increases 
> the port number by 2 for each connection. Only send-port numbers are 
> different, other values are the same. I also tested some other fixed 
> increment, Toeplitz spreads the connections evenly. For real applications, 
> if the load came from local area, then the IP/port combinations are 
> likely to have some non-random patterns.

We are not putting code in core networking stack favoring non secure
behavior.

The +2 behavior for connections from A to B:<fixed port> is something
that we will eventually remove in the future. It used to be +1 not a
long time ago...

Say if we implement the following,

https://tools.ietf.org/html/rfc6056#section-3.3.4

The fact that Toeplitz hash has this linear property should not be a
valid reason to help hackers to exploit vulnerabilities.

In my tests I was using netperf, which randomizes both source &
destination ports.

This is why I could not reproduce your results based on iperf, which
generates 5-tuple in a totally predictable way.

This reminds me some drivers had a well known Toeplitz RSS key, allowing
attackers to direct their attack on a single queue.

I guess we could replace sk_txhash generator by a simple linear
allocator and boom, your driver will be pleased.

But this is only for a very specific workload.

diff --git a/include/net/sock.h b/include/net/sock.h
index e830c1006935..949527413cfb 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1689,7 +1689,8 @@ unsigned long sock_i_ino(struct sock *sk);

 static inline u32 net_tx_rndhash(void)
 {
-       u32 v = prandom_u32();
+       static u32 last_hash;
+       u32 v = ++last_hash; // do not care about SMP races.

        return v ?: 1;
 }

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 22:08                       ` Eric Dumazet
  0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2016-01-14 22:08 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: One Thousand Gnomes, Tom Herbert, linux-kernel, netdev, devel,
	David Miller

On Thu, 2016-01-14 at 20:23 +0000, Haiyang Zhang wrote:
> 

> For non-random inputs, I used the port selection of iperf that increases 
> the port number by 2 for each connection. Only send-port numbers are 
> different, other values are the same. I also tested some other fixed 
> increment, Toeplitz spreads the connections evenly. For real applications, 
> if the load came from local area, then the IP/port combinations are 
> likely to have some non-random patterns.

We are not putting code in core networking stack favoring non secure
behavior.

The +2 behavior for connections from A to B:<fixed port> is something
that we will eventually remove in the future. It used to be +1 not a
long time ago...

Say if we implement the following,

https://tools.ietf.org/html/rfc6056#section-3.3.4

The fact that Toeplitz hash has this linear property should not be a
valid reason to help hackers to exploit vulnerabilities.

In my tests I was using netperf, which randomizes both source &
destination ports.

This is why I could not reproduce your results based on iperf, which
generates 5-tuple in a totally predictable way.

This reminds me some drivers had a well known Toeplitz RSS key, allowing
attackers to direct their attack on a single queue.

I guess we could replace sk_txhash generator by a simple linear
allocator and boom, your driver will be pleased.

But this is only for a very specific workload.

diff --git a/include/net/sock.h b/include/net/sock.h
index e830c1006935..949527413cfb 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1689,7 +1689,8 @@ unsigned long sock_i_ino(struct sock *sk);

 static inline u32 net_tx_rndhash(void)
 {
-       u32 v = prandom_u32();
+       static u32 last_hash;
+       u32 v = ++last_hash; // do not care about SMP races.

        return v ?: 1;
 }

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
  2016-01-14 22:08                       ` Eric Dumazet
@ 2016-01-14 22:29                         ` Haiyang Zhang
  -1 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 22:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tom Herbert, One Thousand Gnomes, David Miller, vkuznets, netdev,
	KY Srinivasan, devel, linux-kernel



> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Thursday, January 14, 2016 5:08 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Tom Herbert <tom@herbertland.com>; One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, 2016-01-14 at 20:23 +0000, Haiyang Zhang wrote:
> >
> 
> 
> > For non-random inputs, I used the port selection of iperf that
> increases
> > the port number by 2 for each connection. Only send-port numbers are
> > different, other values are the same. I also tested some other fixed
> > increment, Toeplitz spreads the connections evenly. For real
> applications,
> > if the load came from local area, then the IP/port combinations are
> > likely to have some non-random patterns.
> 
> We are not putting code in core networking stack favoring non secure
> behavior.
> 
> The +2 behavior for connections from A to B:<fixed port> is something
> that we will eventually remove in the future. It used to be +1 not a
> long time ago...
> 
> Say if we implement the following,
> 
> https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2ftools.i
> etf.org%2fhtml%2frfc6056%23section-
> 3.3.4&data=01%7c01%7chaiyangz%40microsoft.com%7ced5f98ae23a843df05c408d3
> 1d2f3028%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=uPo0Rdme20vZX%2b%2
> frcwe1iE0mKGZYl%2fMdeaF1wld%2fgbQ%3d
> 
> 
> The fact that Toeplitz hash has this linear property should not be a
> valid reason to help hackers to exploit vulnerabilities.
> 
> In my tests I was using netperf, which randomizes both source &
> destination ports.
> 
> This is why I could not reproduce your results based on iperf, which
> generates 5-tuple in a totally predictable way.
> 
> This reminds me some drivers had a well known Toeplitz RSS key, allowing
> attackers to direct their attack on a single queue.
> 
> I guess we could replace sk_txhash generator by a simple linear
> allocator and boom, your driver will be pleased.
> 
> But this is only for a very specific workload.
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index e830c1006935..949527413cfb 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1689,7 +1689,8 @@ unsigned long sock_i_ino(struct sock *sk);
> 
>  static inline u32 net_tx_rndhash(void)
>  {
> -       u32 v = prandom_u32();
> +       static u32 last_hash;
> +       u32 v = ++last_hash; // do not care about SMP races.
> 
>         return v ?: 1;
>  }

Tom, Thanks for your test -- I was not able to reproduce the 
"0 8 8 0 0 8 8 0 8 0 0 8 8 0 0 8" distribution, but I did see some 
predictable patterns by using some increments like 512... 

Tom, Dave, and Eric -- I share your concerns on potential DoS attack 
on predictable patterns. We will re-think about this.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout
@ 2016-01-14 22:29                         ` Haiyang Zhang
  0 siblings, 0 replies; 43+ messages in thread
From: Haiyang Zhang @ 2016-01-14 22:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: One Thousand Gnomes, Tom Herbert, linux-kernel, netdev, devel,
	David Miller



> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Thursday, January 14, 2016 5:08 PM
> To: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Tom Herbert <tom@herbertland.com>; One Thousand Gnomes
> <gnomes@lxorguk.ukuu.org.uk>; David Miller <davem@davemloft.net>;
> vkuznets@redhat.com; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; devel@linuxdriverproject.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> On Thu, 2016-01-14 at 20:23 +0000, Haiyang Zhang wrote:
> >
> 
> 
> > For non-random inputs, I used the port selection of iperf that
> increases
> > the port number by 2 for each connection. Only send-port numbers are
> > different, other values are the same. I also tested some other fixed
> > increment, Toeplitz spreads the connections evenly. For real
> applications,
> > if the load came from local area, then the IP/port combinations are
> > likely to have some non-random patterns.
> 
> We are not putting code in core networking stack favoring non secure
> behavior.
> 
> The +2 behavior for connections from A to B:<fixed port> is something
> that we will eventually remove in the future. It used to be +1 not a
> long time ago...
> 
> Say if we implement the following,
> 
> https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2ftools.i
> etf.org%2fhtml%2frfc6056%23section-
> 3.3.4&data=01%7c01%7chaiyangz%40microsoft.com%7ced5f98ae23a843df05c408d3
> 1d2f3028%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=uPo0Rdme20vZX%2b%2
> frcwe1iE0mKGZYl%2fMdeaF1wld%2fgbQ%3d
> 
> 
> The fact that Toeplitz hash has this linear property should not be a
> valid reason to help hackers to exploit vulnerabilities.
> 
> In my tests I was using netperf, which randomizes both source &
> destination ports.
> 
> This is why I could not reproduce your results based on iperf, which
> generates 5-tuple in a totally predictable way.
> 
> This reminds me some drivers had a well known Toeplitz RSS key, allowing
> attackers to direct their attack on a single queue.
> 
> I guess we could replace sk_txhash generator by a simple linear
> allocator and boom, your driver will be pleased.
> 
> But this is only for a very specific workload.
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index e830c1006935..949527413cfb 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1689,7 +1689,8 @@ unsigned long sock_i_ino(struct sock *sk);
> 
>  static inline u32 net_tx_rndhash(void)
>  {
> -       u32 v = prandom_u32();
> +       static u32 last_hash;
> +       u32 v = ++last_hash; // do not care about SMP races.
> 
>         return v ?: 1;
>  }

Tom, Thanks for your test -- I was not able to reproduce the 
"0 8 8 0 0 8 8 0 8 0 0 8 8 0 0 8" distribution, but I did see some 
predictable patterns by using some increments like 512... 

Tom, Dave, and Eric -- I share your concerns on potential DoS attack 
on predictable patterns. We will re-think about this.

Thanks,
- Haiyang

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2016-01-14 22:43 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-07  9:33 [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout Vitaly Kuznetsov
2016-01-07  9:33 ` Vitaly Kuznetsov
2016-01-07 12:52 ` Eric Dumazet
2016-01-07 13:28   ` Vitaly Kuznetsov
2016-01-07 13:28     ` Vitaly Kuznetsov
2016-01-08  1:02     ` John Fastabend
2016-01-08  3:49       ` KY Srinivasan
2016-01-08  3:49         ` KY Srinivasan
2016-01-08  6:16         ` John Fastabend
2016-01-08  6:16           ` John Fastabend
2016-01-08 18:01           ` KY Srinivasan
2016-01-08 21:07     ` Haiyang Zhang
2016-01-08 21:07       ` Haiyang Zhang
2016-01-09  0:17   ` Tom Herbert
2016-01-09  0:17     ` Tom Herbert
2016-01-10 22:25 ` David Miller
2016-01-10 22:25   ` David Miller
2016-01-13 23:10   ` Haiyang Zhang
2016-01-13 23:10     ` Haiyang Zhang
2016-01-14  4:56     ` David Miller
2016-01-14  4:56       ` David Miller
2016-01-14 17:14     ` Tom Herbert
2016-01-14 17:14       ` Tom Herbert
2016-01-14 17:53       ` One Thousand Gnomes
2016-01-14 17:53         ` One Thousand Gnomes
2016-01-14 18:24         ` Eric Dumazet
2016-01-14 18:24           ` Eric Dumazet
2016-01-14 18:35           ` Haiyang Zhang
2016-01-14 18:35             ` Haiyang Zhang
2016-01-14 18:48             ` Tom Herbert
2016-01-14 19:15               ` Haiyang Zhang
2016-01-14 19:15                 ` Haiyang Zhang
2016-01-14 19:41                 ` Tom Herbert
2016-01-14 20:23                   ` Haiyang Zhang
2016-01-14 20:23                     ` Haiyang Zhang
2016-01-14 21:44                     ` Tom Herbert
2016-01-14 21:44                       ` Tom Herbert
2016-01-14 22:06                       ` David Miller
2016-01-14 22:08                     ` Eric Dumazet
2016-01-14 22:08                       ` Eric Dumazet
2016-01-14 22:29                       ` Haiyang Zhang
2016-01-14 22:29                         ` Haiyang Zhang
2016-01-14 17:53     ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.