From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756699AbcANRxS (ORCPT ); Thu, 14 Jan 2016 12:53:18 -0500 Received: from mail-pa0-f49.google.com ([209.85.220.49]:35693 "EHLO mail-pa0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755592AbcANRxQ (ORCPT ); Thu, 14 Jan 2016 12:53:16 -0500 Message-ID: <1452793993.1223.102.camel@edumazet-glaptop2.roam.corp.google.com> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout From: Eric Dumazet To: Haiyang Zhang Cc: David Miller , "vkuznets@redhat.com" , "netdev@vger.kernel.org" , KY Srinivasan , "devel@linuxdriverproject.org" , "linux-kernel@vger.kernel.org" Date: Thu, 14 Jan 2016 09:53:13 -0800 In-Reply-To: References: <1452159189-11473-1-git-send-email-vkuznets@redhat.com> <20160110.172558.367101858392871618.davem@davemloft.net> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2016-01-13 at 23:10 +0000, Haiyang Zhang wrote: > I have done a comparison of the Toeplitz v.s. Jenkins Hash algorithms, > and found that the Toeplitz provides much better distribution of the > connections into send-indirection-table entries. See the data below -- > showing how many TCP connections are distributed into each of the > sixteen table entries. The Toeplitz hash distributes the connections > almost perfectly evenly, but the Jenkins hash distributes them unevenly. > For example, in case of 64 connections, some entries are 0 or 1, some > other entries are 8. This could cause too many connections in one VMBus > channel and slow down the throughput. So a VMBus channel has a limit of number of flows ? Why is it so ? What happens with 1000 flows ? > This is consistent to our test > which showing slower performance while using the generic skb_get_hash > (Jenkins) than using Toeplitz hash (see perf numbers below). > > > #connections:32: > Toeplitz:2,2,2,2,2,1,2,2,2,2,2,3,2,2,2,2, > Jenkins:3,2,2,4,1,1,0,2,1,1,4,3,2,5,1,0, > #connections:64: > Toeplitz:4,4,5,4,4,3,4,4,4,4,4,4,4,4,4,4, > Jenkins:4,5,4,6,3,5,0,6,1,2,8,3,6,8,2,1, > #connections:128: > Toeplitz:8,8,8,8,8,7,9,8,8,8,8,8,8,8,8,8, > Jenkins:8,12,10,9,7,8,3,10,6,8,9,8,10,11,6,3, > > Throughput (Gbps) comparison: > #conn Toeplitz Jenkins > 32 26.6 23.2 > 64 32.1 23.4 > 128 29.1 24.1 > > For long term solution, I think we should put the Toeplitz hash as > another option to the generic hash function in kernel... But, for the > time being, can you accept this patch to fix the assumptions on > struct flow_keys layout? I find your Toeplitz distribution has an anomaly. Having 128 connections distributed almost _perfectly_ into 16 buckets is telling something how the source/destination ports where allocated maybe, knowing the RSS key or something ? It looks too _perfect_ to be true. Here what I get here from 20 runs of 128 sessions using prandom_u32() hash, distributed to 16 buckets (hash % 16) : 6,9,9,6,11,8,9,7,7,7,9,8,8,7,9,8 : 6,9,6,6,6,9,8,5,12,10,7,7,9,7,13,8 : 7,4,9,9,10,9,8,7,15,4,8,8,11,10,2,7 : 12,5,10,6,7,4,10,10,6,5,10,14,8,8,5,8 : 4,8,5,13,7,4,7,9,7,6,6,9,6,11,17,9 : 10,10,8,5,7,4,5,14,6,9,9,7,8,9,7,10 : 6,4,9,10,13,8,8,7,6,5,8,9,7,5,15,8 : 11,13,7,4,8,6,6,9,10,8,8,5,6,6,11,10 : 8,8,11,7,12,13,5,8,9,6,8,10,5,4,9,5 : 13,5,5,4,5,11,8,8,11,8,9,10,10,6,9,6 : 13,6,12,6,6,7,4,9,5,14,9,12,9,4,4,8 : 4,9,10,12,10,4,8,6,8,5,14,10,5,8,8,7 : 7,7,6,6,12,13,8,12,7,6,8,9,6,5,12,4 : 4,12,9,10,2,12,10,13,5,8,4,6,8,10,4,11 : 5,6,10,10,10,9,16,8,8,7,4,10,7,6,6,6 : 9,13,10,11,6,9,4,7,7,9,7,6,9,9,7,5 : 8,7,4,8,6,9,9,8,7,10,8,10,17,7,5,5 : 10,5,10,8,9,5,9,6,12,8,5,8,7,9,7,10 : 8,10,10,7,10,7,13,3,9,5,7,2,10,9,12,6 : 4,6,13,6,6,6,12,9,11,5,7,10,9,8,11,5 This looks more 'random' to me, and _if_ I use Jenkins hash I have the same distribution. Sure, it is not 'perfectly spread', but who said that all flows are sending the same amount of traffic in the real world ? Using Toeplitz hash is adding a cost of 300 ns per IPV6 packet. TCP_RR (small RPC) workload would certainly not like to compute Toeplitz for every packet. I would like we do not add complexity just to make some benchmark better.