From: Eric Dumazet <eric.dumazet@gmail.com> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Tom Herbert <tom@herbertland.com>, Haiyang Zhang <haiyangz@microsoft.com>, David Miller <davem@davemloft.net>, "vkuznets@redhat.com" <vkuznets@redhat.com>, "netdev@vger.kernel.org" <netdev@vger.kernel.org>, KY Srinivasan <kys@microsoft.com>, "devel@linuxdriverproject.org" <devel@linuxdriverproject.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout Date: Thu, 14 Jan 2016 10:24:09 -0800 [thread overview] Message-ID: <1452795849.1223.112.camel@edumazet-glaptop2.roam.corp.google.com> (raw) In-Reply-To: <20160114175304.161ff0af@lxorguk.ukuu.org.uk> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote: > > These results for Toeplitz are not plausible. Given random input you > > cannot expect any hash function to produce such uniform results. I > > suspect either your input data is biased or how your applying the hash > > is. > > > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get > > something more reasonable: > > IPv4 address patterns are not random. Nothing like it. A long long time > ago we did do a bunch of tuning for network hashes using big porn site > data sets. Random it was not. > I ran my tests with non random IPV4 addresses, as I had 2 hosts, one server, one client. (typical benchmark stuff) The only 'random' part was the ports, so maybe ~20 bits of entropy, considering how we allocate ports during connect() to a given destination to avoid port reuse. > It's probably hard to repeat that exercise now with geo specific routing, > and all the front end caches and redirectors on big sites but I'd > strongly suggest random input is not a good test, and also that you need > to worry more about hash attacks than perfect distributions. Anyway, the exercise is not to find a hash that exactly splits 128 flows into 16 buckets, according to the number of flows per bucket. Maybe only 4 flows are sending at 3Gbits, and others are sending at 100 kbits. There is no way the driver can predict the future. This is why we prefer to select a queue given the cpu sending the packet. This permits a natural shift based on actual load, and is the default on linux (see XPS in Documentation/networking/scaling.txt) Only this driver has a selection based on a flow 'hash'.
WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <eric.dumazet@gmail.com> To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Tom Herbert <tom@herbertland.com>, Haiyang Zhang <haiyangz@microsoft.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "netdev@vger.kernel.org" <netdev@vger.kernel.org>, "devel@linuxdriverproject.org" <devel@linuxdriverproject.org>, David Miller <davem@davemloft.net> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout Date: Thu, 14 Jan 2016 10:24:09 -0800 [thread overview] Message-ID: <1452795849.1223.112.camel@edumazet-glaptop2.roam.corp.google.com> (raw) In-Reply-To: <20160114175304.161ff0af@lxorguk.ukuu.org.uk> On Thu, 2016-01-14 at 17:53 +0000, One Thousand Gnomes wrote: > > These results for Toeplitz are not plausible. Given random input you > > cannot expect any hash function to produce such uniform results. I > > suspect either your input data is biased or how your applying the hash > > is. > > > > When I run 64 random IPv4 3-tuples through Toeplitz and Jenkins I get > > something more reasonable: > > IPv4 address patterns are not random. Nothing like it. A long long time > ago we did do a bunch of tuning for network hashes using big porn site > data sets. Random it was not. > I ran my tests with non random IPV4 addresses, as I had 2 hosts, one server, one client. (typical benchmark stuff) The only 'random' part was the ports, so maybe ~20 bits of entropy, considering how we allocate ports during connect() to a given destination to avoid port reuse. > It's probably hard to repeat that exercise now with geo specific routing, > and all the front end caches and redirectors on big sites but I'd > strongly suggest random input is not a good test, and also that you need > to worry more about hash attacks than perfect distributions. Anyway, the exercise is not to find a hash that exactly splits 128 flows into 16 buckets, according to the number of flows per bucket. Maybe only 4 flows are sending at 3Gbits, and others are sending at 100 kbits. There is no way the driver can predict the future. This is why we prefer to select a queue given the cpu sending the packet. This permits a natural shift based on actual load, and is the default on linux (see XPS in Documentation/networking/scaling.txt) Only this driver has a selection based on a flow 'hash'.
next prev parent reply other threads:[~2016-01-14 18:24 UTC|newest] Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-01-07 9:33 [PATCH net-next] hv_netvsc: don't make assumptions on struct flow_keys layout Vitaly Kuznetsov 2016-01-07 9:33 ` Vitaly Kuznetsov 2016-01-07 12:52 ` Eric Dumazet 2016-01-07 13:28 ` Vitaly Kuznetsov 2016-01-07 13:28 ` Vitaly Kuznetsov 2016-01-08 1:02 ` John Fastabend 2016-01-08 3:49 ` KY Srinivasan 2016-01-08 3:49 ` KY Srinivasan 2016-01-08 6:16 ` John Fastabend 2016-01-08 6:16 ` John Fastabend 2016-01-08 18:01 ` KY Srinivasan 2016-01-08 21:07 ` Haiyang Zhang 2016-01-08 21:07 ` Haiyang Zhang 2016-01-09 0:17 ` Tom Herbert 2016-01-09 0:17 ` Tom Herbert 2016-01-10 22:25 ` David Miller 2016-01-10 22:25 ` David Miller 2016-01-13 23:10 ` Haiyang Zhang 2016-01-13 23:10 ` Haiyang Zhang 2016-01-14 4:56 ` David Miller 2016-01-14 4:56 ` David Miller 2016-01-14 17:14 ` Tom Herbert 2016-01-14 17:14 ` Tom Herbert 2016-01-14 17:53 ` One Thousand Gnomes 2016-01-14 17:53 ` One Thousand Gnomes 2016-01-14 18:24 ` Eric Dumazet [this message] 2016-01-14 18:24 ` Eric Dumazet 2016-01-14 18:35 ` Haiyang Zhang 2016-01-14 18:35 ` Haiyang Zhang 2016-01-14 18:48 ` Tom Herbert 2016-01-14 19:15 ` Haiyang Zhang 2016-01-14 19:15 ` Haiyang Zhang 2016-01-14 19:41 ` Tom Herbert 2016-01-14 20:23 ` Haiyang Zhang 2016-01-14 20:23 ` Haiyang Zhang 2016-01-14 21:44 ` Tom Herbert 2016-01-14 21:44 ` Tom Herbert 2016-01-14 22:06 ` David Miller 2016-01-14 22:08 ` Eric Dumazet 2016-01-14 22:08 ` Eric Dumazet 2016-01-14 22:29 ` Haiyang Zhang 2016-01-14 22:29 ` Haiyang Zhang 2016-01-14 17:53 ` Eric Dumazet
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1452795849.1223.112.camel@edumazet-glaptop2.roam.corp.google.com \ --to=eric.dumazet@gmail.com \ --cc=davem@davemloft.net \ --cc=devel@linuxdriverproject.org \ --cc=gnomes@lxorguk.ukuu.org.uk \ --cc=haiyangz@microsoft.com \ --cc=kys@microsoft.com \ --cc=linux-kernel@vger.kernel.org \ --cc=netdev@vger.kernel.org \ --cc=tom@herbertland.com \ --cc=vkuznets@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.