RFC: Established connections hash function

All of lore.kernel.org
 help / color / mirror / Atom feed

* RFC: Established connections hash function
@ 2007-03-22 15:39 Nikolaos D. Bougalis
  2007-03-22 15:52 ` Evgeniy Polyakov
  2007-03-27 14:11 ` Andi Kleen
  0 siblings, 2 replies; 34+ messages in thread
From: Nikolaos D. Bougalis @ 2007-03-22 15:39 UTC (permalink / raw)
  To: netdev

    Hello,

    I have noticed that the hash function that the kernel uses for
established TCP/IP connections is rather simplistic, specifically:

    h = (local address ^ local_port) ^ (remote_address ^ remote_port);
    h ^= h >> 16;
    h ^= h >> 8;

    Now, simple is great, but this has a number of issues, not the least of
which is that an attacker can very easily cause collisions and force
extremely long chain lengths, a situation that becomes worse the more
distinct IP addresses and listening ports a box has.

    Consider, for example, a box that has 20 ports open and 4 consecutive IP
addresses. An attacker that has an entire class C available can create
24,576 connections that hash to the same value, resulting in a ridiculously
overlong chain. With servers that do virtual hosting and have dozens of IPs,
the situation can become much worse very fast.

    This particular hash seems to be the odd-man out, since most other
network related hashes in the kernel seem to be Jenkins-based, and some use
tagged hashing to defeat algorithmic complexity attacks. For example, the
route hash uses this:

static unsigned int rt_hash_rnd;

static unsigned int rt_hash_code(u32 daddr, u32 saddr)
{
        return (jhash_2words(daddr, saddr, rt_hash_rnd)
                & rt_hash_mask);
}

    With this in mind, I propose the following replacement for inet_ehashfn,
which defeats algorithmic complexity attacks and achieves excellent
distribution:

unsigned int inet_ehashfn(const __be32 laddr, const __u16 lport,
                          const __be32 faddr, const __be16 fport)
{
    return jhash_3words((__force __u32)faddr, (__force __u32)laddr,
                        (((__force __u32)fport) << 16) + lport,
                        inet_ehash_rnd);
}

    where inet_ehash_rnd is initialized once in tcp_init to a random 32-bit
value.

    I will be more than happy to provide a patch for this, but I figured I
would solicit some input first.

    Nik B.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 15:39 RFC: Established connections hash function Nikolaos D. Bougalis
@ 2007-03-22 15:52 ` Evgeniy Polyakov
  2007-03-22 17:32   ` Nikolaos D. Bougalis
  2007-03-27 14:11 ` Andi Kleen
  1 sibling, 1 reply; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-22 15:52 UTC (permalink / raw)
  To: Nikolaos D. Bougalis; +Cc: netdev

On Thu, Mar 22, 2007 at 08:39:04AM -0700, Nikolaos D. Bougalis (nikb@webmaster.com) wrote:
>    This particular hash seems to be the odd-man out, since most other
> network related hashes in the kernel seem to be Jenkins-based, and some use
> tagged hashing to defeat algorithmic complexity attacks. For example, the
> route hash uses this:

It seems you do not know a history...
It is the fastest and actually the best hash for that workloads where it
is used, but unfortunately it is too simple for attacker to predict end
result.

> static unsigned int rt_hash_rnd;
> 
> static unsigned int rt_hash_code(u32 daddr, u32 saddr)
> {
>        return (jhash_2words(daddr, saddr, rt_hash_rnd)
>                & rt_hash_mask);
> }
> 
>    With this in mind, I propose the following replacement for inet_ehashfn,
> which defeats algorithmic complexity attacks and achieves excellent
> distribution:
> 
> unsigned int inet_ehashfn(const __be32 laddr, const __u16 lport,
>                          const __be32 faddr, const __be16 fport)
> {
>    return jhash_3words((__force __u32)faddr, (__force __u32)laddr,
>                        (((__force __u32)fport) << 16) + lport,
>                        inet_ehash_rnd);
> }

And this is utterly broken. For more details please read netdev@
archives and trivial analysis of jhash_3words().

We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 15:52 ` Evgeniy Polyakov
@ 2007-03-22 17:32   ` Nikolaos D. Bougalis
  2007-03-22 18:21     ` Evgeniy Polyakov
  0 siblings, 1 reply; 34+ messages in thread
From: Nikolaos D. Bougalis @ 2007-03-22 17:32 UTC (permalink / raw)
  To: netdev

On Thu, March 22, 2007 at 8:52 AM, Evgeniy Polyakov <johnpol@2ka.mipt.ru> 
wrote:

> It seems you do not know a history...

    I know a lot about history. I may not know the specific history you had 
in mind though.

    I do see now that this has been brought up before. Before posting, I did 
search the archives, but obviously not closely enough. I blame it on the 
early morning hour.


> It is the fastest and actually the best hash for that workloads where it
> is used, but unfortunately it is too simple for attacker to predict end
> result.

    Yes, the distribution of the vanilla function is decent, and yes it's 
very fast. But all that won't help you when chains start having tens of 
thousands of items, and you have to iterate through them constantly. But I 
guess that if it makes you feel better, you can call that "unfortunate."


>> unsigned int inet_ehashfn(const __be32 laddr, const __u16 lport,
>>                          const __be32 faddr, const __be16 fport)
>> {
>>    return jhash_3words((__force __u32)faddr, (__force __u32)laddr,
>>                        (((__force __u32)fport) << 16) + lport,
>>                        inet_ehash_rnd);
>> }
>
> And this is utterly broken. For more details please read netdev@
> archives and trivial analysis of jhash_3words().

    Utterly broken? Nonsense. I have tested the actual function I proposed 
(sans the __force and __u32 stuff, which weren't necessary in my test 
program), against real data, collected from various servers in real-time. It 
has consistently achieved lower average chain lengths than the vanilla 
function and demonstrated no artifacting, and that's trivial to verify.

    The only analysis I could find was this 
http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses 
jhash_2words, and not jhash_3words, and which naively attempts to take the 
output of jhash_2words, and to perform the same mixing trick that the 
vanilla inet_ehashfn does and uses artificially generated data sets.

    But please, feel free to point out any other _unfavorable_ analyses of 
jhash_2words or jhash_3words that I may have missed.


> We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.

    Please explain to me how jhash_2words solves the issue that you claim 
jhash_3words has, when they both use the same underlying bit-mixer?

    -n 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 17:32   ` Nikolaos D. Bougalis
@ 2007-03-22 18:21     ` Evgeniy Polyakov
  2007-03-22 19:44       ` Nikolaos D. Bougalis
  0 siblings, 1 reply; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-22 18:21 UTC (permalink / raw)
  To: Nikolaos D. Bougalis; +Cc: netdev

On Thu, Mar 22, 2007 at 10:32:44AM -0700, Nikolaos D. Bougalis (nikb@webmaster.com) wrote:
>    Utterly broken? Nonsense. I have tested the actual function I proposed 
> (sans the __force and __u32 stuff, which weren't necessary in my test 
> program), against real data, collected from various servers in real-time. 
> It has consistently achieved lower average chain lengths than the vanilla 
> function and demonstrated no artifacting, and that's trivial to verify.

So what?
People test and work with XOR hash for years and they do not strike any
problems. If we talk about specially crafted data, then XOR one is no
worse than Jenkins with 3 words (which is even worse for blind attack of 
constant ports).

>    The only analysis I could find was this 
> http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses 
> jhash_2words, and not jhash_3words, and which naively attempts to take the 
> output of jhash_2words, and to perform the same mixing trick that the 
> vanilla inet_ehashfn does and uses artificially generated data sets.

It is outdated, check recent netdev@ archives. Folding used in that test
does not change distribution, and data was presented as it can be
selected by attacker, who can create with any distribution.

>    But please, feel free to point out any other _unfavorable_ analyses of 
> jhash_2words or jhash_3words that I may have missed.
> 
> 
> >We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.
> 
>    Please explain to me how jhash_2words solves the issue that you claim 
> jhash_3words has, when they both use the same underlying bit-mixer?

$c value is not properly distributed and significanly breaks overall
distribution. Attacker, which controls $c (and it does it by controlling 
ports), can significantly increase selected hash chains.

But it is only $c, $a and $b are properly distributed, so jhash_2words()
is safer than jhash_3words().
Just create a simple application which does
jhash_3words(a, b, rand(), init) and jhash_2words(a, b, rand()) and see
results.

I want to emphasize: it is not about random generator, but about data,
controlled by attacker, which can select it like in tests. $c in
jhash_3words() is the weakest part, jhash_2words() works much better in
that regard.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 18:21     ` Evgeniy Polyakov
@ 2007-03-22 19:44       ` Nikolaos D. Bougalis
  2007-03-22 19:56         ` Evgeniy Polyakov
  2007-03-22 20:58         ` David Miller
  0 siblings, 2 replies; 34+ messages in thread
From: Nikolaos D. Bougalis @ 2007-03-22 19:44 UTC (permalink / raw)
  To: netdev

On Thu, Mar 22, 2007 11:21 AM, Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:

>>    Utterly broken? Nonsense. I have tested the actual function I proposed
>> (sans the __force and __u32 stuff, which weren't necessary in my test
>> program), against real data, collected from various servers in real-time.
>> It has consistently achieved lower average chain lengths than the vanilla
>> function and demonstrated no artifacting, and that's trivial to verify.
>
> So what?

    So what? Are you serious?


> People test and work with XOR hash for years and they do not strike any
> problems. If we talk about specially crafted data, then XOR one is no
> worse than Jenkins with 3 words (which is even worse for blind attack of
> constant ports).

    People _have_ had problems. _I_ have had problems. And when someone with 
a few thousand drones under his control hoses your servers because he can do 
math and he leaves you with 20000-item long chains, _you_ will have 
problems. And sticking your head in the sand and saying "people work with 
XOR hash for years and they do not strike any problems" wont help you.


>>    The only analysis I could find was this
>> http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses
>> jhash_2words, and not jhash_3words, and which naively attempts to take 
>> the
>> output of jhash_2words, and to perform the same mixing trick that the
>> vanilla inet_ehashfn does and uses artificially generated data sets.
>
> It is outdated, check recent netdev@ archives. Folding used in that test
> does not change distribution, and data was presented as it can be
> selected by attacker, who can create with any distribution.

    Be careful here. If the folding makes no difference, it says something 
very important about __jhash_mix, and that something goes against the very 
thing that you are saying.


>>    But please, feel free to point out any other _unfavorable_ analyses of
>> jhash_2words or jhash_3words that I may have missed.
>>
>>
>> >We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.
>>
>>    Please explain to me how jhash_2words solves the issue that you claim
>> jhash_3words has, when they both use the same underlying bit-mixer?
>
> $c value is not properly distributed and significanly breaks overall
> distribution. Attacker, which controls $c (and it does it by controlling
> ports), can significantly increase selected hash chains.

    I've tested the Jenkins hash extensively. I see no evidence of this 
"improper distribution" that you describe. In fact, about the only person 
that I've seen advocate this in the archives of netdev is you, and a lot of 
other very smart people disagree with you, so I consider myself to be in 
good company.


> But it is only $c, $a and $b are properly distributed, so jhash_2words()
> is safer than jhash_3words().
> Just create a simple application which does
> jhash_3words(a, b, rand(), init) and jhash_2words(a, b, rand()) and see
> results.

    What exactly am I supposed to see in these results? Because whatever it 
is, it's not there. Feel free to provide a link to your data and a histogram 
that shows what you find of interest though, and I'll be happy to look at 
it.

    -n
 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 19:44       ` Nikolaos D. Bougalis
@ 2007-03-22 19:56         ` Evgeniy Polyakov
  2007-03-22 20:53           ` Nikolaos D. Bougalis
  2007-03-22 20:58         ` David Miller
  1 sibling, 1 reply; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-22 19:56 UTC (permalink / raw)
  To: Nikolaos D. Bougalis; +Cc: netdev

On Thu, Mar 22, 2007 at 12:44:09PM -0700, Nikolaos D. Bougalis (nikb@webmaster.com) wrote:
> On Thu, Mar 22, 2007 11:21 AM, Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> 
> >>   Utterly broken? Nonsense. I have tested the actual function I proposed
> >>(sans the __force and __u32 stuff, which weren't necessary in my test
> >>program), against real data, collected from various servers in real-time.
> >>It has consistently achieved lower average chain lengths than the vanilla
> >>function and demonstrated no artifacting, and that's trivial to verify.
> >
> >So what?
> 
>    So what? Are you serious?

We started our discussion a bit wrong - let's start it again, ok? :)
 
> >People test and work with XOR hash for years and they do not strike any
> >problems. If we talk about specially crafted data, then XOR one is no
> >worse than Jenkins with 3 words (which is even worse for blind attack of
> >constant ports).
> 
>    People _have_ had problems. _I_ have had problems. And when someone with 
> a few thousand drones under his control hoses your servers because he can 
> do math and he leaves you with 20000-item long chains, _you_ will have 
> problems. And sticking your head in the sand and saying "people work with 
> XOR hash for years and they do not strike any problems" wont help you.

You do not want to read what was written - _if_ we use artificial data,
then attacker can use it too, so if it is possible to break the system
with artificial data, then it is possible it will be broken in a real
life. If we use usual data, then we are ok (although Jenkins with 3
words is not ok).
 
> >>   The only analysis I could find was this
> >>http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14, which uses
> >>jhash_2words, and not jhash_3words, and which naively attempts to take 
> >>the
> >>output of jhash_2words, and to perform the same mixing trick that the
> >>vanilla inet_ehashfn does and uses artificially generated data sets.
> >
> >It is outdated, check recent netdev@ archives. Folding used in that test
> >does not change distribution, and data was presented as it can be
> >selected by attacker, who can create with any distribution.
> 
>    Be careful here. If the folding makes no difference, it says something 
> very important about __jhash_mix, and that something goes against the very 
> thing that you are saying.

Grrr, I think I pointed several times already, that properly distributed
values do not change distribution after folding. And it can be seen in
all tests (and in that you pointed too).

> >>   But please, feel free to point out any other _unfavorable_ analyses of
> >>jhash_2words or jhash_3words that I may have missed.
> >>
> >>
> >>>We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) though.
> >>
> >>   Please explain to me how jhash_2words solves the issue that you claim
> >>jhash_3words has, when they both use the same underlying bit-mixer?
> >
> >$c value is not properly distributed and significanly breaks overall
> >distribution. Attacker, which controls $c (and it does it by controlling
> >ports), can significantly increase selected hash chains.
> 
>    I've tested the Jenkins hash extensively. I see no evidence of this 
> "improper distribution" that you describe. In fact, about the only person 
> that I've seen advocate this in the archives of netdev is you, and a lot of 
> other very smart people disagree with you, so I consider myself to be in 
> good company.

Hmm, I ran tests to select proper hash for netchannel implementation
(actualy the same as sockets) and showed Jenkin's hash problems - it is
enough to have only problem to state that there is a problem, doesn't
it?
 
> >But it is only $c, $a and $b are properly distributed, so jhash_2words()
> >is safer than jhash_3words().
> >Just create a simple application which does
> >jhash_3words(a, b, rand(), init) and jhash_2words(a, b, rand()) and see
> >results.
> 
>    What exactly am I supposed to see in these results? Because whatever it 
> is, it's not there. Feel free to provide a link to your data and a 

I will try to decipher phrase 'whatever it is, it's not there'...

> histogram that shows what you find of interest though, and I'll be happy to 
> look at it.

This thread for example:
http://marc.info/?t=117057613500001&r=1&w=2

One your test shows thare are no problems, try that one I propose, which
can be even created in userspace - you do not want even to get into
account what I try to say to you.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 19:56         ` Evgeniy Polyakov
@ 2007-03-22 20:53           ` Nikolaos D. Bougalis
  2007-03-23  7:52             ` Evgeniy Polyakov
  0 siblings, 1 reply; 34+ messages in thread
From: Nikolaos D. Bougalis @ 2007-03-22 20:53 UTC (permalink / raw)
  To: netdev


> We started our discussion a bit wrong - let's start it again, ok? :)

    Fair enough.


> You do not want to read what was written - _if_ we use artificial data,
> then attacker can use it too, so if it is possible to break the system
> with artificial data, then it is possible it will be broken in a real
> life. If we use usual data, then we are ok (although Jenkins with 3
> words is not ok).

    I'm not saying that an attacker cannot use artificial data. Indeed, 
algorithmic complexity attacks are all about 'crafting' artificial data with 
certain properties. So, yes, I absolutely agree that attackers can and do 
use "artificial data."


> >    Be careful here. If the folding makes no difference, it says 
> > something
> > very important about __jhash_mix, and that something goes against the 
> > very
> > thing that you are saying.
>
> Grrr, I think I pointed several times already, that properly distributed
> values do not change distribution after folding. And it can be seen in
> all tests (and in that you pointed too).

    Yes, I agree that the folding will not be a problem _IF_ the values are 
properly distributed -- although in that case, the folding is unnecessary. 
But that the Jenkins distribution didn't change (according to posts you 
made) after folding says that the output of Jenkins is pretty good to begin 
with ;)


> > >>>We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) 
> > >>>though.
> > >>
> > >>   Please explain to me how jhash_2words solves the issue that you 
> > >> claim
> > >>jhash_3words has, when they both use the same underlying bit-mixer?
> > >
> > >$c value is not properly distributed and significanly breaks overall
> > >distribution. Attacker, which controls $c (and it does it by 
> > >controlling
> > >ports), can significantly increase selected hash chains.

    Even if we assume that $c is not properly distributed, using a secret 
cookie and mixing operations from different algebraic groups changes the 
calculus dramatically. It's no longer straight-forward for the attacker to 
generate collisions (as it is with the current function) because the '$c' 
supplied by the attacker is used in conjunction with the secret cookie 
before __jhash_mix thoroughly mixes the inputs to generate a hash.


> >    I've tested the Jenkins hash extensively. I see no evidence of this
> > "improper distribution" that you describe. In fact, about the only 
> > person
> > that I've seen advocate this in the archives of netdev is you, and a lot 
> > of
> > other very smart people disagree with you, so I consider myself to be in
> > good company.
>
> Hmm, I ran tests to select proper hash for netchannel implementation
> (actualy the same as sockets) and showed Jenkin's hash problems - it is
> enough to have only problem to state that there is a problem, doesn't
> it?

    Again, from what I've seen from your other posts, I don't believe you've 
identified any inherent problems with the Jenkins hash.

    But that aside for a moment, surely you will agree that the ability of 
an attacker with a few dozen machines under his control to trivially mount 
an algorithmic complexity attack causing serious performance drops is also a 
problem with the current code and one that must be addressed.


> I will try to decipher phrase 'whatever it is, it's not there'...

    It meant that I saw nothing particularly interesting running the example 
you suggested and looking at the output.


> This thread for example:
> http://marc.info/?t=117057613500001&r=1&w=2

    I went through most of this thread. I don't see an analysis of the 
Jenkins. Am I missing something?


> One your test shows thare are no problems, try that one I propose, which
> can be even created in userspace - you do not want even to get into
> account what I try to say to you.

    I'm not trying to be obnoxious on purpose here, but I don't see the test 
that you are referring to. Could you be more specific?

    -n



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 20:53           ` Nikolaos D. Bougalis
@ 2007-03-23  7:52             ` Evgeniy Polyakov
  0 siblings, 0 replies; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-23  7:52 UTC (permalink / raw)
  To: Nikolaos D. Bougalis; +Cc: netdev

On Thu, Mar 22, 2007 at 01:53:03PM -0700, Nikolaos D. Bougalis (nikb@webmaster.com) wrote:
> >Grrr, I think I pointed several times already, that properly distributed
> >values do not change distribution after folding. And it can be seen in
> >all tests (and in that you pointed too).
> 
>    Yes, I agree that the folding will not be a problem _IF_ the values are 
> properly distributed -- although in that case, the folding is unnecessary. 
> But that the Jenkins distribution didn't change (according to posts you 
> made) after folding says that the output of Jenkins is pretty good to begin 
> with ;)

In _some_ cases, but not in all.
 
> >> >>>We can use jhash_2words(laddr, faddr, portpair^inet_ehash_rnd) 
> >> >>>though.
> >> >>
> >> >>   Please explain to me how jhash_2words solves the issue that you 
> >> >> claim
> >> >>jhash_3words has, when they both use the same underlying bit-mixer?
> >> >
> >> >$c value is not properly distributed and significanly breaks overall
> >> >distribution. Attacker, which controls $c (and it does it by 
> >> >controlling
> >> >ports), can significantly increase selected hash chains.
> 
>    Even if we assume that $c is not properly distributed, using a secret 
> cookie and mixing operations from different algebraic groups changes the 
> calculus dramatically. It's no longer straight-forward for the attacker to 
> generate collisions (as it is with the current function) because the '$c' 
> supplied by the attacker is used in conjunction with the secret cookie 
> before __jhash_mix thoroughly mixes the inputs to generate a hash.

With XOR hash attacker can predict end result easily, with jenkins it
can not (easily), but jenkins distribution itself (even for usual data) 
results in too long chains - there are two problems:
1. easily predicted result
2. broken distribution

Xor hash has problems with first one, Jenkins (in some cases) with
second.
 
> >>    I've tested the Jenkins hash extensively. I see no evidence of this
> >> "improper distribution" that you describe. In fact, about the only 
> >> person
> >> that I've seen advocate this in the archives of netdev is you, and a lot 
> >> of
> >> other very smart people disagree with you, so I consider myself to be in
> >> good company.
> >
> >Hmm, I ran tests to select proper hash for netchannel implementation
> >(actualy the same as sockets) and showed Jenkin's hash problems - it is
> >enough to have only problem to state that there is a problem, doesn't
> >it?
> 
>    Again, from what I've seen from your other posts, I don't believe you've 
> identified any inherent problems with the Jenkins hash.
> 
>    But that aside for a moment, surely you will agree that the ability of 
> an attacker with a few dozen machines under his control to trivially mount 
> an algorithmic complexity attack causing serious performance drops is also 
> a problem with the current code and one that must be addressed.

Please refer to above two problems - Jenkins hash does not have problem
with easy end result detection, instead if has distribution problem.
Which means that attacker should not guess hash chains, it should
provide special crafted input and distribution will be shifted to the
higher levels.
 
> >I will try to decipher phrase 'whatever it is, it's not there'...
> 
>    It meant that I saw nothing particularly interesting running the example 
> you suggested and looking at the output.
> 
> 
> >This thread for example:
> >http://marc.info/?t=117057613500001&r=1&w=2
> 
>    I went through most of this thread. I don't see an analysis of the 
> Jenkins. Am I missing something?

There is no full analysis, I just posted results I found when selected
hash for different projects with similar to sockets background.
 
> >One your test shows thare are no problems, try that one I propose, which
> >can be even created in userspace - you do not want even to get into
> >account what I try to say to you.
> 
>    I'm not trying to be obnoxious on purpose here, but I don't see the test 
> that you are referring to. Could you be more specific?

http://marc.info/?l=linux-netdev&m=117199140430104&q=p5

>    -n
-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 19:44       ` Nikolaos D. Bougalis
  2007-03-22 19:56         ` Evgeniy Polyakov
@ 2007-03-22 20:58         ` David Miller
  2007-03-22 22:03           ` Eric Dumazet
  2007-03-23  8:07           ` Evgeniy Polyakov
  1 sibling, 2 replies; 34+ messages in thread
From: David Miller @ 2007-03-22 20:58 UTC (permalink / raw)
  To: nikb; +Cc: netdev

From: "Nikolaos D. Bougalis" <nikb@webmaster.com>
Date: Thu, 22 Mar 2007 12:44:09 -0700

>     People _have_ had problems. _I_ have had problems. And when
> someone with a few thousand drones under his control hoses your
> servers because he can do math and he leaves you with 20000-item
> long chains, _you_ will have problems.

No need to further argue this point, the people that matter
(ie. me :-) understand it, don't worry..

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 20:58         ` David Miller
@ 2007-03-22 22:03           ` Eric Dumazet
  2007-03-23  7:11             ` David Miller
  2007-03-23  8:07           ` Evgeniy Polyakov
  1 sibling, 1 reply; 34+ messages in thread
From: Eric Dumazet @ 2007-03-22 22:03 UTC (permalink / raw)
  To: David Miller, nikb; +Cc: netdev

David Miller a écrit :
> From: "Nikolaos D. Bougalis" <nikb@webmaster.com>
> Date: Thu, 22 Mar 2007 12:44:09 -0700
> 
>>     People _have_ had problems. _I_ have had problems. And when
>> someone with a few thousand drones under his control hoses your
>> servers because he can do math and he leaves you with 20000-item
>> long chains, _you_ will have problems.
> 
> No need to further argue this point, the people that matter
> (ie. me :-) understand it, don't worry..

Yes, I recall having one big server hit two years ago by an attack on tcp hash 
function. David sent me the patch to use jhash. It's performing well :)

Welcome to the club :)

===== net/ipv4/tcp_ipv4.c 1.114 vs edited =====
--- 1.114/net/ipv4/tcp_ipv4.c    2005-03-26 15:04:35 -08:00
+++ edited/net/ipv4/tcp_ipv4.c    2005-04-05 13:39:52 -07:00
@@ -103,14 +103,15 @@
   */
  int sysctl_local_port_range[2] = { 1024, 4999 };
  int tcp_port_rover = 1024 - 1;
+static u32 tcp_v4_hash_rand;

  static __inline__ int tcp_hashfn(__u32 laddr, __u16 lport,
                   __u32 faddr, __u16 fport)
  {
-    int h = (laddr ^ lport) ^ (faddr ^ fport);
-    h ^= h >> 16;
-    h ^= h >> 8;
-    return h & (tcp_ehash_size - 1);
+    return jhash_2words(laddr ^ faddr,
+                (lport << 16) | fport,
+                tcp_v4_hash_rand) &
+        (tcp_ehash_size - 1);
  }

 >  static __inline__ int tcp_sk_hashfn(struct sock *sk)
 > @@ -2626,6 +2627,9 @@
 >          panic("Failed to create the TCP control socket.\n");
 >      tcp_socket->sk->sk_allocation   = GFP_ATOMIC;
 >      inet_sk(tcp_socket->sk)->uc_ttl = -1;
 > +
 > +    get_random_bytes(&tcp_v4_hash_rand, 4);
 > +    tcp_v4_hash_rand ^= jiffies;
 >
 >      /* Unhash it so that IP input processing does not even
 >       * see it, we do not wish this socket to see incoming
 >
 >



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 22:03           ` Eric Dumazet
@ 2007-03-23  7:11             ` David Miller
  2007-03-23  8:00               ` Eric Dumazet
  0 siblings, 1 reply; 34+ messages in thread
From: David Miller @ 2007-03-23  7:11 UTC (permalink / raw)
  To: dada1; +Cc: nikb, netdev

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Thu, 22 Mar 2007 23:03:04 +0100

> David Miller a écrit :
> > From: "Nikolaos D. Bougalis" <nikb@webmaster.com>
> > Date: Thu, 22 Mar 2007 12:44:09 -0700
> > 
> >>     People _have_ had problems. _I_ have had problems. And when
> >> someone with a few thousand drones under his control hoses your
> >> servers because he can do math and he leaves you with 20000-item
> >> long chains, _you_ will have problems.
> > 
> > No need to further argue this point, the people that matter
> > (ie. me :-) understand it, don't worry..
> 
> Yes, I recall having one big server hit two years ago by an attack on tcp hash 
> function. David sent me the patch to use jhash. It's performing well :)
> 
> Welcome to the club :)

Ok, how about we put something like the following into 2.6.21?

I'm not looking for the hash perfectionist analysis, please bug the
heck off if that's what your reply is going to be about, don't bother
hitting the reply button I will ignore you.

I want to hear instead if this makes attackability markedly _HARDER_
than what we have now and I am sure beyond a shadow of a doubt that it
does.

The secret is inialized when the first ehash-using socket is created,
that's not perfect (bug off!) but it's better than doing the
initialization in inet_init() or {tcp,dccp}_init() as we'll have a
chance of at least some entropy when that first such socket is
created.  We definitely can't do it for the first AF_INET socket
creation, because icmp creates a bunch of SOCK_RAW inet sockets at
init time which would defeat the whole purpose of deferring this. :)

Thanks.

diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index c28e424..668056b 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -19,6 +19,9 @@
 #include <linux/in6.h>
 #include <linux/ipv6.h>
 #include <linux/types.h>
+#include <linux/jhash.h>
+
+#include <net/inet_sock.h>
 
 #include <net/ipv6.h>
 
@@ -28,12 +31,11 @@ struct inet_hashinfo;
 static inline unsigned int inet6_ehashfn(const struct in6_addr *laddr, const u16 lport,
 				const struct in6_addr *faddr, const __be16 fport)
 {
-	unsigned int hashent = (lport ^ (__force u16)fport);
+	u32 ports = (lport ^ (__force u16)fport);
 
-	hashent ^= (__force u32)(laddr->s6_addr32[3] ^ faddr->s6_addr32[3]);
-	hashent ^= hashent >> 16;
-	hashent ^= hashent >> 8;
-	return hashent;
+	return jhash_3words((__force u32)laddr->s6_addr32[3],
+			    (__force u32)faddr->s6_addr32[3],
+			    ports, inet_ehash_secret);
 }
 
 static inline int inet6_sk_ehashfn(const struct sock *sk)
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index ce6da97..62daf21 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -19,6 +19,7 @@
 
 #include <linux/string.h>
 #include <linux/types.h>
+#include <linux/jhash.h>
 
 #include <net/flow.h>
 #include <net/sock.h>
@@ -167,13 +168,15 @@ static inline void inet_sk_copy_descendant(struct sock *sk_to,
 
 extern int inet_sk_rebuild_header(struct sock *sk);
 
+extern u32 inet_ehash_secret;
+extern void build_ehash_secret(void);
+
 static inline unsigned int inet_ehashfn(const __be32 laddr, const __u16 lport,
 					const __be32 faddr, const __be16 fport)
 {
-	unsigned int h = ((__force __u32)laddr ^ lport) ^ ((__force __u32)faddr ^ (__force __u32)fport);
-	h ^= h >> 16;
-	h ^= h >> 8;
-	return h;
+	return jhash_2words((__force __u32) laddr ^ (__force __u32) faddr,
+			    ((__u32) lport) << 16 | (__force __u32)fport,
+			    inet_ehash_secret);
 }
 
 static inline int inet_sk_ehashfn(const struct sock *sk)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index cf358c8..308318a 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -87,6 +87,7 @@
 #include <linux/init.h>
 #include <linux/poll.h>
 #include <linux/netfilter_ipv4.h>
+#include <linux/random.h>
 
 #include <asm/uaccess.h>
 #include <asm/system.h>
@@ -217,6 +218,16 @@ out:
 	return err;
 }
 
+u32 inet_ehash_secret;
+EXPORT_SYMBOL(inet_ehash_secret);
+
+void build_ehash_secret(void)
+{
+	while (!inet_ehash_secret)
+		get_random_bytes(&inet_ehash_secret, 4);
+}
+EXPORT_SYMBOL(build_ehash_secret);
+
 /*
  *	Create an inet socket.
  */
@@ -233,6 +244,11 @@ static int inet_create(struct socket *sock, int protocol)
 	int try_loading_module = 0;
 	int err;
 
+	if (sock->type != SOCK_RAW &&
+	    sock->type != SOCK_DGRAM &&
+	    !inet_ehash_secret)
+		build_ehash_secret();
+
 	sock->state = SS_UNCONNECTED;
 
 	/* Look for the requested type/protocol pair. */
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 5cac14a..0de723f 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -98,6 +98,11 @@ static int inet6_create(struct socket *sock, int protocol)
 	int try_loading_module = 0;
 	int err;
 
+	if (sock->type != SOCK_RAW &&
+	    sock->type != SOCK_DGRAM &&
+	    !inet_ehash_secret)
+		build_ehash_secret();
+
 	/* Look for the requested type/protocol pair. */
 	answer = NULL;
 lookup_protocol:

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-23  7:11             ` David Miller
@ 2007-03-23  8:00               ` Eric Dumazet
  2007-03-23 18:46                 ` David Miller
  0 siblings, 1 reply; 34+ messages in thread
From: Eric Dumazet @ 2007-03-23  8:00 UTC (permalink / raw)
  To: David Miller; +Cc: nikb, netdev

David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>

>> Welcome to the club :)
> 
> Ok, how about we put something like the following into 2.6.21?

2.6.21 really ?

Just to be clear : I had an attack two years ago, I applied your patch, 
rebooted the machine, and since then the attackers had to find another way to 
hurt the machine. Eventually, when I update the kernel of this machine, I 
forget to appply jhash patch, and attackers dont know they can try again :)

I dont consider this new hash as bug fix at all, ie your patch might enter 
2.6.22 normal dev cycle.

Maybe a *fix*, independant of the hash function (so that no math expert can 
insult us), would be to have a *limit*, say... 1000 (something insane) on the 
length of a hash chain ?

In my case, I saw lengths of about 3000 two years ago under attack, but 
machine was still usable... maybe in half power mode.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-23  8:00               ` Eric Dumazet
@ 2007-03-23 18:46                 ` David Miller
  0 siblings, 0 replies; 34+ messages in thread
From: David Miller @ 2007-03-23 18:46 UTC (permalink / raw)
  To: dada1; +Cc: nikb, netdev

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 23 Mar 2007 09:00:08 +0100

> I dont consider this new hash as bug fix at all, ie your patch might enter 
> 2.6.22 normal dev cycle.

Ok, I checked the patch into net-2.6.22

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 20:58         ` David Miller
  2007-03-22 22:03           ` Eric Dumazet
@ 2007-03-23  8:07           ` Evgeniy Polyakov
  2007-03-23  8:17             ` Eric Dumazet
                               ` (2 more replies)
  1 sibling, 3 replies; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-23  8:07 UTC (permalink / raw)
  To: David Miller; +Cc: nikb, netdev

On Thu, Mar 22, 2007 at 01:58:34PM -0700, David Miller (davem@davemloft.net) wrote:
> From: "Nikolaos D. Bougalis" <nikb@webmaster.com>
> Date: Thu, 22 Mar 2007 12:44:09 -0700
> 
> >     People _have_ had problems. _I_ have had problems. And when
> > someone with a few thousand drones under his control hoses your
> > servers because he can do math and he leaves you with 20000-item
> > long chains, _you_ will have problems.
> 
> No need to further argue this point, the people that matter
> (ie. me :-) understand it, don't worry..

Call me a loooser which mail will be deleted on arrival, but...

jhash_2words(const, const, ((const << 16) | $sport) ^ $random)

where $sport is 1-65535 in a loop, and $random is pseudo-random number
obtained on start.

Which is exactly the case of web server and attacker connects to 80 port
from the same IP address and different source ports.

Result with jenkins:
1 23880
2 12108
3 4040
4 1019
5 200
6 30
7 8
8 1

Xor:
1 65536


Please, do not apply patch as is, I will devote this day to find where
jenkins has problems and try to fix distribution. If I will fail, then
it is up to you to decide that above results are bad or good.

Thank you.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-23  8:07           ` Evgeniy Polyakov
@ 2007-03-23  8:17             ` Eric Dumazet
  2007-03-23  8:33               ` Evgeniy Polyakov
  2007-03-23 11:58             ` XOR hash beauty solved [Was: RFC: Established connections hash function] Evgeniy Polyakov
  2007-03-23 12:45             ` RFC: Established connections hash function Nikolaos D. Bougalis
  2 siblings, 1 reply; 34+ messages in thread
From: Eric Dumazet @ 2007-03-23  8:17 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: David Miller, nikb, netdev

Evgeniy Polyakov a ecrit :
> Call me a loooser which mail will be deleted on arrival, but...
> 
> jhash_2words(const, const, ((const << 16) | $sport) ^ $random)
> 
> where $sport is 1-65535 in a loop, and $random is pseudo-random number
> obtained on start.
> 
> Which is exactly the case of web server and attacker connects to 80 port
> from the same IP address and different source ports.
> 
> Result with jenkins:
> 1 23880
> 2 12108
> 3 4040
> 4 1019
> 5 200
> 6 30
> 7 8
> 8 1
> 
> Xor:
> 1 65536

So what ? You still think hash function must be bijective ? Come on !

You have a machine somewhere that allows 65536 concurrent connections coming 
from the same IP address ?

The last problem you have is the nature of tcp hash function.

Dont argue again with your pseudo science.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-23  8:17             ` Eric Dumazet
@ 2007-03-23  8:33               ` Evgeniy Polyakov
  2007-03-23  9:10                 ` Evgeniy Polyakov
  0 siblings, 1 reply; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-23  8:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, nikb, netdev

[-- Attachment #1: Type: text/plain, Size: 836 bytes --]

On Fri, Mar 23, 2007 at 09:17:19AM +0100, Eric Dumazet (dada1@cosmosbay.com) wrote:
> You have a machine somewhere that allows 65536 concurrent connections 
> coming from the same IP address ?

Attached png file of botnet scenario:
1000 addresses from the same network (class B for example),
each one creates 1024 connections to the same static port.

Eric, I agree, that XOR hash is not perfect, and it should be changed,
but not blindly.

I perfectly know that hash function is not bijective, but it must have
good distribution. 
Function like this
int hash(u32 saddr, u16 sport, u32 daddr, u16 dport, u32 rand)
{
	return ((((rand ^ saddr ^ daddr)>>16)^(dport ^ sport)) >>8);
}

has even worse _distribution_, although you can not predict its end
result due to random value, and attacker will not try to do it.

-- 
	Evgeniy Polyakov

[-- Attachment #2: jhash_botnet.png --]
[-- Type: image/png, Size: 5481 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-23  8:33               ` Evgeniy Polyakov
@ 2007-03-23  9:10                 ` Evgeniy Polyakov
  0 siblings, 0 replies; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-23  9:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, nikb, netdev

[-- Attachment #1: Type: text/plain, Size: 264 bytes --]

On Fri, Mar 23, 2007 at 11:33:32AM +0300, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> Eric, I agree, that XOR hash is not perfect, and it should be changed,
> but not blindly.

Attached case of how broken can be xor in botnet scenario.


-- 
	Evgeniy Polyakov

[-- Attachment #2: jhash_good.png --]
[-- Type: image/png, Size: 7876 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* XOR hash beauty solved [Was: RFC: Established connections hash function]
  2007-03-23  8:07           ` Evgeniy Polyakov
  2007-03-23  8:17             ` Eric Dumazet
@ 2007-03-23 11:58             ` Evgeniy Polyakov
  2007-03-23 12:51               ` Nikolaos D. Bougalis
  2007-03-23 12:45             ` RFC: Established connections hash function Nikolaos D. Bougalis
  2 siblings, 1 reply; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-23 11:58 UTC (permalink / raw)
  To: David Miller; +Cc: nikb, netdev, Eric Dumazet

> Please, do not apply patch as is, I will devote this day to find where
> jenkins has problems and try to fix distribution. If I will fail, then
> it is up to you to decide that above results are bad or good.

I need to admit that I was partially wrong in my analysis of the Jenkins
hash distribution - it does _not_ have problems and any kind of artifacts. 
Waves found in tests are results of folding into hash_size boundary,
distribution inside F(32) field is unifirm.
XOR hash does not have such problem, because it uses (u32 ^ u16) as one
round, which results in the uniform (it is not correct to call that
distribution uniform as is, but only getting into account that u16 values 
used in tests were uniformly distributed) distribution inside F(16), which 
does not suffer from hash_size boundary folding. Since XOR hash has 3 rounds, 
only one of them (xor of the final u32 values) will suffer from folding, but 
tests where is it can be determined for sure use constant addresses, so 
problem hides again.

So, briefly saying, jhash_2/3words have safe distribution, but have
higher-number of elements waves as a result of folding which is
unavoidable for general-purpose hash.

Now my conscience is calm :)

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: XOR hash beauty solved [Was: RFC: Established connections hash function]
  2007-03-23 11:58             ` XOR hash beauty solved [Was: RFC: Established connections hash function] Evgeniy Polyakov
@ 2007-03-23 12:51               ` Nikolaos D. Bougalis
  0 siblings, 0 replies; 34+ messages in thread
From: Nikolaos D. Bougalis @ 2007-03-23 12:51 UTC (permalink / raw)
  To: netdev

> So, briefly saying, jhash_2/3words have safe distribution, but have
> higher-number of elements waves as a result of folding which is
> unavoidable for general-purpose hash.

    Thanks for the analysis. 

    -n



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-23  8:07           ` Evgeniy Polyakov
  2007-03-23  8:17             ` Eric Dumazet
  2007-03-23 11:58             ` XOR hash beauty solved [Was: RFC: Established connections hash function] Evgeniy Polyakov
@ 2007-03-23 12:45             ` Nikolaos D. Bougalis
  2 siblings, 0 replies; 34+ messages in thread
From: Nikolaos D. Bougalis @ 2007-03-23 12:45 UTC (permalink / raw)
  To: netdev

    Let me start off by saying that I hope I didn't come across as 
condenscending in my previous posts. If I did, then it wasn't intended. Now, 
on to more important things :)


> jhash_2words(const, const, ((const << 16) | $sport) ^ $random)
>
> where $sport is 1-65535 in a loop, and $random is pseudo-random number
> obtained on start.

    If you are correct that jhash_3words doesn't properly distribute the 
bits in 'c' (which I don't believe you are, but let's assume it for a 
second) then this function will also be broken:  jhash_2words calls 
jhash_3words; jhash_3words adds (a linear operation) initval and c before 
calling __jhash_mix. So, if there is a problem with passing values under the 
direct control of the attacker into 'c' both jhash_2words and jhash_3words 
are affected; in other words, this variant would also be flawed.


> Which is exactly the case of web server and attacker connects to 80 port
> from the same IP address and different source ports.
>
> Result with jenkins:
> 1 23880
> 2 12108
> 3 4040
> 4 1019
> 5 200
> 6 30
> 7 8
> 8 1
>
> Xor:
> 1 65536

    I believe that the XOR results, if generated by your test above, are 
somewhat meaningless because you're feeding what is ideal input into the XOR 
hash. Which means that you'll get a perfect distribution. With your input, 
one might as well suggest that using the remote port will give a perfect 
distribution, and it will, but only for that specific input.

    Just for kicks, I went to one of our servers, and did "netstat -n | grep 
ESTABLISHED" and ended up with 31072 distinct ip:port/ip:port 4-tuples which 
I then hashed into a 65536 bucket table. There are the results; feel free to 
draw your own conclusions:

[ I think this should come out looking good; sorry if whitespace is screwy ]

+---+-------+-------+-------+-------+
|   |  xor  | j2w 1 | j2w 2 | j3w 1 |
+---+-------+---------------+-------+
| 0 | 40868 | 40930 | 40767 | 40750 |
| 1 | 19208 | 19119 | 19382 | 19413 |
| 2 |  4636 |  4618 |  4576 |  4554 |
| 3 |   716 |   769 |   715 |   734 |
| 4 |    99 |    91 |    87 |    76 |
| 5 |     7 |     8 |     9 |     9 |
| 6 |     1 |     1 |     0 |     0 |
| 7 |     1 |     0 |     0 |     0 |
| 8 |     0 |     0 |     0 |     0 |
+---+-------+-------+-------+-------+

xor: the vanilla linux function
j2w 1 is my variant: jhash_2words(laddr + rport, raddr + lport, seed)
j2w 2 is your variant: jhash_2words(laddr, raddr, (rport << 16) ^ lport) ^ 
seed)
j3w: jhash_3words(laddr, raddr, (rport << 12) + lport, seed)

    The seed used for all the Jenkins hashes came from the low-order 32-bits 
returned by RDTSC, executed when the program started. It remained constant 
throughout the run. 8 runs where made, to ensure that the seed wasn't 
causing weirdness, all runs giving almost identical results. The Jenkins 
hashes did not use the extra 2 right-shifts to fold high-order bits into the 
low-order bits, that is employed by the XOR hash.

    -n 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-22 15:39 RFC: Established connections hash function Nikolaos D. Bougalis
  2007-03-22 15:52 ` Evgeniy Polyakov
@ 2007-03-27 14:11 ` Andi Kleen
  2007-03-28  5:01   ` Nikolaos D. Bougalis
  1 sibling, 1 reply; 34+ messages in thread
From: Andi Kleen @ 2007-03-27 14:11 UTC (permalink / raw)
  To: nikb; +Cc: netdev

"Nikolaos D. Bougalis" <nikb@webmaster.com> writes:
> 
>     I will be more than happy to provide a patch for this, but I figured I
> would solicit some input first.

To truly defend against this you would likely need a cryptographic
hash, which would be likely too slow. 

If it's a real problem the better fix would be to switch to some
kind of balanced tree (like Evgeniy is proposing) . 

But I think it can be mostly ignored. 

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-27 14:11 ` Andi Kleen
@ 2007-03-28  5:01   ` Nikolaos D. Bougalis
  2007-03-28  6:29     ` David Miller
  2007-03-28  9:29     ` Andi Kleen
  0 siblings, 2 replies; 34+ messages in thread
From: Nikolaos D. Bougalis @ 2007-03-28  5:01 UTC (permalink / raw)
  To: netdev; +Cc: ak

"Andi Kleen" (ak@suse.de) wrote:

> To truly defend against this you would likely need a cryptographic
> hash, which would be likely too slow.

    I do not think that a cryptographically secure hash is necessary for 
this. Using a better hash function (i.e. one which does a good job of 
throroughly mixing the input bits, as jenkins does), is sufficient when 
combined with a secret per-boot salt, something which is easily demonstrable 
by test runs.


> If it's a real problem the better fix would be to switch to some
> kind of balanced tree (like Evgeniy is proposing) .

    I don't have any special attachment to the Jenkins hash, or to a hash 
table even. So, _yes_, by all means if a better solution is available let us 
use it and I'll be the first to cheer. But I don't know if Evgeniy's work is 
ready for prime time, and I consider plugging in an improved hash function 
to be an acceptable solution in the meantime.


> But I think it can be mostly ignored.

    With all due respect, it cannot. An attacker with a small-sized botnet 
(which is ~250 hosts) can create chains that contain well in excess of 3000 
items. A big botnet (and there exist botnets with well over 5000 machines in 
them) can bring a system to its knees. You may not have gotten bitten by 
this, but others, including myself and Eric Dumazet have. And I believe that 
David Miller agrees, although I do not want to put words in his mouth -- or 
his keyboard, as the case may be.

    What this boils down to is, yes, we can keep patching our own kernels to 
use tagged jenkins hashing if this affects us, waiting for something better 
to come along. But isn't it more reasonable to add this into the kernel, at 
least as a non-default compile time option, and allow administrators to 
decide whether this is something they want to use?

    -n



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28  5:01   ` Nikolaos D. Bougalis
@ 2007-03-28  6:29     ` David Miller
  2007-03-28  9:29     ` Andi Kleen
  1 sibling, 0 replies; 34+ messages in thread
From: David Miller @ 2007-03-28  6:29 UTC (permalink / raw)
  To: nikb; +Cc: netdev, ak

From: "Nikolaos D. Bougalis" <nikb@webmaster.com>
Date: Tue, 27 Mar 2007 22:01:38 -0700

>     What this boils down to is, yes, we can keep patching our own kernels to 
> use tagged jenkins hashing if this affects us, waiting for something better 
> to come along.

You don't have to patch, I put jenkins into the tree and it will
hit 2.6.22, we don't need to even discuss this any more and you
can safely ignore Andi.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28  5:01   ` Nikolaos D. Bougalis
  2007-03-28  6:29     ` David Miller
@ 2007-03-28  9:29     ` Andi Kleen
  2007-03-28 10:45       ` Evgeniy Polyakov
  1 sibling, 1 reply; 34+ messages in thread
From: Andi Kleen @ 2007-03-28  9:29 UTC (permalink / raw)
  To: nikb; +Cc: netdev


> 
> > But I think it can be mostly ignored.
> 
>     With all due respect, it cannot. An attacker with a small-sized botnet 
> (which is ~250 hosts) can create chains that contain well in excess of 3000 
> items. 

Most likely they can also easily generate enough latency data to crack any simple
hash function then.

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28  9:29     ` Andi Kleen
@ 2007-03-28 10:45       ` Evgeniy Polyakov
  2007-03-28 14:14         ` Andi Kleen
  0 siblings, 1 reply; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-28 10:45 UTC (permalink / raw)
  To: Andi Kleen; +Cc: nikb, netdev

On Wed, Mar 28, 2007 at 11:29:43AM +0200, Andi Kleen (ak@suse.de) wrote:
> > > But I think it can be mostly ignored.
> > 
> >     With all due respect, it cannot. An attacker with a small-sized botnet 
> > (which is ~250 hosts) can create chains that contain well in excess of 3000 
> > items. 
> 
> Most likely they can also easily generate enough latency data to crack any simple
> hash function then.

Jenkins hash is far from being simple to crack, although with some
knowledge it can be done faster.

SHA or something else is essentially the same, except it has different
set of rounds - we only hashes 3 32 bit values, so Jenkins result is
really good.

For the hash tables it is a good solution, but we can move further.
I created multidimensional trie with that problem in mind, but it looks
like right now it is not absolutely required solution - I will continue
to work on it to check if we can be faster than properly sized hash
table with additional trie allocation overhead, but likely I should not
force people include it as is, only for information I think.

> -Andi

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28 10:45       ` Evgeniy Polyakov
@ 2007-03-28 14:14         ` Andi Kleen
  2007-03-28 13:50           ` Eric Dumazet
                             ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Andi Kleen @ 2007-03-28 14:14 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: nikb, netdev

Evgeniy Polyakov <johnpol@2ka.mipt.ru> writes:
> 
> Jenkins hash is far from being simple to crack, although with some
> knowledge it can be done faster.

TCP tends to be initialized early before there is anything
good in the entropy pool.

static void init_std_data(struct entropy_store *r)
{
        struct timeval tv;
        unsigned long flags;

        spin_lock_irqsave(&r->lock, flags);
        r->entropy_count = 0;
        spin_unlock_irqrestore(&r->lock, flags);

        do_gettimeofday(&tv);
        add_entropy_words(r, (__u32 *)&tv, sizeof(tv)/4);
        add_entropy_words(r, (__u32 *)utsname(),
                          sizeof(*(utsname()))/4);
}

utsname is useless here because it runs before user space has 
a chance to set it. The only truly variable thing is the 
boot time, which can be guessed with the ns part being brute forced.

To make it secure you would need to do regular rehash like
the routing cache which would pick up true randomness on the first
rehash.

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28 14:14         ` Andi Kleen
@ 2007-03-28 13:50           ` Eric Dumazet
  2007-03-28 14:52             ` Andi Kleen
  2007-03-28 14:17           ` RFC: Established connections hash function II Andi Kleen
  2007-03-28 19:04           ` RFC: Established connections hash function David Miller
  2 siblings, 1 reply; 34+ messages in thread
From: Eric Dumazet @ 2007-03-28 13:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Evgeniy Polyakov, nikb, netdev

On 28 Mar 2007 16:14:17 +0200
Andi Kleen <andi@firstfloor.org> wrote:
> TCP tends to be initialized early before there is anything
> good in the entropy pool.
> 
> static void init_std_data(struct entropy_store *r)
> {
>         struct timeval tv;
>         unsigned long flags;
> 
>         spin_lock_irqsave(&r->lock, flags);
>         r->entropy_count = 0;
>         spin_unlock_irqrestore(&r->lock, flags);
> 
>         do_gettimeofday(&tv);
>         add_entropy_words(r, (__u32 *)&tv, sizeof(tv)/4);
>         add_entropy_words(r, (__u32 *)utsname(),
>                           sizeof(*(utsname()))/4);
> }
> 
> utsname is useless here because it runs before user space has 
> a chance to set it. The only truly variable thing is the 
> boot time, which can be guessed with the ns part being brute forced.
> 
> To make it secure you would need to do regular rehash like
> the routing cache which would pick up true randomness on the first
> rehash.

Good point, but :

1) We can now use "struct timespec" to get more bits in init_std_data()

2) tcp ehash salt is initialized at first socket creation, not boot time. Maybe we have more available entropy at this point.

3) We dont want to be 'totally secure'. We only want to raise the level, and eventually see if we have to spend more time on this next year(s). AFAIK we had two different reports from people being hit by the flaw of previous hash. Not really a critical issue.

4) We could add a hard limit on the length of one chain. Even if the bad guys discover a flaw, it wont hurt too much.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28 13:50           ` Eric Dumazet
@ 2007-03-28 14:52             ` Andi Kleen
  2007-03-29  9:18               ` Evgeniy Polyakov
  0 siblings, 1 reply; 34+ messages in thread
From: Andi Kleen @ 2007-03-28 14:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andi Kleen, Evgeniy Polyakov, nikb, netdev

On Wed, Mar 28, 2007 at 03:50:47PM +0200, Eric Dumazet wrote:
> 1) We can now use "struct timespec" to get more bits in init_std_data()

That would be a good change, but i don't think it would help that much.
If you know the hardware (e.g. webhost farms tend to have quite
predictive kit) and the kernel binary and the boot offset from
the timestamp you can probably guess the range of ns pretty closely
(let's say down to a few ms). With that it's a small range to search. 

> 2) tcp ehash salt is initialized at first socket creation, not boot time. Maybe we have more available entropy at this point.

Sockets are created early too. It would be a little better, but probably
not much. 

The only true random seed is disk, keyboard/mouse and previous state. previous
state is typically a relatively late init script, probably after the first
socket.  Servers tend to have no disk/mouse activity.

Disk may work if you manage to put it after the root mount, but you
lose on diskless systems. e.g. if the nfsd is built in that wouldn't
work though because it would create sockets before that.

Getting entropy from network interrupts would avoid the the diskless
issue, but people are paranoid about that.

> 3) We dont want to be 'totally secure'. We only want to raise the level, and eventually see if we have to spend more time on this next year(s). AFAIK we had two different reports from people being hit by the flaw of previous hash. Not really a critical issue.

Yes, but you probably want a complexity of at least 10^5-10^6 to be any
useful. I don't think you will get that early in boot from random
unless you use hardware support.

> 
> 4) We could add a hard limit on the length of one chain. Even if the bad guys discover a flaw, it wont hurt too much.

Or just use the trie?  It has other advantages too :)

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28 14:52             ` Andi Kleen
@ 2007-03-29  9:18               ` Evgeniy Polyakov
  0 siblings, 0 replies; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-29  9:18 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Eric Dumazet, nikb, netdev

On Wed, Mar 28, 2007 at 04:52:55PM +0200, Andi Kleen (andi@firstfloor.org) wrote:
> > 3) We dont want to be 'totally secure'. We only want to raise the level, and eventually see if we have to spend more time on this next year(s). AFAIK we had two different reports from people being hit by the flaw of previous hash. Not really a critical issue.
> 
> Yes, but you probably want a complexity of at least 10^5-10^6 to be any
> useful. I don't think you will get that early in boot from random
> unless you use hardware support.

What we have (had) right now is broken situation, and it must be fixed
no matter what solution is used. Using more secure hash (and breaking
Jenkins hash is a bit harder than XOR one, I say it not only from
theoretical point of view looking into its operations) is a fix.
It is possible that there is even better fix - there is always something
better than one has right now, but right now problem must be fixed.
And David (and Eric, and Nikolaos) fixed that problem in place.
It can be solved (this time it will be called 'improved') further.

> > 
> > 4) We could add a hard limit on the length of one chain. Even if the bad guys discover a flaw, it wont hurt too much.

Hard limit should not be used, since it is exactly what attacker wants -
attacker can get all slots in th hash table and server will not respond
to other clients at all, although it could, but much slower.

> Or just use the trie?  It has other advantages too :)

As an interested party I should not comment, but I can not resist - 
yes, it is cool and can be done better :)

> -Andi

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function II
  2007-03-28 14:14         ` Andi Kleen
  2007-03-28 13:50           ` Eric Dumazet
@ 2007-03-28 14:17           ` Andi Kleen
  2007-03-28 19:04           ` RFC: Established connections hash function David Miller
  2 siblings, 0 replies; 34+ messages in thread
From: Andi Kleen @ 2007-03-28 14:17 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Evgeniy Polyakov, nikb, netdev

Andi Kleen <andi@firstfloor.org> writes:

> The only truly variable thing is the 
> boot time, which can be guessed 

Actually you don't need to guess it. It's in any TCP timestamp.

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28 14:14         ` Andi Kleen
  2007-03-28 13:50           ` Eric Dumazet
  2007-03-28 14:17           ` RFC: Established connections hash function II Andi Kleen
@ 2007-03-28 19:04           ` David Miller
  2007-03-28 20:12             ` Andi Kleen
  2 siblings, 1 reply; 34+ messages in thread
From: David Miller @ 2007-03-28 19:04 UTC (permalink / raw)
  To: andi; +Cc: johnpol, nikb, netdev

From: Andi Kleen <andi@firstfloor.org>
Date: 28 Mar 2007 16:14:17 +0200

> Evgeniy Polyakov <johnpol@2ka.mipt.ru> writes:
> > 
> > Jenkins hash is far from being simple to crack, although with some
> > knowledge it can be done faster.
> 
> TCP tends to be initialized early before there is anything
> good in the entropy pool.

Andi, you're being an idiot.

You are spewing endless and uninformed bullshit about this secure hash
topic, and it must stop now!

You obviously didn't even read my patch, because if you did you would
have seen that I don't initialize the random seed until MUCH MUCH
later _EXACTLY_ to deal with this issue.

In my patch the random seed is initialized when the first TCP or DCCP
socket is created, at which point we'll have sufficient entropy.

So please stop talking such nonsense about this topic.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-28 19:04           ` RFC: Established connections hash function David Miller
@ 2007-03-28 20:12             ` Andi Kleen
  0 siblings, 0 replies; 34+ messages in thread
From: Andi Kleen @ 2007-03-28 20:12 UTC (permalink / raw)
  To: David Miller; +Cc: andi, johnpol, nikb, netdev

> In my patch the random seed is initialized when the first TCP or DCCP
> socket is created, at which point we'll have sufficient entropy.

See my discussion of this case in a later mail to Evgeniy 

-Andi

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
@ 2007-03-24 12:26 linux
  2007-03-24 13:29 ` Evgeniy Polyakov
  0 siblings, 1 reply; 34+ messages in thread
From: linux @ 2007-03-24 12:26 UTC (permalink / raw)
  To: johnpol.2ka.mipt.ru, netdev; +Cc: linux

> Result with jenkins:
> 1 23880
> 2 12108
> 3 4040
> 4 1019
> 5 200
> 6 30
> 7 8
> 8 1
> 
> Xor:
> 1 65536

Precisely.  This means that the Xor hash SUCKS, because its output is conspicuously
non-random.

What you expect is a Poisson distribution, where the chance that a chain will contain
k elements is
	P(lambda,k) = e^-lambda * lambda^k / k!
lambda is the average loading per chain.  In your example, it's 1 (65536 inputs, 65536 outputs).
(^ is exponentiation, ! is factorial)

So the distribution I expect to get is:
 0 24109.347656
 1 24109.347656
 2 12054.673828
 3 4018.224609
 4 1004.556152
 5 200.911224
 6 33.485203
 7 4.783601
 8 0.597950
 9 0.066439
10 0.006644

Whick looks a HELL of a lot like what you observed.
(The jenkins result above has 24250 chains with no entries.)

Now, you can sometimes use properties of the inputs to get a distribution
that is more uniform than random, by letting the distribution of the input
"show through" the hash funciton.  Which the xor hash does.  But this
depends on making assumptions about the input distribution, which means
that you're assuming that they're not being chosen maliciously.

If an attacker is choosing maliciously, which is a required assumption
in today's Internet, the best you can do is random.

Now, the core Jenkins hash mix function basically takes three inputs.
What jhash_3words does with it is:
	a += K
	b += K
	c += seed
	__jhash_mix(a, b, c)
	return c;

Now, the ipv4 hash operation fundamentally involves 96 bits of input,
which is a good match for jhash.  If you want to add a salt, perhaps the
simplest thing would be to just replace those constants K with a 96-bit
salt and be done with it:

	a = (lport<<16) + rport + salt[0];
	Xb = laddr + salt[1];
	c = raddr + salt[2];
	__jhash_mix(a,b,c)
	return c;

Regarding control by attackers, let's consider the four inputs and see
how much information an attacker can insert into each one:

remote port: An attacker has complete control over this.  16 bits.
remote address: Depends on the size of the bit-net.  Can vary from 0 bits
	(one machine) to 20 bits for a large bot-net.
local address: Limited to the number of addresses the local machine has.
	Typically 0 bits, rarely more than 2 bits.
	May be much larger (8 bits or more) for stateful firewalls and
	other sorts of proxies.
local port: Limited to the number of open server ports.  Typically 3-6
	bits, but may be lower on heavily firewalled machines.

Certainly combining any two of these in a predictable way without some
non-linear salting makes an attacker's job easier.  While folding the
local and remote addresses before hashing is usually safe because the
local address is usually unique, there are situations in which there are
a large number of possible local addresses.

For example, it allows an attacker with a /24 to attack, say, a stateful
firewall guarding a /24.  If I have my machine at address a.b.c.d connect
to remote machine x.y.z.~d, then they always fold to a^x.b^y.c^z.255,
and I can, by making the local and remote paorts identical for all the
attacks, force 254-entry hash chains without knowing anything else about
the hash function, salt, or whatever.

An interesting question is whether it's better to mix salt into the
bits an attacker has most or least control over.  It's not immediately
obvious to me; does anyone else have insight?  Just mixing in 96 bits
over everything does seem to render the question moot.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: RFC: Established connections hash function
  2007-03-24 12:26 linux
@ 2007-03-24 13:29 ` Evgeniy Polyakov
  0 siblings, 0 replies; 34+ messages in thread
From: Evgeniy Polyakov @ 2007-03-24 13:29 UTC (permalink / raw)
  To: linux; +Cc: johnpol.2ka.mipt.ru, netdev

On Sat, Mar 24, 2007 at 08:26:58AM -0400, linux@horizon.com (linux@horizon.com) wrote:
> > Result with jenkins:
> > 1 23880
> > 2 12108
> > 3 4040
> > 4 1019
> > 5 200
> > 6 30
> > 7 8
> > 8 1
> > 
> > Xor:
> > 1 65536
> 
> Precisely.  This means that the Xor hash SUCKS, because its output is conspicuously
> non-random.

XOR hash was specially designed for network environment, so it has the
ever best result (in this scenario, but it can be too easily controlled by attacker). 
It is not general purpose hash.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2007-03-29  9:19 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-22 15:39 RFC: Established connections hash function Nikolaos D. Bougalis
2007-03-22 15:52 ` Evgeniy Polyakov
2007-03-22 17:32   ` Nikolaos D. Bougalis
2007-03-22 18:21     ` Evgeniy Polyakov
2007-03-22 19:44       ` Nikolaos D. Bougalis
2007-03-22 19:56         ` Evgeniy Polyakov
2007-03-22 20:53           ` Nikolaos D. Bougalis
2007-03-23  7:52             ` Evgeniy Polyakov
2007-03-22 20:58         ` David Miller
2007-03-22 22:03           ` Eric Dumazet
2007-03-23  7:11             ` David Miller
2007-03-23  8:00               ` Eric Dumazet
2007-03-23 18:46                 ` David Miller
2007-03-23  8:07           ` Evgeniy Polyakov
2007-03-23  8:17             ` Eric Dumazet
2007-03-23  8:33               ` Evgeniy Polyakov
2007-03-23  9:10                 ` Evgeniy Polyakov
2007-03-23 11:58             ` XOR hash beauty solved [Was: RFC: Established connections hash function] Evgeniy Polyakov
2007-03-23 12:51               ` Nikolaos D. Bougalis
2007-03-23 12:45             ` RFC: Established connections hash function Nikolaos D. Bougalis
2007-03-27 14:11 ` Andi Kleen
2007-03-28  5:01   ` Nikolaos D. Bougalis
2007-03-28  6:29     ` David Miller
2007-03-28  9:29     ` Andi Kleen
2007-03-28 10:45       ` Evgeniy Polyakov
2007-03-28 14:14         ` Andi Kleen
2007-03-28 13:50           ` Eric Dumazet
2007-03-28 14:52             ` Andi Kleen
2007-03-29  9:18               ` Evgeniy Polyakov
2007-03-28 14:17           ` RFC: Established connections hash function II Andi Kleen
2007-03-28 19:04           ` RFC: Established connections hash function David Miller
2007-03-28 20:12             ` Andi Kleen
2007-03-24 12:26 linux
2007-03-24 13:29 ` Evgeniy Polyakov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.