All of lore.kernel.org
 help / color / mirror / Atom feed
* ipv4: Simplify ARP hash function.
@ 2011-07-08 17:10 David Miller
  2011-07-08 17:40 ` Martin Mares
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2011-07-08 17:10 UTC (permalink / raw)
  To: netdev


Using Jenkins is over the top.

If the premise is that the hash_rnd is a random unpredictable key,
then:

	key ^ dev->ifindex ^ hash_rnd

results in an unpredictable hash result, even if an attacker
controls 'key' and 'dev->ifindex' completely.

Therefore, if this hash result is unpredictable, then the
final fold phase of:

	(val >> 8) ^ (val >> 16) ^ (val >> 24)

is unpredictable as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
---

Someone please check my logic :-) This sames ~100 cycles during a
neigh_lookup() on my Niagara2 box.

diff --git a/include/net/arp.h b/include/net/arp.h
index 91f0568..d570747 100644
--- a/include/net/arp.h
+++ b/include/net/arp.h
@@ -8,6 +8,13 @@
 
 extern struct neigh_table arp_tbl;
 
+static inline u32 arp_hashfn(u32 key, const struct net_device *dev, u32 hash_rnd)
+{
+	u32 val = key ^ dev->ifindex ^ hash_rnd;
+
+	return (val >> 8) ^ (val >> 16) ^ (val >> 24);
+}
+
 extern void	arp_init(void);
 extern int	arp_find(unsigned char *haddr, struct sk_buff *skb);
 extern int	arp_ioctl(struct net *net, unsigned int cmd, void __user *arg);
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 1b74d3b..4412b57 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -97,7 +97,6 @@
 #include <linux/init.h>
 #include <linux/net.h>
 #include <linux/rcupdate.h>
-#include <linux/jhash.h>
 #include <linux/slab.h>
 #ifdef CONFIG_SYSCTL
 #include <linux/sysctl.h>
@@ -232,7 +231,7 @@ static u32 arp_hash(const void *pkey,
 		    const struct net_device *dev,
 		    __u32 hash_rnd)
 {
-	return jhash_2words(*(u32 *)pkey, dev->ifindex, hash_rnd);
+	return arp_hashfn(*(u32 *)pkey, dev, hash_rnd);
 }
 
 static int arp_constructor(struct neighbour *neigh)

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 17:10 ipv4: Simplify ARP hash function David Miller
@ 2011-07-08 17:40 ` Martin Mares
  2011-07-08 17:47   ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Martin Mares @ 2011-07-08 17:40 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Hello!

> Using Jenkins is over the top.
> 
> If the premise is that the hash_rnd is a random unpredictable key,
> then:
> 
> 	key ^ dev->ifindex ^ hash_rnd
> 
> results in an unpredictable hash result, even if an attacker
> controls 'key' and 'dev->ifindex' completely.
> 
> Therefore, if this hash result is unpredictable, then the
> final fold phase of:
> 
> 	(val >> 8) ^ (val >> 16) ^ (val >> 24)
> 
> is unpredictable as well.

If I understand the new hash function correctly, it should be very easy
for an outside attacker to force arbitrary collisions.

The hash function is linear, so it can be reduced to:

	a = key ^ dev->ifindex
	return (a >> 8) ^ (a >> 16) ^ (a >> 24)				// (1)
	     ^ (hash_rnd >> 8) ^ (hash_rnd >> 16) ^ (hash_rnd >> 24)	// (2)

Where (1) is under control of the attacker and while (2) is not, the
only effect of (2) is a random permutation on the hash buckets.

I.e., the attacker can generate arbitrarily long collision chains,
although he cannot pick the bucket where the collisions happen :)

Am I right?

				Have a nice fortnight
-- 
Martin `MJ' Mares                          <mj@ucw.cz>   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
If going to a church makes you a Christian, does going to a garage make you a car?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 17:40 ` Martin Mares
@ 2011-07-08 17:47   ` David Miller
  2011-07-08 17:54     ` David Miller
  2011-07-08 18:03     ` John Heffner
  0 siblings, 2 replies; 21+ messages in thread
From: David Miller @ 2011-07-08 17:47 UTC (permalink / raw)
  To: mj; +Cc: netdev

From: Martin Mares <mj@ucw.cz>
Date: Fri, 8 Jul 2011 19:40:55 +0200

> The hash function is linear, so it can be reduced to:
> 
> 	a = key ^ dev->ifindex
> 	return (a >> 8) ^ (a >> 16) ^ (a >> 24)				// (1)
> 	     ^ (hash_rnd >> 8) ^ (hash_rnd >> 16) ^ (hash_rnd >> 24)	// (2)

Is this really the same?  The inclusion of a full 32-bit xor
with hash_rnd before folding was intentional, so that the
final folding occurs on a completely "random" value.

> Where (1) is under control of the attacker and while (2) is not, the
> only effect of (2) is a random permutation on the hash buckets.
> 
> I.e., the attacker can generate arbitrarily long collision chains,
> although he cannot pick the bucket where the collisions happen :)
> 
> Am I right?

Please give an example :-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 17:47   ` David Miller
@ 2011-07-08 17:54     ` David Miller
  2011-07-08 18:03     ` John Heffner
  1 sibling, 0 replies; 21+ messages in thread
From: David Miller @ 2011-07-08 17:54 UTC (permalink / raw)
  To: mj; +Cc: netdev

From: David Miller <davem@davemloft.net>
Date: Fri, 08 Jul 2011 10:47:39 -0700 (PDT)

> From: Martin Mares <mj@ucw.cz>
> Date: Fri, 8 Jul 2011 19:40:55 +0200
> 
>> The hash function is linear, so it can be reduced to:
>> 
>> 	a = key ^ dev->ifindex
>> 	return (a >> 8) ^ (a >> 16) ^ (a >> 24)				// (1)
>> 	     ^ (hash_rnd >> 8) ^ (hash_rnd >> 16) ^ (hash_rnd >> 24)	// (2)
> 
> Is this really the same?  The inclusion of a full 32-bit xor
> with hash_rnd before folding was intentional, so that the
> final folding occurs on a completely "random" value.

For example, try out this test program.  Run as "./x ${RANDOM_VALUE}",
it shows that the attacker cannot simply just iterate by the number of
hash table slots to create collisions, assuming a hash table size of
256 slots:

--------------------
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argp)
{
	int i, rnd;

	rnd = atoi(argp[1]);
	for (i = 1; i < (64 * 1024); i += 256) {
		int x = (i ^ rnd);

		x ^= (x >> 8) ^ (x << 16) ^ (x >> 24);

		x &= 0xff;

		printf("%d\n", x);
	}
	return 0;
}


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 17:47   ` David Miller
  2011-07-08 17:54     ` David Miller
@ 2011-07-08 18:03     ` John Heffner
  2011-07-08 18:06       ` David Miller
  1 sibling, 1 reply; 21+ messages in thread
From: John Heffner @ 2011-07-08 18:03 UTC (permalink / raw)
  To: David Miller; +Cc: mj, netdev

On Fri, Jul 8, 2011 at 1:47 PM, David Miller <davem@davemloft.net> wrote:
> From: Martin Mares <mj@ucw.cz>
> Date: Fri, 8 Jul 2011 19:40:55 +0200
>
>> The hash function is linear, so it can be reduced to:
>>
>>       a = key ^ dev->ifindex
>>       return (a >> 8) ^ (a >> 16) ^ (a >> 24)                         // (1)
>>            ^ (hash_rnd >> 8) ^ (hash_rnd >> 16) ^ (hash_rnd >> 24)    // (2)
>
> Is this really the same?  The inclusion of a full 32-bit xor
> with hash_rnd before folding was intentional, so that the
> final folding occurs on a completely "random" value.

Martin's reduction looks exactly correct to me.

  -John

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 18:03     ` John Heffner
@ 2011-07-08 18:06       ` David Miller
  2011-07-08 19:26         ` Roland Dreier
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2011-07-08 18:06 UTC (permalink / raw)
  To: johnwheffner; +Cc: mj, netdev

From: John Heffner <johnwheffner@gmail.com>
Date: Fri, 8 Jul 2011 14:03:45 -0400

> On Fri, Jul 8, 2011 at 1:47 PM, David Miller <davem@davemloft.net> wrote:
>> From: Martin Mares <mj@ucw.cz>
>> Date: Fri, 8 Jul 2011 19:40:55 +0200
>>
>>> The hash function is linear, so it can be reduced to:
>>>
>>>       a = key ^ dev->ifindex
>>>       return (a >> 8) ^ (a >> 16) ^ (a >> 24)                         // (1)
>>>            ^ (hash_rnd >> 8) ^ (hash_rnd >> 16) ^ (hash_rnd >> 24)    // (2)
>>
>> Is this really the same?  The inclusion of a full 32-bit xor
>> with hash_rnd before folding was intentional, so that the
>> final folding occurs on a completely "random" value.
> 
> Martin's reduction looks exactly correct to me.

Ok, there was also an unintended bug in my original patch,
I lost the bottom 8 bits in the fold, the hash function
should instead be:

+static inline u32 arp_hashfn(u32 key, const struct net_device *dev, u32 hash_rnd)
+{
+	u32 val = key ^ dev->ifindex ^ hash_rnd;
+
+	return val ^ (val >> 8) ^ (val >> 16) ^ (val >> 24);
+}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 18:06       ` David Miller
@ 2011-07-08 19:26         ` Roland Dreier
  2011-07-08 19:27           ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Roland Dreier @ 2011-07-08 19:26 UTC (permalink / raw)
  To: David Miller; +Cc: johnwheffner, mj, netdev

On Fri, Jul 8, 2011 at 11:06 AM, David Miller <davem@davemloft.net> wrote:
> Ok, there was also an unintended bug in my original patch,
> I lost the bottom 8 bits in the fold, the hash function
> should instead be:
>
> +static inline u32 arp_hashfn(u32 key, const struct net_device *dev, u32 hash_rnd)
> +{
> +       u32 val = key ^ dev->ifindex ^ hash_rnd;
> +
> +       return val ^ (val >> 8) ^ (val >> 16) ^ (val >> 24);
> +}

Doesn't seem to matter much -- this is now equivalent to

      a = key ^ dev->ifindex
       return (a ^ (a >> 8) ^ (a >> 16) ^ (a >> 24))           // (1)
            ^ (rnd ^ (rnd >> 8) ^ (rnd >> 16) ^ (rnd >> 24))   // (2)

where again the attacker controls (1), and (2) is a constant.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 19:26         ` Roland Dreier
@ 2011-07-08 19:27           ` David Miller
  2011-07-08 19:39             ` Michał Mirosław
  2011-07-08 20:44             ` Roland Dreier
  0 siblings, 2 replies; 21+ messages in thread
From: David Miller @ 2011-07-08 19:27 UTC (permalink / raw)
  To: roland; +Cc: johnwheffner, mj, netdev

From: Roland Dreier <roland@purestorage.com>
Date: Fri, 8 Jul 2011 12:26:17 -0700

> On Fri, Jul 8, 2011 at 11:06 AM, David Miller <davem@davemloft.net> wrote:
>> Ok, there was also an unintended bug in my original patch,
>> I lost the bottom 8 bits in the fold, the hash function
>> should instead be:
>>
>> +static inline u32 arp_hashfn(u32 key, const struct net_device *dev, u32 hash_rnd)
>> +{
>> +       u32 val = key ^ dev->ifindex ^ hash_rnd;
>> +
>> +       return val ^ (val >> 8) ^ (val >> 16) ^ (val >> 24);
>> +}
> 
> Doesn't seem to matter much -- this is now equivalent to
> 
>       a = key ^ dev->ifindex
>        return (a ^ (a >> 8) ^ (a >> 16) ^ (a >> 24))           // (1)
>             ^ (rnd ^ (rnd >> 8) ^ (rnd >> 16) ^ (rnd >> 24))   // (2)
> 
> where again the attacker controls (1), and (2) is a constant.

Right, but how can you attack it?  Show me how you can grow
a hash chain of arbitrary length by modulating the key in
a deterministic way.

Nobody has done this yet.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 19:27           ` David Miller
@ 2011-07-08 19:39             ` Michał Mirosław
  2011-07-08 19:51               ` David Miller
  2011-07-08 20:44             ` Roland Dreier
  1 sibling, 1 reply; 21+ messages in thread
From: Michał Mirosław @ 2011-07-08 19:39 UTC (permalink / raw)
  To: David Miller; +Cc: roland, johnwheffner, mj, netdev

2011/7/8 David Miller <davem@davemloft.net>:
> From: Roland Dreier <roland@purestorage.com>
> Date: Fri, 8 Jul 2011 12:26:17 -0700
>
>> On Fri, Jul 8, 2011 at 11:06 AM, David Miller <davem@davemloft.net> wrote:
>>> Ok, there was also an unintended bug in my original patch,
>>> I lost the bottom 8 bits in the fold, the hash function
>>> should instead be:
>>>
>>> +static inline u32 arp_hashfn(u32 key, const struct net_device *dev, u32 hash_rnd)
>>> +{
>>> +       u32 val = key ^ dev->ifindex ^ hash_rnd;
>>> +
>>> +       return val ^ (val >> 8) ^ (val >> 16) ^ (val >> 24);
>>> +}
>>
>> Doesn't seem to matter much -- this is now equivalent to
>>
>>       a = key ^ dev->ifindex
>>        return (a ^ (a >> 8) ^ (a >> 16) ^ (a >> 24))           // (1)
>>             ^ (rnd ^ (rnd >> 8) ^ (rnd >> 16) ^ (rnd >> 24))   // (2)
>>
>> where again the attacker controls (1), and (2) is a constant.
>
> Right, but how can you attack it?  Show me how you can grow
> a hash chain of arbitrary length by modulating the key in
> a deterministic way.

For 256 buckets its easy:

hash_index = b[0] ^ b[1] ^ b[2] ^ b[3];
(b[i] are bytes of the key)

With b[3] = b[0] ^ b[1] ^ b[2] you get 2^24 keys that hash to the same bucket.

Best Regards,
Michał Mirosław

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 19:39             ` Michał Mirosław
@ 2011-07-08 19:51               ` David Miller
  2011-07-08 19:59                 ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2011-07-08 19:51 UTC (permalink / raw)
  To: mirqus; +Cc: roland, johnwheffner, mj, netdev

From: Michał Mirosław <mirqus@gmail.com>
Date: Fri, 8 Jul 2011 21:39:18 +0200

> With b[3] = b[0] ^ b[1] ^ b[2] you get 2^24 keys that hash to the same bucket.

Ok, I'm convinced, thanks :-)

--------------------
#include <stdlib.h>
#include <stdio.h>

int hashfn(unsigned int key, unsigned int rnd)
{
	unsigned int x = key ^ rnd;

	x ^= (x >> 8) ^ (x >> 16) ^ (x >> 24);

	return x & 0xff;
}

int count[256];

unsigned int collide(unsigned int key)
{
	unsigned int b0 = key >> 24;
	unsigned int b1 = (key >> 16) & 0xff;
	unsigned int b2 = (key >> 8) & 0xff;

	key &= ~0xff;
	key |= (b0 ^ b1 ^ b2);

	return key;
}

int main(int argc, char **argp)
{
	unsigned int rnd = atoi(argp[1]);
	unsigned int i;

	for (i = 0; i < (64 * 1024); i++) {
		unsigned int key = i << 8;
		unsigned int hash;

		key = collide(key);
		hash = hashfn(key, rnd);
		printf("%u: %u\n", key, hash);
		count[hash]++;
	}
	for (i = 0; i < 256; i++)
		printf("COUNT[%3u]=%3u\n", i, count[i]);
	return 0;
}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 19:51               ` David Miller
@ 2011-07-08 19:59                 ` David Miller
  2011-07-08 20:10                   ` Michał Mirosław
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2011-07-08 19:59 UTC (permalink / raw)
  To: mirqus; +Cc: roland, johnwheffner, mj, netdev

From: David Miller <davem@davemloft.net>
Date: Fri, 08 Jul 2011 12:51:18 -0700 (PDT)

> From: Michał Mirosław <mirqus@gmail.com>
> Date: Fri, 8 Jul 2011 21:39:18 +0200
> 
>> With b[3] = b[0] ^ b[1] ^ b[2] you get 2^24 keys that hash to the same bucket.
> 
> Ok, I'm convinced, thanks :-)

Although, actually it's not this simple.  The attack doesn't work.

As they "attack" us, the ARP hash table grows and thus the hash mask
changes to match.  Then his old collisions won't collide any more.

We could even adjust the fold shifts as the table grows to make this
effect even more pronounced.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 19:59                 ` David Miller
@ 2011-07-08 20:10                   ` Michał Mirosław
  2011-07-08 20:34                     ` Michał Mirosław
  0 siblings, 1 reply; 21+ messages in thread
From: Michał Mirosław @ 2011-07-08 20:10 UTC (permalink / raw)
  To: David Miller; +Cc: roland, johnwheffner, mj, netdev

2011/7/8 David Miller <davem@davemloft.net>:
> From: David Miller <davem@davemloft.net>
> Date: Fri, 08 Jul 2011 12:51:18 -0700 (PDT)
>
>> From: Michał Mirosław <mirqus@gmail.com>
>> Date: Fri, 8 Jul 2011 21:39:18 +0200
>>
>>> With b[3] = b[0] ^ b[1] ^ b[2] you get 2^24 keys that hash to the same bucket.
>>
>> Ok, I'm convinced, thanks :-)
>
> Although, actually it's not this simple.  The attack doesn't work.
>
> As they "attack" us, the ARP hash table grows and thus the hash mask
> changes to match.  Then his old collisions won't collide any more.
>
> We could even adjust the fold shifts as the table grows to make this
> effect even more pronounced.

There will still be 2^32/n_buckets known values that hash to the same
bucket for every n_buckets. So if the attacker knows when and how the
hash size changes, he can adapt accordingly. It should be easier to
see when you get rid of the XOR [random, but] constant part.

Best Regards,
Michał Mirosław

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 20:10                   ` Michał Mirosław
@ 2011-07-08 20:34                     ` Michał Mirosław
  2011-07-08 20:35                       ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Michał Mirosław @ 2011-07-08 20:34 UTC (permalink / raw)
  To: David Miller; +Cc: roland, johnwheffner, mj, netdev

W dniu 8 lipca 2011 22:10 użytkownik Michał Mirosław <mirqus@gmail.com> napisał:
> 2011/7/8 David Miller <davem@davemloft.net>:
>> From: David Miller <davem@davemloft.net>
>> Date: Fri, 08 Jul 2011 12:51:18 -0700 (PDT)
>>
>>> From: Michał Mirosław <mirqus@gmail.com>
>>> Date: Fri, 8 Jul 2011 21:39:18 +0200
>>>
>>>> With b[3] = b[0] ^ b[1] ^ b[2] you get 2^24 keys that hash to the same bucket.
>>>
>>> Ok, I'm convinced, thanks :-)
>>
>> Although, actually it's not this simple.  The attack doesn't work.
>>
>> As they "attack" us, the ARP hash table grows and thus the hash mask
>> changes to match.  Then his old collisions won't collide any more.
>>
>> We could even adjust the fold shifts as the table grows to make this
>> effect even more pronounced.
>
> There will still be 2^32/n_buckets known values that hash to the same
> bucket for every n_buckets. So if the attacker knows when and how the
> hash size changes, he can adapt accordingly. It should be easier to
> see when you get rid of the XOR [random, but] constant part.

BTW, am I correct, that neighbour hash tables never shrink? Looking at
net/core/neighbour.c it seems that after the table reaches gc_thresh3
capacity, it is never reallocated again.

Best Regards,
Michał Mirosław

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 20:34                     ` Michał Mirosław
@ 2011-07-08 20:35                       ` David Miller
  0 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2011-07-08 20:35 UTC (permalink / raw)
  To: mirqus; +Cc: roland, johnwheffner, mj, netdev

From: Michał Mirosław <mirqus@gmail.com>
Date: Fri, 8 Jul 2011 22:34:22 +0200

> BTW, am I correct, that neighbour hash tables never shrink? Looking at
> net/core/neighbour.c it seems that after the table reaches gc_thresh3
> capacity, it is never reallocated again.

Currently, yes, but I plan to remove that limit.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 19:27           ` David Miller
  2011-07-08 19:39             ` Michał Mirosław
@ 2011-07-08 20:44             ` Roland Dreier
  2011-07-08 22:32               ` David Miller
  1 sibling, 1 reply; 21+ messages in thread
From: Roland Dreier @ 2011-07-08 20:44 UTC (permalink / raw)
  To: David Miller; +Cc: johnwheffner, mj, netdev

>> Doesn't seem to matter much -- this is now equivalent to
>>
>>       a = key ^ dev->ifindex
>>        return (a ^ (a >> 8) ^ (a >> 16) ^ (a >> 24))           // (1)
>>             ^ (rnd ^ (rnd >> 8) ^ (rnd >> 16) ^ (rnd >> 24))   // (2)
>>
>> where again the attacker controls (1), and (2) is a constant.

> Right, but how can you attack it?  Show me how you can grow
> a hash chain of arbitrary length by modulating the key in
> a deterministic way.

Well, if two things hash to different buckets with the full hash
function, then they already hashed to different buckets without
the extra randomness.  So why bother with hash_rnd?

The answer is that you have to mix hash_rnd into the hash
in a nonlinear way, so that an attacker can't know if two values
end up in the same bucket or not.

With your hash function, the attacker can just compute the
hash (without hash_rnd) for all the values of key ^ ifindex
and then use all the values that end up in the same bucket.

 - R.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 20:44             ` Roland Dreier
@ 2011-07-08 22:32               ` David Miller
  2011-07-08 23:11                 ` Roland Dreier
  2011-07-08 23:41                 ` Stephen Hemminger
  0 siblings, 2 replies; 21+ messages in thread
From: David Miller @ 2011-07-08 22:32 UTC (permalink / raw)
  To: roland; +Cc: johnwheffner, mj, netdev

From: Roland Dreier <roland@purestorage.com>
Date: Fri, 8 Jul 2011 13:44:42 -0700

> The answer is that you have to mix hash_rnd into the hash
> in a nonlinear way, so that an attacker can't know if two values
> end up in the same bucket or not.
> 
> With your hash function, the attacker can just compute the
> hash (without hash_rnd) for all the values of key ^ ifindex
> and then use all the values that end up in the same bucket.

Ok, thanks everyone for explaining things.

So what is the cheapest non-linear function we could use?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 22:32               ` David Miller
@ 2011-07-08 23:11                 ` Roland Dreier
  2011-07-10 19:07                   ` David Miller
  2011-07-08 23:41                 ` Stephen Hemminger
  1 sibling, 1 reply; 21+ messages in thread
From: Roland Dreier @ 2011-07-08 23:11 UTC (permalink / raw)
  To: David Miller; +Cc: johnwheffner, mj, netdev

On Fri, Jul 8, 2011 at 3:32 PM, David Miller <davem@davemloft.net> wrote:
> So what is the cheapest non-linear function we could use?

I'm not comfortable giving cryptographic advice, but even + (addition
with carry) is nonlinear when combined with ^.  However that seems
like the low-order bits might be too predictable.

Maybe * of hash key with a random odd value is good enough?

 - R.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 22:32               ` David Miller
  2011-07-08 23:11                 ` Roland Dreier
@ 2011-07-08 23:41                 ` Stephen Hemminger
  2011-07-08 23:47                   ` David Miller
  1 sibling, 1 reply; 21+ messages in thread
From: Stephen Hemminger @ 2011-07-08 23:41 UTC (permalink / raw)
  To: David Miller; +Cc: roland, johnwheffner, mj, netdev

What about using murmur hash which has a four byte pass as well.
  https://sites.google.com/site/murmurhash/
---
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

#define u32 uint32_t

/* Do a one pass murmurhash2 */
static u32 arp_hashfn(u32 key, int ifindex, u32 hash_rnd)
{
	/* murmurhash mixiing constants */
	const unsigned int m = 0x5bd1e995;
	const int r = 24;

	/* Initialize the hash to a 'random' value  */
	unsigned int h = ifindex ^ hash_rnd;
	unsigned int k = key;

	k *= m; 
	k ^= k >> r; 
	k *= m; 
		
	h *= m; 
	h ^= k;

	/* Do a few final mixes of the hash to ensure the last few
	 * bytes are well-incorporated.
	 */

	h ^= h >> 13;
	h *= m;
	h ^= h >> 15;

	return h;
} 

int main(int argc, char **argv)
{
	u32 rnd, key, hash;
	int ifindex;

	key = atoi(argv[1]);
	ifindex = atoi(argv[2]);
	rnd = atoi(argv[3]);

	hash = arp_hashfn(key, ifindex, rnd);
	printf("%u, %d, %u => %u\n", key, ifindex, rnd, hash);
	return 0;
}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 23:41                 ` Stephen Hemminger
@ 2011-07-08 23:47                   ` David Miller
  2011-07-09  3:08                     ` Stephen Hemminger
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2011-07-08 23:47 UTC (permalink / raw)
  To: shemminger; +Cc: roland, johnwheffner, mj, netdev

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 8 Jul 2011 16:41:28 -0700

> What about using murmur hash which has a four byte pass as well.
>   https://sites.google.com/site/murmurhash/

I'm trying to avoid multiplies that are not done in hardware on some
cpus.

Right now I'm looking at one of Thomas Wang's hashes, referenced on
Bob Jenkin's hash analysis page:

u32 hashint(u32 a)
{
	a += ~(a<<15);
	a ^=  (a>>10);
	a +=  (a<<3);
	a ^=  (a>>6);
	a += ~(a<<11);
	a ^=  (a>>16);

	return a;
}

It's 15 instructions, and produces better entropy in the low bits of
the result than the high bits, which is fine for how we'll use this
thing.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 23:47                   ` David Miller
@ 2011-07-09  3:08                     ` Stephen Hemminger
  0 siblings, 0 replies; 21+ messages in thread
From: Stephen Hemminger @ 2011-07-09  3:08 UTC (permalink / raw)
  To: David Miller; +Cc: roland, johnwheffner, mj, netdev

On Fri, 08 Jul 2011 16:47:51 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Stephen Hemminger <shemminger@vyatta.com>
> Date: Fri, 8 Jul 2011 16:41:28 -0700
> 
> > What about using murmur hash which has a four byte pass as well.
> >   https://sites.google.com/site/murmurhash/
> 
> I'm trying to avoid multiplies that are not done in hardware on some
> cpus.
> 
> Right now I'm looking at one of Thomas Wang's hashes, referenced on
> Bob Jenkin's hash analysis page:
> 
> u32 hashint(u32 a)
> {
> 	a += ~(a<<15);
> 	a ^=  (a>>10);
> 	a +=  (a<<3);
> 	a ^=  (a>>6);
> 	a += ~(a<<11);
> 	a ^=  (a>>16);
> 
> 	return a;
> }
> 
> It's 15 instructions, and produces better entropy in the low bits of
> the result than the high bits, which is fine for how we'll use this
> thing.

Ok. but you really have sell those Sparc's while they are still
worth something on Ebay :-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: ipv4: Simplify ARP hash function.
  2011-07-08 23:11                 ` Roland Dreier
@ 2011-07-10 19:07                   ` David Miller
  0 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2011-07-10 19:07 UTC (permalink / raw)
  To: roland; +Cc: johnwheffner, mj, netdev

From: Roland Dreier <roland@purestorage.com>
Date: Fri, 8 Jul 2011 16:11:00 -0700

> Maybe * of hash key with a random odd value is good enough?

Yes, from what I've read over the past few days it should
be.  More precisely:

	(key * hash_rnd) >> (32 - hash_table_size_log2)

where "hash_rnd" is odd.

The reason we want the top bits is because multiplies intrinsically
work such that bits in the inputs can only effect the same or higher
bits in the result.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-07-10 19:07 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-08 17:10 ipv4: Simplify ARP hash function David Miller
2011-07-08 17:40 ` Martin Mares
2011-07-08 17:47   ` David Miller
2011-07-08 17:54     ` David Miller
2011-07-08 18:03     ` John Heffner
2011-07-08 18:06       ` David Miller
2011-07-08 19:26         ` Roland Dreier
2011-07-08 19:27           ` David Miller
2011-07-08 19:39             ` Michał Mirosław
2011-07-08 19:51               ` David Miller
2011-07-08 19:59                 ` David Miller
2011-07-08 20:10                   ` Michał Mirosław
2011-07-08 20:34                     ` Michał Mirosław
2011-07-08 20:35                       ` David Miller
2011-07-08 20:44             ` Roland Dreier
2011-07-08 22:32               ` David Miller
2011-07-08 23:11                 ` Roland Dreier
2011-07-10 19:07                   ` David Miller
2011-07-08 23:41                 ` Stephen Hemminger
2011-07-08 23:47                   ` David Miller
2011-07-09  3:08                     ` Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.