All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] rhashtable: don't attempt to grow when at max_size
@ 2015-04-23 14:38 Johannes Berg
  2015-04-23 15:59 ` David Miller
  2015-04-23 20:46 ` [PATCH] rhashtable: don't attempt to grow when at max_size Thomas Graf
  0 siblings, 2 replies; 26+ messages in thread
From: Johannes Berg @ 2015-04-23 14:38 UTC (permalink / raw)
  To: netdev; +Cc: Patrick McHardy, Thomas Graf, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

The conversion of mac80211's station table to rhashtable had a bug
that I found by accident in code review, that hadn't been found as
rhashtable apparently managed to have a maximum hash chain length
of one (!) in all our testing.

In order to test the bug and verify the fix I set my rhashtable's
max_size very low (4) in order to force getting hash collisions.

At that point, rhashtable WARNed in rhashtable_insert_rehash() but
didn't actually reject the hash table insertion. This caused it to
lose insertions - my master list of stations would have 9 entries,
but the rhashtable only had 5. This may warrant a deeper look, but
that WARN_ON() just shouldn't happen.

Fix this by not returning true from rht_grow_above_100() when the
rhashtable's max_size has been reached - in this case the user is
explicitly configuring it to be at most that big, so even if it's
now above 100% it shouldn't attempt to resize.

This fixes the "lost insertion" issue and consequently allows my
code to display its error (and verify my fix for it.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 include/linux/rhashtable.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index e23d242d1230..dbcbcc59aa92 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -282,7 +282,8 @@ static inline bool rht_shrink_below_30(const struct rhashtable *ht,
 static inline bool rht_grow_above_100(const struct rhashtable *ht,
 				      const struct bucket_table *tbl)
 {
-	return atomic_read(&ht->nelems) > tbl->size;
+	return atomic_read(&ht->nelems) > tbl->size &&
+		(!ht->p.max_size || tbl->size < ht->p.max_size);
 }
 
 /* The bucket lock is selected based on the hash and protects mutations
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] rhashtable: don't attempt to grow when at max_size
  2015-04-23 14:38 [PATCH] rhashtable: don't attempt to grow when at max_size Johannes Berg
@ 2015-04-23 15:59 ` David Miller
  2015-04-23 16:09   ` Johannes Berg
  2015-04-24  0:57   ` rhashtable: Add cap on number of elements in hash table Herbert Xu
  2015-04-23 20:46 ` [PATCH] rhashtable: don't attempt to grow when at max_size Thomas Graf
  1 sibling, 2 replies; 26+ messages in thread
From: David Miller @ 2015-04-23 15:59 UTC (permalink / raw)
  To: johannes; +Cc: netdev, kaber, tgraf, johannes.berg, herbert

From: Johannes Berg <johannes@sipsolutions.net>
Date: Thu, 23 Apr 2015 16:38:43 +0200

> From: Johannes Berg <johannes.berg@intel.com>
> 
> The conversion of mac80211's station table to rhashtable had a bug
> that I found by accident in code review, that hadn't been found as
> rhashtable apparently managed to have a maximum hash chain length
> of one (!) in all our testing.
> 
> In order to test the bug and verify the fix I set my rhashtable's
> max_size very low (4) in order to force getting hash collisions.
> 
> At that point, rhashtable WARNed in rhashtable_insert_rehash() but
> didn't actually reject the hash table insertion. This caused it to
> lose insertions - my master list of stations would have 9 entries,
> but the rhashtable only had 5. This may warrant a deeper look, but
> that WARN_ON() just shouldn't happen.
> 
> Fix this by not returning true from rht_grow_above_100() when the
> rhashtable's max_size has been reached - in this case the user is
> explicitly configuring it to be at most that big, so even if it's
> now above 100% it shouldn't attempt to resize.
> 
> This fixes the "lost insertion" issue and consequently allows my
> code to display its error (and verify my fix for it.)
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

It looks fine to me, but I'll let Herbert and Thomas review this.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rhashtable: don't attempt to grow when at max_size
  2015-04-23 15:59 ` David Miller
@ 2015-04-23 16:09   ` Johannes Berg
  2015-04-23 16:16     ` Daniel Borkmann
  2015-04-24  0:57   ` rhashtable: Add cap on number of elements in hash table Herbert Xu
  1 sibling, 1 reply; 26+ messages in thread
From: Johannes Berg @ 2015-04-23 16:09 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, kaber, tgraf, herbert

On Thu, 2015-04-23 at 11:59 -0400, David Miller wrote:

> > This fixes the "lost insertion" issue and consequently allows my
> > code to display its error (and verify my fix for it.)

> It looks fine to me, but I'll let Herbert and Thomas review this.

Oh, sorry, didn't know Herbert was also involved with this.

Anyway, that's a good idea. I really went for "this looks obvious" and
patched it, and it started working for me thereafter. I don't know
anything about side effects here.

johannes

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rhashtable: don't attempt to grow when at max_size
  2015-04-23 16:09   ` Johannes Berg
@ 2015-04-23 16:16     ` Daniel Borkmann
  0 siblings, 0 replies; 26+ messages in thread
From: Daniel Borkmann @ 2015-04-23 16:16 UTC (permalink / raw)
  To: Johannes Berg, David Miller; +Cc: netdev, kaber, tgraf, herbert

On 04/23/2015 06:09 PM, Johannes Berg wrote:
> On Thu, 2015-04-23 at 11:59 -0400, David Miller wrote:
>
>>> This fixes the "lost insertion" issue and consequently allows my
>>> code to display its error (and verify my fix for it.)
>
>> It looks fine to me, but I'll let Herbert and Thomas review this.
>
> Oh, sorry, didn't know Herbert was also involved with this.

Would be good to add Fixes tags, commit in question seems to be:
ccd57b1bd324 ("rhashtable: Add immediate rehash during insertion")

> Anyway, that's a good idea. I really went for "this looks obvious" and
> patched it, and it started working for me thereafter. I don't know
> anything about side effects here.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rhashtable: don't attempt to grow when at max_size
  2015-04-23 14:38 [PATCH] rhashtable: don't attempt to grow when at max_size Johannes Berg
  2015-04-23 15:59 ` David Miller
@ 2015-04-23 20:46 ` Thomas Graf
  2015-04-23 20:49   ` Johannes Berg
  1 sibling, 1 reply; 26+ messages in thread
From: Thomas Graf @ 2015-04-23 20:46 UTC (permalink / raw)
  To: Johannes Berg; +Cc: netdev, Patrick McHardy, Johannes Berg

On 04/23/15 at 04:38pm, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> The conversion of mac80211's station table to rhashtable had a bug
> that I found by accident in code review, that hadn't been found as
> rhashtable apparently managed to have a maximum hash chain length
> of one (!) in all our testing.

This is the desired chain length ;-)

> In order to test the bug and verify the fix I set my rhashtable's
> max_size very low (4) in order to force getting hash collisions.
> 
> At that point, rhashtable WARNed in rhashtable_insert_rehash() but
> didn't actually reject the hash table insertion. This caused it to
> lose insertions - my master list of stations would have 9 entries,
> but the rhashtable only had 5. This may warrant a deeper look, but
> that WARN_ON() just shouldn't happen.

The warning got fixed recently (51bb8e331b) and
rhashtable_insert_rehash() now only allows a single rehash if at
max_size already. It will now return -EBUSY.

Insertions may still fail while the table is above 100% utilization
so this fix is absolutely needed though.

> Fix this by not returning true from rht_grow_above_100() when the
> rhashtable's max_size has been reached - in this case the user is
> explicitly configuring it to be at most that big, so even if it's
> now above 100% it shouldn't attempt to resize.

Good catch. I wonder whether we want to trigger a periodic rehash
in an interval in this situation or just leave this up to the user
to setup a timer himself.

> This fixes the "lost insertion" issue and consequently allows my
> code to display its error (and verify my fix for it.)
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Acked-by: Thomas Graf <tgraf@suug.ch>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] rhashtable: don't attempt to grow when at max_size
  2015-04-23 20:46 ` [PATCH] rhashtable: don't attempt to grow when at max_size Thomas Graf
@ 2015-04-23 20:49   ` Johannes Berg
  0 siblings, 0 replies; 26+ messages in thread
From: Johannes Berg @ 2015-04-23 20:49 UTC (permalink / raw)
  To: Thomas Graf; +Cc: netdev, Patrick McHardy

On Thu, 2015-04-23 at 21:46 +0100, Thomas Graf wrote:
> On 04/23/15 at 04:38pm, Johannes Berg wrote:
> > From: Johannes Berg <johannes.berg@intel.com>
> > 
> > The conversion of mac80211's station table to rhashtable had a bug
> > that I found by accident in code review, that hadn't been found as
> > rhashtable apparently managed to have a maximum hash chain length
> > of one (!) in all our testing.
> 
> This is the desired chain length ;-)

Sure. But I had a bug in my handling of collisions, so I explicitly
wanted to test them. After all, they are in some ways expected in a hash
table :)

> > At that point, rhashtable WARNed in rhashtable_insert_rehash() but
> > didn't actually reject the hash table insertion. This caused it to
> > lose insertions - my master list of stations would have 9 entries,
> > but the rhashtable only had 5. This may warrant a deeper look, but
> > that WARN_ON() just shouldn't happen.
> 
> The warning got fixed recently (51bb8e331b) and
> rhashtable_insert_rehash() now only allows a single rehash if at
> max_size already. It will now return -EBUSY.
> 
> Insertions may still fail while the table is above 100% utilization
> so this fix is absolutely needed though.

Yeah just failing would be a bit strange.

> > Fix this by not returning true from rht_grow_above_100() when the
> > rhashtable's max_size has been reached - in this case the user is
> > explicitly configuring it to be at most that big, so even if it's
> > now above 100% it shouldn't attempt to resize.
> 
> Good catch. I wonder whether we want to trigger a periodic rehash
> in an interval in this situation or just leave this up to the user
> to setup a timer himself.

You could just document it that it's probably useful if max_size is set?
I'm just going to be setting max_size for debug purposes, so don't
really care all that much.

johannes

^ permalink raw reply	[flat|nested] 26+ messages in thread

* rhashtable: Add cap on number of elements in hash table
  2015-04-23 15:59 ` David Miller
  2015-04-23 16:09   ` Johannes Berg
@ 2015-04-24  0:57   ` Herbert Xu
  2015-04-24  7:01     ` Johannes Berg
                       ` (2 more replies)
  1 sibling, 3 replies; 26+ messages in thread
From: Herbert Xu @ 2015-04-24  0:57 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

On Thu, Apr 23, 2015 at 11:59:19AM -0400, David Miller wrote:
> From: Johannes Berg <johannes@sipsolutions.net>
> Date: Thu, 23 Apr 2015 16:38:43 +0200
> 
> > From: Johannes Berg <johannes.berg@intel.com>
> > 
> > The conversion of mac80211's station table to rhashtable had a bug
> > that I found by accident in code review, that hadn't been found as
> > rhashtable apparently managed to have a maximum hash chain length
> > of one (!) in all our testing.
> > 
> > In order to test the bug and verify the fix I set my rhashtable's
> > max_size very low (4) in order to force getting hash collisions.
> > 
> > At that point, rhashtable WARNed in rhashtable_insert_rehash() but
> > didn't actually reject the hash table insertion. This caused it to
> > lose insertions - my master list of stations would have 9 entries,
> > but the rhashtable only had 5. This may warrant a deeper look, but
> > that WARN_ON() just shouldn't happen.
> > 
> > Fix this by not returning true from rht_grow_above_100() when the
> > rhashtable's max_size has been reached - in this case the user is
> > explicitly configuring it to be at most that big, so even if it's
> > now above 100% it shouldn't attempt to resize.
> > 
> > This fixes the "lost insertion" issue and consequently allows my
> > code to display its error (and verify my fix for it.)
> > 
> > Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> 
> It looks fine to me, but I'll let Herbert and Thomas review this.

Thanks for the heads up.

It seems that I lost track somewhere along the line.  I meant
to add an explicit limit on the overall number of entries since
that was what users like netlink expected but never got around
to doing it.  Instead it seems that we're currently relying on
the rht_grow_above_100 to protect us.

So here is a patch that adds an explicit limit and fixes the
problem Johannes reported.

---8<---
We currently have no limit on the number of elements in a hash table.
This is very bad especially considering that some rhashtable users
had such a limit before the conversion and relied on it for defence
against DoS attacks.

We already have a maximum hash table size limit but its enforcement
is only by luck and results in a nasty WARN_ON.

This patch adds a new paramater insecure_max_entries which becomes
the cap on the table.  If unset it defaults to max_size.  If it is
also zero it means that there is no cap on the number of elements
in the table.  However, the table will grow whenever the utilisation
hits 100% and if that growth fails, you will get ENOMEM on insertion.

As allowing >100% utilisation is potentially dangerous, the name
contains the word insecure.

Note that the cap is not a hard limit.  This is done for performance
reasons as enforcing a hard limit will result in use of atomic ops
that are heavier than the ones we currently use.

The reasoning is that we're only guarding against a gross over-
subscription of the table, rather than a small breach of the limit.

Reported-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index e23d242..95b5a36 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -17,6 +17,7 @@
 #ifndef _LINUX_RHASHTABLE_H
 #define _LINUX_RHASHTABLE_H
 
+#include <linux/atomic.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
 #include <linux/jhash.h>
@@ -100,6 +101,7 @@ struct rhashtable;
  * @key_len: Length of key
  * @key_offset: Offset of key in struct to be hashed
  * @head_offset: Offset of rhash_head in struct to be hashed
+ * @insecure_max_entries: Maximum number of entries (may be exceeded)
  * @max_size: Maximum size while expanding
  * @min_size: Minimum size while shrinking
  * @nulls_base: Base value to generate nulls marker
@@ -115,6 +117,7 @@ struct rhashtable_params {
 	size_t			key_len;
 	size_t			key_offset;
 	size_t			head_offset;
+	unsigned int		insecure_max_entries;
 	unsigned int		max_size;
 	unsigned int		min_size;
 	u32			nulls_base;
@@ -282,7 +285,20 @@ static inline bool rht_shrink_below_30(const struct rhashtable *ht,
 static inline bool rht_grow_above_100(const struct rhashtable *ht,
 				      const struct bucket_table *tbl)
 {
-	return atomic_read(&ht->nelems) > tbl->size;
+	return atomic_read(&ht->nelems) > tbl->size &&
+	       (!ht->p.max_size || tbl->size < ht->p.max_size);
+}
+
+/**
+ * rht_grow_above_max - returns true if table is above maximum
+ * @ht:		hash table
+ * @tbl:	current table
+ */
+static inline bool rht_grow_above_max(const struct rhashtable *ht,
+				      const struct bucket_table *tbl)
+{
+	return ht->p.insecure_max_entries &&
+	       atomic_read(&ht->nelems) >= ht->p.insecure_max_entries;
 }
 
 /* The bucket lock is selected based on the hash and protects mutations
@@ -611,6 +627,10 @@ slow_path:
 			goto slow_path;
 	}
 
+	err = -E2BIG;
+	if (unlikely(rht_grow_above_max(ht, tbl)))
+		goto out;
+
 	err = 0;
 
 	head = rht_dereference_bucket(tbl->buckets[hash], tbl, hash);
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 4898442..6716841 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -14,6 +14,7 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/atomic.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/log2.h>
@@ -446,6 +447,10 @@ int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
 	    rht_grow_above_100(ht, tbl))
 		goto exit;
 
+	err = -E2BIG;
+	if (unlikely(rht_grow_above_max(ht, tbl)))
+		goto exit;
+
 	err = 0;
 
 	head = rht_dereference_bucket(tbl->buckets[hash], tbl, hash);
@@ -733,6 +738,12 @@ int rhashtable_init(struct rhashtable *ht,
 	if (params->max_size)
 		ht->p.max_size = rounddown_pow_of_two(params->max_size);
 
+	if (params->insecure_max_entries)
+		ht->p.insecure_max_entries =
+			rounddown_pow_of_two(params->insecure_max_entries);
+	else
+		ht->p.insecure_max_entries = ht->p.max_size;
+
 	ht->p.min_size = max(ht->p.min_size, HASH_MIN_SIZE);
 
 	/* The maximum (not average) chain length grows with the
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  0:57   ` rhashtable: Add cap on number of elements in hash table Herbert Xu
@ 2015-04-24  7:01     ` Johannes Berg
  2015-04-24  8:04       ` Herbert Xu
  2015-04-24  8:06     ` Thomas Graf
  2015-05-13  8:06     ` Herbert Xu
  2 siblings, 1 reply; 26+ messages in thread
From: Johannes Berg @ 2015-04-24  7:01 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, netdev, kaber, tgraf

On Fri, 2015-04-24 at 08:57 +0800, Herbert Xu wrote:

> It seems that I lost track somewhere along the line.  I meant
> to add an explicit limit on the overall number of entries since
> that was what users like netlink expected but never got around
> to doing it.  Instead it seems that we're currently relying on
> the rht_grow_above_100 to protect us.

This isn't really what I wanted though :-)

I just wanted to test hash collisions.

> We currently have no limit on the number of elements in a hash table.
> This is very bad especially considering that some rhashtable users
> had such a limit before the conversion and relied on it for defence
> against DoS attacks.
> 
> We already have a maximum hash table size limit but its enforcement
> is only by luck and results in a nasty WARN_ON.

And doesn't actually work, the insertion appears to succeed :-)

> This patch adds a new paramater insecure_max_entries which becomes

typo: parameter

> the cap on the table.  If unset it defaults to max_size.  

So at least for my (admittedly testing only) use case, I wouldn't want
it to default to max_size, since the two at least *seem* to do different
things (max # of chains vs. max # of entries), no?

Anyway - since it's for testing only I guess I could even set max_size
to 4 and insecure_max_entries to something far bigger :)

> If it is
> also zero it means that there is no cap on the number of elements
> in the table.  However, the table will grow whenever the utilisation
> hits 100% and if that growth fails, you will get ENOMEM on insertion.
> 
> As allowing >100% utilisation is potentially dangerous, the name
> contains the word insecure.

Not sure I get this. So rhashtable is trying to actually never have
collisions? How could that possibly work?

> @@ -282,7 +285,20 @@ static inline bool rht_shrink_below_30(const struct rhashtable *ht,
>  static inline bool rht_grow_above_100(const struct rhashtable *ht,
>  				      const struct bucket_table *tbl)
>  {
> -	return atomic_read(&ht->nelems) > tbl->size;
> +	return atomic_read(&ht->nelems) > tbl->size &&
> +	       (!ht->p.max_size || tbl->size < ht->p.max_size);
> +}

Since you're also doing what I did here, would it make sense to apply my
patch to net and this one only to net-next?

For my use case (which was testing/debug) I don't actually care that
much, but perhaps that'd be an easier sell towards the end of the merge
window :) It seems that my patch would mostly fix the *issue*, while
yours actually adds a new parameter that's also not actually used yet.

The netlink hash table could potentially hit max_size and thus the
warning and the case I was hitting (on a system with >>64k netlink
sockets.)

johannes

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  7:01     ` Johannes Berg
@ 2015-04-24  8:04       ` Herbert Xu
  0 siblings, 0 replies; 26+ messages in thread
From: Herbert Xu @ 2015-04-24  8:04 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David Miller, netdev, kaber, tgraf

On Fri, Apr 24, 2015 at 09:01:10AM +0200, Johannes Berg wrote:
>
> > As allowing >100% utilisation is potentially dangerous, the name
> > contains the word insecure.
> 
> Not sure I get this. So rhashtable is trying to actually never have
> collisions? How could that possibly work?

Of course it's not to eliminate collisions, but to limit the chain
length to something reasonable.  Using a hashtable at >100% capacity
is usually not reasonable.

> Since you're also doing what I did here, would it make sense to apply my
> patch to net and this one only to net-next?

That would make af_netlink a security hole again because you can now
add unlimited entries to a hash table limited to 64K entries.  Prior
to the rhashtable conversion you weren't allowed to have more than
64K entries in the table.  This was lost in the conversion and I was
trying to restore it.

> For my use case (which was testing/debug) I don't actually care that
> much, but perhaps that'd be an easier sell towards the end of the merge
> window :) It seems that my patch would mostly fix the *issue*, while
> yours actually adds a new parameter that's also not actually used yet.

Well if my patch is too complex then sure we can look at coming up
with a simpler fix but I think we need something that does not let
you add unlimited entries to af_netlink.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  0:57   ` rhashtable: Add cap on number of elements in hash table Herbert Xu
  2015-04-24  7:01     ` Johannes Berg
@ 2015-04-24  8:06     ` Thomas Graf
  2015-04-24  8:12       ` Herbert Xu
  2015-05-13  8:06     ` Herbert Xu
  2 siblings, 1 reply; 26+ messages in thread
From: Thomas Graf @ 2015-04-24  8:06 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, johannes, netdev, kaber, johannes.berg

On 04/24/15 at 08:57am, Herbert Xu wrote:
> It seems that I lost track somewhere along the line.  I meant
> to add an explicit limit on the overall number of entries since
> that was what users like netlink expected but never got around
> to doing it.  Instead it seems that we're currently relying on
> the rht_grow_above_100 to protect us.

Can we please just take Johannes's fix as-is first? It fixes
the bug at hand in an isolated manner without introducing any
new knobs. Your patch includes his fix as-is without modification
anyway.

> So here is a patch that adds an explicit limit and fixes the
> problem Johannes reported.
> 
> ---8<---
> We currently have no limit on the number of elements in a hash table.
> This is very bad especially considering that some rhashtable users
> had such a limit before the conversion and relied on it for defence
> against DoS attacks.

Which users are you talking about? Both Netlink and TIPC still
have an upper limit. nft sets are controlled by privileged users.

> We already have a maximum hash table size limit but its enforcement
> is only by luck and results in a nasty WARN_ON.

As I stated earlier, this is no longer the case and thus this
paragraph only confuses the commit message.

> This patch adds a new paramater insecure_max_entries which becomes
> the cap on the table.  If unset it defaults to max_size.  If it is
> also zero it means that there is no cap on the number of elements
> in the table.  However, the table will grow whenever the utilisation
> hits 100% and if that growth fails, you will get ENOMEM on insertion.

Last time we discussed this it was said that the caller should enforce
the limit like Netlink does.

I'm fine with adding an upper max but I'd like to discuss that in the
context of a full series which converts all existing enforcements and
also contains a testing mechanism to verify this. Also, unless you can
show me where this is currently a real bug, this is really net-next
material.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  8:06     ` Thomas Graf
@ 2015-04-24  8:12       ` Herbert Xu
  2015-04-24  8:15         ` Thomas Graf
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2015-04-24  8:12 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David Miller, johannes, netdev, kaber, johannes.berg

On Fri, Apr 24, 2015 at 09:06:08AM +0100, Thomas Graf wrote:
>
> Which users are you talking about? Both Netlink and TIPC still
> have an upper limit. nft sets are controlled by privileged users.

There is no limit in netlink apart from UINT_MAX AFAICS.  Allowing
UINT_MAX entries into a hash table limited to 64K is not a good
thing.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  8:12       ` Herbert Xu
@ 2015-04-24  8:15         ` Thomas Graf
  2015-04-24  8:22           ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Graf @ 2015-04-24  8:15 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, johannes, netdev, kaber, johannes.berg

On 04/24/15 at 04:12pm, Herbert Xu wrote:
> On Fri, Apr 24, 2015 at 09:06:08AM +0100, Thomas Graf wrote:
> >
> > Which users are you talking about? Both Netlink and TIPC still
> > have an upper limit. nft sets are controlled by privileged users.
> 
> There is no limit in netlink apart from UINT_MAX AFAICS.  Allowing
> UINT_MAX entries into a hash table limited to 64K is not a good
> thing.

OK, so you are saying that the Netlink limit is too low? Then let's
fix that.

You are claiming that the rhashtable convertion removed a cap. I'm
not seeing such a change. Can you point me to where netlink_insert()
enforced a cap pre-rhashtable?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  8:15         ` Thomas Graf
@ 2015-04-24  8:22           ` Herbert Xu
  2015-04-24 15:38             ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2015-04-24  8:22 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David Miller, johannes, netdev, kaber, johannes.berg

On Fri, Apr 24, 2015 at 09:15:39AM +0100, Thomas Graf wrote:
> 
> You are claiming that the rhashtable convertion removed a cap. I'm
> not seeing such a change. Can you point me to where netlink_insert()
> enforced a cap pre-rhashtable?

OK you are right.  We never had such a limit.  In that case I'm
OK with Johannes's original patch.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  8:22           ` Herbert Xu
@ 2015-04-24 15:38             ` David Miller
  0 siblings, 0 replies; 26+ messages in thread
From: David Miller @ 2015-04-24 15:38 UTC (permalink / raw)
  To: herbert; +Cc: tgraf, johannes, netdev, kaber, johannes.berg

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 24 Apr 2015 16:22:11 +0800

> On Fri, Apr 24, 2015 at 09:15:39AM +0100, Thomas Graf wrote:
>> 
>> You are claiming that the rhashtable convertion removed a cap. I'm
>> not seeing such a change. Can you point me to where netlink_insert()
>> enforced a cap pre-rhashtable?
> 
> OK you are right.  We never had such a limit.  In that case I'm
> OK with Johannes's original patch.

Ok I'm applying Johanne's patch then, thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-04-24  0:57   ` rhashtable: Add cap on number of elements in hash table Herbert Xu
  2015-04-24  7:01     ` Johannes Berg
  2015-04-24  8:06     ` Thomas Graf
@ 2015-05-13  8:06     ` Herbert Xu
  2015-05-15  2:22       ` David Miller
  2015-05-15  3:30       ` [v2 PATCH] " Herbert Xu
  2 siblings, 2 replies; 26+ messages in thread
From: Herbert Xu @ 2015-05-13  8:06 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

We currently have no limit on the number of elements in a hash table.
This is a problem because some users (tipc) set a ceiling on the
maximum table size and when that is reached the hash table may
degenerate.  Others may encounter OOM when growing and if we allow
insertions when that happens the hash table perofrmance may also
suffer.

This patch adds a new paramater insecure_max_entries which becomes
the cap on the table.  If unset it defaults to max_size.  If it is
also zero it means that there is no cap on the number of elements
in the table.  However, the table will grow whenever the utilisation
hits 100% and if that growth fails, you will get ENOMEM on insertion.

As allowing >100% utilisation is potentially dangerous, the name
contains the word insecure.

Note that the cap is not a hard limit.  This is done for performance
reasons as enforcing a hard limit will result in use of atomic ops
that are heavier than the ones we currently use.

The reasoning is that we're only guarding against a gross over-
subscription of the table, rather than a small breach of the limit.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index dbcbcc5..f3deeaa 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -17,6 +17,7 @@
 #ifndef _LINUX_RHASHTABLE_H
 #define _LINUX_RHASHTABLE_H
 
+#include <linux/atomic.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
 #include <linux/jhash.h>
@@ -100,6 +101,7 @@ struct rhashtable;
  * @key_len: Length of key
  * @key_offset: Offset of key in struct to be hashed
  * @head_offset: Offset of rhash_head in struct to be hashed
+ * @insecure_max_entries: Maximum number of entries (may be exceeded)
  * @max_size: Maximum size while expanding
  * @min_size: Minimum size while shrinking
  * @nulls_base: Base value to generate nulls marker
@@ -115,6 +117,7 @@ struct rhashtable_params {
 	size_t			key_len;
 	size_t			key_offset;
 	size_t			head_offset;
+	unsigned int		insecure_max_entries;
 	unsigned int		max_size;
 	unsigned int		min_size;
 	u32			nulls_base;
@@ -286,6 +289,18 @@ static inline bool rht_grow_above_100(const struct rhashtable *ht,
 		(!ht->p.max_size || tbl->size < ht->p.max_size);
 }
 
+/**
+ * rht_grow_above_max - returns true if table is above maximum
+ * @ht:		hash table
+ * @tbl:	current table
+ */
+static inline bool rht_grow_above_max(const struct rhashtable *ht,
+				      const struct bucket_table *tbl)
+{
+	return ht->p.insecure_max_entries &&
+	       atomic_read(&ht->nelems) >= ht->p.insecure_max_entries;
+}
+
 /* The bucket lock is selected based on the hash and protects mutations
  * on a group of hash buckets.
  *
@@ -612,6 +627,10 @@ slow_path:
 			goto slow_path;
 	}
 
+	err = -E2BIG;
+	if (unlikely(rht_grow_above_max(ht, tbl)))
+		goto out;
+
 	err = 0;
 
 	head = rht_dereference_bucket(tbl->buckets[hash], tbl, hash);
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 4936fc4..cfd3408 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -14,6 +14,7 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/atomic.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/log2.h>
@@ -451,6 +452,10 @@ int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
 	    rht_grow_above_100(ht, tbl))
 		goto exit;
 
+	err = -E2BIG;
+	if (unlikely(rht_grow_above_max(ht, tbl)))
+		goto exit;
+
 	err = 0;
 
 	head = rht_dereference_bucket(tbl->buckets[hash], tbl, hash);
@@ -734,6 +739,12 @@ int rhashtable_init(struct rhashtable *ht,
 	if (params->max_size)
 		ht->p.max_size = rounddown_pow_of_two(params->max_size);
 
+	if (params->insecure_max_entries)
+		ht->p.insecure_max_entries =
+			rounddown_pow_of_two(params->insecure_max_entries);
+	else
+		ht->p.insecure_max_entries = ht->p.max_size;
+
 	ht->p.min_size = max(ht->p.min_size, HASH_MIN_SIZE);
 
 	/* The maximum (not average) chain length grows with the
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-13  8:06     ` Herbert Xu
@ 2015-05-15  2:22       ` David Miller
  2015-05-15  3:06         ` Herbert Xu
  2015-05-15  3:30       ` [v2 PATCH] " Herbert Xu
  1 sibling, 1 reply; 26+ messages in thread
From: David Miller @ 2015-05-15  2:22 UTC (permalink / raw)
  To: herbert; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 13 May 2015 16:06:40 +0800

> We currently have no limit on the number of elements in a hash table.
> This is a problem because some users (tipc) set a ceiling on the
> maximum table size and when that is reached the hash table may
> degenerate.  Others may encounter OOM when growing and if we allow
> insertions when that happens the hash table perofrmance may also
> suffer.
> 
> This patch adds a new paramater insecure_max_entries which becomes
> the cap on the table.  If unset it defaults to max_size.  If it is
> also zero it means that there is no cap on the number of elements
> in the table.  However, the table will grow whenever the utilisation
> hits 100% and if that growth fails, you will get ENOMEM on insertion.
> 
> As allowing >100% utilisation is potentially dangerous, the name
> contains the word insecure.
> 
> Note that the cap is not a hard limit.  This is done for performance
> reasons as enforcing a hard limit will result in use of atomic ops
> that are heavier than the ones we currently use.
> 
> The reasoning is that we're only guarding against a gross over-
> subscription of the table, rather than a small breach of the limit.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

I'm not so sure I can get behind this change.

For the default case where we propagate max_size into this new limt,
if we grow the hash table to it's maximum size, and then just go one
past the limit, we're going to start failing inserts.

In my opinion, up to at least 2 X max_size, it's safe to allow the
insert.  Assuming a well choosen hash function and a roughly even
distribution.

A two entry deep hash chain is not a reason to fail a hash insertion.

Failures on hash insertions we would not do in pretty much any other
hash implementation in the kernel.  I'd seriously would rather have
3 entry deep hash chains than no new connections at all.

If an administrator has to analyze some situation where this is
happening and they see -E2BIG propagating into their applications,
this is going to be surprising.  And when they do all of the work
to figure out what is causing it, I am pretty sure they will be
somewhat upset that we considered it OK to do this.

So I'd like you to take some time to reconsider hash insert failures,
and only do them when it's going to create an unnaceptable, _massive_
hardship for the machine.  And I do not think that 2 or 3 entry deep
hash chains quality.

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-15  2:22       ` David Miller
@ 2015-05-15  3:06         ` Herbert Xu
  2015-05-15  3:46           ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2015-05-15  3:06 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

On Thu, May 14, 2015 at 10:22:17PM -0400, David Miller wrote:
>
> In my opinion, up to at least 2 X max_size, it's safe to allow the
> insert.  Assuming a well choosen hash function and a roughly even
> distribution.

OK I can make it 2 x max_size/table size.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [v2 PATCH] rhashtable: Add cap on number of elements in hash table
  2015-05-13  8:06     ` Herbert Xu
  2015-05-15  2:22       ` David Miller
@ 2015-05-15  3:30       ` Herbert Xu
  2015-05-16 22:08         ` David Miller
  1 sibling, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2015-05-15  3:30 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

We currently have no limit on the number of elements in a hash table.
This is a problem because some users (tipc) set a ceiling on the
maximum table size and when that is reached the hash table may
degenerate.  Others may encounter OOM when growing and if we allow
insertions when that happens the hash table perofrmance may also
suffer.

This patch adds a new paramater insecure_max_entries which becomes
the cap on the table.  If unset it defaults to max_size * 2.  If
it is also zero it means that there is no cap on the number of
elements in the table.  However, the table will grow whenever the
utilisation hits 100% and if that growth fails, you will get ENOMEM
on insertion.

As allowing oversubscription is potentially dangerous, the name
contains the word insecure.

Note that the cap is not a hard limit.  This is done for performance
reasons as enforcing a hard limit will result in use of atomic ops
that are heavier than the ones we currently use.

The reasoning is that we're only guarding against a gross over-
subscription of the table, rather than a small breach of the limit.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index dbcbcc5..843ceca 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -17,6 +17,7 @@
 #ifndef _LINUX_RHASHTABLE_H
 #define _LINUX_RHASHTABLE_H
 
+#include <linux/atomic.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
 #include <linux/jhash.h>
@@ -100,6 +101,7 @@ struct rhashtable;
  * @key_len: Length of key
  * @key_offset: Offset of key in struct to be hashed
  * @head_offset: Offset of rhash_head in struct to be hashed
+ * @insecure_max_entries: Maximum number of entries (may be exceeded)
  * @max_size: Maximum size while expanding
  * @min_size: Minimum size while shrinking
  * @nulls_base: Base value to generate nulls marker
@@ -115,6 +117,7 @@ struct rhashtable_params {
 	size_t			key_len;
 	size_t			key_offset;
 	size_t			head_offset;
+	unsigned int		insecure_max_entries;
 	unsigned int		max_size;
 	unsigned int		min_size;
 	u32			nulls_base;
@@ -286,6 +289,18 @@ static inline bool rht_grow_above_100(const struct rhashtable *ht,
 		(!ht->p.max_size || tbl->size < ht->p.max_size);
 }
 
+/**
+ * rht_grow_above_max - returns true if table is above maximum
+ * @ht:		hash table
+ * @tbl:	current table
+ */
+static inline bool rht_grow_above_max(const struct rhashtable *ht,
+				      const struct bucket_table *tbl)
+{
+	return ht->p.insecure_max_entries &&
+	       atomic_read(&ht->nelems) >= ht->p.insecure_max_entries;
+}
+
 /* The bucket lock is selected based on the hash and protects mutations
  * on a group of hash buckets.
  *
@@ -589,6 +604,10 @@ restart:
 		goto out;
 	}
 
+	err = -E2BIG;
+	if (unlikely(rht_grow_above_max(ht, tbl)))
+		goto out;
+
 	if (unlikely(rht_grow_above_100(ht, tbl))) {
 slow_path:
 		spin_unlock_bh(lock);
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 4936fc4..ca66a0e 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -14,6 +14,7 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/atomic.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/log2.h>
@@ -446,6 +447,10 @@ int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
 	if (key && rhashtable_lookup_fast(ht, key, ht->p))
 		goto exit;
 
+	err = -E2BIG;
+	if (unlikely(rht_grow_above_max(ht, tbl)))
+		goto exit;
+
 	err = -EAGAIN;
 	if (rhashtable_check_elasticity(ht, tbl, hash) ||
 	    rht_grow_above_100(ht, tbl))
@@ -734,6 +739,12 @@ int rhashtable_init(struct rhashtable *ht,
 	if (params->max_size)
 		ht->p.max_size = rounddown_pow_of_two(params->max_size);
 
+	if (params->insecure_max_entries)
+		ht->p.insecure_max_entries =
+			rounddown_pow_of_two(params->insecure_max_entries);
+	else
+		ht->p.insecure_max_entries = ht->p.max_size * 2;
+
 	ht->p.min_size = max(ht->p.min_size, HASH_MIN_SIZE);
 
 	/* The maximum (not average) chain length grows with the
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-15  3:06         ` Herbert Xu
@ 2015-05-15  3:46           ` David Miller
  2015-05-15  6:30             ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2015-05-15  3:46 UTC (permalink / raw)
  To: herbert; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 15 May 2015 11:06:23 +0800

> On Thu, May 14, 2015 at 10:22:17PM -0400, David Miller wrote:
>>
>> In my opinion, up to at least 2 X max_size, it's safe to allow the
>> insert.  Assuming a well choosen hash function and a roughly even
>> distribution.
> 
> OK I can make it 2 x max_size/table size.

The rest of my email after what you quoted was intended to get one
to consider this issue generally.  :-)

We wouldn't fail these inserts in any other hash table in the kernel.

Would we stop making new TCP sockets if the TCP ehash chains are 3
entries deep?  4?  5?  The answer to all of those is of course no
for any hash chain length of N whatsoever.

This new rhashtable behavior would be the default, and I seriously
doubt that's a behavior people who use a hash table, generally
speaking, desire or want.

Should there perhaps be hard protections for _extremely_ long hash
chains?  Sure, I'm willing to entertain that kind of idea.  But I
would do so at the very far end of the spectrum.  To the point where
the hash table is degenerating into a linked list.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-15  3:46           ` David Miller
@ 2015-05-15  6:30             ` Herbert Xu
  2015-05-16 22:09               ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2015-05-15  6:30 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

On Thu, May 14, 2015 at 11:46:15PM -0400, David Miller wrote:
> 
> We wouldn't fail these inserts in any other hash table in the kernel.
> 
> Would we stop making new TCP sockets if the TCP ehash chains are 3
> entries deep?  4?  5?  The answer to all of those is of course no
> for any hash chain length of N whatsoever.

I would agree with you if this was a fixed sized table.  If your
table grows with you (which is the whole point of rhashtable) then
we should never hit this until you run out of memory.

When you are running out of memory whether you fail when the table
growth fails or later when you can't even allocate an entry is
immaterial because failure is inevitable.

In my view everybody should be using rhashtable without a maximum
size.  The only place where it would make sense to have a maximum
size limit is if you also had a limit on the number of entries.
In which cas you might as well make that the limit on the hash table
size.

> Should there perhaps be hard protections for _extremely_ long hash
> chains?  Sure, I'm willing to entertain that kind of idea.  But I
> would do so at the very far end of the spectrum.  To the point where
> the hash table is degenerating into a linked list.

Do you have any suggestions of what such a limit should be?

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [v2 PATCH] rhashtable: Add cap on number of elements in hash table
  2015-05-15  3:30       ` [v2 PATCH] " Herbert Xu
@ 2015-05-16 22:08         ` David Miller
  0 siblings, 0 replies; 26+ messages in thread
From: David Miller @ 2015-05-16 22:08 UTC (permalink / raw)
  To: herbert; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 15 May 2015 11:30:47 +0800

> We currently have no limit on the number of elements in a hash table.
> This is a problem because some users (tipc) set a ceiling on the
> maximum table size and when that is reached the hash table may
> degenerate.  Others may encounter OOM when growing and if we allow
> insertions when that happens the hash table perofrmance may also
> suffer.
> 
> This patch adds a new paramater insecure_max_entries which becomes
> the cap on the table.  If unset it defaults to max_size * 2.  If
> it is also zero it means that there is no cap on the number of
> elements in the table.  However, the table will grow whenever the
> utilisation hits 100% and if that growth fails, you will get ENOMEM
> on insertion.
> 
> As allowing oversubscription is potentially dangerous, the name
> contains the word insecure.
> 
> Note that the cap is not a hard limit.  This is done for performance
> reasons as enforcing a hard limit will result in use of atomic ops
> that are heavier than the ones we currently use.
> 
> The reasoning is that we're only guarding against a gross over-
> subscription of the table, rather than a small breach of the limit.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks Herbert.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-15  6:30             ` Herbert Xu
@ 2015-05-16 22:09               ` David Miller
  2015-05-17  1:38                 ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2015-05-16 22:09 UTC (permalink / raw)
  To: herbert; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 15 May 2015 14:30:57 +0800

> On Thu, May 14, 2015 at 11:46:15PM -0400, David Miller wrote:
>> 
>> We wouldn't fail these inserts in any other hash table in the kernel.
>> 
>> Would we stop making new TCP sockets if the TCP ehash chains are 3
>> entries deep?  4?  5?  The answer to all of those is of course no
>> for any hash chain length of N whatsoever.
> 
> I would agree with you if this was a fixed sized table.  If your
> table grows with you (which is the whole point of rhashtable) then
> we should never hit this until you run out of memory.
> 
> When you are running out of memory whether you fail when the table
> growth fails or later when you can't even allocate an entry is
> immaterial because failure is inevitable.
> 
> In my view everybody should be using rhashtable without a maximum
> size.  The only place where it would make sense to have a maximum
> size limit is if you also had a limit on the number of entries.
> In which cas you might as well make that the limit on the hash table
> size.

Ok, agreed.

>> Should there perhaps be hard protections for _extremely_ long hash
>> chains?  Sure, I'm willing to entertain that kind of idea.  But I
>> would do so at the very far end of the spectrum.  To the point where
>> the hash table is degenerating into a linked list.
> 
> Do you have any suggestions of what such a limit should be?

Good question.

Obviously something like 50 or 100 is too much.

Perhaps something between 5 and 10.

That's just my gut instinct speaking.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-16 22:09               ` David Miller
@ 2015-05-17  1:38                 ` Herbert Xu
  2015-05-18 20:12                   ` David Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2015-05-17  1:38 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

On Sat, May 16, 2015 at 06:09:46PM -0400, David Miller wrote:
>
> Obviously something like 50 or 100 is too much.
> 
> Perhaps something between 5 and 10.

You are even more parsimonious than I :) Because the maximum chain
length grows at a rate of logN * loglogN, I had chosen the number
16 which should be sufficient even if you had a 2^32 table (which
you currently can't as we max out somewhere below that).

FWIW Thomas did some numbers on this and apparent 10 gets breached
in a 2^24 table at 100% utilisation.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-17  1:38                 ` Herbert Xu
@ 2015-05-18 20:12                   ` David Miller
  2015-05-18 22:35                     ` Herbert Xu
  0 siblings, 1 reply; 26+ messages in thread
From: David Miller @ 2015-05-18 20:12 UTC (permalink / raw)
  To: herbert; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sun, 17 May 2015 09:38:29 +0800

> On Sat, May 16, 2015 at 06:09:46PM -0400, David Miller wrote:
>>
>> Obviously something like 50 or 100 is too much.
>> 
>> Perhaps something between 5 and 10.
> 
> You are even more parsimonious than I :) Because the maximum chain
> length grows at a rate of logN * loglogN, I had chosen the number
> 16 which should be sufficient even if you had a 2^32 table (which
> you currently can't as we max out somewhere below that).
> 
> FWIW Thomas did some numbers on this and apparent 10 gets breached
> in a 2^24 table at 100% utilisation.

Ok, this of course depends upon the distribution of the input data
and the strength/suitability of the hash function.

I'm a little bit disappointed in what Thomas found.  I would expect
the distribution to be at least a little bit better.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: rhashtable: Add cap on number of elements in hash table
  2015-05-18 20:12                   ` David Miller
@ 2015-05-18 22:35                     ` Herbert Xu
  2015-05-19 10:25                       ` David Laight
  0 siblings, 1 reply; 26+ messages in thread
From: Herbert Xu @ 2015-05-18 22:35 UTC (permalink / raw)
  To: David Miller; +Cc: johannes, netdev, kaber, tgraf, johannes.berg

On Mon, May 18, 2015 at 04:12:01PM -0400, David Miller wrote:
>
> Ok, this of course depends upon the distribution of the input data
> and the strength/suitability of the hash function.
> 
> I'm a little bit disappointed in what Thomas found.  I would expect
> the distribution to be at least a little bit better.

Just to make it clear this is not a problem with uneven distribution.
The same thing would happen with any hash function.  It's essentially
a generalised birthday paradox.

Of course the average chain length is still what you expect it to
be, it's only the maximum chain length that grows at logN loglogN.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: rhashtable: Add cap on number of elements in hash table
  2015-05-18 22:35                     ` Herbert Xu
@ 2015-05-19 10:25                       ` David Laight
  0 siblings, 0 replies; 26+ messages in thread
From: David Laight @ 2015-05-19 10:25 UTC (permalink / raw)
  To: 'Herbert Xu', David Miller
  Cc: johannes, netdev, kaber, tgraf, johannes.berg

From: Herbert Xu
> Sent: 18 May 2015 23:35
> On Mon, May 18, 2015 at 04:12:01PM -0400, David Miller wrote:
> >
> > Ok, this of course depends upon the distribution of the input data
> > and the strength/suitability of the hash function.
> >
> > I'm a little bit disappointed in what Thomas found.  I would expect
> > the distribution to be at least a little bit better.
> 
> Just to make it clear this is not a problem with uneven distribution.
> The same thing would happen with any hash function.  It's essentially
> a generalised birthday paradox.

Which means that with real data and a real hash function it is very
likely to be worse.
Did anyone look at what happens to the longest chains if you put
2 or 4 times as many items into the hash table?
Clearly there will be more short chains, but the longest might not
get proportionally longer (guessing at probability theory).

Personally I don't like the idea of rejecting inserts because chains
might get long, and certainly not because a malloc() fails in the
middle of the resize operation.

I'm sure there are some tables where slightly longer chains don't matter.

	David

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2015-05-19 10:27 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-23 14:38 [PATCH] rhashtable: don't attempt to grow when at max_size Johannes Berg
2015-04-23 15:59 ` David Miller
2015-04-23 16:09   ` Johannes Berg
2015-04-23 16:16     ` Daniel Borkmann
2015-04-24  0:57   ` rhashtable: Add cap on number of elements in hash table Herbert Xu
2015-04-24  7:01     ` Johannes Berg
2015-04-24  8:04       ` Herbert Xu
2015-04-24  8:06     ` Thomas Graf
2015-04-24  8:12       ` Herbert Xu
2015-04-24  8:15         ` Thomas Graf
2015-04-24  8:22           ` Herbert Xu
2015-04-24 15:38             ` David Miller
2015-05-13  8:06     ` Herbert Xu
2015-05-15  2:22       ` David Miller
2015-05-15  3:06         ` Herbert Xu
2015-05-15  3:46           ` David Miller
2015-05-15  6:30             ` Herbert Xu
2015-05-16 22:09               ` David Miller
2015-05-17  1:38                 ` Herbert Xu
2015-05-18 20:12                   ` David Miller
2015-05-18 22:35                     ` Herbert Xu
2015-05-19 10:25                       ` David Laight
2015-05-15  3:30       ` [v2 PATCH] " Herbert Xu
2015-05-16 22:08         ` David Miller
2015-04-23 20:46 ` [PATCH] rhashtable: don't attempt to grow when at max_size Thomas Graf
2015-04-23 20:49   ` Johannes Berg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.