linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Thomas Graf <tgraf@suug.ch>, Herbert Xu <herbert@gondor.apana.org.au>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 5/6] rhashtable: support guaranteed successful insertion.
Date: Tue, 27 Mar 2018 10:33:04 +1100	[thread overview]
Message-ID: <152210718434.11435.6551477417902631683.stgit@noble> (raw)
In-Reply-To: <152210688405.11435.13010923693146415942.stgit@noble>

The current rhashtable will fail an insertion if the hashtable
it "too full", one of:
 - table already has 2^31 elements (-E2BIG)
 - a max_size was specified and table already has that
   many elements (rounded up to power of 2) (-E2BIG)
 - a single chain has more than 16 elements (-EBUSY)
 - table has more elements than the current table size,
   and allocating a new table fails (-ENOMEM)
 - a new page needed to be allocated for a nested table,
   and the memory allocation failed (-ENOMEM).

A traditional hash table does not have a concept of "too full", and
insertion only fails if the key already exists.  Many users of hash
tables have separate means of limiting the total number of entries,
and are not susceptible to an attack which could cause unusually large
hash chains.  For those users, the need to check for errors when
inserting objects to an rhashtable is an unnecessary burden and hence
a potential source of bugs (as these failures are likely to be rare).

This patch adds a "never_fail_insert" configuration parameter which
ensures that insertion will only fail if the key already exists.

When this option is in effect:
 - nelems is capped at INT_MAX and will never decrease once it reaches
   that value
 - max_size is largely ignored
 - elements will be added to a table that is nominally "full", though
   a rehash will be scheduled
 - a new table will never be allocated directly by the insert
   function, that is always left for the worker.
   For this to trigger a rehash when long chains are detected (possibly
   still useful) an extra field in the table records if a long chain
   has been seen.  This shares a word with the 'nest' value.  As
   'nest' is never changed once the table is created, updating the
   new ->long_chain without locking cannot cause any corruption.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 include/linux/rhashtable.h |   18 +++++++++++++++---
 lib/rhashtable.c           |   27 +++++++++++++++++++--------
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index 4ffd96949d4f..abdeb1f3f378 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -77,6 +77,7 @@ struct rhlist_head {
  * struct bucket_table - Table of hash buckets
  * @size: Number of hash buckets
  * @nest: Number of bits of first-level nested table.
+ * @long_chain: %true when a chain longer than RHT_ELASTICITY seen.
  * @rehash: Current bucket being rehashed
  * @hash_rnd: Random seed to fold into hash
  * @locks_mask: Mask to apply before accessing locks[]
@@ -89,7 +90,8 @@ struct rhlist_head {
  */
 struct bucket_table {
 	unsigned int		size;
-	unsigned int		nest;
+	unsigned short		nest;
+	bool			long_chain;
 	unsigned int		rehash;
 	u32			hash_rnd;
 	unsigned int		locks_mask;
@@ -129,6 +131,9 @@ struct rhashtable;
  * @min_size: Minimum size while shrinking
  * @locks_mul: Number of bucket locks to allocate per cpu (default: 32)
  * @automatic_shrinking: Enable automatic shrinking of tables
+ * @never_fail_insert: Insert will always succeed, even if table will become
+ *           unbalanced.  Without this, -E2BIG, -EBUSY, and -ENOMEM are possible
+ *           errors from rhashtable_*insert*()
  * @nulls_base: Base value to generate nulls marker
  * @hashfn: Hash function (default: jhash2 if !(key_len % 4), or jhash)
  * @obj_hashfn: Function to hash object
@@ -142,6 +147,7 @@ struct rhashtable_params {
 	unsigned int		max_size;
 	u16			min_size;
 	bool			automatic_shrinking;
+	bool			never_fail_insert;
 	u8			locks_mul;
 	u32			nulls_base;
 	rht_hashfn_t		hashfn;
@@ -832,7 +838,10 @@ static inline void *__rhashtable_insert_fast(
 
 	rcu_assign_pointer(*pprev, obj);
 
-	atomic_inc(&ht->nelems);
+	if (params.never_fail_insert)
+		atomic_add_unless(&ht->nelems, 1, INT_MAX);
+	else
+		atomic_inc(&ht->nelems);
 	if (rht_grow_above_75(ht, tbl))
 		schedule_work(&ht->run_work);
 
@@ -1104,7 +1113,10 @@ static inline int __rhashtable_remove_fast_one(
 	spin_unlock_bh(lock);
 
 	if (err > 0) {
-		atomic_dec(&ht->nelems);
+		if (params.never_fail_insert)
+			atomic_add_unless(&ht->nelems, -1, INT_MAX);
+		else
+			atomic_dec(&ht->nelems);
 		if (unlikely(ht->p.automatic_shrinking &&
 			     rht_shrink_below_30(ht, tbl)))
 			schedule_work(&ht->run_work);
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index fd6f320b9704..427836aace60 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -424,7 +424,7 @@ static void rht_deferred_worker(struct work_struct *work)
 		err = rhashtable_rehash_alloc(ht, tbl, tbl->size * 2);
 	else if (ht->p.automatic_shrinking && rht_shrink_below_30(ht, tbl))
 		err = rhashtable_shrink(ht);
-	else if (tbl->nest)
+	else if (tbl->nest || tbl->long_chain)
 		err = rhashtable_rehash_alloc(ht, tbl, tbl->size);
 
 	if (!err)
@@ -549,14 +549,22 @@ static struct bucket_table *rhashtable_insert_one(struct rhashtable *ht,
 	if (new_tbl)
 		return new_tbl;
 
-	if (PTR_ERR(data) != -ENOENT)
-		return ERR_CAST(data);
+	if (ht->p.never_fail_insert) {
+		if (PTR_ERR(data) == -EAGAIN &&
+		    atomic_read(&ht->nelems) != INT_MAX) {
+			tbl->long_chain = true;
+			schedule_work(&ht->run_work);
+		}
+	} else {
+		if (PTR_ERR(data) != -ENOENT)
+			return ERR_CAST(data);
 
-	if (unlikely(rht_grow_above_max(ht, tbl)))
-		return ERR_PTR(-E2BIG);
+		if (unlikely(rht_grow_above_max(ht, tbl)))
+			return ERR_PTR(-E2BIG);
 
-	if (unlikely(rht_grow_above_100(ht, tbl)))
-		return ERR_PTR(-EAGAIN);
+		if (unlikely(rht_grow_above_100(ht, tbl)))
+			return ERR_PTR(-EAGAIN);
+	}
 
 	pprev = rht_bucket_insert(ht, tbl, hash);
 	if (!pprev)
@@ -574,7 +582,10 @@ static struct bucket_table *rhashtable_insert_one(struct rhashtable *ht,
 
 	rcu_assign_pointer(*pprev, obj);
 
-	atomic_inc(&ht->nelems);
+	if (ht->p.never_fail_insert)
+		atomic_add_unless(&ht->nelems, 1, INT_MAX);
+	else
+		atomic_inc(&ht->nelems);
 	if (rht_grow_above_75(ht, tbl))
 		schedule_work(&ht->run_work);
 

  parent reply	other threads:[~2018-03-27  1:51 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 23:33 [PATCH 0/6] rhashtable: assorted fixes and enhancements NeilBrown
2018-03-26 23:33 ` [PATCH 1/6] rhashtable: improve documentation for rhashtable_walk_peek() NeilBrown
2018-03-27 10:55   ` Sergei Shtylyov
2018-03-27 15:30   ` Herbert Xu
2018-03-27 15:47   ` David Miller
2018-03-27 21:45   ` [PATCH 1/6 v2] " NeilBrown
2018-03-27 22:46   ` [PATCH 1/6] " Andreas Grünbacher
2018-03-28  0:49     ` NeilBrown
2018-03-26 23:33 ` [PATCH 3/6] rhashtable: reset intr when rhashtable_walk_start sees new table NeilBrown
2018-03-27 15:47   ` Herbert Xu
2018-03-26 23:33 ` [PATCH 6/6] rhashtable: allow element counting to be disabled NeilBrown
2018-03-26 23:33 ` NeilBrown [this message]
2018-03-27 15:56   ` [PATCH 5/6] rhashtable: support guaranteed successful insertion Herbert Xu
2018-03-27 21:34     ` NeilBrown
2018-03-28  6:04       ` Herbert Xu
2018-03-28  7:04         ` NeilBrown
2018-03-28  7:27           ` Herbert Xu
2018-03-28 21:26             ` NeilBrown
2018-03-29  5:22               ` Herbert Xu
2018-04-06  3:11                 ` NeilBrown
2018-04-06  4:13                   ` Herbert Xu
2018-03-26 23:33 ` [PATCH 4/6] rhashtable: allow a walk of the hash table without missing objects NeilBrown
2018-03-27 15:49   ` David Miller
2018-03-27 15:54     ` Herbert Xu
2018-03-27 21:50     ` NeilBrown
2018-03-27 15:51   ` Herbert Xu
2018-03-27 21:54     ` NeilBrown
2018-03-28  6:07       ` Herbert Xu
2018-03-28  7:17         ` NeilBrown
2018-03-28  7:30           ` Herbert Xu
2018-03-28 21:34             ` NeilBrown
2018-03-29  1:13               ` NeilBrown
2018-03-26 23:33 ` [PATCH 2/6] rhashtable: remove outdated comments about grow_decision etc NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=152210718434.11435.6551477417902631683.stgit@noble \
    --to=neilb@suse.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).