netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josh Elsasser <jelsasser@appneta.com>
To: "David S . Miller" <davem@davemloft.net>
Cc: josh@elsasser.ca, Josh Elsasser <jelsasser@appneta.com>,
	Thomas Graf <tgraf@suug.ch>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH net] rhashtable: avoid reschedule loop after rapid growth and shrink
Date: Wed, 23 Jan 2019 13:17:58 -0800	[thread overview]
Message-ID: <20190123211758.104275-1-jelsasser@appneta.com> (raw)

When running workloads with large bursts of fragmented packets, we've seen
a few machines stuck returning -EEXIST from rht_shrink() and endlessly
rescheduling their hash table's deferred work, pegging a CPU core.

Root cause is commit da20420f83ea ("rhashtable: Add nested tables"), which
stops ignoring the return code of rhashtable_shrink() and the reallocs
used to grow the hashtable. This uncovers a bug in the shrink logic where
"needs to shrink" check runs against the last table but the actual shrink
operation runs on the first bucket_table in the hashtable (see below):

 +-------+    +--------------+          +---------------+
 | ht    |    | "first" tbl  |          | "last" tbl    |
 | - tbl ---> | - future_tbl ---------> |  - future_tbl ---> NULL
 +-------+    +--------------+          +---------------+
               ^^^                          ^^^
	       used by rhashtable_shrink()  used by rht_shrink_below_30()

A rehash then stalls out when both the last table needs to shrink, the
first table has more elements than the target size, but rht_shrink() hits
a non-NULL future_tbl and returns -EEXIST. This skips the item rehashing
and kicks off a reschedule loop, as no forward progress can be made while
the rhashtable needs to shrink.

Extend rhashtable_shrink() with a "tbl" param to avoid endless exit-and-
reschedules after hitting the EEXIST, allowing it to check a future_tbl
pointer that can actually be non-NULL and make forward progress when the
hashtable needs to shrink.

Fixes: da20420f83ea ("rhashtable: Add nested tables")
Signed-off-by: Josh Elsasser <jelsasser@appneta.com>
---
 lib/rhashtable.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 852ffa5160f1..98e91f9544fa 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -377,9 +377,9 @@ static int rhashtable_rehash_alloc(struct rhashtable *ht,
  * It is valid to have concurrent insertions and deletions protected by per
  * bucket locks or concurrent RCU protected lookups and traversals.
  */
-static int rhashtable_shrink(struct rhashtable *ht)
+static int rhashtable_shrink(struct rhashtable *ht,
+			     struct bucket_table *old_tbl)
 {
-	struct bucket_table *old_tbl = rht_dereference(ht->tbl, ht);
 	unsigned int nelems = atomic_read(&ht->nelems);
 	unsigned int size = 0;
 
@@ -412,7 +412,7 @@ static void rht_deferred_worker(struct work_struct *work)
 	if (rht_grow_above_75(ht, tbl))
 		err = rhashtable_rehash_alloc(ht, tbl, tbl->size * 2);
 	else if (ht->p.automatic_shrinking && rht_shrink_below_30(ht, tbl))
-		err = rhashtable_shrink(ht);
+		err = rhashtable_shrink(ht, tbl);
 	else if (tbl->nest)
 		err = rhashtable_rehash_alloc(ht, tbl, tbl->size);
 
-- 
2.19.1


             reply	other threads:[~2019-01-23 21:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23 21:17 Josh Elsasser [this message]
2019-01-24  3:08 ` [v2 PATCH] rhashtable: Still do rehash when we get EEXIST Herbert Xu
2019-01-24  3:40   ` Josh Elsasser
2019-01-26 22:02     ` Josh Elsasser
2019-03-20 22:39       ` Josh Hunt
     [not found]       ` <CAKA=qzY4Pzee9BVzRCciW32toeHSz7t0q9LuvQXKLG2fX9fBbg@mail.gmail.com>
2019-03-21  1:39         ` [v3 " Herbert Xu
2019-03-21  1:46           ` Herbert Xu
2019-03-21 20:58             ` David Miller
2019-03-21 20:58           ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190123211758.104275-1-jelsasser@appneta.com \
    --to=jelsasser@appneta.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=josh@elsasser.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).