linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test
@ 2015-11-20 17:17 Phil Sutter
  2015-11-20 17:17 ` [PATCH v2 1/4] rhashtable-test: add cond_resched() to thread test Phil Sutter
                   ` (5 more replies)
  0 siblings, 6 replies; 39+ messages in thread
From: Phil Sutter @ 2015-11-20 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

The following series aims to improve lib/test_rhashtable in different
situations:

Patch 1 allows the kernel to reschedule so the test does not block too
        long on slow systems.
Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
        error case (-EBUSY).
Patch 3 auto-adjusts the upper table size limit according to the number
        of threads (in concurrency test). In fact, the current default is
	already too small.
Patch 4 makes it possible to retry inserts even in supposedly permanent
        error case (-ENOMEM) to expose rhashtable's remaining problem of
	-ENOMEM being not as permanent as it is expected to be.

Changes since v1:
- Introduce insert_retry() which is then used in single-threaded test as
  well.
- Do not retry inserts by default if -ENOMEM was returned.
- Rename the retry counter to be a bit more verbose about what it
  contains.
- Add patch 4 as a debugging aid.

Phil Sutter (4):
  rhashtable-test: add cond_resched() to thread test
  rhashtable-test: retry insert operations
  rhashtable-test: calculate max_entries value by default
  rhashtable-test: allow to retry even if -ENOMEM was returned

 lib/test_rhashtable.c | 76 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 50 insertions(+), 26 deletions(-)

-- 
2.1.2


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 1/4] rhashtable-test: add cond_resched() to thread test
  2015-11-20 17:17 [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test Phil Sutter
@ 2015-11-20 17:17 ` Phil Sutter
  2015-11-20 17:17 ` [PATCH v2 2/4] rhashtable-test: retry insert operations Phil Sutter
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 39+ messages in thread
From: Phil Sutter @ 2015-11-20 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

This should fix for soft lockup bugs triggered on slow systems.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 lib/test_rhashtable.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 8c1ad1c..63654e3 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -236,6 +236,8 @@ static int thread_lookup_test(struct thread_data *tdata)
 			       obj->value, key);
 			err++;
 		}
+
+		cond_resched();
 	}
 	return err;
 }
@@ -251,6 +253,7 @@ static int threadfunc(void *data)
 
 	for (i = 0; i < entries; i++) {
 		tdata->objs[i].value = (tdata->id << 16) | i;
+		cond_resched();
 		err = rhashtable_insert_fast(&ht, &tdata->objs[i].node,
 		                             test_rht_params);
 		if (err == -ENOMEM || err == -EBUSY) {
@@ -285,6 +288,8 @@ static int threadfunc(void *data)
 				goto out;
 			}
 			tdata->objs[i].value = TEST_INSERT_FAIL;
+
+			cond_resched();
 		}
 		err = thread_lookup_test(tdata);
 		if (err) {
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 2/4] rhashtable-test: retry insert operations
  2015-11-20 17:17 [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test Phil Sutter
  2015-11-20 17:17 ` [PATCH v2 1/4] rhashtable-test: add cond_resched() to thread test Phil Sutter
@ 2015-11-20 17:17 ` Phil Sutter
  2015-11-20 17:17 ` [PATCH v2 3/4] rhashtable-test: calculate max_entries value by default Phil Sutter
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 39+ messages in thread
From: Phil Sutter @ 2015-11-20 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

After adding cond_resched() calls to threadfunc(), a surprisingly high
rate of insert failures occurred probably due to table resizes getting a
better chance to run in background. To not soften up the remaining
tests, retry inserts until they either succeed or fail permanently.

Also change the non-threaded test to retry insert operations, too.

Suggested-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 lib/test_rhashtable.c | 53 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 63654e3..cfc3440 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -76,6 +76,20 @@ static struct rhashtable_params test_rht_params = {
 static struct semaphore prestart_sem;
 static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
 
+static int insert_retry(struct rhashtable *ht, struct rhash_head *obj,
+                        const struct rhashtable_params params)
+{
+	int err, retries = -1;
+
+	do {
+		retries++;
+		cond_resched();
+		err = rhashtable_insert_fast(ht, obj, params);
+	} while (err == -EBUSY);
+
+	return err ? : retries;
+}
+
 static int __init test_rht_lookup(struct rhashtable *ht)
 {
 	unsigned int i;
@@ -157,7 +171,7 @@ static s64 __init test_rhashtable(struct rhashtable *ht)
 {
 	struct test_obj *obj;
 	int err;
-	unsigned int i, insert_fails = 0;
+	unsigned int i, insert_retries = 0;
 	s64 start, end;
 
 	/*
@@ -170,22 +184,16 @@ static s64 __init test_rhashtable(struct rhashtable *ht)
 		struct test_obj *obj = &array[i];
 
 		obj->value = i * 2;
-
-		err = rhashtable_insert_fast(ht, &obj->node, test_rht_params);
-		if (err == -ENOMEM || err == -EBUSY) {
-			/* Mark failed inserts but continue */
-			obj->value = TEST_INSERT_FAIL;
-			insert_fails++;
-		} else if (err) {
+		err = insert_retry(ht, &obj->node, test_rht_params);
+		if (err > 0)
+			insert_retries += err;
+		else if (err)
 			return err;
-		}
-
-		cond_resched();
 	}
 
-	if (insert_fails)
-		pr_info("  %u insertions failed due to memory pressure\n",
-			insert_fails);
+	if (insert_retries)
+		pr_info("  %u insertions retried due to memory pressure\n",
+			insert_retries);
 
 	test_bucket_stats(ht);
 	rcu_read_lock();
@@ -244,7 +252,7 @@ static int thread_lookup_test(struct thread_data *tdata)
 
 static int threadfunc(void *data)
 {
-	int i, step, err = 0, insert_fails = 0;
+	int i, step, err = 0, insert_retries = 0;
 	struct thread_data *tdata = data;
 
 	up(&prestart_sem);
@@ -253,21 +261,18 @@ static int threadfunc(void *data)
 
 	for (i = 0; i < entries; i++) {
 		tdata->objs[i].value = (tdata->id << 16) | i;
-		cond_resched();
-		err = rhashtable_insert_fast(&ht, &tdata->objs[i].node,
-		                             test_rht_params);
-		if (err == -ENOMEM || err == -EBUSY) {
-			tdata->objs[i].value = TEST_INSERT_FAIL;
-			insert_fails++;
+		err = insert_retry(&ht, &tdata->objs[i].node, test_rht_params);
+		if (err > 0) {
+			insert_retries += err;
 		} else if (err) {
 			pr_err("  thread[%d]: rhashtable_insert_fast failed\n",
 			       tdata->id);
 			goto out;
 		}
 	}
-	if (insert_fails)
-		pr_info("  thread[%d]: %d insert failures\n",
-		        tdata->id, insert_fails);
+	if (insert_retries)
+		pr_info("  thread[%d]: %u insertions retried due to memory pressure\n",
+			tdata->id, insert_retries);
 
 	err = thread_lookup_test(tdata);
 	if (err) {
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 3/4] rhashtable-test: calculate max_entries value by default
  2015-11-20 17:17 [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test Phil Sutter
  2015-11-20 17:17 ` [PATCH v2 1/4] rhashtable-test: add cond_resched() to thread test Phil Sutter
  2015-11-20 17:17 ` [PATCH v2 2/4] rhashtable-test: retry insert operations Phil Sutter
@ 2015-11-20 17:17 ` Phil Sutter
  2015-11-20 17:17 ` [PATCH v2 4/4] rhashtable-test: allow to retry even if -ENOMEM was returned Phil Sutter
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 39+ messages in thread
From: Phil Sutter @ 2015-11-20 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

A maximum table size of 64k entries is insufficient for the multiple
threads test even in default configuration (10 threads * 50000 objects =
500000 objects in total). Since we know how many objects will be
inserted, calculate the max size unless overridden by parameter.

Note that specifying the exact number of objects upon table init won't
suffice as that value is being rounded down to the next power of two -
anticipate this by rounding up to the next power of two in beforehand.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 lib/test_rhashtable.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index cfc3440..6fa77b3 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -36,9 +36,9 @@ static int runs = 4;
 module_param(runs, int, 0);
 MODULE_PARM_DESC(runs, "Number of test runs per variant (default: 4)");
 
-static int max_size = 65536;
+static int max_size = 0;
 module_param(max_size, int, 0);
-MODULE_PARM_DESC(runs, "Maximum table size (default: 65536)");
+MODULE_PARM_DESC(runs, "Maximum table size (default: calculated)");
 
 static bool shrinking = false;
 module_param(shrinking, bool, 0);
@@ -321,7 +321,7 @@ static int __init test_rht_init(void)
 	entries = min(entries, MAX_ENTRIES);
 
 	test_rht_params.automatic_shrinking = shrinking;
-	test_rht_params.max_size = max_size;
+	test_rht_params.max_size = max_size ? : roundup_pow_of_two(entries);
 	test_rht_params.nelem_hint = size;
 
 	pr_info("Running rhashtable test nelem=%d, max_size=%d, shrinking=%d\n",
@@ -367,6 +367,8 @@ static int __init test_rht_init(void)
 		return -ENOMEM;
 	}
 
+	test_rht_params.max_size = max_size ? :
+	                           roundup_pow_of_two(tcount * entries);
 	err = rhashtable_init(&ht, &test_rht_params);
 	if (err < 0) {
 		pr_warn("Test failed: Unable to initialize hashtable: %d\n",
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 4/4] rhashtable-test: allow to retry even if -ENOMEM was returned
  2015-11-20 17:17 [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test Phil Sutter
                   ` (2 preceding siblings ...)
  2015-11-20 17:17 ` [PATCH v2 3/4] rhashtable-test: calculate max_entries value by default Phil Sutter
@ 2015-11-20 17:17 ` Phil Sutter
  2015-11-20 17:28   ` Phil Sutter
  2015-11-23 17:38 ` [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test David Miller
  2015-11-30  9:37 ` Herbert Xu
  5 siblings, 1 reply; 39+ messages in thread
From: Phil Sutter @ 2015-11-20 17:17 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

This is rather a hack to expose the current issue with rhashtable to
under high pressure sometimes return -ENOMEM even though system memory
is not exhausted and a consecutive insert may succeed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 lib/test_rhashtable.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 6fa77b3..270bf72 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -52,6 +52,10 @@ static int tcount = 10;
 module_param(tcount, int, 0);
 MODULE_PARM_DESC(tcount, "Number of threads to spawn (default: 10)");
 
+static bool enomem_retry = false;
+module_param(enomem_retry, bool, 0);
+MODULE_PARM_DESC(enomem_retry, "Retry insert even if -ENOMEM was returned (default: off)");
+
 struct test_obj {
 	int			value;
 	struct rhash_head	node;
@@ -79,14 +83,22 @@ static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
 static int insert_retry(struct rhashtable *ht, struct rhash_head *obj,
                         const struct rhashtable_params params)
 {
-	int err, retries = -1;
+	int err, retries = -1, enomem_retries = 0;
 
 	do {
 		retries++;
 		cond_resched();
 		err = rhashtable_insert_fast(ht, obj, params);
+		if (err == -ENOMEM && enomem_retry) {
+			enomem_retries++;
+			err = -EBUSY;
+		}
 	} while (err == -EBUSY);
 
+	if (enomem_retries)
+		pr_info(" %u insertions retried after -ENOMEM\n",
+			enomem_retries);
+
 	return err ? : retries;
 }
 
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/4] rhashtable-test: allow to retry even if -ENOMEM was returned
  2015-11-20 17:17 ` [PATCH v2 4/4] rhashtable-test: allow to retry even if -ENOMEM was returned Phil Sutter
@ 2015-11-20 17:28   ` Phil Sutter
  0 siblings, 0 replies; 39+ messages in thread
From: Phil Sutter @ 2015-11-20 17:28 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Fri, Nov 20, 2015 at 06:17:20PM +0100, Phil Sutter wrote:
> This is rather a hack to expose the current issue with rhashtable to
> under high pressure sometimes return -ENOMEM even though system memory
> is not exhausted and a consecutive insert may succeed.

Please note that this problem does not show every time when running the
test in default configuration on my system. With increased number of
threads though, it becomes very visible. Load test_rhashtable like so:

modprobe test_rhashtable enomem_retry=1 tcount=20

and grep dmesg for 'insertions retried after -ENOMEM'. In my case:

# dmesg | grep -E '(insertions retried after -ENOMEM|Started)' | tail
[   34.642980]  1 insertions retried after -ENOMEM
[   34.642989]  1 insertions retried after -ENOMEM
[   34.642994]  1 insertions retried after -ENOMEM
[   34.648353]  28294 insertions retried after -ENOMEM
[   34.689687]  31262 insertions retried after -ENOMEM
[   34.714015]  16280 insertions retried after -ENOMEM
[   34.736019]  15327 insertions retried after -ENOMEM
[   34.755100]  39012 insertions retried after -ENOMEM
[   34.769116]  49369 insertions retried after -ENOMEM
[   35.387200] Started 20 threads, 0 failed

Cheers, Phil

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test
  2015-11-20 17:17 [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test Phil Sutter
                   ` (3 preceding siblings ...)
  2015-11-20 17:17 ` [PATCH v2 4/4] rhashtable-test: allow to retry even if -ENOMEM was returned Phil Sutter
@ 2015-11-23 17:38 ` David Miller
  2015-11-30  9:37 ` Herbert Xu
  5 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2015-11-23 17:38 UTC (permalink / raw)
  To: phil; +Cc: netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

From: Phil Sutter <phil@nwl.cc>
Date: Fri, 20 Nov 2015 18:17:16 +0100

> The following series aims to improve lib/test_rhashtable in different
> situations:
> 
> Patch 1 allows the kernel to reschedule so the test does not block too
>         long on slow systems.
> Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
>         error case (-EBUSY).
> Patch 3 auto-adjusts the upper table size limit according to the number
>         of threads (in concurrency test). In fact, the current default is
> 	already too small.
> Patch 4 makes it possible to retry inserts even in supposedly permanent
>         error case (-ENOMEM) to expose rhashtable's remaining problem of
> 	-ENOMEM being not as permanent as it is expected to be.
> 
> Changes since v1:
> - Introduce insert_retry() which is then used in single-threaded test as
>   well.
> - Do not retry inserts by default if -ENOMEM was returned.
> - Rename the retry counter to be a bit more verbose about what it
>   contains.
> - Add patch 4 as a debugging aid.

Series applied, thanks Phil.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test
  2015-11-20 17:17 [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test Phil Sutter
                   ` (4 preceding siblings ...)
  2015-11-23 17:38 ` [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test David Miller
@ 2015-11-30  9:37 ` Herbert Xu
  2015-11-30 10:14   ` Phil Sutter
  5 siblings, 1 reply; 39+ messages in thread
From: Herbert Xu @ 2015-11-30  9:37 UTC (permalink / raw)
  To: Phil Sutter; +Cc: davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

Phil Sutter <phil@nwl.cc> wrote:
> The following series aims to improve lib/test_rhashtable in different
> situations:
> 
> Patch 1 allows the kernel to reschedule so the test does not block too
>        long on slow systems.
> Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
>        error case (-EBUSY).
> Patch 3 auto-adjusts the upper table size limit according to the number
>        of threads (in concurrency test). In fact, the current default is
>        already too small.
> Patch 4 makes it possible to retry inserts even in supposedly permanent
>        error case (-ENOMEM) to expose rhashtable's remaining problem of
>        -ENOMEM being not as permanent as it is expected to be.

I'm sorry but this patch series is simply bogus.

If rhashtable is indeed returning such errors under normal
conditions then rhashtable is broken and we must fix it instead
of working around it in the test code!

FWIW I still haven't been able to reproduce this problem, perhaps
because my machines have too few CPUs?

So can someone please help me reproduce this? Because just loading
test_rhashtable isn't doing it.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test
  2015-11-30  9:37 ` Herbert Xu
@ 2015-11-30 10:14   ` Phil Sutter
  2015-11-30 10:18     ` Herbert Xu
  0 siblings, 1 reply; 39+ messages in thread
From: Phil Sutter @ 2015-11-30 10:14 UTC (permalink / raw)
  To: Herbert Xu; +Cc: davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Mon, Nov 30, 2015 at 05:37:55PM +0800, Herbert Xu wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > The following series aims to improve lib/test_rhashtable in different
> > situations:
> > 
> > Patch 1 allows the kernel to reschedule so the test does not block too
> >        long on slow systems.
> > Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
> >        error case (-EBUSY).
> > Patch 3 auto-adjusts the upper table size limit according to the number
> >        of threads (in concurrency test). In fact, the current default is
> >        already too small.
> > Patch 4 makes it possible to retry inserts even in supposedly permanent
> >        error case (-ENOMEM) to expose rhashtable's remaining problem of
> >        -ENOMEM being not as permanent as it is expected to be.
> 
> I'm sorry but this patch series is simply bogus.

The whole series?!

> If rhashtable is indeed returning such errors under normal
> conditions then rhashtable is broken and we must fix it instead
> of working around it in the test code!

You're stating the obvious. Remember, the reason I prepared patch 4 was
because you wanted to fix just that bug in rhashtable in the first
place.

Just to make this clear: Patches 1-3 are reasonable on their own, the
only connection to the bug is that patch 2 makes it visible (at least on
my system it wasn't before).

> FWIW I still haven't been able to reproduce this problem, perhaps
> because my machines have too few CPUs?

Did you try with my bogus patch series applied? How many CPUs does your
test system actually have?

> So can someone please help me reproduce this? Because just loading
> test_rhashtable isn't doing it.

As said, maybe you need to increase the number of spawned threads
(tcount=50 or so).

Cheers, Phil

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test
  2015-11-30 10:14   ` Phil Sutter
@ 2015-11-30 10:18     ` Herbert Xu
  2015-12-03 12:41       ` rhashtable: Prevent spurious EBUSY errors on insertion Herbert Xu
  2015-12-03 12:51       ` rhashtable: ENOMEM errors when hit with a flood of insertions Herbert Xu
  0 siblings, 2 replies; 39+ messages in thread
From: Herbert Xu @ 2015-11-30 10:18 UTC (permalink / raw)
  To: Phil Sutter, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Mon, Nov 30, 2015 at 11:14:01AM +0100, Phil Sutter wrote:
> On Mon, Nov 30, 2015 at 05:37:55PM +0800, Herbert Xu wrote:
> > Phil Sutter <phil@nwl.cc> wrote:
> > > The following series aims to improve lib/test_rhashtable in different
> > > situations:
> > > 
> > > Patch 1 allows the kernel to reschedule so the test does not block too
> > >        long on slow systems.
> > > Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
> > >        error case (-EBUSY).
> > > Patch 3 auto-adjusts the upper table size limit according to the number
> > >        of threads (in concurrency test). In fact, the current default is
> > >        already too small.
> > > Patch 4 makes it possible to retry inserts even in supposedly permanent
> > >        error case (-ENOMEM) to expose rhashtable's remaining problem of
> > >        -ENOMEM being not as permanent as it is expected to be.
> > 
> > I'm sorry but this patch series is simply bogus.
> 
> The whole series?!

Well at least patch two and four seem clearly wrong because no
rhashtable user should need to retry insertions.

> Did you try with my bogus patch series applied? How many CPUs does your
> test system actually have?
> 
> > So can someone please help me reproduce this? Because just loading
> > test_rhashtable isn't doing it.
> 
> As said, maybe you need to increase the number of spawned threads
> (tcount=50 or so).

OK that's better.  I think I see the problem.  The test in
rhashtable_insert_rehash is racy and if two threads both try
to grow the table one of them may be tricked into doing a rehash
instead.

I'm working on a fix.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* rhashtable: Prevent spurious EBUSY errors on insertion
  2015-11-30 10:18     ` Herbert Xu
@ 2015-12-03 12:41       ` Herbert Xu
  2015-12-03 15:38         ` Phil Sutter
                           ` (2 more replies)
  2015-12-03 12:51       ` rhashtable: ENOMEM errors when hit with a flood of insertions Herbert Xu
  1 sibling, 3 replies; 39+ messages in thread
From: Herbert Xu @ 2015-12-03 12:41 UTC (permalink / raw)
  To: Phil Sutter, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Mon, Nov 30, 2015 at 06:18:59PM +0800, Herbert Xu wrote:
> 
> OK that's better.  I think I see the problem.  The test in
> rhashtable_insert_rehash is racy and if two threads both try
> to grow the table one of them may be tricked into doing a rehash
> instead.
> 
> I'm working on a fix.

OK this patch fixes the EBUSY problem as far as I can tell.  Please
let me know if you still observe EBUSY with it.  I'll respond to the
ENOMEM problem in another email.

---8<---
Thomas and Phil observed that under stress rhashtable insertion
sometimes failed with EBUSY, even though this error should only
ever been seen when we're under attack and our hash chain length
has grown to an unacceptable level, even after a rehash.

It turns out that the logic for detecting whether there is an
existing rehash is faulty.  In particular, when two threads both
try to grow the same table at the same time, one of them may see
the newly grown table and thus erroneously conclude that it had
been rehashed.  This is what leads to the EBUSY error.

This patch fixes this by remembering the current last table we
used during insertion so that rhashtable_insert_rehash can detect
when another thread has also done a resize/rehash.  When this is
detected we will give up our resize/rehash and simply retry the
insertion with the new table.

Reported-by: Thomas Graf <tgraf@suug.ch>
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index 843ceca..e50b31d 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -19,6 +19,7 @@
 
 #include <linux/atomic.h>
 #include <linux/compiler.h>
+#include <linux/err.h>
 #include <linux/errno.h>
 #include <linux/jhash.h>
 #include <linux/list_nulls.h>
@@ -339,10 +340,11 @@ static inline int lockdep_rht_bucket_is_held(const struct bucket_table *tbl,
 int rhashtable_init(struct rhashtable *ht,
 		    const struct rhashtable_params *params);
 
-int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
-			   struct rhash_head *obj,
-			   struct bucket_table *old_tbl);
-int rhashtable_insert_rehash(struct rhashtable *ht);
+struct bucket_table *rhashtable_insert_slow(struct rhashtable *ht,
+					    const void *key,
+					    struct rhash_head *obj,
+					    struct bucket_table *old_tbl);
+int rhashtable_insert_rehash(struct rhashtable *ht, struct bucket_table *tbl);
 
 int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter);
 void rhashtable_walk_exit(struct rhashtable_iter *iter);
@@ -598,9 +600,11 @@ restart:
 
 	new_tbl = rht_dereference_rcu(tbl->future_tbl, ht);
 	if (unlikely(new_tbl)) {
-		err = rhashtable_insert_slow(ht, key, obj, new_tbl);
-		if (err == -EAGAIN)
+		tbl = rhashtable_insert_slow(ht, key, obj, new_tbl);
+		if (!IS_ERR_OR_NULL(tbl))
 			goto slow_path;
+
+		err = PTR_ERR(tbl);
 		goto out;
 	}
 
@@ -611,7 +615,7 @@ restart:
 	if (unlikely(rht_grow_above_100(ht, tbl))) {
 slow_path:
 		spin_unlock_bh(lock);
-		err = rhashtable_insert_rehash(ht);
+		err = rhashtable_insert_rehash(ht, tbl);
 		rcu_read_unlock();
 		if (err)
 			return err;
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index a54ff89..2ff7ed9 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -389,33 +389,31 @@ static bool rhashtable_check_elasticity(struct rhashtable *ht,
 	return false;
 }
 
-int rhashtable_insert_rehash(struct rhashtable *ht)
+int rhashtable_insert_rehash(struct rhashtable *ht,
+			     struct bucket_table *tbl)
 {
 	struct bucket_table *old_tbl;
 	struct bucket_table *new_tbl;
-	struct bucket_table *tbl;
 	unsigned int size;
 	int err;
 
 	old_tbl = rht_dereference_rcu(ht->tbl, ht);
-	tbl = rhashtable_last_table(ht, old_tbl);
 
 	size = tbl->size;
 
+	err = -EBUSY;
+
 	if (rht_grow_above_75(ht, tbl))
 		size *= 2;
 	/* Do not schedule more than one rehash */
 	else if (old_tbl != tbl)
-		return -EBUSY;
+		goto fail;
+
+	err = -ENOMEM;
 
 	new_tbl = bucket_table_alloc(ht, size, GFP_ATOMIC);
-	if (new_tbl == NULL) {
-		/* Schedule async resize/rehash to try allocation
-		 * non-atomic context.
-		 */
-		schedule_work(&ht->run_work);
-		return -ENOMEM;
-	}
+	if (new_tbl == NULL)
+		goto fail;
 
 	err = rhashtable_rehash_attach(ht, tbl, new_tbl);
 	if (err) {
@@ -426,12 +424,24 @@ int rhashtable_insert_rehash(struct rhashtable *ht)
 		schedule_work(&ht->run_work);
 
 	return err;
+
+fail:
+	/* Do not fail the insert if someone else did a rehash. */
+	if (likely(rcu_dereference_raw(tbl->future_tbl)))
+		return 0;
+
+	/* Schedule async rehash to retry allocation in process context. */
+	if (err == -ENOMEM)
+		schedule_work(&ht->run_work);
+
+	return err;
 }
 EXPORT_SYMBOL_GPL(rhashtable_insert_rehash);
 
-int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
-			   struct rhash_head *obj,
-			   struct bucket_table *tbl)
+struct bucket_table *rhashtable_insert_slow(struct rhashtable *ht,
+					    const void *key,
+					    struct rhash_head *obj,
+					    struct bucket_table *tbl)
 {
 	struct rhash_head *head;
 	unsigned int hash;
@@ -467,7 +477,12 @@ int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
 exit:
 	spin_unlock(rht_bucket_lock(tbl, hash));
 
-	return err;
+	if (err == 0)
+		return NULL;
+	else if (err == -EAGAIN)
+		return tbl;
+	else
+		return ERR_PTR(err);
 }
 EXPORT_SYMBOL_GPL(rhashtable_insert_slow);
 
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* rhashtable: ENOMEM errors when hit with a flood of insertions
  2015-11-30 10:18     ` Herbert Xu
  2015-12-03 12:41       ` rhashtable: Prevent spurious EBUSY errors on insertion Herbert Xu
@ 2015-12-03 12:51       ` Herbert Xu
  2015-12-03 15:08         ` David Laight
  2015-12-03 16:08         ` Eric Dumazet
  1 sibling, 2 replies; 39+ messages in thread
From: Herbert Xu @ 2015-12-03 12:51 UTC (permalink / raw)
  To: Phil Sutter, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Mon, Nov 30, 2015 at 06:18:59PM +0800, Herbert Xu wrote:
> 
> OK that's better.  I think I see the problem.  The test in
> rhashtable_insert_rehash is racy and if two threads both try
> to grow the table one of them may be tricked into doing a rehash
> instead.
> 
> I'm working on a fix.

While the EBUSY errors are gone for me, I can still see plenty
of ENOMEM errors.  In fact it turns out that the reason is quite
understandable.  When you pound the rhashtable hard so that it
doesn't actually get a chance to grow the table in process context,
then the table will only grow with GFP_ATOMIC allocations.

For me this starts failing regularly at around 2^19 entries, which
requires about 1024 contiguous pages if I'm not mistaken.

I've got fairly straightforward solution for this, but it does
mean that we have to add another level of complexity to the
rhashtable implementation.  So before I go there I want to be
absolutely sure that we need it.

I guess the question is do we care about users that pound rhashtable
in this fashion?

My answer would be yes but I'd like to hear your opinions.

My solution is to use a slightly more complex/less efficient hash
table when we fail the allocation in interrupt context.  Instead
of allocating contiguous pages, we'll simply switch to allocating
individual pages and have a master page that points to them.

On a 64-bit platform, each page can accomodate 512 entries.  So
with a two-level deep setup (meaning one extra access for a hash
lookup), this would accomodate 2^18 entries.  Three levels (two
extra lookups) will give us 2^27 entries, which should be enough.

When we do this we should of course schedule an async rehash so
that as soon as we get a chance we can move the entries into a
normal hash table that needs only a single lookup.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: rhashtable: ENOMEM errors when hit with a flood of insertions
  2015-12-03 12:51       ` rhashtable: ENOMEM errors when hit with a flood of insertions Herbert Xu
@ 2015-12-03 15:08         ` David Laight
  2015-12-03 16:08         ` Eric Dumazet
  1 sibling, 0 replies; 39+ messages in thread
From: David Laight @ 2015-12-03 15:08 UTC (permalink / raw)
  To: 'Herbert Xu',
	Phil Sutter, davem, netdev, linux-kernel, tgraf, fengguang.wu,
	wfg, lkp

From: Herbert Xu
> Sent: 03 December 2015 12:51
> On Mon, Nov 30, 2015 at 06:18:59PM +0800, Herbert Xu wrote:
> >
> > OK that's better.  I think I see the problem.  The test in
> > rhashtable_insert_rehash is racy and if two threads both try
> > to grow the table one of them may be tricked into doing a rehash
> > instead.
> >
> > I'm working on a fix.
> 
> While the EBUSY errors are gone for me, I can still see plenty
> of ENOMEM errors.  In fact it turns out that the reason is quite
> understandable.  When you pound the rhashtable hard so that it
> doesn't actually get a chance to grow the table in process context,
> then the table will only grow with GFP_ATOMIC allocations.
> 
> For me this starts failing regularly at around 2^19 entries, which
> requires about 1024 contiguous pages if I'm not mistaken.

ISTM that you should always let the insert succeed - even if it makes
the average/maximum chain length increase beyond some limit.
Any limit on the number of hashed items should have been done earlier
by the calling code.
The slight performance decrease caused by scanning longer chains
is almost certainly more 'user friendly' than an error return.

Hoping to get 1024+ contiguous VA pages does seem over-optimistic.

With a 2-level lookup you could make all the 2nd level tables
a fixed size (maybe 4 or 8 pages?) and extend the first level
table as needed.

	David

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-03 12:41       ` rhashtable: Prevent spurious EBUSY errors on insertion Herbert Xu
@ 2015-12-03 15:38         ` Phil Sutter
  2015-12-04 19:38         ` David Miller
  2015-12-17  8:46         ` Xin Long
  2 siblings, 0 replies; 39+ messages in thread
From: Phil Sutter @ 2015-12-03 15:38 UTC (permalink / raw)
  To: Herbert Xu; +Cc: davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Thu, Dec 03, 2015 at 08:41:29PM +0800, Herbert Xu wrote:
> On Mon, Nov 30, 2015 at 06:18:59PM +0800, Herbert Xu wrote:
> > 
> > OK that's better.  I think I see the problem.  The test in
> > rhashtable_insert_rehash is racy and if two threads both try
> > to grow the table one of them may be tricked into doing a rehash
> > instead.
> > 
> > I'm working on a fix.
> 
> OK this patch fixes the EBUSY problem as far as I can tell.  Please
> let me know if you still observe EBUSY with it.  I'll respond to the
> ENOMEM problem in another email.
> 
> ---8<---
> Thomas and Phil observed that under stress rhashtable insertion
> sometimes failed with EBUSY, even though this error should only
> ever been seen when we're under attack and our hash chain length
> has grown to an unacceptable level, even after a rehash.
> 
> It turns out that the logic for detecting whether there is an
> existing rehash is faulty.  In particular, when two threads both
> try to grow the same table at the same time, one of them may see
> the newly grown table and thus erroneously conclude that it had
> been rehashed.  This is what leads to the EBUSY error.
> 
> This patch fixes this by remembering the current last table we
> used during insertion so that rhashtable_insert_rehash can detect
> when another thread has also done a resize/rehash.  When this is
> detected we will give up our resize/rehash and simply retry the
> insertion with the new table.
> 
> Reported-by: Thomas Graf <tgraf@suug.ch>
> Reported-by: Phil Sutter <phil@nwl.cc>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Tested-by: Phil Sutter <phil@nwl.cc>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: ENOMEM errors when hit with a flood of insertions
  2015-12-03 12:51       ` rhashtable: ENOMEM errors when hit with a flood of insertions Herbert Xu
  2015-12-03 15:08         ` David Laight
@ 2015-12-03 16:08         ` Eric Dumazet
  2015-12-04  0:07           ` Herbert Xu
  2015-12-04 14:39           ` rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation Herbert Xu
  1 sibling, 2 replies; 39+ messages in thread
From: Eric Dumazet @ 2015-12-03 16:08 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Phil Sutter, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Thu, 2015-12-03 at 20:51 +0800, Herbert Xu wrote:
> On Mon, Nov 30, 2015 at 06:18:59PM +0800, Herbert Xu wrote:
> > 
> > OK that's better.  I think I see the problem.  The test in
> > rhashtable_insert_rehash is racy and if two threads both try
> > to grow the table one of them may be tricked into doing a rehash
> > instead.
> > 
> > I'm working on a fix.
> 
> While the EBUSY errors are gone for me, I can still see plenty
> of ENOMEM errors.  In fact it turns out that the reason is quite
> understandable.  When you pound the rhashtable hard so that it
> doesn't actually get a chance to grow the table in process context,
> then the table will only grow with GFP_ATOMIC allocations.
> 
> For me this starts failing regularly at around 2^19 entries, which
> requires about 1024 contiguous pages if I'm not mistaken.

Well, it will fail before this point if memory is fragmented.

Anyway, __vmalloc() can be used with GFP_ATOMIC, have you tried this ?

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index a54ff8949f91..9ef5d74963b2 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -120,8 +120,9 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
 	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER) ||
 	    gfp != GFP_KERNEL)
 		tbl = kzalloc(size, gfp | __GFP_NOWARN | __GFP_NORETRY);
-	if (tbl == NULL && gfp == GFP_KERNEL)
-		tbl = vzalloc(size);
+	if (tbl == NULL)
+		tbl = __vmalloc(size, gfp | __GFP_HIGHMEM | __GFP_ZERO,
+				PAGE_KERNEL);
 	if (tbl == NULL)
 		return NULL;
 



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: rhashtable: ENOMEM errors when hit with a flood of insertions
  2015-12-03 16:08         ` Eric Dumazet
@ 2015-12-04  0:07           ` Herbert Xu
  2015-12-04 14:39           ` rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation Herbert Xu
  1 sibling, 0 replies; 39+ messages in thread
From: Herbert Xu @ 2015-12-04  0:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Phil Sutter, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Thu, Dec 03, 2015 at 08:08:39AM -0800, Eric Dumazet wrote:
>
> Well, it will fail before this point if memory is fragmented.

Indeed, I was surprised that it even worked up to that point,
possibly because the previous resizes might have actually been
done in process context.

> Anyway, __vmalloc() can be used with GFP_ATOMIC, have you tried this ?

Ah I didn't know that.  That would be much simpler.  I'll give it a
try.

Thanks Eric!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-03 16:08         ` Eric Dumazet
  2015-12-04  0:07           ` Herbert Xu
@ 2015-12-04 14:39           ` Herbert Xu
  2015-12-04 17:01             ` Phil Sutter
  2015-12-04 21:53             ` David Miller
  1 sibling, 2 replies; 39+ messages in thread
From: Herbert Xu @ 2015-12-04 14:39 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Phil Sutter, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Thu, Dec 03, 2015 at 08:08:39AM -0800, Eric Dumazet wrote:
>
> Anyway, __vmalloc() can be used with GFP_ATOMIC, have you tried this ?

OK I've tried it and I no longer get any ENOMEM errors!

---8<---
When an rhashtable user pounds rhashtable hard with back-to-back
insertions we may end up growing the table in GFP_ATOMIC context.
Unfortunately when the table reaches a certain size this often
fails because we don't have enough physically contiguous pages
to hold the new table.

Eric Dumazet suggested (and in fact wrote this patch) using
__vmalloc instead which can be used in GFP_ATOMIC context.

Reported-by: Phil Sutter <phil@nwl.cc>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index a54ff89..1c624db 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -120,8 +120,9 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
 	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER) ||
 	    gfp != GFP_KERNEL)
 		tbl = kzalloc(size, gfp | __GFP_NOWARN | __GFP_NORETRY);
-	if (tbl == NULL && gfp == GFP_KERNEL)
-		tbl = vzalloc(size);
+	if (tbl == NULL)
+		tbl = __vmalloc(size, gfp | __GFP_HIGHMEM | __GFP_ZERO,
+				PAGE_KERNEL);
 	if (tbl == NULL)
 		return NULL;
 
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-04 14:39           ` rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation Herbert Xu
@ 2015-12-04 17:01             ` Phil Sutter
  2015-12-04 17:45               ` Eric Dumazet
  2015-12-04 21:53             ` David Miller
  1 sibling, 1 reply; 39+ messages in thread
From: Phil Sutter @ 2015-12-04 17:01 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Dumazet, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On Fri, Dec 04, 2015 at 10:39:56PM +0800, Herbert Xu wrote:
> On Thu, Dec 03, 2015 at 08:08:39AM -0800, Eric Dumazet wrote:
> >
> > Anyway, __vmalloc() can be used with GFP_ATOMIC, have you tried this ?
> 
> OK I've tried it and I no longer get any ENOMEM errors!

I can't confirm this, sadly. Using 50 threads, results seem to be stable
and good. But increasing the number of threads I can provoke ENOMEM
condition again. See attached log which shows a failing test run with
100 threads.

I tried to extract logs of a test run with as few as possible failing
threads, but wasn't successful. It seems like the error amplifies
itself: While having stable success with less than 70 threads, going
beyond a margin I could not identify exactly, much more threads failed
than expected. For instance, the attached log shows 70 out of 100
threads failing, while for me every single test with 50 threads was
successful.

HTH, Phil

[-- Attachment #2: test_rhashtable_fail.log --]
[-- Type: text/plain, Size: 48552 bytes --]

[ 5196.212230] Running rhashtable test nelem=8, max_size=0, shrinking=0
[ 5196.243846] Test 00:
[ 5196.245990]   Adding 50000 keys
[ 5196.250787] Info: encountered resize
[ 5196.251631] Info: encountered resize
[ 5196.252773] Info: encountered resize
[ 5196.256076]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=3
[ 5196.261961]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
[ 5196.263282]   Deleting 50000 keys
[ 5196.267054]   Duration of test: 20359448 ns
[ 5196.267762] Test 01:
[ 5196.270278]   Adding 50000 keys
[ 5196.276804] Info: encountered resize
[ 5196.278164]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=1
[ 5196.287668]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
[ 5196.289246]   Deleting 50000 keys
[ 5196.293725]   Duration of test: 22689015 ns
[ 5196.294902] Test 02:
[ 5196.297878]   Adding 50000 keys
[ 5196.305348] Info: encountered resize
[ 5196.306093] Info: encountered resize
[ 5196.306815] Info: encountered resize
[ 5196.307529] Info: encountered resize
[ 5196.308262] Info: encountered resize
[ 5196.308973] Info: encountered resize
[ 5196.309699] Info: encountered resize
[ 5196.310449] Info: encountered resize
[ 5196.311228] Info: encountered resize
[ 5196.311996] Info: encountered resize
[ 5196.312957] Info: encountered resize
[ 5196.314178] Info: encountered resize
[ 5196.318068]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=12
[ 5196.324140]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
[ 5196.325661]   Deleting 50000 keys
[ 5196.338875]   Duration of test: 39997796 ns
[ 5196.339718] Test 03:
[ 5196.341610]   Adding 50000 keys
[ 5196.349677]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
[ 5196.356153]   Traversal complete: counted=50000, nelems=50000, entries=50000, table-jumps=0
[ 5196.357704]   Deleting 50000 keys
[ 5196.362173]   Duration of test: 19844019 ns
[ 5196.363055] Average test time: 25722569
[ 5196.363815] Testing concurrent rhashtable access from 100 threads
[ 5196.684648] vmalloc: allocation failure, allocated 22126592 of 33562624 bytes
[ 5196.685195]   thread[87]: rhashtable_insert_fast failed
[ 5196.687075] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.687652] vmalloc: allocation failure, allocated 22245376 of 33562624 bytes
[ 5196.687653] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.687655] CPU: 1 PID: 12259 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.687656] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.687659]  0000000000000000 0000000025caf0f8 ffff88003094bc70 ffffffff81308384
[ 5196.687660]  0000000002088022 ffff88003094bd00 ffffffff8117b18c ffffffff81815d58
[ 5196.687661]  ffff88003094bc90 ffffffff00000018 ffff88003094bd10 ffff88003094bcb0
[ 5196.687661] Call Trace:
[ 5196.687667]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.687669]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.687673]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.687675]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.687677]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.687681]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.687682]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.687683]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.687684]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.687687]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.687689]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.687691]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.687693]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.687695]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.687697]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.687698]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.687700]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.687701]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.687702] Mem-Info:
[ 5196.687704] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:62 free_cma:0
[ 5196.687708] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.687709] lowmem_reserve[]: 0 976 976 976
[ 5196.687712] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:248kB local_pcp:160kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.687713] lowmem_reserve[]: 0 0 0 0
[ 5196.687719] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.687722] Node 0 DMA32: 30*4kB (UM) 37*8kB (UM) 16*16kB (M) 12*32kB (UM) 5*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.687723] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.687724] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.687724] 102666 total pagecache pages
[ 5196.687726] 12 pages in swap cache
[ 5196.687726] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.687726] Free swap  = 516408kB
[ 5196.687727] Total swap = 520188kB
[ 5196.687727] 262012 pages RAM
[ 5196.687728] 0 pages HighMem/MovableOnly
[ 5196.687728] 7768 pages reserved
[ 5196.687728] 0 pages cma reserved
[ 5196.687728] 0 pages hwpoisoned
[ 5196.688896]   thread[88]: rhashtable_insert_fast failed
[ 5196.691994] vmalloc: allocation failure, allocated 22286336 of 33562624 bytes
[ 5196.691995] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.691997] CPU: 1 PID: 12260 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.692006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.692024]  0000000000000000 00000000f3c4a7e2 ffff88003094fc70 ffffffff81308384
[ 5196.692025]  0000000002088022 ffff88003094fd00 ffffffff8117b18c ffffffff81815d58
[ 5196.692026]  ffff88003094fc90 ffffffff00000018 ffff88003094fd10 ffff88003094fcb0
[ 5196.692027] Call Trace:
[ 5196.692032]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.692035]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.692038]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.692040]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.692042]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.692046]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.692047]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.692048]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.692050]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.692053]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.692054]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.692057]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.692058]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.692060]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.692062]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.692063]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.692065]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.692066]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.692067] Mem-Info:
[ 5196.692070] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:52 free_cma:0
[ 5196.692073] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.692074] lowmem_reserve[]: 0 976 976 976
[ 5196.692077] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:208kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.692078] lowmem_reserve[]: 0 0 0 0
[ 5196.692083] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.692086] Node 0 DMA32: 30*4kB (UM) 35*8kB (M) 13*16kB (M) 12*32kB (UM) 6*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.692088] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.692088] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.692089] 102666 total pagecache pages
[ 5196.692090] 12 pages in swap cache
[ 5196.692090] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.692091] Free swap  = 516408kB
[ 5196.692091] Total swap = 520188kB
[ 5196.692091] 262012 pages RAM
[ 5196.692092] 0 pages HighMem/MovableOnly
[ 5196.692092] 7768 pages reserved
[ 5196.692092] 0 pages cma reserved
[ 5196.692092] 0 pages hwpoisoned
[ 5196.692689]   thread[89]: rhashtable_insert_fast failed
[ 5196.695065] vmalloc: allocation failure, allocated 22228992 of 33562624 bytes
[ 5196.695066] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.695068] CPU: 1 PID: 12261 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.695069] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.695071]  0000000000000000 000000000d036016 ffff8800096fbc70 ffffffff81308384
[ 5196.695072]  0000000002088022 ffff8800096fbd00 ffffffff8117b18c ffffffff81815d58
[ 5196.695073]  ffff8800096fbc90 ffffffff00000018 ffff8800096fbd10 ffff8800096fbcb0
[ 5196.695073] Call Trace:
[ 5196.695079]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.695081]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.695085]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.695087]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.695089]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.695093]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.695094]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.695095]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.695096]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.695099]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.695101]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.695103]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.695105]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.695106]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.695108]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.695110]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.695111]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.695112]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.695113] Mem-Info:
[ 5196.695115] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:66 free_cma:0
[ 5196.695119] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.695120] lowmem_reserve[]: 0 976 976 976
[ 5196.695123] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:264kB local_pcp:176kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.695124] lowmem_reserve[]: 0 0 0 0
[ 5196.695129] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.695132] Node 0 DMA32: 30*4kB (UM) 37*8kB (UM) 16*16kB (UM) 10*32kB (M) 6*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.695133] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.695134] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.695134] 102666 total pagecache pages
[ 5196.695135] 12 pages in swap cache
[ 5196.695136] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.695136] Free swap  = 516408kB
[ 5196.695136] Total swap = 520188kB
[ 5196.695137] 262012 pages RAM
[ 5196.695137] 0 pages HighMem/MovableOnly
[ 5196.695137] 7768 pages reserved
[ 5196.695137] 0 pages cma reserved
[ 5196.695138] 0 pages hwpoisoned
[ 5196.696538]   thread[90]: rhashtable_insert_fast failed
[ 5196.699063] vmalloc: allocation failure, allocated 22286336 of 33562624 bytes
[ 5196.699064] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.699066] CPU: 1 PID: 12262 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.699067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.699070]  0000000000000000 00000000345b8f15 ffff8800096ffc70 ffffffff81308384
[ 5196.699071]  0000000002088022 ffff8800096ffd00 ffffffff8117b18c ffffffff81815d58
[ 5196.699072]  ffff8800096ffc90 ffffffff00000018 ffff8800096ffd10 ffff8800096ffcb0
[ 5196.699072] Call Trace:
[ 5196.699077]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.699080]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.699083]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.699085]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.699087]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.699091]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.699092]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.699093]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.699095]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.699097]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.699099]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.699101]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.699103]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.699104]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.699107]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.699108]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.699110]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.699111]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.699111] Mem-Info:
[ 5196.699114] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:52 free_cma:0
[ 5196.699118] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.699119] lowmem_reserve[]: 0 976 976 976
[ 5196.699122] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:208kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.699123] lowmem_reserve[]: 0 0 0 0
[ 5196.699128] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.699131] Node 0 DMA32: 30*4kB (UM) 35*8kB (M) 15*16kB (UM) 11*32kB (UM) 4*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.699132] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.699133] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.699134] 102666 total pagecache pages
[ 5196.699134] 12 pages in swap cache
[ 5196.699135] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.699135] Free swap  = 516408kB
[ 5196.699136] Total swap = 520188kB
[ 5196.699136] 262012 pages RAM
[ 5196.699136] 0 pages HighMem/MovableOnly
[ 5196.699136] 7768 pages reserved
[ 5196.699137] 0 pages cma reserved
[ 5196.699137] 0 pages hwpoisoned
[ 5196.699733]   thread[91]: rhashtable_insert_fast failed
[ 5196.702130] vmalloc: allocation failure, allocated 22257664 of 33562624 bytes
[ 5196.702131] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.702132] CPU: 1 PID: 12263 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.702133] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.702135]  0000000000000000 00000000ada2b9fc ffff880009143c70 ffffffff81308384
[ 5196.702136]  0000000002088022 ffff880009143d00 ffffffff8117b18c ffffffff81815d58
[ 5196.702137]  ffff880009143c90 ffffffff00000018 ffff880009143d10 ffff880009143cb0
[ 5196.702138] Call Trace:
[ 5196.702143]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.702145]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.702148]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.702151]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.702152]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.702156]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.702157]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.702159]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.702160]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.702163]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.702164]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.702167]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.702168]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.702170]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.702173]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.702174]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.702175]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.702176]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.702177] Mem-Info:
[ 5196.702180] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:59 free_cma:0
[ 5196.702183] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.702184] lowmem_reserve[]: 0 976 976 976
[ 5196.702187] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:236kB local_pcp:148kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.702188] lowmem_reserve[]: 0 0 0 0
[ 5196.702193] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.702197] Node 0 DMA32: 30*4kB (UM) 35*8kB (M) 15*16kB (UM) 11*32kB (UM) 4*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.702198] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.702198] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.702199] 102666 total pagecache pages
[ 5196.702200] 12 pages in swap cache
[ 5196.702200] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.702201] Free swap  = 516408kB
[ 5196.702201] Total swap = 520188kB
[ 5196.702201] 262012 pages RAM
[ 5196.702201] 0 pages HighMem/MovableOnly
[ 5196.702202] 7768 pages reserved
[ 5196.702202] 0 pages cma reserved
[ 5196.702202] 0 pages hwpoisoned
[ 5196.703408]   thread[92]: rhashtable_insert_fast failed
[ 5196.705758] vmalloc: allocation failure, allocated 22274048 of 33562624 bytes
[ 5196.705759] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.705761] CPU: 1 PID: 12264 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.705761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.705763]  0000000000000000 00000000968be9a7 ffff880009147c70 ffffffff81308384
[ 5196.705764]  0000000002088022 ffff880009147d00 ffffffff8117b18c ffffffff81815d58
[ 5196.705765]  ffff880009147c90 ffffffff00000018 ffff880009147d10 ffff880009147cb0
[ 5196.705766] Call Trace:
[ 5196.705770]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.705773]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.705777]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.705779]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.705781]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.705785]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.705786]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.705787]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.705789]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.705791]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.705793]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.705795]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.705797]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.705798]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.705801]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.705802]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.705804]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.705805]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.705806] Mem-Info:
[ 5196.705808] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:55 free_cma:0
[ 5196.705812] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.705813] lowmem_reserve[]: 0 976 976 976
[ 5196.705816] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:220kB local_pcp:132kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.705816] lowmem_reserve[]: 0 0 0 0
[ 5196.705821] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.705825] Node 0 DMA32: 30*4kB (M) 37*8kB (M) 18*16kB (UM) 13*32kB (UM) 4*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.705830] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.705831] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.705831] 102666 total pagecache pages
[ 5196.705832] 12 pages in swap cache
[ 5196.705832] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.705834] Free swap  = 516408kB
[ 5196.705834] Total swap = 520188kB
[ 5196.705834] 262012 pages RAM
[ 5196.705835] 0 pages HighMem/MovableOnly
[ 5196.705835] 7768 pages reserved
[ 5196.705835] 0 pages cma reserved
[ 5196.705836] 0 pages hwpoisoned
[ 5196.706493]   thread[93]: rhashtable_insert_fast failed
[ 5196.708940] vmalloc: allocation failure, allocated 22282240 of 33562624 bytes
[ 5196.708941] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.708943] CPU: 1 PID: 12265 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.708943] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.708945]  0000000000000000 00000000f4d21571 ffff88000914bc70 ffffffff81308384
[ 5196.708946]  0000000002088022 ffff88000914bd00 ffffffff8117b18c ffffffff81815d58
[ 5196.708947]  ffff88000914bc90 ffffffff00000018 ffff88000914bd10 ffff88000914bcb0
[ 5196.708947] Call Trace:
[ 5196.708952]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.708955]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.708958]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.708960]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.708962]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.708966]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.708967]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.708969]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.708970]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.708972]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.708974]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.708977]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.708978]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.708979]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.708982]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.708983]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.708985]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.708986]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.708987] Mem-Info:
[ 5196.708989] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:53 free_cma:0
[ 5196.708993] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.708994] lowmem_reserve[]: 0 976 976 976
[ 5196.708997] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:212kB local_pcp:124kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.708997] lowmem_reserve[]: 0 0 0 0
[ 5196.709021] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.709024] Node 0 DMA32: 30*4kB (UM) 35*8kB (M) 13*16kB (M) 14*32kB (UM) 5*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.709026] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.709026] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.709027] 102666 total pagecache pages
[ 5196.709027] 12 pages in swap cache
[ 5196.709028] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.709028] Free swap  = 516408kB
[ 5196.709029] Total swap = 520188kB
[ 5196.709029] 262012 pages RAM
[ 5196.709030] 0 pages HighMem/MovableOnly
[ 5196.709030] 7768 pages reserved
[ 5196.709030] 0 pages cma reserved
[ 5196.709031] 0 pages hwpoisoned
[ 5196.710116]   thread[94]: rhashtable_insert_fast failed
[ 5196.712572] vmalloc: allocation failure, allocated 22233088 of 33562624 bytes
[ 5196.712573] rhashtable_thra: page allocation failure: order:0, mode:0x2088022
[ 5196.712575] CPU: 1 PID: 12266 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5196.712576] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5196.712578]  0000000000000000 00000000b3f81fc2 ffff88000914fc70 ffffffff81308384
[ 5196.712579]  0000000002088022 ffff88000914fd00 ffffffff8117b18c ffffffff81815d58
[ 5196.712580]  ffff88000914fc90 ffffffff00000018 ffff88000914fd10 ffff88000914fcb0
[ 5196.712581] Call Trace:
[ 5196.712587]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5196.712590]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5196.712593]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5196.712595]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5196.712597]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.712601]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5196.712603]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5196.712604]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5196.712605]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5196.712608]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5196.712610]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5196.712612]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5196.712614]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.712615]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5196.712618]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5196.712619]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.712621]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5196.712622]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5196.712623] Mem-Info:
[ 5196.712625] active_anon:16125 inactive_anon:16714 isolated_anon:0
 active_file:36543 inactive_file:53278 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13060 shmem:12833 pagetables:1159 bounce:0
 free:1325 free_pcp:65 free_cma:0
[ 5196.712629] Node 0 DMA free:3924kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.712630] lowmem_reserve[]: 0 976 976 976
[ 5196.712633] Node 0 DMA32 free:1376kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66756kB active_file:146156kB inactive_file:213108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52216kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:260kB local_pcp:172kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5196.712634] lowmem_reserve[]: 0 0 0 0
[ 5196.712639] Node 0 DMA: 13*4kB (ME) 14*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3924kB
[ 5196.712642] Node 0 DMA32: 30*4kB (M) 37*8kB (UM) 16*16kB (UM) 10*32kB (M) 6*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1376kB
[ 5196.712643] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5196.712644] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5196.712644] 102666 total pagecache pages
[ 5196.712645] 12 pages in swap cache
[ 5196.712646] Swap cache stats: add 1016, delete 1004, find 47/51
[ 5196.712646] Free swap  = 516408kB
[ 5196.712646] Total swap = 520188kB
[ 5196.712647] 262012 pages RAM
[ 5196.712647] 0 pages HighMem/MovableOnly
[ 5196.712647] 7768 pages reserved
[ 5196.712648] 0 pages cma reserved
[ 5196.712648] 0 pages hwpoisoned
[ 5196.713212]   thread[95]: rhashtable_insert_fast failed
[ 5196.716459]   thread[96]: rhashtable_insert_fast failed
[ 5196.719448]   thread[97]: rhashtable_insert_fast failed
[ 5196.723918]   thread[98]: rhashtable_insert_fast failed
[ 5196.727070]   thread[38]: rhashtable_insert_fast failed
[ 5196.732398]   thread[39]: rhashtable_insert_fast failed
[ 5196.735329]   thread[40]: rhashtable_insert_fast failed
[ 5196.740134]   thread[41]: rhashtable_insert_fast failed
[ 5196.743129]   thread[42]: rhashtable_insert_fast failed
[ 5196.748780]   thread[43]: rhashtable_insert_fast failed
[ 5196.751713]   thread[44]: rhashtable_insert_fast failed
[ 5196.756309]   thread[45]: rhashtable_insert_fast failed
[ 5196.759367]   thread[46]: rhashtable_insert_fast failed
[ 5196.762669]   thread[47]: rhashtable_insert_fast failed
[ 5196.765906]   thread[48]: rhashtable_insert_fast failed
[ 5196.769524]   thread[49]: rhashtable_insert_fast failed
[ 5196.772589]   thread[50]: rhashtable_insert_fast failed
[ 5196.776017]   thread[51]: rhashtable_insert_fast failed
[ 5196.779027]   thread[52]: rhashtable_insert_fast failed
[ 5196.782691]   thread[54]: rhashtable_insert_fast failed
[ 5196.786214]   thread[55]: rhashtable_insert_fast failed
[ 5196.790425]   thread[62]: rhashtable_insert_fast failed
[ 5196.793342]   thread[63]: rhashtable_insert_fast failed
[ 5196.796981]   thread[64]: rhashtable_insert_fast failed
[ 5196.800092]   thread[65]: rhashtable_insert_fast failed
[ 5196.803677]   thread[66]: rhashtable_insert_fast failed
[ 5196.806786]   thread[67]: rhashtable_insert_fast failed
[ 5196.811311]   thread[68]: rhashtable_insert_fast failed
[ 5196.814568]   thread[69]: rhashtable_insert_fast failed
[ 5196.819452]   thread[70]: rhashtable_insert_fast failed
[ 5196.824151]   thread[30]: rhashtable_insert_fast failed
[ 5196.828553]   thread[31]: rhashtable_insert_fast failed
[ 5196.831653]   thread[32]: rhashtable_insert_fast failed
[ 5196.834548]   thread[33]: rhashtable_insert_fast failed
[ 5196.837353]   thread[34]: rhashtable_insert_fast failed
[ 5196.840583]   thread[35]: rhashtable_insert_fast failed
[ 5196.843205]   thread[36]: rhashtable_insert_fast failed
[ 5196.846480]   thread[37]: rhashtable_insert_fast failed
[ 5196.849272]   thread[56]: rhashtable_insert_fast failed
[ 5196.853540]   thread[57]: rhashtable_insert_fast failed
[ 5196.867733]   thread[99]: rhashtable_insert_fast failed
[ 5196.882680]   thread[78]: rhashtable_insert_fast failed
[ 5196.885508]   thread[58]: rhashtable_insert_fast failed
[ 5196.888540]   thread[59]: rhashtable_insert_fast failed
[ 5196.891599]   thread[60]: rhashtable_insert_fast failed
[ 5196.895137]   thread[61]: rhashtable_insert_fast failed
[ 5196.906544]   thread[77]: rhashtable_insert_fast failed
[ 5196.912459]   thread[28]: rhashtable_insert_fast failed
[ 5196.915968]   thread[86]: rhashtable_insert_fast failed
[ 5196.920906]   thread[81]: rhashtable_insert_fast failed
[ 5196.924751]   thread[84]: rhashtable_insert_fast failed
[ 5196.929254]   thread[79]: rhashtable_insert_fast failed
[ 5196.933818]   thread[72]: rhashtable_insert_fast failed
[ 5196.938357]   thread[74]: rhashtable_insert_fast failed
[ 5196.941435]   thread[80]: rhashtable_insert_fast failed
[ 5196.944989]   thread[71]: rhashtable_insert_fast failed
[ 5196.948319]   thread[83]: rhashtable_insert_fast failed
[ 5196.952994]   thread[82]: rhashtable_insert_fast failed
[ 5196.955952]   thread[75]: rhashtable_insert_fast failed
[ 5196.959954]   thread[73]: rhashtable_insert_fast failed
[ 5196.962849]   thread[85]: rhashtable_insert_fast failed
[ 5196.967296]   thread[27]: rhashtable_insert_fast failed
[ 5197.388390] CPU: 0 PID: 12200 Comm: rhashtable_thra Not tainted 4.4.0-rc1rhashtable+ #141
[ 5197.390022] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 5197.392029]  0000000000000000 0000000067bc472b ffff880005ab7c70 ffffffff81308384
[ 5197.393658]  0000000002088022 ffff880005ab7d00 ffffffff8117b18c ffffffff81815d58
[ 5197.395253]  ffff880005ab7c90 ffffffff00000018 ffff880005ab7d10 ffff880005ab7cb0
[ 5197.396868] Call Trace:
[ 5197.397512]  [<ffffffff81308384>] dump_stack+0x44/0x60
[ 5197.398463]  [<ffffffff8117b18c>] warn_alloc_failed+0xfc/0x170
[ 5197.399488]  [<ffffffff811c7b4c>] ? alloc_pages_current+0x8c/0x110
[ 5197.400587]  [<ffffffff811b5eae>] __vmalloc_node_range+0x18e/0x290
[ 5197.401685]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5197.402729]  [<ffffffff811b5ffa>] __vmalloc+0x4a/0x50
[ 5197.403661]  [<ffffffff8131f10a>] ? bucket_table_alloc+0x4a/0xf0
[ 5197.404702]  [<ffffffff8131f10a>] bucket_table_alloc+0x4a/0xf0
[ 5197.405744]  [<ffffffff8131f61d>] rhashtable_insert_rehash+0x5d/0xe0
[ 5197.406840]  [<ffffffffa14e4567>] insert_retry.isra.9.constprop.15+0x177/0x270 [test_rhashtable]
[ 5197.417389]  [<ffffffffa14e4706>] threadfunc+0xa6/0x38e [test_rhashtable]
[ 5197.418579]  [<ffffffff815dd28c>] ? __schedule+0x2ac/0x920
[ 5197.419574]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5197.421283]  [<ffffffffa14e4660>] ? insert_retry.isra.9.constprop.15+0x270/0x270 [test_rhashtable]
[ 5197.423014]  [<ffffffff81093ec8>] kthread+0xd8/0xf0
[ 5197.423933]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5197.424911]  [<ffffffff815e1acf>] ret_from_fork+0x3f/0x70
[ 5197.425935]  [<ffffffff81093df0>] ? kthread_park+0x60/0x60
[ 5197.426920] Mem-Info:
[ 5197.427542] active_anon:16125 inactive_anon:16748 isolated_anon:0
 active_file:36543 inactive_file:49042 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6154 slab_unreclaimable:7835
 mapped:13110 shmem:12833 pagetables:1159 bounce:0
 free:2567 free_pcp:160 free_cma:0
[ 5197.433457] Node 0 DMA free:3964kB min:60kB low:72kB high:88kB active_anon:36kB inactive_anon:100kB active_file:16kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:24kB shmem:16kB slab_reclaimable:364kB slab_unreclaimable:644kB kernel_stack:16kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5197.440780] lowmem_reserve[]: 0 976 976 976
[ 5197.441714] Node 0 DMA32 free:6304kB min:3828kB low:4784kB high:5740kB active_anon:64464kB inactive_anon:66892kB active_file:146156kB inactive_file:196164kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1032056kB managed:1001068kB mlocked:0kB dirty:0kB writeback:0kB mapped:52416kB shmem:51316kB slab_reclaimable:24252kB slab_unreclaimable:30696kB kernel_stack:3568kB pagetables:4520kB unstable:0kB bounce:0kB free_pcp:640kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 5197.450137] lowmem_reserve[]: 0 0 0 0
[ 5197.450991] Node 0 DMA: 15*4kB (UME) 18*8kB (UME) 13*16kB (ME) 11*32kB (UME) 10*64kB (UME) 6*128kB (ME) 3*256kB (ME) 2*512kB (UE) 0*1024kB 0*2048kB 0*4096kB = 3964kB
[ 5197.454161] Node 0 DMA32: 224*4kB (UME) 123*8kB (ME) 52*16kB (ME) 36*32kB (UM) 15*64kB (UM) 6*128kB (M) 3*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6360kB
[ 5197.457185] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 5197.458914] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 5197.460587] 98441 total pagecache pages
[ 5197.461382] 19 pages in swap cache
[ 5197.462117] Swap cache stats: add 1062, delete 1043, find 85/100
[ 5197.463488] Free swap  = 517456kB
[ 5197.464266] Total swap = 520188kB
[ 5197.464976] 262012 pages RAM
[ 5197.465652] 0 pages HighMem/MovableOnly
[ 5197.466448] 7768 pages reserved
[ 5197.467153] 0 pages cma reserved
[ 5197.467858] 0 pages hwpoisoned
[ 5199.067127] Test failed: thread 27 returned: -12
[ 5199.067909] Test failed: thread 28 returned: -12
[ 5199.073703] Test failed: thread 30 returned: -12
[ 5199.074493] Test failed: thread 31 returned: -12
[ 5199.075323] Test failed: thread 32 returned: -12
[ 5199.076242] Test failed: thread 33 returned: -12
[ 5199.077090] Test failed: thread 34 returned: -12
[ 5199.077855] Test failed: thread 35 returned: -12
[ 5199.078626] Test failed: thread 36 returned: -12
[ 5199.079478] Test failed: thread 37 returned: -12
[ 5199.080286] Test failed: thread 38 returned: -12
[ 5199.080983] Test failed: thread 39 returned: -12
[ 5199.081750] Test failed: thread 40 returned: -12
[ 5199.082536] Test failed: thread 41 returned: -12
[ 5199.083351] Test failed: thread 42 returned: -12
[ 5199.084135] Test failed: thread 43 returned: -12
[ 5199.084836] Test failed: thread 44 returned: -12
[ 5199.085595] Test failed: thread 45 returned: -12
[ 5199.086621] Test failed: thread 46 returned: -12
[ 5199.087417] Test failed: thread 47 returned: -12
[ 5199.088178] Test failed: thread 48 returned: -12
[ 5199.088917] Test failed: thread 49 returned: -12
[ 5199.089749] Test failed: thread 50 returned: -12
[ 5199.090570] Test failed: thread 51 returned: -12
[ 5199.091392] Test failed: thread 52 returned: -12
[ 5199.092406] Test failed: thread 54 returned: -12
[ 5199.093378] Test failed: thread 55 returned: -12
[ 5199.094385] Test failed: thread 56 returned: -12
[ 5199.095784] Test failed: thread 57 returned: -12
[ 5199.096749] Test failed: thread 58 returned: -12
[ 5199.097722] Test failed: thread 59 returned: -12
[ 5199.098622] Test failed: thread 60 returned: -12
[ 5199.099415] Test failed: thread 61 returned: -12
[ 5199.100230] Test failed: thread 62 returned: -12
[ 5199.100942] Test failed: thread 63 returned: -12
[ 5199.101743] Test failed: thread 64 returned: -12
[ 5199.102568] Test failed: thread 65 returned: -12
[ 5199.103717] Test failed: thread 66 returned: -12
[ 5199.104703] Test failed: thread 67 returned: -12
[ 5199.105510] Test failed: thread 68 returned: -12
[ 5199.106404] Test failed: thread 69 returned: -12
[ 5199.107268] Test failed: thread 70 returned: -12
[ 5199.108522] Test failed: thread 71 returned: -12
[ 5199.109319] Test failed: thread 72 returned: -12
[ 5199.110119] Test failed: thread 73 returned: -12
[ 5199.110822] Test failed: thread 74 returned: -12
[ 5199.111610] Test failed: thread 75 returned: -12
[ 5199.112444] Test failed: thread 77 returned: -12
[ 5199.113243] Test failed: thread 78 returned: -12
[ 5199.113979] Test failed: thread 79 returned: -12
[ 5199.114769] Test failed: thread 80 returned: -12
[ 5199.115564] Test failed: thread 81 returned: -12
[ 5199.116390] Test failed: thread 82 returned: -12
[ 5199.117199] Test failed: thread 83 returned: -12
[ 5199.117912] Test failed: thread 84 returned: -12
[ 5199.118717] Test failed: thread 85 returned: -12
[ 5199.119514] Test failed: thread 86 returned: -12
[ 5199.120727] Test failed: thread 87 returned: -12
[ 5199.121623] Test failed: thread 88 returned: -12
[ 5199.122486] Test failed: thread 89 returned: -12
[ 5199.123291] Test failed: thread 90 returned: -12
[ 5199.124119] Test failed: thread 91 returned: -12
[ 5199.125438] Test failed: thread 92 returned: -12
[ 5199.126270] Test failed: thread 93 returned: -12
[ 5199.127083] Test failed: thread 94 returned: -12
[ 5199.127795] Test failed: thread 95 returned: -12
[ 5199.128602] Test failed: thread 96 returned: -12
[ 5199.129419] Test failed: thread 97 returned: -12
[ 5199.130211] Test failed: thread 98 returned: -12
[ 5199.130913] Test failed: thread 99 returned: -12
[ 5199.131615] Started 100 threads, 70 failed

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-04 17:01             ` Phil Sutter
@ 2015-12-04 17:45               ` Eric Dumazet
  2015-12-04 18:15                 ` Phil Sutter
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Dumazet @ 2015-12-04 17:45 UTC (permalink / raw)
  To: Phil Sutter
  Cc: Herbert Xu, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Fri, 2015-12-04 at 18:01 +0100, Phil Sutter wrote:
> On Fri, Dec 04, 2015 at 10:39:56PM +0800, Herbert Xu wrote:
> > On Thu, Dec 03, 2015 at 08:08:39AM -0800, Eric Dumazet wrote:
> > >
> > > Anyway, __vmalloc() can be used with GFP_ATOMIC, have you tried this ?
> > 
> > OK I've tried it and I no longer get any ENOMEM errors!
> 
> I can't confirm this, sadly. Using 50 threads, results seem to be stable
> and good. But increasing the number of threads I can provoke ENOMEM
> condition again. See attached log which shows a failing test run with
> 100 threads.
> 
> I tried to extract logs of a test run with as few as possible failing
> threads, but wasn't successful. It seems like the error amplifies
> itself: While having stable success with less than 70 threads, going
> beyond a margin I could not identify exactly, much more threads failed
> than expected. For instance, the attached log shows 70 out of 100
> threads failing, while for me every single test with 50 threads was
> successful.
> 
> HTH, Phil

But this patch is about GFP_ATOMIC allocations, I doubt your test is
using GFP_ATOMIC.

Threads (process context) should use GFP_KERNEL allocations.

BTW, if 100 threads are simultaneously trying to vmalloc(32 MB), this
might not be very wise :(

Only one should really do this, while others are waiting.

If we really want parallelism (multiple cpus coordinating their effort),
it should be done very differently.




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-04 17:45               ` Eric Dumazet
@ 2015-12-04 18:15                 ` Phil Sutter
  2015-12-05  7:06                   ` Herbert Xu
  0 siblings, 1 reply; 39+ messages in thread
From: Phil Sutter @ 2015-12-04 18:15 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Herbert Xu, davem, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Fri, Dec 04, 2015 at 09:45:20AM -0800, Eric Dumazet wrote:
> On Fri, 2015-12-04 at 18:01 +0100, Phil Sutter wrote:
> > On Fri, Dec 04, 2015 at 10:39:56PM +0800, Herbert Xu wrote:
> > > On Thu, Dec 03, 2015 at 08:08:39AM -0800, Eric Dumazet wrote:
> > > >
> > > > Anyway, __vmalloc() can be used with GFP_ATOMIC, have you tried this ?
> > > 
> > > OK I've tried it and I no longer get any ENOMEM errors!
> > 
> > I can't confirm this, sadly. Using 50 threads, results seem to be stable
> > and good. But increasing the number of threads I can provoke ENOMEM
> > condition again. See attached log which shows a failing test run with
> > 100 threads.
> > 
> > I tried to extract logs of a test run with as few as possible failing
> > threads, but wasn't successful. It seems like the error amplifies
> > itself: While having stable success with less than 70 threads, going
> > beyond a margin I could not identify exactly, much more threads failed
> > than expected. For instance, the attached log shows 70 out of 100
> > threads failing, while for me every single test with 50 threads was
> > successful.
> 
> But this patch is about GFP_ATOMIC allocations, I doubt your test is
> using GFP_ATOMIC.
> 
> Threads (process context) should use GFP_KERNEL allocations.

Well, I assumed Herbert did his tests using test_rhashtable, and
therefore fixed whatever code-path that triggers. Maybe I'm wrong,
though.

Looking at the vmalloc allocation failure trace, it seems like it's
trying to indeed use GFP_ATOMIC from inside those threads: If I don't
miss anything, bucket_table_alloc is called from
rhashtable_insert_rehash, which passes GFP_ATOMIC unconditionally. But
then again bucket_table_alloc should use kzalloc if 'gfp != GFP_KERNEL',
so I'm probably just cross-eyed right now.

> BTW, if 100 threads are simultaneously trying to vmalloc(32 MB), this
> might not be very wise :(
> 
> Only one should really do this, while others are waiting.

Sure, that was my previous understanding of how this thing works.

> If we really want parallelism (multiple cpus coordinating their effort),
> it should be done very differently.

Maybe my approach of stress-testing rhashtable was too naive in the
first place.

Thanks, Phil

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-03 12:41       ` rhashtable: Prevent spurious EBUSY errors on insertion Herbert Xu
  2015-12-03 15:38         ` Phil Sutter
@ 2015-12-04 19:38         ` David Miller
  2015-12-17  8:46         ` Xin Long
  2 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2015-12-04 19:38 UTC (permalink / raw)
  To: herbert; +Cc: phil, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 3 Dec 2015 20:41:29 +0800

> Thomas and Phil observed that under stress rhashtable insertion
> sometimes failed with EBUSY, even though this error should only
> ever been seen when we're under attack and our hash chain length
> has grown to an unacceptable level, even after a rehash.
> 
> It turns out that the logic for detecting whether there is an
> existing rehash is faulty.  In particular, when two threads both
> try to grow the same table at the same time, one of them may see
> the newly grown table and thus erroneously conclude that it had
> been rehashed.  This is what leads to the EBUSY error.
> 
> This patch fixes this by remembering the current last table we
> used during insertion so that rhashtable_insert_rehash can detect
> when another thread has also done a resize/rehash.  When this is
> detected we will give up our resize/rehash and simply retry the
> insertion with the new table.
> 
> Reported-by: Thomas Graf <tgraf@suug.ch>
> Reported-by: Phil Sutter <phil@nwl.cc>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Looks good, applied, thanks Herbert.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-04 14:39           ` rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation Herbert Xu
  2015-12-04 17:01             ` Phil Sutter
@ 2015-12-04 21:53             ` David Miller
  2015-12-05  7:03               ` Herbert Xu
  1 sibling, 1 reply; 39+ messages in thread
From: David Miller @ 2015-12-04 21:53 UTC (permalink / raw)
  To: herbert
  Cc: eric.dumazet, phil, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 4 Dec 2015 22:39:56 +0800

> When an rhashtable user pounds rhashtable hard with back-to-back
> insertions we may end up growing the table in GFP_ATOMIC context.
> Unfortunately when the table reaches a certain size this often
> fails because we don't have enough physically contiguous pages
> to hold the new table.
> 
> Eric Dumazet suggested (and in fact wrote this patch) using
> __vmalloc instead which can be used in GFP_ATOMIC context.
> 
> Reported-by: Phil Sutter <phil@nwl.cc>
> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks Herbert.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-04 21:53             ` David Miller
@ 2015-12-05  7:03               ` Herbert Xu
  2015-12-06  3:48                 ` David Miller
  0 siblings, 1 reply; 39+ messages in thread
From: Herbert Xu @ 2015-12-05  7:03 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, phil, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

On Fri, Dec 04, 2015 at 04:53:34PM -0500, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Fri, 4 Dec 2015 22:39:56 +0800
> 
> > When an rhashtable user pounds rhashtable hard with back-to-back
> > insertions we may end up growing the table in GFP_ATOMIC context.
> > Unfortunately when the table reaches a certain size this often
> > fails because we don't have enough physically contiguous pages
> > to hold the new table.
> > 
> > Eric Dumazet suggested (and in fact wrote this patch) using
> > __vmalloc instead which can be used in GFP_ATOMIC context.
> > 
> > Reported-by: Phil Sutter <phil@nwl.cc>
> > Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
> > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> Applied, thanks Herbert.

Sorry Dave but you'll have to revert this because I've been able
to trigger the following crash with the patch:

Testing concurrent rhashtable access from 50 threads
------------[ cut here ]------------
kernel BUG at ../mm/vmalloc.c:1337!
invalid opcode: 0000 [#1] PREEMPT SMP 

The reason is that because I was testing insertions with BH disabled,
and __vmalloc doesn't like that, even with GFP_ATOMIC.  As we
obviously want to continue to support rhashtable users inserting
entries with BH disabled, we'll have to look for an alternate
solution.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-04 18:15                 ` Phil Sutter
@ 2015-12-05  7:06                   ` Herbert Xu
  2015-12-07 15:35                     ` Thomas Graf
  2015-12-09  2:18                     ` Thomas Graf
  0 siblings, 2 replies; 39+ messages in thread
From: Herbert Xu @ 2015-12-05  7:06 UTC (permalink / raw)
  To: Phil Sutter, Eric Dumazet, davem, netdev, linux-kernel, tgraf,
	fengguang.wu, wfg, lkp

On Fri, Dec 04, 2015 at 07:15:55PM +0100, Phil Sutter wrote:
>
> > Only one should really do this, while others are waiting.
> 
> Sure, that was my previous understanding of how this thing works.

Yes that's clearly how it should be.  Unfortunately while adding
the locking to do this, I found out that you can't actually call
__vmalloc with BH disabled so this is a no-go.

Unless we can make __vmalloc work with BH disabled, I guess we'll
have to go back to multi-level lookups unless someone has a better
suggestion.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-05  7:03               ` Herbert Xu
@ 2015-12-06  3:48                 ` David Miller
  0 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2015-12-06  3:48 UTC (permalink / raw)
  To: herbert
  Cc: eric.dumazet, phil, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat, 5 Dec 2015 15:03:54 +0800

> Sorry Dave but you'll have to revert this because I've been able
> to trigger the following crash with the patch:
> 
> Testing concurrent rhashtable access from 50 threads
> ------------[ cut here ]------------
> kernel BUG at ../mm/vmalloc.c:1337!
> invalid opcode: 0000 [#1] PREEMPT SMP 
> 
> The reason is that because I was testing insertions with BH disabled,
> and __vmalloc doesn't like that, even with GFP_ATOMIC.  As we
> obviously want to continue to support rhashtable users inserting
> entries with BH disabled, we'll have to look for an alternate
> solution.

Ok, reverted, thanks for the heads up.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-05  7:06                   ` Herbert Xu
@ 2015-12-07 15:35                     ` Thomas Graf
  2015-12-07 19:29                       ` David Miller
  2015-12-09  2:18                     ` Thomas Graf
  1 sibling, 1 reply; 39+ messages in thread
From: Thomas Graf @ 2015-12-07 15:35 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Phil Sutter, Eric Dumazet, davem, netdev, linux-kernel,
	fengguang.wu, wfg, lkp

On 12/05/15 at 03:06pm, Herbert Xu wrote:
> On Fri, Dec 04, 2015 at 07:15:55PM +0100, Phil Sutter wrote:
> >
> > > Only one should really do this, while others are waiting.
> > 
> > Sure, that was my previous understanding of how this thing works.
> 
> Yes that's clearly how it should be.  Unfortunately while adding
> the locking to do this, I found out that you can't actually call
> __vmalloc with BH disabled so this is a no-go.
> 
> Unless we can make __vmalloc work with BH disabled, I guess we'll
> have to go back to multi-level lookups unless someone has a better
> suggestion.

Thanks for fixing the race.

As for the remaining problem, I think we'll have to find a way to
serve a hard pounding user if we want to convert TCP hashtables
later on.

Did you look into what __vmalloc prevents to work with BH disabled?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-07 15:35                     ` Thomas Graf
@ 2015-12-07 19:29                       ` David Miller
  0 siblings, 0 replies; 39+ messages in thread
From: David Miller @ 2015-12-07 19:29 UTC (permalink / raw)
  To: tgraf
  Cc: herbert, phil, eric.dumazet, netdev, linux-kernel, fengguang.wu,
	wfg, lkp

From: Thomas Graf <tgraf@suug.ch>
Date: Mon, 7 Dec 2015 16:35:24 +0100

> Did you look into what __vmalloc prevents to work with BH disabled?

You can't issue the cross-cpu TLB flushes from atomic contexts.
It's the kernel page table updates that create the restriction.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-05  7:06                   ` Herbert Xu
  2015-12-07 15:35                     ` Thomas Graf
@ 2015-12-09  2:18                     ` Thomas Graf
  2015-12-09  2:24                       ` Herbert Xu
  1 sibling, 1 reply; 39+ messages in thread
From: Thomas Graf @ 2015-12-09  2:18 UTC (permalink / raw)
  To: Herbert Xu, David Miller
  Cc: Phil Sutter, Eric Dumazet, netdev, linux-kernel, fengguang.wu, wfg, lkp


On 12/05/15 at 03:06pm, Herbert Xu wrote:
> Unless we can make __vmalloc work with BH disabled, I guess we'll
> have to go back to multi-level lookups unless someone has a better
> suggestion.

Assuming that we only encounter this scenario with very large
table sizes, it might be OK to assume that deferring the actual
resize via the worker thread while continuing to insert above
100% utilization in atomic context is safe.

On 12/07/15 at 02:29pm, David Miller wrote:
> You can't issue the cross-cpu TLB flushes from atomic contexts.
> It's the kernel page table updates that create the restriction.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-09  2:18                     ` Thomas Graf
@ 2015-12-09  2:24                       ` Herbert Xu
  2015-12-09  2:36                         ` Thomas Graf
  0 siblings, 1 reply; 39+ messages in thread
From: Herbert Xu @ 2015-12-09  2:24 UTC (permalink / raw)
  To: Thomas Graf
  Cc: David Miller, Phil Sutter, Eric Dumazet, netdev, linux-kernel,
	fengguang.wu, wfg, lkp

On Wed, Dec 09, 2015 at 03:18:26AM +0100, Thomas Graf wrote:
> 
> Assuming that we only encounter this scenario with very large
> table sizes, it might be OK to assume that deferring the actual
> resize via the worker thread while continuing to insert above
> 100% utilization in atomic context is safe.

As test_rhashtable has demonstrated already this approach doesn't
work.  There is nothing in the kernel that will ensure that the
worker thread gets to run at all.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-09  2:24                       ` Herbert Xu
@ 2015-12-09  2:36                         ` Thomas Graf
  2015-12-09  2:38                           ` Herbert Xu
  0 siblings, 1 reply; 39+ messages in thread
From: Thomas Graf @ 2015-12-09  2:36 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, Phil Sutter, Eric Dumazet, netdev, linux-kernel,
	fengguang.wu, wfg, lkp

On 12/09/15 at 10:24am, Herbert Xu wrote:
> On Wed, Dec 09, 2015 at 03:18:26AM +0100, Thomas Graf wrote:
> > 
> > Assuming that we only encounter this scenario with very large
> > table sizes, it might be OK to assume that deferring the actual
> > resize via the worker thread while continuing to insert above
> > 100% utilization in atomic context is safe.
> 
> As test_rhashtable has demonstrated already this approach doesn't
> work.  There is nothing in the kernel that will ensure that the
> worker thread gets to run at all.

If we define work assuming that an insertion in atomic context
should never fail then yes.  I'm not sure you can guarantee that
with a segmented table either though. I agree though that the
insertion behaviour is much better defined.

My argument is that if we are in a situation in which a worker
thread is never invoked and we've grown 2x from the original
table size, do we still need entries to be inserted into the
table or can we fail?

Without knowing your exact implementation plans: introducing an
additional reference indirection for every lookup will have a
huge performance penalty as well.

Is your plan to only introduce the master table after an
allocation has failed?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-09  2:36                         ` Thomas Graf
@ 2015-12-09  2:38                           ` Herbert Xu
  2015-12-09  2:42                             ` Thomas Graf
  0 siblings, 1 reply; 39+ messages in thread
From: Herbert Xu @ 2015-12-09  2:38 UTC (permalink / raw)
  To: Thomas Graf
  Cc: David Miller, Phil Sutter, Eric Dumazet, netdev, linux-kernel,
	fengguang.wu, wfg, lkp

On Wed, Dec 09, 2015 at 03:36:32AM +0100, Thomas Graf wrote:
> 
> Without knowing your exact implementation plans: introducing an
> additional reference indirection for every lookup will have a
> huge performance penalty as well.
> 
> Is your plan to only introduce the master table after an
> allocation has failed?

Right, obviously the extra indirections would only come into play
after a failed allocation.  As soon as we can run the worker thread
it'll try to remove the extra indirections by doing vmalloc.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation
  2015-12-09  2:38                           ` Herbert Xu
@ 2015-12-09  2:42                             ` Thomas Graf
  0 siblings, 0 replies; 39+ messages in thread
From: Thomas Graf @ 2015-12-09  2:42 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, Phil Sutter, Eric Dumazet, netdev, linux-kernel,
	fengguang.wu, wfg, lkp

On 12/09/15 at 10:38am, Herbert Xu wrote:
> On Wed, Dec 09, 2015 at 03:36:32AM +0100, Thomas Graf wrote:
> > 
> > Without knowing your exact implementation plans: introducing an
> > additional reference indirection for every lookup will have a
> > huge performance penalty as well.
> > 
> > Is your plan to only introduce the master table after an
> > allocation has failed?
> 
> Right, obviously the extra indirections would only come into play
> after a failed allocation.  As soon as we can run the worker thread
> it'll try to remove the extra indirections by doing vmalloc.

OK, this sounds like a good compromise. The penalty is isolated
for the duration of the atomic burst.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-03 12:41       ` rhashtable: Prevent spurious EBUSY errors on insertion Herbert Xu
  2015-12-03 15:38         ` Phil Sutter
  2015-12-04 19:38         ` David Miller
@ 2015-12-17  8:46         ` Xin Long
  2015-12-17  8:48           ` Herbert Xu
  2 siblings, 1 reply; 39+ messages in thread
From: Xin Long @ 2015-12-17  8:46 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Phil Sutter, davem, network dev, linux-kernel, tgraf,
	fengguang.wu, wfg, lkp

On Thu, Dec 3, 2015 at 8:41 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Mon, Nov 30, 2015 at 06:18:59PM +0800, Herbert Xu wrote:
>>
>> OK that's better.  I think I see the problem.  The test in
>> rhashtable_insert_rehash is racy and if two threads both try
>> to grow the table one of them may be tricked into doing a rehash
>> instead.
>>
>> I'm working on a fix.
>
> OK this patch fixes the EBUSY problem as far as I can tell.  Please
> let me know if you still observe EBUSY with it.  I'll respond to the
> ENOMEM problem in another email.
>
> ---8<---
> Thomas and Phil observed that under stress rhashtable insertion
> sometimes failed with EBUSY, even though this error should only
> ever been seen when we're under attack and our hash chain length
> has grown to an unacceptable level, even after a rehash.
>
> It turns out that the logic for detecting whether there is an
> existing rehash is faulty.  In particular, when two threads both
> try to grow the same table at the same time, one of them may see
> the newly grown table and thus erroneously conclude that it had
> been rehashed.  This is what leads to the EBUSY error.
>
> This patch fixes this by remembering the current last table we
> used during insertion so that rhashtable_insert_rehash can detect
> when another thread has also done a resize/rehash.  When this is
> detected we will give up our resize/rehash and simply retry the
> insertion with the new table.
>
> Reported-by: Thomas Graf <tgraf@suug.ch>
> Reported-by: Phil Sutter <phil@nwl.cc>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
> index 843ceca..e50b31d 100644
> --- a/include/linux/rhashtable.h
> +++ b/include/linux/rhashtable.h
> @@ -19,6 +19,7 @@
>
>  #include <linux/atomic.h>
>  #include <linux/compiler.h>
> +#include <linux/err.h>
>  #include <linux/errno.h>
>  #include <linux/jhash.h>
>  #include <linux/list_nulls.h>
> @@ -339,10 +340,11 @@ static inline int lockdep_rht_bucket_is_held(const struct bucket_table *tbl,
>  int rhashtable_init(struct rhashtable *ht,
>                     const struct rhashtable_params *params);
>
> -int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
> -                          struct rhash_head *obj,
> -                          struct bucket_table *old_tbl);
> -int rhashtable_insert_rehash(struct rhashtable *ht);
> +struct bucket_table *rhashtable_insert_slow(struct rhashtable *ht,
> +                                           const void *key,
> +                                           struct rhash_head *obj,
> +                                           struct bucket_table *old_tbl);
> +int rhashtable_insert_rehash(struct rhashtable *ht, struct bucket_table *tbl);
>
>  int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter);
>  void rhashtable_walk_exit(struct rhashtable_iter *iter);
> @@ -598,9 +600,11 @@ restart:
>
>         new_tbl = rht_dereference_rcu(tbl->future_tbl, ht);
>         if (unlikely(new_tbl)) {
> -               err = rhashtable_insert_slow(ht, key, obj, new_tbl);
> -               if (err == -EAGAIN)
> +               tbl = rhashtable_insert_slow(ht, key, obj, new_tbl);
> +               if (!IS_ERR_OR_NULL(tbl))
>                         goto slow_path;
> +
> +               err = PTR_ERR(tbl);
>                 goto out;
>         }
>
> @@ -611,7 +615,7 @@ restart:
>         if (unlikely(rht_grow_above_100(ht, tbl))) {
>  slow_path:
>                 spin_unlock_bh(lock);
> -               err = rhashtable_insert_rehash(ht);
> +               err = rhashtable_insert_rehash(ht, tbl);
>                 rcu_read_unlock();
>                 if (err)
>                         return err;
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index a54ff89..2ff7ed9 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -389,33 +389,31 @@ static bool rhashtable_check_elasticity(struct rhashtable *ht,
>         return false;
>  }
>
> -int rhashtable_insert_rehash(struct rhashtable *ht)
> +int rhashtable_insert_rehash(struct rhashtable *ht,
> +                            struct bucket_table *tbl)
>  {
>         struct bucket_table *old_tbl;
>         struct bucket_table *new_tbl;
> -       struct bucket_table *tbl;
>         unsigned int size;
>         int err;
>
>         old_tbl = rht_dereference_rcu(ht->tbl, ht);
> -       tbl = rhashtable_last_table(ht, old_tbl);
>
>         size = tbl->size;
>
> +       err = -EBUSY;
> +
>         if (rht_grow_above_75(ht, tbl))
>                 size *= 2;
>         /* Do not schedule more than one rehash */
>         else if (old_tbl != tbl)
> -               return -EBUSY;
> +               goto fail;
> +
> +       err = -ENOMEM;
>
>         new_tbl = bucket_table_alloc(ht, size, GFP_ATOMIC);
> -       if (new_tbl == NULL) {
> -               /* Schedule async resize/rehash to try allocation
> -                * non-atomic context.
> -                */
> -               schedule_work(&ht->run_work);
> -               return -ENOMEM;
> -       }
> +       if (new_tbl == NULL)
> +               goto fail;
>
>         err = rhashtable_rehash_attach(ht, tbl, new_tbl);
>         if (err) {
> @@ -426,12 +424,24 @@ int rhashtable_insert_rehash(struct rhashtable *ht)
>                 schedule_work(&ht->run_work);
>
>         return err;
> +
> +fail:
> +       /* Do not fail the insert if someone else did a rehash. */
> +       if (likely(rcu_dereference_raw(tbl->future_tbl)))
> +               return 0;
> +
> +       /* Schedule async rehash to retry allocation in process context. */
> +       if (err == -ENOMEM)
> +               schedule_work(&ht->run_work);
> +
> +       return err;
>  }
>  EXPORT_SYMBOL_GPL(rhashtable_insert_rehash);
>
> -int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
> -                          struct rhash_head *obj,
> -                          struct bucket_table *tbl)
> +struct bucket_table *rhashtable_insert_slow(struct rhashtable *ht,
> +                                           const void *key,
> +                                           struct rhash_head *obj,
> +                                           struct bucket_table *tbl)
>  {
>         struct rhash_head *head;
>         unsigned int hash;
> @@ -467,7 +477,12 @@ int rhashtable_insert_slow(struct rhashtable *ht, const void *key,
>  exit:
>         spin_unlock(rht_bucket_lock(tbl, hash));
>
> -       return err;
> +       if (err == 0)
> +               return NULL;
> +       else if (err == -EAGAIN)
> +               return tbl;
> +       else
> +               return ERR_PTR(err);
>  }
>  EXPORT_SYMBOL_GPL(rhashtable_insert_slow);
>

sorry for late test, but unfortunately, my case with rhashtalbe still
return EBUSY.
I added some debug code in rhashtable_insert_rehash(), and found:
*future_tbl is null*

fail:
        /* Do not fail the insert if someone else did a rehash. */
        if (likely(rcu_dereference_raw(tbl->future_tbl))) {
                printk("future_tbl is there\n");
                return 0;
        } else {
                printk("future_tbl is null\n");
        }

any idea why ?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-17  8:46         ` Xin Long
@ 2015-12-17  8:48           ` Herbert Xu
  2015-12-17  9:00             ` Xin Long
  0 siblings, 1 reply; 39+ messages in thread
From: Herbert Xu @ 2015-12-17  8:48 UTC (permalink / raw)
  To: Xin Long
  Cc: Phil Sutter, davem, network dev, linux-kernel, tgraf,
	fengguang.wu, wfg, lkp

On Thu, Dec 17, 2015 at 04:46:00PM +0800, Xin Long wrote:
>
> sorry for late test, but unfortunately, my case with rhashtalbe still
> return EBUSY.
> I added some debug code in rhashtable_insert_rehash(), and found:
> *future_tbl is null*
> 
> fail:
>         /* Do not fail the insert if someone else did a rehash. */
>         if (likely(rcu_dereference_raw(tbl->future_tbl))) {
>                 printk("future_tbl is there\n");
>                 return 0;
>         } else {
>                 printk("future_tbl is null\n");
>         }
> 
> any idea why ?

That's presumably because you got a genuine double rehash.

Until you post your code we can't really help you.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-17  8:48           ` Herbert Xu
@ 2015-12-17  9:00             ` Xin Long
  2015-12-17 16:07               ` Xin Long
  2015-12-17 17:00               ` David Miller
  0 siblings, 2 replies; 39+ messages in thread
From: Xin Long @ 2015-12-17  9:00 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Phil Sutter, davem, network dev, linux-kernel, tgraf,
	fengguang.wu, wfg, lkp

On Thu, Dec 17, 2015 at 4:48 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Thu, Dec 17, 2015 at 04:46:00PM +0800, Xin Long wrote:
>>
>> sorry for late test, but unfortunately, my case with rhashtalbe still
>> return EBUSY.
>> I added some debug code in rhashtable_insert_rehash(), and found:
>> *future_tbl is null*
>>
>> fail:
>>         /* Do not fail the insert if someone else did a rehash. */
>>         if (likely(rcu_dereference_raw(tbl->future_tbl))) {
>>                 printk("future_tbl is there\n");
>>                 return 0;
>>         } else {
>>                 printk("future_tbl is null\n");
>>         }
>>
>> any idea why ?
>
> That's presumably because you got a genuine double rehash.
>
> Until you post your code we can't really help you.
>
i wish i could , but my codes is a big patch for sctp, and this issue
happens in a special stress test based on this patch.
im trying to think how i can show you. :)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-17  9:00             ` Xin Long
@ 2015-12-17 16:07               ` Xin Long
  2015-12-18  2:26                 ` Herbert Xu
  2015-12-17 17:00               ` David Miller
  1 sibling, 1 reply; 39+ messages in thread
From: Xin Long @ 2015-12-17 16:07 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Phil Sutter, davem, network dev, linux-kernel, tgraf,
	fengguang.wu, wfg, lkp

On Thu, Dec 17, 2015 at 5:00 PM, Xin Long <lucien.xin@gmail.com> wrote:
> On Thu, Dec 17, 2015 at 4:48 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>> On Thu, Dec 17, 2015 at 04:46:00PM +0800, Xin Long wrote:
>>>
>>> sorry for late test, but unfortunately, my case with rhashtalbe still
>>> return EBUSY.
>>> I added some debug code in rhashtable_insert_rehash(), and found:
>>> *future_tbl is null*
>>>
>>> fail:
>>>         /* Do not fail the insert if someone else did a rehash. */
>>>         if (likely(rcu_dereference_raw(tbl->future_tbl))) {
>>>                 printk("future_tbl is there\n");
>>>                 return 0;
>>>         } else {
>>>                 printk("future_tbl is null\n");
>>>         }
>>>
>>> any idea why ?
>>
>> That's presumably because you got a genuine double rehash.
>>
>> Until you post your code we can't really help you.
>>
> i wish i could , but my codes is a big patch for sctp, and this issue
> happens in a special stress test based on this patch.
> im trying to think how i can show you. :)

I'm just wondering, why do not we handle the genuine double rehash
issue inside rhashtable? i mean it's just a temporary error that a
simple retry may fix it.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-17  9:00             ` Xin Long
  2015-12-17 16:07               ` Xin Long
@ 2015-12-17 17:00               ` David Miller
  1 sibling, 0 replies; 39+ messages in thread
From: David Miller @ 2015-12-17 17:00 UTC (permalink / raw)
  To: lucien.xin
  Cc: herbert, phil, netdev, linux-kernel, tgraf, fengguang.wu, wfg, lkp

From: Xin Long <lucien.xin@gmail.com>
Date: Thu, 17 Dec 2015 17:00:35 +0800

> On Thu, Dec 17, 2015 at 4:48 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>> On Thu, Dec 17, 2015 at 04:46:00PM +0800, Xin Long wrote:
>>>
>>> sorry for late test, but unfortunately, my case with rhashtalbe still
>>> return EBUSY.
>>> I added some debug code in rhashtable_insert_rehash(), and found:
>>> *future_tbl is null*
>>>
>>> fail:
>>>         /* Do not fail the insert if someone else did a rehash. */
>>>         if (likely(rcu_dereference_raw(tbl->future_tbl))) {
>>>                 printk("future_tbl is there\n");
>>>                 return 0;
>>>         } else {
>>>                 printk("future_tbl is null\n");
>>>         }
>>>
>>> any idea why ?
>>
>> That's presumably because you got a genuine double rehash.
>>
>> Until you post your code we can't really help you.
>>
> i wish i could , but my codes is a big patch for sctp, and this issue
> happens in a special stress test based on this patch.
> im trying to think how i can show you. :)

Simply post it.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-17 16:07               ` Xin Long
@ 2015-12-18  2:26                 ` Herbert Xu
  2015-12-18  8:18                   ` Xin Long
  0 siblings, 1 reply; 39+ messages in thread
From: Herbert Xu @ 2015-12-18  2:26 UTC (permalink / raw)
  To: Xin Long
  Cc: Phil Sutter, davem, network dev, linux-kernel, tgraf,
	fengguang.wu, wfg, lkp

On Fri, Dec 18, 2015 at 12:07:08AM +0800, Xin Long wrote:
>
> I'm just wondering, why do not we handle the genuine double rehash
> issue inside rhashtable? i mean it's just a temporary error that a
> simple retry may fix it.

Because a double rehash means that someone has cracked your hash
function and there is no point in trying anymore.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: rhashtable: Prevent spurious EBUSY errors on insertion
  2015-12-18  2:26                 ` Herbert Xu
@ 2015-12-18  8:18                   ` Xin Long
  0 siblings, 0 replies; 39+ messages in thread
From: Xin Long @ 2015-12-18  8:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Phil Sutter, davem, network dev, linux-kernel, tgraf,
	fengguang.wu, wfg, lkp

On Fri, Dec 18, 2015 at 10:26 AM, Herbert Xu
<herbert@gondor.apana.org.au> wrote:
> On Fri, Dec 18, 2015 at 12:07:08AM +0800, Xin Long wrote:
>>
>> I'm just wondering, why do not we handle the genuine double rehash
>> issue inside rhashtable? i mean it's just a temporary error that a
>> simple retry may fix it.
>
> Because a double rehash means that someone has cracked your hash
> function and there is no point in trying anymore.

ok, get your point, is it possible to be triggered by some cases under
a big stress insertion, but they are all legal cases. like we use rhash in
nftables, if there are a big batch sets to insert, may this issue happen?

>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-12-18  8:18 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-20 17:17 [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test Phil Sutter
2015-11-20 17:17 ` [PATCH v2 1/4] rhashtable-test: add cond_resched() to thread test Phil Sutter
2015-11-20 17:17 ` [PATCH v2 2/4] rhashtable-test: retry insert operations Phil Sutter
2015-11-20 17:17 ` [PATCH v2 3/4] rhashtable-test: calculate max_entries value by default Phil Sutter
2015-11-20 17:17 ` [PATCH v2 4/4] rhashtable-test: allow to retry even if -ENOMEM was returned Phil Sutter
2015-11-20 17:28   ` Phil Sutter
2015-11-23 17:38 ` [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test David Miller
2015-11-30  9:37 ` Herbert Xu
2015-11-30 10:14   ` Phil Sutter
2015-11-30 10:18     ` Herbert Xu
2015-12-03 12:41       ` rhashtable: Prevent spurious EBUSY errors on insertion Herbert Xu
2015-12-03 15:38         ` Phil Sutter
2015-12-04 19:38         ` David Miller
2015-12-17  8:46         ` Xin Long
2015-12-17  8:48           ` Herbert Xu
2015-12-17  9:00             ` Xin Long
2015-12-17 16:07               ` Xin Long
2015-12-18  2:26                 ` Herbert Xu
2015-12-18  8:18                   ` Xin Long
2015-12-17 17:00               ` David Miller
2015-12-03 12:51       ` rhashtable: ENOMEM errors when hit with a flood of insertions Herbert Xu
2015-12-03 15:08         ` David Laight
2015-12-03 16:08         ` Eric Dumazet
2015-12-04  0:07           ` Herbert Xu
2015-12-04 14:39           ` rhashtable: Use __vmalloc with GFP_ATOMIC for table allocation Herbert Xu
2015-12-04 17:01             ` Phil Sutter
2015-12-04 17:45               ` Eric Dumazet
2015-12-04 18:15                 ` Phil Sutter
2015-12-05  7:06                   ` Herbert Xu
2015-12-07 15:35                     ` Thomas Graf
2015-12-07 19:29                       ` David Miller
2015-12-09  2:18                     ` Thomas Graf
2015-12-09  2:24                       ` Herbert Xu
2015-12-09  2:36                         ` Thomas Graf
2015-12-09  2:38                           ` Herbert Xu
2015-12-09  2:42                             ` Thomas Graf
2015-12-04 21:53             ` David Miller
2015-12-05  7:03               ` Herbert Xu
2015-12-06  3:48                 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).