All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6 net-next] rhashtable fixes
@ 2015-01-30  0:20 Thomas Graf
  2015-01-30  0:20 ` [PATCH 1/6] rhashtable: key_hashfn() must return full hash value Thomas Graf
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, ying.xue

This is a series of fixes which have accumulated while tracking
down the race condition reoprted by Ying Xue. The original
DEBUG_PAGEALLOC splat is resolved.

However, there is still a race (harder to trigger) remaining in
which certain entries are unfindable when removing them from the
table via netlink_remove() and thus they cause a use after free
later on.

Regardless, these fixes can go in now.

Thomas Graf (6):
  rhashtable: key_hashfn() must return full hash value
  rhashtable: Use a single bucket lock for sibling buckets
  rhashtable: Wait for RCU readers after final unzip work
  rhashtable: Dump bucket tables on locking violation under
    PROVE_LOCKING
  rhashtable: Add more lock verification
  rhashtable: Avoid bucket cross reference after removal

 lib/rhashtable.c | 301 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 166 insertions(+), 135 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] rhashtable: key_hashfn() must return full hash value
  2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
@ 2015-01-30  0:20 ` Thomas Graf
  2015-01-30  0:20 ` [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets Thomas Graf
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, ying.xue

The value computed by key_hashfn() is used by rhashtable_lookup_compare()
to traverse both tables during a resize. key_hashfn() must therefore
return the hash value without the buckets mask applied so it can be
masked to the size of each individual table.

Fixes: 97defe1ecf86 ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 lib/rhashtable.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index bc2d0d8..7413697 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -94,13 +94,7 @@ static u32 obj_raw_hashfn(const struct rhashtable *ht, const void *ptr)
 
 static u32 key_hashfn(struct rhashtable *ht, const void *key, u32 len)
 {
-	struct bucket_table *tbl = rht_dereference_rcu(ht->tbl, ht);
-	u32 hash;
-
-	hash = ht->p.hashfn(key, len, ht->p.hash_rnd);
-	hash >>= HASH_RESERVED_SPACE;
-
-	return rht_bucket_index(tbl, hash);
+	return ht->p.hashfn(key, len, ht->p.hash_rnd) >> HASH_RESERVED_SPACE;
 }
 
 static u32 head_hashfn(const struct rhashtable *ht,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets
  2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
  2015-01-30  0:20 ` [PATCH 1/6] rhashtable: key_hashfn() must return full hash value Thomas Graf
@ 2015-01-30  0:20 ` Thomas Graf
  2015-01-31  4:34   ` Herbert Xu
  2015-01-30  0:20 ` [PATCH 3/6] rhashtable: Wait for RCU readers after final unzip work Thomas Graf
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, ying.xue

rhashtable currently allows to use a bucket lock per bucket. This
requires multiple levels of complicated nested locking because when
resizing, a single bucket of the smaller table will map to two
buckets in the larger table. So far rhashtable has explicitly locked
both buckets in the larger table.

By excluding the highest bit of the hash from the bucket lock map and
thus only allowing locks to buckets in a ratio of 1:2, the locking
can be simplified a lot without losing the benefits of multiple locks.
Larger tables which benefit from multiple locks will not have a single
lock per bucket anyway.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 lib/rhashtable.c | 166 +++++++++++++++++++++++--------------------------------
 1 file changed, 68 insertions(+), 98 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 7413697..17eeabc 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -1,7 +1,7 @@
 /*
  * Resizable, Scalable, Concurrent Hash Table
  *
- * Copyright (c) 2014 Thomas Graf <tgraf@suug.ch>
+ * Copyright (c) 2014-2015 Thomas Graf <tgraf@suug.ch>
  * Copyright (c) 2008-2014 Patrick McHardy <kaber@trash.net>
  *
  * Based on the following paper:
@@ -34,7 +34,6 @@
 enum {
 	RHT_LOCK_NORMAL,
 	RHT_LOCK_NESTED,
-	RHT_LOCK_NESTED2,
 };
 
 /* The bucket lock is selected based on the hash and protects mutations
@@ -128,8 +127,8 @@ static int alloc_bucket_locks(struct rhashtable *ht, struct bucket_table *tbl)
 	nr_pcpus = min_t(unsigned int, nr_pcpus, 32UL);
 	size = roundup_pow_of_two(nr_pcpus * ht->p.locks_mul);
 
-	/* Never allocate more than one lock per bucket */
-	size = min_t(unsigned int, size, tbl->size);
+	/* Never allocate more than 0.5 locks per bucket */
+	size = min_t(unsigned int, size, tbl->size >> 1);
 
 	if (sizeof(spinlock_t) != 0) {
 #ifdef CONFIG_NUMA
@@ -211,13 +210,36 @@ bool rht_shrink_below_30(const struct rhashtable *ht, size_t new_size)
 }
 EXPORT_SYMBOL_GPL(rht_shrink_below_30);
 
-static void hashtable_chain_unzip(const struct rhashtable *ht,
+static void lock_buckets(struct bucket_table *new_tbl,
+			 struct bucket_table *old_tbl, unsigned int hash)
+	__acquires(old_bucket_lock)
+{
+	spin_lock_bh(bucket_lock(old_tbl, hash));
+	if (new_tbl != old_tbl)
+		spin_lock_bh_nested(bucket_lock(new_tbl, hash),
+				    RHT_LOCK_NESTED);
+}
+
+static void unlock_buckets(struct bucket_table *new_tbl,
+			   struct bucket_table *old_tbl, unsigned int hash)
+	__releases(old_bucket_lock)
+{
+	if (new_tbl != old_tbl)
+		spin_unlock_bh(bucket_lock(new_tbl, hash));
+	spin_unlock_bh(bucket_lock(old_tbl, hash));
+}
+
+/**
+ * Unlink entries on bucket which hash to different bucket.
+ *
+ * Returns true if no more work needs to be performed on the bucket.
+ */
+static bool hashtable_chain_unzip(const struct rhashtable *ht,
 				  const struct bucket_table *new_tbl,
 				  struct bucket_table *old_tbl,
 				  size_t old_hash)
 {
 	struct rhash_head *he, *p, *next;
-	spinlock_t *new_bucket_lock, *new_bucket_lock2 = NULL;
 	unsigned int new_hash, new_hash2;
 
 	ASSERT_BUCKET_LOCK(old_tbl, old_hash);
@@ -226,10 +248,10 @@ static void hashtable_chain_unzip(const struct rhashtable *ht,
 	p = rht_dereference_bucket(old_tbl->buckets[old_hash], old_tbl,
 				   old_hash);
 	if (rht_is_a_nulls(p))
-		return;
+		return false;
 
-	new_hash = new_hash2 = head_hashfn(ht, new_tbl, p);
-	new_bucket_lock = bucket_lock(new_tbl, new_hash);
+	new_hash = head_hashfn(ht, new_tbl, p);
+	ASSERT_BUCKET_LOCK(new_tbl, new_hash);
 
 	/* Advance the old bucket pointer one or more times until it
 	 * reaches a node that doesn't hash to the same bucket as the
@@ -237,22 +259,19 @@ static void hashtable_chain_unzip(const struct rhashtable *ht,
 	 */
 	rht_for_each_continue(he, p->next, old_tbl, old_hash) {
 		new_hash2 = head_hashfn(ht, new_tbl, he);
+
+		/* All entries in a chain must alwas map to a single
+		 * bucket lock. This is ensured because the bucket lock
+		 * hash map ignores the highest bit.
+		 */
+		ASSERT_BUCKET_LOCK(new_tbl, new_hash2);
+
 		if (new_hash != new_hash2)
 			break;
 		p = he;
 	}
 	rcu_assign_pointer(old_tbl->buckets[old_hash], p->next);
 
-	spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED);
-
-	/* If we have encountered an entry that maps to a different bucket in
-	 * the new table, lock down that bucket as well as we might cut off
-	 * the end of the chain.
-	 */
-	new_bucket_lock2 = bucket_lock(new_tbl, new_hash);
-	if (new_bucket_lock != new_bucket_lock2)
-		spin_lock_bh_nested(new_bucket_lock2, RHT_LOCK_NESTED2);
-
 	/* Find the subsequent node which does hash to the same
 	 * bucket as node P, or NULL if no such node exists.
 	 */
@@ -271,21 +290,16 @@ static void hashtable_chain_unzip(const struct rhashtable *ht,
 	 */
 	rcu_assign_pointer(p->next, next);
 
-	if (new_bucket_lock != new_bucket_lock2)
-		spin_unlock_bh(new_bucket_lock2);
-	spin_unlock_bh(new_bucket_lock);
+	p = rht_dereference_bucket(old_tbl->buckets[old_hash], old_tbl,
+				   old_hash);
+
+	return !rht_is_a_nulls(p);
 }
 
 static void link_old_to_new(struct bucket_table *new_tbl,
 			    unsigned int new_hash, struct rhash_head *entry)
 {
-	spinlock_t *new_bucket_lock;
-
-	new_bucket_lock = bucket_lock(new_tbl, new_hash);
-
-	spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED);
 	rcu_assign_pointer(*bucket_tail(new_tbl, new_hash), entry);
-	spin_unlock_bh(new_bucket_lock);
 }
 
 /**
@@ -308,7 +322,6 @@ int rhashtable_expand(struct rhashtable *ht)
 {
 	struct bucket_table *new_tbl, *old_tbl = rht_dereference(ht->tbl, ht);
 	struct rhash_head *he;
-	spinlock_t *old_bucket_lock;
 	unsigned int new_hash, old_hash;
 	bool complete = false;
 
@@ -338,16 +351,14 @@ int rhashtable_expand(struct rhashtable *ht)
 	 */
 	for (new_hash = 0; new_hash < new_tbl->size; new_hash++) {
 		old_hash = rht_bucket_index(old_tbl, new_hash);
-		old_bucket_lock = bucket_lock(old_tbl, old_hash);
-
-		spin_lock_bh(old_bucket_lock);
+		lock_buckets(new_tbl, old_tbl, new_hash);
 		rht_for_each(he, old_tbl, old_hash) {
 			if (head_hashfn(ht, new_tbl, he) == new_hash) {
 				link_old_to_new(new_tbl, new_hash, he);
 				break;
 			}
 		}
-		spin_unlock_bh(old_bucket_lock);
+		unlock_buckets(new_tbl, old_tbl, new_hash);
 	}
 
 	/* Publish the new table pointer. Lookups may now traverse
@@ -370,18 +381,13 @@ int rhashtable_expand(struct rhashtable *ht)
 		 */
 		complete = true;
 		for (old_hash = 0; old_hash < old_tbl->size; old_hash++) {
-			struct rhash_head *head;
-
-			old_bucket_lock = bucket_lock(old_tbl, old_hash);
-			spin_lock_bh(old_bucket_lock);
+			lock_buckets(new_tbl, old_tbl, old_hash);
 
-			hashtable_chain_unzip(ht, new_tbl, old_tbl, old_hash);
-			head = rht_dereference_bucket(old_tbl->buckets[old_hash],
-						      old_tbl, old_hash);
-			if (!rht_is_a_nulls(head))
+			if (hashtable_chain_unzip(ht, new_tbl, old_tbl,
+						  old_hash))
 				complete = false;
 
-			spin_unlock_bh(old_bucket_lock);
+			unlock_buckets(new_tbl, old_tbl, old_hash);
 		}
 	}
 
@@ -409,7 +415,6 @@ EXPORT_SYMBOL_GPL(rhashtable_expand);
 int rhashtable_shrink(struct rhashtable *ht)
 {
 	struct bucket_table *new_tbl, *tbl = rht_dereference(ht->tbl, ht);
-	spinlock_t *new_bucket_lock, *old_bucket_lock1, *old_bucket_lock2;
 	unsigned int new_hash;
 
 	ASSERT_RHT_MUTEX(ht);
@@ -432,31 +437,14 @@ int rhashtable_shrink(struct rhashtable *ht)
 	 * to lock down both matching buckets in the old table.
 	 */
 	for (new_hash = 0; new_hash < new_tbl->size; new_hash++) {
-		old_bucket_lock1 = bucket_lock(tbl, new_hash);
-		old_bucket_lock2 = bucket_lock(tbl, new_hash + new_tbl->size);
-		new_bucket_lock = bucket_lock(new_tbl, new_hash);
-
-		spin_lock_bh(old_bucket_lock1);
-
-		/* Depending on the lock per buckets mapping, the bucket in
-		 * the lower and upper region may map to the same lock.
-		 */
-		if (old_bucket_lock1 != old_bucket_lock2) {
-			spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED);
-			spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2);
-		} else {
-			spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED);
-		}
+		lock_buckets(new_tbl, tbl, new_hash);
 
 		rcu_assign_pointer(*bucket_tail(new_tbl, new_hash),
 				   tbl->buckets[new_hash]);
 		rcu_assign_pointer(*bucket_tail(new_tbl, new_hash),
 				   tbl->buckets[new_hash + new_tbl->size]);
 
-		spin_unlock_bh(new_bucket_lock);
-		if (old_bucket_lock1 != old_bucket_lock2)
-			spin_unlock_bh(old_bucket_lock2);
-		spin_unlock_bh(old_bucket_lock1);
+		unlock_buckets(new_tbl, tbl, new_hash);
 	}
 
 	/* Publish the new, valid hash table */
@@ -539,19 +527,18 @@ static void __rhashtable_insert(struct rhashtable *ht, struct rhash_head *obj,
  */
 void rhashtable_insert(struct rhashtable *ht, struct rhash_head *obj)
 {
-	struct bucket_table *tbl;
-	spinlock_t *lock;
+	struct bucket_table *tbl, *old_tbl;
 	unsigned hash;
 
 	rcu_read_lock();
 
 	tbl = rht_dereference_rcu(ht->future_tbl, ht);
+	old_tbl = rht_dereference_rcu(ht->tbl, ht);
 	hash = head_hashfn(ht, tbl, obj);
-	lock = bucket_lock(tbl, hash);
 
-	spin_lock_bh(lock);
+	lock_buckets(tbl, old_tbl, hash);
 	__rhashtable_insert(ht, obj, tbl, hash);
-	spin_unlock_bh(lock);
+	unlock_buckets(tbl, old_tbl, hash);
 
 	rcu_read_unlock();
 }
@@ -574,21 +561,20 @@ EXPORT_SYMBOL_GPL(rhashtable_insert);
  */
 bool rhashtable_remove(struct rhashtable *ht, struct rhash_head *obj)
 {
-	struct bucket_table *tbl;
+	struct bucket_table *tbl, *new_tbl, *old_tbl;
 	struct rhash_head __rcu **pprev;
 	struct rhash_head *he;
-	spinlock_t *lock;
-	unsigned int hash;
+	unsigned int hash, new_hash;
 	bool ret = false;
 
 	rcu_read_lock();
-	tbl = rht_dereference_rcu(ht->tbl, ht);
-	hash = head_hashfn(ht, tbl, obj);
-
-	lock = bucket_lock(tbl, hash);
-	spin_lock_bh(lock);
+	tbl = old_tbl = rht_dereference_rcu(ht->tbl, ht);
+	new_tbl = rht_dereference_rcu(ht->future_tbl, ht);
+	new_hash = head_hashfn(ht, new_tbl, obj);
 
+	lock_buckets(new_tbl, old_tbl, new_hash);
 restart:
+	hash = rht_bucket_index(tbl, new_hash);
 	pprev = &tbl->buckets[hash];
 	rht_for_each(he, tbl, hash) {
 		if (he != obj) {
@@ -607,18 +593,12 @@ restart:
 	 * resizing. Thus traversing both is fine and the added cost is
 	 * very rare.
 	 */
-	if (tbl != rht_dereference_rcu(ht->future_tbl, ht)) {
-		spin_unlock_bh(lock);
-
-		tbl = rht_dereference_rcu(ht->future_tbl, ht);
-		hash = head_hashfn(ht, tbl, obj);
-
-		lock = bucket_lock(tbl, hash);
-		spin_lock_bh(lock);
+	if (tbl != new_tbl) {
+		tbl = new_tbl;
 		goto restart;
 	}
 
-	spin_unlock_bh(lock);
+	unlock_buckets(new_tbl, old_tbl, new_hash);
 
 	if (ret) {
 		atomic_dec(&ht->nelems);
@@ -774,24 +754,17 @@ bool rhashtable_lookup_compare_insert(struct rhashtable *ht,
 				      void *arg)
 {
 	struct bucket_table *new_tbl, *old_tbl;
-	spinlock_t *new_bucket_lock, *old_bucket_lock;
-	u32 new_hash, old_hash;
+	u32 new_hash;
 	bool success = true;
 
 	BUG_ON(!ht->p.key_len);
 
 	rcu_read_lock();
-
 	old_tbl = rht_dereference_rcu(ht->tbl, ht);
-	old_hash = head_hashfn(ht, old_tbl, obj);
-	old_bucket_lock = bucket_lock(old_tbl, old_hash);
-	spin_lock_bh(old_bucket_lock);
-
 	new_tbl = rht_dereference_rcu(ht->future_tbl, ht);
 	new_hash = head_hashfn(ht, new_tbl, obj);
-	new_bucket_lock = bucket_lock(new_tbl, new_hash);
-	if (unlikely(old_tbl != new_tbl))
-		spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED);
+
+	lock_buckets(new_tbl, old_tbl, new_hash);
 
 	if (rhashtable_lookup_compare(ht, rht_obj(ht, obj) + ht->p.key_offset,
 				      compare, arg)) {
@@ -802,10 +775,7 @@ bool rhashtable_lookup_compare_insert(struct rhashtable *ht,
 	__rhashtable_insert(ht, obj, new_tbl, new_hash);
 
 exit:
-	if (unlikely(old_tbl != new_tbl))
-		spin_unlock_bh(new_bucket_lock);
-	spin_unlock_bh(old_bucket_lock);
-
+	unlock_buckets(new_tbl, old_tbl, new_hash);
 	rcu_read_unlock();
 
 	return success;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/6] rhashtable: Wait for RCU readers after final unzip work
  2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
  2015-01-30  0:20 ` [PATCH 1/6] rhashtable: key_hashfn() must return full hash value Thomas Graf
  2015-01-30  0:20 ` [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets Thomas Graf
@ 2015-01-30  0:20 ` Thomas Graf
  2015-01-30  0:20 ` [PATCH 4/6] rhashtable: Dump bucket tables on locking violation under PROVE_LOCKING Thomas Graf
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, ying.xue

We need to wait for all RCU readers to complete after the last bit of
unzipping has been completed. Otherwise the old table is freed up
prematurely.

Fixes: 7e1e77636e36 ("lib: Resizable, Scalable, Concurrent Hash Table")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 lib/rhashtable.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 17eeabc..85ec36b 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -391,6 +391,8 @@ int rhashtable_expand(struct rhashtable *ht)
 		}
 	}
 
+	synchronize_rcu();
+
 	bucket_table_free(old_tbl);
 	return 0;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/6] rhashtable: Dump bucket tables on locking violation under PROVE_LOCKING
  2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
                   ` (2 preceding siblings ...)
  2015-01-30  0:20 ` [PATCH 3/6] rhashtable: Wait for RCU readers after final unzip work Thomas Graf
@ 2015-01-30  0:20 ` Thomas Graf
  2015-01-30  0:20 ` [PATCH 5/6] rhashtable: Add more lock verification Thomas Graf
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, ying.xue

This simplifies debugging of locking violations if compiled with
CONFIG_PROVE_LOCKING.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 lib/rhashtable.c | 99 ++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 75 insertions(+), 24 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 85ec36b..fa11a2e 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -48,26 +48,6 @@ static spinlock_t *bucket_lock(const struct bucket_table *tbl, u32 hash)
 	return &tbl->locks[hash & tbl->locks_mask];
 }
 
-#define ASSERT_RHT_MUTEX(HT) BUG_ON(!lockdep_rht_mutex_is_held(HT))
-#define ASSERT_BUCKET_LOCK(TBL, HASH) \
-	BUG_ON(!lockdep_rht_bucket_is_held(TBL, HASH))
-
-#ifdef CONFIG_PROVE_LOCKING
-int lockdep_rht_mutex_is_held(struct rhashtable *ht)
-{
-	return (debug_locks) ? lockdep_is_held(&ht->mutex) : 1;
-}
-EXPORT_SYMBOL_GPL(lockdep_rht_mutex_is_held);
-
-int lockdep_rht_bucket_is_held(const struct bucket_table *tbl, u32 hash)
-{
-	spinlock_t *lock = bucket_lock(tbl, hash);
-
-	return (debug_locks) ? lockdep_is_held(lock) : 1;
-}
-EXPORT_SYMBOL_GPL(lockdep_rht_bucket_is_held);
-#endif
-
 static void *rht_obj(const struct rhashtable *ht, const struct rhash_head *he)
 {
 	return (void *) he - ht->p.head_offset;
@@ -103,6 +83,77 @@ static u32 head_hashfn(const struct rhashtable *ht,
 	return rht_bucket_index(tbl, obj_raw_hashfn(ht, rht_obj(ht, he)));
 }
 
+#ifdef CONFIG_PROVE_LOCKING
+static void debug_dump_buckets(const struct rhashtable *ht,
+			       const struct bucket_table *tbl)
+{
+	struct rhash_head *he;
+	unsigned int i, hash;
+
+	for (i = 0; i < tbl->size; i++) {
+		pr_warn(" [Bucket %d] ", i);
+		rht_for_each_rcu(he, tbl, i) {
+			hash = head_hashfn(ht, tbl, he);
+			pr_cont("[hash = %#x, lock = %p] ",
+				hash, bucket_lock(tbl, hash));
+		}
+		pr_cont("\n");
+	}
+
+}
+
+static void debug_dump_table(struct rhashtable *ht,
+			     const struct bucket_table *tbl,
+			     unsigned int hash)
+{
+	struct bucket_table *old_tbl, *future_tbl;
+
+	pr_emerg("BUG: lock for hash %#x in table %p not held\n",
+		 hash, tbl);
+
+	rcu_read_lock();
+	future_tbl = rht_dereference_rcu(ht->future_tbl, ht);
+	old_tbl = rht_dereference_rcu(ht->tbl, ht);
+	if (future_tbl != old_tbl) {
+		pr_warn("Future table %p (size: %zd)\n",
+			future_tbl, future_tbl->size);
+		debug_dump_buckets(ht, future_tbl);
+	}
+
+	pr_warn("Table %p (size: %zd)\n", old_tbl, old_tbl->size);
+	debug_dump_buckets(ht, old_tbl);
+
+	rcu_read_unlock();
+}
+
+#define ASSERT_RHT_MUTEX(HT) BUG_ON(!lockdep_rht_mutex_is_held(HT))
+#define ASSERT_BUCKET_LOCK(HT, TBL, HASH)				\
+	do {								\
+		if (unlikely(!lockdep_rht_bucket_is_held(TBL, HASH))) {	\
+			debug_dump_table(HT, TBL, HASH);		\
+			BUG();						\
+		}							\
+	} while (0)
+
+int lockdep_rht_mutex_is_held(struct rhashtable *ht)
+{
+	return (debug_locks) ? lockdep_is_held(&ht->mutex) : 1;
+}
+EXPORT_SYMBOL_GPL(lockdep_rht_mutex_is_held);
+
+int lockdep_rht_bucket_is_held(const struct bucket_table *tbl, u32 hash)
+{
+	spinlock_t *lock = bucket_lock(tbl, hash);
+
+	return (debug_locks) ? lockdep_is_held(lock) : 1;
+}
+EXPORT_SYMBOL_GPL(lockdep_rht_bucket_is_held);
+#else
+#define ASSERT_RHT_MUTEX(HT)
+#define ASSERT_BUCKET_LOCK(HT, TBL, HASH)
+#endif
+
+
 static struct rhash_head __rcu **bucket_tail(struct bucket_table *tbl, u32 n)
 {
 	struct rhash_head __rcu **pprev;
@@ -234,7 +285,7 @@ static void unlock_buckets(struct bucket_table *new_tbl,
  *
  * Returns true if no more work needs to be performed on the bucket.
  */
-static bool hashtable_chain_unzip(const struct rhashtable *ht,
+static bool hashtable_chain_unzip(struct rhashtable *ht,
 				  const struct bucket_table *new_tbl,
 				  struct bucket_table *old_tbl,
 				  size_t old_hash)
@@ -242,7 +293,7 @@ static bool hashtable_chain_unzip(const struct rhashtable *ht,
 	struct rhash_head *he, *p, *next;
 	unsigned int new_hash, new_hash2;
 
-	ASSERT_BUCKET_LOCK(old_tbl, old_hash);
+	ASSERT_BUCKET_LOCK(ht, old_tbl, old_hash);
 
 	/* Old bucket empty, no work needed. */
 	p = rht_dereference_bucket(old_tbl->buckets[old_hash], old_tbl,
@@ -251,7 +302,7 @@ static bool hashtable_chain_unzip(const struct rhashtable *ht,
 		return false;
 
 	new_hash = head_hashfn(ht, new_tbl, p);
-	ASSERT_BUCKET_LOCK(new_tbl, new_hash);
+	ASSERT_BUCKET_LOCK(ht, new_tbl, new_hash);
 
 	/* Advance the old bucket pointer one or more times until it
 	 * reaches a node that doesn't hash to the same bucket as the
@@ -264,7 +315,7 @@ static bool hashtable_chain_unzip(const struct rhashtable *ht,
 		 * bucket lock. This is ensured because the bucket lock
 		 * hash map ignores the highest bit.
 		 */
-		ASSERT_BUCKET_LOCK(new_tbl, new_hash2);
+		ASSERT_BUCKET_LOCK(ht, new_tbl, new_hash2);
 
 		if (new_hash != new_hash2)
 			break;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/6] rhashtable: Add more lock verification
  2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
                   ` (3 preceding siblings ...)
  2015-01-30  0:20 ` [PATCH 4/6] rhashtable: Dump bucket tables on locking violation under PROVE_LOCKING Thomas Graf
@ 2015-01-30  0:20 ` Thomas Graf
  2015-01-30  0:20 ` [PATCH 6/6] rhashtable: Avoid bucket cross reference after removal Thomas Graf
  2015-01-30  9:10 ` [PATCH 0/6 net-next] rhashtable fixes Ying Xue
  6 siblings, 0 replies; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, ying.xue

Catch hash miscalculations which result in hard to track down race
conditions.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 lib/rhashtable.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index fa11a2e..f21026a 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -347,9 +347,11 @@ static bool hashtable_chain_unzip(struct rhashtable *ht,
 	return !rht_is_a_nulls(p);
 }
 
-static void link_old_to_new(struct bucket_table *new_tbl,
+static void link_old_to_new(struct rhashtable *ht, struct bucket_table *new_tbl,
 			    unsigned int new_hash, struct rhash_head *entry)
 {
+	ASSERT_BUCKET_LOCK(ht, new_tbl, new_hash);
+
 	rcu_assign_pointer(*bucket_tail(new_tbl, new_hash), entry);
 }
 
@@ -405,7 +407,7 @@ int rhashtable_expand(struct rhashtable *ht)
 		lock_buckets(new_tbl, old_tbl, new_hash);
 		rht_for_each(he, old_tbl, old_hash) {
 			if (head_hashfn(ht, new_tbl, he) == new_hash) {
-				link_old_to_new(new_tbl, new_hash, he);
+				link_old_to_new(ht, new_tbl, new_hash, he);
 				break;
 			}
 		}
@@ -494,6 +496,7 @@ int rhashtable_shrink(struct rhashtable *ht)
 
 		rcu_assign_pointer(*bucket_tail(new_tbl, new_hash),
 				   tbl->buckets[new_hash]);
+		ASSERT_BUCKET_LOCK(ht, tbl, new_hash + new_tbl->size);
 		rcu_assign_pointer(*bucket_tail(new_tbl, new_hash),
 				   tbl->buckets[new_hash + new_tbl->size]);
 
@@ -551,6 +554,8 @@ static void __rhashtable_insert(struct rhashtable *ht, struct rhash_head *obj,
 	struct rhash_head *head = rht_dereference_bucket(tbl->buckets[hash],
 							 tbl, hash);
 
+	ASSERT_BUCKET_LOCK(ht, tbl, hash);
+
 	if (rht_is_a_nulls(head))
 		INIT_RHT_NULLS_HEAD(obj->next, ht, hash);
 	else
@@ -635,6 +640,7 @@ restart:
 			continue;
 		}
 
+		ASSERT_BUCKET_LOCK(ht, tbl, hash);
 		rcu_assign_pointer(*pprev, obj->next);
 
 		ret = true;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/6] rhashtable: Avoid bucket cross reference after removal
  2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
                   ` (4 preceding siblings ...)
  2015-01-30  0:20 ` [PATCH 5/6] rhashtable: Add more lock verification Thomas Graf
@ 2015-01-30  0:20 ` Thomas Graf
  2015-01-30  9:10 ` [PATCH 0/6 net-next] rhashtable fixes Ying Xue
  6 siblings, 0 replies; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  0:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, ying.xue

During a resize, when two buckets in the larger table map to
a single bucket in the smaller table and the new table has already
been (partially) linked to the old table. Removal of an element
may result the bucket in the larger table to point to entries
which all hash to a different value than the bucket index. Thus
causing two buckets to point to the same sub chain after unzipping.
This is not illegal *during* the resize phase but after it has
completed.

Keep the old table around until all of the unzipping is done to
allow the removal code to only search for matching hashed entries
during this special period.

Reported-by: Ying Xue <ying.xue@windriver.com>
Fixes: 97defe1ecf86 ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 lib/rhashtable.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index f21026a..74b9284 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -414,12 +414,6 @@ int rhashtable_expand(struct rhashtable *ht)
 		unlock_buckets(new_tbl, old_tbl, new_hash);
 	}
 
-	/* Publish the new table pointer. Lookups may now traverse
-	 * the new table, but they will not benefit from any
-	 * additional efficiency until later steps unzip the buckets.
-	 */
-	rcu_assign_pointer(ht->tbl, new_tbl);
-
 	/* Unzip interleaved hash chains */
 	while (!complete && !ht->being_destroyed) {
 		/* Wait for readers. All new readers will see the new
@@ -445,6 +439,7 @@ int rhashtable_expand(struct rhashtable *ht)
 	}
 
 	synchronize_rcu();
+	rcu_assign_pointer(ht->tbl, new_tbl);
 
 	bucket_table_free(old_tbl);
 	return 0;
@@ -621,7 +616,7 @@ bool rhashtable_remove(struct rhashtable *ht, struct rhash_head *obj)
 {
 	struct bucket_table *tbl, *new_tbl, *old_tbl;
 	struct rhash_head __rcu **pprev;
-	struct rhash_head *he;
+	struct rhash_head *he, *he2;
 	unsigned int hash, new_hash;
 	bool ret = false;
 
@@ -641,8 +636,21 @@ restart:
 		}
 
 		ASSERT_BUCKET_LOCK(ht, tbl, hash);
-		rcu_assign_pointer(*pprev, obj->next);
 
+		if (unlikely(new_tbl != tbl)) {
+			rht_for_each_continue(he2, he->next, tbl, hash) {
+				if (head_hashfn(ht, tbl, he2) == hash) {
+					rcu_assign_pointer(*pprev, he2);
+					goto found;
+				}
+			}
+
+			INIT_RHT_NULLS_HEAD(*pprev, ht, hash);
+		} else {
+			rcu_assign_pointer(*pprev, obj->next);
+		}
+
+found:
 		ret = true;
 		break;
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/6 net-next] rhashtable fixes
  2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
                   ` (5 preceding siblings ...)
  2015-01-30  0:20 ` [PATCH 6/6] rhashtable: Avoid bucket cross reference after removal Thomas Graf
@ 2015-01-30  9:10 ` Ying Xue
  2015-01-30  9:29   ` Thomas Graf
  6 siblings, 1 reply; 17+ messages in thread
From: Ying Xue @ 2015-01-30  9:10 UTC (permalink / raw)
  To: Thomas Graf, davem; +Cc: netdev

Hi Thomas,

I make sure that my local net-next tree is synchronized to the latest
version in which the commit fe6a043c535acfec8f8e554536c87923dcb45097
("rhashtable: rhashtable_remove() must unlink in both tbl and
future_tbl") is already contained, and then I manually applied the whole
series patches. But when I repeatedly run the test case I originally
posted, soft lockup happens. Please see its relevant log:

root@localhost:/mnt# d[  115.776178] ------------[ cut here ]------------
[  115.776548] WARNING: CPU: 4 PID: 0 at net/sched/sch_generic.c:303
dev_watchdog+0x247/0x250()
[  115.777106] NETDEV WATCHDOG: eth0 (e1000): transmit queue 0 timed out
[  115.777533] Modules linked in: tipc
[  115.777790] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.19.0-rc6+ #182
[  115.778221] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  115.778602]  000000000000012f ffff880017d03d08 ffffffff8175cd25
0000000000001052
[  115.779133]  ffff880017d03d58 ffff880017d03d48 ffffffff81059717
ffffffff00000000
[  115.779661]  ffff880015c26000 ffff880015c263e0 ffff880015f07000
0000000000000001
[  115.780165] Call Trace:
[  115.780165]  <IRQ>  [<ffffffff8175cd25>] dump_stack+0x4c/0x65
[  115.780165]  [<ffffffff81059717>] warn_slowpath_common+0x97/0xe0
[  115.780165]  [<ffffffff81059816>] warn_slowpath_fmt+0x46/0x50
[  115.780165]  [<ffffffff81654537>] dev_watchdog+0x247/0x250
[  115.780165]  [<ffffffff816542f0>] ? pfifo_fast_dequeue+0xe0/0xe0
[  115.780165]  [<ffffffff816542f0>] ? pfifo_fast_dequeue+0xe0/0xe0
[  115.780165]  [<ffffffff810c5ebc>] call_timer_fn+0x8c/0x1e0
[  115.780165]  [<ffffffff810c5e35>] ? call_timer_fn+0x5/0x1e0
[  115.780165]  [<ffffffff817663b0>] ? _raw_spin_unlock_irq+0x30/0x40
[  115.780165]  [<ffffffff816542f0>] ? pfifo_fast_dequeue+0xe0/0xe0
[  115.780165]  [<ffffffff810c7994>] run_timer_softirq+0x2d4/0x320
[  115.780165]  [<ffffffff810d6214>] ? clockevents_program_event+0x74/0x100
[  115.780165]  [<ffffffff8105d653>] __do_softirq+0x123/0x360
[  115.780165]  [<ffffffff8105db2e>] irq_exit+0x8e/0xb0
[  115.780165]  [<ffffffff8176952a>] smp_apic_timer_interrupt+0x4a/0x60
[  115.780165]  [<ffffffff817678af>] apic_timer_interrupt+0x6f/0x80
[  115.780165]  <EOI>  [<ffffffff8100d174>] ? default_idle+0x24/0x100
[  115.780165]  [<ffffffff8100d172>] ? default_idle+0x22/0x100
[  115.780165]  [<ffffffff8100daaf>] arch_cpu_idle+0xf/0x20
[  115.780165]  [<ffffffff8109a509>] cpu_startup_entry+0x2c9/0x3c0
[  115.780165]  [<ffffffff810d5ea2>] ?
clockevents_register_device+0xe2/0x140
[  115.780165]  [<ffffffff810333c1>] start_secondary+0x141/0x150
[  115.780165] ---[ end trace 62da3388fe54379b ]---
[  115.780165] e1000 0000:00:03.0 eth0: Reset adapter
[  116.724005] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  116.724005] Modules linked in: tipc
[  116.724005] irq event stamp: 331179
[  116.724005] hardirqs last  enabled at (331178): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  116.724005] hardirqs last disabled at (331179): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  116.724005] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  116.724005] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  116.724005] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
   3.19.0-rc6+ #182
[  116.724005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  116.724005] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  116.724005] RIP: 0010:[<ffffffff8165d820>]  [<ffffffff8165d820>]
netlink_compare+0x10/0x30
[  116.724005] RSP: 0018:ffff8800105d3cf0  EFLAGS: 00000293
[  116.724005] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  116.724005] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  116.724005] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  116.724005] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  116.724005] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  116.724005] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  116.724005] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  116.724005] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  116.724005] Stack:
[  116.724005]  ffffffff8139de4e ffffffff8139ddc0 000000028105d9c8
07ca27d400000000
[  116.724005]  ffff8800152d13c0 ffff8800105d3d48 ffff880016500000
ffff8800152d13c0
[  116.724005]  0000000000000004 ffff8800152d13c0 ffff880010c07608
ffff8800105d3da8
[  116.724005] Call Trace:
[  116.724005]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  116.724005]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  116.724005]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  116.724005]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  116.724005]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  116.724005]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  116.724005]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  116.724005]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  116.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  116.724005]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  116.724005]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  116.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  116.724005]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  116.724005]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  116.724005]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  116.724005]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  116.724005] Code: f0 ff 83 98 01 00 00 48 83 c4 08 5b 5d c3 66 66 66
66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 c0 8b 56 08 39 97 68 04
00 00 <55> 48 89 e5 74 0a 5d c3 0f 1f 84 00 00 00 00 00 48 8b 47 30 48
[  144.724006] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  144.724006] Modules linked in: tipc
[  144.724006] irq event stamp: 345169
[  144.724006] hardirqs last  enabled at (345168): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  144.724006] hardirqs last disabled at (345169): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  144.724006] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  144.724006] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  144.724006] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  144.724006] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  144.724006] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  144.724006] RIP: 0010:[<ffffffff8165d826>]  [<ffffffff8165d826>]
netlink_compare+0x16/0x30
[  144.724006] RSP: 0018:ffff8800105d3ce8  EFLAGS: 00000293
[  144.724006] RAX: 0000000000000000 RBX: ffff8800105d3c68 RCX:
00000000dbaee169
[  144.724006] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  144.724006] RBP: ffff8800105d3ce8 R08: 00000000df2b8827 R09:
ffff880010c07468
[  144.724006] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d0000
[  144.724006] R13: ffff880013970000 R14: 0000000000000000 R15:
0000000000000001
[  144.724006] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  144.724006] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  144.724006] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  144.724006] Stack:
[  144.724006]  ffff8800105d3d48 ffffffff8139de4e ffffffff8139ddc0
000000028105d9c8
[  144.724006]  07ca27d400000000 ffff8800152d13c0 ffff8800105d3d48
ffff880016500000
[  144.724006]  ffff8800152d13c0 0000000000000004 ffff8800152d13c0
ffff880010c07608
[  144.724006] Call Trace:
[  144.724006]  [<ffffffff8139de4e>] rhashtable_lookup_compare+0x8e/0x120
[  144.724006]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  144.724006]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  144.724006]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  144.724006]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  144.724006]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  144.724006]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  144.724006]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  144.724006]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  144.724006]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  144.724006]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  144.724006]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  144.724006]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  144.724006]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  144.724006]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  144.724006]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  144.724006] Code: 00 48 83 c4 08 5b 5d c3 66 66 66 66 2e 0f 1f 84 00
00 00 00 00 0f 1f 44 00 00 31 c0 8b 56 08 39 97 68 04 00 00 55 48 89 e5
74 0a <5d> c3 0f 1f 84 00 00 00 00 00 48 8b 47 30 48 39 06 5d 0f 94 c0
[  150.732004] INFO: rcu_sched self-detected stall on CPU { 7}  (t=15000
jiffies g=674 c=673 q=18)
[  150.732004] Task dump for CPU 7:
[  150.732004] bind_netlink    R  running task        0   631    561
0x20020008
[  150.732004]  0000000000000231 ffff880017dc3d68 ffffffff81086c26
ffffffff81086b88
[  150.732004]  00000000a5f9a5f8 0000000000000007 ffffffff81c53940
ffff880017dc3d88
[  150.732004]  ffffffff8108a57f ffffffff81c53940 ffffffff81c53940
ffff880017dc3db8
[  150.732004] Call Trace:
[  150.732004]  <IRQ>  [<ffffffff81086c26>] sched_show_task+0x106/0x170
[  150.732004]  [<ffffffff81086b88>] ? sched_show_task+0x68/0x170
[  150.732004]  [<ffffffff8108a57f>] dump_cpu_task+0x3f/0x50
[  150.732004]  [<ffffffff810bfa7b>] rcu_dump_cpu_stacks+0x8b/0xc0
[  150.732004]  [<ffffffff810c33d0>] rcu_check_callbacks+0x480/0x6d0
[  150.732004]  [<ffffffff810a133d>] ? trace_hardirqs_off+0xd/0x10
[  150.732004]  [<ffffffff810c8408>] update_process_times+0x38/0x70
[  150.732004]  [<ffffffff810d8623>] tick_sched_handle.isra.15+0x33/0x70
[  150.732004]  [<ffffffff810d88cb>] tick_sched_timer+0x4b/0x80
[  150.732004]  [<ffffffff810c8d8b>] __run_hrtimer+0x9b/0x290
[  150.732004]  [<ffffffff810d8880>] ? tick_sched_do_timer+0x40/0x40
[  150.732004]  [<ffffffff810c95e4>] ? hrtimer_interrupt+0x74/0x260
[  150.732004]  [<ffffffff810c9677>] hrtimer_interrupt+0x107/0x260
[  150.732004]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  150.732004]  [<ffffffff81034cb9>] local_apic_timer_interrupt+0x39/0x60
[  150.732004]  [<ffffffff81769525>] smp_apic_timer_interrupt+0x45/0x60
[  150.732004]  [<ffffffff817678af>] apic_timer_interrupt+0x6f/0x80
[  150.732004]  <EOI>  [<ffffffff817675e0>] ? retint_restore_args+0xe/0xe
[  150.732004]  [<ffffffff8165d817>] ? netlink_compare+0x7/0x30
[  150.732004]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  150.732004]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  150.732004]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  150.732004]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  150.732004]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  150.732004]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  150.732004]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  150.732004]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  150.732004]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  150.732004]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  150.732004]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  150.732004]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  150.732004]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  150.732004]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  150.732004]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  150.732004]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  176.724005] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  176.724005] Modules linked in: tipc
[  176.724005] irq event stamp: 361143
[  176.724005] hardirqs last  enabled at (361142): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  176.724005] hardirqs last disabled at (361143): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  176.724005] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  176.724005] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  176.724005] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  176.724005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  176.724005] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  176.724005] RIP: 0010:[<ffffffff8139de55>]  [<ffffffff8139de55>]
rhashtable_lookup_compare+0x95/0x120
[  176.724005] RSP: 0018:ffff8800105d3cf8  EFLAGS: 00000246
[  176.724005] RAX: 0000000000000000 RBX: 0000000000000003 RCX:
00000000dbaee169
[  176.724005] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  176.724005] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  176.724005] R10: 0000000000000003 R11: 0000000000000001 R12:
ffffffff817675e0
[  176.724005] R13: ffffffff810c2d18 R14: ffff8800105d3c58 R15:
0000000000000046
[  176.724005] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  176.724005] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  176.724005] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  176.724005] Stack:
[  176.724005]  ffffffff8139ddc0 000000028105d9c8 07ca27d400000000
ffff8800152d13c0
[  176.724005]  ffff8800105d3d48 ffff880016500000 ffff8800152d13c0
0000000000000004
[  176.724005]  ffff8800152d13c0 ffff880010c07608 ffff8800105d3da8
ffffffff8139dfaf
[  176.724005] Call Trace:
[  176.724005]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  176.724005]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  176.724005]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  176.724005]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  176.724005]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  176.724005]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  176.724005]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  176.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  176.724005]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  176.724005]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  176.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  176.724005]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  176.724005]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  176.724005]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  176.724005]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  176.724005] Code: 8b 02 83 e8 01 23 45 c4 48 83 c0 02 4c 8b 74 c2 08
41 f6 c6 01 75 1a 4c 89 f7 48 2b 7b 30 4c 89 ee 41 ff d4 84 c0 75 46 4d
8b 36 <41> f6 c6 01 74 e6 4c 39 7d c8 75 6c 48 c7 c2 61 de 39 81 be 01
[  204.724005] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  204.724005] Modules linked in: tipc
[  204.724005] irq event stamp: 375131
[  204.724005] hardirqs last  enabled at (375130): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  204.724005] hardirqs last disabled at (375131): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  204.724005] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  204.724005] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  204.724005] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  204.724005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  204.724005] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  204.724005] RIP: 0010:[<ffffffff8139de4b>]  [<ffffffff8139de4b>]
rhashtable_lookup_compare+0x8b/0x120
[  204.724005] RSP: 0018:ffff8800105d3cf8  EFLAGS: 00000286
[  204.724005] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  204.724005] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  204.724005] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  204.724005] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  204.724005] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  204.724005] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  204.724005] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  204.724005] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  204.724005] Stack:
[  204.724005]  ffffffff8139ddc0 000000028105d9c8 07ca27d400000000
ffff8800152d13c0
[  204.724005]  ffff8800105d3d48 ffff880016500000 ffff8800152d13c0
0000000000000004
[  204.724005]  ffff8800152d13c0 ffff880010c07608 ffff8800105d3da8
ffffffff8139dfaf
[  204.724005] Call Trace:
[  204.724005]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  204.724005]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  204.724005]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  204.724005]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  204.724005]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  204.724005]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  204.724005]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  204.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  204.724005]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  204.724005]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  204.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  204.724005]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  204.724005]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  204.724005]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  204.724005]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  204.724005] Code: e8 05 89 45 c4 48 8b 55 c8 48 8b 02 83 e8 01 23 45
c4 48 83 c0 02 4c 8b 74 c2 08 41 f6 c6 01 75 1a 4c 89 f7 48 2b 7b 30 4c
89 ee <41> ff d4 84 c0 75 46 4d 8b 36 41 f6 c6 01 74 e6 4c 39 7d c8 75
[  232.724005] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  232.724005] Modules linked in: tipc
[  232.724005] irq event stamp: 389119
[  232.724005] hardirqs last  enabled at (389118): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  232.724005] hardirqs last disabled at (389119): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  232.724005] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  232.724005] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  232.724005] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  232.724005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  232.724005] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  232.724005] RIP: 0010:[<ffffffff8165d820>]  [<ffffffff8165d820>]
netlink_compare+0x10/0x30
[  232.724005] RSP: 0018:ffff8800105d3cf0  EFLAGS: 00000293
[  232.724005] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  232.724005] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  232.724005] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  232.724005] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  232.724005] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  232.724005] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  232.724005] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  232.724005] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  232.724005] Stack:
[  232.724005]  ffffffff8139de4e ffffffff8139ddc0 000000028105d9c8
07ca27d400000000
[  232.724005]  ffff8800152d13c0 ffff8800105d3d48 ffff880016500000
ffff8800152d13c0
[  232.724005]  0000000000000004 ffff8800152d13c0 ffff880010c07608
ffff8800105d3da8
[  232.724005] Call Trace:
[  232.724005]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  232.724005]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  232.724005]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  232.724005]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  232.724005]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  232.724005]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  232.724005]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  232.724005]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  232.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  232.724005]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  232.724005]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  232.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  232.724005]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  232.724005]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  232.724005]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  232.724005]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  232.724005] Code: f0 ff 83 98 01 00 00 48 83 c4 08 5b 5d c3 66 66 66
66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 c0 8b 56 08 39 97 68 04
00 00 <55> 48 89 e5 74 0a 5d c3 0f 1f 84 00 00 00 00 00 48 8b 47 30 48
[  240.772081] INFO: task kworker/7:1:70 blocked for more than 120 seconds.
[  240.773035]       Tainted: G        W    L 3.19.0-rc6+ #182
[  240.773811] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  240.774897] kworker/7:1     D ffff8800167c3af8     0    70      2
0x00000000
[  240.775940] Workqueue: events rht_deferred_worker
[  240.776676]  ffff8800167c3af8 0000000000001ffb ffff8800167b8000
00000000000139c0
[  240.777763]  ffff8800167c3fd8 0000000000000000 00000000000139c0
ffff880013970000
[  240.778845]  ffff8800167b8000 0000000000000000 ffff8800167c3c58
7fffffffffffffff
[  240.779928] Call Trace:
[  240.780318]  [<ffffffff81760879>] schedule+0x29/0x70
[  240.781014]  [<ffffffff81764de5>] schedule_timeout+0x1d5/0x230
[  240.781826]  [<ffffffff810a68fa>] ? mark_held_locks+0x6a/0x90
[  240.782622]  [<ffffffff817663b0>] ? _raw_spin_unlock_irq+0x30/0x40
[  240.783459]  [<ffffffff810a6a25>] ? trace_hardirqs_on_caller+0x105/0x1d0
[  240.784483]  [<ffffffff81761d1b>] wait_for_completion+0xbb/0x120
[  240.785322]  [<ffffffff81087910>] ? try_to_wake_up+0x3c0/0x3c0
[  240.786179]  [<ffffffff810c12f0>] ? __call_rcu.constprop.61+0x260/0x260
[  240.787122]  [<ffffffff810bea9d>] wait_rcu_gp+0x4d/0x60
[  240.787906]  [<ffffffff810beab0>] ? wait_rcu_gp+0x60/0x60
[  240.788721]  [<ffffffff810c1cfd>] synchronize_sched+0x5d/0x70
[  240.789538]  [<ffffffff8139e653>] rhashtable_shrink+0x113/0x150
[  240.790385]  [<ffffffff8139ea30>] rht_deferred_worker+0x80/0xa0
[  240.791235]  [<ffffffff81073449>] process_one_work+0x1b9/0x530
[  240.792100]  [<ffffffff810733d2>] ? process_one_work+0x142/0x530
[  240.792960]  [<ffffffff81073c5f>] worker_thread+0x11f/0x480
[  240.793740]  [<ffffffff81073b40>] ? rescuer_thread+0x340/0x340
[  240.794552]  [<ffffffff81079b4f>] kthread+0xef/0x110
[  240.795268]  [<ffffffff81079a60>] ? flush_kthread_worker+0xf0/0xf0
[  240.796185]  [<ffffffff817668ec>] ret_from_fork+0x7c/0xb0
[  240.796938]  [<ffffffff81079a60>] ? flush_kthread_worker+0xf0/0xf0
[  240.797785] 3 locks held by kworker/7:1/70:
[  240.798411]  #0:  ("events"){.+.+.+}, at: [<ffffffff810733d2>]
process_one_work+0x142/0x530
[  240.799699]  #1:  ((&ht->run_work)){+.+.+.}, at: [<ffffffff810733d2>]
process_one_work+0x142/0x530
[  240.801082]  #2:  (&ht->mutex){+.+.+.}, at: [<ffffffff8139e9dd>]
rht_deferred_worker+0x2d/0xa0
[  260.724007] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  260.724007] Modules linked in: tipc
[  260.724007] irq event stamp: 403107
[  260.724007] hardirqs last  enabled at (403106): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  260.724007] hardirqs last disabled at (403107): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  260.724007] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  260.724007] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  260.724007] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  260.724007] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  260.724007] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  260.724007] RIP: 0010:[<ffffffff8165d826>]  [<ffffffff8165d826>]
netlink_compare+0x16/0x30
[  260.724007] RSP: 0018:ffff8800105d3ce8  EFLAGS: 00000293
[  260.724007] RAX: 0000000000000000 RBX: ffff8800105d3c68 RCX:
00000000dbaee169
[  260.724007] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  260.724007] RBP: ffff8800105d3ce8 R08: 00000000df2b8827 R09:
ffff880010c07468
[  260.724007] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d0000
[  260.724007] R13: ffff880013970000 R14: 0000000000000000 R15:
0000000000000001
[  260.724007] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  260.724007] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  260.724007] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  260.724007] Stack:
[  260.724007]  ffff8800105d3d48 ffffffff8139de4e ffffffff8139ddc0
000000028105d9c8
[  260.724007]  07ca27d400000000 ffff8800152d13c0 ffff8800105d3d48
ffff880016500000
[  260.724007]  ffff8800152d13c0 0000000000000004 ffff8800152d13c0
ffff880010c07608
[  260.724007] Call Trace:
[  260.724007]  [<ffffffff8139de4e>] rhashtable_lookup_compare+0x8e/0x120
[  260.724007]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  260.724007]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  260.724007]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  260.724007]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  260.724007]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  260.724007]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  260.724007]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  260.724007]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  260.724007]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  260.724007]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  260.724007]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  260.724007]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  260.724007]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  260.724007]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  260.724007]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  260.724007] Code: 00 48 83 c4 08 5b 5d c3 66 66 66 66 2e 0f 1f 84 00
00 00 00 00 0f 1f 44 00 00 31 c0 8b 56 08 39 97 68 04 00 00 55 48 89 e5
74 0a <5d> c3 0f 1f 84 00 00 00 00 00 48 8b 47 30 48 39 06 5d 0f 94 c0
[  288.724004] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  288.724004] Modules linked in: tipc
[  288.724004] irq event stamp: 417095
[  288.724004] hardirqs last  enabled at (417094): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  288.724004] hardirqs last disabled at (417095): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  288.724004] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  288.724004] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  288.724004] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  288.724004] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  288.724004] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  288.724004] RIP: 0010:[<ffffffff8139de4b>]  [<ffffffff8139de4b>]
rhashtable_lookup_compare+0x8b/0x120
[  288.724004] RSP: 0018:ffff8800105d3cf8  EFLAGS: 00000286
[  288.724004] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  288.724004] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  288.724004] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  288.724004] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  288.724004] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  288.724004] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  288.724004] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  288.724004] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  288.724004] Stack:
[  288.724004]  ffffffff8139ddc0 000000028105d9c8 07ca27d400000000
ffff8800152d13c0
[  288.724004]  ffff8800105d3d48 ffff880016500000 ffff8800152d13c0
0000000000000004
[  288.724004]  ffff8800152d13c0 ffff880010c07608 ffff8800105d3da8
ffffffff8139dfaf
[  288.724004] Call Trace:
[  288.724004]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  288.724004]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  288.724004]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  288.724004]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  288.724004]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  288.724004]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  288.724004]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  288.724004]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  288.724004]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  288.724004]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  288.724004]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  288.724004]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  288.724004]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  288.724004]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  288.724004]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  288.724004] Code: e8 05 89 45 c4 48 8b 55 c8 48 8b 02 83 e8 01 23 45
c4 48 83 c0 02 4c 8b 74 c2 08 41 f6 c6 01 75 1a 4c 89 f7 48 2b 7b 30 4c
89 ee <41> ff d4 84 c0 75 46 4d 8b 36 41 f6 c6 01 74 e6 4c 39 7d c8 75
[  316.724005] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  316.724005] Modules linked in: tipc
[  316.724005] irq event stamp: 431083
[  316.724005] hardirqs last  enabled at (431082): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  316.724005] hardirqs last disabled at (431083): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  316.724005] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  316.724005] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  316.724005] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  316.724005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  316.724005] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  316.724005] RIP: 0010:[<ffffffff8139de4b>]  [<ffffffff8139de4b>]
rhashtable_lookup_compare+0x8b/0x120
[  316.724005] RSP: 0018:ffff8800105d3cf8  EFLAGS: 00000286
[  316.724005] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  316.724005] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  316.724005] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  316.724005] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  316.724005] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  316.724005] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  316.724005] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  316.724005] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  316.724005] Stack:
[  316.724005]  ffffffff8139ddc0 000000028105d9c8 07ca27d400000000
ffff8800152d13c0
[  316.724005]  ffff8800105d3d48 ffff880016500000 ffff8800152d13c0
0000000000000004
[  316.724005]  ffff8800152d13c0 ffff880010c07608 ffff8800105d3da8
ffffffff8139dfaf
[  316.724005] Call Trace:
[  316.724005]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  316.724005]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  316.724005]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  316.724005]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  316.724005]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  316.724005]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  316.724005]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  316.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  316.724005]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  316.724005]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  316.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  316.724005]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  316.724005]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  316.724005]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  316.724005]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  316.724005] Code: e8 05 89 45 c4 48 8b 55 c8 48 8b 02 83 e8 01 23 45
c4 48 83 c0 02 4c 8b 74 c2 08 41 f6 c6 01 75 1a 4c 89 f7 48 2b 7b 30 4c
89 ee <41> ff d4 84 c0 75 46 4d 8b 36 41 f6 c6 01 74 e6 4c 39 7d c8 75
[  330.748004] INFO: rcu_sched self-detected stall on CPU { 7}  (t=60004
jiffies g=674 c=673 q=100)
[  330.748004] Task dump for CPU 7:
[  330.748004] bind_netlink    R  running task        0   631    561
0x20020008
[  330.748004]  0000000000000231 ffff880017dc3d68 ffffffff81086c26
ffffffff81086b88
[  330.748004]  00000000f75af759 0000000000000007 ffffffff81c53940
ffff880017dc3d88
[  330.748004]  ffffffff8108a57f ffffffff81c53940 ffffffff81c53940
ffff880017dc3db8
[  330.748004] Call Trace:
[  330.748004]  <IRQ>  [<ffffffff81086c26>] sched_show_task+0x106/0x170
[  330.748004]  [<ffffffff81086b88>] ? sched_show_task+0x68/0x170
[  330.748004]  [<ffffffff8108a57f>] dump_cpu_task+0x3f/0x50
[  330.748004]  [<ffffffff810bfa7b>] rcu_dump_cpu_stacks+0x8b/0xc0
[  330.748004]  [<ffffffff810c33d0>] rcu_check_callbacks+0x480/0x6d0
[  330.748004]  [<ffffffff810a133d>] ? trace_hardirqs_off+0xd/0x10
[  330.748004]  [<ffffffff810c8408>] update_process_times+0x38/0x70
[  330.748004]  [<ffffffff810d8623>] tick_sched_handle.isra.15+0x33/0x70
[  330.748004]  [<ffffffff810d88cb>] tick_sched_timer+0x4b/0x80
[  330.748004]  [<ffffffff810c8d8b>] __run_hrtimer+0x9b/0x290
[  330.748004]  [<ffffffff810d8880>] ? tick_sched_do_timer+0x40/0x40
[  330.748004]  [<ffffffff810c95e4>] ? hrtimer_interrupt+0x74/0x260
[  330.748004]  [<ffffffff810c9677>] hrtimer_interrupt+0x107/0x260
[  330.748004]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  330.748004]  [<ffffffff81034cb9>] local_apic_timer_interrupt+0x39/0x60
[  330.748004]  [<ffffffff81769525>] smp_apic_timer_interrupt+0x45/0x60
[  330.748004]  [<ffffffff817678af>] apic_timer_interrupt+0x6f/0x80
[  330.748004]  <EOI>  [<ffffffff817675e0>] ? retint_restore_args+0xe/0xe
[  330.748004]  [<ffffffff8165d817>] ? netlink_compare+0x7/0x30
[  330.748004]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  330.748004]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  330.748004]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  330.748004]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  330.748004]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  330.748004]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  330.748004]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  330.748004]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  330.748004]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  330.748004]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  330.748004]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  330.748004]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  330.748004]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  330.748004]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  330.748004]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  330.748004]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  330.771015] INFO: rcu_sched detected stalls on CPUs/tasks: { 7}
(detected by 1, t=60009 jiffies, g=674, c=673, q=100)
[  330.772011] Task dump for CPU 7:
[  330.772011] bind_netlink    R  running task        0   631    561
0x20020008
[  330.772011]  0000000000000000 0000000000000000 ffff8800105d3f28
0000000000000002
[  330.772011]  00000000ffdfca20 000000000000000c ffff8800105d3f28
ffffffff8161701e
[  330.772011]  ffff8800105d3f78 ffffffff81652318 0000000000000001
ffdfca38000013f3
[  330.772011] Call Trace:
[  330.772011]  [<ffffffff8161701e>] ? SyS_bind+0xe/0x10
[  330.772011]  [<ffffffff81652318>] ? compat_SyS_socketcall+0xa8/0x200
[  330.772011]  [<ffffffff81768df3>] ? sysenter_dispatch+0x7/0x1f
[  330.772011]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  356.724013] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 23s!
[bind_netlink:631]
[  356.724013] Modules linked in: tipc
[  356.724013] irq event stamp: 451061
[  356.724013] hardirqs last  enabled at (451060): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  356.724013] hardirqs last disabled at (451061): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  356.724013] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  356.724013] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  356.724013] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  356.724013] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  356.724013] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  356.724013] RIP: 0010:[<ffffffff8139de55>]  [<ffffffff8139de55>]
rhashtable_lookup_compare+0x95/0x120
[  356.724013] RSP: 0018:ffff8800105d3cf8  EFLAGS: 00000246
[  356.724013] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  356.724013] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  356.724013] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  356.724013] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  356.724013] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  356.724013] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  356.724013] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  356.724013] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  356.724013] Stack:
[  356.724013]  ffffffff8139ddc0 000000028105d9c8 07ca27d400000000
ffff8800152d13c0
[  356.724013]  ffff8800105d3d48 ffff880016500000 ffff8800152d13c0
0000000000000004
[  356.724013]  ffff8800152d13c0 ffff880010c07608 ffff8800105d3da8
ffffffff8139dfaf
[  356.724013] Call Trace:
[  356.724013]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  356.724013]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  356.724013]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  356.724013]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  356.724013]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  356.724013]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  356.724013]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  356.724013]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  356.724013]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  356.724013]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  356.724013]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  356.724013]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  356.724013]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  356.724013]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  356.724013]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  356.724013] Code: 8b 02 83 e8 01 23 45 c4 48 83 c0 02 4c 8b 74 c2 08
41 f6 c6 01 75 1a 4c 89 f7 48 2b 7b 30 4c 89 ee 41 ff d4 84 c0 75 46 4d
8b 36 <41> f6 c6 01 74 e6 4c 39 7d c8 75 6c 48 c7 c2 61 de 39 81 be 01
[  360.800040] INFO: task kworker/7:1:70 blocked for more than 120 seconds.
[  360.800687]       Tainted: G        W    L 3.19.0-rc6+ #182
[  360.801176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  360.801858] kworker/7:1     D ffff8800167c3af8     0    70      2
0x00000000
[  360.802495] Workqueue: events rht_deferred_worker
[  360.802943]  ffff8800167c3af8 0000000000001ffb ffff8800167b8000
00000000000139c0
[  360.803624]  ffff8800167c3fd8 0000000000000000 00000000000139c0
ffff880013970000
[  360.804388]  ffff8800167b8000 0000000000000000 ffff8800167c3c58
7fffffffffffffff
[  360.805071] Call Trace:
[  360.805295]  [<ffffffff81760879>] schedule+0x29/0x70
[  360.805751]  [<ffffffff81764de5>] schedule_timeout+0x1d5/0x230
[  360.806262]  [<ffffffff810a68fa>] ? mark_held_locks+0x6a/0x90
[  360.806764]  [<ffffffff817663b0>] ? _raw_spin_unlock_irq+0x30/0x40
[  360.807323]  [<ffffffff810a6a25>] ? trace_hardirqs_on_caller+0x105/0x1d0
[  360.808109]  [<ffffffff81761d1b>] wait_for_completion+0xbb/0x120
[  360.808653]  [<ffffffff81087910>] ? try_to_wake_up+0x3c0/0x3c0
[  360.809178]  [<ffffffff810c12f0>] ? __call_rcu.constprop.61+0x260/0x260
[  360.809772]  [<ffffffff810bea9d>] wait_rcu_gp+0x4d/0x60
[  360.810246]  [<ffffffff810beab0>] ? wait_rcu_gp+0x60/0x60
[  360.810733]  [<ffffffff810c1cfd>] synchronize_sched+0x5d/0x70
[  360.811358]  [<ffffffff8139e653>] rhashtable_shrink+0x113/0x150
[  360.811876]  [<ffffffff8139ea30>] rht_deferred_worker+0x80/0xa0
[  360.812424]  [<ffffffff81073449>] process_one_work+0x1b9/0x530
[  360.812957]  [<ffffffff810733d2>] ? process_one_work+0x142/0x530
[  360.813481]  [<ffffffff81073c5f>] worker_thread+0x11f/0x480
[  360.813969]  [<ffffffff81073b40>] ? rescuer_thread+0x340/0x340
[  360.814480]  [<ffffffff81079b4f>] kthread+0xef/0x110
[  360.814919]  [<ffffffff81079a60>] ? flush_kthread_worker+0xf0/0xf0
[  360.815459]  [<ffffffff817668ec>] ret_from_fork+0x7c/0xb0
[  360.815945]  [<ffffffff81079a60>] ? flush_kthread_worker+0xf0/0xf0
[  360.816523] 3 locks held by kworker/7:1/70:
[  360.816893]  #0:  ("events"){.+.+.+}, at: [<ffffffff810733d2>]
process_one_work+0x142/0x530
[  360.817696]  #1:  ((&ht->run_work)){+.+.+.}, at: [<ffffffff810733d2>]
process_one_work+0x142/0x530
[  360.818527]  #2:  (&ht->mutex){+.+.+.}, at: [<ffffffff8139e9dd>]
rht_deferred_worker+0x2d/0xa0
[  360.819349] INFO: task jbd2/sda-8:116 blocked for more than 120 seconds.
[  360.819943]       Tainted: G        W    L 3.19.0-rc6+ #182
[  360.820472] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  360.821157] jbd2/sda-8      D ffff8800159a3af8     0   116      2
0x00000000
[  360.821832]  ffff8800159a3af8 ffff8800159a3aa8 ffff8800159c2110
00000000000139c0
[  360.822515]  ffff8800159a3fd8 ffff8800159a3ab8 00000000000139c0
ffff880015a64220
[  360.823222]  ffff8800159c2110 ffffffff81118113 ffff880017d14330
ffff8800159c2110
[  360.823907] Call Trace:
[  360.824162]  [<ffffffff81118113>] ? __delayacct_blkio_start+0x23/0x30
[  360.824817]  [<ffffffff81761250>] ? bit_wait_timeout+0x80/0x80
[  360.825397]  [<ffffffff81760879>] schedule+0x29/0x70
[  360.825863]  [<ffffffff8176094e>] io_schedule+0x8e/0xd0
[  360.826406]  [<ffffffff8176127c>] bit_wait_io+0x2c/0x50
[  360.826966]  [<ffffffff81760fe5>] __wait_on_bit+0x65/0x90
[  360.827458]  [<ffffffff811f83bc>] ? _submit_bh+0x11c/0x150
[  360.827952]  [<ffffffff81761250>] ? bit_wait_timeout+0x80/0x80
[  360.828488]  [<ffffffff8176113c>] out_of_line_wait_on_bit+0x7c/0x90
[  360.829056]  [<ffffffff81099f30>] ? wake_atomic_t_function+0x40/0x40
[  360.829628]  [<ffffffff811f64a9>] __wait_on_buffer+0x49/0x50
[  360.830147]  [<ffffffff812bce65>]
jbd2_journal_commit_transaction+0x16c5/0x1be0
[  360.830806]  [<ffffffff810a6afd>] ? trace_hardirqs_on+0xd/0x10
[  360.831319]  [<ffffffff810c6335>] ? del_timer_sync+0x5/0xd0
[  360.831811]  [<ffffffff812c0a21>] kjournald2+0xc1/0x280
[  360.832331]  [<ffffffff81099eb0>] ? prepare_to_wait_event+0x120/0x120
[  360.832897]  [<ffffffff812c0960>] ? commit_timeout+0x10/0x10
[  360.833416]  [<ffffffff81079b4f>] kthread+0xef/0x110
[  360.833854]  [<ffffffff81079a60>] ? flush_kthread_worker+0xf0/0xf0
[  360.834398]  [<ffffffff817668ec>] ret_from_fork+0x7c/0xb0
[  360.834873]  [<ffffffff81079a60>] ? flush_kthread_worker+0xf0/0xf0
[  360.835413] no locks held by jbd2/sda-8/116.
[  384.724008] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 23s!
[bind_netlink:631]
[  384.724008] Modules linked in: tipc
[  384.724008] irq event stamp: 465049
[  384.724008] hardirqs last  enabled at (465048): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  384.724008] hardirqs last disabled at (465049): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  384.724008] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  384.724008] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  384.724008] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  384.724008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  384.724008] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  384.724008] RIP: 0010:[<ffffffff8165d810>]  [<ffffffff8165d810>]
netlink_compare+0x0/0x30
[  384.724008] RSP: 0018:ffff8800105d3cf0  EFLAGS: 00000286
[  384.724008] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  384.724008] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  384.724008] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  384.724008] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  384.724008] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  384.724008] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  384.724008] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  384.724008] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  384.724008] Stack:
[  384.724008]  ffffffff8139de4e ffffffff8139ddc0 000000028105d9c8
07ca27d400000000
[  384.724008]  ffff8800152d13c0 ffff8800105d3d48 ffff880016500000
ffff8800152d13c0
[  384.724008]  0000000000000004 ffff8800152d13c0 ffff880010c07608
ffff8800105d3da8
[  384.724008] Call Trace:
[  384.724008]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  384.724008]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  384.724008]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  384.724008]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  384.724008]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  384.724008]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  384.724008]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  384.724008]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  384.724008]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  384.724008]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  384.724008]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  384.724008]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  384.724008]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  384.724008]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  384.724008]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  384.724008]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  384.724008] Code: c7 87 30 03 00 00 69 00 00 00 ff 93 50 04 00 00 f0
ff 83 98 01 00 00 48 83 c4 08 5b 5d c3 66 66 66 66 2e 0f 1f 84 00 00 00
00 00 <0f> 1f 44 00 00 31 c0 8b 56 08 39 97 68 04 00 00 55 48 89 e5 74
[  412.724006] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  412.724006] Modules linked in: tipc
[  412.724006] irq event stamp: 479037
[  412.724006] hardirqs last  enabled at (479036): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  412.724006] hardirqs last disabled at (479037): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  412.724006] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  412.724006] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  412.724006] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  412.724006] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  412.724006] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  412.724006] RIP: 0010:[<ffffffff8165d820>]  [<ffffffff8165d820>]
netlink_compare+0x10/0x30
[  412.724006] RSP: 0018:ffff8800105d3cf0  EFLAGS: 00000293
[  412.724006] RAX: 0000000000000000 RBX: 0000000000000003 RCX:
00000000dbaee169
[  412.724006] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  412.724006] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  412.724006] R10: 0000000000000003 R11: 0000000000000001 R12:
ffffffff817675e0
[  412.724006] R13: ffffffff810c2d18 R14: ffff8800105d3c58 R15:
0000000000000046
[  412.724006] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  412.724006] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  412.724006] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  412.724006] Stack:
[  412.724006]  ffffffff8139de4e ffffffff8139ddc0 000000028105d9c8
07ca27d400000000
[  412.724006]  ffff8800152d13c0 ffff8800105d3d48 ffff880016500000
ffff8800152d13c0
[  412.724006]  0000000000000004 ffff8800152d13c0 ffff880010c07608
ffff8800105d3da8
[  412.724006] Call Trace:
[  412.724006]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  412.724006]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  412.724006]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  412.724006]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  412.724006]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  412.724006]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  412.724006]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  412.724006]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  412.724006]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  412.724006]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  412.724006]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  412.724006]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  412.724006]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  412.724006]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  412.724006]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  412.724006]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  412.724006] Code: f0 ff 83 98 01 00 00 48 83 c4 08 5b 5d c3 66 66 66
66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 c0 8b 56 08 39 97 68 04
00 00 <55> 48 89 e5 74 0a 5d c3 0f 1f 84 00 00 00 00 00 48 8b 47 30 48
[  440.724005] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  440.724005] Modules linked in: tipc
[  440.724005] irq event stamp: 493025
[  440.724005] hardirqs last  enabled at (493024): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  440.724005] hardirqs last disabled at (493025): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  440.724005] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  440.724005] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  440.724005] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  440.724005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  440.724005] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  440.724005] RIP: 0010:[<ffffffff8165d820>]  [<ffffffff8165d820>]
netlink_compare+0x10/0x30
[  440.724005] RSP: 0018:ffff8800105d3cf0  EFLAGS: 00000293
[  440.724005] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  440.724005] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  440.724005] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  440.724005] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  440.724005] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  440.724005] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  440.724005] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  440.724005] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  440.724005] Stack:
[  440.724005]  ffffffff8139de4e ffffffff8139ddc0 000000028105d9c8
07ca27d400000000
[  440.724005]  ffff8800152d13c0 ffff8800105d3d48 ffff880016500000
ffff8800152d13c0
[  440.724005]  0000000000000004 ffff8800152d13c0 ffff880010c07608
ffff8800105d3da8
[  440.724005] Call Trace:
[  440.724005]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  440.724005]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  440.724005]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  440.724005]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  440.724005]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  440.724005]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  440.724005]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  440.724005]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  440.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  440.724005]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  440.724005]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  440.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  440.724005]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  440.724005]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  440.724005]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  440.724005]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  440.724005] Code: f0 ff 83 98 01 00 00 48 83 c4 08 5b 5d c3 66 66 66
66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 c0 8b 56 08 39 97 68 04
00 00 <55> 48 89 e5 74 0a 5d c3 0f 1f 84 00 00 00 00 00 48 8b 47 30 48
[  468.724005] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[bind_netlink:631]
[  468.724005] Modules linked in: tipc
[  468.724005] irq event stamp: 507013
[  468.724005] hardirqs last  enabled at (507012): [<ffffffff817675e0>]
restore_args+0x0/0x30
[  468.724005] hardirqs last disabled at (507013): [<ffffffff817678aa>]
apic_timer_interrupt+0x6a/0x80
[  468.724005] softirqs last  enabled at (318168): [<ffffffff8161897f>]
lock_sock_nested+0x4f/0xc0
[  468.724005] softirqs last disabled at (318170): [<ffffffff8139d79a>]
lock_buckets+0x3a/0x80
[  468.724005] CPU: 7 PID: 631 Comm: bind_netlink Tainted: G        W
 L 3.19.0-rc6+ #182
[  468.724005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  468.724005] task: ffff880013970000 ti: ffff8800105d0000 task.ti:
ffff8800105d0000
[  468.724005] RIP: 0010:[<ffffffff8165d820>]  [<ffffffff8165d820>]
netlink_compare+0x10/0x30
[  468.724005] RSP: 0018:ffff8800105d3cf0  EFLAGS: 00000293
[  468.724005] RAX: 0000000000000000 RBX: ffffffff817675e0 RCX:
00000000dbaee169
[  468.724005] RDX: 0000000000001668 RSI: ffff8800105d3db8 RDI:
ffff88001651b800
[  468.724005] RBP: ffff8800105d3d48 R08: 00000000df2b8827 R09:
ffff880010c07468
[  468.724005] R10: 0000000000000003 R11: 0000000000000001 R12:
ffff8800105d3c68
[  468.724005] R13: 0000000000000046 R14: ffff8800105d0000 R15:
ffff880013970000
[  468.724005] FS:  0000000000000000(0000) GS:ffff880017dc0000(0063)
knlGS:00000000f75ae900
[  468.724005] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  468.724005] CR2: 0000000008760000 CR3: 0000000015121000 CR4:
00000000000006e0
[  468.724005] Stack:
[  468.724005]  ffffffff8139de4e ffffffff8139ddc0 000000028105d9c8
07ca27d400000000
[  468.724005]  ffff8800152d13c0 ffff8800105d3d48 ffff880016500000
ffff8800152d13c0
[  468.724005]  0000000000000004 ffff8800152d13c0 ffff880010c07608
ffff8800105d3da8
[  468.724005] Call Trace:
[  468.724005]  [<ffffffff8139de4e>] ? rhashtable_lookup_compare+0x8e/0x120
[  468.724005]  [<ffffffff8139ddc0>] ? rhashtable_remove+0x200/0x200
[  468.724005]  [<ffffffff8139dfaf>]
rhashtable_lookup_compare_insert+0x9f/0x110
[  468.724005]  [<ffffffff8139df45>] ?
rhashtable_lookup_compare_insert+0x35/0x110
[  468.724005]  [<ffffffff8165d810>] ? netlink_overrun+0x50/0x50
[  468.724005]  [<ffffffff8165e0f3>] ? netlink_insert+0x43/0xf0
[  468.724005]  [<ffffffff8165e141>] netlink_insert+0x91/0xf0
[  468.724005]  [<ffffffff81660470>] netlink_bind+0x210/0x260
[  468.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  468.724005]  [<ffffffff810a5b10>] ? lock_release_non_nested+0xa0/0x340
[  468.724005]  [<ffffffff81616d24>] SYSC_bind+0xa4/0xc0
[  468.724005]  [<ffffffff81184fc6>] ? might_fault+0x66/0xc0
[  468.724005]  [<ffffffff8161701e>] SyS_bind+0xe/0x10
[  468.724005]  [<ffffffff81652318>] compat_SyS_socketcall+0xa8/0x200
[  468.724005]  [<ffffffff81768df3>] sysenter_dispatch+0x7/0x1f
[  468.724005]  [<ffffffff8139660e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  468.724005] Code: f0 ff 83 98 01 00 00 48 83 c4 08 5b 5d c3 66 66 66
66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 c0 8b 56 08 39 97 68 04
00 00 <55> 48 89 e5 74 0a 5d c3 0f 1f 84 00 00 00 00 00 48 8b 47 30 48

Regards,
Ying

On 01/30/2015 08:20 AM, Thomas Graf wrote:
> This is a series of fixes which have accumulated while tracking
> down the race condition reoprted by Ying Xue. The original
> DEBUG_PAGEALLOC splat is resolved.
> 
> However, there is still a race (harder to trigger) remaining in
> which certain entries are unfindable when removing them from the
> table via netlink_remove() and thus they cause a use after free
> later on.
> 
> Regardless, these fixes can go in now.
> 
> Thomas Graf (6):
>   rhashtable: key_hashfn() must return full hash value
>   rhashtable: Use a single bucket lock for sibling buckets
>   rhashtable: Wait for RCU readers after final unzip work
>   rhashtable: Dump bucket tables on locking violation under
>     PROVE_LOCKING
>   rhashtable: Add more lock verification
>   rhashtable: Avoid bucket cross reference after removal
> 
>  lib/rhashtable.c | 301 ++++++++++++++++++++++++++++++-------------------------
>  1 file changed, 166 insertions(+), 135 deletions(-)
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/6 net-next] rhashtable fixes
  2015-01-30  9:10 ` [PATCH 0/6 net-next] rhashtable fixes Ying Xue
@ 2015-01-30  9:29   ` Thomas Graf
  2015-01-30  9:56     ` Ying Xue
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Graf @ 2015-01-30  9:29 UTC (permalink / raw)
  To: Ying Xue; +Cc: davem, netdev

On 01/30/15 at 05:10pm, Ying Xue wrote:
> Hi Thomas,
> 
> I make sure that my local net-next tree is synchronized to the latest
> version in which the commit fe6a043c535acfec8f8e554536c87923dcb45097
> ("rhashtable: rhashtable_remove() must unlink in both tbl and
> future_tbl") is already contained, and then I manually applied the whole
> series patches. But when I repeatedly run the test case I originally
> posted, soft lockup happens. Please see its relevant log:

Right, I see the same soft lockup. Interestingly I cannot trigger it
with the rht test code. I can only trigger it with your Netlink socket
creation stress test. It is definitely related to the deferred worker,
when I disable growing, then the bug disappears. I think that the
expansion leaves a race open in which remove cannot find certain entries
(I verified this by adding a BUG_ON() when rhashtable_remove() could not
find a match). This then keeps an entry on the list which has already
been freed.

However, I think this was present before these fixes but hidden as the
lockup requires a lot more iterations of your stress test on my
machine.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/6 net-next] rhashtable fixes
  2015-01-30  9:29   ` Thomas Graf
@ 2015-01-30  9:56     ` Ying Xue
  2015-02-03 17:21       ` Thomas Graf
  0 siblings, 1 reply; 17+ messages in thread
From: Ying Xue @ 2015-01-30  9:56 UTC (permalink / raw)
  To: Thomas Graf; +Cc: davem, netdev

On 01/30/2015 05:29 PM, Thomas Graf wrote:
> On 01/30/15 at 05:10pm, Ying Xue wrote:
>> Hi Thomas,
>>
>> I make sure that my local net-next tree is synchronized to the latest
>> version in which the commit fe6a043c535acfec8f8e554536c87923dcb45097
>> ("rhashtable: rhashtable_remove() must unlink in both tbl and
>> future_tbl") is already contained, and then I manually applied the whole
>> series patches. But when I repeatedly run the test case I originally
>> posted, soft lockup happens. Please see its relevant log:
> 
> Right, I see the same soft lockup. Interestingly I cannot trigger it
> with the rht test code. I can only trigger it with your Netlink socket
> creation stress test. It is definitely related to the deferred worker,
> when I disable growing, then the bug disappears.

Yes, when I disable expansion, the soft lockup also disappears too.

 I think that the
> expansion leaves a race open in which remove cannot find certain entries
> (I verified this by adding a BUG_ON() when rhashtable_remove() could not
> find a match). This then keeps an entry on the list which has already
> been freed.
> 
> However, I think this was present before these fixes but hidden as the
> lockup requires a lot more iterations of your stress test on my
> machine.
> 

If you need to some verification for your new patches or do some
experiments, please let me know, and I can help to do them.

Regards,
Ying

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets
  2015-01-30  0:20 ` [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets Thomas Graf
@ 2015-01-31  4:34   ` Herbert Xu
  2015-01-31  8:41     ` Thomas Graf
  0 siblings, 1 reply; 17+ messages in thread
From: Herbert Xu @ 2015-01-31  4:34 UTC (permalink / raw)
  To: Thomas Graf; +Cc: davem, netdev, ying.xue

Thomas Graf <tgraf@suug.ch> wrote:
> rhashtable currently allows to use a bucket lock per bucket. This
> requires multiple levels of complicated nested locking because when
> resizing, a single bucket of the smaller table will map to two
> buckets in the larger table. So far rhashtable has explicitly locked
> both buckets in the larger table.
> 
> By excluding the highest bit of the hash from the bucket lock map and
> thus only allowing locks to buckets in a ratio of 1:2, the locking
> can be simplified a lot without losing the benefits of multiple locks.
> Larger tables which benefit from multiple locks will not have a single
> lock per bucket anyway.
> 
> Signed-off-by: Thomas Graf <tgraf@suug.ch>

Thomas, could you please hold off on these changes? They totally
conflict with my rehash work, which is going to render these changes
moot anyway since it'll completely change how expansion/shrinking
works.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets
  2015-01-31  4:34   ` Herbert Xu
@ 2015-01-31  8:41     ` Thomas Graf
  2015-01-31  9:38       ` Herbert Xu
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Graf @ 2015-01-31  8:41 UTC (permalink / raw)
  To: Herbert Xu; +Cc: davem, netdev, ying.xue

On 01/31/15 at 03:34pm, Herbert Xu wrote:
> Thomas, could you please hold off on these changes? They totally
> conflict with my rehash work, which is going to render these changes
> moot anyway since it'll completely change how expansion/shrinking
> works.

Can you share that work?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets
  2015-01-31  8:41     ` Thomas Graf
@ 2015-01-31  9:38       ` Herbert Xu
  2015-01-31 12:50         ` Thomas Graf
  0 siblings, 1 reply; 17+ messages in thread
From: Herbert Xu @ 2015-01-31  9:38 UTC (permalink / raw)
  To: Thomas Graf; +Cc: davem, netdev, ying.xue

On Sat, Jan 31, 2015 at 08:41:27AM +0000, Thomas Graf wrote:
> On 01/31/15 at 03:34pm, Herbert Xu wrote:
> > Thomas, could you please hold off on these changes? They totally
> > conflict with my rehash work, which is going to render these changes
> > moot anyway since it'll completely change how expansion/shrinking
> > works.
> 
> Can you share that work?

I will post it when it's ready.  But if you keep moving the goal
posts then I'll never get there.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets
  2015-01-31  9:38       ` Herbert Xu
@ 2015-01-31 12:50         ` Thomas Graf
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Graf @ 2015-01-31 12:50 UTC (permalink / raw)
  To: Herbert Xu; +Cc: davem, netdev, ying.xue

On 01/31/15 at 08:38pm, Herbert Xu wrote:
> On Sat, Jan 31, 2015 at 08:41:27AM +0000, Thomas Graf wrote:
> > On 01/31/15 at 03:34pm, Herbert Xu wrote:
> > > Thomas, could you please hold off on these changes? They totally
> > > conflict with my rehash work, which is going to render these changes
> > > moot anyway since it'll completely change how expansion/shrinking
> > > works.
> > 
> > Can you share that work?
> 
> I will post it when it's ready.  But if you keep moving the goal
> posts then I'll never get there.

My intent is definitely not to make it harder for you. My intent
is fix the remaining races which were reported before net-next gets
merged. I'm happy to hold off if you plan on proposing your rehashing 
in the current net-next cycle.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/6 net-next] rhashtable fixes
  2015-01-30  9:56     ` Ying Xue
@ 2015-02-03 17:21       ` Thomas Graf
  2015-02-04  2:32         ` Ying Xue
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Graf @ 2015-02-03 17:21 UTC (permalink / raw)
  To: Ying Xue; +Cc: davem, netdev

On 01/30/15 at 05:56pm, Ying Xue wrote:
> On 01/30/2015 05:29 PM, Thomas Graf wrote:
> > Right, I see the same soft lockup. Interestingly I cannot trigger it
> > with the rht test code. I can only trigger it with your Netlink socket
> > creation stress test. It is definitely related to the deferred worker,
> > when I disable growing, then the bug disappears.
> 
> Yes, when I disable expansion, the soft lockup also disappears too.

I have found the last remaining race and can now run your test
program successfully in an endless loop.

I will resubmit a v2 of this series.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/6 net-next] rhashtable fixes
  2015-02-03 17:21       ` Thomas Graf
@ 2015-02-04  2:32         ` Ying Xue
  0 siblings, 0 replies; 17+ messages in thread
From: Ying Xue @ 2015-02-04  2:32 UTC (permalink / raw)
  To: Thomas Graf; +Cc: davem, netdev

On 02/04/2015 01:21 AM, Thomas Graf wrote:
> On 01/30/15 at 05:56pm, Ying Xue wrote:
>> On 01/30/2015 05:29 PM, Thomas Graf wrote:
>>> Right, I see the same soft lockup. Interestingly I cannot trigger it
>>> with the rht test code. I can only trigger it with your Netlink socket
>>> creation stress test. It is definitely related to the deferred worker,
>>> when I disable growing, then the bug disappears.
>>
>> Yes, when I disable expansion, the soft lockup also disappears too.
> 
> I have found the last remaining race and can now run your test
> program successfully in an endless loop.
> 
> I will resubmit a v2 of this series.
> 
> 

Thanks! Once I receive your updates, I will test them again.

Regards,
Ying

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 6/6] rhashtable: Avoid bucket cross reference after removal
  2015-02-05  1:03 [PATCH 0/6 v2 " Thomas Graf
@ 2015-02-05  1:03 ` Thomas Graf
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Graf @ 2015-02-05  1:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, herbert, ying.xue

During a resize, when two buckets in the larger table map to
a single bucket in the smaller table and the new table has already
been (partially) linked to the old table. Removal of an element
may result the bucket in the larger table to point to entries
which all hash to a different value than the bucket index. Thus
causing two buckets to point to the same sub chain after unzipping.
This is not illegal *during* the resize phase but after it has
completed.

Keep the old table around until all of the unzipping is done to
allow the removal code to only search for matching hashed entries
during this special period.

Reported-by: Ying Xue <ying.xue@windriver.com>
Fixes: 97defe1ecf86 ("rhashtable: Per bucket locks & deferred expansion/shrinking")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
---
 lib/rhashtable.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index ef0816b..5919d63 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -415,12 +415,6 @@ int rhashtable_expand(struct rhashtable *ht)
 		unlock_buckets(new_tbl, old_tbl, new_hash);
 	}
 
-	/* Publish the new table pointer. Lookups may now traverse
-	 * the new table, but they will not benefit from any
-	 * additional efficiency until later steps unzip the buckets.
-	 */
-	rcu_assign_pointer(ht->tbl, new_tbl);
-
 	/* Unzip interleaved hash chains */
 	while (!complete && !ht->being_destroyed) {
 		/* Wait for readers. All new readers will see the new
@@ -445,6 +439,7 @@ int rhashtable_expand(struct rhashtable *ht)
 		}
 	}
 
+	rcu_assign_pointer(ht->tbl, new_tbl);
 	synchronize_rcu();
 
 	bucket_table_free(old_tbl);
@@ -627,14 +622,14 @@ bool rhashtable_remove(struct rhashtable *ht, struct rhash_head *obj)
 {
 	struct bucket_table *tbl, *new_tbl, *old_tbl;
 	struct rhash_head __rcu **pprev;
-	struct rhash_head *he;
+	struct rhash_head *he, *he2;
 	unsigned int hash, new_hash;
 	bool ret = false;
 
 	rcu_read_lock();
 	tbl = old_tbl = rht_dereference_rcu(ht->tbl, ht);
 	new_tbl = rht_dereference_rcu(ht->future_tbl, ht);
-	new_hash = head_hashfn(ht, new_tbl, obj);
+	new_hash = obj_raw_hashfn(ht, rht_obj(ht, obj));
 
 	lock_buckets(new_tbl, old_tbl, new_hash);
 restart:
@@ -647,8 +642,21 @@ restart:
 		}
 
 		ASSERT_BUCKET_LOCK(ht, tbl, hash);
-		rcu_assign_pointer(*pprev, obj->next);
 
+		if (unlikely(new_tbl != tbl)) {
+			rht_for_each_continue(he2, he->next, tbl, hash) {
+				if (head_hashfn(ht, tbl, he2) == hash) {
+					rcu_assign_pointer(*pprev, he2);
+					goto found;
+				}
+			}
+
+			INIT_RHT_NULLS_HEAD(*pprev, ht, hash);
+		} else {
+			rcu_assign_pointer(*pprev, obj->next);
+		}
+
+found:
 		ret = true;
 		break;
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-02-05  1:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-30  0:20 [PATCH 0/6 net-next] rhashtable fixes Thomas Graf
2015-01-30  0:20 ` [PATCH 1/6] rhashtable: key_hashfn() must return full hash value Thomas Graf
2015-01-30  0:20 ` [PATCH 2/6] rhashtable: Use a single bucket lock for sibling buckets Thomas Graf
2015-01-31  4:34   ` Herbert Xu
2015-01-31  8:41     ` Thomas Graf
2015-01-31  9:38       ` Herbert Xu
2015-01-31 12:50         ` Thomas Graf
2015-01-30  0:20 ` [PATCH 3/6] rhashtable: Wait for RCU readers after final unzip work Thomas Graf
2015-01-30  0:20 ` [PATCH 4/6] rhashtable: Dump bucket tables on locking violation under PROVE_LOCKING Thomas Graf
2015-01-30  0:20 ` [PATCH 5/6] rhashtable: Add more lock verification Thomas Graf
2015-01-30  0:20 ` [PATCH 6/6] rhashtable: Avoid bucket cross reference after removal Thomas Graf
2015-01-30  9:10 ` [PATCH 0/6 net-next] rhashtable fixes Ying Xue
2015-01-30  9:29   ` Thomas Graf
2015-01-30  9:56     ` Ying Xue
2015-02-03 17:21       ` Thomas Graf
2015-02-04  2:32         ` Ying Xue
2015-02-05  1:03 [PATCH 0/6 v2 " Thomas Graf
2015-02-05  1:03 ` [PATCH 6/6] rhashtable: Avoid bucket cross reference after removal Thomas Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.