* [PATCH v2] lib: fix data race in rhashtable_rehash_one
@ 2015-09-22 8:51 Dmitry Vyukov
2015-09-22 9:05 ` Eric Dumazet
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Dmitry Vyukov @ 2015-09-22 8:51 UTC (permalink / raw)
To: eric.dumazet, netdev, linux-kernel, tgraf
Cc: kcc, andreyknvl, glider, ktsan, paulmck, Dmitry Vyukov
rhashtable_rehash_one() uses complex logic to update entry->next field,
after INIT_RHT_NULLS_HEAD and NULLS_MARKER expansion:
entry->next = 1 | ((base + off) << 1)
This can be compiled along the lines of:
entry->next = base + off
entry->next <<= 1
entry->next |= 1
Which will break concurrent readers.
NULLS value recomputation is not needed here, so just remove
the complex logic.
The data race was found with KernelThreadSanitizer (KTSAN).
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
---
v2: Remove NULLS values recomputation as it is not needed.
Update commit description to clarify that the problem
is not with racy reads/writes per se but rather with
the complex update logic.
KTSAN report for the record:
ThreadSanitizer: data-race in netlink_lookup
Atomic read at 0xffff880480443bd0 of size 8 by thread 2747 on CPU 11:
[< inline >] rhashtable_lookup_fast include/linux/rhashtable.h:543
[< inline >] __netlink_lookup net/netlink/af_netlink.c:1026
[<ffffffff81bd9a84>] netlink_lookup+0x134/0x1c0 net/netlink/af_netlink.c:1046
[< inline >] netlink_getsockbyportid net/netlink/af_netlink.c:1616
[<ffffffff81bdc701>] netlink_unicast+0x111/0x300 net/netlink/af_netlink.c:1812
[<ffffffff81bdcdb9>] netlink_sendmsg+0x4c9/0x5f0 net/netlink/af_netlink.c:2443
[< inline >] sock_sendmsg_nosec net/socket.c:610
[<ffffffff81b5d6f3>] sock_sendmsg+0x83/0x90 net/socket.c:620
[<ffffffff81b5e59f>] ___sys_sendmsg+0x3cf/0x3e0 net/socket.c:1952
[<ffffffff81b5f6ac>] __sys_sendmsg+0x4c/0xb0 net/socket.c:1986
[< inline >] SYSC_sendmsg net/socket.c:1997
[<ffffffff81b5f740>] SyS_sendmsg+0x30/0x50 net/socket.c:1993
[<ffffffff81ee3e11>] entry_SYSCALL_64_fastpath+0x31/0x95
arch/x86/entry/entry_64.S:188
Previous write at 0xffff880480443bd0 of size 8 by thread 213 on CPU 4:
[< inline >] rhashtable_rehash_one lib/rhashtable.c:193
[< inline >] rhashtable_rehash_chain lib/rhashtable.c:213
[< inline >] rhashtable_rehash_table lib/rhashtable.c:257
[<ffffffff8156f7e0>] rht_deferred_worker+0x3b0/0x6d0 lib/rhashtable.c:373
[<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036
[<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170
[<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209
[<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529
Mutexes locked by thread 213:
Mutex 217217 is locked here:
[<ffffffff81ee0407>] mutex_lock+0x57/0x70 kernel/locking/mutex.c:108
[<ffffffff8156f475>] rht_deferred_worker+0x45/0x6d0 lib/rhashtable.c:363
[<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036
[<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170
[<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209
[<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529
Mutex 431216 is locked here:
[< inline >] __raw_spin_lock_bh include/linux/spinlock_api_smp.h:149
[<ffffffff81ee3195>] _raw_spin_lock_bh+0x65/0x80 kernel/locking/spinlock.c:175
[< inline >] spin_lock_bh include/linux/spinlock.h:317
[< inline >] rhashtable_rehash_chain lib/rhashtable.c:212
[< inline >] rhashtable_rehash_table lib/rhashtable.c:257
[<ffffffff8156f616>] rht_deferred_worker+0x1e6/0x6d0 lib/rhashtable.c:373
[<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036
[<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170
[<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209
[<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529
Mutex 432766 is locked here:
[< inline >] __raw_spin_lock include/linux/spinlock_api_smp.h:158
[<ffffffff81ee37d0>] _raw_spin_lock+0x50/0x70 kernel/locking/spinlock.c:151
[< inline >] rhashtable_rehash_one lib/rhashtable.c:186
[< inline >] rhashtable_rehash_chain lib/rhashtable.c:213
[< inline >] rhashtable_rehash_table lib/rhashtable.c:257
[<ffffffff8156f79b>] rht_deferred_worker+0x36b/0x6d0 lib/rhashtable.c:373
[<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036
[<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170
[<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209
[<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529
---
lib/rhashtable.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index cc0c697..a54ff89 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -187,10 +187,7 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash)
head = rht_dereference_bucket(new_tbl->buckets[new_hash],
new_tbl, new_hash);
- if (rht_is_a_nulls(head))
- INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash);
- else
- RCU_INIT_POINTER(entry->next, head);
+ RCU_INIT_POINTER(entry->next, head);
rcu_assign_pointer(new_tbl->buckets[new_hash], entry);
spin_unlock(new_bucket_lock);
--
2.6.0.rc0.131.gf624c3d
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] lib: fix data race in rhashtable_rehash_one
2015-09-22 8:51 [PATCH v2] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov
@ 2015-09-22 9:05 ` Eric Dumazet
2015-09-22 9:17 ` Thomas Graf
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2015-09-22 9:05 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: netdev, linux-kernel, tgraf, kcc, andreyknvl, glider, ktsan, paulmck
On Tue, 2015-09-22 at 10:51 +0200, Dmitry Vyukov wrote:
> rhashtable_rehash_one() uses complex logic to update entry->next field,
> after INIT_RHT_NULLS_HEAD and NULLS_MARKER expansion:
>
> entry->next = 1 | ((base + off) << 1)
>
> This can be compiled along the lines of:
>
> entry->next = base + off
> entry->next <<= 1
> entry->next |= 1
>
> Which will break concurrent readers.
>
> NULLS value recomputation is not needed here, so just remove
> the complex logic.
>
> The data race was found with KernelThreadSanitizer (KTSAN).
>
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> ---
Thanks Dmitry
Acked-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] lib: fix data race in rhashtable_rehash_one
2015-09-22 8:51 [PATCH v2] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov
2015-09-22 9:05 ` Eric Dumazet
@ 2015-09-22 9:17 ` Thomas Graf
2015-09-22 15:19 ` Herbert Xu
2015-09-23 0:36 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Thomas Graf @ 2015-09-22 9:17 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: eric.dumazet, netdev, linux-kernel, kcc, andreyknvl, glider,
ktsan, paulmck
On 09/22/15 at 10:51am, Dmitry Vyukov wrote:
> rhashtable_rehash_one() uses complex logic to update entry->next field,
> after INIT_RHT_NULLS_HEAD and NULLS_MARKER expansion:
>
> entry->next = 1 | ((base + off) << 1)
>
> This can be compiled along the lines of:
>
> entry->next = base + off
> entry->next <<= 1
> entry->next |= 1
>
> Which will break concurrent readers.
>
> NULLS value recomputation is not needed here, so just remove
> the complex logic.
>
> The data race was found with KernelThreadSanitizer (KTSAN).
>
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] lib: fix data race in rhashtable_rehash_one
2015-09-22 8:51 [PATCH v2] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov
2015-09-22 9:05 ` Eric Dumazet
2015-09-22 9:17 ` Thomas Graf
@ 2015-09-22 15:19 ` Herbert Xu
2015-09-23 0:36 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Herbert Xu @ 2015-09-22 15:19 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: eric.dumazet, netdev, linux-kernel, tgraf, kcc, andreyknvl,
glider, ktsan, paulmck, dvyukov
Dmitry Vyukov <dvyukov@google.com> wrote:
> rhashtable_rehash_one() uses complex logic to update entry->next field,
> after INIT_RHT_NULLS_HEAD and NULLS_MARKER expansion:
>
> entry->next = 1 | ((base + off) << 1)
>
> This can be compiled along the lines of:
>
> entry->next = base + off
> entry->next <<= 1
> entry->next |= 1
>
> Which will break concurrent readers.
>
> NULLS value recomputation is not needed here, so just remove
> the complex logic.
>
> The data race was found with KernelThreadSanitizer (KTSAN).
>
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] lib: fix data race in rhashtable_rehash_one
2015-09-22 8:51 [PATCH v2] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov
` (2 preceding siblings ...)
2015-09-22 15:19 ` Herbert Xu
@ 2015-09-23 0:36 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2015-09-23 0:36 UTC (permalink / raw)
To: dvyukov
Cc: eric.dumazet, netdev, linux-kernel, tgraf, kcc, andreyknvl,
glider, ktsan, paulmck
From: Dmitry Vyukov <dvyukov@google.com>
Date: Tue, 22 Sep 2015 10:51:52 +0200
> rhashtable_rehash_one() uses complex logic to update entry->next field,
> after INIT_RHT_NULLS_HEAD and NULLS_MARKER expansion:
>
> entry->next = 1 | ((base + off) << 1)
>
> This can be compiled along the lines of:
>
> entry->next = base + off
> entry->next <<= 1
> entry->next |= 1
>
> Which will break concurrent readers.
>
> NULLS value recomputation is not needed here, so just remove
> the complex logic.
>
> The data race was found with KernelThreadSanitizer (KTSAN).
>
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-09-23 0:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22 8:51 [PATCH v2] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov
2015-09-22 9:05 ` Eric Dumazet
2015-09-22 9:17 ` Thomas Graf
2015-09-22 15:19 ` Herbert Xu
2015-09-23 0:36 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.