* [PATCH] lib: fix data race in rhashtable_rehash_one @ 2015-09-21 8:08 Dmitry Vyukov 2015-09-21 13:31 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: Dmitry Vyukov @ 2015-09-21 8:08 UTC (permalink / raw) To: tgraf, netdev, linux-kernel Cc: kcc, andreyknvl, glider, ktsan, paulmck, Dmitry Vyukov rhashtable_rehash_one() uses plain writes to update entry->next, while it is being concurrently accessed by readers. Unfortunately, the compiler is within its rights to (for example) use byte-at-a-time writes to update the pointer, which would fatally confuse concurrent readers. Use WRITE_ONCE to update entry->next in rhashtable_rehash_one(). The data race was found with KernelThreadSanitizer (KTSAN). Signed-off-by: Dmitry Vyukov <dvyukov@google.com> --- KTSAN report for the record: ThreadSanitizer: data-race in netlink_lookup Atomic read at 0xffff880480443bd0 of size 8 by thread 2747 on CPU 11: [< inline >] rhashtable_lookup_fast include/linux/rhashtable.h:543 [< inline >] __netlink_lookup net/netlink/af_netlink.c:1026 [<ffffffff81bd9a84>] netlink_lookup+0x134/0x1c0 net/netlink/af_netlink.c:1046 [< inline >] netlink_getsockbyportid net/netlink/af_netlink.c:1616 [<ffffffff81bdc701>] netlink_unicast+0x111/0x300 net/netlink/af_netlink.c:1812 [<ffffffff81bdcdb9>] netlink_sendmsg+0x4c9/0x5f0 net/netlink/af_netlink.c:2443 [< inline >] sock_sendmsg_nosec net/socket.c:610 [<ffffffff81b5d6f3>] sock_sendmsg+0x83/0x90 net/socket.c:620 [<ffffffff81b5e59f>] ___sys_sendmsg+0x3cf/0x3e0 net/socket.c:1952 [<ffffffff81b5f6ac>] __sys_sendmsg+0x4c/0xb0 net/socket.c:1986 [< inline >] SYSC_sendmsg net/socket.c:1997 [<ffffffff81b5f740>] SyS_sendmsg+0x30/0x50 net/socket.c:1993 [<ffffffff81ee3e11>] entry_SYSCALL_64_fastpath+0x31/0x95 arch/x86/entry/entry_64.S:188 Previous write at 0xffff880480443bd0 of size 8 by thread 213 on CPU 4: [< inline >] rhashtable_rehash_one lib/rhashtable.c:193 [< inline >] rhashtable_rehash_chain lib/rhashtable.c:213 [< inline >] rhashtable_rehash_table lib/rhashtable.c:257 [<ffffffff8156f7e0>] rht_deferred_worker+0x3b0/0x6d0 lib/rhashtable.c:373 [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 Mutexes locked by thread 213: Mutex 217217 is locked here: [<ffffffff81ee0407>] mutex_lock+0x57/0x70 kernel/locking/mutex.c:108 [<ffffffff8156f475>] rht_deferred_worker+0x45/0x6d0 lib/rhashtable.c:363 [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 Mutex 431216 is locked here: [< inline >] __raw_spin_lock_bh include/linux/spinlock_api_smp.h:149 [<ffffffff81ee3195>] _raw_spin_lock_bh+0x65/0x80 kernel/locking/spinlock.c:175 [< inline >] spin_lock_bh include/linux/spinlock.h:317 [< inline >] rhashtable_rehash_chain lib/rhashtable.c:212 [< inline >] rhashtable_rehash_table lib/rhashtable.c:257 [<ffffffff8156f616>] rht_deferred_worker+0x1e6/0x6d0 lib/rhashtable.c:373 [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 Mutex 432766 is locked here: [< inline >] __raw_spin_lock include/linux/spinlock_api_smp.h:158 [<ffffffff81ee37d0>] _raw_spin_lock+0x50/0x70 kernel/locking/spinlock.c:151 [< inline >] rhashtable_rehash_one lib/rhashtable.c:186 [< inline >] rhashtable_rehash_chain lib/rhashtable.c:213 [< inline >] rhashtable_rehash_table lib/rhashtable.c:257 [<ffffffff8156f79b>] rht_deferred_worker+0x36b/0x6d0 lib/rhashtable.c:373 [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 --- lib/rhashtable.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/lib/rhashtable.c b/lib/rhashtable.c index cc0c697..978624d 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -188,9 +188,12 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash) new_tbl, new_hash); if (rht_is_a_nulls(head)) - INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash); - else - RCU_INIT_POINTER(entry->next, head); + head = (struct rhash_head *)rht_marker(ht, new_hash); + /* We don't insert any new nodes that were not previously accessible + * to readers, so we don't need to use rcu_assign_pointer(). + * But entry is being concurrently accessed by readers, so we need to + * use at least WRITE_ONCE. */ + WRITE_ONCE(entry->next, head); rcu_assign_pointer(new_tbl->buckets[new_hash], entry); spin_unlock(new_bucket_lock); -- 2.6.0.rc0.131.gf624c3d ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 8:08 [PATCH] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov @ 2015-09-21 13:31 ` Eric Dumazet 2015-09-21 14:51 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2015-09-21 13:31 UTC (permalink / raw) To: Dmitry Vyukov Cc: tgraf, netdev, linux-kernel, kcc, andreyknvl, glider, ktsan, paulmck On Mon, 2015-09-21 at 10:08 +0200, Dmitry Vyukov wrote: > rhashtable_rehash_one() uses plain writes to update entry->next, > while it is being concurrently accessed by readers. > Unfortunately, the compiler is within its rights to (for example) use > byte-at-a-time writes to update the pointer, which would fatally confuse > concurrent readers. > > Use WRITE_ONCE to update entry->next in rhashtable_rehash_one(). > > The data race was found with KernelThreadSanitizer (KTSAN). > > Signed-off-by: Dmitry Vyukov <dvyukov@google.com> > --- > KTSAN report for the record: > > ThreadSanitizer: data-race in netlink_lookup > > Atomic read at 0xffff880480443bd0 of size 8 by thread 2747 on CPU 11: > [< inline >] rhashtable_lookup_fast include/linux/rhashtable.h:543 > [< inline >] __netlink_lookup net/netlink/af_netlink.c:1026 > [<ffffffff81bd9a84>] netlink_lookup+0x134/0x1c0 net/netlink/af_netlink.c:1046 > [< inline >] netlink_getsockbyportid net/netlink/af_netlink.c:1616 > [<ffffffff81bdc701>] netlink_unicast+0x111/0x300 net/netlink/af_netlink.c:1812 > [<ffffffff81bdcdb9>] netlink_sendmsg+0x4c9/0x5f0 net/netlink/af_netlink.c:2443 > [< inline >] sock_sendmsg_nosec net/socket.c:610 > [<ffffffff81b5d6f3>] sock_sendmsg+0x83/0x90 net/socket.c:620 > [<ffffffff81b5e59f>] ___sys_sendmsg+0x3cf/0x3e0 net/socket.c:1952 > [<ffffffff81b5f6ac>] __sys_sendmsg+0x4c/0xb0 net/socket.c:1986 > [< inline >] SYSC_sendmsg net/socket.c:1997 > [<ffffffff81b5f740>] SyS_sendmsg+0x30/0x50 net/socket.c:1993 > [<ffffffff81ee3e11>] entry_SYSCALL_64_fastpath+0x31/0x95 > arch/x86/entry/entry_64.S:188 > > Previous write at 0xffff880480443bd0 of size 8 by thread 213 on CPU 4: > [< inline >] rhashtable_rehash_one lib/rhashtable.c:193 > [< inline >] rhashtable_rehash_chain lib/rhashtable.c:213 > [< inline >] rhashtable_rehash_table lib/rhashtable.c:257 > [<ffffffff8156f7e0>] rht_deferred_worker+0x3b0/0x6d0 lib/rhashtable.c:373 > [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 > [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 > [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 > [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 > > Mutexes locked by thread 213: > Mutex 217217 is locked here: > [<ffffffff81ee0407>] mutex_lock+0x57/0x70 kernel/locking/mutex.c:108 > [<ffffffff8156f475>] rht_deferred_worker+0x45/0x6d0 lib/rhashtable.c:363 > [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 > [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 > [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 > [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 > > Mutex 431216 is locked here: > [< inline >] __raw_spin_lock_bh include/linux/spinlock_api_smp.h:149 > [<ffffffff81ee3195>] _raw_spin_lock_bh+0x65/0x80 kernel/locking/spinlock.c:175 > [< inline >] spin_lock_bh include/linux/spinlock.h:317 > [< inline >] rhashtable_rehash_chain lib/rhashtable.c:212 > [< inline >] rhashtable_rehash_table lib/rhashtable.c:257 > [<ffffffff8156f616>] rht_deferred_worker+0x1e6/0x6d0 lib/rhashtable.c:373 > [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 > [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 > [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 > [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 > > Mutex 432766 is locked here: > [< inline >] __raw_spin_lock include/linux/spinlock_api_smp.h:158 > [<ffffffff81ee37d0>] _raw_spin_lock+0x50/0x70 kernel/locking/spinlock.c:151 > [< inline >] rhashtable_rehash_one lib/rhashtable.c:186 > [< inline >] rhashtable_rehash_chain lib/rhashtable.c:213 > [< inline >] rhashtable_rehash_table lib/rhashtable.c:257 > [<ffffffff8156f79b>] rht_deferred_worker+0x36b/0x6d0 lib/rhashtable.c:373 > [<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036 > [<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170 > [<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209 > [<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529 > --- > lib/rhashtable.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > index cc0c697..978624d 100644 > --- a/lib/rhashtable.c > +++ b/lib/rhashtable.c > @@ -188,9 +188,12 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash) > new_tbl, new_hash); > > if (rht_is_a_nulls(head)) > - INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash); > - else > - RCU_INIT_POINTER(entry->next, head); > + head = (struct rhash_head *)rht_marker(ht, new_hash); > + /* We don't insert any new nodes that were not previously accessible > + * to readers, so we don't need to use rcu_assign_pointer(). > + * But entry is being concurrently accessed by readers, so we need to > + * use at least WRITE_ONCE. */ This is bogus. 1) Linux is certainly not working if some arch or compiler is not doing single word writes. WRITE_ONCE() would not help at all to enforce this. 2) If new node is not yet visible, we don't care if we write entry->next using any kind of operation. So the WRITE_ONCE() is not needed at all. > + WRITE_ONCE(entry->next, head); The rcu_assign_pointer() immediately following is enough in this case. We have hundred of similar cases in the kernel. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 13:31 ` Eric Dumazet @ 2015-09-21 14:51 ` Eric Dumazet 2015-09-21 15:10 ` Dmitry Vyukov 2015-09-21 22:25 ` Thomas Graf 0 siblings, 2 replies; 10+ messages in thread From: Eric Dumazet @ 2015-09-21 14:51 UTC (permalink / raw) To: Dmitry Vyukov Cc: tgraf, netdev, linux-kernel, kcc, andreyknvl, glider, ktsan, paulmck On Mon, 2015-09-21 at 06:31 -0700, Eric Dumazet wrote: > On Mon, 2015-09-21 at 10:08 +0200, Dmitry Vyukov wrote: > > rhashtable_rehash_one() uses plain writes to update entry->next, > > while it is being concurrently accessed by readers. > > Unfortunately, the compiler is within its rights to (for example) use > > byte-at-a-time writes to update the pointer, which would fatally confuse > > concurrent readers. > > > This is bogus. > > 1) Linux is certainly not working if some arch or compiler is not doing > single word writes. WRITE_ONCE() would not help at all to enforce this. > > 2) If new node is not yet visible, we don't care if we write > entry->next using any kind of operation. > > So the WRITE_ONCE() is not needed at all. > > > > > + WRITE_ONCE(entry->next, head); > > > The rcu_assign_pointer() immediately following is enough in this case. > > We have hundred of similar cases in the kernel. > > The changelog and comment are totally confusing. Please remove the bogus parts in them, and/or rephrase. The important part here is that we rehash an item, so we need to make sure to maintain consistent ->next field, and need to prevent compiler from using ->next as a temporary variable. ptr->next = 1UL | ((base + offset) << 1); Is dangerous because compiler could issue : ptr->next = (base + offset); ptr->next <<= 1; ptr->next += 1UL; Frankly, all this looks like an oversight in this code. Not sure why the NULLS value is even recomputed. diff --git a/lib/rhashtable.c b/lib/rhashtable.c index cc0c69710dcf..0a29f07ba45a 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -187,10 +187,7 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash) head = rht_dereference_bucket(new_tbl->buckets[new_hash], new_tbl, new_hash); - if (rht_is_a_nulls(head)) - INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash); - else - RCU_INIT_POINTER(entry->next, head); + RCU_INIT_POINTER(entry->next, head); rcu_assign_pointer(new_tbl->buckets[new_hash], entry); spin_unlock(new_bucket_lock); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 14:51 ` Eric Dumazet @ 2015-09-21 15:10 ` Dmitry Vyukov 2015-09-21 15:15 ` Eric Dumazet 2015-09-21 22:25 ` Thomas Graf 1 sibling, 1 reply; 10+ messages in thread From: Dmitry Vyukov @ 2015-09-21 15:10 UTC (permalink / raw) To: Eric Dumazet Cc: tgraf, netdev, LKML, Kostya Serebryany, Andrey Konovalov, Alexander Potapenko, ktsan, Paul McKenney On Mon, Sep 21, 2015 at 4:51 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Mon, 2015-09-21 at 06:31 -0700, Eric Dumazet wrote: >> On Mon, 2015-09-21 at 10:08 +0200, Dmitry Vyukov wrote: >> > rhashtable_rehash_one() uses plain writes to update entry->next, >> > while it is being concurrently accessed by readers. >> > Unfortunately, the compiler is within its rights to (for example) use >> > byte-at-a-time writes to update the pointer, which would fatally confuse >> > concurrent readers. >> > >> This is bogus. >> >> 1) Linux is certainly not working if some arch or compiler is not doing >> single word writes. WRITE_ONCE() would not help at all to enforce this. >> >> 2) If new node is not yet visible, we don't care if we write >> entry->next using any kind of operation. >> >> So the WRITE_ONCE() is not needed at all. >> >> >> >> > + WRITE_ONCE(entry->next, head); >> >> >> The rcu_assign_pointer() immediately following is enough in this case. >> >> We have hundred of similar cases in the kernel. >> >> > > The changelog and comment are totally confusing. > > Please remove the bogus parts in them, and/or rephrase. > > The important part here is that we rehash an item, so we need to make > sure to maintain consistent ->next field, and need to prevent compiler > from using ->next as a temporary variable. > > ptr->next = 1UL | ((base + offset) << 1); > > Is dangerous because compiler could issue : > > ptr->next = (base + offset); > > ptr->next <<= 1; > > ptr->next += 1UL; > > Frankly, all this looks like an oversight in this code. > > Not sure why the NULLS value is even recomputed. I have not looked in detail yet, but the NULLS recomputation uses new_hash, which obviously wasn't available when the value was previously computed. Don't know yet whether it is important or not. > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > index cc0c69710dcf..0a29f07ba45a 100644 > --- a/lib/rhashtable.c > +++ b/lib/rhashtable.c > @@ -187,10 +187,7 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash) > head = rht_dereference_bucket(new_tbl->buckets[new_hash], > new_tbl, new_hash); > > - if (rht_is_a_nulls(head)) > - INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash); > - else > - RCU_INIT_POINTER(entry->next, head); > + RCU_INIT_POINTER(entry->next, head); > > rcu_assign_pointer(new_tbl->buckets[new_hash], entry); > spin_unlock(new_bucket_lock); > > > -- > You received this message because you are subscribed to the Google Groups "ktsan" group. > To unsubscribe from this group and stop receiving emails from it, send an email to ktsan+unsubscribe@googlegroups.com. > To post to this group, send email to ktsan@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/ktsan/1442847108.29850.56.camel%40edumazet-glaptop2.roam.corp.google.com. > For more options, visit https://groups.google.com/d/optout. -- Dmitry Vyukov, Software Engineer, dvyukov@google.com Google Germany GmbH, Dienerstraße 12, 80331, München Geschäftsführer: Graham Law, Christine Elizabeth Flores Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind, leiten Sie diese bitte nicht weiter, informieren Sie den Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank. This e-mail is confidential. If you are not the right addressee please do not forward it, please inform the sender, and please erase this e-mail including any attachments. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 15:10 ` Dmitry Vyukov @ 2015-09-21 15:15 ` Eric Dumazet 0 siblings, 0 replies; 10+ messages in thread From: Eric Dumazet @ 2015-09-21 15:15 UTC (permalink / raw) To: Dmitry Vyukov Cc: tgraf, netdev, LKML, Kostya Serebryany, Andrey Konovalov, Alexander Potapenko, ktsan, Paul McKenney On Mon, 2015-09-21 at 17:10 +0200, Dmitry Vyukov wrote: > On Mon, Sep 21, 2015 at 4:51 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > On Mon, 2015-09-21 at 06:31 -0700, Eric Dumazet wrote: > >> On Mon, 2015-09-21 at 10:08 +0200, Dmitry Vyukov wrote: > >> > rhashtable_rehash_one() uses plain writes to update entry->next, > >> > while it is being concurrently accessed by readers. > >> > Unfortunately, the compiler is within its rights to (for example) use > >> > byte-at-a-time writes to update the pointer, which would fatally confuse > >> > concurrent readers. > >> > > >> This is bogus. > >> > >> 1) Linux is certainly not working if some arch or compiler is not doing > >> single word writes. WRITE_ONCE() would not help at all to enforce this. > >> > >> 2) If new node is not yet visible, we don't care if we write > >> entry->next using any kind of operation. > >> > >> So the WRITE_ONCE() is not needed at all. > >> > >> > >> > >> > + WRITE_ONCE(entry->next, head); > >> > >> > >> The rcu_assign_pointer() immediately following is enough in this case. > >> > >> We have hundred of similar cases in the kernel. > >> > >> > > > > The changelog and comment are totally confusing. > > > > Please remove the bogus parts in them, and/or rephrase. > > > > The important part here is that we rehash an item, so we need to make > > sure to maintain consistent ->next field, and need to prevent compiler > > from using ->next as a temporary variable. > > > > ptr->next = 1UL | ((base + offset) << 1); > > > > Is dangerous because compiler could issue : > > > > ptr->next = (base + offset); > > > > ptr->next <<= 1; > > > > ptr->next += 1UL; > > > > Frankly, all this looks like an oversight in this code. > > > > Not sure why the NULLS value is even recomputed. > > I have not looked in detail yet, but the NULLS recomputation uses > new_hash, which obviously wasn't available when the value was > previously computed. Don't know yet whether it is important or not. Well, head already contains the right value, set in bucket_table_alloc() for (i = 0; i < nbuckets; i++) INIT_RHT_NULLS_HEAD(tbl->buckets[i], ht, i); Think of this nulls value as a special NULL pointer. If hash table is properly allocated/initialized, all the chains are correctly ending with a proper NULL pointer. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 14:51 ` Eric Dumazet 2015-09-21 15:10 ` Dmitry Vyukov @ 2015-09-21 22:25 ` Thomas Graf 2015-09-21 23:03 ` Eric Dumazet 1 sibling, 1 reply; 10+ messages in thread From: Thomas Graf @ 2015-09-21 22:25 UTC (permalink / raw) To: Eric Dumazet Cc: Dmitry Vyukov, netdev, linux-kernel, kcc, andreyknvl, glider, ktsan, paulmck On 09/21/15 at 07:51am, Eric Dumazet wrote: > The important part here is that we rehash an item, so we need to make > sure to maintain consistent ->next field, and need to prevent compiler > from using ->next as a temporary variable. > > ptr->next = 1UL | ((base + offset) << 1); > > Is dangerous because compiler could issue : > > ptr->next = (base + offset); > > ptr->next <<= 1; > > ptr->next += 1UL; > > Frankly, all this looks like an oversight in this code. > > Not sure why the NULLS value is even recomputed. The hash of the chain is part of the NULLS value. Since the entry might have been moved to a different chain, the NULLS value must be recalculated to contain the proper hash. However, nobody is using the hash today as far as I can see so we could as well just remove it and use the base value only for the nulls marker. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 22:25 ` Thomas Graf @ 2015-09-21 23:03 ` Eric Dumazet 2015-09-22 8:19 ` Thomas Graf 2015-09-22 15:18 ` Herbert Xu 0 siblings, 2 replies; 10+ messages in thread From: Eric Dumazet @ 2015-09-21 23:03 UTC (permalink / raw) To: Thomas Graf Cc: Dmitry Vyukov, netdev, linux-kernel, kcc, andreyknvl, glider, ktsan, paulmck On Tue, 2015-09-22 at 00:25 +0200, Thomas Graf wrote: > On 09/21/15 at 07:51am, Eric Dumazet wrote: > > The important part here is that we rehash an item, so we need to make > > sure to maintain consistent ->next field, and need to prevent compiler > > from using ->next as a temporary variable. > > > > ptr->next = 1UL | ((base + offset) << 1); > > > > Is dangerous because compiler could issue : > > > > ptr->next = (base + offset); > > > > ptr->next <<= 1; > > > > ptr->next += 1UL; > > > > Frankly, all this looks like an oversight in this code. > > > > Not sure why the NULLS value is even recomputed. > > The hash of the chain is part of the NULLS value. Since the > entry might have been moved to a different chain, the NULLS > value must be recalculated to contain the proper hash. > > However, nobody is using the hash today as far as I can > see so we could as well just remove it and use the base > value only for the nulls marker. What I said is : In @head you already have the correct nulls value, from hash table. You do not need to recompute this value, and/or test if hash table chain is empty. If hash bucket is empty, it contains the appropriate NULLS value. If you are paranoiac add this debugging check : if (rht_is_a_nulls(head)) BUG_ON(head != (struct rhash_head *)rht_marker(ht, new_hash)); Therefore, simply fix the bug and unnecessary code with : diff --git a/lib/rhashtable.c b/lib/rhashtable.c index cc0c69710dcf..a54ff8949f91 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -187,10 +187,7 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash) head = rht_dereference_bucket(new_tbl->buckets[new_hash], new_tbl, new_hash); - if (rht_is_a_nulls(head)) - INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash); - else - RCU_INIT_POINTER(entry->next, head); + RCU_INIT_POINTER(entry->next, head); rcu_assign_pointer(new_tbl->buckets[new_hash], entry); spin_unlock(new_bucket_lock); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 23:03 ` Eric Dumazet @ 2015-09-22 8:19 ` Thomas Graf 2015-09-22 8:52 ` Dmitry Vyukov 2015-09-22 15:18 ` Herbert Xu 1 sibling, 1 reply; 10+ messages in thread From: Thomas Graf @ 2015-09-22 8:19 UTC (permalink / raw) To: Eric Dumazet Cc: Dmitry Vyukov, netdev, linux-kernel, kcc, andreyknvl, glider, ktsan, paulmck On 09/21/15 at 04:03pm, Eric Dumazet wrote: > What I said is : > > In @head you already have the correct nulls value, from hash table. > > You do not need to recompute this value, and/or test if hash table chain > is empty. > > If hash bucket is empty, it contains the appropriate NULLS value. > > If you are paranoiac add this debugging check : > > if (rht_is_a_nulls(head)) > BUG_ON(head != (struct rhash_head *)rht_marker(ht, new_hash)); > > > Therefore, simply fix the bug and unnecessary code with : You are absolutely right Eric. Do you want to revise your patch Dmitry? Eric's proposed fix absolutely the best way to fix this. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-22 8:19 ` Thomas Graf @ 2015-09-22 8:52 ` Dmitry Vyukov 0 siblings, 0 replies; 10+ messages in thread From: Dmitry Vyukov @ 2015-09-22 8:52 UTC (permalink / raw) To: Thomas Graf Cc: Eric Dumazet, netdev, LKML, Kostya Serebryany, Andrey Konovalov, Alexander Potapenko, ktsan, Paul McKenney On Tue, Sep 22, 2015 at 10:19 AM, Thomas Graf <tgraf@suug.ch> wrote: > On 09/21/15 at 04:03pm, Eric Dumazet wrote: >> What I said is : >> >> In @head you already have the correct nulls value, from hash table. >> >> You do not need to recompute this value, and/or test if hash table chain >> is empty. >> >> If hash bucket is empty, it contains the appropriate NULLS value. >> >> If you are paranoiac add this debugging check : >> >> if (rht_is_a_nulls(head)) >> BUG_ON(head != (struct rhash_head *)rht_marker(ht, new_hash)); >> >> >> Therefore, simply fix the bug and unnecessary code with : > > You are absolutely right Eric. Do you want to revise your patch Dmitry? > Eric's proposed fix absolutely the best way to fix this. Mailed v2 of the patch. -- Dmitry Vyukov, Software Engineer, dvyukov@google.com Google Germany GmbH, Dienerstraße 12, 80331, München Geschäftsführer: Graham Law, Christine Elizabeth Flores Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind, leiten Sie diese bitte nicht weiter, informieren Sie den Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank. This e-mail is confidential. If you are not the right addressee please do not forward it, please inform the sender, and please erase this e-mail including any attachments. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: fix data race in rhashtable_rehash_one 2015-09-21 23:03 ` Eric Dumazet 2015-09-22 8:19 ` Thomas Graf @ 2015-09-22 15:18 ` Herbert Xu 1 sibling, 0 replies; 10+ messages in thread From: Herbert Xu @ 2015-09-22 15:18 UTC (permalink / raw) To: Eric Dumazet Cc: tgraf, dvyukov, netdev, linux-kernel, kcc, andreyknvl, glider, ktsan, paulmck Eric Dumazet <eric.dumazet@gmail.com> wrote: > > What I said is : > > In @head you already have the correct nulls value, from hash table. > > You do not need to recompute this value, and/or test if hash table chain > is empty. > > If hash bucket is empty, it contains the appropriate NULLS value. > > If you are paranoiac add this debugging check : > > if (rht_is_a_nulls(head)) > BUG_ON(head != (struct rhash_head *)rht_marker(ht, new_hash)); > > > Therefore, simply fix the bug and unnecessary code with : Ack. I remember seeing this when I was working on it but never got around to removing this bogosity. Thanks, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-09-22 15:19 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-09-21 8:08 [PATCH] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov 2015-09-21 13:31 ` Eric Dumazet 2015-09-21 14:51 ` Eric Dumazet 2015-09-21 15:10 ` Dmitry Vyukov 2015-09-21 15:15 ` Eric Dumazet 2015-09-21 22:25 ` Thomas Graf 2015-09-21 23:03 ` Eric Dumazet 2015-09-22 8:19 ` Thomas Graf 2015-09-22 8:52 ` Dmitry Vyukov 2015-09-22 15:18 ` Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).