From: Eric Dumazet <eric.dumazet@gmail.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: tgraf@suug.ch, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, kcc@google.com,
andreyknvl@google.com, glider@google.com, ktsan@googlegroups.com,
paulmck@linux.vnet.ibm.com
Subject: Re: [PATCH] lib: fix data race in rhashtable_rehash_one
Date: Mon, 21 Sep 2015 07:51:48 -0700 [thread overview]
Message-ID: <1442847108.29850.56.camel@edumazet-glaptop2.roam.corp.google.com> (raw)
In-Reply-To: <1442842315.29850.44.camel@edumazet-glaptop2.roam.corp.google.com>
On Mon, 2015-09-21 at 06:31 -0700, Eric Dumazet wrote:
> On Mon, 2015-09-21 at 10:08 +0200, Dmitry Vyukov wrote:
> > rhashtable_rehash_one() uses plain writes to update entry->next,
> > while it is being concurrently accessed by readers.
> > Unfortunately, the compiler is within its rights to (for example) use
> > byte-at-a-time writes to update the pointer, which would fatally confuse
> > concurrent readers.
> >
> This is bogus.
>
> 1) Linux is certainly not working if some arch or compiler is not doing
> single word writes. WRITE_ONCE() would not help at all to enforce this.
>
> 2) If new node is not yet visible, we don't care if we write
> entry->next using any kind of operation.
>
> So the WRITE_ONCE() is not needed at all.
>
>
>
> > + WRITE_ONCE(entry->next, head);
>
>
> The rcu_assign_pointer() immediately following is enough in this case.
>
> We have hundred of similar cases in the kernel.
>
>
The changelog and comment are totally confusing.
Please remove the bogus parts in them, and/or rephrase.
The important part here is that we rehash an item, so we need to make
sure to maintain consistent ->next field, and need to prevent compiler
from using ->next as a temporary variable.
ptr->next = 1UL | ((base + offset) << 1);
Is dangerous because compiler could issue :
ptr->next = (base + offset);
ptr->next <<= 1;
ptr->next += 1UL;
Frankly, all this looks like an oversight in this code.
Not sure why the NULLS value is even recomputed.
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index cc0c69710dcf..0a29f07ba45a 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -187,10 +187,7 @@ static int rhashtable_rehash_one(struct rhashtable *ht, unsigned int old_hash)
head = rht_dereference_bucket(new_tbl->buckets[new_hash],
new_tbl, new_hash);
- if (rht_is_a_nulls(head))
- INIT_RHT_NULLS_HEAD(entry->next, ht, new_hash);
- else
- RCU_INIT_POINTER(entry->next, head);
+ RCU_INIT_POINTER(entry->next, head);
rcu_assign_pointer(new_tbl->buckets[new_hash], entry);
spin_unlock(new_bucket_lock);
next prev parent reply other threads:[~2015-09-21 14:51 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-21 8:08 [PATCH] lib: fix data race in rhashtable_rehash_one Dmitry Vyukov
2015-09-21 13:31 ` Eric Dumazet
2015-09-21 14:51 ` Eric Dumazet [this message]
2015-09-21 15:10 ` Dmitry Vyukov
2015-09-21 15:15 ` Eric Dumazet
2015-09-21 22:25 ` Thomas Graf
2015-09-21 23:03 ` Eric Dumazet
2015-09-22 8:19 ` Thomas Graf
2015-09-22 8:52 ` Dmitry Vyukov
2015-09-22 15:18 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1442847108.29850.56.camel@edumazet-glaptop2.roam.corp.google.com \
--to=eric.dumazet@gmail.com \
--cc=andreyknvl@google.com \
--cc=dvyukov@google.com \
--cc=glider@google.com \
--cc=kcc@google.com \
--cc=ktsan@googlegroups.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).