All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Alan Huang <mmpgouride@gmail.com>
Cc: frederic@kernel.org, quic_neeraju@quicinc.com,
	joel@joelfernandes.org, josh@joshtriplett.org,
	boqun.feng@gmail.com, corbet@lwn.net, rcu@vger.kernel.org,
	linux-doc@vger.kernel.org
Subject: Re: [PATCH v2] docs/RCU: Bring back smp_wmb()
Date: Fri, 14 Jul 2023 16:23:06 -0700	[thread overview]
Message-ID: <9eaf506f-cc14-4da6-9efc-057c0c3e56b0@paulmck-laptop> (raw)
In-Reply-To: <20230711150906.466434-1-mmpgouride@gmail.com>

On Tue, Jul 11, 2023 at 03:09:06PM +0000, Alan Huang wrote:
> The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is
> n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(),
> which modifies obj->obj_node.next. There may be readers holding the
> reference of obj in lockless_lookup, and when updater modifies ->next,
> readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU.
> 
> There are two memory ordering required in the insertion algorithm,
> we need to make sure obj->key is updated before obj->obj_node.next
> and obj->refcnt, atomic_set_release is not enough to provide the
> required memory barrier.
> 
> Signed-off-by: Alan Huang <mmpgouride@gmail.com>

This is an interesting one!!!

Now I am having a hard time believing that the smp_rmb() suffices.

> ---
> Changelog:
>   v1 -> v2: Use _ONCE to protect obj->key.
> 
>  Documentation/RCU/rculist_nulls.rst | 21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst
> index 21e40fcc08de..2a9f5a63d334 100644
> --- a/Documentation/RCU/rculist_nulls.rst
> +++ b/Documentation/RCU/rculist_nulls.rst
> @@ -47,7 +47,7 @@ objects, which is having below type.
>      * reuse these object before the RCU grace period, we
>      * must check key after getting the reference on object
>      */
> -    if (obj->key != key) { // not the object we expected
> +    if (READ_ONCE(obj->key) != key) { // not the object we expected
>        put_ref(obj);
>        rcu_read_unlock();
>        goto begin;
> @@ -64,10 +64,10 @@ but a version with an additional memory barrier (smp_rmb())
>    {
>      struct hlist_node *node, *next;
>      for (pos = rcu_dereference((head)->first);
> -         pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
> +         pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) &&

Suppose that lockless_lookup() is delayed just before fetching pos->next,
and that there were 17 more node to search in the list.

Then consider the following sequence of events:

o	The updater deletes this same node and kmem_cache_free()s it.

o	Another updater kmem_cache_alloc()s that same memory and
	inserts it into an empty hash chain with a different key.

o	Then lockless_lookup() fetches pos->next and sees a NULL pointer,
	thus failing to search the remaining 17 nodes in the list,
	one of which had the desired key value.

o	The lookup algorithm resumes and sees the NULL return from
	lockless_lookup(), and ends up with a NULL obj.

	And this happens even with the strongest possible ordering
	everywhere.

OK, yes, it is late on Friday.  So what am I missing here?

Independent of that, does hlist_add_head_rcu() need to replace its
"n->next = first" with "WRITE_ONCE(n->next, first)"?

						Thanx, Paul

>           ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; });
>           pos = rcu_dereference(next))
> -      if (obj->key == key)
> +      if (READ_ONCE(obj->key) == key)
>          return obj;
>      return NULL;
>    }
> @@ -111,8 +111,13 @@ detect the fact that it missed following items in original chain.
>     */
>    obj = kmem_cache_alloc(...);
>    lock_chain(); // typically a spin_lock()
> -  obj->key = key;
> -  atomic_set_release(&obj->refcnt, 1); // key before refcnt
> +  WRITE_ONCE(obj->key, key);
> +  /*
> +   * We need to make sure obj->key is updated before obj->obj_node.next
> +   * and obj->refcnt.
> +   */
> +  smp_wmb();
> +  atomic_set(&obj->refcnt, 1);
>    hlist_add_head_rcu(&obj->obj_node, list);
>    unlock_chain(); // typically a spin_unlock()
>  
> @@ -165,12 +170,12 @@ Note that using hlist_nulls means the type of 'obj_node' field of
>    begin:
>    rcu_read_lock();
>    hlist_nulls_for_each_entry_rcu(obj, node, head, obj_node) {
> -    if (obj->key == key) {
> +    if (READ_ONCE(obj->key) == key) {
>        if (!try_get_ref(obj)) { // might fail for free objects
>  	rcu_read_unlock();
>          goto begin;
>        }
> -      if (obj->key != key) { // not the object we expected
> +      if (READ_ONCE(obj->key) != key) { // not the object we expected
>          put_ref(obj);
>  	rcu_read_unlock();
>          goto begin;
> @@ -206,7 +211,7 @@ hlist_add_head_rcu().
>     */
>    obj = kmem_cache_alloc(cachep);
>    lock_chain(); // typically a spin_lock()
> -  obj->key = key;
> +  WRITE_ONCE(obj->key, key);
>    atomic_set_release(&obj->refcnt, 1); // key before refcnt
>    /*
>     * insert obj in RCU way (readers might be traversing chain)
> -- 
> 2.34.1
> 

  reply	other threads:[~2023-07-14 23:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-11 15:09 [PATCH v2] docs/RCU: Bring back smp_wmb() Alan Huang
2023-07-14 23:23 ` Paul E. McKenney [this message]
2023-07-15  0:50   ` Alan Huang
2023-07-15 17:19     ` Paul E. McKenney
2023-07-16 11:21       ` Alan Huang
2023-07-17 16:02         ` Paul E. McKenney
2023-07-17 17:53           ` Alan Huang
2023-07-17 19:06             ` Paul E. McKenney
2023-07-18 16:08               ` Alan Huang
2023-07-21 22:49                 ` Paul E. McKenney
2023-07-22  1:02                   ` Alan Huang
2023-07-24 16:09                     ` Paul E. McKenney
2023-07-29 15:17                       ` Alan Huang
2023-07-31 20:06                         ` Paul E. McKenney
2023-08-03 13:50                           ` Alan Huang
2023-08-03 13:54                             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9eaf506f-cc14-4da6-9efc-057c0c3e56b0@paulmck-laptop \
    --to=paulmck@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=corbet@lwn.net \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=mmpgouride@gmail.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.