All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Roman Gushchin <klamm@yandex-team.ru>,
	Dipankar Sarma <dipankar@in.ibm.com>,
	zhmurov@yandex-team.ru, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	James Morris <jmorris@namei.org>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Patrick McHardy <kaber@trash.net>
Subject: Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro
Date: Wed, 22 May 2013 02:58:39 -0700	[thread overview]
Message-ID: <20130522095839.GC3578@linux.vnet.ibm.com> (raw)
In-Reply-To: <1369188080.3301.268.camel@edumazet-glaptop>

On Tue, May 21, 2013 at 07:01:20PM -0700, Eric Dumazet wrote:
> On Tue, 2013-05-21 at 22:12 +0400, Roman Gushchin wrote:
> > > If other rcu accessors have the same problem, a more complete patch is
> > > needed.
> > 
> > [PATCH] rcu: fix a race in rcu lists traverse macros
> > 
> > Some network functions (udp4_lib_lookup2(), for instance) use the
> > rcu lists traverse macros (hlist_nulls_for_each_entry_rcu, for instance)
> > in a way that assumes restarting of a loop. In this case, it is strictly
> > necessary to reread the head->first value from the memory before each scan.
> > Without additional hints, gcc caches this value in a register. In this case,
> > if a cached node is moved to another chain during the scan, we can loop
> > forever getting wrong nulls values and restarting the loop uninterruptedly.
> > 
> > Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
> > Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru>
> > ---
> >   include/linux/compiler.h      | 6 ++++++
> >   include/linux/rculist.h       | 6 ++++--
> >   include/linux/rculist_nulls.h | 3 ++-
> >   3 files changed, 12 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 92669cd..0e05d7c 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -351,6 +351,12 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
> >    */
> >   #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
> > 
> > +/*
> > + * Prevent the compiler from optimizing accesses to structure member.
> > + */
> > +#define GET_STRUCT_FIELD_VOLATILE(struct_ptr, field)		\
> > +	(((volatile typeof(*struct_ptr) *)struct_ptr)->field)
> > +
> >   /* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */
> >   #ifdef CONFIG_KPROBES
> >   # define __kprobes	__attribute__((__section__(".kprobes.text")))
> > diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> > index 8089e35..6c3eea7 100644
> > --- a/include/linux/rculist.h
> > +++ b/include/linux/rculist.h
> > @@ -282,7 +282,8 @@ static inline void list_splice_init_rcu(struct list_head *list,
> >    * as long as the traversal is guarded by rcu_read_lock().
> >    */
> >   #define list_for_each_entry_rcu(pos, head, member) \
> > -	for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
> > +	for (pos = list_entry_rcu(GET_STRUCT_FIELD_VOLATILE(head, next), \
> > +				  typeof(*pos), member); \
> >   		&pos->member != (head); \
> >   		pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> > 
> > @@ -348,7 +349,8 @@ static inline void hlist_replace_rcu(struct hlist_node *old,
> >   /*
> >    * return the first or the next element in an RCU protected hlist
> >    */
> > -#define hlist_first_rcu(head)	(*((struct hlist_node __rcu **)(&(head)->first)))
> > +#define hlist_first_rcu(head)	(*((struct hlist_node __rcu **)		\
> > +				   (&GET_STRUCT_FIELD_VOLATILE(head, first))))
> >   #define hlist_next_rcu(node)	(*((struct hlist_node __rcu **)(&(node)->next)))
> >   #define hlist_pprev_rcu(node)	(*((struct hlist_node __rcu **)((node)->pprev)))
> > 
> > diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h
> > index 2ae1371..b04fd0a 100644
> > --- a/include/linux/rculist_nulls.h
> > +++ b/include/linux/rculist_nulls.h
> > @@ -38,7 +38,8 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n)
> >   }
> > 
> >   #define hlist_nulls_first_rcu(head) \
> > -	(*((struct hlist_nulls_node __rcu __force **)&(head)->first))
> > +	(*((struct hlist_nulls_node __rcu __force **)	\
> > +	   &(GET_STRUCT_FIELD_VOLATILE(head, first))))
> > 
> >   #define hlist_nulls_next_rcu(node) \
> >   	(*((struct hlist_nulls_node __rcu __force **)&(node)->next))

Now that I am more awake...

The RCU list macros assume that the list header is either statically
allocated (in which case no ACCESS_ONCE() or whatever is needed) or
that the caller did whatever was necessary to protect the list header,
whether that be holding the right lock, using rcu_dereference() when
traversing the pointer to the list header, or whatever.

Maybe I need to document this assumption...

						Thanx, Paul

> Please use ACCESS_ONCE(), which is the standard way to deal with this,
> and remove the rcu_dereference_raw() in 
> hlist_nulls_for_each_entry_rcu()
> 
> something like : (for the nulls part only)
> 
> diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h
> index 2ae1371..1d485d3 100644
> --- a/include/linux/rculist_nulls.h
> +++ b/include/linux/rculist_nulls.h
> @@ -40,6 +40,9 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n)
>  #define hlist_nulls_first_rcu(head) \
>  	(*((struct hlist_nulls_node __rcu __force **)&(head)->first))
> 
> +#define hlist_nulls_first_rcu_once(head) \
> +	ACCESS_ONCE(*((struct hlist_nulls_node __rcu __force **)&(head)->first))
> +
>  #define hlist_nulls_next_rcu(node) \
>  	(*((struct hlist_nulls_node __rcu __force **)&(node)->next))
> 
> @@ -107,7 +110,7 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n,
>   *
>   */
>  #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member)                        \
> -	for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head));            \
> +	for (pos = hlist_nulls_first_rcu_once(head);                            \
>  		(!is_a_nulls(pos)) &&                                           \
>  		({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \
>                 pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos)))
> 
> 
> 


  parent reply	other threads:[~2013-05-22 10:25 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-21  9:05 [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro Roman Gushchin
2013-05-21 10:40 ` David Laight
2013-05-21 11:55   ` Roman Gushchin
2013-05-21 13:42     ` David Laight
2013-05-21 12:09 ` Paul E. McKenney
2013-05-21 12:46   ` Roman Gushchin
2013-05-21 12:58     ` Paul E. McKenney
2013-05-21 13:37   ` Eric Dumazet
2013-05-21 13:44   ` Eric Dumazet
2013-05-21 14:47     ` Roman Gushchin
2013-05-21 15:16       ` Eric Dumazet
2013-05-21 15:51         ` Roman Gushchin
2013-05-21 15:38       ` Eric Dumazet
2013-05-21 15:51         ` Roman Gushchin
2013-05-21 18:12         ` [PATCH v2] " Roman Gushchin
2013-05-22  2:01           ` Eric Dumazet
2013-05-22  5:49             ` Eric Dumazet
2013-05-22 11:58               ` Roman Gushchin
2013-05-22 12:30                 ` Eric Dumazet
2013-05-22 13:07                   ` Roman Gushchin
2013-05-22 17:45                     ` Paul E. McKenney
2013-05-22 19:17                       ` Roman Gushchin
2013-05-25 11:37                         ` Paul E. McKenney
2013-05-27 11:34                           ` Roman Gushchin
2013-05-27 17:55                           ` Roman Gushchin
2013-05-28  0:12                             ` Eric Dumazet
2013-05-28  9:10                               ` Roman Gushchin
2013-05-29  0:34                                 ` Eric Dumazet
2013-05-29  1:31                                   ` Paul E. McKenney
2013-05-29  5:08                                     ` Eric Dumazet
2013-05-29 10:09                                       ` Roman Gushchin
2013-05-29 19:06                                         ` Eric Dumazet
2013-05-30  8:25                                           ` Roman Gushchin
2013-06-02 23:31                                             ` Eric Dumazet
2013-06-03  2:58                                               ` David Miller
2013-06-03  3:12                                                 ` Eric Dumazet
2013-06-03  3:27                                                   ` David Miller
2013-06-03  3:42                                                     ` Paul E. McKenney
2013-06-03  3:47                                                       ` Eric Dumazet
2013-06-03  3:49                                                       ` David Miller
2013-06-03  6:05                                                         ` Paul E. McKenney
2013-06-10 18:29                                                         ` Boris B. Zhmurov
2013-06-10 18:51                                                           ` Eric Dumazet
2013-06-03  3:48                                                   ` Paul E. McKenney
2013-06-03  3:42                                                 ` Paul E. McKenney
2013-05-29  9:17                                   ` Roman Gushchin
2013-05-29  1:19                                 ` Paul E. McKenney
2013-05-22 13:27                   ` David Laight
2013-05-22 13:27                     ` David Laight
2013-05-22 13:36                     ` Eric Dumazet
2013-05-22 14:23                       ` David Laight
2013-05-22 14:23                         ` David Laight
2013-05-22 13:55                     ` Roman Gushchin
2013-05-22  9:58             ` Paul E. McKenney [this message]
2013-05-22 12:28               ` Eric Dumazet
2013-05-22 13:00                 ` Paul E. McKenney
2013-05-22 14:16                   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130522095839.GC3578@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=davem@davemloft.net \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=klamm@yandex-team.ru \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=yoshfuji@linux-ipv6.org \
    --cc=zhmurov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.