All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu()
@ 2013-06-21  0:32 Tejun Heo
  2013-06-25 18:51 ` Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Tejun Heo @ 2013-06-21  0:32 UTC (permalink / raw)
  To: Paul E. McKenney, Dipankar Sarma
  Cc: Fengguang Wu, David S. Miller, Li Zefan, Patrick McHardy, linux-kernel

list_first_or_null() should test whether the list is empty and return
pointer to the first entry if not in a RCU safe manner.  It's broken
in two ways.

* It compares __kernel @__ptr with __rcu @__next triggering the
  following sparse warning.

  net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)

* It doesn't perform rcu_dereference*() and computes the entry address
  using container_of() directly from the __rcu pointer which is
  inconsitent with other rculist interface.  As a result, all three
  in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy.  They
  dereference the pointer w/o going through read barrier.

Fix it by making list_first_or_null_rcu() dereference ->next directly
and then use list_entry_rcu() on it like other rculist accessors.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: stable@vger.kernel.org
---
 include/linux/rculist.h |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -267,8 +267,9 @@ static inline void list_splice_init_rcu(
  */
 #define list_first_or_null_rcu(ptr, type, member) \
 	({struct list_head *__ptr = (ptr); \
-	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
-	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
+	  struct list_head *__next = __ptr->next; \
+	  likely(__ptr != __next) ? \
+		list_entry_rcu(__next, type, member) : NULL; \
 	})
 
 /**

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-21  0:32 [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu() Tejun Heo
@ 2013-06-25 18:51 ` Tejun Heo
  2013-06-25 22:57 ` Paul E. McKenney
  2013-06-26 17:27 ` [PATCH v2] " Tejun Heo
  2 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2013-06-25 18:51 UTC (permalink / raw)
  To: Paul E. McKenney, Dipankar Sarma
  Cc: Fengguang Wu, David S. Miller, Li Zefan, Patrick McHardy, linux-kernel

On Thu, Jun 20, 2013 at 05:32:44PM -0700, Tejun Heo wrote:
> list_first_or_null() should test whether the list is empty and return
> pointer to the first entry if not in a RCU safe manner.  It's broken
> in two ways.
> 
> * It compares __kernel @__ptr with __rcu @__next triggering the
>   following sparse warning.
> 
>   net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)
> 
> * It doesn't perform rcu_dereference*() and computes the entry address
>   using container_of() directly from the __rcu pointer which is
>   inconsitent with other rculist interface.  As a result, all three
>   in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy.  They
>   dereference the pointer w/o going through read barrier.
> 
> Fix it by making list_first_or_null_rcu() dereference ->next directly
> and then use list_entry_rcu() on it like other rculist accessors.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Dipankar Sarma <dipankar@in.ibm.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: stable@vger.kernel.org

Paul, ping.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-21  0:32 [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu() Tejun Heo
  2013-06-25 18:51 ` Tejun Heo
@ 2013-06-25 22:57 ` Paul E. McKenney
  2013-06-25 23:09   ` Tejun Heo
  2013-06-26 17:27 ` [PATCH v2] " Tejun Heo
  2 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2013-06-25 22:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

On Thu, Jun 20, 2013 at 05:32:44PM -0700, Tejun Heo wrote:
> list_first_or_null() should test whether the list is empty and return
> pointer to the first entry if not in a RCU safe manner.  It's broken
> in two ways.
> 
> * It compares __kernel @__ptr with __rcu @__next triggering the
>   following sparse warning.
> 
>   net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)
> 
> * It doesn't perform rcu_dereference*() and computes the entry address
>   using container_of() directly from the __rcu pointer which is
>   inconsitent with other rculist interface.  As a result, all three
>   in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy.  They
>   dereference the pointer w/o going through read barrier.
> 
> Fix it by making list_first_or_null_rcu() dereference ->next directly
> and then use list_entry_rcu() on it like other rculist accessors.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Dipankar Sarma <dipankar@in.ibm.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: stable@vger.kernel.org
> ---
>  include/linux/rculist.h |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -267,8 +267,9 @@ static inline void list_splice_init_rcu(
>   */
>  #define list_first_or_null_rcu(ptr, type, member) \
>  	({struct list_head *__ptr = (ptr); \
> -	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
> -	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
> +	  struct list_head *__next = __ptr->next; \
> +	  likely(__ptr != __next) ? \
> +		list_entry_rcu(__next, type, member) : NULL; \
>  	})
> 
>  /**

I am a bit uneasy with this, and would feel better if the volatile
cast was on the very first fetch of the ->next pointer.

Is there some reason why my unease is ill-founded?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-25 22:57 ` Paul E. McKenney
@ 2013-06-25 23:09   ` Tejun Heo
  2013-06-26 14:17     ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2013-06-25 23:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

Hello, Paul.

On Tue, Jun 25, 2013 at 03:57:59PM -0700, Paul E. McKenney wrote:
> >  #define list_first_or_null_rcu(ptr, type, member) \
> >  	({struct list_head *__ptr = (ptr); \
> > -	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
> > -	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
> > +	  struct list_head *__next = __ptr->next; \
> > +	  likely(__ptr != __next) ? \
> > +		list_entry_rcu(__next, type, member) : NULL; \
> 
> I am a bit uneasy with this, and would feel better if the volatile
> cast was on the very first fetch of the ->next pointer.
> 
> Is there some reason why my unease is ill-founded?

Do you mean something like the following?

	  struct list_head *__next = ACCESS_ONCE(__ptr->next); \
	  likely(__ptr != __next) ? \
		list_entry_rcu(__next, type, member) : NULL; \

Yeah, that looks right to me.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-25 23:09   ` Tejun Heo
@ 2013-06-26 14:17     ` Paul E. McKenney
  2013-06-26 15:25       ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2013-06-26 14:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

On Tue, Jun 25, 2013 at 04:09:38PM -0700, Tejun Heo wrote:
> Hello, Paul.
> 
> On Tue, Jun 25, 2013 at 03:57:59PM -0700, Paul E. McKenney wrote:
> > >  #define list_first_or_null_rcu(ptr, type, member) \
> > >  	({struct list_head *__ptr = (ptr); \
> > > -	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
> > > -	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
> > > +	  struct list_head *__next = __ptr->next; \
> > > +	  likely(__ptr != __next) ? \
> > > +		list_entry_rcu(__next, type, member) : NULL; \
> > 
> > I am a bit uneasy with this, and would feel better if the volatile
> > cast was on the very first fetch of the ->next pointer.
> > 
> > Is there some reason why my unease is ill-founded?
> 
> Do you mean something like the following?
> 
> 	  struct list_head *__next = ACCESS_ONCE(__ptr->next); \
> 	  likely(__ptr != __next) ? \
> 		list_entry_rcu(__next, type, member) : NULL; \
> 
> Yeah, that looks right to me.

I would feel much better about this!  Does it avoid warnings in your
use cases?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-26 14:17     ` Paul E. McKenney
@ 2013-06-26 15:25       ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2013-06-26 15:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

On Wed, Jun 26, 2013 at 07:17:52AM -0700, Paul E. McKenney wrote:
> > Do you mean something like the following?
> > 
> > 	  struct list_head *__next = ACCESS_ONCE(__ptr->next); \
> > 	  likely(__ptr != __next) ? \
> > 		list_entry_rcu(__next, type, member) : NULL; \
> > 
> > Yeah, that looks right to me.
> 
> I would feel much better about this!  Does it avoid warnings in your
> use cases?

Yeah, it does, and more importantly it adds the missing read barrier
during RCU deref.  I'll give it another test and post the updated
version.

Thanks a lot!

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-21  0:32 [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu() Tejun Heo
  2013-06-25 18:51 ` Tejun Heo
  2013-06-25 22:57 ` Paul E. McKenney
@ 2013-06-26 17:27 ` Tejun Heo
  2013-06-28 17:24   ` Paul E. McKenney
  2013-06-28 17:34   ` [PATCH v3] " Tejun Heo
  2 siblings, 2 replies; 13+ messages in thread
From: Tejun Heo @ 2013-06-26 17:27 UTC (permalink / raw)
  To: Paul E. McKenney, Dipankar Sarma
  Cc: Fengguang Wu, David S. Miller, Li Zefan, Patrick McHardy, linux-kernel

list_first_or_null() should test whether the list is empty and return
pointer to the first entry if not in a RCU safe manner.  It's broken
in several ways.

* It compares __kernel @__ptr with __rcu @__next triggering the
  following sparse warning.

  net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)

* It doesn't perform rcu_dereference*() and computes the entry address
  using container_of() directly from the __rcu pointer which is
  inconsitent with other rculist interface.  As a result, all three
  in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy.  They
  dereference the pointer w/o going through read barrier.

* While ->next dereference passes through list_next_rcu(), the
  compiler is still free to fetch ->next more than once and thus
  nullify the "__ptr != __next" condition check.

Fix it by making list_first_or_null_rcu() dereference ->next directly
using ACCESS_ONCE() and then use list_entry_rcu() on it like other
rculist accessors.

v2: Paul pointed out that the compiler may fetch the pointer more than
    once nullifying the condition check.  ACCESS_ONCE() added on
    ->next dereference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: stable@vger.kernel.org
---
Paul, I was mistaken.  For list_first_or_null_rcu(), @ptr is constant.
It's a value which can't change and usually not even a l-value.
ACCESS_ONCE() is necessary when dereferencing @ptr->next, which may
change while list_first_or_null_rcu() is in progress.

Thanks.

 include/linux/rculist.h |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -266,9 +266,10 @@ static inline void list_splice_init_rcu(
  * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
  */
 #define list_first_or_null_rcu(ptr, type, member) \
-	({struct list_head *__ptr = (ptr); \
-	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
-	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
+	({struct list_head *__ptr = ptr; \
+	  struct list_head *__next = ACCESS_ONCE(__ptr->next); \
+	  likely(__ptr != __next) ? \
+		list_entry_rcu(__next, type, member) : NULL; \
 	})
 
 /**

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-26 17:27 ` [PATCH v2] " Tejun Heo
@ 2013-06-28 17:24   ` Paul E. McKenney
  2013-06-28 17:31     ` Tejun Heo
  2013-06-28 17:34   ` [PATCH v3] " Tejun Heo
  1 sibling, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2013-06-28 17:24 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

On Wed, Jun 26, 2013 at 10:27:53AM -0700, Tejun Heo wrote:
> list_first_or_null() should test whether the list is empty and return
> pointer to the first entry if not in a RCU safe manner.  It's broken
> in several ways.
> 
> * It compares __kernel @__ptr with __rcu @__next triggering the
>   following sparse warning.
> 
>   net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)
> 
> * It doesn't perform rcu_dereference*() and computes the entry address
>   using container_of() directly from the __rcu pointer which is
>   inconsitent with other rculist interface.  As a result, all three
>   in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy.  They
>   dereference the pointer w/o going through read barrier.
> 
> * While ->next dereference passes through list_next_rcu(), the
>   compiler is still free to fetch ->next more than once and thus
>   nullify the "__ptr != __next" condition check.
> 
> Fix it by making list_first_or_null_rcu() dereference ->next directly
> using ACCESS_ONCE() and then use list_entry_rcu() on it like other
> rculist accessors.
> 
> v2: Paul pointed out that the compiler may fetch the pointer more than
>     once nullifying the condition check.  ACCESS_ONCE() added on
>     ->next dereference.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Dipankar Sarma <dipankar@in.ibm.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: stable@vger.kernel.org
> ---
> Paul, I was mistaken.  For list_first_or_null_rcu(), @ptr is constant.
> It's a value which can't change and usually not even a l-value.
> ACCESS_ONCE() is necessary when dereferencing @ptr->next, which may
> change while list_first_or_null_rcu() is in progress.
> 
> Thanks.

Fair enough!

But why drop the parens around "ptr"?

							Thanx, Paul

>  include/linux/rculist.h |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -266,9 +266,10 @@ static inline void list_splice_init_rcu(
>   * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
>   */
>  #define list_first_or_null_rcu(ptr, type, member) \
> -	({struct list_head *__ptr = (ptr); \
> -	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
> -	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
> +	({struct list_head *__ptr = ptr; \
> +	  struct list_head *__next = ACCESS_ONCE(__ptr->next); \
> +	  likely(__ptr != __next) ? \
> +		list_entry_rcu(__next, type, member) : NULL; \
>  	})
> 
>  /**
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-28 17:24   ` Paul E. McKenney
@ 2013-06-28 17:31     ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2013-06-28 17:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

On Fri, Jun 28, 2013 at 10:24:55AM -0700, Paul E. McKenney wrote:
> But why drop the parens around "ptr"?

Oops, when did that happen?  :) Will post an updated version right
away.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-26 17:27 ` [PATCH v2] " Tejun Heo
  2013-06-28 17:24   ` Paul E. McKenney
@ 2013-06-28 17:34   ` Tejun Heo
  2013-06-28 19:25     ` Paul E. McKenney
  1 sibling, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2013-06-28 17:34 UTC (permalink / raw)
  To: Paul E. McKenney, Dipankar Sarma
  Cc: Fengguang Wu, David S. Miller, Li Zefan, Patrick McHardy, linux-kernel

list_first_or_null() should test whether the list is empty and return
pointer to the first entry if not in a RCU safe manner.  It's broken
in several ways.

* It compares __kernel @__ptr with __rcu @__next triggering the
  following sparse warning.

  net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)

* It doesn't perform rcu_dereference*() and computes the entry address
  using container_of() directly from the __rcu pointer which is
  inconsitent with other rculist interface.  As a result, all three
  in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy.  They
  dereference the pointer w/o going through read barrier.

* While ->next dereference passes through list_next_rcu(), the
  compiler is still free to fetch ->next more than once and thus
  nullify the "__ptr != __next" condition check.

Fix it by making list_first_or_null_rcu() dereference ->next directly
using ACCESS_ONCE() and then use list_entry_rcu() on it like other
rculist accessors.

v2: Paul pointed out that the compiler may fetch the pointer more than
    once nullifying the condition check.  ACCESS_ONCE() added on
    ->next dereference.

v3: Restored () around macro param which was accidentally removed.
    Spotted by Paul.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: stable@vger.kernel.org
---
 include/linux/rculist.h |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 8089e35..523f13c 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -267,8 +267,9 @@ static inline void list_splice_init_rcu(struct list_head *list,
  */
 #define list_first_or_null_rcu(ptr, type, member) \
 	({struct list_head *__ptr = (ptr); \
-	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
-	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
+	  struct list_head *__next = ACCESS_ONCE(__ptr->next); \
+	  likely(__ptr != __next) ? \
+		list_entry_rcu(__next, type, member) : NULL; \
 	})
 
 /**

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-28 17:34   ` [PATCH v3] " Tejun Heo
@ 2013-06-28 19:25     ` Paul E. McKenney
  2013-07-23 14:48       ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2013-06-28 19:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

On Fri, Jun 28, 2013 at 10:34:48AM -0700, Tejun Heo wrote:
> list_first_or_null() should test whether the list is empty and return
> pointer to the first entry if not in a RCU safe manner.  It's broken
> in several ways.
> 
> * It compares __kernel @__ptr with __rcu @__next triggering the
>   following sparse warning.
> 
>   net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)
> 
> * It doesn't perform rcu_dereference*() and computes the entry address
>   using container_of() directly from the __rcu pointer which is
>   inconsitent with other rculist interface.  As a result, all three
>   in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy.  They
>   dereference the pointer w/o going through read barrier.
> 
> * While ->next dereference passes through list_next_rcu(), the
>   compiler is still free to fetch ->next more than once and thus
>   nullify the "__ptr != __next" condition check.
> 
> Fix it by making list_first_or_null_rcu() dereference ->next directly
> using ACCESS_ONCE() and then use list_entry_rcu() on it like other
> rculist accessors.
> 
> v2: Paul pointed out that the compiler may fetch the pointer more than
>     once nullifying the condition check.  ACCESS_ONCE() added on
>     ->next dereference.
> 
> v3: Restored () around macro param which was accidentally removed.
>     Spotted by Paul.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Cc: Dipankar Sarma <dipankar@in.ibm.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: stable@vger.kernel.org

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/rculist.h |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 8089e35..523f13c 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -267,8 +267,9 @@ static inline void list_splice_init_rcu(struct list_head *list,
>   */
>  #define list_first_or_null_rcu(ptr, type, member) \
>  	({struct list_head *__ptr = (ptr); \
> -	  struct list_head __rcu *__next = list_next_rcu(__ptr); \
> -	  likely(__ptr != __next) ? container_of(__next, type, member) : NULL; \
> +	  struct list_head *__next = ACCESS_ONCE(__ptr->next); \
> +	  likely(__ptr != __next) ? \
> +		list_entry_rcu(__next, type, member) : NULL; \
>  	})
> 
>  /**
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-06-28 19:25     ` Paul E. McKenney
@ 2013-07-23 14:48       ` Tejun Heo
  2013-07-23 15:01         ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2013-07-23 14:48 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

Hello, Paul.

On Fri, Jun 28, 2013 at 12:25:09PM -0700, Paul E. McKenney wrote:
> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

How should this patch be routed?  Probably best to route with other
rcu changes?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] rculist: list_first_or_null_rcu() should use list_entry_rcu()
  2013-07-23 14:48       ` Tejun Heo
@ 2013-07-23 15:01         ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2013-07-23 15:01 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Dipankar Sarma, Fengguang Wu, David S. Miller, Li Zefan,
	Patrick McHardy, linux-kernel

On Tue, Jul 23, 2013 at 10:48:54AM -0400, Tejun Heo wrote:
> Hello, Paul.
> 
> On Fri, Jun 28, 2013 at 12:25:09PM -0700, Paul E. McKenney wrote:
> > Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> How should this patch be routed?  Probably best to route with other
> rcu changes?

Fair point, queued for 3.12.  Thank you, Tejun!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-07-23 15:11 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-21  0:32 [PATCH] rculist: list_first_or_null_rcu() should use list_entry_rcu() Tejun Heo
2013-06-25 18:51 ` Tejun Heo
2013-06-25 22:57 ` Paul E. McKenney
2013-06-25 23:09   ` Tejun Heo
2013-06-26 14:17     ` Paul E. McKenney
2013-06-26 15:25       ` Tejun Heo
2013-06-26 17:27 ` [PATCH v2] " Tejun Heo
2013-06-28 17:24   ` Paul E. McKenney
2013-06-28 17:31     ` Tejun Heo
2013-06-28 17:34   ` [PATCH v3] " Tejun Heo
2013-06-28 19:25     ` Paul E. McKenney
2013-07-23 14:48       ` Tejun Heo
2013-07-23 15:01         ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.