linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
@ 2013-03-09 15:38 Ming Lei
  2013-03-11 23:45 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ming Lei @ 2013-03-09 15:38 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel; +Cc: Ming Lei, Shaohua Li, Al Viro

Generally, both atomic_inc_unless_negative() and
atomic_dec_unless_positive() need at least two atomic_cmpxchg()
to complete the atomic operation. In fact, the 1st atomic_cmpxchg()
is just used to read current value of the atomic variable at most times.

Considered memory barrier, bus lock, cache walking, etc. things may be
involved in atomic_cmpxchg(), it is much expensive than atomic_read(),
which is just the simple below:

	(*(volatile int *)&(v)->counter)

so this patch can save one extra atomic_cmpxchg() for the two
helpers under general situation, and should improve them a bit.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 include/linux/atomic.h |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 5b08a85..aa951d8 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -63,26 +63,34 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
 #ifndef atomic_inc_unless_negative
 static inline int atomic_inc_unless_negative(atomic_t *p)
 {
-	int v, v1;
-	for (v = 0; v >= 0; v = v1) {
-		v1 = atomic_cmpxchg(p, v, v + 1);
-		if (likely(v1 == v))
+	int v, t;
+
+	v = atomic_read(p);
+	while (1) {
+		if (unlikely(v < 0))
+			return 0;
+		t = atomic_cmpxchg(p, v, v + 1);
+		if (likely(t == v))
 			return 1;
+		v = t;
 	}
-	return 0;
 }
 #endif
 
 #ifndef atomic_dec_unless_positive
 static inline int atomic_dec_unless_positive(atomic_t *p)
 {
-	int v, v1;
-	for (v = 0; v <= 0; v = v1) {
-		v1 = atomic_cmpxchg(p, v, v - 1);
-		if (likely(v1 == v))
+	int v, t;
+
+	v = atomic_read(p);
+	while (1) {
+		if (unlikely(v > 0))
+			return 0;
+		t = atomic_cmpxchg(p, v, v - 1);
+		if (likely(t == v))
 			return 1;
+		v = t;
 	}
-	return 0;
 }
 #endif
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-09 15:38 [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive Ming Lei
@ 2013-03-11 23:45 ` Andrew Morton
  2013-03-11 23:59 ` Frederic Weisbecker
  2013-03-12  0:21 ` Paul E. McKenney
  2 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2013-03-11 23:45 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-kernel, Shaohua Li, Al Viro

On Sat,  9 Mar 2013 23:38:21 +0800 Ming Lei <tom.leiming@gmail.com> wrote:

> Generally, both atomic_inc_unless_negative() and
> atomic_dec_unless_positive() need at least two atomic_cmpxchg()
> to complete the atomic operation. In fact, the 1st atomic_cmpxchg()
> is just used to read current value of the atomic variable at most times.
> 
> Considered memory barrier, bus lock, cache walking, etc. things may be
> involved in atomic_cmpxchg(), it is much expensive than atomic_read(),
> which is just the simple below:
> 
> 	(*(volatile int *)&(v)->counter)
> 
> so this patch can save one extra atomic_cmpxchg() for the two
> helpers under general situation, and should improve them a bit.
> 
> ...
>
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -63,26 +63,34 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
>  #ifndef atomic_inc_unless_negative
>  static inline int atomic_inc_unless_negative(atomic_t *p)
>  {
> -	int v, v1;
> -	for (v = 0; v >= 0; v = v1) {
> -		v1 = atomic_cmpxchg(p, v, v + 1);
> -		if (likely(v1 == v))
> +	int v, t;
> +
> +	v = atomic_read(p);
> +	while (1) {
> +		if (unlikely(v < 0))
> +			return 0;
> +		t = atomic_cmpxchg(p, v, v + 1);
> +		if (likely(t == v))
>  			return 1;
> +		v = t;
>  	}
> -	return 0;

It looks right to me...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-09 15:38 [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive Ming Lei
  2013-03-11 23:45 ` Andrew Morton
@ 2013-03-11 23:59 ` Frederic Weisbecker
  2013-03-12  2:15   ` Ming Lei
  2013-03-12  0:21 ` Paul E. McKenney
  2 siblings, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2013-03-11 23:59 UTC (permalink / raw)
  To: Ming Lei; +Cc: Andrew Morton, linux-kernel, Shaohua Li, Al Viro, Paul McKenney

2013/3/9 Ming Lei <tom.leiming@gmail.com>:
> Generally, both atomic_inc_unless_negative() and
> atomic_dec_unless_positive() need at least two atomic_cmpxchg()
> to complete the atomic operation. In fact, the 1st atomic_cmpxchg()
> is just used to read current value of the atomic variable at most times.
>
> Considered memory barrier, bus lock, cache walking, etc. things may be
> involved in atomic_cmpxchg(), it is much expensive than atomic_read(),
> which is just the simple below:
>
>         (*(volatile int *)&(v)->counter)
>
> so this patch can save one extra atomic_cmpxchg() for the two
> helpers under general situation, and should improve them a bit.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  include/linux/atomic.h |   28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 5b08a85..aa951d8 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -63,26 +63,34 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
>  #ifndef atomic_inc_unless_negative
>  static inline int atomic_inc_unless_negative(atomic_t *p)
>  {
> -       int v, v1;
> -       for (v = 0; v >= 0; v = v1) {
> -               v1 = atomic_cmpxchg(p, v, v + 1);
> -               if (likely(v1 == v))
> +       int v, t;
> +
> +       v = atomic_read(p);
> +       while (1) {
> +               if (unlikely(v < 0))
> +                       return 0;

But atomic_read() lacks the full memory barrier that is needed for
proper atomicity here.

For example if the initial value of p is -1 and another CPU just did
an atomic_inc() that resulted in the new value to be 0, the above
atomic_read() might return -1 because there is no guarantee it's
seeing the recent update on the remote CPU.

atomic_cmpxchg() OTOH provides that guarantee.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-09 15:38 [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive Ming Lei
  2013-03-11 23:45 ` Andrew Morton
  2013-03-11 23:59 ` Frederic Weisbecker
@ 2013-03-12  0:21 ` Paul E. McKenney
  2 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2013-03-12  0:21 UTC (permalink / raw)
  To: Ming Lei; +Cc: Andrew Morton, linux-kernel, Shaohua Li, Al Viro

On Sat, Mar 09, 2013 at 11:38:21PM +0800, Ming Lei wrote:
> Generally, both atomic_inc_unless_negative() and
> atomic_dec_unless_positive() need at least two atomic_cmpxchg()
> to complete the atomic operation. In fact, the 1st atomic_cmpxchg()
> is just used to read current value of the atomic variable at most times.
> 
> Considered memory barrier, bus lock, cache walking, etc. things may be
> involved in atomic_cmpxchg(), it is much expensive than atomic_read(),
> which is just the simple below:
> 
> 	(*(volatile int *)&(v)->counter)
> 
> so this patch can save one extra atomic_cmpxchg() for the two
> helpers under general situation, and should improve them a bit.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> ---
>  include/linux/atomic.h |   28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 5b08a85..aa951d8 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -63,26 +63,34 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
>  #ifndef atomic_inc_unless_negative
>  static inline int atomic_inc_unless_negative(atomic_t *p)
>  {
> -	int v, v1;
> -	for (v = 0; v >= 0; v = v1) {
> -		v1 = atomic_cmpxchg(p, v, v + 1);
> -		if (likely(v1 == v))
> +	int v, t;
> +
> +	v = atomic_read(p);
> +	while (1) {
> +		if (unlikely(v < 0))

As Frederic noted, you need an smp_mb() right here.  Because
atomic_inc_unless_negative() returns a value, it is required to
act as a full memory barrier.  On the other code paths, atomic_cmpxchg()
supplies the needed memory ordering, but not on this path.

> +			return 0;
> +		t = atomic_cmpxchg(p, v, v + 1);
> +		if (likely(t == v))
>  			return 1;
> +		v = t;
>  	}
> -	return 0;
>  }
>  #endif
>  
>  #ifndef atomic_dec_unless_positive
>  static inline int atomic_dec_unless_positive(atomic_t *p)
>  {
> -	int v, v1;
> -	for (v = 0; v <= 0; v = v1) {
> -		v1 = atomic_cmpxchg(p, v, v - 1);
> -		if (likely(v1 == v))
> +	int v, t;
> +
> +	v = atomic_read(p);
> +	while (1) {
> +		if (unlikely(v > 0))

Ditto here.

							Thanx, Paul

> +			return 0;
> +		t = atomic_cmpxchg(p, v, v - 1);
> +		if (likely(t == v))
>  			return 1;
> +		v = t;
>  	}
> -	return 0;
>  }
>  #endif
>  
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-11 23:59 ` Frederic Weisbecker
@ 2013-03-12  2:15   ` Ming Lei
  2013-03-12  3:39     ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2013-03-12  2:15 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Andrew Morton, linux-kernel, Shaohua Li, Al Viro, Paul McKenney

On Tue, Mar 12, 2013 at 7:59 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 2013/3/9 Ming Lei <tom.leiming@gmail.com>:
>> Generally, both atomic_inc_unless_negative() and
>> atomic_dec_unless_positive() need at least two atomic_cmpxchg()
>> to complete the atomic operation. In fact, the 1st atomic_cmpxchg()
>> is just used to read current value of the atomic variable at most times.
>>
>> Considered memory barrier, bus lock, cache walking, etc. things may be
>> involved in atomic_cmpxchg(), it is much expensive than atomic_read(),
>> which is just the simple below:
>>
>>         (*(volatile int *)&(v)->counter)
>>
>> so this patch can save one extra atomic_cmpxchg() for the two
>> helpers under general situation, and should improve them a bit.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Shaohua Li <shli@kernel.org>
>> Cc: Al Viro <viro@zeniv.linux.org.uk>
>> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>> ---
>>  include/linux/atomic.h |   28 ++++++++++++++++++----------
>>  1 file changed, 18 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
>> index 5b08a85..aa951d8 100644
>> --- a/include/linux/atomic.h
>> +++ b/include/linux/atomic.h
>> @@ -63,26 +63,34 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
>>  #ifndef atomic_inc_unless_negative
>>  static inline int atomic_inc_unless_negative(atomic_t *p)
>>  {
>> -       int v, v1;
>> -       for (v = 0; v >= 0; v = v1) {
>> -               v1 = atomic_cmpxchg(p, v, v + 1);
>> -               if (likely(v1 == v))
>> +       int v, t;
>> +
>> +       v = atomic_read(p);
>> +       while (1) {
>> +               if (unlikely(v < 0))
>> +                       return 0;
>
> But atomic_read() lacks the full memory barrier that is needed for
> proper atomicity here.
>
> For example if the initial value of p is -1 and another CPU just did
> an atomic_inc() that resulted in the new value to be 0, the above
> atomic_read() might return -1 because there is no guarantee it's
> seeing the recent update on the remote CPU.

Yes, you are right. Also looks memory barrier is needed around
atomic_inc() too.

But I have a question, why a memory barrier can guarantee that
remote CPU can see the recent update?  I understand that memory
barrier only orders consecutive memory access, and but here
not see this kind of pattern. Sorry for a possibly stupid question.


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-12  2:15   ` Ming Lei
@ 2013-03-12  3:39     ` Paul E. McKenney
  2013-03-12  4:03       ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2013-03-12  3:39 UTC (permalink / raw)
  To: Ming Lei
  Cc: Frederic Weisbecker, Andrew Morton, linux-kernel, Shaohua Li, Al Viro

On Tue, Mar 12, 2013 at 10:15:42AM +0800, Ming Lei wrote:
> On Tue, Mar 12, 2013 at 7:59 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 2013/3/9 Ming Lei <tom.leiming@gmail.com>:
> >> Generally, both atomic_inc_unless_negative() and
> >> atomic_dec_unless_positive() need at least two atomic_cmpxchg()
> >> to complete the atomic operation. In fact, the 1st atomic_cmpxchg()
> >> is just used to read current value of the atomic variable at most times.
> >>
> >> Considered memory barrier, bus lock, cache walking, etc. things may be
> >> involved in atomic_cmpxchg(), it is much expensive than atomic_read(),
> >> which is just the simple below:
> >>
> >>         (*(volatile int *)&(v)->counter)
> >>
> >> so this patch can save one extra atomic_cmpxchg() for the two
> >> helpers under general situation, and should improve them a bit.
> >>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Shaohua Li <shli@kernel.org>
> >> Cc: Al Viro <viro@zeniv.linux.org.uk>
> >> Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> >> ---
> >>  include/linux/atomic.h |   28 ++++++++++++++++++----------
> >>  1 file changed, 18 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> >> index 5b08a85..aa951d8 100644
> >> --- a/include/linux/atomic.h
> >> +++ b/include/linux/atomic.h
> >> @@ -63,26 +63,34 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
> >>  #ifndef atomic_inc_unless_negative
> >>  static inline int atomic_inc_unless_negative(atomic_t *p)
> >>  {
> >> -       int v, v1;
> >> -       for (v = 0; v >= 0; v = v1) {
> >> -               v1 = atomic_cmpxchg(p, v, v + 1);
> >> -               if (likely(v1 == v))
> >> +       int v, t;
> >> +
> >> +       v = atomic_read(p);
> >> +       while (1) {
> >> +               if (unlikely(v < 0))
> >> +                       return 0;
> >
> > But atomic_read() lacks the full memory barrier that is needed for
> > proper atomicity here.
> >
> > For example if the initial value of p is -1 and another CPU just did
> > an atomic_inc() that resulted in the new value to be 0, the above
> > atomic_read() might return -1 because there is no guarantee it's
> > seeing the recent update on the remote CPU.
> 
> Yes, you are right. Also looks memory barrier is needed around
> atomic_inc() too.

The atomic_inc() primitive makes no guarantees about ordering (see
Documentation/atomic_ops.txt), so, yes, if the caller needs ordering
around an atomic_inc(), the caller must supply the needed memory
barriers, for example, by using smp_mb__before_atomic_inc() or
smp_mb__after_atomic_inc().

> But I have a question, why a memory barrier can guarantee that
> remote CPU can see the recent update?  I understand that memory
> barrier only orders consecutive memory access, and but here
> not see this kind of pattern. Sorry for a possibly stupid question.

Atomic operations that return a value are required to act as full memory
barriers.  This means that code relying on ordering provided by these
atomic operations must also do ordering, either by using an explicit
memory barrier or by relying on guarantees from atomic operations.

For example:

	CPU 0					CPU 1

	X = 1;					r1 = Z;
	if (atomic_inc_unless_negative(&Y)	smp_mb();
		do_something();
	Z = 1;					r2 = X;

Assuming X and Z are initially zero, if r1==1, we are guaranteed
that r2==1.  However, CPU 1 needs its smp_mb() in order to pair with
the barrier implicit in atomic_inc_unless_negative().

Make sense?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-12  3:39     ` Paul E. McKenney
@ 2013-03-12  4:03       ` Ming Lei
  2013-03-12 14:38         ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2013-03-12  4:03 UTC (permalink / raw)
  To: paulmck
  Cc: Frederic Weisbecker, Andrew Morton, linux-kernel, Shaohua Li, Al Viro

On Tue, Mar 12, 2013 at 11:39 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
>
> Atomic operations that return a value are required to act as full memory
> barriers.  This means that code relying on ordering provided by these
> atomic operations must also do ordering, either by using an explicit
> memory barrier or by relying on guarantees from atomic operations.
>
> For example:
>
>         CPU 0                                   CPU 1
>
>         X = 1;                                  r1 = Z;
>         if (atomic_inc_unless_negative(&Y)      smp_mb();
>                 do_something();
>         Z = 1;                                  r2 = X;
>
> Assuming X and Z are initially zero, if r1==1, we are guaranteed
> that r2==1.  However, CPU 1 needs its smp_mb() in order to pair with
> the barrier implicit in atomic_inc_unless_negative().
>
> Make sense?

Yes, it does, and thanks for the explanation.

But looks the above example is not what Frederic described:

"the above atomic_read() might return -1 because there is no
guarantee it's seeing the recent update on the remote CPU."

Even I am not sure if adding one smp_mb() around atomic_read()
can guarantee that too.

Andrew, please ignore the patch, thanks.

Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-12  4:03       ` Ming Lei
@ 2013-03-12 14:38         ` Paul E. McKenney
  2013-03-12 15:02           ` Frederic Weisbecker
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2013-03-12 14:38 UTC (permalink / raw)
  To: Ming Lei
  Cc: Frederic Weisbecker, Andrew Morton, linux-kernel, Shaohua Li, Al Viro

On Tue, Mar 12, 2013 at 12:03:23PM +0800, Ming Lei wrote:
> On Tue, Mar 12, 2013 at 11:39 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > Atomic operations that return a value are required to act as full memory
> > barriers.  This means that code relying on ordering provided by these
> > atomic operations must also do ordering, either by using an explicit
> > memory barrier or by relying on guarantees from atomic operations.
> >
> > For example:
> >
> >         CPU 0                                   CPU 1
> >
> >         X = 1;                                  r1 = Z;
> >         if (atomic_inc_unless_negative(&Y)      smp_mb();
> >                 do_something();
> >         Z = 1;                                  r2 = X;
> >
> > Assuming X and Z are initially zero, if r1==1, we are guaranteed
> > that r2==1.  However, CPU 1 needs its smp_mb() in order to pair with
> > the barrier implicit in atomic_inc_unless_negative().
> >
> > Make sense?
> 
> Yes, it does, and thanks for the explanation.
> 
> But looks the above example is not what Frederic described:
> 
> "the above atomic_read() might return -1 because there is no
> guarantee it's seeing the recent update on the remote CPU."
> 
> Even I am not sure if adding one smp_mb() around atomic_read()
> can guarantee that too.

Frederic was likely thinking of some other scenario that would be
broken by atomic_inc_unless_negative() failing to act as a full
memory barrier.  Here is another example:


        CPU 0                                   CPU 1

        	                                X = 1;
        if (atomic_inc_unless_negative(&Y)      r1 = atomic_xchg(&Y, -1);
                r2 = X;

If atomic_inc_unless_negative() acts as a full memory barrier, then
if CPU 0 reaches the assignment from X, the results will be guaranteed
to be 1.  Otherwise, there is no guarantee.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-12 14:38         ` Paul E. McKenney
@ 2013-03-12 15:02           ` Frederic Weisbecker
  2013-03-12 17:55             ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2013-03-12 15:02 UTC (permalink / raw)
  To: paulmck; +Cc: Ming Lei, Andrew Morton, linux-kernel, Shaohua Li, Al Viro

2013/3/12 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> On Tue, Mar 12, 2013 at 12:03:23PM +0800, Ming Lei wrote:
>> On Tue, Mar 12, 2013 at 11:39 AM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> >
>> > Atomic operations that return a value are required to act as full memory
>> > barriers.  This means that code relying on ordering provided by these
>> > atomic operations must also do ordering, either by using an explicit
>> > memory barrier or by relying on guarantees from atomic operations.
>> >
>> > For example:
>> >
>> >         CPU 0                                   CPU 1
>> >
>> >         X = 1;                                  r1 = Z;
>> >         if (atomic_inc_unless_negative(&Y)      smp_mb();
>> >                 do_something();
>> >         Z = 1;                                  r2 = X;
>> >
>> > Assuming X and Z are initially zero, if r1==1, we are guaranteed
>> > that r2==1.  However, CPU 1 needs its smp_mb() in order to pair with
>> > the barrier implicit in atomic_inc_unless_negative().
>> >
>> > Make sense?
>>
>> Yes, it does, and thanks for the explanation.
>>
>> But looks the above example is not what Frederic described:
>>
>> "the above atomic_read() might return -1 because there is no
>> guarantee it's seeing the recent update on the remote CPU."
>>
>> Even I am not sure if adding one smp_mb() around atomic_read()
>> can guarantee that too.
>
> Frederic was likely thinking of some other scenario that would be
> broken by atomic_inc_unless_negative() failing to act as a full
> memory barrier.  Here is another example:
>
>
>         CPU 0                                   CPU 1
>
>                                                 X = 1;
>         if (atomic_inc_unless_negative(&Y)      r1 = atomic_xchg(&Y, -1);
>                 r2 = X;
>
> If atomic_inc_unless_negative() acts as a full memory barrier, then
> if CPU 0 reaches the assignment from X, the results will be guaranteed
> to be 1.  Otherwise, there is no guarantee.

Your scenarios show an interesting guarantee I did not think about.
But my concern was on such a situation:

  CPU 0                            CPU 1

  atomic_set(&X, -1)
                                       atomic_inc(&X)
  atomic_add_unless_negative(&X, 5)

On the above situation, CPU 0 may still see X == -1 and thus not add
the 5. Of course all that only make sense with datas coming along.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-12 15:02           ` Frederic Weisbecker
@ 2013-03-12 17:55             ` Paul E. McKenney
  2013-03-13  9:33               ` anish singh
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2013-03-12 17:55 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ming Lei, Andrew Morton, linux-kernel, Shaohua Li, Al Viro

On Tue, Mar 12, 2013 at 04:02:47PM +0100, Frederic Weisbecker wrote:
> 2013/3/12 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > On Tue, Mar 12, 2013 at 12:03:23PM +0800, Ming Lei wrote:
> >> On Tue, Mar 12, 2013 at 11:39 AM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> >
> >> > Atomic operations that return a value are required to act as full memory
> >> > barriers.  This means that code relying on ordering provided by these
> >> > atomic operations must also do ordering, either by using an explicit
> >> > memory barrier or by relying on guarantees from atomic operations.
> >> >
> >> > For example:
> >> >
> >> >         CPU 0                                   CPU 1
> >> >
> >> >         X = 1;                                  r1 = Z;
> >> >         if (atomic_inc_unless_negative(&Y)      smp_mb();
> >> >                 do_something();
> >> >         Z = 1;                                  r2 = X;
> >> >
> >> > Assuming X and Z are initially zero, if r1==1, we are guaranteed
> >> > that r2==1.  However, CPU 1 needs its smp_mb() in order to pair with
> >> > the barrier implicit in atomic_inc_unless_negative().
> >> >
> >> > Make sense?
> >>
> >> Yes, it does, and thanks for the explanation.
> >>
> >> But looks the above example is not what Frederic described:
> >>
> >> "the above atomic_read() might return -1 because there is no
> >> guarantee it's seeing the recent update on the remote CPU."
> >>
> >> Even I am not sure if adding one smp_mb() around atomic_read()
> >> can guarantee that too.
> >
> > Frederic was likely thinking of some other scenario that would be
> > broken by atomic_inc_unless_negative() failing to act as a full
> > memory barrier.  Here is another example:
> >
> >
> >         CPU 0                                   CPU 1
> >
> >                                                 X = 1;
> >         if (atomic_inc_unless_negative(&Y)      r1 = atomic_xchg(&Y, -1);
> >                 r2 = X;
> >
> > If atomic_inc_unless_negative() acts as a full memory barrier, then
> > if CPU 0 reaches the assignment from X, the results will be guaranteed
> > to be 1.  Otherwise, there is no guarantee.
> 
> Your scenarios show an interesting guarantee I did not think about.
> But my concern was on such a situation:
> 
>   CPU 0                            CPU 1
> 
>   atomic_set(&X, -1)
>                                        atomic_inc(&X)
>   atomic_add_unless_negative(&X, 5)
> 
> On the above situation, CPU 0 may still see X == -1 and thus not add
> the 5. Of course all that only make sense with datas coming along.

That could happen, but you would need CPU 1 to carry out some other
reference for it to be a bug.  Otherwise, CPU 1's atomic_inc() just
happened after all of CPU 0's code.  But yes, it would be possible
to misorder with some larger scenario starting with this example.
Especially given that atomic_inc() does not make any ordering guarantees.

							Thanx, Paul

							Thanx, Paul


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-12 17:55             ` Paul E. McKenney
@ 2013-03-13  9:33               ` anish singh
  2013-03-13 12:40                 ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: anish singh @ 2013-03-13  9:33 UTC (permalink / raw)
  To: paulmck
  Cc: Frederic Weisbecker, Ming Lei, Andrew Morton, linux-kernel,
	Shaohua Li, Al Viro

On Tue, Mar 12, 2013 at 11:25 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Mar 12, 2013 at 04:02:47PM +0100, Frederic Weisbecker wrote:
>> 2013/3/12 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
>> > On Tue, Mar 12, 2013 at 12:03:23PM +0800, Ming Lei wrote:
>> >> On Tue, Mar 12, 2013 at 11:39 AM, Paul E. McKenney
>> >> <paulmck@linux.vnet.ibm.com> wrote:
>> >> >
>> >> > Atomic operations that return a value are required to act as full
>> >> > memory
>> >> > barriers.  This means that code relying on ordering provided by
>> >> > these
>> >> > atomic operations must also do ordering, either by using an explicit
>> >> > memory barrier or by relying on guarantees from atomic operations.
>> >> >
>> >> > For example:
>> >> >
>> >> >         CPU 0                                   CPU 1
>> >> >
>> >> >         X = 1;                                  r1 = Z;
>> >> >         if (atomic_inc_unless_negative(&Y)      smp_mb();
>> >> >                 do_something();
>> >> >         Z = 1;                                  r2 = X;
>> >> >
>> >> > Assuming X and Z are initially zero, if r1==1, we are guaranteed
>> >> > that r2==1.  However, CPU 1 needs its smp_mb() in order to pair with
>> >> > the barrier implicit in atomic_inc_unless_negative().
>> >> >
>> >> > Make sense?
>> >>
>> >> Yes, it does, and thanks for the explanation.
>> >>
>> >> But looks the above example is not what Frederic described:
>> >>
>> >> "the above atomic_read() might return -1 because there is no
>> >> guarantee it's seeing the recent update on the remote CPU."
>> >>
>> >> Even I am not sure if adding one smp_mb() around atomic_read()
>> >> can guarantee that too.
>> >
>> > Frederic was likely thinking of some other scenario that would be
>> > broken by atomic_inc_unless_negative() failing to act as a full
>> > memory barrier.  Here is another example:
>> >
>> >
>> >         CPU 0                                   CPU 1
>> >
>> >                                                 X = 1;
>> >         if (atomic_inc_unless_negative(&Y)      r1 = atomic_xchg(&Y,
>> > -1);
>> >                 r2 = X;
>> >
>> > If atomic_inc_unless_negative() acts as a full memory barrier, then
>> > if CPU 0 reaches the assignment from X, the results will be guaranteed
>> > to be 1.  Otherwise, there is no guarantee.
>>
>> Your scenarios show an interesting guarantee I did not think about.
>> But my concern was on such a situation:
>>
>>   CPU 0                            CPU 1
>>
>>   atomic_set(&X, -1)
>>                                        atomic_inc(&X)
>>   atomic_add_unless_negative(&X, 5)
>>
>> On the above situation, CPU 0 may still see X == -1 and thus not add
>> the 5. Of course all that only make sense with datas coming along.
>
> That could happen, but you would need CPU 1 to carry out some other
> reference for it to be a bug.  Otherwise, CPU 1's atomic_inc() just

  CPU 0                            CPU 1

  atomic_set(&X, -1)
                                        A =5
                                       &X = A
  atomic_add_unless_negative(&X, 5)

Do you mean this when you referred "carry out some other reference
for it to be a bug"?

> happened after all of CPU 0's code.  But yes, it would be possible
> to misorder with some larger scenario starting with this example.
> Especially given that atomic_inc() does not make any ordering guarantees.
>
>                                                         Thanx, Paul
>
>                                                         Thanx, Paul
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive
  2013-03-13  9:33               ` anish singh
@ 2013-03-13 12:40                 ` Paul E. McKenney
  0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2013-03-13 12:40 UTC (permalink / raw)
  To: anish singh
  Cc: Frederic Weisbecker, Ming Lei, Andrew Morton, linux-kernel,
	Shaohua Li, Al Viro

On Wed, Mar 13, 2013 at 03:03:10PM +0530, anish singh wrote:
> On Tue, Mar 12, 2013 at 11:25 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Mar 12, 2013 at 04:02:47PM +0100, Frederic Weisbecker wrote:
> >> 2013/3/12 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> >> > On Tue, Mar 12, 2013 at 12:03:23PM +0800, Ming Lei wrote:
> >> >> On Tue, Mar 12, 2013 at 11:39 AM, Paul E. McKenney
> >> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> >> >
> >> >> > Atomic operations that return a value are required to act as full
> >> >> > memory
> >> >> > barriers.  This means that code relying on ordering provided by
> >> >> > these
> >> >> > atomic operations must also do ordering, either by using an explicit
> >> >> > memory barrier or by relying on guarantees from atomic operations.
> >> >> >
> >> >> > For example:
> >> >> >
> >> >> >         CPU 0                                   CPU 1
> >> >> >
> >> >> >         X = 1;                                  r1 = Z;
> >> >> >         if (atomic_inc_unless_negative(&Y)      smp_mb();
> >> >> >                 do_something();
> >> >> >         Z = 1;                                  r2 = X;
> >> >> >
> >> >> > Assuming X and Z are initially zero, if r1==1, we are guaranteed
> >> >> > that r2==1.  However, CPU 1 needs its smp_mb() in order to pair with
> >> >> > the barrier implicit in atomic_inc_unless_negative().
> >> >> >
> >> >> > Make sense?
> >> >>
> >> >> Yes, it does, and thanks for the explanation.
> >> >>
> >> >> But looks the above example is not what Frederic described:
> >> >>
> >> >> "the above atomic_read() might return -1 because there is no
> >> >> guarantee it's seeing the recent update on the remote CPU."
> >> >>
> >> >> Even I am not sure if adding one smp_mb() around atomic_read()
> >> >> can guarantee that too.
> >> >
> >> > Frederic was likely thinking of some other scenario that would be
> >> > broken by atomic_inc_unless_negative() failing to act as a full
> >> > memory barrier.  Here is another example:
> >> >
> >> >
> >> >         CPU 0                                   CPU 1
> >> >
> >> >                                                 X = 1;
> >> >         if (atomic_inc_unless_negative(&Y)      r1 = atomic_xchg(&Y,
> >> > -1);
> >> >                 r2 = X;
> >> >
> >> > If atomic_inc_unless_negative() acts as a full memory barrier, then
> >> > if CPU 0 reaches the assignment from X, the results will be guaranteed
> >> > to be 1.  Otherwise, there is no guarantee.
> >>
> >> Your scenarios show an interesting guarantee I did not think about.
> >> But my concern was on such a situation:
> >>
> >>   CPU 0                            CPU 1
> >>
> >>   atomic_set(&X, -1)
> >>                                        atomic_inc(&X)
> >>   atomic_add_unless_negative(&X, 5)
> >>
> >> On the above situation, CPU 0 may still see X == -1 and thus not add
> >> the 5. Of course all that only make sense with datas coming along.
> >
> > That could happen, but you would need CPU 1 to carry out some other
> > reference for it to be a bug.  Otherwise, CPU 1's atomic_inc() just
> 
>   CPU 0                            CPU 1
> 
>   atomic_set(&X, -1)
>                                         A =5
>                                        &X = A
>   atomic_add_unless_negative(&X, 5)
> 
> Do you mean this when you referred "carry out some other reference
> for it to be a bug"?

Not exactly.  I was thinking more of something like this, with X and
Y both initially zero:

	CPU 0					CPU 1

						Y = 1;
						smp_mb__before_atomic_dec();
	if (atomic_inc_unless_negative(&X))	atomic_dec(&X);
		r1 = 1;
	else
		r1 = Y;

If atomic_inc_unless_negative() does not imply a full memory barrier,
r1 could be equal to zero.  (Not sure where atomic_add_unless_negative()
came from, by the way, not seeing it in mainline.)

							Thanx, Paul

> > happened after all of CPU 0's code.  But yes, it would be possible
> > to misorder with some larger scenario starting with this example.
> > Especially given that atomic_inc() does not make any ordering guarantees.
> >
> >                                                         Thanx, Paul
> >
> >                                                         Thanx, Paul
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-03-13 12:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-09 15:38 [PATCH] atomic: improve atomic_inc_unless_negative/atomic_dec_unless_positive Ming Lei
2013-03-11 23:45 ` Andrew Morton
2013-03-11 23:59 ` Frederic Weisbecker
2013-03-12  2:15   ` Ming Lei
2013-03-12  3:39     ` Paul E. McKenney
2013-03-12  4:03       ` Ming Lei
2013-03-12 14:38         ` Paul E. McKenney
2013-03-12 15:02           ` Frederic Weisbecker
2013-03-12 17:55             ` Paul E. McKenney
2013-03-13  9:33               ` anish singh
2013-03-13 12:40                 ` Paul E. McKenney
2013-03-12  0:21 ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).