All of lore.kernel.org
 help / color / mirror / Atom feed
* Definition of cmpxchg()
@ 2018-12-10 15:29 Akira Yokosawa
  2018-12-10 20:17 ` Paul E. McKenney
  0 siblings, 1 reply; 3+ messages in thread
From: Akira Yokosawa @ 2018-12-10 15:29 UTC (permalink / raw)
  To: Paul E. McKenney, Junchang Wang; +Cc: perfbook, Akira Yokosawa

Hi Paul and Junchang,

This is a continuation of the discussion on the definition of
cmpxchg() in CodeSamples/api-pthreads/api-gcc.h.

Current definition is as follows:

#define cmpxchg(ptr, o, n) \
({ \
	typeof(*ptr) _____actual = (o); \
	\
	__atomic_compare_exchange_n(ptr, (void *)&_____actual, (n), 1, \
			__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? (o) : (o)+1; \
})

As was already pointed out by Paul, this definition has potential issue
when an argument with side-effect is given to "o". Also, (o)+1 can overflow.

As a matter of fact, there is another definition of cmpxchg() in
CodeSamples/formal/litmus/api.h. Currently it is defined as follows:

#define cmpxchg(x, o, n) ({ \
	typeof(o) __old = (o); \
	__atomic_compare_exchange_n((x), &__old, (n), 1, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
	__old; \
})

Apparently, this definition is not good on PPC or ARM.

Following is a simple litmus test of cmpxchg() (to be executed by litmus7):
--------------------------------------------
C C-cmpxchg
{
}

{
#include "api.h"
}

P0(int *x)
{
	int r1;

	r1 = cmpxchg(x, 1, 2);
}

P1(int *x)
{
	int r1;

	r1 = cmpxchg(x, 0, 1);
}

locations [1:r1]
exists (0:r1=1 /\ ~x=2)
---------------------------------

Testing on PPC with the current api.h yields the following:
---------------------------------
Test C-cmpxchg Allowed
Histogram (4 states)
2789  :>0:r1=0; 1:r1=0; x=0;
50452053:>0:r1=0; 1:r1=0; x=1;
36003 *>0:r1=1; 1:r1=0; x=1;
49509155:>0:r1=1; 1:r1=0; x=2;
Ok

Witnesses
Positive: 36003, Negative: 99963997
Condition exists (0:r1=1 /\ not (x=2)) is validated
Hash=3f625d680407ff822fb56f1a80834291
Observation C-cmpxchg Sometimes 36003 99963997
Time C-cmpxchg 43.37
----------------------------------

If we change the definition to suppress spurious failure in
__atomic_compare_exchange_n(), i.e.:

#define cmpxchg(x, o, n) ({ \
	typeof(o) __old = (o); \
	__atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
	__old; \
})

, litmus7 yields the following:
----------------------------------
Test C-cmpxchg Allowed
Histogram (2 states)
50766497:>0:r1=0; 1:r1=0; x=1;
49233503:>0:r1=1; 1:r1=0; x=2;
No

Witnesses
Positive: 0, Negative: 100000000
Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
Hash=3f625d680407ff822fb56f1a80834291
Observation C-cmpxchg Never 0 100000000
Time C-cmpxchg 35.20
----------------------------------

This matches what herd7 says (with the 2nd pair of {} removed):
----------------------------------
Test C-cmpxchg Allowed
States 2
0:r1=0; 1:r1=0; x=1;
0:r1=1; 1:r1=0; x=2;
No
Witnesses
Positive: 0 Negative: 2
Condition exists (0:r1=1 /\ not (x=2))
Observation C-cmpxchg Never 0 2
Time C-cmpxchg 0.01
Hash=1720691890ea69e35615997626680d6f
----------------------------------

Also, klitmus7 on PPC says:
----------------------------------
Test C-cmpxchg Allowed
Histogram (2 states)
2019624 :>0:r1=0; 1:r1=0; x=1;
1980376 :>0:r1=1; 1:r1=0; x=2;
No

Witnesses
Positive: 0, Negative: 4000000
Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
Hash=1720691890ea69e35615997626680d6f
Observation C-cmpxchg Never 0 4000000
Time C-cmpxchg 0.76
----------------------------------

If we use the definition in api-gcc.h, litmus7 says:
----------------------------------
Test C-cmpxchg Allowed
Histogram (3 states)
11388 :>0:r1=2; 1:r1=1; x=0;
50373528:>0:r1=2; 1:r1=0; x=1;
49615084:>0:r1=1; 1:r1=0; x=2;
No

Witnesses
Positive: 0, Negative: 100000000
Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
Hash=3f625d680407ff822fb56f1a80834291
Observation C-cmpxchg Never 0 100000000
Time C-cmpxchg 52.06
----------------------------------
The result is still "Never", but the statistics look quite different.
There is no "0:r1=0" observed!

So, at least for litmus test, cmpxchg() should be defined as follows:

#define cmpxchg(x, o, n) ({ \
	typeof(o) __old = (o); \
	__atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
	__old; \
})

If this is the way to go, I think the definition in api-gcc.h should
be the same way for consistency, even if it would cause a little
performance degradation. Then there would be no worry about overflows.

Thoughts?

        Thanks, Akira


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Definition of cmpxchg()
  2018-12-10 15:29 Definition of cmpxchg() Akira Yokosawa
@ 2018-12-10 20:17 ` Paul E. McKenney
  2018-12-11  1:35   ` Junchang Wang
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2018-12-10 20:17 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: Junchang Wang, perfbook

On Tue, Dec 11, 2018 at 12:29:08AM +0900, Akira Yokosawa wrote:
> Hi Paul and Junchang,
> 
> This is a continuation of the discussion on the definition of
> cmpxchg() in CodeSamples/api-pthreads/api-gcc.h.
> 
> Current definition is as follows:
> 
> #define cmpxchg(ptr, o, n) \
> ({ \
> 	typeof(*ptr) _____actual = (o); \
> 	\
> 	__atomic_compare_exchange_n(ptr, (void *)&_____actual, (n), 1, \
> 			__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? (o) : (o)+1; \
> })
> 
> As was already pointed out by Paul, this definition has potential issue
> when an argument with side-effect is given to "o". Also, (o)+1 can overflow.
> 
> As a matter of fact, there is another definition of cmpxchg() in
> CodeSamples/formal/litmus/api.h. Currently it is defined as follows:
> 
> #define cmpxchg(x, o, n) ({ \
> 	typeof(o) __old = (o); \
> 	__atomic_compare_exchange_n((x), &__old, (n), 1, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
> 	__old; \
> })
> 
> Apparently, this definition is not good on PPC or ARM.
> 
> Following is a simple litmus test of cmpxchg() (to be executed by litmus7):
> --------------------------------------------
> C C-cmpxchg
> {
> }
> 
> {
> #include "api.h"
> }
> 
> P0(int *x)
> {
> 	int r1;
> 
> 	r1 = cmpxchg(x, 1, 2);
> }
> 
> P1(int *x)
> {
> 	int r1;
> 
> 	r1 = cmpxchg(x, 0, 1);
> }
> 
> locations [1:r1]
> exists (0:r1=1 /\ ~x=2)
> ---------------------------------
> 
> Testing on PPC with the current api.h yields the following:
> ---------------------------------
> Test C-cmpxchg Allowed
> Histogram (4 states)
> 2789  :>0:r1=0; 1:r1=0; x=0;
> 50452053:>0:r1=0; 1:r1=0; x=1;
> 36003 *>0:r1=1; 1:r1=0; x=1;
> 49509155:>0:r1=1; 1:r1=0; x=2;
> Ok
> 
> Witnesses
> Positive: 36003, Negative: 99963997
> Condition exists (0:r1=1 /\ not (x=2)) is validated
> Hash=3f625d680407ff822fb56f1a80834291
> Observation C-cmpxchg Sometimes 36003 99963997
> Time C-cmpxchg 43.37
> ----------------------------------
> 
> If we change the definition to suppress spurious failure in
> __atomic_compare_exchange_n(), i.e.:
> 
> #define cmpxchg(x, o, n) ({ \
> 	typeof(o) __old = (o); \
> 	__atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
> 	__old; \
> })
> 
> , litmus7 yields the following:
> ----------------------------------
> Test C-cmpxchg Allowed
> Histogram (2 states)
> 50766497:>0:r1=0; 1:r1=0; x=1;
> 49233503:>0:r1=1; 1:r1=0; x=2;
> No
> 
> Witnesses
> Positive: 0, Negative: 100000000
> Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
> Hash=3f625d680407ff822fb56f1a80834291
> Observation C-cmpxchg Never 0 100000000
> Time C-cmpxchg 35.20
> ----------------------------------
> 
> This matches what herd7 says (with the 2nd pair of {} removed):
> ----------------------------------
> Test C-cmpxchg Allowed
> States 2
> 0:r1=0; 1:r1=0; x=1;
> 0:r1=1; 1:r1=0; x=2;
> No
> Witnesses
> Positive: 0 Negative: 2
> Condition exists (0:r1=1 /\ not (x=2))
> Observation C-cmpxchg Never 0 2
> Time C-cmpxchg 0.01
> Hash=1720691890ea69e35615997626680d6f
> ----------------------------------
> 
> Also, klitmus7 on PPC says:
> ----------------------------------
> Test C-cmpxchg Allowed
> Histogram (2 states)
> 2019624 :>0:r1=0; 1:r1=0; x=1;
> 1980376 :>0:r1=1; 1:r1=0; x=2;
> No
> 
> Witnesses
> Positive: 0, Negative: 4000000
> Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
> Hash=1720691890ea69e35615997626680d6f
> Observation C-cmpxchg Never 0 4000000
> Time C-cmpxchg 0.76
> ----------------------------------
> 
> If we use the definition in api-gcc.h, litmus7 says:
> ----------------------------------
> Test C-cmpxchg Allowed
> Histogram (3 states)
> 11388 :>0:r1=2; 1:r1=1; x=0;
> 50373528:>0:r1=2; 1:r1=0; x=1;
> 49615084:>0:r1=1; 1:r1=0; x=2;
> No
> 
> Witnesses
> Positive: 0, Negative: 100000000
> Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
> Hash=3f625d680407ff822fb56f1a80834291
> Observation C-cmpxchg Never 0 100000000
> Time C-cmpxchg 52.06
> ----------------------------------
> The result is still "Never", but the statistics look quite different.
> There is no "0:r1=0" observed!
> 
> So, at least for litmus test, cmpxchg() should be defined as follows:
> 
> #define cmpxchg(x, o, n) ({ \
> 	typeof(o) __old = (o); \
> 	__atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
> 	__old; \
> })
> 
> If this is the way to go, I think the definition in api-gcc.h should
> be the same way for consistency, even if it would cause a little
> performance degradation. Then there would be no worry about overflows.

Makes sense to me!  Longer term, it would be good to characterize
weak mode, but having things work is a very good thing.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Definition of cmpxchg()
  2018-12-10 20:17 ` Paul E. McKenney
@ 2018-12-11  1:35   ` Junchang Wang
  0 siblings, 0 replies; 3+ messages in thread
From: Junchang Wang @ 2018-12-11  1:35 UTC (permalink / raw)
  To: Paul McKenney, Akira Yokosawa; +Cc: perfbook

On Tue, Dec 11, 2018 at 4:17 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote:
>
> On Tue, Dec 11, 2018 at 12:29:08AM +0900, Akira Yokosawa wrote:
> > Hi Paul and Junchang,
> >
> > This is a continuation of the discussion on the definition of
> > cmpxchg() in CodeSamples/api-pthreads/api-gcc.h.
> >
> > Current definition is as follows:
> >
> > #define cmpxchg(ptr, o, n) \
> > ({ \
> >       typeof(*ptr) _____actual = (o); \
> >       \
> >       __atomic_compare_exchange_n(ptr, (void *)&_____actual, (n), 1, \
> >                       __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? (o) : (o)+1; \
> > })
> >
> > As was already pointed out by Paul, this definition has potential issue
> > when an argument with side-effect is given to "o". Also, (o)+1 can overflow.
> >
> > As a matter of fact, there is another definition of cmpxchg() in
> > CodeSamples/formal/litmus/api.h. Currently it is defined as follows:
> >
> > #define cmpxchg(x, o, n) ({ \
> >       typeof(o) __old = (o); \
> >       __atomic_compare_exchange_n((x), &__old, (n), 1, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
> >       __old; \
> > })
> >
> > Apparently, this definition is not good on PPC or ARM.
> >
> > Following is a simple litmus test of cmpxchg() (to be executed by litmus7):
> > --------------------------------------------
> > C C-cmpxchg
> > {
> > }
> >
> > {
> > #include "api.h"
> > }
> >
> > P0(int *x)
> > {
> >       int r1;
> >
> >       r1 = cmpxchg(x, 1, 2);
> > }
> >
> > P1(int *x)
> > {
> >       int r1;
> >
> >       r1 = cmpxchg(x, 0, 1);
> > }
> >
> > locations [1:r1]
> > exists (0:r1=1 /\ ~x=2)
> > ---------------------------------
> >
> > Testing on PPC with the current api.h yields the following:
> > ---------------------------------
> > Test C-cmpxchg Allowed
> > Histogram (4 states)
> > 2789  :>0:r1=0; 1:r1=0; x=0;
> > 50452053:>0:r1=0; 1:r1=0; x=1;
> > 36003 *>0:r1=1; 1:r1=0; x=1;
> > 49509155:>0:r1=1; 1:r1=0; x=2;
> > Ok
> >
> > Witnesses
> > Positive: 36003, Negative: 99963997
> > Condition exists (0:r1=1 /\ not (x=2)) is validated
> > Hash=3f625d680407ff822fb56f1a80834291
> > Observation C-cmpxchg Sometimes 36003 99963997
> > Time C-cmpxchg 43.37
> > ----------------------------------
> >
> > If we change the definition to suppress spurious failure in
> > __atomic_compare_exchange_n(), i.e.:
> >
> > #define cmpxchg(x, o, n) ({ \
> >       typeof(o) __old = (o); \
> >       __atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
> >       __old; \
> > })
> >
> > , litmus7 yields the following:
> > ----------------------------------
> > Test C-cmpxchg Allowed
> > Histogram (2 states)
> > 50766497:>0:r1=0; 1:r1=0; x=1;
> > 49233503:>0:r1=1; 1:r1=0; x=2;
> > No
> >
> > Witnesses
> > Positive: 0, Negative: 100000000
> > Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
> > Hash=3f625d680407ff822fb56f1a80834291
> > Observation C-cmpxchg Never 0 100000000
> > Time C-cmpxchg 35.20
> > ----------------------------------
> >
> > This matches what herd7 says (with the 2nd pair of {} removed):
> > ----------------------------------
> > Test C-cmpxchg Allowed
> > States 2
> > 0:r1=0; 1:r1=0; x=1;
> > 0:r1=1; 1:r1=0; x=2;
> > No
> > Witnesses
> > Positive: 0 Negative: 2
> > Condition exists (0:r1=1 /\ not (x=2))
> > Observation C-cmpxchg Never 0 2
> > Time C-cmpxchg 0.01
> > Hash=1720691890ea69e35615997626680d6f
> > ----------------------------------
> >
> > Also, klitmus7 on PPC says:
> > ----------------------------------
> > Test C-cmpxchg Allowed
> > Histogram (2 states)
> > 2019624 :>0:r1=0; 1:r1=0; x=1;
> > 1980376 :>0:r1=1; 1:r1=0; x=2;
> > No
> >
> > Witnesses
> > Positive: 0, Negative: 4000000
> > Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
> > Hash=1720691890ea69e35615997626680d6f
> > Observation C-cmpxchg Never 0 4000000
> > Time C-cmpxchg 0.76
> > ----------------------------------
> >
> > If we use the definition in api-gcc.h, litmus7 says:
> > ----------------------------------
> > Test C-cmpxchg Allowed
> > Histogram (3 states)
> > 11388 :>0:r1=2; 1:r1=1; x=0;
> > 50373528:>0:r1=2; 1:r1=0; x=1;
> > 49615084:>0:r1=1; 1:r1=0; x=2;
> > No
> >
> > Witnesses
> > Positive: 0, Negative: 100000000
> > Condition exists (0:r1=1 /\ not (x=2)) is NOT validated
> > Hash=3f625d680407ff822fb56f1a80834291
> > Observation C-cmpxchg Never 0 100000000
> > Time C-cmpxchg 52.06
> > ----------------------------------
> > The result is still "Never", but the statistics look quite different.
> > There is no "0:r1=0" observed!
> >
> > So, at least for litmus test, cmpxchg() should be defined as follows:
> >
> > #define cmpxchg(x, o, n) ({ \
> >       typeof(o) __old = (o); \
> >       __atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \
> >       __old; \
> > })
> >
> > If this is the way to go, I think the definition in api-gcc.h should
> > be the same way for consistency, even if it would cause a little
> > performance degradation. Then there would be no worry about overflows.
>
> Makes sense to me!  Longer term, it would be good to characterize
> weak mode, but having things work is a very good thing.  ;-)
>
>                                                         Thanx, Paul
>

Hi Akira,

Thanks for the detailed analysis. It a very interesting topic.

For correctness, I agree with you on adopting strong variation in
__atomic_compare_exchange_n. But for long-term, the following is my
thinking, please take a look.

For the current implementation (the one in api-gcc.h), the issue of
the side-effect in argument (o) could be removed by adding one another
local variable. The following is one example (It diagrams the idea,
though I didn't test the code yet. But I can test it this night.)

#define cmpxchg(ptr, o, n) \
({ \
        typeof(*ptr) _____actual = (o); \
        typeof(*ptr) __old = _____actual; \
        __atomic_compare_exchange_n(ptr, (void *)&_____actual, (n), 1, \
                        __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? __old : __old+1; \
})

The reason that I stick to enabling weak variation on PPC and ARM is
that it gives us about 10% performance improvement when we ran
count_lim_atomic on a PPC server. When I was trying to collect some
performance data of stress tests, this 10% benefit really means a lot.
For the overflow issue, maybe we can figure out some better ways.
Please let me know what do you think. :-)

Regards,
--Junchang


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-12-11  1:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-10 15:29 Definition of cmpxchg() Akira Yokosawa
2018-12-10 20:17 ` Paul E. McKenney
2018-12-11  1:35   ` Junchang Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.