All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junchang Wang <junchangwang@gmail.com>
To: Akira Yokosawa <akiyks@gmail.com>
Cc: Paul McKenney <paulmck@linux.ibm.com>, perfbook@vger.kernel.org
Subject: Re: [Possible BUG] count_lim_atomic.c fails on POWER8
Date: Thu, 25 Oct 2018 10:11:18 +0800	[thread overview]
Message-ID: <CABoNC828mNgvvqcUm65d+PFdBVB1Pp6FzEyXniPbxo-dUeLxpw@mail.gmail.com> (raw)
In-Reply-To: <073797d5-67f7-7426-f895-8004428a84ab@gmail.com>

Hi Akira,

Thanks for the mail. My understanding is that PPC uses LL/SC to
emulate CAS by using a tiny loop. Unfortunately, the LL/SC loop itself
could fail (due to, for example, context switches) even if *ptr equals
to old. In such a case, a CAS instruction in actually should return a
success. I think this is what the term "spurious fail" describes. Here
is a reference:
http://liblfds.org/mediawiki/index.php?title=Article:CAS_and_LL/SC_Implementation_Details_by_Processor_family

It seems that __atomic_compare_exchange_n() provides option "weak" for
performance. I tested these two solutions and got the following
results:

                           1      4      8     16     32    64
my patch (ns)     35    34    37    73    142  281
strong (ns)          39    39    41    79    158  313

I tested the performance of count_lim_atomic by varying the number of
updaters (./count_lim_atomic N uperf) on a 8-core PPC server. The
first row in the table is the result when my patch is used, and the
second row is the result when the 4th argument of the function is set
to false(0). It seems performance improves slightly if option "weak"
is used. However, there is no performance boost as we expected. So
your solution sounds good if safety is one major concern because
option "weak" seems risky to me :-)

Another interesting observation is that the performance of LL/SC-based
CAS instruction deteriorates dramatically when the number of working
threads exceeds the number of CPU cores.


Thanks,
--Junchang


On Thu, Oct 25, 2018 at 6:05 AM Akira Yokosawa <akiyks@gmail.com> wrote:
>
> On 2018/10/24 23:53:29 +0800, Junchang Wang wrote:
> > Hi Akira and Paul,
> >
> > I checked the code today and the implementation of cmpxchg() looks
> > very suspicious to me; Current  cmpxchg() first executes function
> > __atomic_compare_exchange_n, and then checks whether the value stored
> > in field __actual (old) has been changed to decide if the CAS
> > instruction has been successfully performed. However, I saw the *weak*
> > field is set, which, as far as I know, means
> > __atomic_compare_exchange_n could fail even if the value of *ptr is
> > equal to __actual (old). Unfortunately, current cmpxchg will treat
> > this case as a success because the value of __actual(old) does not
> > change.
>
> Thanks for looking into this!
>
> I also suspected the use of "weak" semantics of
> __atomic_compare_exchange_n(), but did not quite understand what
> "spurious fail" actually means. Your theory sounds plausible to me.
>
> I've suggested in a private email to Paul to modify the 4th argument
> to false(0) as a workaround, which would prevent such "spurious fail".
>
> Both approaches looks good to me. I'd defer to Paul on the choice.
>
>         Thanks, Akira
>
> >
> > This bug happens in both Power8 and ARMv8. It seems it affects
> > architectures that use LL/SC to emulate CAS. Following patch helps
> > solve this issue on my testbeds. Please take a look. Any thoughts?
> >
> > ---
> >  CodeSamples/api-pthreads/api-gcc.h | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/CodeSamples/api-pthreads/api-gcc.h
> > b/CodeSamples/api-pthreads/api-gcc.h
> > index 1dd26ca..38a16c0 100644
> > --- a/CodeSamples/api-pthreads/api-gcc.h
> > +++ b/CodeSamples/api-pthreads/api-gcc.h
> > @@ -166,11 +166,9 @@ struct __xchg_dummy {
> >
> >  #define cmpxchg(ptr, o, n) \
> >  ({ \
> > -       typeof(*ptr) _____actual = (o); \
> > -       \
> > -       (void)__atomic_compare_exchange_n(ptr, (void *)&_____actual, (n), 1, \
> > -                                         __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); \
> > -       _____actual; \
> > +       typeof(*ptr) old = (o); \
> > +       (__atomic_compare_exchange_n(ptr, (void *)&old, (n), 1,
> > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST))? \
> > +                               (o) : (n); \
> >  })
> >
> >  static __inline__ int atomic_cmpxchg(atomic_t *v, int old, int new)
> >
>


  parent reply	other threads:[~2018-10-25  2:11 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-20 15:53 [Possible BUG] count_lim_atomic.c fails on POWER8 Akira Yokosawa
2018-10-20 16:36 ` Paul E. McKenney
2018-10-24 15:53   ` Junchang Wang
2018-10-24 22:05     ` Akira Yokosawa
2018-10-24 22:29       ` Akira Yokosawa
2018-10-25  2:18         ` Junchang Wang
2018-10-25  2:11       ` Junchang Wang [this message]
2018-10-25  9:45         ` Paul E. McKenney
2018-10-25 12:23           ` Akira Yokosawa
2018-10-25 14:09           ` Junchang Wang
2018-10-25 15:17             ` Akira Yokosawa
2018-10-25 22:04               ` Akira Yokosawa
2018-10-26  0:58                 ` Junchang Wang
2018-10-27 14:56                   ` Akira Yokosawa
     [not found]                     ` <20181028001723.GJ4170@linux.ibm.com>
2018-10-28 12:08                       ` Junchang Wang
2018-10-28 13:19                         ` Paul E. McKenney
2018-10-28 13:22                         ` Akira Yokosawa
2018-10-28 14:24                       ` Akira Yokosawa
2018-10-28 16:43                         ` Paul E. McKenney
2018-10-29 14:45                           ` Akira Yokosawa
2018-10-29 15:30                             ` Paul E. McKenney
2018-10-26  1:12                 ` Akira Yokosawa
2018-10-26 11:34                   ` Akira Yokosawa
2018-10-26 16:06                     ` Junchang Wang
2018-10-25 15:24             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABoNC828mNgvvqcUm65d+PFdBVB1Pp6FzEyXniPbxo-dUeLxpw@mail.gmail.com \
    --to=junchangwang@gmail.com \
    --cc=akiyks@gmail.com \
    --cc=paulmck@linux.ibm.com \
    --cc=perfbook@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.