Re: perf events ring buffer memory barrier on powerpc

* Re: perf events ring buffer memory barrier on powerpc
@ 2014-05-08 20:46 Mikulas Patocka
       [not found] ` <OF667059AA.7F151BCC-ONC2257CD3.0036CFEB-C2257CD3.003BBF01@il.ibm.com>
  0 siblings, 1 reply; 74+ messages in thread
From: Mikulas Patocka @ 2014-05-08 20:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Victor Kaplansky, Paul E. McKenney, linux-kernel

[ I found this in the lkml archvive ]

> On Wed, Oct 30, 2013 at 04:52:05PM +0200, Victor Kaplansky wrote:
>
> > Peter Zijlstra <peterz@infradead.org> wrote on 10/30/2013 01:25:26 PM:
> >
> > > Also, I'm not entirely sure on C, that too seems like a dependency, we
> > > simply cannot read the buffer @tail before we've read the tail itself,
> > > now can we? Similarly we cannot compare tail to head without having the
> > > head read completed.
> >
> > No, this one we cannot omit, because our problem on consumer side is not
> > with @tail, which is written exclusively by consumer, but with @head.
>
> Ah indeed, my argument was flawed in that @head is the important part.
> But we still do a comparison of @tail against @head before we do further
> reads.
>
> Although I suppose speculative reads are allowed -- they don't have the
> destructive behaviour speculative writes have -- and thus we could in
> fact get reorder issues.
>
> But since it is still a dependent load in that we do that @tail vs @head
> comparison before doing other loads, wouldn't a read_barrier_depends()
> be sufficient? Or do we still need a complete rmb?
>
> > BTW, it is why you also don't need ACCESS_ONCE() around @tail, but only
> > around
> > @head read.
>
> Agreed, the ACCESS_ONCE() around tail is superfluous since we're the one
> updating tail, so there's no problem with the value changing
> unexpectedly.

You need ACCESS_ONCE even if you are the only process writing the value. 
Because without ACCESS_ONCE, the compiler may perform store tearing and 
split the store into several smaller stores. Search the file 
"Documentation/memory-barriers.txt" for the term "store tearing", it shows 
an example where one instruction storing 32-bit value may be split to two 
instructions, each storing 16-bit value.

Mikulas

^ permalink raw reply	[flat|nested] 74+ messages in thread