linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Anton Blanchard <anton@samba.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux PPC dev <linuxppc-dev@ozlabs.org>,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	Michael Ellerman <michael@ellerman.id.au>,
	Michael Neuling <mikey@neuling.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: perf events ring buffer memory barrier on powerpc
Date: Thu, 31 Oct 2013 08:25:43 -0700	[thread overview]
Message-ID: <20131031152543.GC4067@linux.vnet.ibm.com> (raw)
In-Reply-To: <OFDF23985D.1D949148-ON42257C15.002DEB25-42257C15.0036DF93@il.ibm.com>

On Thu, Oct 31, 2013 at 11:59:21AM +0200, Victor Kaplansky wrote:
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote on 10/31/2013
> 06:32:58 AM:
> 
> > If you want to play the "omit memory barriers" game, then proving a
> > negative is in fact the task before you.
> 
> Generally it is not fair. Otherwise, anyone could put an smp_mb() at a
> random place, and expect others to "prove" that it is not needed.
> 
> It is not fair also because it should be virtually impossible to prove lack
> of any problem. OTH, if a problem exists, it should be easy for proponents
> of a memory barrier to build a test case or design a scenario demonstrating
> the problem.

I really don't care about "fair" -- I care instead about the kernel
working reliably.

And it should also be easy for proponents of removing memory barriers to
clearly articulate what orderings their code does and does not need.

> Actually, advocates of the memory barrier in our case do have an argument -
> - the rule of thumb saying that barriers should be paired. I consider this
> rule only as a general recommendation to look into potentially risky
> places.
> And indeed, in our case if the store to circular wasn't conditional, it
> would require a memory barrier to prevent the store to be performed before
> the read of @tail. But in our case the store is conditional, so no memory
> barrier is required.

You are assuming control dependencies that the C language does not
provide.  Now, for all I know right now, there might well be some other
reason why a full barrier is not required, but the "if" statement cannot
be that reason.

Please review section 1.10 of the C++11 standard (or the corresponding
section of the C11 standard, if you prefer).  The point is that the
C/C++11 covers only data dependencies, not control dependencies.

> > And the correctness of this code has been called into question.  :-(
> > An embarrassingly long time ago -- I need to get this either proven
> > or fixed.
> 
> I agree.

Glad we agree on something!

> > Before C/C++11, the closest thing to such a prohibition is use of
> > volatile, for example, ACCESS_ONCE().  Even in C/C++11, you have to
> > use atomics to get anything resembing this prohibition.
> >
> > If you just use normal variables, the compiler is within its rights
> > to transform something like the following:
> >
> >    if (a)
> >       b = 1;
> >    else
> >       b = 42;
> >
> > Into:
> >
> >    b = 42;
> >    if (a)
> >       b = 1;
> >
> > Many other similar transformations are permitted.  Some are used to all
> > vector instructions to be used -- the compiler can do a write with an
> > overly wide vector instruction, then clean up the clobbered variables
> > later, if it wishes.  Again, if the variables are not marked volatile,
> > or, in C/C++11, atomic.
> 
> All this can justify only compiler barrier() which is almost free from
> performance point of view, since current gcc anyway doesn't perform store
> hoisting optimization in our case.

If the above example doesn't get you to give up your incorrect assumption
about "if" statements having much effect on ordering, you need more help
than I can give you just now.

> (And I'm not getting into philosophical discussion whether kernel code
> should consider future possible bugs/features in gcc or C/C++11
> standard).

Should you wish to get into that discussion in the near future, you
will need to find someone else to discuss it with.

> > The compilers don't always know as much as they might about the
> underlying
> > hardware's memory model.
> 
> That's correct in general. But can you point out a problem that really
> exists?

We will see.

In the meantime, can you summarize the ordering requirements of your
code?

> > Of course, if this code is architecture specific,
> > it can avoid DEC Alpha's fun and games, which could also violate your
> > assumptions in the above paragraph:
> >
> >    http://www.openvms.compaq.com/wizard/wiz_2637.html
> 
> Are you talking about this paragraph from above link:
> 
> "For instance, your producer must issue a "memory barrier" instruction
>   after writing the data to shared memory and before inserting it on
>   the queue; likewise, your consumer must issue a memory barrier
>   instruction after removing an item from the queue and before reading
>   from its memory.  Otherwise, you risk seeing stale data, since, while
>   the Alpha processor does provide coherent memory, it does not provide
>   implicit ordering of reads and writes.  (That is, the write of the
>   producer's data might reach memory after the write of the queue, such
>   that the consumer might read the new item from the queue but get the
>   previous values from the item's memory."
> 
> If yes, I don't think it explains the need of memory barrier on Alpha
> in our case (we all agree about the need of smp_wmb() right before @head
> update by producer). If not, could you please point to specific paragraph?

Did you miss the following passage in the paragraph you quoted?

	"... likewise, your consumer must issue a memory barrier
	instruction after removing an item from the queue and before
	reading from its memory."

That is why DEC Alpha readers need a read-side memory barrier -- it says
so right there.  And as either you or Peter noted earlier in this thread,
this barrier can be supplied by smp_read_barrier_depends().

I can sympathize if you are having trouble believing this.  After all,
it took the DEC Alpha architects a full hour to convince me, and that was
in a face-to-face meeting instead of over email.  (Just for the record,
it took me even longer to convince them that their earlier documentation
did not clearly indicate the need for these read-side barriers.)  But
regardless of whether or not I sympathize, DEC Alpha is what it is.

> > Anyway, proving or fixing the code in Documentation/circular-buffers.txt
> > has been on my list for too long, so I will take a closer look at it.
> 
> Thanks!
> 
> I'm concerned more about performance overhead imposed by the full memory
> barrier in kfifo circular buffers. Even if it is needed on Alpha (I don't
> understand why) we could try to solve this with some memory barrier which
> is effective only on architectures which really need it.

By exactly how much does the memory barrier slow your code down on some
example system?  (Yes, I can believe that it is a problem, but is it
really a problem in your exact situation?)

							Thanx, Paul


  parent reply	other threads:[~2013-11-01 11:15 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-22 23:54 perf events ring buffer memory barrier on powerpc Michael Neuling
2013-10-23  7:39 ` Victor Kaplansky
2013-10-23 14:19 ` Frederic Weisbecker
2013-10-23 14:25   ` Frederic Weisbecker
2013-10-25 17:37   ` Peter Zijlstra
2013-10-25 20:31     ` Michael Neuling
2013-10-27  9:00     ` Victor Kaplansky
2013-10-28  9:22       ` Peter Zijlstra
2013-10-28 10:02     ` Frederic Weisbecker
2013-10-28 12:38       ` Victor Kaplansky
2013-10-28 13:26         ` Peter Zijlstra
2013-10-28 16:34           ` Paul E. McKenney
2013-10-28 20:17             ` Oleg Nesterov
2013-10-28 20:58               ` Victor Kaplansky
2013-10-29 10:21                 ` Peter Zijlstra
2013-10-29 10:30                   ` Peter Zijlstra
2013-10-29 10:35                     ` Peter Zijlstra
2013-10-29 20:15                       ` Oleg Nesterov
2013-10-29 19:27                     ` Vince Weaver
2013-10-30 10:42                       ` Peter Zijlstra
2013-10-30 11:48                         ` James Hogan
2013-10-30 12:48                           ` Peter Zijlstra
2013-11-06 13:19                         ` [tip:perf/core] tools/perf: Add required memory barriers tip-bot for Peter Zijlstra
2013-11-06 13:50                           ` Vince Weaver
2013-11-06 14:00                             ` Peter Zijlstra
2013-11-06 14:28                               ` Peter Zijlstra
2013-11-06 14:55                                 ` Vince Weaver
2013-11-06 15:10                                   ` Peter Zijlstra
2013-11-06 15:23                                     ` Peter Zijlstra
2013-11-06 14:44                               ` Peter Zijlstra
2013-11-06 16:07                                 ` Peter Zijlstra
2013-11-06 17:31                                   ` Vince Weaver
2013-11-06 18:24                                     ` Peter Zijlstra
2013-11-07  8:21                                       ` Ingo Molnar
2013-11-07 14:27                                         ` Vince Weaver
2013-11-07 15:55                                           ` Ingo Molnar
2013-11-11 16:24                                         ` Peter Zijlstra
2013-11-11 21:10                                           ` Ingo Molnar
2013-10-29 21:23                     ` perf events ring buffer memory barrier on powerpc Michael Neuling
2013-10-30  9:27                 ` Paul E. McKenney
2013-10-30 11:25                   ` Peter Zijlstra
2013-10-30 14:52                     ` Victor Kaplansky
2013-10-30 15:39                       ` Peter Zijlstra
2013-10-30 17:14                         ` Victor Kaplansky
2013-10-30 17:44                           ` Peter Zijlstra
2013-10-31  6:16                       ` Paul E. McKenney
2013-11-01 13:12                         ` Victor Kaplansky
2013-11-02 16:36                           ` Paul E. McKenney
2013-11-02 17:26                             ` Paul E. McKenney
2013-10-31  6:40                     ` Paul E. McKenney
2013-11-01 14:25                       ` Victor Kaplansky
2013-11-02 17:28                         ` Paul E. McKenney
2013-11-01 14:56                       ` Peter Zijlstra
2013-11-02 17:32                         ` Paul E. McKenney
2013-11-03 14:40                           ` Paul E. McKenney
2013-11-03 15:17                             ` [RFC] arch: Introduce new TSO memory barrier smp_tmb() Peter Zijlstra
2013-11-03 18:08                               ` Linus Torvalds
2013-11-03 20:01                                 ` Peter Zijlstra
2013-11-03 22:42                                   ` Paul E. McKenney
2013-11-03 23:34                                     ` Linus Torvalds
2013-11-04 10:51                                       ` Paul E. McKenney
2013-11-04 11:22                                         ` Peter Zijlstra
2013-11-04 16:27                                           ` Paul E. McKenney
2013-11-04 16:48                                             ` Peter Zijlstra
2013-11-04 19:11                                             ` Peter Zijlstra
2013-11-04 19:18                                               ` Peter Zijlstra
2013-11-04 20:54                                                 ` Paul E. McKenney
2013-11-04 20:53                                               ` Paul E. McKenney
2013-11-05 14:05                                                 ` Will Deacon
2013-11-05 14:49                                                   ` Paul E. McKenney
2013-11-05 18:49                                                   ` Peter Zijlstra
2013-11-06 11:00                                                     ` Will Deacon
2013-11-06 12:39                                                 ` Peter Zijlstra
2013-11-06 12:51                                                   ` Geert Uytterhoeven
2013-11-06 13:57                                                     ` Peter Zijlstra
2013-11-06 18:48                                                       ` Paul E. McKenney
2013-11-06 19:42                                                         ` Peter Zijlstra
2013-11-07 11:17                                                       ` Will Deacon
2013-11-07 13:36                                                         ` Peter Zijlstra
2013-11-07 23:50                                           ` Mathieu Desnoyers
2013-11-04 11:05                                       ` Will Deacon
2013-11-04 16:34                                         ` Paul E. McKenney
2013-11-03 20:59                               ` Benjamin Herrenschmidt
2013-11-03 22:43                                 ` Paul E. McKenney
2013-11-03 17:07                             ` perf events ring buffer memory barrier on powerpc Will Deacon
2013-11-03 22:47                               ` Paul E. McKenney
2013-11-04  9:57                                 ` Will Deacon
2013-11-04 10:52                                   ` Paul E. McKenney
2013-11-01 16:11                       ` Peter Zijlstra
2013-11-02 17:46                         ` Paul E. McKenney
2013-11-01 16:18                       ` Peter Zijlstra
2013-11-02 17:49                         ` Paul E. McKenney
2013-10-30 13:28                   ` Victor Kaplansky
2013-10-30 15:51                     ` Peter Zijlstra
2013-10-30 18:29                       ` Peter Zijlstra
2013-10-30 19:11                         ` Peter Zijlstra
2013-10-31  4:33                       ` Paul E. McKenney
2013-10-31  4:32                     ` Paul E. McKenney
2013-10-31  9:04                       ` Peter Zijlstra
2013-10-31 15:07                         ` Paul E. McKenney
2013-10-31 15:19                           ` Peter Zijlstra
2013-11-01  9:28                             ` Paul E. McKenney
2013-11-01 10:30                               ` Peter Zijlstra
2013-11-02 15:20                                 ` Paul E. McKenney
2013-11-04  9:07                                   ` Peter Zijlstra
2013-11-04 10:00                                     ` Paul E. McKenney
2013-10-31  9:59                       ` Victor Kaplansky
2013-10-31 12:28                         ` David Laight
2013-10-31 12:55                           ` Victor Kaplansky
2013-10-31 15:25                         ` Paul E. McKenney [this message]
2013-11-01 16:06                           ` Victor Kaplansky
2013-11-01 16:25                             ` David Laight
2013-11-01 16:30                               ` Victor Kaplansky
2013-11-03 20:57                                 ` Benjamin Herrenschmidt
2013-11-02 15:46                             ` Paul E. McKenney
2013-10-28 19:09           ` Oleg Nesterov
2013-10-29 14:06     ` [tip:perf/urgent] perf: Fix perf ring buffer memory ordering tip-bot for Peter Zijlstra
2014-05-08 20:46 perf events ring buffer memory barrier on powerpc Mikulas Patocka
     [not found] ` <OF667059AA.7F151BCC-ONC2257CD3.0036CFEB-C2257CD3.003BBF01@il.ibm.com>
2014-05-09 12:20   ` Mikulas Patocka
2014-05-09 13:47     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131031152543.GC4067@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=VICTORK@il.ibm.com \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=michael@ellerman.id.au \
    --cc=mikey@neuling.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).