From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> To: Peter Zijlstra <peterz@infradead.org> Cc: Victor Kaplansky <VICTORK@il.ibm.com>, Oleg Nesterov <oleg@redhat.com>, Anton Blanchard <anton@samba.org>, Benjamin Herrenschmidt <benh@kernel.crashing.org>, Frederic Weisbecker <fweisbec@gmail.com>, LKML <linux-kernel@vger.kernel.org>, Linux PPC dev <linuxppc-dev@ozlabs.org>, Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>, Michael Ellerman <michael@ellerman.id.au>, Michael Neuling <mikey@neuling.org> Subject: Re: perf events ring buffer memory barrier on powerpc Date: Sun, 3 Nov 2013 06:40:17 -0800 [thread overview] Message-ID: <20131103144017.GA25118@linux.vnet.ibm.com> (raw) In-Reply-To: <20131102173239.GB3947@linux.vnet.ibm.com> On Sat, Nov 02, 2013 at 10:32:39AM -0700, Paul E. McKenney wrote: > On Fri, Nov 01, 2013 at 03:56:34PM +0100, Peter Zijlstra wrote: > > On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote: > > > > Now the whole crux of the question is if we need barrier A at all, since > > > > the STORES issued by the @buf writes are dependent on the ubuf->tail > > > > read. > > > > > > The dependency you are talking about is via the "if" statement? > > > Even C/C++11 is not required to respect control dependencies. > > > > > > This one is a bit annoying. The x86 TSO means that you really only > > > need barrier(), ARM (recent ARM, anyway) and Power could use a weaker > > > barrier, and so on -- but smp_mb() emits a full barrier. > > > > > > Perhaps a new smp_tmb() for TSO semantics, where reads are ordered > > > before reads, writes before writes, and reads before writes, but not > > > writes before reads? Another approach would be to define a per-arch > > > barrier for this particular case. > > > > I suppose we can only introduce new barrier primitives if there's more > > than 1 use-case. > > There probably are others. If there was an smp_tmb(), I would likely use it in rcu_assign_pointer(). There are some corner cases that can happen with the current smp_wmb() that would be prevented by smp_tmb(). These corner cases are a bit strange, as follows: struct foo gp; void P0(void) { struct foo *p = kmalloc(sizeof(*p); if (!p) return; ACCESS_ONCE(p->a) = 0; BUG_ON(ACCESS_ONCE(p->a)); rcu_assign_pointer(gp, p); } void P1(void) { struct foo *p = rcu_dereference(gp); if (!p) return; ACCESS_ONCE(p->a) = 1; } With smp_wmb(), the BUG_ON() can occur because smp_wmb() does not prevent CPU from reordering the read in the BUG_ON() with the rcu_assign_pointer(). With smp_tmb(), it could not. Now, I am not too worried about this because I cannot think of any use for code like that in P0() and P1(). But if there was an smp_tmb(), it would be cleaner to make the BUG_ON() impossible. Thanx, Paul > > > > If the read shows no available space, we simply will not issue those > > > > writes -- therefore we could argue we can avoid the memory barrier. > > > > > > Proving that means iterating through the permitted combinations of > > > compilers and architectures... There is always hand-coded assembly > > > language, I suppose. > > > > I'm starting to think that while the C/C++ language spec says they can > > wreck the world by doing these silly optimization, real world users will > > push back for breaking their existing code. > > > > I'm fairly sure the GCC people _will_ get shouted at _loudly_ when they > > break the kernel by doing crazy shit like that. > > > > Given its near impossible to write a correct program in C/C++ and > > tagging the entire kernel with __atomic is equally not going to happen, > > I think we must find a practical solution. > > > > Either that, or we really need to consider forking the language and > > compiler :-( > > Depends on how much benefit the optimizations provide. If they provide > little or no benefit, I am with you, otherwise we will need to bit some > bullet or another. Keep in mind that there is a lot of code in the > kernel that runs sequentially (e.g., due to being fully protected by > locks), and aggressive optimizations for that sort of code are harmless. > > Can't say I know the answer at the moment, though. > > Thanx, Paul
WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> To: Peter Zijlstra <peterz@infradead.org> Cc: Michael Neuling <mikey@neuling.org>, Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>, Oleg Nesterov <oleg@redhat.com>, LKML <linux-kernel@vger.kernel.org>, Linux PPC dev <linuxppc-dev@ozlabs.org>, Anton Blanchard <anton@samba.org>, Frederic Weisbecker <fweisbec@gmail.com>, Victor Kaplansky <VICTORK@il.ibm.com> Subject: Re: perf events ring buffer memory barrier on powerpc Date: Sun, 3 Nov 2013 06:40:17 -0800 [thread overview] Message-ID: <20131103144017.GA25118@linux.vnet.ibm.com> (raw) In-Reply-To: <20131102173239.GB3947@linux.vnet.ibm.com> On Sat, Nov 02, 2013 at 10:32:39AM -0700, Paul E. McKenney wrote: > On Fri, Nov 01, 2013 at 03:56:34PM +0100, Peter Zijlstra wrote: > > On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote: > > > > Now the whole crux of the question is if we need barrier A at all, since > > > > the STORES issued by the @buf writes are dependent on the ubuf->tail > > > > read. > > > > > > The dependency you are talking about is via the "if" statement? > > > Even C/C++11 is not required to respect control dependencies. > > > > > > This one is a bit annoying. The x86 TSO means that you really only > > > need barrier(), ARM (recent ARM, anyway) and Power could use a weaker > > > barrier, and so on -- but smp_mb() emits a full barrier. > > > > > > Perhaps a new smp_tmb() for TSO semantics, where reads are ordered > > > before reads, writes before writes, and reads before writes, but not > > > writes before reads? Another approach would be to define a per-arch > > > barrier for this particular case. > > > > I suppose we can only introduce new barrier primitives if there's more > > than 1 use-case. > > There probably are others. If there was an smp_tmb(), I would likely use it in rcu_assign_pointer(). There are some corner cases that can happen with the current smp_wmb() that would be prevented by smp_tmb(). These corner cases are a bit strange, as follows: struct foo gp; void P0(void) { struct foo *p = kmalloc(sizeof(*p); if (!p) return; ACCESS_ONCE(p->a) = 0; BUG_ON(ACCESS_ONCE(p->a)); rcu_assign_pointer(gp, p); } void P1(void) { struct foo *p = rcu_dereference(gp); if (!p) return; ACCESS_ONCE(p->a) = 1; } With smp_wmb(), the BUG_ON() can occur because smp_wmb() does not prevent CPU from reordering the read in the BUG_ON() with the rcu_assign_pointer(). With smp_tmb(), it could not. Now, I am not too worried about this because I cannot think of any use for code like that in P0() and P1(). But if there was an smp_tmb(), it would be cleaner to make the BUG_ON() impossible. Thanx, Paul > > > > If the read shows no available space, we simply will not issue those > > > > writes -- therefore we could argue we can avoid the memory barrier. > > > > > > Proving that means iterating through the permitted combinations of > > > compilers and architectures... There is always hand-coded assembly > > > language, I suppose. > > > > I'm starting to think that while the C/C++ language spec says they can > > wreck the world by doing these silly optimization, real world users will > > push back for breaking their existing code. > > > > I'm fairly sure the GCC people _will_ get shouted at _loudly_ when they > > break the kernel by doing crazy shit like that. > > > > Given its near impossible to write a correct program in C/C++ and > > tagging the entire kernel with __atomic is equally not going to happen, > > I think we must find a practical solution. > > > > Either that, or we really need to consider forking the language and > > compiler :-( > > Depends on how much benefit the optimizations provide. If they provide > little or no benefit, I am with you, otherwise we will need to bit some > bullet or another. Keep in mind that there is a lot of code in the > kernel that runs sequentially (e.g., due to being fully protected by > locks), and aggressive optimizations for that sort of code are harmless. > > Can't say I know the answer at the moment, though. > > Thanx, Paul
next prev parent reply other threads:[~2013-11-03 14:40 UTC|newest] Thread overview: 215+ messages / expand[flat|nested] mbox.gz Atom feed top 2013-10-22 23:54 perf events ring buffer memory barrier on powerpc Michael Neuling 2013-10-23 7:39 ` Victor Kaplansky 2013-10-23 7:39 ` Victor Kaplansky 2013-10-23 14:19 ` Frederic Weisbecker 2013-10-23 14:19 ` Frederic Weisbecker 2013-10-23 14:25 ` Frederic Weisbecker 2013-10-23 14:25 ` Frederic Weisbecker 2013-10-25 17:37 ` Peter Zijlstra 2013-10-25 17:37 ` Peter Zijlstra 2013-10-25 20:31 ` Michael Neuling 2013-10-25 20:31 ` Michael Neuling 2013-10-27 9:00 ` Victor Kaplansky 2013-10-27 9:00 ` Victor Kaplansky 2013-10-28 9:22 ` Peter Zijlstra 2013-10-28 9:22 ` Peter Zijlstra 2013-10-28 10:02 ` Frederic Weisbecker 2013-10-28 10:02 ` Frederic Weisbecker 2013-10-28 12:38 ` Victor Kaplansky 2013-10-28 12:38 ` Victor Kaplansky 2013-10-28 13:26 ` Peter Zijlstra 2013-10-28 13:26 ` Peter Zijlstra 2013-10-28 16:34 ` Paul E. McKenney 2013-10-28 16:34 ` Paul E. McKenney 2013-10-28 20:17 ` Oleg Nesterov 2013-10-28 20:17 ` Oleg Nesterov 2013-10-28 20:58 ` Victor Kaplansky 2013-10-28 20:58 ` Victor Kaplansky 2013-10-29 10:21 ` Peter Zijlstra 2013-10-29 10:21 ` Peter Zijlstra 2013-10-29 10:30 ` Peter Zijlstra 2013-10-29 10:30 ` Peter Zijlstra 2013-10-29 10:35 ` Peter Zijlstra 2013-10-29 10:35 ` Peter Zijlstra 2013-10-29 20:15 ` Oleg Nesterov 2013-10-29 20:15 ` Oleg Nesterov 2013-10-29 19:27 ` Vince Weaver 2013-10-29 19:27 ` Vince Weaver 2013-10-30 10:42 ` Peter Zijlstra 2013-10-30 10:42 ` Peter Zijlstra 2013-10-30 11:48 ` James Hogan 2013-10-30 11:48 ` James Hogan 2013-10-30 12:48 ` Peter Zijlstra 2013-10-30 12:48 ` Peter Zijlstra 2013-11-06 13:19 ` [tip:perf/core] tools/perf: Add required memory barriers tip-bot for Peter Zijlstra 2013-11-06 13:50 ` Vince Weaver 2013-11-06 14:00 ` Peter Zijlstra 2013-11-06 14:28 ` Peter Zijlstra 2013-11-06 14:55 ` Vince Weaver 2013-11-06 15:10 ` Peter Zijlstra 2013-11-06 15:23 ` Peter Zijlstra 2013-11-06 14:44 ` Peter Zijlstra 2013-11-06 16:07 ` Peter Zijlstra 2013-11-06 17:31 ` Vince Weaver 2013-11-06 18:24 ` Peter Zijlstra 2013-11-07 8:21 ` Ingo Molnar 2013-11-07 14:27 ` Vince Weaver 2013-11-07 15:55 ` Ingo Molnar 2013-11-11 16:24 ` Peter Zijlstra 2013-11-11 21:10 ` Ingo Molnar 2013-10-29 21:23 ` perf events ring buffer memory barrier on powerpc Michael Neuling 2013-10-29 21:23 ` Michael Neuling 2013-10-30 9:27 ` Paul E. McKenney 2013-10-30 9:27 ` Paul E. McKenney 2013-10-30 11:25 ` Peter Zijlstra 2013-10-30 11:25 ` Peter Zijlstra 2013-10-30 14:52 ` Victor Kaplansky 2013-10-30 14:52 ` Victor Kaplansky 2013-10-30 15:39 ` Peter Zijlstra 2013-10-30 15:39 ` Peter Zijlstra 2013-10-30 17:14 ` Victor Kaplansky 2013-10-30 17:14 ` Victor Kaplansky 2013-10-30 17:44 ` Peter Zijlstra 2013-10-30 17:44 ` Peter Zijlstra 2013-10-31 6:16 ` Paul E. McKenney 2013-10-31 6:16 ` Paul E. McKenney 2013-11-01 13:12 ` Victor Kaplansky 2013-11-01 13:12 ` Victor Kaplansky 2013-11-02 16:36 ` Paul E. McKenney 2013-11-02 16:36 ` Paul E. McKenney 2013-11-02 17:26 ` Paul E. McKenney 2013-11-02 17:26 ` Paul E. McKenney 2013-10-31 6:40 ` Paul E. McKenney 2013-10-31 6:40 ` Paul E. McKenney 2013-11-01 14:25 ` Victor Kaplansky 2013-11-01 14:25 ` Victor Kaplansky 2013-11-02 17:28 ` Paul E. McKenney 2013-11-02 17:28 ` Paul E. McKenney 2013-11-01 14:56 ` Peter Zijlstra 2013-11-01 14:56 ` Peter Zijlstra 2013-11-02 17:32 ` Paul E. McKenney 2013-11-02 17:32 ` Paul E. McKenney 2013-11-03 14:40 ` Paul E. McKenney [this message] 2013-11-03 14:40 ` Paul E. McKenney 2013-11-03 15:17 ` [RFC] arch: Introduce new TSO memory barrier smp_tmb() Peter Zijlstra 2013-11-03 15:17 ` Peter Zijlstra 2013-11-03 18:08 ` Linus Torvalds 2013-11-03 18:08 ` Linus Torvalds 2013-11-03 20:01 ` Peter Zijlstra 2013-11-03 20:01 ` Peter Zijlstra 2013-11-03 22:42 ` Paul E. McKenney 2013-11-03 22:42 ` Paul E. McKenney 2013-11-03 23:34 ` Linus Torvalds 2013-11-03 23:34 ` Linus Torvalds 2013-11-04 10:51 ` Paul E. McKenney 2013-11-04 10:51 ` Paul E. McKenney 2013-11-04 11:22 ` Peter Zijlstra 2013-11-04 11:22 ` Peter Zijlstra 2013-11-04 16:27 ` Paul E. McKenney 2013-11-04 16:27 ` Paul E. McKenney 2013-11-04 16:48 ` Peter Zijlstra 2013-11-04 16:48 ` Peter Zijlstra 2013-11-04 19:11 ` Peter Zijlstra 2013-11-04 19:11 ` Peter Zijlstra 2013-11-04 19:18 ` Peter Zijlstra 2013-11-04 19:18 ` Peter Zijlstra 2013-11-04 20:54 ` Paul E. McKenney 2013-11-04 20:54 ` Paul E. McKenney 2013-11-04 20:53 ` Paul E. McKenney 2013-11-04 20:53 ` Paul E. McKenney 2013-11-05 14:05 ` Will Deacon 2013-11-05 14:05 ` Will Deacon 2013-11-05 14:49 ` Paul E. McKenney 2013-11-05 14:49 ` Paul E. McKenney 2013-11-05 18:49 ` Peter Zijlstra 2013-11-05 18:49 ` Peter Zijlstra 2013-11-06 11:00 ` Will Deacon 2013-11-06 11:00 ` Will Deacon 2013-11-06 12:39 ` Peter Zijlstra 2013-11-06 12:39 ` Peter Zijlstra 2013-11-06 12:51 ` Geert Uytterhoeven 2013-11-06 12:51 ` Geert Uytterhoeven 2013-11-06 13:57 ` Peter Zijlstra 2013-11-06 13:57 ` Peter Zijlstra 2013-11-06 18:48 ` Paul E. McKenney 2013-11-06 18:48 ` Paul E. McKenney 2013-11-06 19:42 ` Peter Zijlstra 2013-11-06 19:42 ` Peter Zijlstra 2013-11-07 11:17 ` Will Deacon 2013-11-07 11:17 ` Will Deacon 2013-11-07 13:36 ` Peter Zijlstra 2013-11-07 13:36 ` Peter Zijlstra 2013-11-07 23:50 ` Mathieu Desnoyers 2013-11-07 23:50 ` Mathieu Desnoyers 2013-11-04 11:05 ` Will Deacon 2013-11-04 11:05 ` Will Deacon 2013-11-04 16:34 ` Paul E. McKenney 2013-11-04 16:34 ` Paul E. McKenney 2013-11-03 20:59 ` Benjamin Herrenschmidt 2013-11-03 20:59 ` Benjamin Herrenschmidt 2013-11-03 22:43 ` Paul E. McKenney 2013-11-03 22:43 ` Paul E. McKenney 2013-11-03 17:07 ` perf events ring buffer memory barrier on powerpc Will Deacon 2013-11-03 22:47 ` Paul E. McKenney 2013-11-04 9:57 ` Will Deacon 2013-11-04 10:52 ` Paul E. McKenney 2013-11-01 16:11 ` Peter Zijlstra 2013-11-01 16:11 ` Peter Zijlstra 2013-11-02 17:46 ` Paul E. McKenney 2013-11-02 17:46 ` Paul E. McKenney 2013-11-01 16:18 ` Peter Zijlstra 2013-11-01 16:18 ` Peter Zijlstra 2013-11-02 17:49 ` Paul E. McKenney 2013-11-02 17:49 ` Paul E. McKenney 2013-10-30 13:28 ` Victor Kaplansky 2013-10-30 13:28 ` Victor Kaplansky 2013-10-30 15:51 ` Peter Zijlstra 2013-10-30 15:51 ` Peter Zijlstra 2013-10-30 18:29 ` Peter Zijlstra 2013-10-30 18:29 ` Peter Zijlstra 2013-10-30 19:11 ` Peter Zijlstra 2013-10-30 19:11 ` Peter Zijlstra 2013-10-31 4:33 ` Paul E. McKenney 2013-10-31 4:33 ` Paul E. McKenney 2013-10-31 4:32 ` Paul E. McKenney 2013-10-31 4:32 ` Paul E. McKenney 2013-10-31 9:04 ` Peter Zijlstra 2013-10-31 9:04 ` Peter Zijlstra 2013-10-31 15:07 ` Paul E. McKenney 2013-10-31 15:07 ` Paul E. McKenney 2013-10-31 15:19 ` Peter Zijlstra 2013-10-31 15:19 ` Peter Zijlstra 2013-11-01 9:28 ` Paul E. McKenney 2013-11-01 9:28 ` Paul E. McKenney 2013-11-01 10:30 ` Peter Zijlstra 2013-11-01 10:30 ` Peter Zijlstra 2013-11-02 15:20 ` Paul E. McKenney 2013-11-02 15:20 ` Paul E. McKenney 2013-11-04 9:07 ` Peter Zijlstra 2013-11-04 9:07 ` Peter Zijlstra 2013-11-04 10:00 ` Paul E. McKenney 2013-11-04 10:00 ` Paul E. McKenney 2013-10-31 9:59 ` Victor Kaplansky 2013-10-31 9:59 ` Victor Kaplansky 2013-10-31 12:28 ` David Laight 2013-10-31 12:28 ` David Laight 2013-10-31 12:55 ` Victor Kaplansky 2013-10-31 12:55 ` Victor Kaplansky 2013-10-31 15:25 ` Paul E. McKenney 2013-10-31 15:25 ` Paul E. McKenney 2013-11-01 16:06 ` Victor Kaplansky 2013-11-01 16:06 ` Victor Kaplansky 2013-11-01 16:25 ` David Laight 2013-11-01 16:25 ` David Laight 2013-11-01 16:30 ` Victor Kaplansky 2013-11-01 16:30 ` Victor Kaplansky 2013-11-03 20:57 ` Benjamin Herrenschmidt 2013-11-03 20:57 ` Benjamin Herrenschmidt 2013-11-02 15:46 ` Paul E. McKenney 2013-11-02 15:46 ` Paul E. McKenney 2013-10-28 19:09 ` Oleg Nesterov 2013-10-28 19:09 ` Oleg Nesterov 2013-10-29 14:06 ` [tip:perf/urgent] perf: Fix perf ring buffer memory ordering tip-bot for Peter Zijlstra 2014-05-08 20:46 perf events ring buffer memory barrier on powerpc Mikulas Patocka [not found] ` <OF667059AA.7F151BCC-ONC2257CD3.0036CFEB-C2257CD3.003BBF01@il.ibm.com> 2014-05-09 12:20 ` Mikulas Patocka 2014-05-09 13:47 ` Paul E. McKenney
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20131103144017.GA25118@linux.vnet.ibm.com \ --to=paulmck@linux.vnet.ibm.com \ --cc=VICTORK@il.ibm.com \ --cc=anton@samba.org \ --cc=benh@kernel.crashing.org \ --cc=fweisbec@gmail.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@ozlabs.org \ --cc=mathieu.desnoyers@polymtl.ca \ --cc=michael@ellerman.id.au \ --cc=mikey@neuling.org \ --cc=oleg@redhat.com \ --cc=peterz@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.