From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb} Date: Fri, 19 Oct 2018 11:44:17 +0200 Message-ID: <20181019094417.GE3121@hirez.programming.kicks-ass.net> References: <20181017144156.16639-1-daniel@iogearbox.net> <20181017144156.16639-3-daniel@iogearbox.net> <20181017155050.GM3121@hirez.programming.kicks-ass.net> <55f86215-44a8-2bb8-b1d0-a77a142dc697@iogearbox.net> <20181018081434.GT3121@hirez.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: alexei.starovoitov@gmail.com, paulmck@linux.vnet.ibm.com, will.deacon@arm.com, acme@redhat.com, yhs@fb.com, john.fastabend@gmail.com, netdev@vger.kernel.org To: Daniel Borkmann Return-path: Received: from bombadil.infradead.org ([198.137.202.133]:33530 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726609AbeJSRtp (ORCPT ); Fri, 19 Oct 2018 13:49:45 -0400 Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote: > diff --git a/tools/include/linux/ring_buffer.h b/tools/include/linux/ring_buffer.h > new file mode 100644 > index 0000000..48200e0 > --- /dev/null > +++ b/tools/include/linux/ring_buffer.h > @@ -0,0 +1,69 @@ > +#ifndef _TOOLS_LINUX_RING_BUFFER_H_ > +#define _TOOLS_LINUX_RING_BUFFER_H_ > + > +#include > +#include > + > +/* > + * Below barriers pair as follows (kernel/events/ring_buffer.c): > + * > + * Since the mmap() consumer (userspace) can run on a different CPU: > + * > + * kernel user > + * > + * if (LOAD ->data_tail) { LOAD ->data_head > + * (A) smp_rmb() (C) > + * STORE $data LOAD $data > + * smp_wmb() (B) smp_mb() (D) > + * STORE ->data_head STORE ->data_tail > + * } > + * > + * Where A pairs with D, and B pairs with C. > + * > + * In our case A is a control dependency that separates the load > + * of the ->data_tail and the stores of $data. In case ->data_tail > + * indicates there is no room in the buffer to store $data we do not. > + * > + * D needs to be a full barrier since it separates the data READ > + * from the tail WRITE. > + * > + * For B a WMB is sufficient since it separates two WRITEs, and for > + * C an RMB is sufficient since it separates two READs. > + */ > + > +/* > + * Note, instead of B, C, D we could also use smp_store_release() > + * in B and D as well as smp_load_acquire() in C. However, this > + * optimization makes sense not for all architectures since it > + * would resolve into READ_ONCE() + smp_mb() pair for smp_load_acquire() > + * and smp_mb() + WRITE_ONCE() pair for smp_store_release(), thus > + * for those smp_wmb() in B and smp_rmb() in C would still be less > + * expensive. For the case of D this has either the same cost or > + * is less expensive. For example, due to TSO (total store order), > + * x86 can avoid the CPU barrier entirely. > + */ > + > +static inline u64 ring_buffer_read_head(struct perf_event_mmap_page *base) > +{ > +/* > + * Architectures where smp_load_acquire() does not fallback to > + * READ_ONCE() + smp_mb() pair. > + */ > +#if defined(__x86_64__) || defined(__aarch64__) || defined(__powerpc64__) || \ > + defined(__ia64__) || defined(__sparc__) && defined(__arch64__) > + return smp_load_acquire(&base->data_head); > +#else > + u64 head = READ_ONCE(base->data_head); > + > + smp_rmb(); > + return head; > +#endif > +} > + > +static inline void ring_buffer_write_tail(struct perf_event_mmap_page *base, > + u64 tail) > +{ > + smp_store_release(&base->data_tail, tail); > +} > + > +#endif /* _TOOLS_LINUX_RING_BUFFER_H_ */ (for the whole patch, but in particular the above) Acked-by: Peter Zijlstra (Intel)