From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb} Date: Fri, 19 Oct 2018 04:56:12 -0700 Message-ID: <20181019115612.GT2674@linux.ibm.com> References: <20181017144156.16639-1-daniel@iogearbox.net> <20181017144156.16639-3-daniel@iogearbox.net> <20181017155050.GM3121@hirez.programming.kicks-ass.net> <55f86215-44a8-2bb8-b1d0-a77a142dc697@iogearbox.net> <20181018081434.GT3121@hirez.programming.kicks-ass.net> <20181018153307.ayvmq6du3gnsyvro@ast-mbp.dhcp.thefacebook.com> <1e3eea86-f708-f48b-4db4-911a84b94e06@iogearbox.net> <20181019035340.ahjocmdj2o2zam4m@ast-mbp.dhcp.thefacebook.com> <20181019110243.GC14246@arm.com> Reply-To: paulmck@linux.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Alexei Starovoitov , Daniel Borkmann , Peter Zijlstra , acme@redhat.com, yhs@fb.com, john.fastabend@gmail.com, netdev@vger.kernel.org To: Will Deacon Return-path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:58162 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726784AbeJSUCE (ORCPT ); Fri, 19 Oct 2018 16:02:04 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w9JBjutC032612 for ; Fri, 19 Oct 2018 07:56:18 -0400 Received: from e13.ny.us.ibm.com (e13.ny.us.ibm.com [129.33.205.203]) by mx0b-001b2d01.pphosted.com with ESMTP id 2n7cuh5ey1-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 19 Oct 2018 07:56:18 -0400 Received: from localhost by e13.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 19 Oct 2018 07:56:17 -0400 Content-Disposition: inline In-Reply-To: <20181019110243.GC14246@arm.com> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, Oct 19, 2018 at 12:02:43PM +0100, Will Deacon wrote: > On Thu, Oct 18, 2018 at 08:53:42PM -0700, Alexei Starovoitov wrote: > > On Thu, Oct 18, 2018 at 09:00:46PM +0200, Daniel Borkmann wrote: > > > On 10/18/2018 05:33 PM, Alexei Starovoitov wrote: > > > > On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote: > > > >> #endif /* _TOOLS_LINUX_ASM_IA64_BARRIER_H */ > > > >> diff --git a/tools/arch/powerpc/include/asm/barrier.h b/tools/arch/powerpc/include/asm/barrier.h > > > >> index a634da0..905a2c6 100644 > > > >> --- a/tools/arch/powerpc/include/asm/barrier.h > > > >> +++ b/tools/arch/powerpc/include/asm/barrier.h > > > >> @@ -27,4 +27,20 @@ > > > >> #define rmb() __asm__ __volatile__ ("sync" : : : "memory") > > > >> #define wmb() __asm__ __volatile__ ("sync" : : : "memory") > > > >> > > > >> +#if defined(__powerpc64__) > > > >> +#define smp_lwsync() __asm__ __volatile__ ("lwsync" : : : "memory") > > > >> + > > > >> +#define smp_store_release(p, v) \ > > > >> +do { \ > > > >> + smp_lwsync(); \ > > > >> + WRITE_ONCE(*p, v); \ > > > >> +} while (0) > > > >> + > > > >> +#define smp_load_acquire(p) \ > > > >> +({ \ > > > >> + typeof(*p) ___p1 = READ_ONCE(*p); \ > > > >> + smp_lwsync(); \ > > > >> + ___p1; \ > > > > > > > > I don't like this proliferation of asm. > > > > Why do we think that we can do better job than compiler? > > > > can we please use gcc builtins instead? > > > > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html > > > > __atomic_load_n(ptr, __ATOMIC_ACQUIRE); > > > > __atomic_store_n(ptr, val, __ATOMIC_RELEASE); > > > > are done specifically for this use case if I'm not mistaken. > > > > I think it pays to learn what compiler provides. > > > > > > But are you sure the C11 memory model matches exact same model as kernel? > > > Seems like last time Will looked into it [0] it wasn't the case ... > > > > I'm only suggesting equivalence of __atomic_load_n(ptr, __ATOMIC_ACQUIRE) > > with kernel's smp_load_acquire(). > > I've seen a bunch of user space ring buffer implementations implemented > > with __atomic_load_n() primitives. > > But let's ask experts who live in both worlds. > > One thing to be wary of is if there is an implementation choice between > how to implement load-acquire and store-release for a given architecture. > In these situations, it's often important that concurrent software agrees > on the "mapping", so we'd need to be sure that (a) All userspace compilers > that we care about have compatible mappings and (b) These mappings are > compatible with the kernel code. Agreed! Mixing and matching can be done, but it does require quite a bit of care. Thanx, Paul