From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58674) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1amzf9-0001yR-FN for qemu-devel@nongnu.org; Mon, 04 Apr 2016 04:15:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1amzf6-0000e4-AG for qemu-devel@nongnu.org; Mon, 04 Apr 2016 04:15:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55778) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1amzf6-0000dj-5D for qemu-devel@nongnu.org; Mon, 04 Apr 2016 04:15:04 -0400 References: <1453976119-24372-1-git-send-email-alex.bennee@linaro.org> <1453976119-24372-4-git-send-email-alex.bennee@linaro.org> <87h9fl12zq.fsf@gmail.com> From: Paolo Bonzini Message-ID: <57022283.5070400@redhat.com> Date: Mon, 4 Apr 2016 10:14:59 +0200 MIME-Version: 1.0 In-Reply-To: <87h9fl12zq.fsf@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v1 3/5] include/qemu/atomic.h: default to __atomic functions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pranith Kumar , =?UTF-8?Q?Alex_Benn=c3=a9e?= Cc: mttcg@greensocs.com, peter.maydell@linaro.org, mark.burton@greensocs.com, a.rigo@virtualopensystems.com, qemu-devel@nongnu.org, stefanha@redhat.com, fred.konrad@greensocs.com On 01/04/2016 22:35, Pranith Kumar wrote:; barrier(); }) > I could not really understand why we need to wrap the fence with > barrier()'s. There are three parts to my confusion. Let me ask one afte= r the > other. >=20 > On x86, __atomic_thread_fence(__ATOMIC_SEQ_CST) will generate an mfence > instruction. On ARM, this will generate the dmb instruction. Both these > serializing instructions also act as compiler barriers. Is there any > architecture which does not generate such a serializing instruction? (More on this later). >> +#define smp_wmb() ({ barrier(); __atomic_thread_fence(__ATOMIC_RELE= ASE); barrier(); }) >> +#define smp_rmb() ({ barrier(); __atomic_thread_fence(__ATOMIC_ACQU= IRE); barrier(); }) >=20 > Second, why do you need barrier() on both sides? One barrier() seems to= be > sufficient to prevent the compiler from reordering across the macro. Am= I > missing something? Yes, that's true. > Finally, I tried looking at the gcc docs but could find nothing regardi= ng > __atomic_thread_fence() not being considered as a memory barrier. What = I did > find mentions about it being treated as a function call during the main > optimization stages and not during later stages: >=20 > http://www.spinics.net/lists/gcchelp/msg39798.html >=20 > AFAIU, in these later stages, even adding a barrier() as we are doing w= ill > have no effect. >=20 > Can you point me to any docs which talk more about this? The issue is that atomic_thread_fence() only affects other atomic operations, while smp_rmb() and smp_wmb() affect normal loads and stores as well. In the GCC implementation, atomic operations (even relaxed ones) access memory as if the pointer was volatile. By doing this, GCC can remove the acquire and release fences altogether on TSO architectures. We actually observed a case where the compiler subsequently inverted the order of two writes around a smp_wmb(). It was fixed in commit 3bbf572 ("atomics: add explicit compiler fence in __atomic memory barriers", 2015-06-05). In principle it could do the same on architectures that are sequentially consistent; even if none exists in practice, keeping the barriers for smp_mb() is consistent with the other barriers. Paolo