From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50035)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1b4sCV-0007Sz-3K
	for qemu-devel@nongnu.org; Mon, 23 May 2016 11:55:28 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1b4sCQ-0008Ob-PQ
	for qemu-devel@nongnu.org; Mon, 23 May 2016 11:55:25 -0400
Received: from out2-smtp.messagingengine.com ([66.111.4.26]:38898)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1b4sCO-0008I9-Fl
	for qemu-devel@nongnu.org; Mon, 23 May 2016 11:55:22 -0400
Date: Mon, 23 May 2016 11:55:10 -0400
From: "Emilio G. Cota" <cota@braap.org>
Message-ID: <20160523155510.GC1768@flamenco>
References: <1463863336-28760-1-git-send-email-cota@braap.org>
	<1463863336-28760-2-git-send-email-cota@braap.org>
	<955e8307-01a5-b2f9-48df-8309bd30c443@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <955e8307-01a5-b2f9-48df-8309bd30c443@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/2] atomics: do not use __atomic
 primitives for RCU atomics
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: QEMU Developers <qemu-devel@nongnu.org>, MTTCG Devel <mttcg@greensocs.com>, Alex =?iso-8859-1?Q?Benn=E9e?= <alex.bennee@linaro.org>, Richard Henderson <rth@twiddle.net>, Sergey Fedorov <serge.fdrv@gmail.com>

On Mon, May 23, 2016 at 16:21:36 +0200, Paolo Bonzini wrote:
> On 21/05/2016 22:42, Emilio G. Cota wrote:
> > Commit a0aa44b4 ("include/qemu/atomic.h: default to __atomic functions")
> > set all atomics to default (on recent GCC versions) to __atomic primitives.
> > 
> > In the process, the atomic_rcu_read/set were converted to implement
> > consume/release semantics, respectively. This is inefficient; for
> > correctness and maximum performance we only need an smp_barrier_depends
> > for reads, and an smp_wmb for writes. Fix it by using the original
> > definition of these two primitives for all compilers.
> 
> Indeed most compilers implement consume the same as acquire, which is
> inefficient.
> However, isn't in practice atomic_thread_fence(release) +
> atomic_store(relaxed) the same as atomic_store(release)?

Yes. However this is not the issue I'm addressing with the patch.

The performance regression I measured is due to using load-acquire vs.
load+smp_read_barrier_depends(). In the latter case only Alpha will
emit a fence; in the former we always emit store-release, which
is "stronger" (i.e. more constraining.)

A similar thing applies to atomic_rcu_write, although I haven't
measured its impact. We only need smp_wmb+store, yet we emit a
store-release, which is again "stronger".

		E.