From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966130AbcBCTNo (ORCPT ); Wed, 3 Feb 2016 14:13:44 -0500 Received: from foss.arm.com ([217.140.101.70]:37214 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965762AbcBCTNn (ORCPT ); Wed, 3 Feb 2016 14:13:43 -0500 Date: Wed, 3 Feb 2016 19:13:42 +0000 From: Will Deacon To: Linus Torvalds Cc: Boqun Feng , Paul McKenney , Peter Zijlstra , "Maciej W. Rozycki" , David Daney , =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= , Ralf Baechle , Linux Kernel Mailing List Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock() Message-ID: <20160203191342.GC15852@arm.com> References: <20160202051904.GC1239@fixme-laptop.cn.ibm.com> <20160202064433.GG6719@linux.vnet.ibm.com> <20160202093440.GD1239@fixme-laptop.cn.ibm.com> <20160202175127.GO10166@arm.com> <20160202193037.GQ10166@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 02, 2016 at 11:55:57AM -0800, Linus Torvalds wrote: > On Tue, Feb 2, 2016 at 11:30 AM, Will Deacon wrote: > > > > FWIW, and this is by no means conclusive, I hacked that up quickly and > > ran hackbench a few times on the nearest idle arm64 system. The results > > were consistently ~4% slower using acquire for rcu_dereference. > > Ok, that's *much* more noticeable than I would have expected. I take > it that load-acquire is really really slow on current arm64 > implementations. See my reply to Ingo, but it seems a bunch of this was down to rebooting the system between runs and hackbench being particularly susceptible to that. > Just out of interest, is store-release slow too? Because that should > be easy to make fast. There's a slight gotcha with arm64's store-release instruction in that it's RCsc and therefore orders against a subsequent load-acquire. That's not to say you can't make it fast, but it's potentially more involved than posting a flag in a store buffer (or whatever you were envisaging :) Measuring store-release is much more difficult, because you can't replace it with a dependency or the like, only other barrier constructs. Will