From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932337AbbFBTAJ (ORCPT ); Tue, 2 Jun 2015 15:00:09 -0400 Received: from resqmta-ch2-08v.sys.comcast.net ([69.252.207.40]:45863 "EHLO resqmta-ch2-08v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754466AbbFBS7q (ORCPT ); Tue, 2 Jun 2015 14:59:46 -0400 Message-ID: <556DFD1A.7070802@gentoo.org> Date: Tue, 02 Jun 2015 14:59:38 -0400 From: Joshua Kinard User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Ralf Baechle CC: Leonid Yegoshin , linux-mips@linux-mips.org, will.deacon@arm.com, linux-kernel@vger.kernel.org, benh@kernel.crashing.org, markos.chandras@imgtec.com, macro@linux-mips.org, Steven.Hill@imgtec.com, alexander.h.duyck@redhat.com, davem@davemloft.net Subject: Re: [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release References: <20150602000818.6668.76632.stgit@ubuntu-yegoshin> <556D6C31.3070500@gentoo.org> <20150602095920.GD29986@linux-mips.org> In-Reply-To: <20150602095920.GD29986@linux-mips.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/02/2015 05:59, Ralf Baechle wrote: > On Tue, Jun 02, 2015 at 04:41:21AM -0400, Joshua Kinard wrote: > >> On 06/01/2015 20:09, Leonid Yegoshin wrote: >>> The following series implements lightweight SYNC memory barriers for SMP Linux >>> and a correct use of SYNCs around atomics, futexes, spinlocks etc LL-SC loops - >>> the basic building blocks of any atomics in MIPS. >>> >>> Historically, a generic MIPS doesn't use memory barriers around LL-SC loops in >>> atomics, spinlocks etc. However, Architecture documents never specify that LL-SC >>> loop creates a memory barrier. Some non-generic MIPS vendors already feel >>> the pain and enforces it. With introduction in a recent out-of-order superscalar >>> MIPS processors an aggressive speculative memory read it is a problem now. >>> >>> The generic MIPS memory barrier instruction SYNC (aka SYNC 0) is something >>> very heavvy because it was designed for propogating barrier down to memory. >>> MIPS R2 introduced lightweight SYNC instructions which correspond to smp_*() >>> set of SMP barriers. The description was very HW-specific and it was never >>> used, however, it is much less trouble for processor pipelines and can be used >>> in smp_mb()/smp_rmb()/smp_wmb() as is as in acquire/release barrier semantics. >>> After prolonged discussions with HW team it became clear that lightweight SYNCs >>> were designed specifically with smp_*() in mind but description is in timeline >>> ordering space. >>> >>> So, the problem was spotted recently in engineering tests and it was confirmed >>> with tests that without memory barrier load and store may pass LL/SC >>> instructions in both directions, even in old MIPS R2 processors. >>> Aggressive speculation in MIPS R6 and MIPS I5600 processors adds more fire to >>> this issue. >>> >>> 3 patches introduces a configurable control for lightweight SYNCs around LL/SC >>> loops and for MIPS32 R2 it was allowed to choose an enforcing SYNCs or not >>> (keep as is) because some old MIPS32 R2 may be happy without that SYNCs. >>> In MIPS R6 I chose to have SYNC around LL/SC mandatory because all of that >>> processors have an agressive speculation and delayed write buffers. In that >>> processors series it is still possible the use of SYNC 0 instead of >>> lightweight SYNCs in configuration - just in case of some trouble in >>> implementation in specific CPU. However, it is considered safe do not implement >>> some or any lightweight SYNC in specific core because Architecture requires >>> HW map of unimplemented SYNCs to SYNC 0. >> >> How useful might this be for older hardware, such as the R10k CPUs? Just >> fallbacks to the old sync insn? > > The R10000 family is strongly ordered so there is no SYNC instruction > required in the entire kernel even though some Origin hardware documentation > incorrectly claims otherwise. So no benefits even in the speculative execution case on noncoherent hardware like IP28 and IP32? --J