From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754054AbbKLNbW (ORCPT ); Thu, 12 Nov 2015 08:31:22 -0500 Received: from unicorn.mansr.com ([81.2.72.234]:47694 "EHLO unicorn.mansr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751664AbbKLNbU convert rfc822-to-8bit (ORCPT ); Thu, 12 Nov 2015 08:31:20 -0500 From: =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= To: Peter Zijlstra Cc: ralf@linux-mips.org, ddaney@caviumnetworks.com, linux-kernel@vger.kernel.org, Paul McKenney , Will Deacon , torvalds@linux-foundation.org, boqun.feng@gmail.com Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock() References: <20151112123123.GZ17308@twins.programming.kicks-ass.net> Date: Thu, 12 Nov 2015 13:31:11 +0000 In-Reply-To: <20151112123123.GZ17308@twins.programming.kicks-ass.net> (Peter Zijlstra's message of "Thu, 12 Nov 2015 13:31:23 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra writes: > Hi > > I think the MIPS arch_spin_unlock() is borken. > > spin_unlock() must have RELEASE semantics, these require that no LOADs > nor STOREs leak out from the critical section. > > From what I know MIPS has a relaxed memory model which allows reads to > pass stores, and as implemented arch_spin_unlock() only issues a wmb > which doesn't order prior reads vs later stores. This is correct. > Therefore upgrade the wmb() to smp_mb(). > > (Also, why the unconditional wmb, as opposed to smp_wmb() ?) Good question. The current MIPS asm/barrier.h uses a plain SYNC instruction for all kinds of barriers (except on Cavium Octeon), which is a bit wasteful. A MIPS implementation can optionally support partial barriers (load, store, acquire, release) which all behave like a full barrier if not implemented, so those really ought to be used. > Maybe-Signed-off-by: Peter Zijlstra (Intel) > --- > diff --git a/arch/mips/include/asm/spinlock.h b/arch/mips/include/asm/spinlock.h > index 40196bebe849..b2ca13f06152 100644 > --- a/arch/mips/include/asm/spinlock.h > +++ b/arch/mips/include/asm/spinlock.h > @@ -140,7 +140,7 @@ static inline void arch_spin_lock(arch_spinlock_t *lock) > static inline void arch_spin_unlock(arch_spinlock_t *lock) > { > unsigned int serving_now = lock->h.serving_now + 1; > - wmb(); > + smp_mb(); > lock->h.serving_now = (u16)serving_now; > nudge_writes(); > } All this weirdness was added in commit 500c2e1f: MIPS: Optimize spinlocks. The current locking mechanism uses a ll/sc sequence to release a spinlock. This is slower than a wmb() followed by a store to unlock. The branching forward to .subsection 2 on sc failure slows down the contended case. So we get rid of that part too. Since we are now working on naturally aligned u16 values, we can get rid of a masking operation as the LHU already does the right thing. The ANDI are reversed for better scheduling on multi-issue CPUs On a 12 CPU 750MHz Octeon cn5750 this patch improves ipv4 UDP packet forwarding rates from 3.58*10^6 PPS to 3.99*10^6 PPS, or about 11%. Signed-off-by: David Daney To: linux-mips@linux-mips.org Patchwork: http://patchwork.linux-mips.org/patch/937/ Signed-off-by: Ralf Baechle -- Måns Rullgård mans@mansr.com