From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752644AbbBRQBG (ORCPT ); Wed, 18 Feb 2015 11:01:06 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37619 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752223AbbBRQBA (ORCPT ); Wed, 18 Feb 2015 11:01:00 -0500 Date: Wed, 18 Feb 2015 16:59:04 +0100 From: Oleg Nesterov To: "Paul E. McKenney" , Manfred Spraul Cc: Peter Zijlstra , Kirill Tkhai , linux-kernel@vger.kernel.org, Ingo Molnar , Josh Poimboeuf Subject: Re: [PATCH 2/2] [PATCH] sched: Add smp_rmb() in task rq locking cycles Message-ID: <20150218155904.GA27687@redhat.com> References: <20150217104516.12144.85911.stgit@tkhai> <1424170021.5749.22.camel@tkhai> <20150217121258.GM5029@twins.programming.kicks-ass.net> <20150217130523.GV24151@twins.programming.kicks-ass.net> <20150217160532.GW4166@linux.vnet.ibm.com> <20150217183636.GR5029@twins.programming.kicks-ass.net> <20150217215231.GK4166@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150217215231.GK4166@linux.vnet.ibm.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org (Forgot to add Manfred, resending) Thanks Paul and Peter, this was the interesting reading ;) This is almost off-topic (but see below), but perhaps memory-barriers.txt could also mention spin_unlock_wait() to explain that _most probably_ it is pointless without the memory barrier(s), and the barrer before-or-after unlock_wait() pairs with release-or-acquire. At the same time, the code like spin_unlock_wait(); STORE; _can_ be correct because this implies the load-store control dependency. On 02/17, Paul E. McKenney wrote: > > | mb | wmb | rmb | rbd | acq | rel | ctl | > -----+-------+-------+-------+-------+-------+-------+-------+ > mb | Y | | Y | y | Y | | Y + > -----+-------+-------+-------+-------+-------+-------+-------+ > wmb | Y | | Y | y | Y | | Y + > -----+-------+-------+-------+-------+-------+-------+-------+ > rmb | | | | | | | + > -----+-------+-------+-------+-------+-------+-------+-------+ > rbd | | | | | | | + > -----+-------+-------+-------+-------+-------+-------+-------+ > acq | | | | | | | + > -----+-------+-------+-------+-------+-------+-------+-------+ > rel | Y | | Y | y | Y | | Y + > -----+-------+-------+-------+-------+-------+-------+-------+ > ctl | | | | | | | + > -----+-------+-------+-------+-------+-------+-------+-------+ OK, so "acq" can't pair with "acq", and I am not sure I understand. First of all, it is not clear to me how you can even try to pair them unless you do something like spin_unlock_wait(). I would like to see an example which is not "obviously wrong". At the same time, if you play with spin_unlock_wait() or spin_is_locked() then acq can pair with acq? Let's look at sem_lock(). I never looked at this code before, I can be easily wrong. Manfred will correct me. But at first glance we can write the oversimplified pseudo-code: spinlock_t local, global; bool my_lock(bool try_local) { if (try_local) { spin_lock(&local); if (!spin_is_locked(&global)) return true; spin_unlock(&local); } spin_lock(&global); spin_unlock_wait(&local); return false; } void my_unlock(bool drop_local) { if (drop_local) spin_unlock(&local); else spin_unlock(&global); } it assumes that the "local" lock is cheaper than "global", the usage is bool xxx = my_lock(condition); /* CRITICAL SECTION */ my_unlock(xxx); Now. Unless I missed something, my_lock() does NOT need a barrier BEFORE spin_unlock_wait() (or spin_is_locked()). Either my_lock(true) should see spin_is_locked(global) == T, or my_lock(false)->spin_unlock_wait() should see that "local" is locked and wait. Doesn't this mean that acq can pair with acq or I am totally confused? Another question is do we need a barrier AFTER spin_unlock_wait(). I do not know what ipc/sem.c actually needs, but in general (I think) this does need mb(). Otherwise my_lock / my_unlock itself does not have the proper acq/rel semantics. For example, my_lock(false) can miss the changes which were done under my_lock(true). So I think that (in theory) sem_wait_array() need smp_mb() at the end. But, given that we have the control dependency, perhaps smp_rmb() is enough? Oleg.