From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932715AbbDJRoK (ORCPT <rfc822;w@1wt.eu>);
	Fri, 10 Apr 2015 13:44:10 -0400
Received: from mail-wg0-f44.google.com ([74.125.82.44]:33362 "EHLO
	mail-wg0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1756285AbbDJRoF (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 10 Apr 2015 13:44:05 -0400
Date: Fri, 10 Apr 2015 19:44:00 +0200
From: Ingo Molnar <mingo@kernel.org>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Jason Low <jason.low2@hp.com>, Peter Zijlstra <peterz@infradead.org>,
        Davidlohr Bueso <dave@stgolabs.net>,
        Tim Chen <tim.c.chen@linux.intel.com>,
        Aswin Chandramouleeswaran <aswin@hp.com>,
        LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mutex: Speed up mutex_spin_on_owner() by not taking the
 RCU lock
Message-ID: <20150410174400.GA6563@gmail.com>
References: <20150409053725.GB13871@gmail.com>
 <1428561611.3506.78.camel@j-VirtualBox>
 <20150409075311.GA4645@gmail.com>
 <CA+55aFz6KKxGVxPAbsmw9GsKJfy85P2C0EmYBrGpn+aJDjZJWw@mail.gmail.com>
 <20150409175652.GI6464@linux.vnet.ibm.com>
 <CA+55aFzXMDjQQ7jTjsPdh1RikXfgV7OCd-+13cz06MOmDBA33w@mail.gmail.com>
 <CA+55aFwZWi6ecDmVsMBQJTrgrW3GD2DaRtpiOspe=5amR1=dNg@mail.gmail.com>
 <20150409183926.GM6464@linux.vnet.ibm.com>
 <20150410090051.GA28549@gmail.com>
 <20150410142024.GY6464@linux.vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150410142024.GY6464@linux.vnet.ibm.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > No RCU overhead, and this is the access to owner->on_cpu:
> > 
> >   69:	49 8b 81 10 c0 ff ff 	mov    -0x3ff0(%r9),%rax
> > 
> > Totally untested and all that, I only built the mutex.o.
> > 
> > What do you think? Am I missing anything?
> 
> I suspect it is good, but let's take a look at Linus' summary of the code:
> 
>         rcu_read_lock();
>         while (sem->owner == owner) {
>                 if (!owner->on_cpu || need_resched())
>                         break;
>                 cpu_relax_lowlatency();
>         }
>         rcu_read_unlock();

Note that I patched the mutex case as a prototype, which is more 
commonly used than rwsem-xadd. But the rwsem case is similar as well.

> The cpu_relax_lowlatency() looks to have barrier() semantics, so the 
> sem->owner should get reloaded every time through the loop.  This is 
> needed, because otherwise the task structure could get freed and 
> reallocated as something else that happened to have the field at the 
> ->on_cpu offset always zero, resulting in an infinite loop.

So at least with the get_kernel(..., &owner->on_cpu) approach, the 
get_kernel() copy has barrier semantics as well (it's in assembly), so 
it will be reloaded in every iteration in a natural fashion.

Thanks,

	Ingo