From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932970AbcFISPG (ORCPT <rfc822;w@1wt.eu>);
	Thu, 9 Jun 2016 14:15:06 -0400
Received: from mga03.intel.com ([134.134.136.65]:54397 "EHLO mga03.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S932790AbcFISPE (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Thu, 9 Jun 2016 14:15:04 -0400
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.26,446,1459839600"; 
   d="scan'208";a="825017212"
Subject: Re: performance delta after VFS i_mutex=>i_rwsem conversion
To: Ingo Molnar <mingo@kernel.org>, Waiman Long <waiman.long@hpe.com>
References: <5755D671.9070908@intel.com>
 <CA+55aFxH_7wjo_BgUPK5iomWedE2=DaUZVX-yruHOWEk7OTiHQ@mail.gmail.com>
 <5755E782.90800@hpe.com> <20160608085837.GA10792@gmail.com>
 <20160609102552.GA16968@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        "Chen, Tim C" <tim.c.chen@intel.com>, Ingo Molnar <mingo@redhat.com>,
        Davidlohr Bueso <dbueso@suse.de>,
        "Peter Zijlstra (Intel)" <peterz@infradead.org>,
        Jason Low <jason.low2@hp.com>, Michel Lespinasse <walken@google.com>,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        Waiman Long <waiman.long@hp.com>, Al Viro <viro@zeniv.linux.org.uk>,
        LKML <linux-kernel@vger.kernel.org>
From: Dave Hansen <dave.hansen@intel.com>
Message-ID: <5759B21E.2030003@intel.com>
Date: Thu, 9 Jun 2016 11:14:54 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <20160609102552.GA16968@gmail.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 06/09/2016 03:25 AM, Ingo Molnar wrote:
>>> That should eliminate the performance gap between mutex and rwsem wrt
>>> spinning when only writers are present. I am hoping that that patchset can
>>> be queued for 4.8.
>>
>> Yeah, so I actually had this series merged for testing last week, but a 
>> complication with a prereq patch made me unmerge it. But I have no fundamental 
>> objections, at all.
...
> Ok, these enhancements are now in the locking tree and are queued up for v4.8:
> 
>    git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/core
> 
> Dave, you might want to check your numbers with these changes: is rwsem 
> performance still significantly worse than mutex performance?

It's substantially closer than it was, but there's probably a little
work still to do.  The rwsem still looks to be sleeping a lot more than
the mutex.  Here's where we started:

	https://www.sr71.net/~dave/intel/rwsem-vs-mutex.png

The rwsem peaked lower and earlier than the mutex code.  Now, if we
compare the old (4.7-rc1) rwsem code to the newly-patched rwsem code
(from tip/locking):

> https://www.sr71.net/~dave/intel/bb.html?1=4.7.0-rc1&2=4.7.0-rc1-00127-gd4c3be7

We can see the peak is a bit higher and more importantly, it's more of a
plateau than a sharp peak.  We can also compare the new rwsem code to
the 4.5 code that had the mutex in place:

> https://www.sr71.net/~dave/intel/bb.html?1=4.5.0-rc6&2=4.7.0-rc1-00127-gd4c3be7

rwsems are still a _bit_ below the mutex code at the peak, and they also
seem to be substantially lower during the tail from 20 cpus on up.  The
rwsems are sleeping less than they were before the tip/locking updates,
but they are still idling the CPUs 90% of the time while the mutexes end
up idle 15-20% of the time when all the cpus are contending on the lock.