From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751293AbcGMUwH (ORCPT <rfc822;w@1wt.eu>);
	Wed, 13 Jul 2016 16:52:07 -0400
Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:7435 "EHLO
	mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK)
	by vger.kernel.org with ESMTP id S1751275AbcGMUv4 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Wed, 13 Jul 2016 16:51:56 -0400
X-IBM-Helo: d01dlp02.pok.ibm.com
X-IBM-MailFrom: paulmck@linux.vnet.ibm.com
Date: Wed, 13 Jul 2016 13:52:11 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>, John Stultz <john.stultz@linaro.org>,
        Ingo Molnar <mingo@redhat.com>, lkml <linux-kernel@vger.kernel.org>,
        Dmitry Shmidt <dimitrysh@google.com>,
        Rom Lemarchand <romlem@google.com>, Colin Cross <ccross@google.com>,
        Todd Kjos <tkjos@google.com>, Oleg Nesterov <oleg@redhat.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup
 locking changes
Reply-To: paulmck@linux.vnet.ibm.com
References: <CALAqxLV2H06M2AxP2kK7pGHjLgQAaGdHB9HJEGZjH_Rouy4WhQ@mail.gmail.com>
 <20160713182102.GJ4065@mtj.duckdns.org>
 <20160713183347.GK4065@mtj.duckdns.org>
 <CALAqxLU76kVm8b5GNnSk25-Xk6D1efiWtwNJbY1th_pcN9Ks=g@mail.gmail.com>
 <20160713201823.GB29670@mtj.duckdns.org>
 <20160713202657.GW30154@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160713202657.GW30154@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 16071320-0056-0000-0000-000000C6B310
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 16071320-0057-0000-0000-000004E0C45B
Message-Id: <20160713205211.GN7094@linux.vnet.ibm.com>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-07-13_11:,,
 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0
 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam
 adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000
 definitions=main-1607130228
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Jul 13, 2016 at 10:26:57PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 04:18:23PM -0400, Tejun Heo wrote:
> > Hello, John.
> > 
> > On Wed, Jul 13, 2016 at 01:13:11PM -0700, John Stultz wrote:
> > > On Wed, Jul 13, 2016 at 11:33 AM, Tejun Heo <tj@kernel.org> wrote:
> > > > On Wed, Jul 13, 2016 at 02:21:02PM -0400, Tejun Heo wrote:
> > > >> One interesting thing to try would be replacing it with a regular
> > > >> non-percpu rwsem and see how it behaves.  That should easily tell us
> > > >> whether this is from actual contention or artifacts from percpu_rwsem
> > > >> implementation.
> > > >
> > > > So, something like the following.  Can you please see whether this
> > > > makes any difference?
> > > 
> > > Yea. So this brings it down for me closer to what we're seeing with
> > > the Dmitry's patch reverting the two problematic commits, usually
> > > 10-50us with one early spike at 18ms.
> > 
> > So, it's a percpu rwsem issue then.  I haven't really followed the
> > perpcpu rwsem changes closely.  Oleg, are multi-milisec delay expected
> > on down write expected with the current implementation of
> > percpu_rwsem?
> 
> There is a synchronize_sched() in there, so sorta. That thing is heavily
> geared towards readers, as is the only 'sane' choice for global locks.

Then one diagnostic step to take would be to replace that
synchronize_sched() with synchronize_sched_expedited(), and see if that
gets rid of the delays.

Not a particularly real-time-friendly fix, but certainly a good check
on our various assumptions.

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index be922c9f3d37..211acddc7e21 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -38,19 +38,19 @@ static const struct {
 #endif
 } gp_ops[] = {
 	[RCU_SYNC] = {
-		.sync = synchronize_rcu,
+		.sync = synchronize_rcu_expedited,
 		.call = call_rcu,
 		.wait = rcu_barrier,
 		__INIT_HELD(rcu_read_lock_held)
 	},
 	[RCU_SCHED_SYNC] = {
-		.sync = synchronize_sched,
+		.sync = synchronize_sched_expedited,
 		.call = call_rcu_sched,
 		.wait = rcu_barrier_sched,
 		__INIT_HELD(rcu_read_lock_sched_held)
 	},
 	[RCU_BH_SYNC] = {
-		.sync = synchronize_rcu_bh,
+		.sync = synchronize_rcu_bh_expedited,
 		.call = call_rcu_bh,
 		.wait = rcu_barrier_bh,
 		__INIT_HELD(rcu_read_lock_bh_held)