From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751383AbcGMVDW (ORCPT ); Wed, 13 Jul 2016 17:03:22 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:26675 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751005AbcGMVDO (ORCPT ); Wed, 13 Jul 2016 17:03:14 -0400 X-IBM-Helo: d01dlp01.pok.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Wed, 13 Jul 2016 14:03:15 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Tejun Heo , John Stultz , Ingo Molnar , lkml , Dmitry Shmidt , Rom Lemarchand , Colin Cross , Todd Kjos , Oleg Nesterov Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes Reply-To: paulmck@linux.vnet.ibm.com References: <20160713182102.GJ4065@mtj.duckdns.org> <20160713183347.GK4065@mtj.duckdns.org> <20160713201823.GB29670@mtj.duckdns.org> <20160713202657.GW30154@twins.programming.kicks-ass.net> <20160713203944.GC29670@mtj.duckdns.org> <20160713205102.GZ30909@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160713205102.GZ30909@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16071321-0040-0000-0000-000000CD0E7D X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16071321-0041-0000-0000-000004A7401A Message-Id: <20160713210315.GO7094@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-07-13_11:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1607130230 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote: > On Wed, Jul 13, 2016 at 04:39:44PM -0400, Tejun Heo wrote: > > > > There is a synchronize_sched() in there, so sorta. That thing is heavily > > > geared towards readers, as is the only 'sane' choice for global locks. > > > > It used to use the expedited variant until 001dac627ff3 > > ("locking/percpu-rwsem: Make use of the rcu_sync infrastructure"), so > > it might have been okay before then. > > Right, but expedited stuff sprays IPIs around the entire system. That's > stuff other people complain about. Do anyone other than the non-NO_HZ_FULL low-latency guys and the -rt guys care? > > The options that I can see are > > > > 1. Somehow make percpu_rwsem's write behavior more responsive in a way > > which is acceptable all use cases. This would be great but > > probably impossible. > > > > 2. Add a fast-writer option to percpu_rwsem so that users which care > > about write latency can opt in for higher processing overhead for > > lower latency. > > So, IIRC, the trade-off is a full memory barrier in read_lock and > read_unlock() vs sync_sched() in write. > > Full memory barriers are expensive and while the combined cost might > well exceed the cost of the sync_sched() it doesn't suffer the latency > issues. > > Not sure if we can frob the two in a single codebase, but I can have a > poke if Oleg or Paul doesn't beat me to it. Take the patch that I just sent out and make the choice of normal vs. expedited depend on CONFIG_PREEMPT_RT or whatever the -rt guys are calling it these days. Is there a low-latency Kconfig option other than CONFIG_NO_HZ_FULL? The memory-barrier approach can definitely be made to work, but is going to be more complex due to the need to wait for readers. Thanx, Paul