From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751423AbcGMOmu (ORCPT ); Wed, 13 Jul 2016 10:42:50 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:2298 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751164AbcGMOmm (ORCPT ); Wed, 13 Jul 2016 10:42:42 -0400 X-IBM-Helo: d03dlp02.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Wed, 13 Jul 2016 07:42:43 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: John Stultz , Tejun Heo , Ingo Molnar , lkml , Dmitry Shmidt , Rom Lemarchand , Colin Cross , Todd Kjos , Oleg Nesterov Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes Reply-To: paulmck@linux.vnet.ibm.com References: <20160713082112.GR30154@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160713082112.GR30154@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16071314-0020-0000-0000-0000095033E4 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16071314-0021-0000-0000-000053A656CD Message-Id: <20160713144243.GF7094@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-07-13_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1607130163 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 13, 2016 at 10:21:12AM +0200, Peter Zijlstra wrote: > On Tue, Jul 12, 2016 at 05:00:04PM -0700, John Stultz wrote: > > Hey Tejun, > > > > So Dmitry Shmidt recently noticed that with 4.4 based systems we're > > seeing quite a bit of performance overhead from > > __cgroup_procs_write(). > > > > With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite > > often take 10s of miliseconds to execute (with max times up in the > > 80ms range). > > > > While with 4.1 it was quite often in the single usec range, and max > > time values still in in sub-milisecond range. > > > > The majority of these performance regressions seem to come from the > > locking changes in: > > > > 3014dde762f6 ("cgroup: simplify threadgroup locking") > > and > > 1ed1328792ff ("sched, cgroup: replace signal_struct->group_rwsem with > > a global percpu_rwsem") > > > > Dmitry has found that by reverting these two changes (which don't > > revert easiliy), we can get back down to tens 10-100 usec range for > > most calls, with max values occasionally spiking to ~18ms. > > > > Those two commits do talk about performance regressions, that were > > supposedly alleviated by percpu_rwsem changes, but I'm not sure we are > > seeing this. > > Do you have 'funny' RCU options that quickly force a grace period when > you go idle or something? > > But yes, it does not surprise me to find this commit is causing > problems. Hmmm... Looks like RCU is present both before and after. But please do send along your .config. Speaking of .config, is CONFIG_PREEMPT=y? If so, does the workload feature preemption and migration? If that is the case, you might be seeing contention on the per-CPU cgroup_threadgroup_rwsem, given that the second patch seems to be adding acquisitions. Thanx, Paul