From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751656AbcGNQpw (ORCPT ); Thu, 14 Jul 2016 12:45:52 -0400 Received: from merlin.infradead.org ([205.233.59.134]:46337 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751276AbcGNQpu (ORCPT ); Thu, 14 Jul 2016 12:45:50 -0400 Date: Thu, 14 Jul 2016 18:45:47 +0200 From: Peter Zijlstra To: "Paul E. McKenney" Cc: Tejun Heo , John Stultz , Ingo Molnar , lkml , Dmitry Shmidt , Rom Lemarchand , Colin Cross , Todd Kjos , Oleg Nesterov Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes Message-ID: <20160714164547.GG30154@twins.programming.kicks-ass.net> References: <20160713182102.GJ4065@mtj.duckdns.org> <20160713183347.GK4065@mtj.duckdns.org> <20160713201823.GB29670@mtj.duckdns.org> <20160713202657.GW30154@twins.programming.kicks-ass.net> <20160713203944.GC29670@mtj.duckdns.org> <20160713205102.GZ30909@twins.programming.kicks-ass.net> <20160714131809.GO30927@twins.programming.kicks-ass.net> <20160714162355.GW7094@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160714162355.GW7094@linux.vnet.ibm.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 14, 2016 at 09:23:55AM -0700, Paul E. McKenney wrote: > Hmmm... How does this handle the following sequence of events for > the case where we are not biased towards the reader? > > o The per-CPU rwsem is set up with RCU_NONE and readers_slow > (as opposed to readers_block). The rcu_sync ->gp_state is > GP_PENDING, which means that rcu_sync_is_idle() will always > return true. /false/, rcu_sync_is_idle() will always be false, to force us into the slowpath. > o Task A on CPU 0 runs percpu_down_read() to completion, and remains > within the critical section. CPU 0's ->refcount is therefore 1. > > o Task B on CPU 1 does percpu_down_write(), which write-acquires > ->rw_sem, does rcu_sync_enter() (which is a no-op due to > RCU_NONE), sets ->state to readers_block, and is just now going > to wait for readers, which first invokes readers_active_check(), > which starts summing the ->refcount for CPUs 0, 1, and 2, > obtaining the value 1 so far. > > o Task C CPU 2 enters percpu_down_read(), disables preemption, > increments CPU 2's ->refcount, sees rcu_sync_is_idle() return > true (so skips __percpu_down_read()), enables preemption, and > enters its critical section. false, so does __percpu_down_read() > > o Task C migrates to CPU 3 and invokes percpu_up_read(), which > disables preemption, sees rcu_sync_is_idle() return true, calls > __this_cpu_dec() on CPU 3's ->refcount, and enables preemption. > The value of CPU 3's ->refcount is thus (unsigned int)-1. __percpu_up_read() > > o Task B on CPU 1 continues execution in readers_active_check(), with > the full sum being zero. > > So it looks to me like we have Task A as a writer at the same time that > Task A is a reader, which would not be so good. > > So what am I missing here? for RCU_NONE we init rsp->gp_state to !0, which makes: static inline rcu_sync_is_idle()'s return !rsp->gp_state (aka. rsp->gp_state == 0) return false. > And a couple of checkpatch nits below. Yes, I had to apply the patch to > figure out what it was doing. ;-) Yah, too much churn to read :-) In any case, rest assured you've already gone over this part of the patch several times. I repurposed an old percpu-rwsem optimization, Oleg recognised it.