From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752957AbcFNSCd (ORCPT ); Tue, 14 Jun 2016 14:02:33 -0400 Received: from smtprelay0004.hostedemail.com ([216.40.44.4]:41873 "EHLO smtprelay.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751884AbcFNSCc (ORCPT ); Tue, 14 Jun 2016 14:02:32 -0400 X-Session-Marker: 726F737465647440676F6F646D69732E6F7267 X-Spam-Summary: 2,0,0,,d41d8cd98f00b204,rostedt@goodmis.org,:::::::::::::,RULES_HIT:41:355:379:541:599:800:960:973:988:989:1260:1277:1311:1313:1314:1345:1359:1437:1515:1516:1518:1534:1542:1593:1594:1711:1730:1747:1777:1792:2393:2553:2559:2562:2898:3138:3139:3140:3141:3142:3355:3622:3865:3866:3867:3868:3870:3871:3872:3873:3874:4250:4605:5007:6119:6261:7807:7875:7901:7903:7904:10004:10400:10848:10967:11026:11232:11658:11914:12043:12114:12296:12438:12517:12519:12663:12740:13019:13161:13229:13255:13439:14096:14097:14181:14659:14721:21080:21324,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:4,LUA_SUMMARY:none X-HE-Tag: angle40_32e009563293d X-Filterd-Recvd-Size: 3742 Date: Tue, 14 Jun 2016 14:02:28 -0400 From: Steven Rostedt To: Peter Zijlstra Cc: LKML , Ingo Molnar , Thomas Gleixner , Clark Williams , Andrew Morton , Nick Piggin Subject: Re: [PATCH] sched: Do not release current rq lock on non contended double_lock_balance() Message-ID: <20160614140228.0ecf15af@grimm.local.home> In-Reply-To: <20160614115820.GD30921@twins.programming.kicks-ass.net> References: <20160613123732.3a8ccc57@gandalf.local.home> <20160614115820.GD30921@twins.programming.kicks-ass.net> X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 14 Jun 2016 13:58:20 +0200 Peter Zijlstra wrote: > > The above puts a strict limit on hold time and is fair because of the > queueing. > > > +++ b/kernel/sched/sched.h > > @@ -1548,10 +1548,15 @@ static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest) > > __acquires(busiest->lock) > > __acquires(this_rq->lock) > > { > > + int ret = 0; > > + > > + if (unlikely(!raw_spin_trylock(&busiest->lock))) { > > + raw_spin_unlock(&this_rq->lock); > > + double_rq_lock(this_rq, busiest); > > + ret = 1; > > + } > > > > + return ret; > > } > > This relies on trylock no being allowed to steal the lock, which I think > is true for all fair spinlocks (for ticket this must be true, but it is > possible with qspinlock for example). > > And it does indeed make the hold time harder to analyze. > > For instance; pull_rt_task() does: > > for_each_cpu() { > double_lock_balance(this, that); > ... > double_unlock_balance(this, that); > } > > Which, with the trylock, ends up with a max possible hold time of > O(nr_cpus). Sure, but I think we should try to limit that loop too, because that loop itself is what is triggering the large latency for me, because it constantly releases a spinlock and has to wait. This loop is done with preemption disabled. > > Unlikely, sure, but RT is a game of upper bounds etc. Sure, but should we force worst case all the time? We do a lot of optimization to allow for good throughput as well. > > So should we maybe do something like: > > if (unlikely(raw_spin_is_contended(&this_rq->lock) || > !raw_spin_trylock(&busiest->lock))) { Why do we care if this_rq is contended? That's exactly what causes large latency to happen. Because when we let go of this_rq, this fast path becomes much slower because now it must wait for whatever is waiting on it to finish. The more CPUs you have, the bigger this issue becomes. If there's a loop on O(nr_cpus) (which is still technically O(1)) then another CPU may need to wait for that loop to finish. But the loop itself is kept tighter. If we always always release the lock, we allow other CPUs to continue at the expense of the one CPU from continuing, the K of that O(nr_cpus) becomes much larger and we consolidate the latency all to a single CPU which may be the one that is running the highest priority task in the system. I'm seeing 100s us latency because of this. With this patch, that latency disappears. -- Steve > raw_spin_unlock(&this_rq->lock); > double_rq_lock(this_rq, busiest); > ret = 1; > }