From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752544AbcFNTm1 (ORCPT <rfc822;w@1wt.eu>);
	Tue, 14 Jun 2016 15:42:27 -0400
Received: from bombadil.infradead.org ([198.137.202.9]:33415 "EHLO
	bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751364AbcFNTm0 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 14 Jun 2016 15:42:26 -0400
Date: Tue, 14 Jun 2016 21:42:17 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Clark Williams <williams@redhat.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: [PATCH] sched: Do not release current rq lock on non contended
 double_lock_balance()
Message-ID: <20160614194217.GK30921@twins.programming.kicks-ass.net>
References: <20160613123732.3a8ccc57@gandalf.local.home>
 <20160614115820.GD30921@twins.programming.kicks-ass.net>
 <20160614140228.0ecf15af@grimm.local.home>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160614140228.0ecf15af@grimm.local.home>
User-Agent: Mutt/1.5.23.1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Jun 14, 2016 at 02:02:28PM -0400, Steven Rostedt wrote:
> On Tue, 14 Jun 2016 13:58:20 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> > And it does indeed make the hold time harder to analyze.
> > 
> > For instance; pull_rt_task() does:
> > 
> > 	for_each_cpu() {
> > 		double_lock_balance(this, that);
> > 		...
> > 		double_unlock_balance(this, that);
> > 	}
> > 
> > Which, with the trylock, ends up with a max possible hold time of
> > O(nr_cpus).
> 
> Sure, but I think we should try to limit that loop too, because that
> loop itself is what is triggering the large latency for me, because
> it constantly releases a spinlock and has to wait. This loop is done
> with preemption disabled.

Much worse, its done with IRQs disabled. But that affects only the local
CPU. Holding the lock that long affects all other CPUs too.

> > Unlikely, sure, but RT is a game of upper bounds etc.
> 
> Sure, but should we force worst case all the time?

How is that relevant? Either you have a bounded operation or you don't.

> We do a lot of optimization to allow for good throughput as well.

Only within keeping the upper bounds. The moment you let go of that,
you've destroyed RT.

> > So should we maybe do something like:
> > 
> > 	if (unlikely(raw_spin_is_contended(&this_rq->lock) ||
> > 	             !raw_spin_trylock(&busiest->lock))) {
> 
> Why do we care if this_rq is contended? 

To bound hold time.

> That's exactly what causes
> large latency to happen. Because when we let go of this_rq, this fast
> path becomes much slower because now it must wait for whatever is
> waiting on it to finish. The more CPUs you have, the bigger this issue
> becomes.

Yes, icky issue.

And while the numbers look pretty I'm not sure you've not introduced
another, less likely, issue.