From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752797AbbGBKyQ (ORCPT <rfc822;w@1wt.eu>);
	Thu, 2 Jul 2015 06:54:16 -0400
Received: from casper.infradead.org ([85.118.1.10]:45326 "EHLO
	casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751293AbbGBKyI (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 2 Jul 2015 06:54:08 -0400
Date: Thu, 2 Jul 2015 12:53:59 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Yuyang Du <yuyang.du@intel.com>
Cc: Rabin Vincent <rabin.vincent@axis.com>,
        Mike Galbraith <umgwanakikbuti@gmail.com>,
        "mingo@redhat.com" <mingo@redhat.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Paul Turner <pjt@google.com>, Ben Segall <bsegall@google.com>,
        Morten Rasmussen <morten.rasmussen@arm.com>
Subject: Re: [PATCH?] Livelock in pick_next_task_fair() / idle_balance()
Message-ID: <20150702105359.GY19282@twins.programming.kicks-ass.net>
References: <20150630143057.GA31689@axis.com>
 <1435728995.9397.7.camel@gmail.com>
 <20150701145551.GA15690@axis.com>
 <20150701204404.GH25159@twins.programming.kicks-ass.net>
 <20150701232511.GA5197@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150701232511.GA5197@intel.com>
User-Agent: Mutt/1.5.21 (2012-12-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Jul 02, 2015 at 07:25:11AM +0800, Yuyang Du wrote:
> And obviously, the idle balancing livelock SHOULD happen: one CPU pulls
> tasks from the other, makes the other idle, and this iterates...
> 
> That being said, it is also obvious to prevent the livelock from happening:
> idle pulling until the source rq's nr_running is 1, becuase otherwise we
> just avoid idleness by making another idleness.

Well, ideally the imbalance calculation would be so that it would avoid
this from happening in the first place. Its a 'balance' operation, not a
'steal everything'.

We want to take work -- as we have none -- but we want to ensure that
afterwards we have equal work, ie we're balanced.

So clearly that all is hosed. Now Morten was looking into simplifying
calculate_imbalance() recently.

> On Wed, Jul 01, 2015 at 10:44:04PM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 01, 2015 at 04:55:51PM +0200, Rabin Vincent wrote:
> > >  PID: 413    TASK: 8edda408  CPU: 1   COMMAND: "rngd"
> > >   task_h_load():     0 [ = (load_avg_contrib {    0} * cfs_rq->h_load {    0}) / (cfs_rq->runnable_load_avg {    0} + 1) ]
> > >   SE: 8edda450 load_avg_contrib:     0 load.weight:  1024 PARENT: 8fffbd00 GROUPNAME: (null)
> > >   SE: 8fffbd00 load_avg_contrib:     0 load.weight:     2 PARENT: 8f531f80 GROUPNAME: rngd@hwrng.service
> > >   SE: 8f531f80 load_avg_contrib:     0 load.weight:  1024 PARENT: 8f456e00 GROUPNAME: system-rngd.slice
> > >   SE: 8f456e00 load_avg_contrib:   118 load.weight:   911 PARENT: 00000000 GROUPNAME: system.slice
> > 
> > Firstly, a group (parent) load_avg_contrib should never be less than
> > that of its constituent parts, therefore the top 3 SEs should have at
> > least 118 too.
> 
> I think the downward is parent,

Ugh, I cannot read. Let me blame it on the heat.