From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752797AbbGBKyQ (ORCPT ); Thu, 2 Jul 2015 06:54:16 -0400 Received: from casper.infradead.org ([85.118.1.10]:45326 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751293AbbGBKyI (ORCPT ); Thu, 2 Jul 2015 06:54:08 -0400 Date: Thu, 2 Jul 2015 12:53:59 +0200 From: Peter Zijlstra To: Yuyang Du Cc: Rabin Vincent , Mike Galbraith , "mingo@redhat.com" , "linux-kernel@vger.kernel.org" , Paul Turner , Ben Segall , Morten Rasmussen Subject: Re: [PATCH?] Livelock in pick_next_task_fair() / idle_balance() Message-ID: <20150702105359.GY19282@twins.programming.kicks-ass.net> References: <20150630143057.GA31689@axis.com> <1435728995.9397.7.camel@gmail.com> <20150701145551.GA15690@axis.com> <20150701204404.GH25159@twins.programming.kicks-ass.net> <20150701232511.GA5197@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150701232511.GA5197@intel.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 02, 2015 at 07:25:11AM +0800, Yuyang Du wrote: > And obviously, the idle balancing livelock SHOULD happen: one CPU pulls > tasks from the other, makes the other idle, and this iterates... > > That being said, it is also obvious to prevent the livelock from happening: > idle pulling until the source rq's nr_running is 1, becuase otherwise we > just avoid idleness by making another idleness. Well, ideally the imbalance calculation would be so that it would avoid this from happening in the first place. Its a 'balance' operation, not a 'steal everything'. We want to take work -- as we have none -- but we want to ensure that afterwards we have equal work, ie we're balanced. So clearly that all is hosed. Now Morten was looking into simplifying calculate_imbalance() recently. > On Wed, Jul 01, 2015 at 10:44:04PM +0200, Peter Zijlstra wrote: > > On Wed, Jul 01, 2015 at 04:55:51PM +0200, Rabin Vincent wrote: > > > PID: 413 TASK: 8edda408 CPU: 1 COMMAND: "rngd" > > > task_h_load(): 0 [ = (load_avg_contrib { 0} * cfs_rq->h_load { 0}) / (cfs_rq->runnable_load_avg { 0} + 1) ] > > > SE: 8edda450 load_avg_contrib: 0 load.weight: 1024 PARENT: 8fffbd00 GROUPNAME: (null) > > > SE: 8fffbd00 load_avg_contrib: 0 load.weight: 2 PARENT: 8f531f80 GROUPNAME: rngd@hwrng.service > > > SE: 8f531f80 load_avg_contrib: 0 load.weight: 1024 PARENT: 8f456e00 GROUPNAME: system-rngd.slice > > > SE: 8f456e00 load_avg_contrib: 118 load.weight: 911 PARENT: 00000000 GROUPNAME: system.slice > > > > Firstly, a group (parent) load_avg_contrib should never be less than > > that of its constituent parts, therefore the top 3 SEs should have at > > least 118 too. > > I think the downward is parent, Ugh, I cannot read. Let me blame it on the heat.