From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S933030AbcFOPdE (ORCPT <rfc822;w@1wt.eu>);
	Wed, 15 Jun 2016 11:33:04 -0400
Received: from foss.arm.com ([217.140.101.70]:38354 "EHLO foss.arm.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S932432AbcFOPdB (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Wed, 15 Jun 2016 11:33:01 -0400
Subject: Re: [rfc patch] sched/fair: Use instantaneous load for fork/exec
 balancing
To: Mike Galbraith <umgwanakikbuti@gmail.com>,
        Peter Zijlstra <peterz@infradead.org>
References: <1465891111.1694.13.camel@gmail.com> <5760115C.7040306@arm.com>
 <1465922407.3626.21.camel@gmail.com>
Cc: Yuyang Du <yuyang.du@intel.com>, LKML <linux-kernel@vger.kernel.org>
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
Message-ID: <5761752A.6000606@arm.com>
Date: Wed, 15 Jun 2016 16:32:58 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.8.0
MIME-Version: 1.0
In-Reply-To: <1465922407.3626.21.camel@gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 14/06/16 17:40, Mike Galbraith wrote:
> On Tue, 2016-06-14 at 15:14 +0100, Dietmar Eggemann wrote:
> 
>> IMHO, the hackbench performance "boost" w/o 0905f04eb21f is due to the
>> fact that a new task gets all it's load decayed (making it a small task)
>> in the __update_load_avg() call in remove_entity_load_avg() because its
>> se->avg.last_update_time value is 0 which creates a huge time difference
>> comparing it to cfs_rq->avg.last_update_time. The patch 0905f04eb21f
>> avoids this and thus the task stays big se->avg.load_avg = 1024.
> 
> I don't care much at all about the hackbench "regression" in its own
> right, and what causes it, for me, bottom line is that there are cases
> where we need to be able to resolve, and can't, simply because we're
> looking at a fuzzy (rippling) reflection.

Understood. I just thought it would be nice to know why 0905f04eb21f
makes this problem even more visible. But so far I wasn't able to figure
out why this diff in se->avg.load_avg [1024 versus 0] has this effect on
cfs_rq->runnable_load_avg making it even less suitable in find idlest*.
enqueue_entity_load_avg()'s cfs_rq->runnable_load_* += sa->load_* looks
suspicious though.
> 
> In general, the fuzz helps us to not be so spastic.  I'm not sure that
> we really really need to care all that much, because I strongly suspect
> that it's only gonna make any difference at all in corner cases, but
> there are real world cases that matter.  I know for fact that schbench
> (facebook) which is at least based on a real world load fails early due
> to us stacking tasks due to that fuzzy view of reality.  In that case,
> it's because the fuzz consists of a high amplitude aging sawtooth..

... only for fork/exec? Which then would be related to the initial value
of se->avg.load_avg. Otherwise we could go back to pre b92486cbf2aa
"sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task".

> find idlest* sees a collection of pesudo-random numbers, effectively,
> the fates pick idlest via lottery, get it wrong often enough that a big
> box _never_ reaches full utilization before we stack tasks, putting an
> end to the latency game.  For generic loads, the smoothing works, but
> for some corners, it blows chunks.  Fork/exec seemed like a spot where
> you really can't go wrong by looking at clear unadulterated reality.
> 
> 	-Mike
>