From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752784AbcFNQkQ (ORCPT ); Tue, 14 Jun 2016 12:40:16 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:35415 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751796AbcFNQkM (ORCPT ); Tue, 14 Jun 2016 12:40:12 -0400 Message-ID: <1465922407.3626.21.camel@gmail.com> Subject: Re: [rfc patch] sched/fair: Use instantaneous load for fork/exec balancing From: Mike Galbraith To: Dietmar Eggemann , Peter Zijlstra Cc: Yuyang Du , LKML Date: Tue, 14 Jun 2016 18:40:07 +0200 In-Reply-To: <5760115C.7040306@arm.com> References: <1465891111.1694.13.camel@gmail.com> <5760115C.7040306@arm.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2016-06-14 at 15:14 +0100, Dietmar Eggemann wrote: > IMHO, the hackbench performance "boost" w/o 0905f04eb21f is due to the > fact that a new task gets all it's load decayed (making it a small task) > in the __update_load_avg() call in remove_entity_load_avg() because its > se->avg.last_update_time value is 0 which creates a huge time difference > comparing it to cfs_rq->avg.last_update_time. The patch 0905f04eb21f > avoids this and thus the task stays big se->avg.load_avg = 1024. I don't care much at all about the hackbench "regression" in its own right, and what causes it, for me, bottom line is that there are cases where we need to be able to resolve, and can't, simply because we're looking at a fuzzy (rippling) reflection. In general, the fuzz helps us to not be so spastic. I'm not sure that we really really need to care all that much, because I strongly suspect that it's only gonna make any difference at all in corner cases, but there are real world cases that matter. I know for fact that schbench (facebook) which is at least based on a real world load fails early due to us stacking tasks due to that fuzzy view of reality. In that case, it's because the fuzz consists of a high amplitude aging sawtooth.. find idlest* sees a collection of pesudo-random numbers, effectively, the fates pick idlest via lottery, get it wrong often enough that a big box _never_ reaches full utilization before we stack tasks, putting an end to the latency game. For generic loads, the smoothing works, but for some corners, it blows chunks. Fork/exec seemed like a spot where you really can't go wrong by looking at clear unadulterated reality. -Mike