From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752408AbcFNOPC (ORCPT ); Tue, 14 Jun 2016 10:15:02 -0400 Received: from foss.arm.com ([217.140.101.70]:58969 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750794AbcFNOPA (ORCPT ); Tue, 14 Jun 2016 10:15:00 -0400 Subject: Re: [rfc patch] sched/fair: Use instantaneous load for fork/exec balancing To: Mike Galbraith , Peter Zijlstra References: <1465891111.1694.13.camel@gmail.com> Cc: Yuyang Du , LKML From: Dietmar Eggemann Message-ID: <5760115C.7040306@arm.com> Date: Tue, 14 Jun 2016 15:14:52 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <1465891111.1694.13.camel@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14/06/16 08:58, Mike Galbraith wrote: > SUSE's regression testing noticed that... > > 0905f04eb21f sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task() > > ...introduced a hackbench regression, and indeed it does. I think this > regression has more to do with randomness than anything else, but in > general... > > While averaging calms down load balancing, helping to keep migrations > down to a dull roar, it's not completely wonderful when it comes to > things that live in the here and now, hackbench being one such. > > time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done' > > real 0m55.397s > user 0m8.320s > sys 5m40.789s > > echo LB_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features > > real 0m48.049s > user 0m6.510s > sys 5m6.291s > > Signed-off-by: Mike Galbraith I see similar values on ARM64 (Juno r0: 2xCortex-A57 4xCortex-A53). OK, 1000 invocations of hackbench take a little bit longer but I guess it's the fork's we're after. - echo NO_LB_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done' root@juno:~# time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done' real 10m17.155s user 2m56.976s sys 38m0.324s - echo LB_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done' real 9m49.832s user 2m42.896s sys 34m51.452s - But I get a similar effect in case I initialize se->avg.load_avg w/ 0: --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -680,7 +680,7 @@ void init_entity_runnable_average(struct sched_entity *se) * will definitely be update (after enqueue). */ sa->period_contrib = 1023; - sa->load_avg = scale_load_down(se->load.weight); + sa->load_avg = scale_load_down(0); sa->load_sum = sa->load_avg * LOAD_AVG_MAX; root@juno:~# time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done' real 9m55.396s user 2m41.192s sys 35m6.196s IMHO, the hackbench performance "boost" w/o 0905f04eb21f is due to the fact that a new task gets all it's load decayed (making it a small task) in the __update_load_avg() call in remove_entity_load_avg() because its se->avg.last_update_time value is 0 which creates a huge time difference comparing it to cfs_rq->avg.last_update_time. The patch 0905f04eb21f avoids this and thus the task stays big se->avg.load_avg = 1024. It can't be a difference in the value of cfs_rq->removed_load_avg because w/o the patch 0905f04eb21f, we atomic_long_add 0 and with the patch we bail before the atomic_long_add(). [...]