From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752983AbcD2AKV (ORCPT ); Thu, 28 Apr 2016 20:10:21 -0400 Received: from cmta8.telus.net ([209.171.16.81]:33930 "EHLO cmta8.telus.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752945AbcD2AKR (ORCPT ); Thu, 28 Apr 2016 20:10:17 -0400 X-Authority-Analysis: v=2.1 cv=APo7pnY9 c=1 sm=2 tr=0 a=zJWegnE7BH9C0Gl4FFgQyA==:117 a=zJWegnE7BH9C0Gl4FFgQyA==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=Pyq9K9CWowscuQLKlpiwfMBGOR0=:19 a=IkcTkHD0fZMA:10 a=aatUQebYAAAA:8 a=egogFJTRAAAA:8 a=VwQbUJbxAAAA:8 a=wUD_UkA43DWxoPdYHDoA:9 a=QEXdDO2ut3YA:10 X-Telus-Outbound-IP: 173.180.45.4 From: "Doug Smythies" To: "'Vik Heyndrickx'" , Cc: , "'Damien Wyart'" References: In-Reply-To: Subject: RE: [PATCH] sched: loadavg 0.00, 0.01, 0.05 on idle, 1.00, 0.99, 0.95 on full load Date: Thu, 28 Apr 2016 17:10:13 -0700 Message-ID: <001b01d1a1ab$81156e80$83404b80$@net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AdGhfkg3Zj5IxziCQPqCFIw4K7F8LgALFw9Q Content-Language: en-ca Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2016.04.28 11:46 Vik Heyndrickx wrote: > Hi Peter, > >Systems show a minimal load average of 0.00, 0.01, 0.05 even when they > have no load at all. > > Uptime and /proc/loadavg on all systems with kernels released during the > last five years up until kernel version 4.6-rc5, show a 5- and 15-minute > minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on > idle systems, but the way the kernel calculates this value prevents it > from getting lower than the mentioned values. > > Likewise but not as obviously noticeable, a fully loaded system with no > processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95 > (multiplied by number of cores). > > Once the (old) load becomes 93 or higher, it mathematically can never > get lower than 93, even when the active (load) remains 0 forever. > This results in the strange 0.00, 0.01, 0.05 uptime values on idle > systems. Note: 93/2048 = 0.0454..., which rounds up to 0.05. > > It is not correct to add a 0.5 rounding (=1024/2048) here, since the > result from this function is fed back into the next iteration again, > so the result of that +0.5 rounding value then gets multiplied by > (2048-2037), and then rounded again, so there is a virtual "ghost" > load created, next to the old and active load terms. > > By changing the way the internally kept value is rounded, that internal > value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon > increasing load, the internally kept load value is rounded up, when the > load is decreasing, the load value is rounded down. > > The modified code was tested on nohz=off and nohz kernels. It was tested > on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was > tested on single, dual, and octal cores system. It was tested on virtual > hosts and bare hardware. No unwanted effects have been observed, and the > problems that the patch intended to fix were indeed gone. > > Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes") > Cc: Doug Smythies > Tested-by: Damien Wyart > Signed-off-by: Vik Heyndrickx > > --- kernel/sched/loadavg.c.orig 2016-04-25 01:17:05.000000000 +0200 > +++ kernel/sched/loadavg.c 2016-04-28 16:47:47.754266136 +0200 > @@ -99,10 +99,12 @@ long calc_load_fold_active(struct rq *th > static unsigned long > calc_load(unsigned long load, unsigned long exp, unsigned long active) > { > - load *= exp; > - load += active * (FIXED_1 - exp); > - load += 1UL << (FSHIFT - 1); > - return load >> FSHIFT; > + long unsigned newload; > + > + newload = load * exp + active * (FIXED_1 - exp); > + if (active >= load) > + newload += FIXED_1-1; > + return newload / FIXED_1; > } > > #ifdef CONFIG_NO_HZ_COMMON See also: https://bugzilla.kernel.org/show_bug.cgi?id=45001 I also tested this patch on 2016.01.22. It works fine.