linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Damien Wyart <damien.wyart@free.fr>,
	Vik Heyndrickx <vik.heyndrickx@veribox.net>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Doug Smythies <dsmythies@telus.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Galbraith <efault@gmx.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: [PATCH 4.5 22/87] sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems
Date: Mon, 30 May 2016 13:49:20 -0700	[thread overview]
Message-ID: <20160530204934.209374998@linuxfoundation.org> (raw)
In-Reply-To: <20160530204933.149873142@linuxfoundation.org>

4.5-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vik Heyndrickx <vik.heyndrickx@veribox.net>

commit 20878232c52329f92423d27a60e48b6a6389e0dd upstream.

Systems show a minimal load average of 0.00, 0.01, 0.05 even when they
have no load at all.

Uptime and /proc/loadavg on all systems with kernels released during the
last five years up until kernel version 4.6-rc5, show a 5- and 15-minute
minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on
idle systems, but the way the kernel calculates this value prevents it
from getting lower than the mentioned values.

Likewise but not as obviously noticeable, a fully loaded system with no
processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95
(multiplied by number of cores).

Once the (old) load becomes 93 or higher, it mathematically can never
get lower than 93, even when the active (load) remains 0 forever.
This results in the strange 0.00, 0.01, 0.05 uptime values on idle
systems.  Note: 93/2048 = 0.0454..., which rounds up to 0.05.

It is not correct to add a 0.5 rounding (=1024/2048) here, since the
result from this function is fed back into the next iteration again,
so the result of that +0.5 rounding value then gets multiplied by
(2048-2037), and then rounded again, so there is a virtual "ghost"
load created, next to the old and active load terms.

By changing the way the internally kept value is rounded, that internal
value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon
increasing load, the internally kept load value is rounded up, when the
load is decreasing, the load value is rounded down.

The modified code was tested on nohz=off and nohz kernels. It was tested
on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was
tested on single, dual, and octal cores system. It was tested on virtual
hosts and bare hardware. No unwanted effects have been observed, and the
problems that the patch intended to fix were indeed gone.

Tested-by: Damien Wyart <damien.wyart@free.fr>
Signed-off-by: Vik Heyndrickx <vik.heyndrickx@veribox.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Doug Smythies <dsmythies@telus.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes")
Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/sched/loadavg.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -99,10 +99,13 @@ long calc_load_fold_active(struct rq *th
 static unsigned long
 calc_load(unsigned long load, unsigned long exp, unsigned long active)
 {
-	load *= exp;
-	load += active * (FIXED_1 - exp);
-	load += 1UL << (FSHIFT - 1);
-	return load >> FSHIFT;
+	unsigned long newload;
+
+	newload = load * exp + active * (FIXED_1 - exp);
+	if (active >= load)
+		newload += FIXED_1-1;
+
+	return newload / FIXED_1;
 }
 
 #ifdef CONFIG_NO_HZ_COMMON

  parent reply	other threads:[~2016-05-30 21:55 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 20:48 [PATCH 4.5 00/87] 4.5.6-stable review Greg Kroah-Hartman
2016-05-30 20:48 ` [PATCH 4.5 01/87] perf/x86/intel/pt: Generate PMI in the STOP region as well Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 02/87] perf/core: Fix perf_event_open() vs. execve() race Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 03/87] arm64: Fix typo in the pmdp_huge_get_and_clear() definition Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 04/87] arm64: Ensure pmd_present() returns false after pmd_mknotpresent() Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 05/87] arm64: Implement ptep_set_access_flags() for hardware AF/DBM Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 06/87] arm64: Implement pmdp_set_access_flags() " Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 07/87] arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 08/87] arm/arm64: KVM: Enforce Break-Before-Make on Stage-2 page tables Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 09/87] kvm: arm64: Fix EC field in inject_abt64 Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 10/87] remove directory incorrectly tries to set delete on close on non-empty directories Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 11/87] fs/cifs: correctly to anonymous authentication via NTLMSSP Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 12/87] fs/cifs: correctly to anonymous authentication for the LANMAN authentication Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 13/87] fs/cifs: correctly to anonymous authentication for the NTLM(v1) authentication Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 14/87] fs/cifs: correctly to anonymous authentication for the NTLM(v2) authentication Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 15/87] asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 16/87] ring-buffer: Use long for nr_pages to avoid overflow failures Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 17/87] ring-buffer: Prevent overflow of size in ring_buffer_resize() Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 18/87] crypto: caam - fix caam_jr_alloc() ret code Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 19/87] crypto: talitos - fix ahash algorithms registration Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 20/87] crypto: sun4i-ss - Replace spinlock_bh by spin_lock_irq{save|restore} Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 21/87] clk: qcom: msm8916: Fix crypto clock flags Greg Kroah-Hartman
2016-05-30 20:49 ` Greg Kroah-Hartman [this message]
2016-05-30 20:49 ` [PATCH 4.5 23/87] mfd: omap-usb-tll: Fix scheduling while atomic BUG Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 24/87] Input: pwm-beeper - fix - scheduling while atomic Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 25/87] irqchip/gic: Ensure ordering between read of INTACK and shared data Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 26/87] irqchip/gic-v3: Configure all interrupts as non-secure Group-1 Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 27/87] nfc: st21nfca: Fix static checker warning Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 28/87] can: fix handling of unmodifiable configuration options Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 29/87] mmc: mmc: Fix partition switch timeout for some eMMCs Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 30/87] ACPI / PM: Export acpi_device_fix_up_power() Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 31/87] mmc: sdhci-acpi: Ensure connected devices are powered when probing Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 32/87] mmc: sdhci-acpi: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 33/87] ACPI / osi: Fix an issue that acpi_osi=!* cannot disable ACPICA internal strings Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 35/87] mmc: longer timeout for long read time quirk Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 36/87] mmc: sdhci-pci: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 37/87] Bluetooth: vhci: fix open_timeout vs. hdev race Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 38/87] Bluetooth: vhci: purge unhandled skbs Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 39/87] Bluetooth: vhci: Fix race at creating hci device Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 40/87] mei: fix NULL dereferencing during FW initiated disconnection Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 41/87] mei: amthif: discard not read messages Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 42/87] mei: bus: call mei_cl_read_start under device lock Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 43/87] USB: serial: mxuport: fix use-after-free in probe error path Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 44/87] USB: serial: keyspan: " Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 45/87] USB: serial: quatech2: " Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 46/87] USB: serial: io_edgeport: fix memory leaks in attach " Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 47/87] USB: serial: io_edgeport: fix memory leaks in probe " Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 48/87] USB: serial: option: add support for Cinterion PH8 and AHxx Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 49/87] USB: serial: option: add more ZTE device ids Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 50/87] USB: serial: option: add even " Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 51/87] usb: gadget: f_fs: Fix EFAULT generation for async read operations Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 53/87] usb: misc: usbtest: fix pattern tests for scatterlists Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 54/87] USB: leave LPM alone if possible when binding/unbinding interface drivers Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 55/87] usb: gadget: udc: core: Fix argument of dev_err() in usb_gadget_map_request() Greg Kroah-Hartman
2016-05-30 20:49 ` [PATCH 4.5 56/87] staging: comedi: das1800: fix possible NULL dereference Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 62/87] xen/x86: actually allocate legacy interrupts on PV guests Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 63/87] tty: vt, return error when con_startup fails Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 64/87] TTY: n_gsm, fix false positive WARN_ON Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 65/87] tty/serial: atmel: fix hardware handshake selection Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 66/87] Fix OpenSSH pty regression on close Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 68/87] serial: 8250_mid: use proper bar for DNV platform Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 69/87] serial: 8250_mid: recognize interrupt source in handler Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 70/87] serial: samsung: Reorder the sequence of clock control when call s3c24xx_serial_set_termios() Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 71/87] locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait() Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 72/87] clk: bcm2835: add locking to pll*_on/off methods Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 73/87] watchdog: sp5100_tco: properly check for new register layouts Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 74/87] mcb: Fixed bar number assignment for the gdd Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 75/87] ALSA: hda/realtek - New codecs support for ALC234/ALC274/ALC294 Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 76/87] ALSA: hda - Fix headphone noise on Dell XPS 13 9360 Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 77/87] ALSA: hda/realtek - Add support for ALC295/ALC3254 Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 78/87] ALSA: hda - Fix headset mic detection problem for one Dell machine Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 79/87] IB/srp: Fix a debug kernel crash Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 80/87] thunderbolt: Fix double free of drom buffer Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 81/87] SIGNAL: Move generic copy_siginfo() to signal.h Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 82/87] UBI: Fix static volume checks when Fastmap is used Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 83/87] hpfs: fix remount failure when there are no options changed Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 84/87] hpfs: implement the show_options method Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 85/87] scsi: Add intermediate STARGET_REMOVE state to scsi_target_state Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 86/87] Revert "scsi: fix soft lockup in scsi_remove_target() on module removal" Greg Kroah-Hartman
2016-05-30 20:50 ` [PATCH 4.5 87/87] kbuild: move -Wunused-const-variable to W=1 warning level Greg Kroah-Hartman
2016-06-01  5:26 ` [PATCH 4.5 00/87] 4.5.6-stable review Guenter Roeck
2016-06-01 14:21 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160530204934.209374998@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=damien.wyart@free.fr \
    --cc=dsmythies@telus.net \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vik.heyndrickx@veribox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).