netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: David Laight <David.Laight@ACULAB.COM>
Cc: "'Jesper Dangaard Brouer'" <brouer@redhat.com>,
	"'Marek Majkowski'" <marek@cloudflare.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	network dev <netdev@vger.kernel.org>,
	kernel-team <kernel-team@cloudflare.com>,
	Paolo Abeni <pabeni@redhat.com>
Subject: Re: epoll_wait() performance
Date: Thu, 28 Nov 2019 17:52:05 +0100	[thread overview]
Message-ID: <20191128165205.GA12629@1wt.eu> (raw)
In-Reply-To: <b71441bb2fa14bc7b583de643a1ccf8b@AcuMS.aculab.com>

On Thu, Nov 28, 2019 at 04:37:01PM +0000, David Laight wrote:
> My test system tends to increase its clock rate when busy.
> (The fans speed up immediately, the cpu has a passive heatsink and all the
> case fans are connected (via buffers) to the motherboard 'cpu fan' header.)
> I could probably work out how to lock the frequency, but for some tests I run:
> $ while :; do :; done
> Putting 1 cpu into a userspace infinite loop make them all run flat out
> (until thermally throttled).

It would be way more efficient to only make the CPUs spin in the idle
loop. I wrote a small module a few years ago for this, which allows me
to do the equivalent of "idle=poll" at runtime. It's very convenient
in VMs as it significantly reduces your latency and jitter by preventing
them from sleeping. It's quite efficient as well to stabilize CPUs having
an important difference between their highest and lowest frequencies.

I'm attaching the patch here, it's straightforward, it was made on
3.14 and still worked unmodified on 4.19, I'm sure it still does with
more recent kernels.

Hoping this helps,
Willy

---

From 22d67389c2b28d924260b8ced78111111006ed94 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w@1wt.eu>
Date: Wed, 27 Jan 2016 17:24:54 +0100
Subject: staging: add a new "idle_poll" module to disable idle loop

Sometimes it's convenient to be able to switch to polled mode for the
idle loop. This module does just that, and reverts back to the original
mode once unloaded.
---
 drivers/staging/Kconfig               |  2 ++
 drivers/staging/Makefile              |  1 +
 drivers/staging/idle_poll/Kconfig     |  8 ++++++++
 drivers/staging/idle_poll/Makefile    |  7 +++++++
 drivers/staging/idle_poll/idle_poll.c | 22 ++++++++++++++++++++++
 kernel/cpu/idle.c                     |  1 +
 6 files changed, 41 insertions(+)
 create mode 100644 drivers/staging/idle_poll/Kconfig
 create mode 100644 drivers/staging/idle_poll/Makefile
 create mode 100644 drivers/staging/idle_poll/idle_poll.c

diff --git a/drivers/staging/Kconfig b/drivers/staging/Kconfig
index 9594f204d4fc..936a2721b0f7 100644
--- a/drivers/staging/Kconfig
+++ b/drivers/staging/Kconfig
@@ -148,4 +148,6 @@ source "drivers/staging/dgnc/Kconfig"
 
 source "drivers/staging/dgap/Kconfig"
 
+source "drivers/staging/idle_poll/Kconfig"
+
 endif # STAGING
diff --git a/drivers/staging/Makefile b/drivers/staging/Makefile
index 6ca1cf3dbcd4..d3d45aff73d2 100644
--- a/drivers/staging/Makefile
+++ b/drivers/staging/Makefile
@@ -66,3 +66,4 @@ obj-$(CONFIG_XILLYBUS)		+= xillybus/
 obj-$(CONFIG_DGNC)			+= dgnc/
 obj-$(CONFIG_DGAP)			+= dgap/
 obj-$(CONFIG_MTD_SPINAND_MT29F)	+= mt29f_spinand/
+obj-$(CONFIG_IDLE_POLL)		+= idle_poll/
diff --git a/drivers/staging/idle_poll/Kconfig b/drivers/staging/idle_poll/Kconfig
new file mode 100644
index 000000000000..4c96a21f66aa
--- /dev/null
+++ b/drivers/staging/idle_poll/Kconfig
@@ -0,0 +1,8 @@
+config IDLE_POLL
+	tristate "IDLE_POLL enabler"
+	help
+	    This module automatically enables polling-based idle loop.
+	    It is convenient in certain situations to simply load the
+	    module to disable the idle loop, or unload it to re-enable
+	    it.
+
diff --git a/drivers/staging/idle_poll/Makefile b/drivers/staging/idle_poll/Makefile
new file mode 100644
index 000000000000..60ad176f11f6
--- /dev/null
+++ b/drivers/staging/idle_poll/Makefile
@@ -0,0 +1,7 @@
+# This rule extracts the directory part from the location where this Makefile
+# is found, strips last slash and retrieves the last component which is used
+# to make a file name. It is a generic way of building modules which always
+# have the name of the directory they're located in. $(lastword) could have
+# been used instead of $(word $(words)) but it's bogus on some versions.
+
+obj-m += $(notdir $(patsubst %/,%,$(dir $(word $(words $(MAKEFILE_LIST)),$(MAKEFILE_LIST))))).o
diff --git a/drivers/staging/idle_poll/idle_poll.c b/drivers/staging/idle_poll/idle_poll.c
new file mode 100644
index 000000000000..6f39f85cc61d
--- /dev/null
+++ b/drivers/staging/idle_poll/idle_poll.c
@@ -0,0 +1,22 @@
+#include <linux/module.h>
+#include <linux/cpu.h>
+
+static int __init modinit(void)
+{
+	cpu_idle_poll_ctrl(true);
+	return 0;
+}
+
+static void __exit modexit(void)
+{
+	cpu_idle_poll_ctrl(false);
+	return;
+}
+
+module_init(modinit);
+module_exit(modexit);
+
+MODULE_DESCRIPTION("idle_poll enabler");
+MODULE_AUTHOR("Willy Tarreau");
+MODULE_VERSION("0.0.1");
+MODULE_LICENSE("GPL");
diff --git a/kernel/cpu/idle.c b/kernel/cpu/idle.c
index 277f494c2a9a..fbf648bc52b2 100644
--- a/kernel/cpu/idle.c
+++ b/kernel/cpu/idle.c
@@ -22,6 +22,7 @@ void cpu_idle_poll_ctrl(bool enable)
 		WARN_ON_ONCE(cpu_idle_force_poll < 0);
 	}
 }
+EXPORT_SYMBOL(cpu_idle_poll_ctrl);
 
 #ifdef CONFIG_GENERIC_IDLE_POLL_SETUP
 static int __init cpu_idle_poll_setup(char *__unused)
-- 
2.20.1


  reply	other threads:[~2019-11-28 16:52 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-22 11:17 epoll_wait() performance David Laight
2019-11-27  9:50 ` Marek Majkowski
2019-11-27 10:39   ` David Laight
2019-11-27 15:48     ` Jesper Dangaard Brouer
2019-11-27 16:04       ` David Laight
2019-11-27 19:48         ` Willem de Bruijn
2019-11-28 16:25           ` David Laight
2019-11-28 11:12         ` Jesper Dangaard Brouer
2019-11-28 16:37           ` David Laight
2019-11-28 16:52             ` Willy Tarreau [this message]
2019-12-19  7:57             ` Jesper Dangaard Brouer
2019-11-27 16:26       ` Paolo Abeni
2019-11-27 17:30         ` David Laight
2019-11-27 17:46           ` Eric Dumazet
2019-11-28 10:17             ` David Laight
2019-11-30  1:07               ` Eric Dumazet
2019-11-30 13:29                 ` Jakub Sitnicki
2019-12-02 12:24                   ` David Laight
2019-12-02 16:47                     ` Willem de Bruijn
2019-11-27 17:50           ` Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191128165205.GA12629@1wt.eu \
    --to=w@1wt.eu \
    --cc=David.Laight@ACULAB.COM \
    --cc=brouer@redhat.com \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marek@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).