From: Zhiqiang Liu <liuzhiqiang26@huawei.com>
To: <tglx@linutronix.de>, <corbet@lwn.net>,
Kees Cook <keescook@chromium.org>,
Eric Dumazet <edumazet@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
<manfred@colorfullife.com>, <jwilk@jwilk.net>,
<dvyukov@google.com>, <feng.tang@intel.com>,
<sunilmut@microsoft.com>, <quentin.perret@arm.com>,
<linux@leemhuis.info>, <alex.popov@linux.com>,
<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-fsdevel@vger.kernel.org>,
"wangxiaogang (F)" <wangxiaogang3@huawei.com>,
"Zhoukang (A)" <zhoukang7@huawei.com>,
Mingfangsen <mingfangsen@huawei.com>, <tedheadster@gmail.com>
Subject: Re: [PATCH next v2] softirq: enable MAX_SOFTIRQ_TIME tuning with sysctl, max_softirq_time_msecs
Date: Wed, 11 Sep 2019 11:17:46 +0800 [thread overview]
Message-ID: <5569bcf0-fb5a-d808-2eed-610b3a548630@huawei.com> (raw)
In-Reply-To: <53770380-053e-70b6-f75e-a0e00bf35c30@huawei.com>
Friendly ping...
On 2019/6/25 11:13, Zhiqiang Liu wrote:
> From: Zhiqiang liu <liuzhiqiang26@huawei.com>
>
> In __do_softirq func, MAX_SOFTIRQ_TIME was set to 2ms via experimentation by
> commit c10d73671 ("softirq: reduce latencies") in 2013, which was designed
> to reduce latencies for various network workloads. The key reason is that the
> maximum number of microseconds in one NAPI polling cycle in net_rx_action func
> was set to 2 jiffies, so different HZ settting will lead to different latencies.
>
> However, commit 7acf8a1e8 ("Replace 2 jiffies with sysctl netdev_budget_usecs
> to enable softirq tuning") adopts netdev_budget_usecs to tun maximum number of
> microseconds in one NAPI polling cycle. So the latencies of net_rx_action can be
> controlled by sysadmins to copy with hardware changes over time.
>
> Correspondingly, the MAX_SOFTIRQ_TIME should be able to be tunned by sysadmins,
> who knows best about hardware performance, for excepted tradeoff between latence
> and fairness. Here, we add sysctl variable max_softirq_time_msecs to replace
> MAX_SOFTIRQ_TIME with 2ms default value.
>
> Note: max_softirq_time_msecs will be coverted to jiffies, and any budget
> value will be rounded up to the next jiffies, which relates to CONFIG_HZ.
> The time accuracy of jiffies will result in a certain difference
> between the setting jiffies of max_softirq_time_msecs and the actual
> value, which is in one jiffies range.
>
> Signed-off-by: Zhiqiang liu <liuzhiqiang26@huawei.com>
> ---
> Documentation/sysctl/kernel.txt | 17 +++++++++++++++++
> kernel/softirq.c | 8 +++++---
> kernel/sysctl.c | 9 +++++++++
> 3 files changed, 31 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index f0c86fbb3b48..23b36393f150 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -44,6 +44,7 @@ show up in /proc/sys/kernel:
> - kexec_load_disabled
> - kptr_restrict
> - l2cr [ PPC only ]
> +- max_softirq_time_msecs
> - modprobe ==> Documentation/debugging-modules.txt
> - modules_disabled
> - msg_next_id [ sysv ipc ]
> @@ -445,6 +446,22 @@ This flag controls the L2 cache of G3 processor boards. If
>
> ==============================================================
>
> +max_softirq_time_msecs:
> +
> +Maximum number of milliseconds to break the loop of restarting softirq
> +processing for at most MAX_SOFTIRQ_RESTART times in __do_softirq().
> +max_softirq_time_msecs will be coverted to jiffies, and any budget
> +value will be rounded up to the next jiffies, which relates to CONFIG_HZ.
> +The time accuracy of jiffies will result in a certain difference
> +between the setting jiffies of max_softirq_time_msecs and the actual
> +value, which is in one jiffies range.
> +
> +max_softirq_time_msecs is a non-negative integer value, and setting
> +negative value is meaningless and will return error.
> +Default: 2
> +
> +==============================================================
> +
> modules_disabled:
>
> A toggle value indicating if modules are allowed to be loaded
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index a6b81c6b6bff..1e456db70093 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -199,7 +199,8 @@ EXPORT_SYMBOL(__local_bh_enable_ip);
>
> /*
> * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
> - * but break the loop if need_resched() is set or after 2 ms.
> + * but break the loop if need_resched() is set or after
> + * max_softirq_time_msecs msecs.
> * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
> * certain cases, such as stop_machine(), jiffies may cease to
> * increment and so we need the MAX_SOFTIRQ_RESTART limit as
> @@ -210,7 +211,7 @@ EXPORT_SYMBOL(__local_bh_enable_ip);
> * we want to handle softirqs as soon as possible, but they
> * should not be able to lock up the box.
> */
> -#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
> +unsigned int __read_mostly max_softirq_time_msecs = 2;
> #define MAX_SOFTIRQ_RESTART 10
>
> #ifdef CONFIG_TRACE_IRQFLAGS
> @@ -248,7 +249,8 @@ static inline void lockdep_softirq_end(bool in_hardirq) { }
>
> asmlinkage __visible void __softirq_entry __do_softirq(void)
> {
> - unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> + unsigned long end = jiffies +
> + msecs_to_jiffies(max_softirq_time_msecs);
> unsigned long old_flags = current->flags;
> int max_restart = MAX_SOFTIRQ_RESTART;
> struct softirq_action *h;
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 1beca96fb625..96ff292ce7f6 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -118,6 +118,7 @@ extern unsigned int sysctl_nr_open_min, sysctl_nr_open_max;
> #ifndef CONFIG_MMU
> extern int sysctl_nr_trim_pages;
> #endif
> +extern unsigned int max_softirq_time_msecs;
>
> /* Constants used for minimum and maximum */
> #ifdef CONFIG_LOCKUP_DETECTOR
> @@ -1276,6 +1277,14 @@ static struct ctl_table kern_table[] = {
> .extra2 = &one,
> },
> #endif
> + {
> + .procname = "max_softirq_time_msecs",
> + .data = &max_softirq_time_msecs,
> + .maxlen = sizeof(unsigned int),
> + .mode = 0644,
> + .proc_handler = proc_dointvec_minmax,
> + .extra1 = &zero,
> + },
> { }
> };
>
prev parent reply other threads:[~2019-09-11 3:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-25 3:13 [PATCH next v2] softirq: enable MAX_SOFTIRQ_TIME tuning with sysctl, max_softirq_time_msecs Zhiqiang Liu
2019-06-28 6:52 ` Zhiqiang Liu
2019-07-09 1:32 ` Zhiqiang Liu
2019-09-11 3:17 ` Zhiqiang Liu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5569bcf0-fb5a-d808-2eed-610b3a548630@huawei.com \
--to=liuzhiqiang26@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=alex.popov@linux.com \
--cc=corbet@lwn.net \
--cc=dvyukov@google.com \
--cc=edumazet@google.com \
--cc=feng.tang@intel.com \
--cc=jwilk@jwilk.net \
--cc=keescook@chromium.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@leemhuis.info \
--cc=manfred@colorfullife.com \
--cc=mingfangsen@huawei.com \
--cc=quentin.perret@arm.com \
--cc=sunilmut@microsoft.com \
--cc=tedheadster@gmail.com \
--cc=tglx@linutronix.de \
--cc=wangxiaogang3@huawei.com \
--cc=zhoukang7@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).