* [PATCH net-next] Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning
@ 2017-04-19 16:37 Matthew Whitehead
2017-04-21 17:22 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Matthew Whitehead @ 2017-04-19 16:37 UTC (permalink / raw)
To: netdev; +Cc: Matthew Whitehead
Constants used for tuning are generally a bad idea, especially as hardware
changes over time. Replace the constant 2 jiffies with sysctl variable
netdev_budget_usecs to enable sysadmins to tune the softirq processing.
Also document the variable.
For example, a very fast machine might tune this to 1000 microseconds,
while my regression testing 486DX-25 needs it to be 4000 microseconds on
a nearly idle network to prevent time_squeeze from being incremented.
Version 2: changed jiffies to microseconds for predictable units.
Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
---
Documentation/sysctl/net.txt | 11 ++++++++++-
include/linux/netdevice.h | 1 +
include/uapi/linux/sysctl.h | 1 +
kernel/sysctl_binary.c | 1 +
net/core/dev.c | 4 +++-
net/core/sysctl_net_core.c | 8 ++++++++
6 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
index 2ebabc9..14db18c 100644
--- a/Documentation/sysctl/net.txt
+++ b/Documentation/sysctl/net.txt
@@ -188,7 +188,16 @@ netdev_budget
Maximum number of packets taken from all interfaces in one polling cycle (NAPI
poll). In one polling cycle interfaces which are registered to polling are
-probed in a round-robin manner.
+probed in a round-robin manner. Also, a polling cycle may not exceed
+netdev_budget_usecs microseconds, even if netdev_budget has not been
+exhausted.
+
+netdev_budget_usecs
+---------------------
+
+Maximum number of microseconds in one NAPI polling cycle. Polling
+will exit when either netdev_budget_usecs have elapsed during the
+poll cycle or the number of packets processed reaches netdev_budget.
netdev_max_backlog
------------------
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 97456b25..2fc72a5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3305,6 +3305,7 @@ static __always_inline int ____dev_forward_skb(struct net_device *dev,
void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev);
extern int netdev_budget;
+extern unsigned int netdev_budget_usecs;
/* Called by rtnetlink.c:rtnl_unlock() */
void netdev_run_todo(void);
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index d2b1215..a1f1f25 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -274,6 +274,7 @@ enum
NET_CORE_AEVENT_ETIME=20,
NET_CORE_AEVENT_RSEQTH=21,
NET_CORE_WARNINGS=22,
+ NET_CORE_BUDGET_USECS=23,
};
/* /proc/sys/net/ethernet */
diff --git a/kernel/sysctl_binary.c b/kernel/sysctl_binary.c
index ece4b17..4ee3e49 100644
--- a/kernel/sysctl_binary.c
+++ b/kernel/sysctl_binary.c
@@ -197,6 +197,7 @@ struct bin_table {
{ CTL_INT, NET_CORE_AEVENT_ETIME, "xfrm_aevent_etime" },
{ CTL_INT, NET_CORE_AEVENT_RSEQTH, "xfrm_aevent_rseqth" },
{ CTL_INT, NET_CORE_WARNINGS, "warnings" },
+ { CTL_INT, NET_CORE_BUDGET_USECS, "netdev_budget_usecs" },
{},
};
diff --git a/net/core/dev.c b/net/core/dev.c
index 533a6d6..78627e5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3441,6 +3441,7 @@ int dev_queue_xmit_accel(struct sk_buff *skb, void *accel_priv)
int netdev_tstamp_prequeue __read_mostly = 1;
int netdev_budget __read_mostly = 300;
+unsigned int __read_mostly netdev_budget_usecs = 2000;
int weight_p __read_mostly = 64; /* old backlog weight */
int dev_weight_rx_bias __read_mostly = 1; /* bias for backlog weight */
int dev_weight_tx_bias __read_mostly = 1; /* bias for output_queue quota */
@@ -5310,7 +5311,8 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll)
static __latent_entropy void net_rx_action(struct softirq_action *h)
{
struct softnet_data *sd = this_cpu_ptr(&softnet_data);
- unsigned long time_limit = jiffies + 2;
+ unsigned long time_limit = jiffies +
+ usecs_to_jiffies(netdev_budget_usecs);
int budget = netdev_budget;
LIST_HEAD(list);
LIST_HEAD(repoll);
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 7f9cc40..ea23254 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -452,6 +452,14 @@ static int proc_do_rss_key(struct ctl_table *table, int write,
.extra1 = &one,
.extra2 = &max_skb_frags,
},
+ {
+ .procname = "netdev_budget_usecs",
+ .data = &netdev_budget_usecs,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &zero,
+ },
{ }
};
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning
2017-04-19 16:37 [PATCH net-next] Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning Matthew Whitehead
@ 2017-04-21 17:22 ` David Miller
2017-04-21 19:57 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2017-04-21 17:22 UTC (permalink / raw)
To: tedheadster; +Cc: netdev
From: Matthew Whitehead <tedheadster@gmail.com>
Date: Wed, 19 Apr 2017 12:37:10 -0400
> Constants used for tuning are generally a bad idea, especially as hardware
> changes over time. Replace the constant 2 jiffies with sysctl variable
> netdev_budget_usecs to enable sysadmins to tune the softirq processing.
> Also document the variable.
>
> For example, a very fast machine might tune this to 1000 microseconds,
> while my regression testing 486DX-25 needs it to be 4000 microseconds on
> a nearly idle network to prevent time_squeeze from being incremented.
>
> Version 2: changed jiffies to microseconds for predictable units.
>
> Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning
2017-04-21 17:22 ` David Miller
@ 2017-04-21 19:57 ` Eric Dumazet
2017-04-21 20:00 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2017-04-21 19:57 UTC (permalink / raw)
To: David Miller; +Cc: tedheadster, netdev
On Fri, 2017-04-21 at 13:22 -0400, David Miller wrote:
> From: Matthew Whitehead <tedheadster@gmail.com>
> Date: Wed, 19 Apr 2017 12:37:10 -0400
>
> > Constants used for tuning are generally a bad idea, especially as hardware
> > changes over time. Replace the constant 2 jiffies with sysctl variable
> > netdev_budget_usecs to enable sysadmins to tune the softirq processing.
> > Also document the variable.
> >
> > For example, a very fast machine might tune this to 1000 microseconds,
> > while my regression testing 486DX-25 needs it to be 4000 microseconds on
> > a nearly idle network to prevent time_squeeze from being incremented.
> >
> > Version 2: changed jiffies to microseconds for predictable units.
> >
> > Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
>
> Applied, thanks.
Can we revert the changes in kernel/sysctl_binary.c &
include/uapi/linux/sysctl.h ?
{ CTL_INT, NET_CORE_BUDGET_USECS, "netdev_budget_usecs" },
NET_CORE_BUDGET_USECS=23,
Unless I am missing something, we should not add new binary sysctls.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning
2017-04-21 19:57 ` Eric Dumazet
@ 2017-04-21 20:00 ` David Miller
2017-04-21 20:07 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2017-04-21 20:00 UTC (permalink / raw)
To: eric.dumazet; +Cc: tedheadster, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 21 Apr 2017 12:57:02 -0700
> On Fri, 2017-04-21 at 13:22 -0400, David Miller wrote:
>> From: Matthew Whitehead <tedheadster@gmail.com>
>> Date: Wed, 19 Apr 2017 12:37:10 -0400
>>
>> > Constants used for tuning are generally a bad idea, especially as hardware
>> > changes over time. Replace the constant 2 jiffies with sysctl variable
>> > netdev_budget_usecs to enable sysadmins to tune the softirq processing.
>> > Also document the variable.
>> >
>> > For example, a very fast machine might tune this to 1000 microseconds,
>> > while my regression testing 486DX-25 needs it to be 4000 microseconds on
>> > a nearly idle network to prevent time_squeeze from being incremented.
>> >
>> > Version 2: changed jiffies to microseconds for predictable units.
>> >
>> > Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
>>
>> Applied, thanks.
>
> Can we revert the changes in kernel/sysctl_binary.c &
> include/uapi/linux/sysctl.h ?
>
> { CTL_INT, NET_CORE_BUDGET_USECS, "netdev_budget_usecs" },
>
> NET_CORE_BUDGET_USECS=23,
>
> Unless I am missing something, we should not add new binary sysctls.
That's true, I'll kill this.
====================
[PATCH] net: Remove NET_CORE_BUDGET_USECS from sysctl binary interface.
We are not supposed to add new entries to this thing
any more.
Thanks to Eric Dumazet for noticing this.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
include/uapi/linux/sysctl.h | 1 -
kernel/sysctl_binary.c | 1 -
2 files changed, 2 deletions(-)
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 177f5f1..e13d480 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -274,7 +274,6 @@ enum
NET_CORE_AEVENT_ETIME=20,
NET_CORE_AEVENT_RSEQTH=21,
NET_CORE_WARNINGS=22,
- NET_CORE_BUDGET_USECS=23,
};
/* /proc/sys/net/ethernet */
diff --git a/kernel/sysctl_binary.c b/kernel/sysctl_binary.c
index 4ee3e49..ece4b17 100644
--- a/kernel/sysctl_binary.c
+++ b/kernel/sysctl_binary.c
@@ -197,7 +197,6 @@ static const struct bin_table bin_net_core_table[] = {
{ CTL_INT, NET_CORE_AEVENT_ETIME, "xfrm_aevent_etime" },
{ CTL_INT, NET_CORE_AEVENT_RSEQTH, "xfrm_aevent_rseqth" },
{ CTL_INT, NET_CORE_WARNINGS, "warnings" },
- { CTL_INT, NET_CORE_BUDGET_USECS, "netdev_budget_usecs" },
{},
};
--
2.4.11
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning
2017-04-21 20:00 ` David Miller
@ 2017-04-21 20:07 ` Eric Dumazet
0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2017-04-21 20:07 UTC (permalink / raw)
To: David Miller; +Cc: tedheadster, netdev
On Fri, 2017-04-21 at 16:00 -0400, David Miller wrote:
> That's true, I'll kill this.
>
Thanks !
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-04-21 20:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-19 16:37 [PATCH net-next] Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning Matthew Whitehead
2017-04-21 17:22 ` David Miller
2017-04-21 19:57 ` Eric Dumazet
2017-04-21 20:00 ` David Miller
2017-04-21 20:07 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).