All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net: snmp: minor optimizations
@ 2021-09-30  1:03 Eric Dumazet
  2021-09-30  1:03 ` [PATCH net-next 1/2] net: snmp: inline snmp_get_cpu_field() Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Eric Dumazet @ 2021-09-30  1:03 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Mat Martineau, Matthieu Baerts

From: Eric Dumazet <edumazet@google.com>

Fetching many SNMP counters on hosts with large number of cpus
takes a lot of time. mptcp still uses the old non-batched
fashion which is not cache friendly.

Eric Dumazet (2):
  net: snmp: inline snmp_get_cpu_field()
  mptcp: use batch snmp operations in mptcp_seq_show()

 include/net/ip.h   |  6 +++++-
 net/ipv4/af_inet.c |  6 ------
 net/mptcp/mib.c    | 17 +++++++----------
 3 files changed, 12 insertions(+), 17 deletions(-)

-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next 1/2] net: snmp: inline snmp_get_cpu_field()
  2021-09-30  1:03 [PATCH net-next 0/2] net: snmp: minor optimizations Eric Dumazet
@ 2021-09-30  1:03 ` Eric Dumazet
  2021-09-30  1:03 ` [PATCH net-next 2/2] mptcp: use batch snmp operations in mptcp_seq_show() Eric Dumazet
  2021-09-30 13:20 ` [PATCH net-next 0/2] net: snmp: minor optimizations patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2021-09-30  1:03 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Mat Martineau, Matthieu Baerts

From: Eric Dumazet <edumazet@google.com>

This trivial function is called ~90,000 times on 256 cpus hosts,
when reading /proc/net/netstat. And this number keeps inflating.

Inlining it saves many cycles.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/ip.h   | 6 +++++-
 net/ipv4/af_inet.c | 6 ------
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index 9192444f2964ebb59454a7dfa5ddf3b19dea04c9..cf229a53119428307da898af4b0dc23e1cecc053 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -291,7 +291,11 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 #define NET_ADD_STATS(net, field, adnd)	SNMP_ADD_STATS((net)->mib.net_statistics, field, adnd)
 #define __NET_ADD_STATS(net, field, adnd) __SNMP_ADD_STATS((net)->mib.net_statistics, field, adnd)
 
-u64 snmp_get_cpu_field(void __percpu *mib, int cpu, int offct);
+static inline u64 snmp_get_cpu_field(void __percpu *mib, int cpu, int offt)
+{
+	return  *(((unsigned long *)per_cpu_ptr(mib, cpu)) + offt);
+}
+
 unsigned long snmp_fold_field(void __percpu *mib, int offt);
 #if BITS_PER_LONG==32
 u64 snmp_get_cpu_field64(void __percpu *mib, int cpu, int offct,
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 40558033f857c0ca7d98b778f70487e194f3d066..967926c1bf56cfc915258b0969914b11f24c1e16 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1662,12 +1662,6 @@ int inet_ctl_sock_create(struct sock **sk, unsigned short family,
 }
 EXPORT_SYMBOL_GPL(inet_ctl_sock_create);
 
-u64 snmp_get_cpu_field(void __percpu *mib, int cpu, int offt)
-{
-	return  *(((unsigned long *)per_cpu_ptr(mib, cpu)) + offt);
-}
-EXPORT_SYMBOL_GPL(snmp_get_cpu_field);
-
 unsigned long snmp_fold_field(void __percpu *mib, int offt)
 {
 	unsigned long res = 0;
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next 2/2] mptcp: use batch snmp operations in mptcp_seq_show()
  2021-09-30  1:03 [PATCH net-next 0/2] net: snmp: minor optimizations Eric Dumazet
  2021-09-30  1:03 ` [PATCH net-next 1/2] net: snmp: inline snmp_get_cpu_field() Eric Dumazet
@ 2021-09-30  1:03 ` Eric Dumazet
  2021-09-30 13:20 ` [PATCH net-next 0/2] net: snmp: minor optimizations patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2021-09-30  1:03 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Mat Martineau, Matthieu Baerts

From: Eric Dumazet <edumazet@google.com>

Using snmp_get_cpu_field_batch() allows for better cpu cache
utilization, especially on hosts with large number of cpus.

Also remove special handling when mptcp mibs where not yet
allocated.

I chose to use temporary storage on the stack to keep this patch simple.
We might in the future use the storage allocated in netstat_seq_show().

Combined with prior patch (inlining snmp_get_cpu_field)
time to fetch and output mptcp counters on a 256 cpu host [1]
goes from 75 usec to 16 usec.

[1] L1 cache size is 32KB, it is not big enough to hold all dataset.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/mptcp/mib.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
index b21ff9be04c61772c8a7648d9540b48907de9115..3240b72271a7f3724577907bc99bdfe41a97f9de 100644
--- a/net/mptcp/mib.c
+++ b/net/mptcp/mib.c
@@ -72,6 +72,7 @@ bool mptcp_mib_alloc(struct net *net)
 
 void mptcp_seq_show(struct seq_file *seq)
 {
+	unsigned long sum[ARRAY_SIZE(mptcp_snmp_list) - 1];
 	struct net *net = seq->private;
 	int i;
 
@@ -81,17 +82,13 @@ void mptcp_seq_show(struct seq_file *seq)
 
 	seq_puts(seq, "\nMPTcpExt:");
 
-	if (!net->mib.mptcp_statistics) {
-		for (i = 0; mptcp_snmp_list[i].name; i++)
-			seq_puts(seq, " 0");
-
-		seq_putc(seq, '\n');
-		return;
-	}
+	memset(sum, 0, sizeof(sum));
+	if (net->mib.mptcp_statistics)
+		snmp_get_cpu_field_batch(sum, mptcp_snmp_list,
+					 net->mib.mptcp_statistics);
 
 	for (i = 0; mptcp_snmp_list[i].name; i++)
-		seq_printf(seq, " %lu",
-			   snmp_fold_field(net->mib.mptcp_statistics,
-					   mptcp_snmp_list[i].entry));
+		seq_printf(seq, " %lu", sum[i]);
+
 	seq_putc(seq, '\n');
 }
-- 
2.33.0.800.g4c38ced690-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 0/2] net: snmp: minor optimizations
  2021-09-30  1:03 [PATCH net-next 0/2] net: snmp: minor optimizations Eric Dumazet
  2021-09-30  1:03 ` [PATCH net-next 1/2] net: snmp: inline snmp_get_cpu_field() Eric Dumazet
  2021-09-30  1:03 ` [PATCH net-next 2/2] mptcp: use batch snmp operations in mptcp_seq_show() Eric Dumazet
@ 2021-09-30 13:20 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-09-30 13:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, netdev, edumazet, mathew.j.martineau, matthieu.baerts

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Wed, 29 Sep 2021 18:03:31 -0700 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Fetching many SNMP counters on hosts with large number of cpus
> takes a lot of time. mptcp still uses the old non-batched
> fashion which is not cache friendly.
> 
> Eric Dumazet (2):
>   net: snmp: inline snmp_get_cpu_field()
>   mptcp: use batch snmp operations in mptcp_seq_show()
> 
> [...]

Here is the summary with links:
  - [net-next,1/2] net: snmp: inline snmp_get_cpu_field()
    https://git.kernel.org/netdev/net-next/c/59f09ae8fac4
  - [net-next,2/2] mptcp: use batch snmp operations in mptcp_seq_show()
    https://git.kernel.org/netdev/net-next/c/acbd0c814413

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-30 13:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30  1:03 [PATCH net-next 0/2] net: snmp: minor optimizations Eric Dumazet
2021-09-30  1:03 ` [PATCH net-next 1/2] net: snmp: inline snmp_get_cpu_field() Eric Dumazet
2021-09-30  1:03 ` [PATCH net-next 2/2] mptcp: use batch snmp operations in mptcp_seq_show() Eric Dumazet
2021-09-30 13:20 ` [PATCH net-next 0/2] net: snmp: minor optimizations patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.