linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field
@ 2016-09-09  6:33 Jia He
  2016-09-09  6:33 ` [RFC PATCH v3 1/7] net:snmp: Introduce generic batch interfaces for snmp_get_cpu_field{,64} Jia He
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Jia He @ 2016-09-09  6:33 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

In a PowerPc server with large cpu number(160), besides commit
a3a773726c9f ("net: Optimize snmp stat aggregation by walking all
the percpu data at once"), I watched several other snmp_fold_field
callsites which will cause high cache miss rate.

My simple test case, which read from the procfs items endlessly:
/***********************************************************/
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define LINELEN  2560
int main(int argc, char **argv)
{
        int i;
        int fd = -1 ;
        int rdsize = 0;
        char buf[LINELEN+1];

        buf[LINELEN] = 0;
        memset(buf,0,LINELEN);

        if(1 >= argc) {
                printf("file name empty\n");
                return -1;
        }

        fd = open(argv[1], O_RDWR, 0644);
        if(0 > fd){
                printf("open error\n");
                return -2;
        }

        for(i=0;i<0xffffffff;i++) {
                while(0 < (rdsize = read(fd,buf,LINELEN))){
                        //nothing here
                }

                lseek(fd, 0, SEEK_SET);
        }

        close(fd);
        return 0;
}
/**********************************************************/

compile and run:
gcc test.c -o test

perf stat -d -e cache-misses ./test /proc/net/snmp
perf stat -d -e cache-misses ./test /proc/net/snmp6
perf stat -d -e cache-misses ./test /proc/net/netstat
perf stat -d -e cache-misses ./test /proc/net/sctp/snmp
perf stat -d -e cache-misses ./test /proc/net/xfrm_stat

before the patch set:
====================
 Performance counter stats for 'system wide':

         355911097      cache-misses                                                 [40.08%]
        2356829300      L1-dcache-loads                                              [60.04%]
         355642645      L1-dcache-load-misses     #   15.09% of all L1-dcache hits   [60.02%]
         346544541      LLC-loads                                                    [59.97%]
            389763      LLC-load-misses           #    0.11% of all LL-cache hits    [40.02%]

       6.245162638 seconds time elapsed

After the patch set:
===================
 Performance counter stats for 'system wide':

         194992476      cache-misses                                                 [40.03%]
        6718051877      L1-dcache-loads                                              [60.07%]
         194871921      L1-dcache-load-misses     #    2.90% of all L1-dcache hits   [60.11%]
         187632232      LLC-loads                                                    [60.04%]
            464466      LLC-load-misses           #    0.25% of all LL-cache hits    [39.89%]

       6.868422769 seconds time elapsed
The cache-miss rate can be reduced from 15% to 2.9%

v3:
- introduce generic interface (suggested by Marcelo Ricardo Leitner)
- use max_t instead of self defined macro (suggested by David Miller)
v2:
- fix bug in udplite statistics.
- snmp_seq_show is split into 2 parts

Jia He (7):
  net:snmp: Introduce generic interfaces for snmp_get_cpu_field{,64}
  proc: Reduce cache miss in {snmp,netstat}_seq_show
  proc: Reduce cache miss in snmp6_seq_show
  proc: Reduce cache miss in sctp_snmp_seq_show
  proc: Reduce cache miss in xfrm_statistics_seq_show
  ipv6: Remove useless parameter in __snmp6_fill_statsdev
  net: Suppress the "Comparison to NULL could be written" warning

 include/net/ip.h     |  23 +++++++++
 net/ipv4/proc.c      | 138 +++++++++++++++++++++++++++++++++------------------
 net/ipv6/addrconf.c  |  12 ++---
 net/ipv6/proc.c      |  32 ++++++++----
 net/sctp/proc.c      |  10 ++--
 net/xfrm/xfrm_proc.c |  10 +++-
 6 files changed, 157 insertions(+), 68 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 1/7] net:snmp: Introduce generic batch interfaces for snmp_get_cpu_field{,64}
  2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
@ 2016-09-09  6:33 ` Jia He
  2016-09-09  6:33 ` [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show Jia He
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Jia He @ 2016-09-09  6:33 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

This is to introduced the generic batch interfaces for snmp_get_cpu_field{,64}.
It exchanges the two for-loops for collecting the percpu statistics data.
This can aggregate the data by going through all the items of each cpu
sequentially.

Signed-off-by: Jia He <hejianet@gmail.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

---
 include/net/ip.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/include/net/ip.h b/include/net/ip.h
index 9742b92..bc43c0f 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -219,6 +219,29 @@ static inline u64 snmp_fold_field64(void __percpu *mib, int offt, size_t syncp_o
 }
 #endif
 
+#define snmp_get_cpu_field64_batch(buff64, stats_list, mib_statistic, offset) \
+{ \
+	int i, c; \
+	for_each_possible_cpu(c) { \
+		for (i = 0; stats_list[i].name; i++) \
+			buff64[i] += snmp_get_cpu_field64( \
+					mib_statistic, \
+					c, stats_list[i].entry, \
+					offset); \
+	} \
+}
+
+#define snmp_get_cpu_field_batch(buff, stats_list, mib_statistic) \
+{ \
+	int i, c; \
+	for_each_possible_cpu(c) { \
+		for (i = 0; stats_list[i].name; i++) \
+			buff[i] += snmp_get_cpu_field( \
+						mib_statistic, \
+						c, stats_list[i].entry); \
+	} \
+}
+
 void inet_get_local_port_range(struct net *net, int *low, int *high);
 
 #ifdef CONFIG_SYSCTL
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
  2016-09-09  6:33 ` [RFC PATCH v3 1/7] net:snmp: Introduce generic batch interfaces for snmp_get_cpu_field{,64} Jia He
@ 2016-09-09  6:33 ` Jia He
  2016-09-12 18:57   ` Marcelo
  2016-09-09  6:33 ` [RFC PATCH v3 3/7] proc: Reduce cache miss in snmp6_seq_show Jia He
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Jia He @ 2016-09-09  6:33 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

This is to use the generic interface snmp_get_cpu_field{,64}_batch to 
aggregate the data by going through all the items of each cpu sequentially.
Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
warning "the frame size" larger than 1024 on s390.

Signed-off-by: Jia He <hejianet@gmail.com>
---
 net/ipv4/proc.c | 106 +++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 74 insertions(+), 32 deletions(-)

diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 9f665b6..c6fc80e 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -46,6 +46,8 @@
 #include <net/sock.h>
 #include <net/raw.h>
 
+#define TCPUDP_MIB_MAX max_t(u32, UDP_MIB_MAX, TCP_MIB_MAX)
+
 /*
  *	Report socket allocation statistics [mea@utu.fi]
  */
@@ -378,13 +380,15 @@ static void icmp_put(struct seq_file *seq)
 /*
  *	Called from the PROCfs module. This outputs /proc/net/snmp.
  */
-static int snmp_seq_show(struct seq_file *seq, void *v)
+static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
 {
 	int i;
+	u64 buff64[IPSTATS_MIB_MAX];
 	struct net *net = seq->private;
 
-	seq_puts(seq, "Ip: Forwarding DefaultTTL");
+	memset(buff64, 0, IPSTATS_MIB_MAX * sizeof(u64));
 
+	seq_puts(seq, "Ip: Forwarding DefaultTTL");
 	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
 		seq_printf(seq, " %s", snmp4_ipstats_list[i].name);
 
@@ -393,57 +397,77 @@ static int snmp_seq_show(struct seq_file *seq, void *v)
 		   net->ipv4.sysctl_ip_default_ttl);
 
 	BUILD_BUG_ON(offsetof(struct ipstats_mib, mibs) != 0);
+	snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
+				   net->mib.ip_statistics,
+				   offsetof(struct ipstats_mib, syncp));
 	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
-		seq_printf(seq, " %llu",
-			   snmp_fold_field64(net->mib.ip_statistics,
-					     snmp4_ipstats_list[i].entry,
-					     offsetof(struct ipstats_mib, syncp)));
+		seq_printf(seq, " %llu", buff64[i]);
 
-	icmp_put(seq);	/* RFC 2011 compatibility */
-	icmpmsg_put(seq);
+	return 0;
+}
+
+static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
+{
+	int i;
+	unsigned long buff[TCPUDP_MIB_MAX];
+	struct net *net = seq->private;
+
+	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
 
 	seq_puts(seq, "\nTcp:");
 	for (i = 0; snmp4_tcp_list[i].name != NULL; i++)
 		seq_printf(seq, " %s", snmp4_tcp_list[i].name);
 
 	seq_puts(seq, "\nTcp:");
+	snmp_get_cpu_field_batch(buff, snmp4_tcp_list,
+				 net->mib.tcp_statistics);
 	for (i = 0; snmp4_tcp_list[i].name != NULL; i++) {
 		/* MaxConn field is signed, RFC 2012 */
 		if (snmp4_tcp_list[i].entry == TCP_MIB_MAXCONN)
-			seq_printf(seq, " %ld",
-				   snmp_fold_field(net->mib.tcp_statistics,
-						   snmp4_tcp_list[i].entry));
+			seq_printf(seq, " %ld", buff[i]);
 		else
-			seq_printf(seq, " %lu",
-				   snmp_fold_field(net->mib.tcp_statistics,
-						   snmp4_tcp_list[i].entry));
+			seq_printf(seq, " %lu", buff[i]);
 	}
 
+	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
+
+	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
+				 net->mib.udp_statistics);
 	seq_puts(seq, "\nUdp:");
 	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
 		seq_printf(seq, " %s", snmp4_udp_list[i].name);
-
 	seq_puts(seq, "\nUdp:");
 	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
-		seq_printf(seq, " %lu",
-			   snmp_fold_field(net->mib.udp_statistics,
-					   snmp4_udp_list[i].entry));
+		seq_printf(seq, " %lu", buff[i]);
+
+	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
 
 	/* the UDP and UDP-Lite MIBs are the same */
 	seq_puts(seq, "\nUdpLite:");
+	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
+				 net->mib.udplite_statistics);
 	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
 		seq_printf(seq, " %s", snmp4_udp_list[i].name);
-
 	seq_puts(seq, "\nUdpLite:");
 	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
-		seq_printf(seq, " %lu",
-			   snmp_fold_field(net->mib.udplite_statistics,
-					   snmp4_udp_list[i].entry));
+		seq_printf(seq, " %lu", buff[i]);
 
 	seq_putc(seq, '\n');
 	return 0;
 }
 
+static int snmp_seq_show(struct seq_file *seq, void *v)
+{
+	snmp_seq_show_ipstats(seq, v);
+
+	icmp_put(seq);	/* RFC 2011 compatibility */
+	icmpmsg_put(seq);
+
+	snmp_seq_show_tcp_udp(seq, v);
+
+	return 0;
+}
+
 static int snmp_seq_open(struct inode *inode, struct file *file)
 {
 	return single_open_net(inode, file, snmp_seq_show);
@@ -457,41 +481,59 @@ static const struct file_operations snmp_seq_fops = {
 	.release = single_release_net,
 };
 
-
-
 /*
  *	Output /proc/net/netstat
  */
-static int netstat_seq_show(struct seq_file *seq, void *v)
+static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
 {
 	int i;
+	unsigned long buff[LINUX_MIB_MAX];
 	struct net *net = seq->private;
 
+	memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
+
 	seq_puts(seq, "TcpExt:");
 	for (i = 0; snmp4_net_list[i].name != NULL; i++)
 		seq_printf(seq, " %s", snmp4_net_list[i].name);
 
 	seq_puts(seq, "\nTcpExt:");
+	snmp_get_cpu_field_batch(buff, snmp4_net_list,
+				 net->mib.net_statistics);
 	for (i = 0; snmp4_net_list[i].name != NULL; i++)
-		seq_printf(seq, " %lu",
-			   snmp_fold_field(net->mib.net_statistics,
-					   snmp4_net_list[i].entry));
+		seq_printf(seq, " %lu", buff[i]);
+
+	return 0;
+}
+
+static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
+{
+	int i;
+	u64 buff64[IPSTATS_MIB_MAX];
+	struct net *net = seq->private;
 
 	seq_puts(seq, "\nIpExt:");
 	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
 		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
 
 	seq_puts(seq, "\nIpExt:");
+	snmp_get_cpu_field64_batch(buff64, snmp4_ipextstats_list,
+				   net->mib.ip_statistics,
+				   offsetof(struct ipstats_mib, syncp));
 	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
-		seq_printf(seq, " %llu",
-			   snmp_fold_field64(net->mib.ip_statistics,
-					     snmp4_ipextstats_list[i].entry,
-					     offsetof(struct ipstats_mib, syncp)));
+		seq_printf(seq, " %llu", buff64[i]);
 
 	seq_putc(seq, '\n');
 	return 0;
 }
 
+static int netstat_seq_show(struct seq_file *seq, void *v)
+{
+	netstat_seq_show_tcpext(seq, v);
+	netstat_seq_show_ipext(seq, v);
+
+	return 0;
+}
+
 static int netstat_seq_open(struct inode *inode, struct file *file)
 {
 	return single_open_net(inode, file, netstat_seq_show);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/7] proc: Reduce cache miss in snmp6_seq_show
  2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
  2016-09-09  6:33 ` [RFC PATCH v3 1/7] net:snmp: Introduce generic batch interfaces for snmp_get_cpu_field{,64} Jia He
  2016-09-09  6:33 ` [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show Jia He
@ 2016-09-09  6:33 ` Jia He
  2016-09-12 19:05   ` Marcelo
  2016-09-09  6:33 ` [RFC PATCH v3 4/7] proc: Reduce cache miss in sctp_snmp_seq_show Jia He
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Jia He @ 2016-09-09  6:33 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

This is to use the generic interface snmp_get_cpu_field{,64}_batch to 
aggregate the data by going through all the items of each cpu sequentially.

Signed-off-by: Jia He <hejianet@gmail.com>
---
 net/ipv6/proc.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 679253d0..50ba2c3 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -30,6 +30,11 @@
 #include <net/transp_v6.h>
 #include <net/ipv6.h>
 
+#define MAX4(a, b, c, d) \
+	max_t(u32, max_t(u32, a, b), max_t(u32, c, d))
+#define SNMP_MIB_MAX MAX4(UDP_MIB_MAX, TCP_MIB_MAX, \
+			IPSTATS_MIB_MAX, ICMP_MIB_MAX)
+
 static int sockstat6_seq_show(struct seq_file *seq, void *v)
 {
 	struct net *net = seq->private;
@@ -192,13 +197,19 @@ static void snmp6_seq_show_item(struct seq_file *seq, void __percpu *pcpumib,
 				const struct snmp_mib *itemlist)
 {
 	int i;
-	unsigned long val;
-
-	for (i = 0; itemlist[i].name; i++) {
-		val = pcpumib ?
-			snmp_fold_field(pcpumib, itemlist[i].entry) :
-			atomic_long_read(smib + itemlist[i].entry);
-		seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, val);
+	unsigned long buff[SNMP_MIB_MAX];
+
+	memset(buff, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
+
+	if (pcpumib) {
+		snmp_get_cpu_field_batch(buff, itemlist, pcpumib);
+		for (i = 0; itemlist[i].name; i++)
+			seq_printf(seq, "%-32s\t%lu\n",
+				   itemlist[i].name, buff[i]);
+	} else {
+		for (i = 0; itemlist[i].name; i++)
+			seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name,
+				   atomic_long_read(smib + itemlist[i].entry));
 	}
 }
 
@@ -206,10 +217,13 @@ static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu *mib,
 				  const struct snmp_mib *itemlist, size_t syncpoff)
 {
 	int i;
+	u64 buff64[SNMP_MIB_MAX];
+
+	memset(buff64, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
 
+	snmp_get_cpu_field64_batch(buff64, itemlist, mib, syncpoff);
 	for (i = 0; itemlist[i].name; i++)
-		seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name,
-			   snmp_fold_field64(mib, itemlist[i].entry, syncpoff));
+		seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name, buff64[i]);
 }
 
 static int snmp6_seq_show(struct seq_file *seq, void *v)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 4/7] proc: Reduce cache miss in sctp_snmp_seq_show
  2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
                   ` (2 preceding siblings ...)
  2016-09-09  6:33 ` [RFC PATCH v3 3/7] proc: Reduce cache miss in snmp6_seq_show Jia He
@ 2016-09-09  6:33 ` Jia He
  2016-09-09  6:34 ` [RFC PATCH v3 5/7] proc: Reduce cache miss in xfrm_statistics_seq_show Jia He
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Jia He @ 2016-09-09  6:33 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

This is to use the generic interface snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.

Signed-off-by: Jia He <hejianet@gmail.com>
---
 net/sctp/proc.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/sctp/proc.c b/net/sctp/proc.c
index ef8ba77..0487c01 100644
--- a/net/sctp/proc.c
+++ b/net/sctp/proc.c
@@ -75,11 +75,15 @@ static int sctp_snmp_seq_show(struct seq_file *seq, void *v)
 {
 	struct net *net = seq->private;
 	int i;
+	unsigned long buff[SCTP_MIB_MAX];
 
+	memset(buff, 0, sizeof(unsigned long) * SCTP_MIB_MAX);
+
+	snmp_get_cpu_field_batch(buff, sctp_snmp_list,
+				 net->sctp.sctp_statistics);
 	for (i = 0; sctp_snmp_list[i].name != NULL; i++)
 		seq_printf(seq, "%-32s\t%ld\n", sctp_snmp_list[i].name,
-			   snmp_fold_field(net->sctp.sctp_statistics,
-				      sctp_snmp_list[i].entry));
+						buff[i]);
 
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 5/7] proc: Reduce cache miss in xfrm_statistics_seq_show
  2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
                   ` (3 preceding siblings ...)
  2016-09-09  6:33 ` [RFC PATCH v3 4/7] proc: Reduce cache miss in sctp_snmp_seq_show Jia He
@ 2016-09-09  6:34 ` Jia He
  2016-09-09  6:34 ` [RFC PATCH v3 6/7] ipv6: Remove useless parameter in __snmp6_fill_statsdev Jia He
  2016-09-09  6:34 ` [RFC PATCH v3 7/7] net: Suppress the "Comparison to NULL could be written" warning Jia He
  6 siblings, 0 replies; 17+ messages in thread
From: Jia He @ 2016-09-09  6:34 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

This is to use the generic interface snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.

Signed-off-by: Jia He <hejianet@gmail.com>
---
 net/xfrm/xfrm_proc.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_proc.c b/net/xfrm/xfrm_proc.c
index 9c4fbd8..2b280fc 100644
--- a/net/xfrm/xfrm_proc.c
+++ b/net/xfrm/xfrm_proc.c
@@ -52,10 +52,16 @@ static int xfrm_statistics_seq_show(struct seq_file *seq, void *v)
 {
 	struct net *net = seq->private;
 	int i;
+	unsigned long buff[LINUX_MIB_XFRMMAX];
+
+	memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_XFRMMAX);
+
+	snmp_get_cpu_field_batch(buff, xfrm_mib_list,
+				 net->mib.xfrm_statistics);
 	for (i = 0; xfrm_mib_list[i].name; i++)
 		seq_printf(seq, "%-24s\t%lu\n", xfrm_mib_list[i].name,
-			   snmp_fold_field(net->mib.xfrm_statistics,
-					   xfrm_mib_list[i].entry));
+						buff[i]);
+
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 6/7] ipv6: Remove useless parameter in __snmp6_fill_statsdev
  2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
                   ` (4 preceding siblings ...)
  2016-09-09  6:34 ` [RFC PATCH v3 5/7] proc: Reduce cache miss in xfrm_statistics_seq_show Jia He
@ 2016-09-09  6:34 ` Jia He
  2016-09-09  6:34 ` [RFC PATCH v3 7/7] net: Suppress the "Comparison to NULL could be written" warning Jia He
  6 siblings, 0 replies; 17+ messages in thread
From: Jia He @ 2016-09-09  6:34 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

The parameter items(always ICMP6_MIB_MAX) is useless for __snmp6_fill_statsdev.

Signed-off-by: Jia He <hejianet@gmail.com>
---
 net/ipv6/addrconf.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f418d2e..e170554 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4952,18 +4952,18 @@ static inline size_t inet6_if_nlmsg_size(void)
 }
 
 static inline void __snmp6_fill_statsdev(u64 *stats, atomic_long_t *mib,
-				      int items, int bytes)
+					int bytes)
 {
 	int i;
-	int pad = bytes - sizeof(u64) * items;
+	int pad = bytes - sizeof(u64) * ICMP6_MIB_MAX;
 	BUG_ON(pad < 0);
 
 	/* Use put_unaligned() because stats may not be aligned for u64. */
-	put_unaligned(items, &stats[0]);
-	for (i = 1; i < items; i++)
+	put_unaligned(ICMP6_MIB_MAX, &stats[0]);
+	for (i = 1; i < ICMP6_MIB_MAX; i++)
 		put_unaligned(atomic_long_read(&mib[i]), &stats[i]);
 
-	memset(&stats[items], 0, pad);
+	memset(&stats[ICMP6_MIB_MAX], 0, pad);
 }
 
 static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib,
@@ -4996,7 +4996,7 @@ static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype,
 				     offsetof(struct ipstats_mib, syncp));
 		break;
 	case IFLA_INET6_ICMP6STATS:
-		__snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, ICMP6_MIB_MAX, bytes);
+		__snmp6_fill_statsdev(stats, idev->stats.icmpv6dev->mibs, bytes);
 		break;
 	}
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 7/7] net: Suppress the "Comparison to NULL could be written" warning
  2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
                   ` (5 preceding siblings ...)
  2016-09-09  6:34 ` [RFC PATCH v3 6/7] ipv6: Remove useless parameter in __snmp6_fill_statsdev Jia He
@ 2016-09-09  6:34 ` Jia He
  6 siblings, 0 replies; 17+ messages in thread
From: Jia He @ 2016-09-09  6:34 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, marcelo.leitner, Jia He

This is to suppress the checkpatch.pl warning "Comparison to NULL
could be written". No functional changes here.

Signed-off-by: Jia He <hejianet@gmail.com>
---
 net/ipv4/proc.c | 32 ++++++++++++++++----------------
 net/sctp/proc.c |  2 +-
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index c6fc80e..c6ee8a2 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -357,22 +357,22 @@ static void icmp_put(struct seq_file *seq)
 	atomic_long_t *ptr = net->mib.icmpmsg_statistics->mibs;
 
 	seq_puts(seq, "\nIcmp: InMsgs InErrors InCsumErrors");
-	for (i = 0; icmpmibmap[i].name != NULL; i++)
+	for (i = 0; icmpmibmap[i].name; i++)
 		seq_printf(seq, " In%s", icmpmibmap[i].name);
 	seq_puts(seq, " OutMsgs OutErrors");
-	for (i = 0; icmpmibmap[i].name != NULL; i++)
+	for (i = 0; icmpmibmap[i].name; i++)
 		seq_printf(seq, " Out%s", icmpmibmap[i].name);
 	seq_printf(seq, "\nIcmp: %lu %lu %lu",
 		snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_INMSGS),
 		snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_INERRORS),
 		snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_CSUMERRORS));
-	for (i = 0; icmpmibmap[i].name != NULL; i++)
+	for (i = 0; icmpmibmap[i].name; i++)
 		seq_printf(seq, " %lu",
 			   atomic_long_read(ptr + icmpmibmap[i].index));
 	seq_printf(seq, " %lu %lu",
 		snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_OUTMSGS),
 		snmp_fold_field(net->mib.icmp_statistics, ICMP_MIB_OUTERRORS));
-	for (i = 0; icmpmibmap[i].name != NULL; i++)
+	for (i = 0; icmpmibmap[i].name; i++)
 		seq_printf(seq, " %lu",
 			   atomic_long_read(ptr + (icmpmibmap[i].index | 0x100)));
 }
@@ -389,7 +389,7 @@ static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
 	memset(buff64, 0, IPSTATS_MIB_MAX * sizeof(u64));
 
 	seq_puts(seq, "Ip: Forwarding DefaultTTL");
-	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
+	for (i = 0; snmp4_ipstats_list[i].name; i++)
 		seq_printf(seq, " %s", snmp4_ipstats_list[i].name);
 
 	seq_printf(seq, "\nIp: %d %d",
@@ -400,7 +400,7 @@ static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
 	snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
 				   net->mib.ip_statistics,
 				   offsetof(struct ipstats_mib, syncp));
-	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
+	for (i = 0; snmp4_ipstats_list[i].name; i++)
 		seq_printf(seq, " %llu", buff64[i]);
 
 	return 0;
@@ -415,13 +415,13 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
 	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
 
 	seq_puts(seq, "\nTcp:");
-	for (i = 0; snmp4_tcp_list[i].name != NULL; i++)
+	for (i = 0; snmp4_tcp_list[i].name; i++)
 		seq_printf(seq, " %s", snmp4_tcp_list[i].name);
 
 	seq_puts(seq, "\nTcp:");
 	snmp_get_cpu_field_batch(buff, snmp4_tcp_list,
 				 net->mib.tcp_statistics);
-	for (i = 0; snmp4_tcp_list[i].name != NULL; i++) {
+	for (i = 0; snmp4_tcp_list[i].name; i++) {
 		/* MaxConn field is signed, RFC 2012 */
 		if (snmp4_tcp_list[i].entry == TCP_MIB_MAXCONN)
 			seq_printf(seq, " %ld", buff[i]);
@@ -434,10 +434,10 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
 	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
 				 net->mib.udp_statistics);
 	seq_puts(seq, "\nUdp:");
-	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+	for (i = 0; snmp4_udp_list[i].name; i++)
 		seq_printf(seq, " %s", snmp4_udp_list[i].name);
 	seq_puts(seq, "\nUdp:");
-	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+	for (i = 0; snmp4_udp_list[i].name; i++)
 		seq_printf(seq, " %lu", buff[i]);
 
 	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
@@ -446,10 +446,10 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
 	seq_puts(seq, "\nUdpLite:");
 	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
 				 net->mib.udplite_statistics);
-	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+	for (i = 0; snmp4_udp_list[i].name; i++)
 		seq_printf(seq, " %s", snmp4_udp_list[i].name);
 	seq_puts(seq, "\nUdpLite:");
-	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
+	for (i = 0; snmp4_udp_list[i].name; i++)
 		seq_printf(seq, " %lu", buff[i]);
 
 	seq_putc(seq, '\n');
@@ -493,13 +493,13 @@ static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
 	memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
 
 	seq_puts(seq, "TcpExt:");
-	for (i = 0; snmp4_net_list[i].name != NULL; i++)
+	for (i = 0; snmp4_net_list[i].name; i++)
 		seq_printf(seq, " %s", snmp4_net_list[i].name);
 
 	seq_puts(seq, "\nTcpExt:");
 	snmp_get_cpu_field_batch(buff, snmp4_net_list,
 				 net->mib.net_statistics);
-	for (i = 0; snmp4_net_list[i].name != NULL; i++)
+	for (i = 0; snmp4_net_list[i].name; i++)
 		seq_printf(seq, " %lu", buff[i]);
 
 	return 0;
@@ -512,14 +512,14 @@ static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
 	struct net *net = seq->private;
 
 	seq_puts(seq, "\nIpExt:");
-	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
+	for (i = 0; snmp4_ipextstats_list[i].name; i++)
 		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
 
 	seq_puts(seq, "\nIpExt:");
 	snmp_get_cpu_field64_batch(buff64, snmp4_ipextstats_list,
 				   net->mib.ip_statistics,
 				   offsetof(struct ipstats_mib, syncp));
-	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
+	for (i = 0; snmp4_ipextstats_list[i].name; i++)
 		seq_printf(seq, " %llu", buff64[i]);
 
 	seq_putc(seq, '\n');
diff --git a/net/sctp/proc.c b/net/sctp/proc.c
index 0487c01..f596407 100644
--- a/net/sctp/proc.c
+++ b/net/sctp/proc.c
@@ -81,7 +81,7 @@ static int sctp_snmp_seq_show(struct seq_file *seq, void *v)
 
 	snmp_get_cpu_field_batch(buff, sctp_snmp_list,
 				 net->sctp.sctp_statistics);
-	for (i = 0; sctp_snmp_list[i].name != NULL; i++)
+	for (i = 0; sctp_snmp_list[i].name; i++)
 		seq_printf(seq, "%-32s\t%ld\n", sctp_snmp_list[i].name,
 						buff[i]);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-09  6:33 ` [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show Jia He
@ 2016-09-12 18:57   ` Marcelo
  2016-09-14  3:10     ` hejianet
  2016-09-14  5:58     ` hejianet
  0 siblings, 2 replies; 17+ messages in thread
From: Marcelo @ 2016-09-12 18:57 UTC (permalink / raw)
  To: Jia He
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu

On Fri, Sep 09, 2016 at 02:33:57PM +0800, Jia He wrote:
> This is to use the generic interface snmp_get_cpu_field{,64}_batch to 
> aggregate the data by going through all the items of each cpu sequentially.
> Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
> warning "the frame size" larger than 1024 on s390.

Yeah about that, did you test it with stack overflow detection?
These arrays can be quite large.

One more below..

> 
> Signed-off-by: Jia He <hejianet@gmail.com>
> ---
>  net/ipv4/proc.c | 106 +++++++++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 74 insertions(+), 32 deletions(-)
> 
> diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
> index 9f665b6..c6fc80e 100644
> --- a/net/ipv4/proc.c
> +++ b/net/ipv4/proc.c
> @@ -46,6 +46,8 @@
>  #include <net/sock.h>
>  #include <net/raw.h>
>  
> +#define TCPUDP_MIB_MAX max_t(u32, UDP_MIB_MAX, TCP_MIB_MAX)
> +
>  /*
>   *	Report socket allocation statistics [mea@utu.fi]
>   */
> @@ -378,13 +380,15 @@ static void icmp_put(struct seq_file *seq)
>  /*
>   *	Called from the PROCfs module. This outputs /proc/net/snmp.
>   */
> -static int snmp_seq_show(struct seq_file *seq, void *v)
> +static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
>  {
>  	int i;
> +	u64 buff64[IPSTATS_MIB_MAX];
>  	struct net *net = seq->private;
>  
> -	seq_puts(seq, "Ip: Forwarding DefaultTTL");
> +	memset(buff64, 0, IPSTATS_MIB_MAX * sizeof(u64));
>  
> +	seq_puts(seq, "Ip: Forwarding DefaultTTL");
>  	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
>  		seq_printf(seq, " %s", snmp4_ipstats_list[i].name);
>  
> @@ -393,57 +397,77 @@ static int snmp_seq_show(struct seq_file *seq, void *v)
>  		   net->ipv4.sysctl_ip_default_ttl);
>  
>  	BUILD_BUG_ON(offsetof(struct ipstats_mib, mibs) != 0);
> +	snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
> +				   net->mib.ip_statistics,
> +				   offsetof(struct ipstats_mib, syncp));
>  	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
> -		seq_printf(seq, " %llu",
> -			   snmp_fold_field64(net->mib.ip_statistics,
> -					     snmp4_ipstats_list[i].entry,
> -					     offsetof(struct ipstats_mib, syncp)));
> +		seq_printf(seq, " %llu", buff64[i]);
>  
> -	icmp_put(seq);	/* RFC 2011 compatibility */
> -	icmpmsg_put(seq);
> +	return 0;
> +}
> +
> +static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
> +{
> +	int i;
> +	unsigned long buff[TCPUDP_MIB_MAX];
> +	struct net *net = seq->private;
> +
> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>  
>  	seq_puts(seq, "\nTcp:");
>  	for (i = 0; snmp4_tcp_list[i].name != NULL; i++)
>  		seq_printf(seq, " %s", snmp4_tcp_list[i].name);
>  
>  	seq_puts(seq, "\nTcp:");
> +	snmp_get_cpu_field_batch(buff, snmp4_tcp_list,
> +				 net->mib.tcp_statistics);
>  	for (i = 0; snmp4_tcp_list[i].name != NULL; i++) {
>  		/* MaxConn field is signed, RFC 2012 */
>  		if (snmp4_tcp_list[i].entry == TCP_MIB_MAXCONN)
> -			seq_printf(seq, " %ld",
> -				   snmp_fold_field(net->mib.tcp_statistics,
> -						   snmp4_tcp_list[i].entry));
> +			seq_printf(seq, " %ld", buff[i]);
>  		else
> -			seq_printf(seq, " %lu",
> -				   snmp_fold_field(net->mib.tcp_statistics,
> -						   snmp4_tcp_list[i].entry));
> +			seq_printf(seq, " %lu", buff[i]);
>  	}
>  
> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
> +
> +	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
> +				 net->mib.udp_statistics);
>  	seq_puts(seq, "\nUdp:");
>  	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>  		seq_printf(seq, " %s", snmp4_udp_list[i].name);
> -
>  	seq_puts(seq, "\nUdp:");
>  	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
> -		seq_printf(seq, " %lu",
> -			   snmp_fold_field(net->mib.udp_statistics,
> -					   snmp4_udp_list[i].entry));
> +		seq_printf(seq, " %lu", buff[i]);
> +
> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>  
>  	/* the UDP and UDP-Lite MIBs are the same */
>  	seq_puts(seq, "\nUdpLite:");
> +	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
> +				 net->mib.udplite_statistics);
>  	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>  		seq_printf(seq, " %s", snmp4_udp_list[i].name);
> -
>  	seq_puts(seq, "\nUdpLite:");
>  	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
> -		seq_printf(seq, " %lu",
> -			   snmp_fold_field(net->mib.udplite_statistics,
> -					   snmp4_udp_list[i].entry));
> +		seq_printf(seq, " %lu", buff[i]);
>  
>  	seq_putc(seq, '\n');
>  	return 0;
>  }
>  
> +static int snmp_seq_show(struct seq_file *seq, void *v)
> +{
> +	snmp_seq_show_ipstats(seq, v);
> +
> +	icmp_put(seq);	/* RFC 2011 compatibility */
> +	icmpmsg_put(seq);
> +
> +	snmp_seq_show_tcp_udp(seq, v);
> +
> +	return 0;
> +}
> +
>  static int snmp_seq_open(struct inode *inode, struct file *file)
>  {
>  	return single_open_net(inode, file, snmp_seq_show);
> @@ -457,41 +481,59 @@ static const struct file_operations snmp_seq_fops = {
>  	.release = single_release_net,
>  };
>  
> -
> -
>  /*
>   *	Output /proc/net/netstat
>   */
> -static int netstat_seq_show(struct seq_file *seq, void *v)
> +static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
>  {
>  	int i;
> +	unsigned long buff[LINUX_MIB_MAX];
>  	struct net *net = seq->private;
>  
> +	memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
> +
>  	seq_puts(seq, "TcpExt:");
>  	for (i = 0; snmp4_net_list[i].name != NULL; i++)
>  		seq_printf(seq, " %s", snmp4_net_list[i].name);
>  
>  	seq_puts(seq, "\nTcpExt:");
> +	snmp_get_cpu_field_batch(buff, snmp4_net_list,
> +				 net->mib.net_statistics);
>  	for (i = 0; snmp4_net_list[i].name != NULL; i++)
> -		seq_printf(seq, " %lu",
> -			   snmp_fold_field(net->mib.net_statistics,
> -					   snmp4_net_list[i].entry));
> +		seq_printf(seq, " %lu", buff[i]);
> +
> +	return 0;
> +}
> +
> +static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
> +{
> +	int i;
> +	u64 buff64[IPSTATS_MIB_MAX];
> +	struct net *net = seq->private;
>  
>  	seq_puts(seq, "\nIpExt:");
>  	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
>  		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
>  
>  	seq_puts(seq, "\nIpExt:");

You're missing a memset() call here.

> +	snmp_get_cpu_field64_batch(buff64, snmp4_ipextstats_list,
> +				   net->mib.ip_statistics,
> +				   offsetof(struct ipstats_mib, syncp));
>  	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
> -		seq_printf(seq, " %llu",
> -			   snmp_fold_field64(net->mib.ip_statistics,
> -					     snmp4_ipextstats_list[i].entry,
> -					     offsetof(struct ipstats_mib, syncp)));
> +		seq_printf(seq, " %llu", buff64[i]);
>  
>  	seq_putc(seq, '\n');
>  	return 0;
>  }
>  
> +static int netstat_seq_show(struct seq_file *seq, void *v)
> +{
> +	netstat_seq_show_tcpext(seq, v);
> +	netstat_seq_show_ipext(seq, v);
> +
> +	return 0;
> +}
> +
>  static int netstat_seq_open(struct inode *inode, struct file *file)
>  {
>  	return single_open_net(inode, file, netstat_seq_show);
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 3/7] proc: Reduce cache miss in snmp6_seq_show
  2016-09-09  6:33 ` [RFC PATCH v3 3/7] proc: Reduce cache miss in snmp6_seq_show Jia He
@ 2016-09-12 19:05   ` Marcelo
  2016-09-14  3:11     ` hejianet
  0 siblings, 1 reply; 17+ messages in thread
From: Marcelo @ 2016-09-12 19:05 UTC (permalink / raw)
  To: Jia He
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu

On Fri, Sep 09, 2016 at 02:33:58PM +0800, Jia He wrote:
> This is to use the generic interface snmp_get_cpu_field{,64}_batch to 
> aggregate the data by going through all the items of each cpu sequentially.
> 
> Signed-off-by: Jia He <hejianet@gmail.com>
> ---
>  net/ipv6/proc.c | 32 +++++++++++++++++++++++---------
>  1 file changed, 23 insertions(+), 9 deletions(-)
> 
> diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
> index 679253d0..50ba2c3 100644
> --- a/net/ipv6/proc.c
> +++ b/net/ipv6/proc.c
> @@ -30,6 +30,11 @@
>  #include <net/transp_v6.h>
>  #include <net/ipv6.h>
>  
> +#define MAX4(a, b, c, d) \
> +	max_t(u32, max_t(u32, a, b), max_t(u32, c, d))
> +#define SNMP_MIB_MAX MAX4(UDP_MIB_MAX, TCP_MIB_MAX, \
> +			IPSTATS_MIB_MAX, ICMP_MIB_MAX)
> +
>  static int sockstat6_seq_show(struct seq_file *seq, void *v)
>  {
>  	struct net *net = seq->private;
> @@ -192,13 +197,19 @@ static void snmp6_seq_show_item(struct seq_file *seq, void __percpu *pcpumib,
>  				const struct snmp_mib *itemlist)
>  {
>  	int i;
> -	unsigned long val;
> -
> -	for (i = 0; itemlist[i].name; i++) {
> -		val = pcpumib ?
> -			snmp_fold_field(pcpumib, itemlist[i].entry) :
> -			atomic_long_read(smib + itemlist[i].entry);
> -		seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, val);
> +	unsigned long buff[SNMP_MIB_MAX];
> +
> +	memset(buff, 0, sizeof(unsigned long) * SNMP_MIB_MAX);

This memset() could be moved...

> +
> +	if (pcpumib) {

... here, so it's not executed if it hits the else block.

> +		snmp_get_cpu_field_batch(buff, itemlist, pcpumib);
> +		for (i = 0; itemlist[i].name; i++)
> +			seq_printf(seq, "%-32s\t%lu\n",
> +				   itemlist[i].name, buff[i]);
> +	} else {
> +		for (i = 0; itemlist[i].name; i++)
> +			seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name,
> +				   atomic_long_read(smib + itemlist[i].entry));
>  	}
>  }
>  
> @@ -206,10 +217,13 @@ static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu *mib,
>  				  const struct snmp_mib *itemlist, size_t syncpoff)
>  {
>  	int i;
> +	u64 buff64[SNMP_MIB_MAX];
> +
> +	memset(buff64, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
>  
> +	snmp_get_cpu_field64_batch(buff64, itemlist, mib, syncpoff);
>  	for (i = 0; itemlist[i].name; i++)
> -		seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name,
> -			   snmp_fold_field64(mib, itemlist[i].entry, syncpoff));
> +		seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name, buff64[i]);
>  }
>  
>  static int snmp6_seq_show(struct seq_file *seq, void *v)
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-12 18:57   ` Marcelo
@ 2016-09-14  3:10     ` hejianet
  2016-09-14  5:58     ` hejianet
  1 sibling, 0 replies; 17+ messages in thread
From: hejianet @ 2016-09-14  3:10 UTC (permalink / raw)
  To: Marcelo
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu

Hi Marcelo


On 9/13/16 2:57 AM, Marcelo wrote:
> On Fri, Sep 09, 2016 at 02:33:57PM +0800, Jia He wrote:
>> This is to use the generic interface snmp_get_cpu_field{,64}_batch to
>> aggregate the data by going through all the items of each cpu sequentially.
>> Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
>> warning "the frame size" larger than 1024 on s390.
> Yeah about that, did you test it with stack overflow detection?
> These arrays can be quite large.
>
> One more below..
I found scripts/checkstack.pl could analyze the stack usage statically.
[root@tian-lp1 kernel]# objdump -d vmlinux | scripts/checkstack.pl ppc64|grep seq
0xc0000000007d4b18 netstat_seq_show_tcpext.isra.7 [vmlinux]:1120
0xc0000000007ccbe8 fib_triestat_seq_show [vmlinux]:     496
0xc00000000083e7a4 tcp6_seq_show [vmlinux]:             480
0xc0000000007d4908 snmp_seq_show_ipstats.isra.6 [vmlinux]:464
0xc0000000007d4d18 netstat_seq_show_ipext.isra.8 [vmlinux]:464
0xc0000000006f5bd8 proto_seq_show [vmlinux]:            416
0xc0000000007f5718 xfrm_statistics_seq_show [vmlinux]:  416
0xc0000000007405b4 dev_seq_printf_stats [vmlinux]:      400

seems the stack usage in netstat_seq_show_tcpext is too big.
Will consider how to reduce it

B.R.
Jia
>> Signed-off-by: Jia He <hejianet@gmail.com>
>> ---
>>   net/ipv4/proc.c | 106 +++++++++++++++++++++++++++++++++++++++-----------------
>>   1 file changed, 74 insertions(+), 32 deletions(-)
>>
>> diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
>> index 9f665b6..c6fc80e 100644
>> --- a/net/ipv4/proc.c
>> +++ b/net/ipv4/proc.c
>> @@ -46,6 +46,8 @@
>>   #include <net/sock.h>
>>   #include <net/raw.h>
>>   
>> +#define TCPUDP_MIB_MAX max_t(u32, UDP_MIB_MAX, TCP_MIB_MAX)
>> +
>>   /*
>>    *	Report socket allocation statistics [mea@utu.fi]
>>    */
>> @@ -378,13 +380,15 @@ static void icmp_put(struct seq_file *seq)
>>   /*
>>    *	Called from the PROCfs module. This outputs /proc/net/snmp.
>>    */
>> -static int snmp_seq_show(struct seq_file *seq, void *v)
>> +static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
>>   {
>>   	int i;
>> +	u64 buff64[IPSTATS_MIB_MAX];
>>   	struct net *net = seq->private;
>>   
>> -	seq_puts(seq, "Ip: Forwarding DefaultTTL");
>> +	memset(buff64, 0, IPSTATS_MIB_MAX * sizeof(u64));
>>   
>> +	seq_puts(seq, "Ip: Forwarding DefaultTTL");
>>   	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_ipstats_list[i].name);
>>   
>> @@ -393,57 +397,77 @@ static int snmp_seq_show(struct seq_file *seq, void *v)
>>   		   net->ipv4.sysctl_ip_default_ttl);
>>   
>>   	BUILD_BUG_ON(offsetof(struct ipstats_mib, mibs) != 0);
>> +	snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
>> +				   net->mib.ip_statistics,
>> +				   offsetof(struct ipstats_mib, syncp));
>>   	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %llu",
>> -			   snmp_fold_field64(net->mib.ip_statistics,
>> -					     snmp4_ipstats_list[i].entry,
>> -					     offsetof(struct ipstats_mib, syncp)));
>> +		seq_printf(seq, " %llu", buff64[i]);
>>   
>> -	icmp_put(seq);	/* RFC 2011 compatibility */
>> -	icmpmsg_put(seq);
>> +	return 0;
>> +}
>> +
>> +static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
>> +{
>> +	int i;
>> +	unsigned long buff[TCPUDP_MIB_MAX];
>> +	struct net *net = seq->private;
>> +
>> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>>   
>>   	seq_puts(seq, "\nTcp:");
>>   	for (i = 0; snmp4_tcp_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_tcp_list[i].name);
>>   
>>   	seq_puts(seq, "\nTcp:");
>> +	snmp_get_cpu_field_batch(buff, snmp4_tcp_list,
>> +				 net->mib.tcp_statistics);
>>   	for (i = 0; snmp4_tcp_list[i].name != NULL; i++) {
>>   		/* MaxConn field is signed, RFC 2012 */
>>   		if (snmp4_tcp_list[i].entry == TCP_MIB_MAXCONN)
>> -			seq_printf(seq, " %ld",
>> -				   snmp_fold_field(net->mib.tcp_statistics,
>> -						   snmp4_tcp_list[i].entry));
>> +			seq_printf(seq, " %ld", buff[i]);
>>   		else
>> -			seq_printf(seq, " %lu",
>> -				   snmp_fold_field(net->mib.tcp_statistics,
>> -						   snmp4_tcp_list[i].entry));
>> +			seq_printf(seq, " %lu", buff[i]);
>>   	}
>>   
>> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>> +
>> +	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
>> +				 net->mib.udp_statistics);
>>   	seq_puts(seq, "\nUdp:");
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_udp_list[i].name);
>> -
>>   	seq_puts(seq, "\nUdp:");
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %lu",
>> -			   snmp_fold_field(net->mib.udp_statistics,
>> -					   snmp4_udp_list[i].entry));
>> +		seq_printf(seq, " %lu", buff[i]);
>> +
>> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>>   
>>   	/* the UDP and UDP-Lite MIBs are the same */
>>   	seq_puts(seq, "\nUdpLite:");
>> +	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
>> +				 net->mib.udplite_statistics);
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_udp_list[i].name);
>> -
>>   	seq_puts(seq, "\nUdpLite:");
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %lu",
>> -			   snmp_fold_field(net->mib.udplite_statistics,
>> -					   snmp4_udp_list[i].entry));
>> +		seq_printf(seq, " %lu", buff[i]);
>>   
>>   	seq_putc(seq, '\n');
>>   	return 0;
>>   }
>>   
>> +static int snmp_seq_show(struct seq_file *seq, void *v)
>> +{
>> +	snmp_seq_show_ipstats(seq, v);
>> +
>> +	icmp_put(seq);	/* RFC 2011 compatibility */
>> +	icmpmsg_put(seq);
>> +
>> +	snmp_seq_show_tcp_udp(seq, v);
>> +
>> +	return 0;
>> +}
>> +
>>   static int snmp_seq_open(struct inode *inode, struct file *file)
>>   {
>>   	return single_open_net(inode, file, snmp_seq_show);
>> @@ -457,41 +481,59 @@ static const struct file_operations snmp_seq_fops = {
>>   	.release = single_release_net,
>>   };
>>   
>> -
>> -
>>   /*
>>    *	Output /proc/net/netstat
>>    */
>> -static int netstat_seq_show(struct seq_file *seq, void *v)
>> +static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
>>   {
>>   	int i;
>> +	unsigned long buff[LINUX_MIB_MAX];
>>   	struct net *net = seq->private;
>>   
>> +	memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
>> +
>>   	seq_puts(seq, "TcpExt:");
>>   	for (i = 0; snmp4_net_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_net_list[i].name);
>>   
>>   	seq_puts(seq, "\nTcpExt:");
>> +	snmp_get_cpu_field_batch(buff, snmp4_net_list,
>> +				 net->mib.net_statistics);
>>   	for (i = 0; snmp4_net_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %lu",
>> -			   snmp_fold_field(net->mib.net_statistics,
>> -					   snmp4_net_list[i].entry));
>> +		seq_printf(seq, " %lu", buff[i]);
>> +
>> +	return 0;
>> +}
>> +
>> +static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
>> +{
>> +	int i;
>> +	u64 buff64[IPSTATS_MIB_MAX];
>> +	struct net *net = seq->private;
>>   
>>   	seq_puts(seq, "\nIpExt:");
>>   	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
>>   
>>   	seq_puts(seq, "\nIpExt:");
> You're missing a memset() call here.
>
>> +	snmp_get_cpu_field64_batch(buff64, snmp4_ipextstats_list,
>> +				   net->mib.ip_statistics,
>> +				   offsetof(struct ipstats_mib, syncp));
>>   	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %llu",
>> -			   snmp_fold_field64(net->mib.ip_statistics,
>> -					     snmp4_ipextstats_list[i].entry,
>> -					     offsetof(struct ipstats_mib, syncp)));
>> +		seq_printf(seq, " %llu", buff64[i]);
>>   
>>   	seq_putc(seq, '\n');
>>   	return 0;
>>   }
>>   
>> +static int netstat_seq_show(struct seq_file *seq, void *v)
>> +{
>> +	netstat_seq_show_tcpext(seq, v);
>> +	netstat_seq_show_ipext(seq, v);
>> +
>> +	return 0;
>> +}
>> +
>>   static int netstat_seq_open(struct inode *inode, struct file *file)
>>   {
>>   	return single_open_net(inode, file, netstat_seq_show);
>> -- 
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 3/7] proc: Reduce cache miss in snmp6_seq_show
  2016-09-12 19:05   ` Marcelo
@ 2016-09-14  3:11     ` hejianet
  0 siblings, 0 replies; 17+ messages in thread
From: hejianet @ 2016-09-14  3:11 UTC (permalink / raw)
  To: Marcelo
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu



On 9/13/16 3:05 AM, Marcelo wrote:
> On Fri, Sep 09, 2016 at 02:33:58PM +0800, Jia He wrote:
>> This is to use the generic interface snmp_get_cpu_field{,64}_batch to
>> aggregate the data by going through all the items of each cpu sequentially.
>>
>> Signed-off-by: Jia He <hejianet@gmail.com>
>> ---
>>   net/ipv6/proc.c | 32 +++++++++++++++++++++++---------
>>   1 file changed, 23 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
>> index 679253d0..50ba2c3 100644
>> --- a/net/ipv6/proc.c
>> +++ b/net/ipv6/proc.c
>> @@ -30,6 +30,11 @@
>>   #include <net/transp_v6.h>
>>   #include <net/ipv6.h>
>>   
>> +#define MAX4(a, b, c, d) \
>> +	max_t(u32, max_t(u32, a, b), max_t(u32, c, d))
>> +#define SNMP_MIB_MAX MAX4(UDP_MIB_MAX, TCP_MIB_MAX, \
>> +			IPSTATS_MIB_MAX, ICMP_MIB_MAX)
>> +
>>   static int sockstat6_seq_show(struct seq_file *seq, void *v)
>>   {
>>   	struct net *net = seq->private;
>> @@ -192,13 +197,19 @@ static void snmp6_seq_show_item(struct seq_file *seq, void __percpu *pcpumib,
>>   				const struct snmp_mib *itemlist)
>>   {
>>   	int i;
>> -	unsigned long val;
>> -
>> -	for (i = 0; itemlist[i].name; i++) {
>> -		val = pcpumib ?
>> -			snmp_fold_field(pcpumib, itemlist[i].entry) :
>> -			atomic_long_read(smib + itemlist[i].entry);
>> -		seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name, val);
>> +	unsigned long buff[SNMP_MIB_MAX];
>> +
>> +	memset(buff, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
> This memset() could be moved...
>
>> +
>> +	if (pcpumib) {
> ... here, so it's not executed if it hits the else block.
Thanks for the suggestion
B.R.
Jia
>> +		snmp_get_cpu_field_batch(buff, itemlist, pcpumib);
>> +		for (i = 0; itemlist[i].name; i++)
>> +			seq_printf(seq, "%-32s\t%lu\n",
>> +				   itemlist[i].name, buff[i]);
>> +	} else {
>> +		for (i = 0; itemlist[i].name; i++)
>> +			seq_printf(seq, "%-32s\t%lu\n", itemlist[i].name,
>> +				   atomic_long_read(smib + itemlist[i].entry));
>>   	}
>>   }
>>   
>> @@ -206,10 +217,13 @@ static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu *mib,
>>   				  const struct snmp_mib *itemlist, size_t syncpoff)
>>   {
>>   	int i;
>> +	u64 buff64[SNMP_MIB_MAX];
>> +
>> +	memset(buff64, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
>>   
>> +	snmp_get_cpu_field64_batch(buff64, itemlist, mib, syncpoff);
>>   	for (i = 0; itemlist[i].name; i++)
>> -		seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name,
>> -			   snmp_fold_field64(mib, itemlist[i].entry, syncpoff));
>> +		seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name, buff64[i]);
>>   }
>>   
>>   static int snmp6_seq_show(struct seq_file *seq, void *v)
>> -- 
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-12 18:57   ` Marcelo
  2016-09-14  3:10     ` hejianet
@ 2016-09-14  5:58     ` hejianet
  2016-09-14 11:55       ` Marcelo
  1 sibling, 1 reply; 17+ messages in thread
From: hejianet @ 2016-09-14  5:58 UTC (permalink / raw)
  To: Marcelo
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu

Hi Marcelo


On 9/13/16 2:57 AM, Marcelo wrote:
> On Fri, Sep 09, 2016 at 02:33:57PM +0800, Jia He wrote:
>> This is to use the generic interface snmp_get_cpu_field{,64}_batch to
>> aggregate the data by going through all the items of each cpu sequentially.
>> Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
>> warning "the frame size" larger than 1024 on s390.
> Yeah about that, did you test it with stack overflow detection?
> These arrays can be quite large.
>
> One more below..
Do you think it is acceptable if the stack usage is a little larger than 1024?
e.g. 1120
I can't find any other way to reduce the stack usage except use "static" before
unsigned long buff[TCP_MIB_MAX]

PS. sizeof buff is about TCP_MIB_MAX(116)*8=928
B.R.
>> Signed-off-by: Jia He <hejianet@gmail.com>
>> ---
>>   net/ipv4/proc.c | 106 +++++++++++++++++++++++++++++++++++++++-----------------
>>   1 file changed, 74 insertions(+), 32 deletions(-)
>>
>> diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
>> index 9f665b6..c6fc80e 100644
>> --- a/net/ipv4/proc.c
>> +++ b/net/ipv4/proc.c
>> @@ -46,6 +46,8 @@
>>   #include <net/sock.h>
>>   #include <net/raw.h>
>>   
>> +#define TCPUDP_MIB_MAX max_t(u32, UDP_MIB_MAX, TCP_MIB_MAX)
>> +
>>   /*
>>    *	Report socket allocation statistics [mea@utu.fi]
>>    */
>> @@ -378,13 +380,15 @@ static void icmp_put(struct seq_file *seq)
>>   /*
>>    *	Called from the PROCfs module. This outputs /proc/net/snmp.
>>    */
>> -static int snmp_seq_show(struct seq_file *seq, void *v)
>> +static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
>>   {
>>   	int i;
>> +	u64 buff64[IPSTATS_MIB_MAX];
>>   	struct net *net = seq->private;
>>   
>> -	seq_puts(seq, "Ip: Forwarding DefaultTTL");
>> +	memset(buff64, 0, IPSTATS_MIB_MAX * sizeof(u64));
>>   
>> +	seq_puts(seq, "Ip: Forwarding DefaultTTL");
>>   	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_ipstats_list[i].name);
>>   
>> @@ -393,57 +397,77 @@ static int snmp_seq_show(struct seq_file *seq, void *v)
>>   		   net->ipv4.sysctl_ip_default_ttl);
>>   
>>   	BUILD_BUG_ON(offsetof(struct ipstats_mib, mibs) != 0);
>> +	snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
>> +				   net->mib.ip_statistics,
>> +				   offsetof(struct ipstats_mib, syncp));
>>   	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %llu",
>> -			   snmp_fold_field64(net->mib.ip_statistics,
>> -					     snmp4_ipstats_list[i].entry,
>> -					     offsetof(struct ipstats_mib, syncp)));
>> +		seq_printf(seq, " %llu", buff64[i]);
>>   
>> -	icmp_put(seq);	/* RFC 2011 compatibility */
>> -	icmpmsg_put(seq);
>> +	return 0;
>> +}
>> +
>> +static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
>> +{
>> +	int i;
>> +	unsigned long buff[TCPUDP_MIB_MAX];
>> +	struct net *net = seq->private;
>> +
>> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>>   
>>   	seq_puts(seq, "\nTcp:");
>>   	for (i = 0; snmp4_tcp_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_tcp_list[i].name);
>>   
>>   	seq_puts(seq, "\nTcp:");
>> +	snmp_get_cpu_field_batch(buff, snmp4_tcp_list,
>> +				 net->mib.tcp_statistics);
>>   	for (i = 0; snmp4_tcp_list[i].name != NULL; i++) {
>>   		/* MaxConn field is signed, RFC 2012 */
>>   		if (snmp4_tcp_list[i].entry == TCP_MIB_MAXCONN)
>> -			seq_printf(seq, " %ld",
>> -				   snmp_fold_field(net->mib.tcp_statistics,
>> -						   snmp4_tcp_list[i].entry));
>> +			seq_printf(seq, " %ld", buff[i]);
>>   		else
>> -			seq_printf(seq, " %lu",
>> -				   snmp_fold_field(net->mib.tcp_statistics,
>> -						   snmp4_tcp_list[i].entry));
>> +			seq_printf(seq, " %lu", buff[i]);
>>   	}
>>   
>> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>> +
>> +	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
>> +				 net->mib.udp_statistics);
>>   	seq_puts(seq, "\nUdp:");
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_udp_list[i].name);
>> -
>>   	seq_puts(seq, "\nUdp:");
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %lu",
>> -			   snmp_fold_field(net->mib.udp_statistics,
>> -					   snmp4_udp_list[i].entry));
>> +		seq_printf(seq, " %lu", buff[i]);
>> +
>> +	memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
>>   
>>   	/* the UDP and UDP-Lite MIBs are the same */
>>   	seq_puts(seq, "\nUdpLite:");
>> +	snmp_get_cpu_field_batch(buff, snmp4_udp_list,
>> +				 net->mib.udplite_statistics);
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_udp_list[i].name);
>> -
>>   	seq_puts(seq, "\nUdpLite:");
>>   	for (i = 0; snmp4_udp_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %lu",
>> -			   snmp_fold_field(net->mib.udplite_statistics,
>> -					   snmp4_udp_list[i].entry));
>> +		seq_printf(seq, " %lu", buff[i]);
>>   
>>   	seq_putc(seq, '\n');
>>   	return 0;
>>   }
>>   
>> +static int snmp_seq_show(struct seq_file *seq, void *v)
>> +{
>> +	snmp_seq_show_ipstats(seq, v);
>> +
>> +	icmp_put(seq);	/* RFC 2011 compatibility */
>> +	icmpmsg_put(seq);
>> +
>> +	snmp_seq_show_tcp_udp(seq, v);
>> +
>> +	return 0;
>> +}
>> +
>>   static int snmp_seq_open(struct inode *inode, struct file *file)
>>   {
>>   	return single_open_net(inode, file, snmp_seq_show);
>> @@ -457,41 +481,59 @@ static const struct file_operations snmp_seq_fops = {
>>   	.release = single_release_net,
>>   };
>>   
>> -
>> -
>>   /*
>>    *	Output /proc/net/netstat
>>    */
>> -static int netstat_seq_show(struct seq_file *seq, void *v)
>> +static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
>>   {
>>   	int i;
>> +	unsigned long buff[LINUX_MIB_MAX];
>>   	struct net *net = seq->private;
>>   
>> +	memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
>> +
>>   	seq_puts(seq, "TcpExt:");
>>   	for (i = 0; snmp4_net_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_net_list[i].name);
>>   
>>   	seq_puts(seq, "\nTcpExt:");
>> +	snmp_get_cpu_field_batch(buff, snmp4_net_list,
>> +				 net->mib.net_statistics);
>>   	for (i = 0; snmp4_net_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %lu",
>> -			   snmp_fold_field(net->mib.net_statistics,
>> -					   snmp4_net_list[i].entry));
>> +		seq_printf(seq, " %lu", buff[i]);
>> +
>> +	return 0;
>> +}
>> +
>> +static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
>> +{
>> +	int i;
>> +	u64 buff64[IPSTATS_MIB_MAX];
>> +	struct net *net = seq->private;
>>   
>>   	seq_puts(seq, "\nIpExt:");
>>   	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
>>   		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
>>   
>>   	seq_puts(seq, "\nIpExt:");
> You're missing a memset() call here.
>
>> +	snmp_get_cpu_field64_batch(buff64, snmp4_ipextstats_list,
>> +				   net->mib.ip_statistics,
>> +				   offsetof(struct ipstats_mib, syncp));
>>   	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
>> -		seq_printf(seq, " %llu",
>> -			   snmp_fold_field64(net->mib.ip_statistics,
>> -					     snmp4_ipextstats_list[i].entry,
>> -					     offsetof(struct ipstats_mib, syncp)));
>> +		seq_printf(seq, " %llu", buff64[i]);
>>   
>>   	seq_putc(seq, '\n');
>>   	return 0;
>>   }
>>   
>> +static int netstat_seq_show(struct seq_file *seq, void *v)
>> +{
>> +	netstat_seq_show_tcpext(seq, v);
>> +	netstat_seq_show_ipext(seq, v);
>> +
>> +	return 0;
>> +}
>> +
>>   static int netstat_seq_open(struct inode *inode, struct file *file)
>>   {
>>   	return single_open_net(inode, file, netstat_seq_show);
>> -- 
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-14  5:58     ` hejianet
@ 2016-09-14 11:55       ` Marcelo
  2016-09-21 16:18         ` hejianet
  0 siblings, 1 reply; 17+ messages in thread
From: Marcelo @ 2016-09-14 11:55 UTC (permalink / raw)
  To: hejianet
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu

Hi Jia,

On Wed, Sep 14, 2016 at 01:58:42PM +0800, hejianet wrote:
> Hi Marcelo
> 
> 
> On 9/13/16 2:57 AM, Marcelo wrote:
> > On Fri, Sep 09, 2016 at 02:33:57PM +0800, Jia He wrote:
> > > This is to use the generic interface snmp_get_cpu_field{,64}_batch to
> > > aggregate the data by going through all the items of each cpu sequentially.
> > > Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
> > > warning "the frame size" larger than 1024 on s390.
> > Yeah about that, did you test it with stack overflow detection?
> > These arrays can be quite large.
> > 
> > One more below..
> Do you think it is acceptable if the stack usage is a little larger than 1024?
> e.g. 1120
> I can't find any other way to reduce the stack usage except use "static" before
> unsigned long buff[TCP_MIB_MAX]
> 
> PS. sizeof buff is about TCP_MIB_MAX(116)*8=928
> B.R.

That's pretty much the question. Linux has the option on some archs to
run with 4Kb (4KSTACKS option), so this function alone would be using
25% of it in this last case. While on x86_64, it uses 16Kb (6538b8ea886e
("x86_64: expand kernel stack to 16K")).

Adding static to it is not an option as it actually makes the variable
shared amongst the CPUs (and then you have concurrency issues), plus the
fact that it's always allocated, even while not in use.

Others here certainly know better than me if it's okay to make such
usage of the stach.

> > > +static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
> > > +{
> > > +	int i;
> > > +	u64 buff64[IPSTATS_MIB_MAX];
> > > +	struct net *net = seq->private;
> > >   	seq_puts(seq, "\nIpExt:");
> > >   	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
> > >   		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
> > >   	seq_puts(seq, "\nIpExt:");
> > You're missing a memset() call here.

Not sure if you missed this one or not..

Thanks,
Marcelo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-14 11:55       ` Marcelo
@ 2016-09-21 16:18         ` hejianet
  2016-09-21 18:24           ` Marcelo
  0 siblings, 1 reply; 17+ messages in thread
From: hejianet @ 2016-09-21 16:18 UTC (permalink / raw)
  To: Marcelo
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu

Hi Marcelo

sorry for the late, just came back from a vacation.

On 9/14/16 7:55 PM, Marcelo wrote:
> Hi Jia,
>
> On Wed, Sep 14, 2016 at 01:58:42PM +0800, hejianet wrote:
>> Hi Marcelo
>>
>>
>> On 9/13/16 2:57 AM, Marcelo wrote:
>>> On Fri, Sep 09, 2016 at 02:33:57PM +0800, Jia He wrote:
>>>> This is to use the generic interface snmp_get_cpu_field{,64}_batch to
>>>> aggregate the data by going through all the items of each cpu sequentially.
>>>> Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
>>>> warning "the frame size" larger than 1024 on s390.
>>> Yeah about that, did you test it with stack overflow detection?
>>> These arrays can be quite large.
>>>
>>> One more below..
>> Do you think it is acceptable if the stack usage is a little larger than 1024?
>> e.g. 1120
>> I can't find any other way to reduce the stack usage except use "static" before
>> unsigned long buff[TCP_MIB_MAX]
>>
>> PS. sizeof buff is about TCP_MIB_MAX(116)*8=928
>> B.R.
> That's pretty much the question. Linux has the option on some archs to
> run with 4Kb (4KSTACKS option), so this function alone would be using
> 25% of it in this last case. While on x86_64, it uses 16Kb (6538b8ea886e
> ("x86_64: expand kernel stack to 16K")).
>
> Adding static to it is not an option as it actually makes the variable
> shared amongst the CPUs (and then you have concurrency issues), plus the
> fact that it's always allocated, even while not in use.
>
> Others here certainly know better than me if it's okay to make such
> usage of the stach.
What about this patch instead?
It is a trade-off. I split the aggregation process into 2 parts, it will
increase the cache miss a little bit, but it can reduce the stack usage.
After this, stack usage is 672bytes
objdump -d vmlinux | ./scripts/checkstack.pl ppc64 | grep seq_show
0xc0000000007f7cc0 netstat_seq_show_tcpext.isra.3 [vmlinux]:672

diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index c6ee8a2..cc41590 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -486,22 +486,37 @@ static const struct file_operations snmp_seq_fops = {
   */
  static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
  {
-       int i;
-       unsigned long buff[LINUX_MIB_MAX];
+       int i, c;
+       unsigned long buff[LINUX_MIB_MAX/2 + 1];
         struct net *net = seq->private;

-       memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
+       memset(buff, 0, sizeof(unsigned long) * (LINUX_MIB_MAX/2 + 1));

         seq_puts(seq, "TcpExt:");
         for (i = 0; snmp4_net_list[i].name; i++)
                 seq_printf(seq, " %s", snmp4_net_list[i].name);

         seq_puts(seq, "\nTcpExt:");
-       snmp_get_cpu_field_batch(buff, snmp4_net_list,
-                                net->mib.net_statistics);
-       for (i = 0; snmp4_net_list[i].name; i++)
+       for_each_possible_cpu(c) {
+               for (i = 0; i < LINUX_MIB_MAX/2; i++)
+                       buff[i] += snmp_get_cpu_field(
+ net->mib.net_statistics,
+                                               c, snmp4_net_list[i].entry);
+       }
+       for (i = 0; i < LINUX_MIB_MAX/2; i++)
                 seq_printf(seq, " %lu", buff[i]);

+       memset(buff, 0, sizeof(unsigned long) * (LINUX_MIB_MAX/2 + 1));
+       for_each_possible_cpu(c) {
+               for (i = LINUX_MIB_MAX/2; snmp4_net_list[i].name; i++)
+                       buff[i - LINUX_MIB_MAX/2] += snmp_get_cpu_field(
+                               net->mib.net_statistics,
+                               c,
+                               snmp4_net_list[i].entry);
+       }
+        for (i = LINUX_MIB_MAX/2; snmp4_net_list[i].name; i++)
+                seq_printf(seq, " %lu", buff[i - LINUX_MIB_MAX/2]);
+
         return 0;
  }

>>>> +static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
>>>> +{
>>>> +	int i;
>>>> +	u64 buff64[IPSTATS_MIB_MAX];
>>>> +	struct net *net = seq->private;
>>>>    	seq_puts(seq, "\nIpExt:");
>>>>    	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
>>>>    		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
>>>>    	seq_puts(seq, "\nIpExt:");
>>> You're missing a memset() call here.
> Not sure if you missed this one or not..
indeed, thanks
B.R.
Jia
> Thanks,
> Marcelo
>

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-21 16:18         ` hejianet
@ 2016-09-21 18:24           ` Marcelo
  2016-09-22  5:38             ` hejianet
  0 siblings, 1 reply; 17+ messages in thread
From: Marcelo @ 2016-09-21 18:24 UTC (permalink / raw)
  To: hejianet
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu

On Thu, Sep 22, 2016 at 12:18:46AM +0800, hejianet wrote:
> Hi Marcelo
> 
> sorry for the late, just came back from a vacation.

Hi, no problem. Hope your batteries are recharged now :-)

> 
> On 9/14/16 7:55 PM, Marcelo wrote:
> > Hi Jia,
> > 
> > On Wed, Sep 14, 2016 at 01:58:42PM +0800, hejianet wrote:
> > > Hi Marcelo
> > > 
> > > 
> > > On 9/13/16 2:57 AM, Marcelo wrote:
> > > > On Fri, Sep 09, 2016 at 02:33:57PM +0800, Jia He wrote:
> > > > > This is to use the generic interface snmp_get_cpu_field{,64}_batch to
> > > > > aggregate the data by going through all the items of each cpu sequentially.
> > > > > Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
> > > > > warning "the frame size" larger than 1024 on s390.
> > > > Yeah about that, did you test it with stack overflow detection?
> > > > These arrays can be quite large.
> > > > 
> > > > One more below..
> > > Do you think it is acceptable if the stack usage is a little larger than 1024?
> > > e.g. 1120
> > > I can't find any other way to reduce the stack usage except use "static" before
> > > unsigned long buff[TCP_MIB_MAX]
> > > 
> > > PS. sizeof buff is about TCP_MIB_MAX(116)*8=928
> > > B.R.
> > That's pretty much the question. Linux has the option on some archs to
> > run with 4Kb (4KSTACKS option), so this function alone would be using
> > 25% of it in this last case. While on x86_64, it uses 16Kb (6538b8ea886e
> > ("x86_64: expand kernel stack to 16K")).
> > 
> > Adding static to it is not an option as it actually makes the variable
> > shared amongst the CPUs (and then you have concurrency issues), plus the
> > fact that it's always allocated, even while not in use.
> > 
> > Others here certainly know better than me if it's okay to make such
> > usage of the stach.
> What about this patch instead?
> It is a trade-off. I split the aggregation process into 2 parts, it will
> increase the cache miss a little bit, but it can reduce the stack usage.
> After this, stack usage is 672bytes
> objdump -d vmlinux | ./scripts/checkstack.pl ppc64 | grep seq_show
> 0xc0000000007f7cc0 netstat_seq_show_tcpext.isra.3 [vmlinux]:672
> 
> diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
> index c6ee8a2..cc41590 100644
> --- a/net/ipv4/proc.c
> +++ b/net/ipv4/proc.c
> @@ -486,22 +486,37 @@ static const struct file_operations snmp_seq_fops = {
>   */
>  static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
>  {
> -       int i;
> -       unsigned long buff[LINUX_MIB_MAX];
> +       int i, c;
> +       unsigned long buff[LINUX_MIB_MAX/2 + 1];
>         struct net *net = seq->private;
> 
> -       memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
> +       memset(buff, 0, sizeof(unsigned long) * (LINUX_MIB_MAX/2 + 1));
> 
>         seq_puts(seq, "TcpExt:");
>         for (i = 0; snmp4_net_list[i].name; i++)
>                 seq_printf(seq, " %s", snmp4_net_list[i].name);
> 
>         seq_puts(seq, "\nTcpExt:");
> -       snmp_get_cpu_field_batch(buff, snmp4_net_list,
> -                                net->mib.net_statistics);
> -       for (i = 0; snmp4_net_list[i].name; i++)
> +       for_each_possible_cpu(c) {
> +               for (i = 0; i < LINUX_MIB_MAX/2; i++)
> +                       buff[i] += snmp_get_cpu_field(
> + net->mib.net_statistics,
> +                                               c, snmp4_net_list[i].entry);
> +       }
> +       for (i = 0; i < LINUX_MIB_MAX/2; i++)
>                 seq_printf(seq, " %lu", buff[i]);
> 
> +       memset(buff, 0, sizeof(unsigned long) * (LINUX_MIB_MAX/2 + 1));
> +       for_each_possible_cpu(c) {
> +               for (i = LINUX_MIB_MAX/2; snmp4_net_list[i].name; i++)
> +                       buff[i - LINUX_MIB_MAX/2] += snmp_get_cpu_field(
> +                               net->mib.net_statistics,
> +                               c,
> +                               snmp4_net_list[i].entry);
> +       }
> +        for (i = LINUX_MIB_MAX/2; snmp4_net_list[i].name; i++)
> +                seq_printf(seq, " %lu", buff[i - LINUX_MIB_MAX/2]);
> +
>         return 0;
>  }

Yep, it halves the stack usage, but it doesn't look good heh

But well, you may try to post the patchset (with or without this last
change, you pick) officially and see how it goes. As you're posting as
RFC, it's not being evaluated as seriously.

FWIW, I tested your patches, using your test and /proc/net/snmp file on
a x86_64 box, Intel(R) Xeon(R) CPU E5-2643 v3.

Before the patches:

 Performance counter stats for './test /proc/net/snmp':

             5.225      cache-misses                                                
    12.708.673.785      L1-dcache-loads                                             
     1.288.450.174      L1-dcache-load-misses     #   10,14% of all L1-dcache hits  
     1.271.857.028      LLC-loads                                                   
             4.122      LLC-load-misses           #    0,00% of all LL-cache hits   

       9,174936524 seconds time elapsed

After:

 Performance counter stats for './test /proc/net/snmp':

             2.865      cache-misses                                                
    30.203.883.807      L1-dcache-loads                                             
     1.215.774.643      L1-dcache-load-misses     #    4,03% of all L1-dcache hits  
     1.181.662.831      LLC-loads                                                   
             2.685      LLC-load-misses           #    0,00% of all LL-cache hits   

      13,374445056 seconds time elapsed

Numbers were steady across multiple runs.

  Marcelo

> 
> > > > > +static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
> > > > > +{
> > > > > +	int i;
> > > > > +	u64 buff64[IPSTATS_MIB_MAX];
> > > > > +	struct net *net = seq->private;
> > > > >    	seq_puts(seq, "\nIpExt:");
> > > > >    	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
> > > > >    		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
> > > > >    	seq_puts(seq, "\nIpExt:");
> > > > You're missing a memset() call here.
> > Not sure if you missed this one or not..
> indeed, thanks
> B.R.
> Jia
> > Thanks,
> > Marcelo
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show
  2016-09-21 18:24           ` Marcelo
@ 2016-09-22  5:38             ` hejianet
  0 siblings, 0 replies; 17+ messages in thread
From: hejianet @ 2016-09-22  5:38 UTC (permalink / raw)
  To: Marcelo
  Cc: netdev, linux-sctp, linux-kernel, davem, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich,
	Neil Horman, Steffen Klassert, Herbert Xu



On 9/22/16 2:24 AM, Marcelo wrote:
> On Thu, Sep 22, 2016 at 12:18:46AM +0800, hejianet wrote:
>> Hi Marcelo
>>
>> sorry for the late, just came back from a vacation.
> Hi, no problem. Hope your batteries are recharged now :-)
>
>> On 9/14/16 7:55 PM, Marcelo wrote:
>>> Hi Jia,
>>>
>>> On Wed, Sep 14, 2016 at 01:58:42PM +0800, hejianet wrote:
>>>> Hi Marcelo
>>>>
>>>>
>>>> On 9/13/16 2:57 AM, Marcelo wrote:
>>>>> On Fri, Sep 09, 2016 at 02:33:57PM +0800, Jia He wrote:
>>>>>> This is to use the generic interface snmp_get_cpu_field{,64}_batch to
>>>>>> aggregate the data by going through all the items of each cpu sequentially.
>>>>>> Then snmp_seq_show and netstat_seq_show are split into 2 parts to avoid build
>>>>>> warning "the frame size" larger than 1024 on s390.
>>>>> Yeah about that, did you test it with stack overflow detection?
>>>>> These arrays can be quite large.
>>>>>
>>>>> One more below..
>>>> Do you think it is acceptable if the stack usage is a little larger than 1024?
>>>> e.g. 1120
>>>> I can't find any other way to reduce the stack usage except use "static" before
>>>> unsigned long buff[TCP_MIB_MAX]
>>>>
>>>> PS. sizeof buff is about TCP_MIB_MAX(116)*8=928
>>>> B.R.
>>> That's pretty much the question. Linux has the option on some archs to
>>> run with 4Kb (4KSTACKS option), so this function alone would be using
>>> 25% of it in this last case. While on x86_64, it uses 16Kb (6538b8ea886e
>>> ("x86_64: expand kernel stack to 16K")).
>>>
>>> Adding static to it is not an option as it actually makes the variable
>>> shared amongst the CPUs (and then you have concurrency issues), plus the
>>> fact that it's always allocated, even while not in use.
>>>
>>> Others here certainly know better than me if it's okay to make such
>>> usage of the stach.
>> What about this patch instead?
>> It is a trade-off. I split the aggregation process into 2 parts, it will
>> increase the cache miss a little bit, but it can reduce the stack usage.
>> After this, stack usage is 672bytes
>> objdump -d vmlinux | ./scripts/checkstack.pl ppc64 | grep seq_show
>> 0xc0000000007f7cc0 netstat_seq_show_tcpext.isra.3 [vmlinux]:672
>>
>> diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
>> index c6ee8a2..cc41590 100644
>> --- a/net/ipv4/proc.c
>> +++ b/net/ipv4/proc.c
>> @@ -486,22 +486,37 @@ static const struct file_operations snmp_seq_fops = {
>>    */
>>   static int netstat_seq_show_tcpext(struct seq_file *seq, void *v)
>>   {
>> -       int i;
>> -       unsigned long buff[LINUX_MIB_MAX];
>> +       int i, c;
>> +       unsigned long buff[LINUX_MIB_MAX/2 + 1];
>>          struct net *net = seq->private;
>>
>> -       memset(buff, 0, sizeof(unsigned long) * LINUX_MIB_MAX);
>> +       memset(buff, 0, sizeof(unsigned long) * (LINUX_MIB_MAX/2 + 1));
>>
>>          seq_puts(seq, "TcpExt:");
>>          for (i = 0; snmp4_net_list[i].name; i++)
>>                  seq_printf(seq, " %s", snmp4_net_list[i].name);
>>
>>          seq_puts(seq, "\nTcpExt:");
>> -       snmp_get_cpu_field_batch(buff, snmp4_net_list,
>> -                                net->mib.net_statistics);
>> -       for (i = 0; snmp4_net_list[i].name; i++)
>> +       for_each_possible_cpu(c) {
>> +               for (i = 0; i < LINUX_MIB_MAX/2; i++)
>> +                       buff[i] += snmp_get_cpu_field(
>> + net->mib.net_statistics,
>> +                                               c, snmp4_net_list[i].entry);
>> +       }
>> +       for (i = 0; i < LINUX_MIB_MAX/2; i++)
>>                  seq_printf(seq, " %lu", buff[i]);
>>
>> +       memset(buff, 0, sizeof(unsigned long) * (LINUX_MIB_MAX/2 + 1));
>> +       for_each_possible_cpu(c) {
>> +               for (i = LINUX_MIB_MAX/2; snmp4_net_list[i].name; i++)
>> +                       buff[i - LINUX_MIB_MAX/2] += snmp_get_cpu_field(
>> +                               net->mib.net_statistics,
>> +                               c,
>> +                               snmp4_net_list[i].entry);
>> +       }
>> +        for (i = LINUX_MIB_MAX/2; snmp4_net_list[i].name; i++)
>> +                seq_printf(seq, " %lu", buff[i - LINUX_MIB_MAX/2]);
>> +
>>          return 0;
>>   }
> Yep, it halves the stack usage, but it doesn't look good heh
>
> But well, you may try to post the patchset (with or without this last
> change, you pick) officially and see how it goes. As you're posting as
> RFC, it's not being evaluated as seriously.
Thanks for the suggestion, I will remove it in future patch version
>
> FWIW, I tested your patches, using your test and /proc/net/snmp file on
> a x86_64 box, Intel(R) Xeon(R) CPU E5-2643 v3.
>
> Before the patches:
>
>   Performance counter stats for './test /proc/net/snmp':
>
>               5.225      cache-misses
>      12.708.673.785      L1-dcache-loads
>       1.288.450.174      L1-dcache-load-misses     #   10,14% of all L1-dcache hits
>       1.271.857.028      LLC-loads
>               4.122      LLC-load-misses           #    0,00% of all LL-cache hits
>
>         9,174936524 seconds time elapsed
>
> After:
>
>   Performance counter stats for './test /proc/net/snmp':
>
>               2.865      cache-misses
>      30.203.883.807      L1-dcache-loads
>       1.215.774.643      L1-dcache-load-misses     #    4,03% of all L1-dcache hits
>       1.181.662.831      LLC-loads
>               2.685      LLC-load-misses           #    0,00% of all LL-cache hits
>
>        13,374445056 seconds time elapsed
>
> Numbers were steady across multiple runs.
>
>    Marcelo
Yes, I guess your X86 machine doesn't have the large cpu number as mine (cpu#=160).
The cache misses rate difference btw before and after this patch will be more
significant if the cpu number is large.

B.R.
Jia
>
>>>>>> +static int netstat_seq_show_ipext(struct seq_file *seq, void *v)
>>>>>> +{
>>>>>> +	int i;
>>>>>> +	u64 buff64[IPSTATS_MIB_MAX];
>>>>>> +	struct net *net = seq->private;
>>>>>>     	seq_puts(seq, "\nIpExt:");
>>>>>>     	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
>>>>>>     		seq_printf(seq, " %s", snmp4_ipextstats_list[i].name);
>>>>>>     	seq_puts(seq, "\nIpExt:");
>>>>> You're missing a memset() call here.
>>> Not sure if you missed this one or not..
>> indeed, thanks
>> B.R.
>> Jia
>>> Thanks,
>>> Marcelo
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-09-22  5:39 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-09  6:33 [RFC PATCH v3 0/7] Reduce cache miss for snmp_fold_field Jia He
2016-09-09  6:33 ` [RFC PATCH v3 1/7] net:snmp: Introduce generic batch interfaces for snmp_get_cpu_field{,64} Jia He
2016-09-09  6:33 ` [RFC PATCH v3 2/7] proc: Reduce cache miss in {snmp,netstat}_seq_show Jia He
2016-09-12 18:57   ` Marcelo
2016-09-14  3:10     ` hejianet
2016-09-14  5:58     ` hejianet
2016-09-14 11:55       ` Marcelo
2016-09-21 16:18         ` hejianet
2016-09-21 18:24           ` Marcelo
2016-09-22  5:38             ` hejianet
2016-09-09  6:33 ` [RFC PATCH v3 3/7] proc: Reduce cache miss in snmp6_seq_show Jia He
2016-09-12 19:05   ` Marcelo
2016-09-14  3:11     ` hejianet
2016-09-09  6:33 ` [RFC PATCH v3 4/7] proc: Reduce cache miss in sctp_snmp_seq_show Jia He
2016-09-09  6:34 ` [RFC PATCH v3 5/7] proc: Reduce cache miss in xfrm_statistics_seq_show Jia He
2016-09-09  6:34 ` [RFC PATCH v3 6/7] ipv6: Remove useless parameter in __snmp6_fill_statsdev Jia He
2016-09-09  6:34 ` [RFC PATCH v3 7/7] net: Suppress the "Comparison to NULL could be written" warning Jia He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).