All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/6] Reduce cache miss for snmp_fold_field
@ 2016-09-06  2:30 ` Jia He
  0 siblings, 0 replies; 22+ messages in thread
From: Jia He @ 2016-09-06  2:30 UTC (permalink / raw)
  To: netdev
  Cc: linux-sctp, linux-kernel, davem, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Vlad Yasevich, Neil Horman,
	Steffen Klassert, Herbert Xu, Jia He

In a PowerPc server with large cpu number(160), besides commit
a3a773726c9f ("net: Optimize snmp stat aggregation by walking all
the percpu data at once"), I watched several other snmp_fold_field
callsites which will cause high cache miss rate.

My simple test case, which read from the procfs items endlessly:
/***********************************************************/
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define LINELEN  2560
int main(int argc, char **argv)
{
        int i;
        int fd = -1 ;
        int rdsize = 0;
        char buf[LINELEN+1];

        buf[LINELEN] = 0;
        memset(buf,0,LINELEN);

        if(1 >= argc) {
                printf("file name empty\n");
                return -1;
        }

        fd = open(argv[1], O_RDWR, 0644);
        if(0 > fd){
                printf("open error\n");
                return -2;
        }

        for(i=0;i<0xffffffff;i++) {
                while(0 < (rdsize = read(fd,buf,LINELEN))){
                        //nothing here
                }

                lseek(fd, 0, SEEK_SET);
        }

        close(fd);
        return 0;
}
/**********************************************************/

compile and run:
gcc test.c -o test

perf stat -d -e cache-misses ./test /proc/net/snmp
perf stat -d -e cache-misses ./test /proc/net/snmp6
perf stat -d -e cache-misses ./test /proc/net/netstat
perf stat -d -e cache-misses ./test /proc/net/sctp/snmp
perf stat -d -e cache-misses ./test /proc/net/xfrm_stat

before the patch set:
====================
 Performance counter stats for 'system wide':

         355911097      cache-misses                                                 [40.08%]
        2356829300      L1-dcache-loads                                              [60.04%]
         355642645      L1-dcache-load-misses     #   15.09% of all L1-dcache hits   [60.02%]
         346544541      LLC-loads                                                    [59.97%]
            389763      LLC-load-misses           #    0.11% of all LL-cache hits    [40.02%]

       6.245162638 seconds time elapsed

After the patch set:
===================
 Performance counter stats for 'system wide':

         194992476      cache-misses                                                 [40.03%]
        6718051877      L1-dcache-loads                                              [60.07%]
         194871921      L1-dcache-load-misses     #    2.90% of all L1-dcache hits   [60.11%]
         187632232      LLC-loads                                                    [60.04%]
            464466      LLC-load-misses           #    0.25% of all LL-cache hits    [39.89%]

       6.868422769 seconds time elapsed
The cache-miss rate can be reduced from 15% to 2.9%

v2:
- 1/6 fix bug in udplite statistics. 
- 1/6 snmp_seq_show is split into 2 parts

Jia He (6):
  proc: Reduce cache miss in {snmp,netstat}_seq_show
  proc: Reduce cache miss in snmp6_seq_show
  proc: Reduce cache miss in sctp_snmp_seq_show
  proc: Reduce cache miss in xfrm_statistics_seq_show
  ipv6: Remove useless parameter in __snmp6_fill_statsdev
  net: Suppress the "Comparison to NULL could be written" warning

 net/ipv4/proc.c      | 144 ++++++++++++++++++++++++++++++++++-----------------
 net/ipv6/addrconf.c  |  12 ++---
 net/ipv6/proc.c      |  47 +++++++++++++----
 net/sctp/proc.c      |  15 ++++--
 net/xfrm/xfrm_proc.c |  15 ++++--
 5 files changed, 162 insertions(+), 71 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-09-07  2:31 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-06  2:30 [RFC PATCH v2 0/6] Reduce cache miss for snmp_fold_field Jia He
2016-09-06  2:30 ` Jia He
2016-09-06  2:30 ` [RFC PATCH v2 1/6] proc: Reduce cache miss in {snmp,netstat}_seq_show Jia He
2016-09-06  2:30   ` Jia He
2016-09-06 22:57   ` David Miller
2016-09-06 22:57     ` David Miller
2016-09-07  2:29     ` hejianet
2016-09-07  2:29       ` hejianet
2016-09-06  2:30 ` [RFC PATCH v2 2/6] proc: Reduce cache miss in snmp6_seq_show Jia He
2016-09-06  2:30   ` Jia He
2016-09-06  2:30 ` [RFC PATCH v2 3/6] proc: Reduce cache miss in sctp_snmp_seq_show Jia He
2016-09-06  2:30   ` Jia He
2016-09-06  2:30 ` [RFC PATCH v2 4/6] proc: Reduce cache miss in xfrm_statistics_seq_show Jia He
2016-09-06  2:30   ` Jia He
2016-09-06  2:30 ` [RFC PATCH v2 5/6] ipv6: Remove useless parameter in __snmp6_fill_statsdev Jia He
2016-09-06  2:30   ` Jia He
2016-09-06  2:30 ` [RFC PATCH v2 6/6] net: Suppress the "Comparison to NULL could be written" warning Jia He
2016-09-06  2:30   ` Jia He
2016-09-06 12:44 ` [RFC PATCH v2 0/6] Reduce cache miss for snmp_fold_field Marcelo Ricardo Leitner
2016-09-06 12:44   ` Marcelo Ricardo Leitner
2016-09-07  2:30   ` hejianet
2016-09-07  2:30     ` hejianet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.