All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] proc: much faster /proc/vmstat
@ 2016-08-06 12:54 Alexey Dobriyan
  2016-08-07  1:35 ` Al Viro
  0 siblings, 1 reply; 3+ messages in thread
From: Alexey Dobriyan @ 2016-08-06 12:54 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel

Every current KDE system has process named ksysguardd polling files below
once in several seconds:

	$ strace -e trace=open -p $(pidof ksysguardd)
	Process 1812 attached
	open("/etc/mtab", O_RDONLY|O_CLOEXEC)   = 8
	open("/etc/mtab", O_RDONLY|O_CLOEXEC)   = 8
	open("/proc/net/dev", O_RDONLY)         = 8
	open("/proc/net/wireless", O_RDONLY)    = -1 ENOENT (No such file or directory)
	open("/proc/stat", O_RDONLY)            = 8
	open("/proc/vmstat", O_RDONLY)          = 8

Hell knows what it is doing but speed up reading /proc/vmstat by 33%!

Benchmark is open+read+close 1.000.000 times.

			BEFORE
$ perf stat -r 10 taskset -c 3 ./proc-vmstat

 Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):

      13146.768464      task-clock (msec)         #    0.960 CPUs utilized            ( +-  0.60% )
                15      context-switches          #    0.001 K/sec                    ( +-  1.41% )
                 1      cpu-migrations            #    0.000 K/sec                    ( +- 11.11% )
               104      page-faults               #    0.008 K/sec                    ( +-  0.57% )
    45,489,799,349      cycles                    #    3.460 GHz                      ( +-  0.03% )
     9,970,175,743      stalled-cycles-frontend   #   21.92% frontend cycles idle     ( +-  0.10% )
     2,800,298,015      stalled-cycles-backend    #   6.16% backend cycles idle       ( +-  0.32% )
    79,241,190,850      instructions              #    1.74  insn per cycle
                                                  #    0.13  stalled cycles per insn  ( +-  0.00% )
    17,616,096,146      branches                  # 1339.956 M/sec                    ( +-  0.00% )
       176,106,232      branch-misses             #    1.00% of all branches          ( +-  0.18% )

      13.691078109 seconds time elapsed                                          ( +-  0.03% )
      ^^^^^^^^^^^^

			AFTER
$ perf stat -r 10 taskset -c 3 ./proc-vmstat

 Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):

       8688.353749      task-clock (msec)         #    0.950 CPUs utilized            ( +-  1.25% )
                10      context-switches          #    0.001 K/sec                    ( +-  2.13% )
                 1      cpu-migrations            #    0.000 K/sec
               104      page-faults               #    0.012 K/sec                    ( +-  0.56% )
    30,384,010,730      cycles                    #    3.497 GHz                      ( +-  0.07% )
    12,296,259,407      stalled-cycles-frontend   #   40.47% frontend cycles idle     ( +-  0.13% )
     3,370,668,651      stalled-cycles-backend    #  11.09% backend cycles idle       ( +-  0.69% )
    28,969,052,879      instructions              #    0.95  insn per cycle
                                                  #    0.42  stalled cycles per insn  ( +-  0.01% )
     6,308,245,891      branches                  #  726.058 M/sec                    ( +-  0.00% )
       214,685,502      branch-misses             #    3.40% of all branches          ( +-  0.26% )

       9.146081052 seconds time elapsed                                          ( +-  0.07% )
       ^^^^^^^^^^^

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 mm/vmstat.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1592,7 +1592,10 @@ static int vmstat_show(struct seq_file *m, void *arg)
 {
 	unsigned long *l = arg;
 	unsigned long off = l - (unsigned long *)m->private;
-	seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
+
+	seq_puts(m, vmstat_text[off]);
+	seq_put_decimal_ull(m, ' ', *l);
+	seq_putc(m, '\n');
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] proc: much faster /proc/vmstat
  2016-08-06 12:54 [PATCH] proc: much faster /proc/vmstat Alexey Dobriyan
@ 2016-08-07  1:35 ` Al Viro
  2016-08-07  8:42   ` Alexey Dobriyan
  0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2016-08-07  1:35 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: akpm, linux-kernel

On Sat, Aug 06, 2016 at 03:54:56PM +0300, Alexey Dobriyan wrote:

[sprintf sucks, let's convert numbers manually]

> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1592,7 +1592,10 @@ static int vmstat_show(struct seq_file *m, void *arg)
>  {
>  	unsigned long *l = arg;
>  	unsigned long off = l - (unsigned long *)m->private;
> -	seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
> +
> +	seq_puts(m, vmstat_text[off]);
> +	seq_put_decimal_ull(m, ' ', *l);
> +	seq_putc(m, '\n');
>  	return 0;
>  }

If that manages to be a hotspot, we really should
	* educate the wankers responsible for the userland code in question,
until they repent and cease committing such abominations.
	* look into fixing vsnprintf().  

Seriously, what the hell is vsnprintf() doing that takes so much time?  It's
not as if it was a complex format anyway.  WTF is going on there?  Where is
it spending that much time?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] proc: much faster /proc/vmstat
  2016-08-07  1:35 ` Al Viro
@ 2016-08-07  8:42   ` Alexey Dobriyan
  0 siblings, 0 replies; 3+ messages in thread
From: Alexey Dobriyan @ 2016-08-07  8:42 UTC (permalink / raw)
  To: Al Viro; +Cc: akpm, linux-kernel

On Sun, Aug 07, 2016 at 02:35:13AM +0100, Al Viro wrote:
> On Sat, Aug 06, 2016 at 03:54:56PM +0300, Alexey Dobriyan wrote:
> 
> [sprintf sucks, let's convert numbers manually]
> 
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -1592,7 +1592,10 @@ static int vmstat_show(struct seq_file *m, void *arg)
> >  {
> >  	unsigned long *l = arg;
> >  	unsigned long off = l - (unsigned long *)m->private;
> > -	seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
> > +
> > +	seq_puts(m, vmstat_text[off]);
> > +	seq_put_decimal_ull(m, ' ', *l);
> > +	seq_putc(m, '\n');
> >  	return 0;
> >  }
> 
> If that manages to be a hotspot, we really should
> 	* educate the wankers responsible for the userland code in question,
> until they repent and cease committing such abominations.

I'll get right on that.

> 	* look into fixing vsnprintf().  
> 
> Seriously, what the hell is vsnprintf() doing that takes so much time?  It's
> not as if it was a complex format anyway.  WTF is going on there?  Where is
> it spending that much time?

1. format_decode() is busy looking for format specifier: 2 branches per character
   (not in this case, but in others)

2. approximately million branches while parsing format mini language
   and everywhere

3. just look at what string() does
   /proc/vmstat is good case because most of its content are strings

But the patch will still be faster.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-08-07  8:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-06 12:54 [PATCH] proc: much faster /proc/vmstat Alexey Dobriyan
2016-08-07  1:35 ` Al Viro
2016-08-07  8:42   ` Alexey Dobriyan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.