* [PATCH] proc: much faster /proc/vmstat
@ 2016-08-06 12:54 Alexey Dobriyan
2016-08-07 1:35 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Alexey Dobriyan @ 2016-08-06 12:54 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel
Every current KDE system has process named ksysguardd polling files below
once in several seconds:
$ strace -e trace=open -p $(pidof ksysguardd)
Process 1812 attached
open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 8
open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 8
open("/proc/net/dev", O_RDONLY) = 8
open("/proc/net/wireless", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/proc/stat", O_RDONLY) = 8
open("/proc/vmstat", O_RDONLY) = 8
Hell knows what it is doing but speed up reading /proc/vmstat by 33%!
Benchmark is open+read+close 1.000.000 times.
BEFORE
$ perf stat -r 10 taskset -c 3 ./proc-vmstat
Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):
13146.768464 task-clock (msec) # 0.960 CPUs utilized ( +- 0.60% )
15 context-switches # 0.001 K/sec ( +- 1.41% )
1 cpu-migrations # 0.000 K/sec ( +- 11.11% )
104 page-faults # 0.008 K/sec ( +- 0.57% )
45,489,799,349 cycles # 3.460 GHz ( +- 0.03% )
9,970,175,743 stalled-cycles-frontend # 21.92% frontend cycles idle ( +- 0.10% )
2,800,298,015 stalled-cycles-backend # 6.16% backend cycles idle ( +- 0.32% )
79,241,190,850 instructions # 1.74 insn per cycle
# 0.13 stalled cycles per insn ( +- 0.00% )
17,616,096,146 branches # 1339.956 M/sec ( +- 0.00% )
176,106,232 branch-misses # 1.00% of all branches ( +- 0.18% )
13.691078109 seconds time elapsed ( +- 0.03% )
^^^^^^^^^^^^
AFTER
$ perf stat -r 10 taskset -c 3 ./proc-vmstat
Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):
8688.353749 task-clock (msec) # 0.950 CPUs utilized ( +- 1.25% )
10 context-switches # 0.001 K/sec ( +- 2.13% )
1 cpu-migrations # 0.000 K/sec
104 page-faults # 0.012 K/sec ( +- 0.56% )
30,384,010,730 cycles # 3.497 GHz ( +- 0.07% )
12,296,259,407 stalled-cycles-frontend # 40.47% frontend cycles idle ( +- 0.13% )
3,370,668,651 stalled-cycles-backend # 11.09% backend cycles idle ( +- 0.69% )
28,969,052,879 instructions # 0.95 insn per cycle
# 0.42 stalled cycles per insn ( +- 0.01% )
6,308,245,891 branches # 726.058 M/sec ( +- 0.00% )
214,685,502 branch-misses # 3.40% of all branches ( +- 0.26% )
9.146081052 seconds time elapsed ( +- 0.07% )
^^^^^^^^^^^
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---
mm/vmstat.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1592,7 +1592,10 @@ static int vmstat_show(struct seq_file *m, void *arg)
{
unsigned long *l = arg;
unsigned long off = l - (unsigned long *)m->private;
- seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
+
+ seq_puts(m, vmstat_text[off]);
+ seq_put_decimal_ull(m, ' ', *l);
+ seq_putc(m, '\n');
return 0;
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] proc: much faster /proc/vmstat
2016-08-06 12:54 [PATCH] proc: much faster /proc/vmstat Alexey Dobriyan
@ 2016-08-07 1:35 ` Al Viro
2016-08-07 8:42 ` Alexey Dobriyan
0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2016-08-07 1:35 UTC (permalink / raw)
To: Alexey Dobriyan; +Cc: akpm, linux-kernel
On Sat, Aug 06, 2016 at 03:54:56PM +0300, Alexey Dobriyan wrote:
[sprintf sucks, let's convert numbers manually]
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1592,7 +1592,10 @@ static int vmstat_show(struct seq_file *m, void *arg)
> {
> unsigned long *l = arg;
> unsigned long off = l - (unsigned long *)m->private;
> - seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
> +
> + seq_puts(m, vmstat_text[off]);
> + seq_put_decimal_ull(m, ' ', *l);
> + seq_putc(m, '\n');
> return 0;
> }
If that manages to be a hotspot, we really should
* educate the wankers responsible for the userland code in question,
until they repent and cease committing such abominations.
* look into fixing vsnprintf().
Seriously, what the hell is vsnprintf() doing that takes so much time? It's
not as if it was a complex format anyway. WTF is going on there? Where is
it spending that much time?
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] proc: much faster /proc/vmstat
2016-08-07 1:35 ` Al Viro
@ 2016-08-07 8:42 ` Alexey Dobriyan
0 siblings, 0 replies; 3+ messages in thread
From: Alexey Dobriyan @ 2016-08-07 8:42 UTC (permalink / raw)
To: Al Viro; +Cc: akpm, linux-kernel
On Sun, Aug 07, 2016 at 02:35:13AM +0100, Al Viro wrote:
> On Sat, Aug 06, 2016 at 03:54:56PM +0300, Alexey Dobriyan wrote:
>
> [sprintf sucks, let's convert numbers manually]
>
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -1592,7 +1592,10 @@ static int vmstat_show(struct seq_file *m, void *arg)
> > {
> > unsigned long *l = arg;
> > unsigned long off = l - (unsigned long *)m->private;
> > - seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
> > +
> > + seq_puts(m, vmstat_text[off]);
> > + seq_put_decimal_ull(m, ' ', *l);
> > + seq_putc(m, '\n');
> > return 0;
> > }
>
> If that manages to be a hotspot, we really should
> * educate the wankers responsible for the userland code in question,
> until they repent and cease committing such abominations.
I'll get right on that.
> * look into fixing vsnprintf().
>
> Seriously, what the hell is vsnprintf() doing that takes so much time? It's
> not as if it was a complex format anyway. WTF is going on there? Where is
> it spending that much time?
1. format_decode() is busy looking for format specifier: 2 branches per character
(not in this case, but in others)
2. approximately million branches while parsing format mini language
and everywhere
3. just look at what string() does
/proc/vmstat is good case because most of its content are strings
But the patch will still be faster.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-08-07 8:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-06 12:54 [PATCH] proc: much faster /proc/vmstat Alexey Dobriyan
2016-08-07 1:35 ` Al Viro
2016-08-07 8:42 ` Alexey Dobriyan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).