linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Significant performance difference for postgres w/o sched_autogroup
@ 2015-07-08 15:45 Andres Freund
  2015-07-09  2:45 ` Mike Galbraith
  0 siblings, 1 reply; 4+ messages in thread
From: Andres Freund @ 2015-07-08 15:45 UTC (permalink / raw)
  To: linux-kernel, Mike Galbraith; +Cc: Ingo Molnar, Peter Zijlstra

Hi,

while debugging a performance issue I noticed the following very odd
bit.

Kernel: 4.1.0-rc7-andres-00049-ge64f638
CPU: 2xE5520

Workload:

postgresql (multi-process via shared memory SQL server) and benchmark
client (pgbench, multi threaded) running on the same server. Connected
using unix sockets.  The statements are relatively simple (~1.5ms on
average), forcing frequent back/forth between server processes and
pgbench threads.

I found that disabling sched_autogroup *significantly* reduces
throughput. Even when both server and client are started from the same
terminal and thus should be in the same group!

There's a significant difference in how %sys with autogroups
enabled/disabled. ~8% v ~27%. That sounds too much.

The difference in postgres performance is significant:
enabled:
progress: 1.0 s, 25857.2 tps, lat 1.171 ms stddev 3.314
progress: 2.0 s, 19582.5 tps, lat 1.586 ms stddev 5.349
progress: 3.0 s, 17796.3 tps, lat 1.804 ms stddev 7.364
progress: 4.0 s, 18736.7 tps, lat 1.696 ms stddev 7.612
progress: 5.0 s, 19269.9 tps, lat 1.640 ms stddev 6.835
progress: 6.0 s, 21389.6 tps, lat 1.467 ms stddev 5.663
progress: 7.0 s, 22153.8 tps, lat 1.417 ms stddev 8.417
progress: 8.0 s, 21550.7 tps, lat 1.487 ms stddev 7.460
progress: 9.0 s, 22602.3 tps, lat 1.401 ms stddev 6.738
progress: 10.0 s, 23475.5 tps, lat 1.327 ms stddev 7.111
tps = 21329.219141 (excluding connections establishing)

progress: 1.0 s, 25092.5 tps, lat 1.255 ms stddev 0.511
progress: 2.0 s, 13322.0 tps, lat 2.377 ms stddev 5.435
progress: 3.0 s, 13903.0 tps, lat 2.298 ms stddev 5.075
progress: 4.0 s, 13734.0 tps, lat 2.328 ms stddev 5.146
progress: 5.0 s, 14015.0 tps, lat 2.281 ms stddev 5.099
progress: 6.0 s, 14034.0 tps, lat 2.277 ms stddev 5.002
progress: 7.0 s, 13930.0 tps, lat 2.297 ms stddev 5.136
progress: 8.0 s, 14015.0 tps, lat 2.279 ms stddev 5.161
progress: 9.0 s, 14000.0 tps, lat 2.287 ms stddev 5.117
progress: 10.0 s, 13900.6 tps, lat 2.299 ms stddev 5.142
tps = 15006.317841 (excluding connections establishing)

(note the odd dip after the first second. That's what I was actually debugging)

A perf diff of -g -a -e cycles:k shows:
# Baseline    Delta  Shared Object        Symbol
# ........  .......  ...................  ..............................................
#
    11.70%   +3.87%  [kernel.vmlinux]     [k] unix_stream_recvmsg
     6.74%   +2.49%  [kernel.vmlinux]     [k] __fget
     6.69%   +4.31%  [kernel.vmlinux]     [k] sys_recvfrom
     6.49%   +1.77%  [kernel.vmlinux]     [k] system_call_after_swapgs
     5.37%   +1.39%  [kernel.vmlinux]     [k] _raw_spin_lock
     4.63%   +1.76%  [kernel.vmlinux]     [k] mutex_lock_interruptible
     4.27%   +1.27%  [kernel.vmlinux]     [k] mutex_unlock
     3.43%   +1.22%  [kernel.vmlinux]     [k] fput
     3.41%   -0.21%  [kernel.vmlinux]     [k] __fget_light
     2.96%   +1.33%  libpthread-2.19.so   [.] __libc_recv
     2.56%   +2.35%  [kernel.vmlinux]     [k] system_call_fastpath
     2.36%   +0.80%  [kernel.vmlinux]     [k] import_single_range
     2.30%   +0.82%  [kernel.vmlinux]     [k] sockfd_lookup_light

Looking a bit into the profile shows an apparent difference in how often
recv goes through the fastpath:

enabled:
-    6.55%  pgbench          [kernel.vmlinux]     [k] sys_recvfrom
   - sys_recvfrom
      + 85.88% system_call_fastpath
      + 14.12% __libc_recv

disabled:
-   10.97%  pgbench          [kernel.vmlinux]    [k] sys_recvfrom
   - sys_recvfrom
      + 74.16% system_call_fastpath
      + 25.84% __libc_recv

Is that actually expected or is there a bug somewhere? The difference
seems to be a bit large to me.

If interesting, I can try to script the workload in a reproducible
manner.

Regards,

Andres

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Significant performance difference for postgres w/o sched_autogroup
  2015-07-08 15:45 Significant performance difference for postgres w/o sched_autogroup Andres Freund
@ 2015-07-09  2:45 ` Mike Galbraith
  2015-07-09  9:28   ` Andres Freund
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Galbraith @ 2015-07-09  2:45 UTC (permalink / raw)
  To: Andres Freund; +Cc: linux-kernel, Ingo Molnar, Peter Zijlstra

On Wed, 2015-07-08 at 17:45 +0200, Andres Freund wrote:

> Workload:
> 
> postgresql (multi-process via shared memory SQL server) and benchmark
> client (pgbench, multi threaded) running on the same server. Connected
> using unix sockets.  The statements are relatively simple (~1.5ms on
> average), forcing frequent back/forth between server processes and
> pgbench threads.
> 
> I found that disabling sched_autogroup *significantly* reduces
> throughput. Even when both server and client are started from the same
> terminal and thus should be in the same group!
> 
> There's a significant difference in how %sys with autogroups
> enabled/disabled. ~8% v ~27%. That sounds too much.

Seems reasonable to me.  1(tg)/2(tgs) > 1(task)/N(tasks), throughput is
what the server can sustain on its given budget, larger budget means
less client blockage, thus less %sys.

	-Mike


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Significant performance difference for postgres w/o sched_autogroup
  2015-07-09  2:45 ` Mike Galbraith
@ 2015-07-09  9:28   ` Andres Freund
  2015-07-09 10:35     ` Mike Galbraith
  0 siblings, 1 reply; 4+ messages in thread
From: Andres Freund @ 2015-07-09  9:28 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-kernel, Ingo Molnar, Peter Zijlstra

On 2015-07-09 04:45:38 +0200, Mike Galbraith wrote:
> On Wed, 2015-07-08 at 17:45 +0200, Andres Freund wrote:
> > Workload:
> > 
> > postgresql (multi-process via shared memory SQL server) and benchmark
> > client (pgbench, multi threaded) running on the same server. Connected
> > using unix sockets.  The statements are relatively simple (~1.5ms on
> > average), forcing frequent back/forth between server processes and
> > pgbench threads.
> > 
> > I found that disabling sched_autogroup *significantly* reduces
> > throughput. Even when both server and client are started from the same
> > terminal and thus should be in the same group!
> > 
> > There's a significant difference in how %sys with autogroups
> > enabled/disabled. ~8% v ~27%. That sounds too much.
> 
> Seems reasonable to me.  1(tg)/2(tgs) > 1(task)/N(tasks), throughput is
> what the server can sustain on its given budget, larger budget means
> less client blockage, thus less %sys.

That'd make some sense if there were other stuff going on - but here the
same total budget in both cases leads to a 40% difference in throughput:

autogroup on: tps = 21329.219141; autogroup off: tps = 15006.317841

there's really nothing to do for the kernel or other tasks in the
background. It's a readonly workload, not requiring lots of memory, ...

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Significant performance difference for postgres w/o sched_autogroup
  2015-07-09  9:28   ` Andres Freund
@ 2015-07-09 10:35     ` Mike Galbraith
  0 siblings, 0 replies; 4+ messages in thread
From: Mike Galbraith @ 2015-07-09 10:35 UTC (permalink / raw)
  To: Andres Freund; +Cc: linux-kernel, Ingo Molnar, Peter Zijlstra

On Thu, 2015-07-09 at 11:28 +0200, Andres Freund wrote:

> That'd make some sense if there were other stuff going on - but here the
> same total budget in both cases leads to a 40% difference in throughput:

When I suspect that what I'm hearing is horse pookey, I instrument the
source and ask my silicon horse pookey detector :)

	-Mike


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-07-09 10:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-08 15:45 Significant performance difference for postgres w/o sched_autogroup Andres Freund
2015-07-09  2:45 ` Mike Galbraith
2015-07-09  9:28   ` Andres Freund
2015-07-09 10:35     ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).