All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@kernel.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	live-patching@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC PATCH 00/10] x86: undwarf unwinder
Date: Fri, 2 Jun 2017 11:40:48 +0100	[thread overview]
Message-ID: <20170602104048.jkkzssljsompjdwy@suse.de> (raw)
In-Reply-To: <3db1be2a-cc33-89f3-950f-cfe1c21ee7f1@suse.cz>

On Thu, Jun 01, 2017 at 04:08:25PM +0200, Jiri Slaby wrote:
> Ccing Mel who did proper measurements and can hopefully comment on his
> results.
> 
> On 06/01/2017, 03:50 PM, Ingo Molnar wrote:
> > That's not what I meant! The speedup comes from (hopefully) being able to disable 
> > CONFIG_FRAME_POINTER, which:
> > 
> >  - creates simpler/faster function prologues and epilogues - no managing of RBP 
> >    needed
> > 
> >  - gives one more generic purpose register to work from. This matters less on 
> >    64-bit kernels but it's a small effect.
> > 
> > I've seen numbers of 1-2% of instruction count reduction in common kernel 
> > workloads, which would be pretty significant on well cached workloads.
> 

I didn't preserve the data involved but in a variety of workloads including
netperf, page allocator microbenchmark, pgbench and sqlite, enabling
framepointer introduced overhead of around the 5-10% mark. According
to an internal report I gave at the time, hackbench-thread-sockets was
around the 5% mark and a perf run showed "3.49% more cache misses with
framepointer enabled and 6.59% more cycles". Additional notes I made at
the time although again, without the original data is

---8<---
It looks like a small amount of overhead added everywhere and the size of
the vmlinux files supports that

   text    data     bss     dec     hex   filename
8143072 6480614 11153408 25777094 18953c6 vmlinux/decker/vmlinux-4.8.0-disable-fp
8396698 6480614 11153408 26030720 18d3280 vmlinux/decker/vmlinux-4.8.0-enable-fp

I also took a closer look at the pagealloc microbenchmarks because they
rely on so few functions. Profiles were not always captured due to the
short-lived nature of some of the tests so I looked at batches of 16384
allocation/frees of order-0 pages. Overall it showed 4.46% decline with
framepointer enabled and profiling. 3.89% more cycles and 24.94% more
cache misses.

As before, the framepointer cache miss overhead is not that obvious as
the bulk of samples take place elsewhere -- in this case, in checking
whether pages are buddies when merging. It's slightly clearer in
__rmqueue where 17.9% of cache misses are in the function entry point
with framepointer enabled vs 4.04% with framepointer disabled.
---8<---

Granted, the check was done back in 4.8, but I've no reason to believe
that 4.12 is any different and enabling framepointer does have a quite
substantial hit to workloads that spent a lot of time in the kernel.

-- 
Mel Gorman
SUSE Labs

      reply	other threads:[~2017-06-02 10:41 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-01  5:44 [RFC PATCH 00/10] x86: undwarf unwinder Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 01/10] objtool: move checking code to check.c Josh Poimboeuf
2017-06-14  7:22   ` Jiri Slaby
2017-06-01  5:44 ` [RFC PATCH 02/10] objtool, x86: add several functions and files to the objtool whitelist Josh Poimboeuf
2017-06-14  7:24   ` Jiri Slaby
2017-06-14 13:03     ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 03/10] objtool: stack validation 2.0 Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 04/10] objtool: add undwarf debuginfo generation Josh Poimboeuf
2017-06-14  8:42   ` Jiri Slaby
2017-06-14 13:27     ` Josh Poimboeuf
2017-06-22  7:47       ` Jiri Slaby
2017-06-22 12:49         ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 05/10] objtool, x86: add facility for asm code to provide CFI hints Josh Poimboeuf
2017-06-01 13:57   ` Andy Lutomirski
2017-06-01 14:16     ` Josh Poimboeuf
2017-06-01 14:40       ` Andy Lutomirski
2017-06-01 15:02         ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 06/10] x86/entry: add CFI hint undwarf annotations Josh Poimboeuf
2017-06-01 14:03   ` Andy Lutomirski
2017-06-01 14:23     ` Josh Poimboeuf
2017-06-01 14:28       ` Josh Poimboeuf
2017-06-01 14:39         ` Andy Lutomirski
2017-06-01 15:01           ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 07/10] x86/asm: add CFI hint annotations to sync_core() Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 08/10] extable: rename 'sortextable' script to 'sorttable' Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 09/10] extable: add undwarf table sorting ability to sorttable script Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 10/10] x86/unwind: add undwarf unwinder Josh Poimboeuf
2017-06-01 11:05   ` Peter Zijlstra
2017-06-01 12:26     ` Josh Poimboeuf
2017-06-01 12:47       ` Jiri Slaby
2017-06-01 13:02         ` Josh Poimboeuf
2017-06-01 13:42         ` Peter Zijlstra
2017-06-01 13:10       ` Peter Zijlstra
2017-06-01 12:13   ` Peter Zijlstra
2017-06-01 12:36     ` Josh Poimboeuf
2017-06-01 13:12       ` Peter Zijlstra
2017-06-01 15:03         ` Josh Poimboeuf
2017-06-14 11:45   ` Jiri Slaby
2017-06-14 13:44     ` Josh Poimboeuf
2017-06-01  6:08 ` [RFC PATCH 00/10] x86: " Ingo Molnar
2017-06-01 11:58   ` Josh Poimboeuf
2017-06-01 12:17     ` Peter Zijlstra
2017-06-01 12:33       ` Jiri Slaby
2017-06-01 12:52         ` Josh Poimboeuf
2017-06-01 12:57           ` Jiri Slaby
2017-06-01 12:47       ` Josh Poimboeuf
2017-06-01 13:25         ` Peter Zijlstra
2017-06-06 14:14           ` Sergey Senozhatsky
2017-06-01 13:50         ` Andy Lutomirski
2017-06-01 13:50     ` Ingo Molnar
2017-06-01 13:58       ` Jiri Slaby
2017-06-02  8:30         ` Jiri Slaby
2017-06-01 14:05       ` Josh Poimboeuf
2017-06-01 14:08       ` Jiri Slaby
2017-06-02 10:40         ` Mel Gorman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170602104048.jkkzssljsompjdwy@suse.de \
    --to=mgorman@suse.de \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.