From: Josh Poimboeuf <jpoimboe@redhat.com>
To: x86@kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>,
Pavel Machek <pavel@ucw.cz>,
kernel list <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@kernel.org>,
Andrew Lutomirski <luto@kernel.org>,
Borislav Petkov <bp@alien8.de>, Brian Gerst <brgerst@gmail.com>,
Denys Vlasenko <dvlasenk@redhat.com>, Peter Anvin <hpa@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH v2] x86: mostly disable '-maccumulate-outgoing-args'
Date: Thu, 16 Mar 2017 14:31:33 -0500 [thread overview]
Message-ID: <20170316193133.zrj6gug53766m6nn@treble> (raw)
In-Reply-To: <20170316154208.6c3mm6qjus3qtr6w@treble>
The gcc '-maccumulate-outgoing-args' flag is enabled for most configs,
mostly because of issues which are no longer relevant. For most
configs, and with most recent versions of gcc, it's no longer needed.
Clarify which cases need it, and only enable it for those cases. Also
produce a compile-time error for the ftrace graph + mcount + '-Os' case,
which will otherwise cause runtime failures.
The main benefit of '-maccumulate-outgoing-args' is that it prevents an
ugly prologue for functions which have aligned stacks. But removing the
option also has some benefits: more readable argument saves, smaller
text size, and (presumably) slightly improved performance.
Here are the object size savings for 32-bit and 64-bit defconfig
kernels:
text data bss dec hex filename
10006710 3543328 1773568 15323606 e9d1d6 vmlinux.x86-32.before
9706358 3547424 1773568 15027350 e54c96 vmlinux.x86-32.after
text data bss dec hex filename
10652105 4537576 843776 16033457 f4a6b1 vmlinux.x86-64.before
10639629 4537576 843776 16020981 f475f5 vmlinux.x86-64.after
That comes out to a 3% text size improvement on x86-32 and a 0.1% text
size improvement on x86-64.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
v2:
- improve readability of the comments
- add comment about why gcc version needs to be checked
- use cc-option-yn instead of cc-option
arch/x86/Makefile | 31 +++++++++++++++++++++++++++----
arch/x86/Makefile_32.cpu | 18 ------------------
arch/x86/kernel/ftrace.c | 6 ++++++
scripts/Kbuild.include | 4 ++++
4 files changed, 37 insertions(+), 22 deletions(-)
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 2d44933..04c87be 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -120,10 +120,6 @@ else
# -funit-at-a-time shrinks the kernel .text considerably
# unfortunately it makes reading oopses harder.
KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
-
- # this works around some issues with generating unwind tables in older gccs
- # newer gccs do it by default
- KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
endif
ifdef CONFIG_X86_X32
@@ -147,6 +143,33 @@ ifeq ($(CONFIG_KMEMCHECK),y)
KBUILD_CFLAGS += $(call cc-option,-fno-builtin-memcpy)
endif
+# If the function graph tracer is used with mcount instead of fentry,
+# '-maccumulate-outgoing-args' is needed to prevent a gcc bug
+# (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109)
+ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ ifndef CONFIG_HAVE_FENTRY
+ ACCUMULATE_OUTGOING_ARGS := 1
+ else
+ ifeq ($(call cc-option-yn, -mfentry), n)
+ ACCUMULATE_OUTGOING_ARGS := 1
+ endif
+ endif
+endif
+
+# Jump labels need '-maccumulate-outgoing-args' for gcc < 4.5.2 to prevent a
+# gcc bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46226). There's no way
+# to test for this bug at compile-time because the test case needs to execute,
+# which is a no-go for cross compilers. So check the gcc version instead.
+ifdef CONFIG_JUMP_LABEL
+ ifneq ($(ACCUMULATE_OUTGOING_ARGS), 1)
+ ACCUMULATE_OUTGOING_ARGS = $(call cc-if-fullversion, -lt, 040502, 1)
+ endif
+endif
+
+ifeq ($(ACCUMULATE_OUTGOING_ARGS), 1)
+ KBUILD_CFLAGS += -maccumulate-outgoing-args
+endif
+
# Stackpointer is addressed different for 32 bit and 64 bit x86
sp-$(CONFIG_X86_32) := esp
sp-$(CONFIG_X86_64) := rsp
diff --git a/arch/x86/Makefile_32.cpu b/arch/x86/Makefile_32.cpu
index 6647ed4..a45eb15 100644
--- a/arch/x86/Makefile_32.cpu
+++ b/arch/x86/Makefile_32.cpu
@@ -45,24 +45,6 @@ cflags-$(CONFIG_MGEODE_LX) += $(call cc-option,-march=geode,-march=pentium-mmx)
# cpu entries
cflags-$(CONFIG_X86_GENERIC) += $(call tune,generic,$(call tune,i686))
-# Work around the pentium-mmx code generator madness of gcc4.4.x which
-# does stack alignment by generating horrible code _before_ the mcount
-# prologue (push %ebp, mov %esp, %ebp) which breaks the function graph
-# tracer assumptions. For i686, generic, core2 this is set by the
-# compiler anyway
-ifeq ($(CONFIG_FUNCTION_GRAPH_TRACER), y)
-ADD_ACCUMULATE_OUTGOING_ARGS := y
-endif
-
-# Work around to a bug with asm goto with first implementations of it
-# in gcc causing gcc to mess up the push and pop of the stack in some
-# uses of asm goto.
-ifeq ($(CONFIG_JUMP_LABEL), y)
-ADD_ACCUMULATE_OUTGOING_ARGS := y
-endif
-
-cflags-$(ADD_ACCUMULATE_OUTGOING_ARGS) += $(call cc-option,-maccumulate-outgoing-args)
-
# Bug fix for binutils: this option is required in order to keep
# binutils from generating NOPL instructions against our will.
ifneq ($(CONFIG_X86_P6_NOP),y)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 8f3d9cf..59f9b46 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -29,6 +29,12 @@
#include <asm/ftrace.h>
#include <asm/nops.h>
+#if defined(CONFIG_FUNCTION_GRAPH_TRACER) && \
+ !defined(CC_USING_FENTRY) && \
+ !defined(CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE)
+# error Your compiler does not support function graph tracing
+#endif
+
#ifdef CONFIG_DYNAMIC_FTRACE
int ftrace_arch_code_modify_prepare(void)
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index d6ca649..afe3fd3 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -148,6 +148,10 @@ cc-fullversion = $(shell $(CONFIG_SHELL) \
# Usage: EXTRA_CFLAGS += $(call cc-ifversion, -lt, 0402, -O1)
cc-ifversion = $(shell [ $(cc-version) $(1) $(2) ] && echo $(3) || echo $(4))
+# cc-if-fullversion
+# Usage: EXTRA_CFLAGS += $(call cc-if-fullversion, -lt, 040502, -O1)
+cc-if-fullversion = $(shell [ $(cc-fullversion) $(1) $(2) ] && echo $(3) || echo $(4))
+
# cc-ldoption
# Usage: ldflags += $(call cc-ldoption, -Wl$(comma)--hash-style=both)
cc-ldoption = $(call try-run,\
--
2.7.4
next prev parent reply other threads:[~2017-03-16 19:31 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-21 22:14 v4.10: kernel stack frame pointer .. has bad value (null) Pavel Machek
2017-02-21 23:12 ` Josh Poimboeuf
2017-02-21 23:15 ` H. Peter Anvin
2017-02-22 16:45 ` Josh Poimboeuf
2017-02-22 20:51 ` H. Peter Anvin
2017-02-22 21:15 ` Josh Poimboeuf
2017-02-22 21:05 ` Pavel Machek
2017-02-22 21:21 ` Josh Poimboeuf
2017-02-22 22:47 ` Pavel Machek
2017-02-22 22:56 ` Josh Poimboeuf
2017-02-22 23:18 ` Josh Poimboeuf
2017-02-23 20:10 ` Pavel Machek
2017-02-25 5:04 ` Josh Poimboeuf
2017-03-02 23:45 ` Josh Poimboeuf
2017-03-06 16:38 ` Pavel Machek
2017-03-07 17:38 ` Josh Poimboeuf
2017-03-07 17:52 ` Linus Torvalds
2017-03-07 17:59 ` Andy Lutomirski
2017-03-07 18:28 ` Josh Poimboeuf
2017-03-07 18:30 ` Josh Poimboeuf
2017-03-07 18:40 ` Linus Torvalds
2017-03-08 17:37 ` Josh Poimboeuf
2017-03-08 18:25 ` Linus Torvalds
2017-03-08 18:54 ` Andy Lutomirski
2017-03-08 21:22 ` Pavel Machek
2017-03-09 9:38 ` Geert Uytterhoeven
2017-03-09 10:56 ` Pavel Machek
2017-03-09 12:16 ` Geert Uytterhoeven
2017-03-10 13:17 ` Compiling kernels faster (was Re: v4.10: kernel stack frame pointer .. has bad value (null)) Pavel Machek
2017-03-10 13:28 ` Geert Uytterhoeven
2017-03-10 14:15 ` Willy Tarreau
2017-03-09 10:49 ` Old compiler versions " Pavel Machek
2017-03-09 18:05 ` Linus Torvalds
2017-03-09 15:29 ` v4.10: kernel stack frame pointer .. has bad value (null) Peter Zijlstra
2017-03-09 21:12 ` Pavel Machek
2017-03-08 21:29 ` Josh Poimboeuf
2017-03-09 14:14 ` Steven Rostedt
2017-03-09 18:31 ` Josh Poimboeuf
2017-03-16 15:42 ` [PATCH] x86: mostly disable '-maccumulate-outgoing-args' Josh Poimboeuf
2017-03-16 17:32 ` Steven Rostedt
2017-03-16 18:36 ` Josh Poimboeuf
2017-03-16 18:53 ` Josh Poimboeuf
2017-03-16 19:04 ` Josh Poimboeuf
2017-03-16 19:07 ` Steven Rostedt
2017-03-16 19:06 ` Steven Rostedt
2017-03-16 19:31 ` Josh Poimboeuf [this message]
2017-03-22 7:51 ` [PATCH v2] " Ingo Molnar
2017-03-22 15:48 ` Josh Poimboeuf
2017-03-28 8:13 ` [tip:x86/urgent] x86/build: Mostly " tip-bot for Josh Poimboeuf
2017-03-28 16:17 ` Josh Poimboeuf
2017-03-30 9:58 ` tip-bot for Josh Poimboeuf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170316193133.zrj6gug53766m6nn@treble \
--to=jpoimboe@redhat.com \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=dvlasenk@redhat.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=pavel@ucw.cz \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).