linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@kernel.org>,
	Alexander Potapenko <glider@google.com>,
	Dmitriy Vyukov <dvyukov@google.com>,
	Matthias Kaehlcke <mka@chromium.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC PATCH 3/4] x86/asm: Make alternative macro interfaces more clear and consistent
Date: Mon, 18 Sep 2017 12:40:31 -0500	[thread overview]
Message-ID: <20170918174031.u5bb6oxpsr6u7gad@treble> (raw)
In-Reply-To: <a8671d1f-01fb-2be4-1036-b27102d6fbc7@virtuozzo.com>

On Sun, Sep 17, 2017 at 01:22:32AM +0300, Andrey Ryabinin wrote:
> On 09/16/2017 02:29 AM, Josh Poimboeuf wrote:
> > On Fri, Sep 15, 2017 at 11:01:19AM -0700, Linus Torvalds wrote:
> >> On Fri, Sep 15, 2017 at 9:53 AM, Andrey Ryabinin
> >> <aryabinin@virtuozzo.com> wrote:
> >>>
> >>> I'm not so sure that this is disabled optimization. I assume that global rsp makes
> >>> changes something in gcc's register allocation logic, or something like that leading
> >>> to subtle changes in generated code.
> >>>
> >>> But what I recently find out, is that this "regression" sometimes is actually improvement in .text size.
> >>> It all depends on .config, e.g:
> >>
> >> Oh, that would be lovely and solve all the issues.
> >>
> >> And looking at the code generation differences for one file
> >> (kernel/futex.c) and one single config (my default config), the thing
> >> that the global stack register seems to change is that it moves some
> >> code - particularly completely unrelated inline asm code - inside the
> >> region protected by frame pointers.
> >>
> >> There are a few register allocation changes too, but they didn't seem
> >> to make code worse, and I think they were just "incidental" from code
> >> movement. And most code movement really seemed to be around inline
> >> asms, I  wonder if the gcc logic simply is something like "if the
> >> stack pointer is visible as a register, don't move any inline asm
> >> across a frame setup".
> >>
> >> In fact, on that one file and one configuration, the resulting
> >> assembler file had three fewer lines of code with that global stack
> >> register declaration than with the local one.
> >>
> >> So at least from just that one case, I can back up Andrey's
> >> observation: it's not that the code gets worse, it just is slightly
> >> different. Sometimes it's better.
> >>
> >> So maybe that simple patch to just make the stack pointer be a global
> >> register declaration really is the fix for this issue.
> >>
> >> It's not *pretty*, and I'd much rather just see some explicit way for
> >> us to say "this asm wants the frame to be set up", but of the
> >> alternatives we've seen, maybe it's the right thing to do?
> > 
> > Ok, here's the (compile tested) patch in case anybody wants to try it
> > out.
> > 
> 
> I already had almost the same patch, so I just used mine. Patch seems to be
> the same as yours except cosmetic details which shouldn't affect the end result.
> But just in case it posted below.
> 
> 
> The patch was applied on top of b38923a068c10fc36ca8f596d650d095ce390b85 commit,
> kernel compiled gcc 7.2.0 
> 
> allnoconfig:   text    data     bss     dec     hex filename
> base           946920  206384 1215752 2369056  242620 allnoconfig/vmlinux
> patched        946501  206384 1215752 2368637  24247d allnoconfig_p/vmlinux
> 
> defconfig:     text    data     bss     dec     hex filename
> base           10729978        4840000  876544 16446522         faf43a defconfig/vmlinux
> patched        10730666        4840000  876544 16447210         faf6ea defconfig_p/vmlinux
> 
> allyesconfig:   text    data     bss     dec     hex filename
> base            161016572       152888347       49303552        363208471       15a61f17        allyesconfig/vmlinux
> patched         161003557       152888347       49303552        363195456       15a5ec40        allyesconfig_p/vmlinux

I get similar results with my config: slightly smaller .text, both with
*and* without frame pointers.  So yeah, this is probably the best
option, or at least, the least-worst option.  I shouldn't have dismissed
the idea so quickly before.

(H.J. Lu suggested another idea, which is to use "rbp" as an input
constraint.  I tried it, but it added 22k of text to my kernel with
frame pointers disabled, so that's definitely worse than this.)

If testing continues to goes well, I'll submit an official patch soon.

-- 
Josh

  reply	other threads:[~2017-09-18 17:40 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-31 14:11 [RFC PATCH 0/4] x86/asm: Add ASM_CALL() macro for inline asms with call instructions Josh Poimboeuf
2017-08-31 14:11 ` [RFC PATCH 1/4] x86/paravirt: Fix output constraint macro names Josh Poimboeuf
2017-08-31 14:11 ` [RFC PATCH 2/4] x86/asm: Convert some inline asm positional operands to named operands Josh Poimboeuf
2017-08-31 14:11 ` [RFC PATCH 3/4] x86/asm: Make alternative macro interfaces more clear and consistent Josh Poimboeuf
2017-08-31 16:11   ` Linus Torvalds
2017-08-31 17:25     ` Josh Poimboeuf
2017-08-31 17:31       ` Josh Poimboeuf
2017-09-02 10:32         ` Ingo Molnar
2017-09-14 14:48           ` Josh Poimboeuf
2017-09-14 17:16             ` Linus Torvalds
2017-09-14 17:26               ` Josh Poimboeuf
2017-09-14 17:33                 ` Josh Poimboeuf
2017-09-14 18:28                   ` Linus Torvalds
2017-09-14 18:45                     ` Josh Poimboeuf
2017-09-15 16:10                       ` Josh Poimboeuf
2017-09-15 16:53       ` Andrey Ryabinin
2017-09-15 17:20         ` Josh Poimboeuf
2017-09-15 18:01         ` Linus Torvalds
2017-09-15 23:29           ` Josh Poimboeuf
2017-09-16 22:22             ` Andrey Ryabinin
2017-09-18 17:40               ` Josh Poimboeuf [this message]
2017-09-19 16:02           ` Josh Poimboeuf
2017-08-31 14:11 ` [RFC PATCH 4/4] x86/asm: Use ASM_CALL() macro for inline asm statements with call instructions Josh Poimboeuf
2017-08-31 14:50   ` Peter Zijlstra
2017-08-31 15:21     ` Josh Poimboeuf
2017-08-31 15:36       ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170918174031.u5bb6oxpsr6u7gad@treble \
    --to=jpoimboe@redhat.com \
    --cc=arnd@arndb.de \
    --cc=aryabinin@virtuozzo.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mka@chromium.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).