From: Segher Boessenkool <segher@kernel.crashing.org> To: David Laight <David.Laight@aculab.com> Cc: "'Rasmus Villemoes'" <rasmus.villemoes@prevas.dk>, Christophe Leroy <christophe.leroy@csgroup.eu>, "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>, Paul Mackerras <paulus@samba.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org> Subject: Re: [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure Date: Mon, 15 Mar 2021 18:59:47 -0500 [thread overview] Message-ID: <20210315235947.GD16691@gate.crashing.org> (raw) In-Reply-To: <14e2cfb8c3f141aaba8fe0fb2d8f1885@AcuMS.aculab.com> On Mon, Mar 15, 2021 at 04:38:52PM +0000, David Laight wrote: > From: Rasmus Villemoes > > Sent: 15 March 2021 16:24 > > On 12/03/2021 03.29, Segher Boessenkool wrote: > > > On Tue, Mar 09, 2021 at 06:19:30AM +0000, Christophe Leroy wrote: > > >> With some defconfig including CONFIG_CC_OPTIMIZE_FOR_SIZE, > > >> (for instance mvme5100_defconfig and ps3_defconfig), gcc 5 > > >> generates a call to _restgpr_31_x. > > > > > >> I don't know if there is a way to tell GCC not to emit that call, because at the end we get more > > instructions than needed. > > > > > > The function is required by the ABI, you need to have it. > > > > > > You get *fewer* insns statically, and that is what -Os is about: reduce > > > the size of the binaries. > > > > Is there any reason to not just always build the vdso with -O2? It's one > > page/one VMA either way, and the vdso is about making certain system > > calls cheaper, so if unconditional -O2 could save a few cycles compared > > to -Os, why not? (And if, as it seems, there's only one user within the > > DSO of _restgpr_31_x, yes, the overall size of the .text segment > > probably increases slightly). > > Sometimes -Os generates such horrid code you really never want to use it. > A classic is on x86 where it replaces 'load register with byte constant' > with 'push byte' 'pop register'. > The code is actually smaller but the execution time is horrid. > > There are also cases where -O2 actually generates smaller code. Yes, as with all heuristics it doesn't always work out. But usually -Os is smaller. > Although you may need to disable loop unrolling (often dubious at best) > and either force or disable some function inlining. The cases where GCC does loop unrolling at -O2 always help quite a lot. Or, do you have a counter-example? We'd love to see one. And yup, inlining is hard. GCC's heuristics there are very good nowadays, but any single decision has big effects. Doing the important spots manually (always_inline or noinline) has good payoff. Segher
WARNING: multiple messages have this Message-ID (diff)
From: Segher Boessenkool <segher@kernel.crashing.org> To: David Laight <David.Laight@aculab.com> Cc: Paul Mackerras <paulus@samba.org>, "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, 'Rasmus Villemoes' <rasmus.villemoes@prevas.dk> Subject: Re: [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure Date: Mon, 15 Mar 2021 18:59:47 -0500 [thread overview] Message-ID: <20210315235947.GD16691@gate.crashing.org> (raw) In-Reply-To: <14e2cfb8c3f141aaba8fe0fb2d8f1885@AcuMS.aculab.com> On Mon, Mar 15, 2021 at 04:38:52PM +0000, David Laight wrote: > From: Rasmus Villemoes > > Sent: 15 March 2021 16:24 > > On 12/03/2021 03.29, Segher Boessenkool wrote: > > > On Tue, Mar 09, 2021 at 06:19:30AM +0000, Christophe Leroy wrote: > > >> With some defconfig including CONFIG_CC_OPTIMIZE_FOR_SIZE, > > >> (for instance mvme5100_defconfig and ps3_defconfig), gcc 5 > > >> generates a call to _restgpr_31_x. > > > > > >> I don't know if there is a way to tell GCC not to emit that call, because at the end we get more > > instructions than needed. > > > > > > The function is required by the ABI, you need to have it. > > > > > > You get *fewer* insns statically, and that is what -Os is about: reduce > > > the size of the binaries. > > > > Is there any reason to not just always build the vdso with -O2? It's one > > page/one VMA either way, and the vdso is about making certain system > > calls cheaper, so if unconditional -O2 could save a few cycles compared > > to -Os, why not? (And if, as it seems, there's only one user within the > > DSO of _restgpr_31_x, yes, the overall size of the .text segment > > probably increases slightly). > > Sometimes -Os generates such horrid code you really never want to use it. > A classic is on x86 where it replaces 'load register with byte constant' > with 'push byte' 'pop register'. > The code is actually smaller but the execution time is horrid. > > There are also cases where -O2 actually generates smaller code. Yes, as with all heuristics it doesn't always work out. But usually -Os is smaller. > Although you may need to disable loop unrolling (often dubious at best) > and either force or disable some function inlining. The cases where GCC does loop unrolling at -O2 always help quite a lot. Or, do you have a counter-example? We'd love to see one. And yup, inlining is hard. GCC's heuristics there are very good nowadays, but any single decision has big effects. Doing the important spots manually (always_inline or noinline) has good payoff. Segher
next prev parent reply other threads:[~2021-03-16 0:05 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-03-09 6:19 [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure Christophe Leroy 2021-03-09 6:19 ` Christophe Leroy 2021-03-12 2:29 ` Segher Boessenkool 2021-03-12 2:29 ` Segher Boessenkool 2021-03-15 16:23 ` Rasmus Villemoes 2021-03-15 16:23 ` Rasmus Villemoes 2021-03-15 16:38 ` David Laight 2021-03-15 16:38 ` David Laight 2021-03-15 23:59 ` Segher Boessenkool [this message] 2021-03-15 23:59 ` Segher Boessenkool 2021-03-16 9:35 ` David Laight 2021-03-16 9:35 ` David Laight 2021-03-15 23:47 ` Segher Boessenkool 2021-03-15 23:47 ` Segher Boessenkool 2021-03-12 13:09 ` Christophe Leroy 2021-03-15 13:31 ` Michael Ellerman 2021-03-15 13:31 ` Michael Ellerman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210315235947.GD16691@gate.crashing.org \ --to=segher@kernel.crashing.org \ --cc=David.Laight@aculab.com \ --cc=christophe.leroy@csgroup.eu \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=paulus@samba.org \ --cc=rasmus.villemoes@prevas.dk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.