All of lore.kernel.org
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: David Laight <David.Laight@aculab.com>
Cc: "'Rasmus Villemoes'" <rasmus.villemoes@prevas.dk>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	Paul Mackerras <paulus@samba.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure
Date: Mon, 15 Mar 2021 18:59:47 -0500	[thread overview]
Message-ID: <20210315235947.GD16691@gate.crashing.org> (raw)
In-Reply-To: <14e2cfb8c3f141aaba8fe0fb2d8f1885@AcuMS.aculab.com>

On Mon, Mar 15, 2021 at 04:38:52PM +0000, David Laight wrote:
> From: Rasmus Villemoes
> > Sent: 15 March 2021 16:24
> > On 12/03/2021 03.29, Segher Boessenkool wrote:
> > > On Tue, Mar 09, 2021 at 06:19:30AM +0000, Christophe Leroy wrote:
> > >> With some defconfig including CONFIG_CC_OPTIMIZE_FOR_SIZE,
> > >> (for instance mvme5100_defconfig and ps3_defconfig), gcc 5
> > >> generates a call to _restgpr_31_x.
> > >
> > >> I don't know if there is a way to tell GCC not to emit that call, because at the end we get more
> > instructions than needed.
> > >
> > > The function is required by the ABI, you need to have it.
> > >
> > > You get *fewer* insns statically, and that is what -Os is about: reduce
> > > the size of the binaries.
> > 
> > Is there any reason to not just always build the vdso with -O2? It's one
> > page/one VMA either way, and the vdso is about making certain system
> > calls cheaper, so if unconditional -O2 could save a few cycles compared
> > to -Os, why not? (And if, as it seems, there's only one user within the
> > DSO of _restgpr_31_x, yes, the overall size of the .text segment
> > probably increases slightly).
> 
> Sometimes -Os generates such horrid code you really never want to use it.
> A classic is on x86 where it replaces 'load register with byte constant'
> with 'push byte' 'pop register'.
> The code is actually smaller but the execution time is horrid.
> 
> There are also cases where -O2 actually generates smaller code.

Yes, as with all heuristics it doesn't always work out.  But usually -Os
is smaller.

> Although you may need to disable loop unrolling (often dubious at best)
> and either force or disable some function inlining.

The cases where GCC does loop unrolling at -O2 always help quite a lot.
Or, do you have a counter-example?  We'd love to see one.

And yup, inlining is hard.  GCC's heuristics there are very good
nowadays, but any single decision has big effects.  Doing the important
spots manually (always_inline or noinline) has good payoff.


Segher

WARNING: multiple messages have this Message-ID (diff)
From: Segher Boessenkool <segher@kernel.crashing.org>
To: David Laight <David.Laight@aculab.com>
Cc: Paul Mackerras <paulus@samba.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	'Rasmus Villemoes' <rasmus.villemoes@prevas.dk>
Subject: Re: [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure
Date: Mon, 15 Mar 2021 18:59:47 -0500	[thread overview]
Message-ID: <20210315235947.GD16691@gate.crashing.org> (raw)
In-Reply-To: <14e2cfb8c3f141aaba8fe0fb2d8f1885@AcuMS.aculab.com>

On Mon, Mar 15, 2021 at 04:38:52PM +0000, David Laight wrote:
> From: Rasmus Villemoes
> > Sent: 15 March 2021 16:24
> > On 12/03/2021 03.29, Segher Boessenkool wrote:
> > > On Tue, Mar 09, 2021 at 06:19:30AM +0000, Christophe Leroy wrote:
> > >> With some defconfig including CONFIG_CC_OPTIMIZE_FOR_SIZE,
> > >> (for instance mvme5100_defconfig and ps3_defconfig), gcc 5
> > >> generates a call to _restgpr_31_x.
> > >
> > >> I don't know if there is a way to tell GCC not to emit that call, because at the end we get more
> > instructions than needed.
> > >
> > > The function is required by the ABI, you need to have it.
> > >
> > > You get *fewer* insns statically, and that is what -Os is about: reduce
> > > the size of the binaries.
> > 
> > Is there any reason to not just always build the vdso with -O2? It's one
> > page/one VMA either way, and the vdso is about making certain system
> > calls cheaper, so if unconditional -O2 could save a few cycles compared
> > to -Os, why not? (And if, as it seems, there's only one user within the
> > DSO of _restgpr_31_x, yes, the overall size of the .text segment
> > probably increases slightly).
> 
> Sometimes -Os generates such horrid code you really never want to use it.
> A classic is on x86 where it replaces 'load register with byte constant'
> with 'push byte' 'pop register'.
> The code is actually smaller but the execution time is horrid.
> 
> There are also cases where -O2 actually generates smaller code.

Yes, as with all heuristics it doesn't always work out.  But usually -Os
is smaller.

> Although you may need to disable loop unrolling (often dubious at best)
> and either force or disable some function inlining.

The cases where GCC does loop unrolling at -O2 always help quite a lot.
Or, do you have a counter-example?  We'd love to see one.

And yup, inlining is hard.  GCC's heuristics there are very good
nowadays, but any single decision has big effects.  Doing the important
spots manually (always_inline or noinline) has good payoff.


Segher

  reply	other threads:[~2021-03-16  0:05 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-09  6:19 [PATCH] powerpc/vdso32: Add missing _restgpr_31_x to fix build failure Christophe Leroy
2021-03-09  6:19 ` Christophe Leroy
2021-03-12  2:29 ` Segher Boessenkool
2021-03-12  2:29   ` Segher Boessenkool
2021-03-15 16:23   ` Rasmus Villemoes
2021-03-15 16:23     ` Rasmus Villemoes
2021-03-15 16:38     ` David Laight
2021-03-15 16:38       ` David Laight
2021-03-15 23:59       ` Segher Boessenkool [this message]
2021-03-15 23:59         ` Segher Boessenkool
2021-03-16  9:35         ` David Laight
2021-03-16  9:35           ` David Laight
2021-03-15 23:47     ` Segher Boessenkool
2021-03-15 23:47       ` Segher Boessenkool
2021-03-12 13:09 ` Christophe Leroy
2021-03-15 13:31 ` Michael Ellerman
2021-03-15 13:31   ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210315235947.GD16691@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=David.Laight@aculab.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    --cc=rasmus.villemoes@prevas.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.