Use -fno-unit-at-a-time if gcc supports it
diff mbox series

Message ID 20030905004710.GA31000@averell
State New, archived
Headers show
Series
  • Use -fno-unit-at-a-time if gcc supports it
Related show

Commit Message

Andi Kleen Sept. 5, 2003, 12:47 a.m. UTC
Hallo,

gcc 3.4 current has switched to default -fno-unit-at-a-time mode for -O2. 
The 3.3-Hammer branch compiler used in some distributions also does this.

Unfortunately the kernel doesn't compile with unit-at-a-time currently,
it cannot tolerate the reordering of functions in relation to inline
assembly.

This patch just turns it off when gcc supports the option.

I only did it for i386 for now. The problem is actually not i386 specific
(other architectures break too), so it may make sense to move the check_gcc 
stuff into the main Makefile and do it for everybody.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Aaron Lehmann Sept. 5, 2003, 1:05 a.m. UTC | #1
On Fri, Sep 05, 2003 at 02:47:10AM +0200, Andi Kleen wrote:
> 
> Hallo,
> 
> gcc 3.4 current has switched to default -fno-unit-at-a-time mode for -O2. 
> The 3.3-Hammer branch compiler used in some distributions also does this.
> 
> Unfortunately the kernel doesn't compile with unit-at-a-time currently,

Did you mean -funit-at-a-time, rather than the converse?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andi Kleen Sept. 5, 2003, 1:24 a.m. UTC | #2
On Thu, Sep 04, 2003 at 06:05:35PM -0700, Aaron Lehmann wrote:
> On Fri, Sep 05, 2003 at 02:47:10AM +0200, Andi Kleen wrote:
> > 
> > Hallo,
> > 
> > gcc 3.4 current has switched to default -fno-unit-at-a-time mode for -O2. 
> > The 3.3-Hammer branch compiler used in some distributions also does this.
> > 
> > Unfortunately the kernel doesn't compile with unit-at-a-time currently,
> 
> Did you mean -funit-at-a-time, rather than the converse?

Yep, sorry for the confusion.

It defaults to -funit-at-a-time now, but the kernel must use
 -fno-unit-at-a-time

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Jan Hubicka Sept. 5, 2003, 5:37 a.m. UTC | #3
> 
> Hallo,
> 
> gcc 3.4 current has switched to default -fno-unit-at-a-time mode for -O2. 
> The 3.3-Hammer branch compiler used in some distributions also does this.
> 
> Unfortunately the kernel doesn't compile with unit-at-a-time currently,
> it cannot tolerate the reordering of functions in relation to inline
> assembly.

How much work would be to fix kernel in this regard?
Are there some cases where this is esential?  Kernel would be nice
target to whole program optimization and GCC is not that far from it
right now.

Honza
> 
> This patch just turns it off when gcc supports the option.
> 
> I only did it for i386 for now. The problem is actually not i386 specific
> (other architectures break too), so it may make sense to move the check_gcc 
> stuff into the main Makefile and do it for everybody.
> 
> -Andi
> 
> --- linux-2.6.0test4-work/arch/i386/Makefile-o	2003-08-23 13:03:08.000000000 +0200
> +++ linux-2.6.0test4-work/arch/i386/Makefile	2003-09-05 02:14:07.000000000 +0200
> @@ -26,6 +26,10 @@
>  # prevent gcc from keeping the stack 16 byte aligned
>  CFLAGS += $(call check_gcc,-mpreferred-stack-boundary=2,)
>  
> +# gcc 3.4/3.3-hammer support -funit-at-a-time mode, but the Kernel is not ready
> +# for it yet
> +CFLAGS += $(call check_gcc,-fno-unit-at-a-time,)
> +
>  align := $(subst -functions=0,,$(call check_gcc,-falign-functions=0,-malign-functions=0))
>  
>  cflags-$(CONFIG_M386)		+= -march=i386
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Linus Torvalds Sept. 5, 2003, 2:54 p.m. UTC | #4
On Fri, 5 Sep 2003, Andi Kleen wrote:
> 
> Unfortunately the kernel doesn't compile with unit-at-a-time currently,
> it cannot tolerate the reordering of functions in relation to inline
> assembly.

What is the problem exactly? Is it the exception table getting unordered?  
We _could_ just sort it at boot-time (or, even better, at build time after
the final link) instead...

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andreas Jaeger Sept. 5, 2003, 3:17 p.m. UTC | #5
Linus Torvalds <torvalds@osdl.org> writes:

> On Fri, 5 Sep 2003, Andi Kleen wrote:
>> 
>> Unfortunately the kernel doesn't compile with unit-at-a-time currently,
>> it cannot tolerate the reordering of functions in relation to inline
>> assembly.
>
> What is the problem exactly? Is it the exception table getting unordered?  
> We _could_ just sort it at boot-time (or, even better, at build time after
> the final link) instead...

The problem is that unit-at-a-time sees all functions used and finds
some static functions/variables that are not called anywhere and
therefore drops them, making a smaller binary.  Since GCC does not
look into inline assembler, anything referenced from inline assembler
only, will be treated as not used and therefore removed.

You have to options:
- use attribute ((used)) (implemented since GCC 3.2) to tell GCC that
  a function/variable should never be removed
- use -fno-unit-at-a-time.

Since unit-at-a-time has better inlining heuristics the better way is
to add the used attribute - but that takes some time.  The short-term
solution would be to add the compiler flag,

Andreas
Andreas Jaeger Sept. 5, 2003, 4:10 p.m. UTC | #6
Robert Love <rml@tech9.net> writes:

> On Fri, 2003-09-05 at 11:17, Andreas Jaeger wrote:
>
>
>> Since unit-at-a-time has better inlining heuristics the better way is
>> to add the used attribute - but that takes some time.  The short-term
>> solution would be to add the compiler flag,
>
> Won't we get a linker error if a static symbol is used but
> optimized-away?  It shouldn't be hard to fix the n linker errors that
> crop up.

Yes, we would get a linker error.

> And why are we using static symbols in inline assembly outside of the
> compilation scope?

Don't know.

> Anyhow, if it generates an error, this isn't hard to fix.

Just lots of places...

> Here is the start...
>
> 	Robert Love
>
>
> --- linux-rml/include/linux/compiler.h	Fri Sep  5 11:57:56 2003
> +++ linux/include/linux/compiler.h	Fri Sep  5 12:02:02 2003
> @@ -74,6 +74,19 @@
>  #define __attribute_pure__	/* unimplemented */
>  #endif
>  
> +/*
> + * As of gcc 3.2, we can mark a function as 'used' and gcc will assume that,
> + * even if it does not find a reference to it in any compilation unit.  We
> + * need this for gcc 3.4 and beyond, which can optimize on a program-wide
> + * scope, and not just one file at a time, to avoid static symbols being
> + * discarded.
> + */
> +#if (__GNUC__ == 3 && __GNUC_MINOR__ > 1) || __GNUC__ > 3
> +#define __attribute_used__	__attribute__((used))
> +#else
> +#define __attribute_used__	/* unimplemented */

In glibc we have for the else case:
# define __attribute_used__ __attribute__ ((__unused__))

This might reduce warnings about unused functions.  But this change is
not critical IMO, so your patch looks fine!

> +#endif
> +
>  /* This macro obfuscates arithmetic on a variable address so that gcc
>     shouldn't recognize the original var, and make assumptions about it */
>  #define RELOC_HIDE(ptr, off)					\

Andreas
Rob Love Sept. 5, 2003, 4:16 p.m. UTC | #7
On Fri, 2003-09-05 at 11:17, Andreas Jaeger wrote:


> Since unit-at-a-time has better inlining heuristics the better way is
> to add the used attribute - but that takes some time.  The short-term
> solution would be to add the compiler flag,

Won't we get a linker error if a static symbol is used but
optimized-away?  It shouldn't be hard to fix the n linker errors that
crop up.

And why are we using static symbols in inline assembly outside of the
compilation scope?

Anyhow, if it generates an error, this isn't hard to fix.

Here is the start...

	Robert Love


--- linux-rml/include/linux/compiler.h	Fri Sep  5 11:57:56 2003
+++ linux/include/linux/compiler.h	Fri Sep  5 12:02:02 2003
@@ -74,6 +74,19 @@
 #define __attribute_pure__	/* unimplemented */
 #endif
 
+/*
+ * As of gcc 3.2, we can mark a function as 'used' and gcc will assume that,
+ * even if it does not find a reference to it in any compilation unit.  We
+ * need this for gcc 3.4 and beyond, which can optimize on a program-wide
+ * scope, and not just one file at a time, to avoid static symbols being
+ * discarded.
+ */
+#if (__GNUC__ == 3 && __GNUC_MINOR__ > 1) || __GNUC__ > 3
+#define __attribute_used__	__attribute__((used))
+#else
+#define __attribute_used__	/* unimplemented */
+#endif
+
 /* This macro obfuscates arithmetic on a variable address so that gcc
    shouldn't recognize the original var, and make assumptions about it */
 #define RELOC_HIDE(ptr, off)					\


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Jakub Jelinek Sept. 5, 2003, 5:19 p.m. UTC | #8
On Fri, Sep 05, 2003 at 05:17:00PM +0200, Andreas Jaeger wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > On Fri, 5 Sep 2003, Andi Kleen wrote:
> >> 
> >> Unfortunately the kernel doesn't compile with unit-at-a-time currently,
> >> it cannot tolerate the reordering of functions in relation to inline
> >> assembly.
> >
> > What is the problem exactly? Is it the exception table getting unordered?  
> > We _could_ just sort it at boot-time (or, even better, at build time after
> > the final link) instead...
> 
> The problem is that unit-at-a-time sees all functions used and finds
> some static functions/variables that are not called anywhere and
> therefore drops them, making a smaller binary.  Since GCC does not
> look into inline assembler, anything referenced from inline assembler
> only, will be treated as not used and therefore removed.
> 
> You have to options:
> - use attribute ((used)) (implemented since GCC 3.2) to tell GCC that
>   a function/variable should never be removed

To be precise, implemented since GCC 3.2 for functions and since GCC 3.3
for variables.

	Jakub
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andi Kleen Sept. 5, 2003, 5:27 p.m. UTC | #9
> How much work would be to fix kernel in this regard?

The big problem is that -funit-at-a-time is not widely used yet,
so even if we fix the kernel at some point it would likely 
get broken again all the time by people who use older kernels
(= most kernel developers currently)

> Are there some cases where this is esential?  Kernel would be nice
> target to whole program optimization and GCC is not that far from it
> right now.

I'm not sure that is that good an idea. When I was still hacking 
TCP I especially moved some stuff out-of-line in the fast path to avoid 
register pressure. Otherwise gcc would inline rarely used sub functions 
and completely mess up the register allocation in the fast path.
Of course just a call alone messes up the registers somewhat because
of its clobbers, but a full inlining is usually worse.

That was a long time ago, of course the code has significantly changed by
then.

I suspect that is true for a lot of core kernel code - everything
that is worth inlining is already inlined and for the rest it doesn't matter.

On the other hand a lot of driver code seems to be written without
manual consideration for inline. For that it may be worth it. But then
I would consider core kernel code to be more important than driver
code.

Also I fear cross module inlining would expose a lot of latent bugs
(missing barriers etc.) when the optimizer becomes more aggressive. 
I'm not saying this would be a bad thing, just that it may be a lot 
of work to fix (both for compiler and kernel people)

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andi Kleen Sept. 5, 2003, 5:30 p.m. UTC | #10
> You have to options:
> - use attribute ((used)) (implemented since GCC 3.2) to tell GCC that
>   a function/variable should never be removed
> - use -fno-unit-at-a-time.

Another problem is the way 32bit emulation is implemented in many 
64bit ports (all copying from sparc64) and now unified. This assumes 
an ordering between global functions and global assembly too. 

Not for i386 though. I think Andrew has already done some cleanups
in this area recently too, but it may still be dubious.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Jeff Garzik Sept. 5, 2003, 5:59 p.m. UTC | #11
On Fri, Sep 05, 2003 at 07:27:15PM +0200, Andi Kleen wrote:
> I'm not sure that is that good an idea. When I was still hacking 
> TCP I especially moved some stuff out-of-line in the fast path to avoid 
> register pressure. Otherwise gcc would inline rarely used sub functions 
> and completely mess up the register allocation in the fast path.
> Of course just a call alone messes up the registers somewhat because
> of its clobbers, but a full inlining is usually worse.
[...]
> I suspect that is true for a lot of core kernel code - everything
> that is worth inlining is already inlined and for the rest it doesn't matter.

Definitely , agreed.  In fact, we are moving in the opposite direction:
looking into what we can un-inline...


> On the other hand a lot of driver code seems to be written without
> manual consideration for inline. For that it may be worth it. But then
> I would consider core kernel code to be more important than driver
> code.

Modern network drivers seem fairly aware of it ;-)

> Also I fear cross module inlining would expose a lot of latent bugs
> (missing barriers etc.) when the optimizer becomes more aggressive. 
> I'm not saying this would be a bad thing, just that it may be a lot 
> of work to fix (both for compiler and kernel people)

Agreed.

	Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Jan Hubicka Sept. 6, 2003, 7:06 a.m. UTC | #12
> On Fri, 2003-09-05 at 11:17, Andreas Jaeger wrote:
> 
> 
> > Since unit-at-a-time has better inlining heuristics the better way is
> > to add the used attribute - but that takes some time.  The short-term
> > solution would be to add the compiler flag,
> 
> Won't we get a linker error if a static symbol is used but
> optimized-away?  It shouldn't be hard to fix the n linker errors that
> crop up.

Yes, you get linker error.
You may also run into misscompilation assuiming that function is static
and it is both called by hand in asm and by function call and there is
missing attribute used and asmlinkage definition.  In that case GCC
would conclude to change into register calling convention on i386
breaking asm code.

I would expect this to be rare as functions tends to be used either by
assembly or by normal code but not by both.
> 
> And why are we using static symbols in inline assembly outside of the
> compilation scope?

The toplevel asm statements are common source of this at least in glibc.
I didn't look much into the kernel sources.

I would be very happy if someone did look on that.  It may be well
possible that implementing tricks you do currently with toplevel asm
staements would need further extensions in GCC now and it would be nice
to know about that.

For instance it used to be possible to force function to go into given
section by changing the section by hand, but now you have to use section
attribute (that is cleaner anyway)
> 
> Anyhow, if it generates an error, this isn't hard to fix.
> 
> Here is the start...
> 
> 	Robert Love
> 
> 
> --- linux-rml/include/linux/compiler.h	Fri Sep  5 11:57:56 2003
> +++ linux/include/linux/compiler.h	Fri Sep  5 12:02:02 2003
> @@ -74,6 +74,19 @@
>  #define __attribute_pure__	/* unimplemented */
>  #endif
>  
> +/*
> + * As of gcc 3.2, we can mark a function as 'used' and gcc will assume that,
> + * even if it does not find a reference to it in any compilation unit.  We
> + * need this for gcc 3.4 and beyond, which can optimize on a program-wide
> + * scope, and not just one file at a time, to avoid static symbols being
> + * discarded.
> + */
> +#if (__GNUC__ == 3 && __GNUC_MINOR__ > 1) || __GNUC__ > 3
> +#define __attribute_used__	__attribute__((used))
> +#else
> +#define __attribute_used__	/* unimplemented */
> +#endif
> +
I believe there is little trick - attribute used works either for
variables or functions.  Functions can be marked as used only for GCC
3.4+ if I am right, so you may need __attribute_used_function__ and
__attribute_used_variable__ macros for that.

Honza
>  /* This macro obfuscates arithmetic on a variable address so that gcc
>     shouldn't recognize the original var, and make assumptions about it */
>  #define RELOC_HIDE(ptr, off)					\
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Jan Hubicka Sept. 6, 2003, 7:08 a.m. UTC | #13
> > How much work would be to fix kernel in this regard?
> 
> The big problem is that -funit-at-a-time is not widely used yet,
> so even if we fix the kernel at some point it would likely 
> get broken again all the time by people who use older kernels
> (= most kernel developers currently)
> 
> > Are there some cases where this is esential?  Kernel would be nice
> > target to whole program optimization and GCC is not that far from it
> > right now.
> 
> I'm not sure that is that good an idea. When I was still hacking 
> TCP I especially moved some stuff out-of-line in the fast path to avoid 
> register pressure. Otherwise gcc would inline rarely used sub functions 
> and completely mess up the register allocation in the fast path.
> Of course just a call alone messes up the registers somewhat because
> of its clobbers, but a full inlining is usually worse.

You can use -O2 and rely on inline done by hand.  I can add option
-fno-inline-functions-called-once.  That should avoid such a problems.
Anyway it would be nice to mark functions that exist for this reason by
noinline attribute so compiler knows about it, but that is different
story.
> 
> Also I fear cross module inlining would expose a lot of latent bugs
> (missing barriers etc.) when the optimizer becomes more aggressive. 
> I'm not saying this would be a bad thing, just that it may be a lot 
> of work to fix (both for compiler and kernel people)

Some of this should be already tested by folks using Intel compiler I
would hope.

Honza
> 
> -Andi
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Helge Hafting Sept. 8, 2003, 9:48 a.m. UTC | #14
Andreas Jaeger wrote:
[...]
> The problem is that unit-at-a-time sees all functions used and finds
> some static functions/variables that are not called anywhere and
> therefore drops them, making a smaller binary.  Since GCC does not
> look into inline assembler, anything referenced from inline assembler
> only, will be treated as not used and therefore removed.
> 
> You have to options:
> - use attribute ((used)) (implemented since GCC 3.2) to tell GCC that
>   a function/variable should never be removed
> - use -fno-unit-at-a-time.
> 
> Since unit-at-a-time has better inlining heuristics the better way is
> to add the used attribute - but that takes some time.  The short-term
> solution would be to add the compiler flag,
> 
Seems to me that a better solution is to mark the assembly code
in question so gcc knows that is somehow calls the function.
That still allows optimizing away the function whenever the
assembly itself is left out. (Module not compiled or similiar)
Marking the function "used" includes it anyway.

I realize this way probably isn' supported right now,
but people are talking about changing gcc so I
mentioned it as an ideal way.

Helge Hafting


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

--- linux-2.6.0test4-work/arch/i386/Makefile-o	2003-08-23 13:03:08.000000000 +0200
+++ linux-2.6.0test4-work/arch/i386/Makefile	2003-09-05 02:14:07.000000000 +0200
@@ -26,6 +26,10 @@ 
 # prevent gcc from keeping the stack 16 byte aligned
 CFLAGS += $(call check_gcc,-mpreferred-stack-boundary=2,)
 
+# gcc 3.4/3.3-hammer support -funit-at-a-time mode, but the Kernel is not ready
+# for it yet
+CFLAGS += $(call check_gcc,-fno-unit-at-a-time,)
+
 align := $(subst -functions=0,,$(call check_gcc,-falign-functions=0,-malign-functions=0))
 
 cflags-$(CONFIG_M386)		+= -march=i386