All of lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND PATCH v5 0/4] kernel hacking: GCC optimization for better debug experience (-Og)
@ 2018-06-05  8:13 ` changbin.du at intel.com
  0 siblings, 0 replies; 69+ messages in thread
From: changbin.du @ 2018-06-05  8:13 UTC (permalink / raw)
  To: mingo, akpm, yamada.masahiro, michal.lkml
  Cc: rostedt, tglx, rdunlap, x86, linux, lgirdwood, broonie, arnd,
	linux-kbuild, linux-kernel, linux-arch, linux-arm-kernel,
	linux-sparse, changbin.du, Changbin Du

From: Changbin Du <changbin.du@intel.com>

Hi all,
I know some kernel developers was searching for a method to dissable GCC
optimizations, probably they want to apply GCC '-O0' option. But since Linux
kernel replys on GCC optimization to remove some dead code, so '-O0' just
breaks the build. They do need this because they want to debug kernel with
qemu, simics, kgtp or kgdb.

Thanks for the GCC '-Og' optimization level introduced in GCC 4.8, which
offers a reasonable level of optimization while maintaining fast compilation
and a good debugging experience. It is similar to '-O1' while perferring to
keep debug ability over runtime speed. With '-Og', we can build a kernel with
better debug ability and little performance drop after some simple change.

In this series, firstly introduce a new config CONFIG_NO_AUTO_INLINE after two
fixes for this new option. With this option, only functions explicitly marked
with "inline" will  be inlined. This will allow the function tracer to trace
more functions because it only traces functions that the compiler has not
inlined.

Then introduce new config CC_OPTIMIZE_FOR_DEBUGGING which apply '-Og'
optimization level for whole kernel, with a simple fix in fix_to_virt().
Currently I have only tested this option on x86 and ARM platform. Other
platforms should also work but probably need some compiling fixes as what
having done in this series. I leave that to who want to try this debug
option.

Comparison of vmlinux size: a bit smaller.

    w/o CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ size vmlinux
       text    data     bss     dec     hex filename
    22665554   9709674  2920908 35296136        21a9388 vmlinux

    w/ CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ size vmlinux
       text    data     bss     dec     hex filename
    21499032   10102758 2920908 34522698        20ec64a vmlinux


Comparison of system performance: a bit drop (~6%).
    This benchmark of kernel compilation is suggested by Ingo Molnar.
    https://lkml.org/lkml/2018/5/2/74

    Preparation: Set cpufreq to 'performance'.
    for ((cpu=0; cpu<120; cpu++)); do
      G=/sys/devices/system/cpu/cpu$cpu/cpufreq/scaling_governor
      [ -f $G ] && echo performance > $G
    done

    w/o CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ perf stat --repeat 5 --null --pre                 '\
        cp -a kernel ../kernel.copy.$(date +%s);         \
        rm -rf *;                                        \
        git checkout .;                                  \
        echo 1 > /proc/sys/vm/drop_caches;               \
        find ../kernel* -type f | xargs cat >/dev/null;  \
        make -j kernel >/dev/null;                       \
        make clean >/dev/null 2>&1;                      \
        sync                                            '\
                                                         \
        make -j8 >/dev/null

     Performance counter stats for 'make -j8' (5 runs):

        219.764246652 seconds time elapsed                   ( +-  0.78% )

    w/ CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ perf stat --repeat 5 --null --pre                 '\
        cp -a kernel ../kernel.copy.$(date +%s);         \
        rm -rf *;                                        \
        git checkout .;                                  \
        echo 1 > /proc/sys/vm/drop_caches;               \
        find ../kernel* -type f | xargs cat >/dev/null;  \
        make -j kernel >/dev/null;                       \
        make clean >/dev/null 2>&1;                      \
        sync                                            '\
                                                         \
        make -j8 >/dev/null

    Performance counter stats for 'make -j8' (5 runs):

         233.574187771 seconds time elapsed                  ( +-  0.19% )

v5:
  o Exchange the position of last two patches to avoid compiling error.
v4:
  o Remove aready merged one "regulator: add dummy function of_find_regulator_by_node".

Changbin Du (4):
  x86/mm: surround level4_kernel_pgt with #ifdef
    CONFIG_X86_5LEVEL...#endif
  kernel hacking: new config NO_AUTO_INLINE to disable compiler
    auto-inline optimizations
  ARM: mm: fix build error in fix_to_virt with
    CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
  kernel hacking: new config CC_OPTIMIZE_FOR_DEBUGGING to apply GCC -Og
    optimization

 Makefile                          | 10 ++++++++++
 arch/arm/mm/mmu.c                 |  2 +-
 arch/x86/include/asm/pgtable_64.h |  2 ++
 arch/x86/kernel/head64.c          | 13 ++++++-------
 include/linux/compiler-gcc.h      |  2 +-
 include/linux/compiler.h          |  2 +-
 init/Kconfig                      | 19 +++++++++++++++++++
 lib/Kconfig.debug                 | 17 +++++++++++++++++
 8 files changed, 57 insertions(+), 10 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 69+ messages in thread
* [PATCH v5 0/4] kernel hacking: GCC optimization for better debug experience (-Og)
@ 2018-05-11  8:09 changbin.du
  2018-05-11  8:09 ` [PATCH v5 2/4] kernel hacking: new config NO_AUTO_INLINE to disable compiler auto-inline optimizations changbin.du
  0 siblings, 1 reply; 69+ messages in thread
From: changbin.du @ 2018-05-11  8:09 UTC (permalink / raw)
  To: yamada.masahiro, michal.lkml, tglx, mingo, akpm
  Cc: rostedt, rdunlap, x86, lgirdwood, broonie, arnd, linux-kbuild,
	linux-kernel, linux-arch, Changbin Du

From: Changbin Du <changbin.du@intel.com>

Hi all,
I know some kernel developers was searching for a method to dissable GCC
optimizations, probably they want to apply GCC '-O0' option. But since Linux
kernel replys on GCC optimization to remove some dead code, so '-O0' just
breaks the build. They do need this because they want to debug kernel with
qemu, simics, kgtp or kgdb.

Thanks for the GCC '-Og' optimization level introduced in GCC 4.8, which
offers a reasonable level of optimization while maintaining fast compilation
and a good debugging experience. It is similar to '-O1' while perferring to
keep debug ability over runtime speed. With '-Og', we can build a kernel with
better debug ability and little performance drop after some simple change.

In this series, firstly introduce a new config CONFIG_NO_AUTO_INLINE after two
fixes for this new option. With this option, only functions explicitly marked
with "inline" will  be inlined. This will allow the function tracer to trace
more functions because it only traces functions that the compiler has not
inlined.

Then introduce new config CC_OPTIMIZE_FOR_DEBUGGING which apply '-Og'
optimization level for whole kernel, with a simple fix in fix_to_virt().
Currently I have only tested this option on x86 and ARM platform. Other
platforms should also work but probably need some compiling fixes as what
having done in this series. I leave that to who want to try this debug
option.

Comparison of vmlinux size: a bit smaller.

    w/o CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ size vmlinux
       text    data     bss     dec     hex filename
    22665554   9709674  2920908 35296136        21a9388 vmlinux

    w/ CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ size vmlinux
       text    data     bss     dec     hex filename
    21499032   10102758 2920908 34522698        20ec64a vmlinux


Comparison of system performance: a bit drop (~6%).
    This benchmark of kernel compilation is suggested by Ingo Molnar.
    https://lkml.org/lkml/2018/5/2/74

    Preparation: Set cpufreq to 'performance'.
    for ((cpu=0; cpu<120; cpu++)); do
      G=/sys/devices/system/cpu/cpu$cpu/cpufreq/scaling_governor
      [ -f $G ] && echo performance > $G
    done

    w/o CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ perf stat --repeat 5 --null --pre                 '\
        cp -a kernel ../kernel.copy.$(date +%s);         \
        rm -rf *;                                        \
        git checkout .;                                  \
        echo 1 > /proc/sys/vm/drop_caches;               \
        find ../kernel* -type f | xargs cat >/dev/null;  \
        make -j kernel >/dev/null;                       \
        make clean >/dev/null 2>&1;                      \
        sync                                            '\
                                                         \
        make -j8 >/dev/null

     Performance counter stats for 'make -j8' (5 runs):

        219.764246652 seconds time elapsed                   ( +-  0.78% )

    w/ CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
    $ perf stat --repeat 5 --null --pre                 '\
        cp -a kernel ../kernel.copy.$(date +%s);         \
        rm -rf *;                                        \
        git checkout .;                                  \
        echo 1 > /proc/sys/vm/drop_caches;               \
        find ../kernel* -type f | xargs cat >/dev/null;  \
        make -j kernel >/dev/null;                       \
        make clean >/dev/null 2>&1;                      \
        sync                                            '\
                                                         \
        make -j8 >/dev/null

    Performance counter stats for 'make -j8' (5 runs):

         233.574187771 seconds time elapsed                  ( +-  0.19% )

v5:
  o Exchange the position of last two patches to avoid compiling error.
v4:
  o Remove aready merged one "regulator: add dummy function of_find_regulator_by_node".

Changbin Du (4):
  x86/mm: surround level4_kernel_pgt with #ifdef
    CONFIG_X86_5LEVEL...#endif
  kernel hacking: new config NO_AUTO_INLINE to disable compiler
    auto-inline optimizations
  ARM: mm: fix build error in fix_to_virt with
    CONFIG_CC_OPTIMIZE_FOR_DEBUGGING
  kernel hacking: new config CC_OPTIMIZE_FOR_DEBUGGING to apply GCC -Og
    optimization

 Makefile                          | 10 ++++++++++
 arch/arm/mm/mmu.c                 |  2 +-
 arch/x86/include/asm/pgtable_64.h |  2 ++
 arch/x86/kernel/head64.c          | 13 ++++++-------
 include/linux/compiler-gcc.h      |  2 +-
 include/linux/compiler.h          |  2 +-
 init/Kconfig                      | 19 +++++++++++++++++++
 lib/Kconfig.debug                 | 17 +++++++++++++++++
 8 files changed, 57 insertions(+), 10 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2018-06-11 16:01 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-05  8:13 [RESEND PATCH v5 0/4] kernel hacking: GCC optimization for better debug experience (-Og) changbin.du
2018-06-05  8:13 ` changbin.du at intel.com
2018-06-05  8:13 ` [PATCH v5 1/4] x86/mm: surround level4_kernel_pgt with #ifdef CONFIG_X86_5LEVEL...#endif changbin.du
2018-06-05  8:13   ` changbin.du at intel.com
2018-06-05  8:13 ` [PATCH v5 2/4] kernel hacking: new config NO_AUTO_INLINE to disable compiler auto-inline optimizations changbin.du
2018-06-05  8:13   ` changbin.du at intel.com
2018-06-05 21:21   ` kbuild test robot
2018-06-05 21:21     ` kbuild test robot
2018-06-05 21:21     ` kbuild test robot
2018-06-05 21:21     ` kbuild test robot
2018-06-06 13:57     ` Steven Rostedt
2018-06-06 13:57       ` Steven Rostedt
2018-06-06 14:26       ` Johan Hovold
2018-06-06 14:26         ` Johan Hovold
2018-06-06 18:26         ` Steven Rostedt
2018-06-06 18:26           ` Steven Rostedt
2018-06-07  4:17           ` Viresh Kumar
2018-06-07  4:17             ` Viresh Kumar
2018-06-07  7:46             ` Du, Changbin
2018-06-07  7:46               ` Du, Changbin
2018-06-07  8:38               ` Viresh Kumar
2018-06-07  8:38                 ` Viresh Kumar
2018-06-07  9:03                 ` Bernd Petrovitsch
2018-06-07  9:03                   ` Bernd Petrovitsch
2018-06-07  9:03                   ` Bernd Petrovitsch
2018-06-07  9:10                   ` Viresh Kumar
2018-06-07  9:10                     ` Viresh Kumar
2018-06-07  9:18                     ` Johan Hovold
2018-06-07  9:18                       ` Johan Hovold
2018-06-07  9:19                       ` Viresh Kumar
2018-06-07  9:19                         ` Viresh Kumar
2018-06-07 10:12                         ` Alex Elder
2018-06-07 10:12                           ` Alex Elder
2018-06-07 10:27                           ` Johan Hovold
2018-06-07 10:27                             ` Johan Hovold
2018-06-07 10:27                             ` Johan Hovold
2018-06-08 20:03                       ` Steven Rostedt
2018-06-08 20:03                         ` Steven Rostedt
2018-06-11 15:46                         ` Johan Hovold
2018-06-11 15:46                           ` Johan Hovold
2018-06-07  8:06             ` Johan Hovold
2018-06-07  8:06               ` Johan Hovold
2018-06-05 21:34   ` kbuild test robot
2018-06-05 21:34     ` kbuild test robot
2018-06-05 21:34     ` kbuild test robot
2018-06-05 21:34     ` kbuild test robot
2018-06-06 14:01     ` Steven Rostedt
2018-06-06 14:01       ` Steven Rostedt
2018-06-05  8:13 ` [PATCH v5 3/4] ARM: mm: fix build error in fix_to_virt with CONFIG_CC_OPTIMIZE_FOR_DEBUGGING changbin.du
2018-06-05  8:13   ` changbin.du at intel.com
2018-06-05  8:13 ` [PATCH v5 4/4] kernel hacking: new config CC_OPTIMIZE_FOR_DEBUGGING to apply GCC -Og optimization changbin.du
2018-06-05  8:13   ` changbin.du at intel.com
2018-06-10 10:44   ` kbuild test robot
2018-06-10 10:44     ` kbuild test robot
2018-06-10 10:44     ` kbuild test robot
2018-06-10 10:44     ` kbuild test robot
2018-06-10 15:49   ` kbuild test robot
2018-06-10 15:49     ` kbuild test robot
2018-06-10 15:49     ` kbuild test robot
2018-06-10 15:49     ` kbuild test robot
2018-06-11 16:00     ` Steven Rostedt
2018-06-11 16:00       ` Steven Rostedt
  -- strict thread matches above, loose matches on Subject: below --
2018-05-11  8:09 [PATCH v5 0/4] kernel hacking: GCC optimization for better debug experience (-Og) changbin.du
2018-05-11  8:09 ` [PATCH v5 2/4] kernel hacking: new config NO_AUTO_INLINE to disable compiler auto-inline optimizations changbin.du
2018-05-17 15:49   ` kbuild test robot
2018-05-17 15:49     ` kbuild test robot
2018-05-17 15:49     ` kbuild test robot
2018-05-17 17:58   ` kbuild test robot
2018-05-17 17:58     ` kbuild test robot
2018-05-17 17:58     ` kbuild test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.