linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"David S. Miller" <davem@davemloft.net>,
	Ard Biesheuvel <ardb@kernel.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell"
Date: Tue, 4 Jan 2022 13:36:16 +0100	[thread overview]
Message-ID: <20220104123616.GA1584@1wt.eu> (raw)
In-Reply-To: <YdIfz+LMewetSaEB@gmail.com>

Hi Ingo!

First, great work! I'm particularly interested in this work because I
went through a similar process a bout 6 months ago in haproxy and saved
40-45% build time, and thought how well the same principles could apply
to the kernel if anyone had felt brave enough to engage into that. I do
appreciate how tedious a work it can be and do really sympathise with
you on this! A few comments below:

On Sun, Jan 02, 2022 at 10:57:35PM +0100, Ingo Molnar wrote:
>  - Uninlining: there's a number of unnecessary inline functions that also
>    couple otherwise unrelated headers to each other. The fast-headers tree
>    contains over 100 uninlining commits.
> 
>  - Type & API header decoupling. This is one of the most effective techniques
>    to reduce size - but it can rarely be done in a straightforward fashion,
>    and has to be prepared by various decoupling measures, such as the moving
>    of inline functions or the creation of new headers for less frequently used
>    APIs and types.

These were the main two key points I went through as well and found them
to be extremely effective. The essential build time in my case came from
the same inline functions being built hundreds of times for nothing, just
because a header file was included for just one type. I had already
decoupled types and API long ago but that didn't stand long enough for a
few files that were included everywhere. What I noticed is that ideally
we'd need to have 3 layers:
  - types alone
  - function prototypes alone, depending on the former if needed
  - inline functions, depending on the two former ones, if needed

Most code doesn't need need the inline functions, especially other headers,
and being able to only cross-include type definitions is extremely helpful.

In my case something that further improved this effectiveness was to use a
lot more incomplete types everywhere possible. There's no reason to include
foo.h just to have a definition of "struct foo" from "bar.h" if you're only
using it as a pointer in "struct bar". Just prepend "struct foo;" before
struct bar and be done with it.

This showed me how horrible typedefs are: there seems to be no way to
create incomplete definitions for them. So I had to create an even lower
level tiny include file for just the few ones I needed (mostly ints).

I hadn't found a perfect way to deal with macros. Sometimes you consider
them as inline functions and they seem to be better placed there, and
sometimes you figure they are used in type declarations and you have to
have them somewhere else. And when a macro is needed between multiple
type definitions (e.g. an array size), it becomes more delicate because
you quickly realize that a dedicated file for all such settings would
make sense, but it can complicate maintenance.

Another point I didn't feel brave enough to experiment with was to guard
include files around the #include directive in order to avoid opening
the files at all. In my case the C files are huge so such savings could
have been small. There are definitely savings to do there but this looked
too complicated to maintain. And I don't think that #pragma once would be
any effective alternative.

>  - For the 'reference' subsystem of the scheduler, I also improved build speed by
>    consolidating .c files into roughly equal size build units. Instead of 20+
>    separate .o's, there's now just 4 .o's being built. Obviously this approach
>    does not scale to the over 30,000 .c files in the kernel, but I wanted to
>    demonstrate it because optimizing at that level brings the next level of build
>    performance, and it might be feasible for a handful of other core kernel subsystems.

I tried this as well for the sake of avoiding to reprocess the same header
files multiple times but it was too difficult and I gave up. I'd be tempted
to encourage developers to write a bit less but larger files, but these can
also become a maintenance nightmare, they tend to be much slower to build
when too big, and they do parallelize less well, so a balance has to be
found, and if the headers hell is better addressed, then this becomes less
important.

I noticed that you measured the number of includes per file. I did the
same by counting the references to the include files in the preprocessed
output, but ultimately found an easier metric: the total preprocessed
size. I simply replaced "-c" with "-E" in my makefile, and ran
"find . -name '*.o' | grep '^[^#]' | xargs cat | wc" to observe the output,
since in the end, that's what is really fed to the compiler. I overall
found that metric to be a relatively accurate representation of an
expected build time. It's particularly interesting because it's much
faster to obtain than a full build and can easily show you that some
optimizations have absolutely zero effect (typically because most
includes are guarded and what's not included at some place will be at
another one).

In my project I noticed that the total preprocessed size was initially
around 50-60 times larger than the total C+H files. After optimizing it
went down to around 20 times, which is roughly in line with the build
time savings.

Just my two cents, kudos for working on this!
Willy

  parent reply	other threads:[~2022-01-04 12:37 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-02 21:57 [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-03 10:11 ` Greg Kroah-Hartman
2022-01-03 11:12   ` Ingo Molnar
2022-01-03 13:46     ` Greg Kroah-Hartman
2022-01-03 16:29       ` Ingo Molnar
2022-01-10 10:28         ` Peter Zijlstra
2022-01-04 14:10     ` [PATCH] per_task: Remove the PER_TASK_BYTES hard-coded constant Ingo Molnar
2022-01-04 15:14       ` Andy Shevchenko
2022-01-04 23:27         ` Ingo Molnar
2022-01-04 17:51     ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Arnd Bergmann
2022-01-05  0:05       ` Ingo Molnar
2022-01-05  1:37         ` Arnd Bergmann
2022-01-05  9:37       ` Andy Shevchenko
2022-01-04 14:05   ` [PATCH] per_task: Implement single template to define 'struct task_struct_per_task' fields and offsets Ingo Molnar
2022-01-03 13:54 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Kirill A. Shutemov
2022-01-04 10:54   ` Ingo Molnar
2022-01-04 13:34     ` Greg Kroah-Hartman
2022-01-04 13:54       ` [PATCH] headers/uninline: Uninline single-use function: kobject_has_children() Ingo Molnar
2022-01-04 15:09         ` Greg Kroah-Hartman
2022-01-04 15:14           ` Greg Kroah-Hartman
2022-01-05  0:11             ` Ingo Molnar
2022-01-05 15:23               ` Greg Kroah-Hartman
2022-01-06 11:26                 ` Ingo Molnar
2022-01-03 17:54 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Nathan Chancellor
2022-01-04 10:47   ` Ingo Molnar
2022-01-04 10:56     ` [DEBUG PATCH] DO NOT MERGE: Enable SHADOW_CALL_STACK on GCC builds, for build testing Ingo Molnar
2022-01-04 11:02     ` [PATCH] headers/deps: dcache: Move the ____cacheline_aligned attribute to the head of the definition Ingo Molnar
2022-01-04 15:05       ` kernel test robot
2022-01-04 17:51       ` Nathan Chancellor
2022-01-05  0:20         ` Ingo Molnar
2022-01-05  0:26           ` [PATCH] headers/deps: Attribute placement fixes for Clang & GCC Ingo Molnar
2022-01-04 11:19     ` [TREE] "Fast Kernel Headers" Tree WIP/development branch Ingo Molnar
2022-01-04 17:25     ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Nick Desaulniers
2022-01-05  0:43       ` Ingo Molnar
2022-01-04 17:50     ` Nathan Chancellor
2022-01-05  0:35       ` [PATCH] x86/kbuild: Enable CONFIG_KALLSYMS_ALL=y in the defconfigs Ingo Molnar
2022-01-08 21:57         ` [tip: x86/build] " tip-bot2 for Ingo Molnar
2022-01-05  0:40       ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-05  1:07         ` Ingo Molnar
2022-01-05  5:20           ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel'\''s "Dependency Hell Paul Zimmerman
2022-01-05 21:42           ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Nathan Chancellor
2022-01-08 10:32             ` [PATCH] headers/deps: Add header dependencies to .c files: <linux/ptrace_api.h> Ingo Molnar
2022-01-08 11:08             ` [PATCH] FIX: headers/deps: uapi/headers: Create usr/include/uapi symbolic link Ingo Molnar
2022-01-08 11:18             ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-08 11:38             ` [PATCH] x86/bitops: Remove unused __sw_hweight64() assembly implementation Ingo Molnar
2022-01-08 11:49             ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-08 12:17               ` Ingo Molnar
2022-01-10 20:03               ` Nathan Chancellor
2022-01-10 20:05                 ` Nathan Chancellor
2022-01-05 22:33         ` Nathan Chancellor
2022-01-08 15:16       ` Ingo Molnar
2022-01-07  0:29     ` Nathan Chancellor
2022-01-08 11:54       ` Ingo Molnar
2022-01-04 12:36 ` Willy Tarreau [this message]
2022-01-04 16:05 ` Andy Shevchenko
2022-01-04 16:18 ` Andy Shevchenko
2022-01-15  0:42 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220104123616.GA1584@1wt.eu \
    --to=w@1wt.eu \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=jpoimboe@redhat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).