From: Ingo Molnar <mingo@kernel.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
"David S. Miller" <davem@davemloft.net>,
Ard Biesheuvel <ardb@kernel.org>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell"
Date: Mon, 3 Jan 2022 12:12:50 +0100 [thread overview]
Message-ID: <YdLaMvaM9vq4W6f1@gmail.com> (raw)
In-Reply-To: <YdLL0kaFhm6rp9NS@kroah.com>
* Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > Before going into details about how this tree solves 'dependency hell'
> > exactly, here's the current kernel build performance gain with
> > CONFIG_FAST_HEADERS=y enabled, (and with CONFIG_KALLSYMS_FAST=y enabled as
> > well - see below), using a stock x86 Linux distribution's .config with all
> > modules built into the vmlinux:
> >
> > #
> > # Performance counter stats for 'make -j96 vmlinux' (3 runs):
> > #
> > # (Elapsed time in seconds):
> > #
> >
> > v5.16-rc7: 231.34 +- 0.60 secs, 15.5 builds/hour # [ vanilla baseline ]
> > -fast-headers-v1: 129.97 +- 0.51 secs, 27.7 builds/hour # +78.0% improvement
> >
> > Or in terms of CPU time utilized:
> >
> > v5.16-rc7: 11,474,982.05 msec cpu-clock # 49.601 CPUs utilized
> > -fast-headers-v1: 7,100,730.37 msec cpu-clock # 54.635 CPUs utilized # +61.6% improvement
>
> Speed up is very impressive, nice job!
Thanks! :-)
> > Techniques used by the fast-headers tree to reduce header size & dependencies:
> >
> > - Aggressive decoupling of high level headers from each other, starting
> > with <linux/sched.h>. Since 'struct task_struct' is a union of many
> > subsystems, there's a new "per_task" infrastructure modeled after the
> > per_cpu framework, which creates fields in task_struct without having
> > to modify sched.h or the 'struct task_struct' type:
> >
> > DECLARE_PER_TASK(type, name);
> > ...
> > per_task(current, name) = val;
> >
> > The per_task() facility then seamlessly creates an offset into the
> > task_struct->per_task_area[] array, and uses the asm-offsets.h
> > mechanism to create offsets into it early in the build.
> >
> > There's no runtime overhead disadvantage from using per_task() framework,
> > the generated code is functionally equivalent to types embedded in
> > task_struct.
>
> This is "interesting", but how are you going to keep the
> kernel/sched/per_task_area_struct_defs.h and struct task_struct_per_task
> definition in sync?
I have plans to clean this up further - see below - but in general I'd
*discourage* the embedding of new complex types to task_struct.
In practice, most new task_struct fields are either simple types or
pointers to structs, which can be added to task_struct without having to
define a complex type for <linux/sched.h>.
For example here's the list of the last 5 extensions of task_struct, since
November 2020 - I copy & pasted them out of git log -p include/linux/sched.h:
+ unsigned in_eventfd_signal:1;
+ cpumask_t *user_cpus_ptr;
+ unsigned int saved_state;
+ unsigned long saved_state_change;
+ struct bpf_run_ctx *bpf_ctx;
All of those new fields are either simple C types or struct pointers - none
of those extensions need per_task() handling per se.
The overall policy to extend task_struct, going forward, would be to:
- Either make simple-type or struct-pointer additions to task_struct, that
don't couple <linux/sched.h> to other subsystems.
- Or, if you absolutely must - and we don't want to forbid this - use the
per_task() machinery to create a simple accessor to a complex embedded
type.
> [...] It seems that you manually created this (which is great for
> testing), but over the long-term, trying to manually determine what needs
> to be done here to keep everything lined up properly is going to be a
> major pain.
Note that under the policy above - and even according to the practice of
the last ~1.5 years - it should be exceedingly rare having to extend the
per_task() facility.
There's one thing ugly about it, the fixed PER_TASK_BYTES limit, I plan to
make ->per_task_array[] the last field of task_struct, i.e. change it to:
u8 per_task_area[];
This actually became possible through the fixing of the x86 FPU code in the
following fast-headers commit:
4ae0f28bc1c8 headers/deps: x86/fpu: Make task_struct::thread constant size
In the last ~1 year existence of the per_task() facility I didn't have any
maintenance troubles with these fields getting out of sync, but we could
also auto-generate kernel/sched/per_task_area_struct_defs.h from
kernel/sched/per_task_area_struct.h via a build-time script, and make
kernel/sched/per_task_area_struct.h the only method to define such fields.
> That issue aside, I took a glance at the tree, and overall it looks like
> a lot of nice cleanups. Most of these can probably go through the
> various subsystem trees, after you split them out, for the "major" .h
> cleanups. Is that something you are going to be planning on doing?
Yeah, I absolutely plan on doing that too:
- About ~70% of the commits can be split up & parallelized through
maintainer trees.
- With the exception of the untangling of sched.h, per_task and the
"Optimize Headers" series, where a lot of patches are dependent on each
other. These are actually needed to get any measurable benefits from this
tree (!). We can do these through the scheduler tree, or through the
dedicated headers tree I posted.
The latter monolithic series is pretty much unavoidable, it's the result of
30 years of coupling a lot of kernel subsystems to task_struct via embedded
structs & other complex types, that needed quite a bit of effort to
untangle, and that untangling needed to happen in-order.
Do these plans this sound good to you?
Thanks,
Ingo
next prev parent reply other threads:[~2022-01-03 11:13 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-02 21:57 [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-03 10:11 ` Greg Kroah-Hartman
2022-01-03 11:12 ` Ingo Molnar [this message]
2022-01-03 13:46 ` Greg Kroah-Hartman
2022-01-03 16:29 ` Ingo Molnar
2022-01-10 10:28 ` Peter Zijlstra
2022-01-04 14:10 ` [PATCH] per_task: Remove the PER_TASK_BYTES hard-coded constant Ingo Molnar
2022-01-04 15:14 ` Andy Shevchenko
2022-01-04 23:27 ` Ingo Molnar
2022-01-04 17:51 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Arnd Bergmann
2022-01-05 0:05 ` Ingo Molnar
2022-01-05 1:37 ` Arnd Bergmann
2022-01-05 9:37 ` Andy Shevchenko
2022-01-04 14:05 ` [PATCH] per_task: Implement single template to define 'struct task_struct_per_task' fields and offsets Ingo Molnar
2022-01-03 13:54 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Kirill A. Shutemov
2022-01-04 10:54 ` Ingo Molnar
2022-01-04 13:34 ` Greg Kroah-Hartman
2022-01-04 13:54 ` [PATCH] headers/uninline: Uninline single-use function: kobject_has_children() Ingo Molnar
2022-01-04 15:09 ` Greg Kroah-Hartman
2022-01-04 15:14 ` Greg Kroah-Hartman
2022-01-05 0:11 ` Ingo Molnar
2022-01-05 15:23 ` Greg Kroah-Hartman
2022-01-06 11:26 ` Ingo Molnar
2022-01-03 17:54 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Nathan Chancellor
2022-01-04 10:47 ` Ingo Molnar
2022-01-04 10:56 ` [DEBUG PATCH] DO NOT MERGE: Enable SHADOW_CALL_STACK on GCC builds, for build testing Ingo Molnar
2022-01-04 11:02 ` [PATCH] headers/deps: dcache: Move the ____cacheline_aligned attribute to the head of the definition Ingo Molnar
2022-01-04 15:05 ` kernel test robot
2022-01-04 15:05 ` kernel test robot
2022-01-04 17:51 ` Nathan Chancellor
2022-01-05 0:20 ` Ingo Molnar
2022-01-05 0:26 ` [PATCH] headers/deps: Attribute placement fixes for Clang & GCC Ingo Molnar
2022-01-04 11:19 ` [TREE] "Fast Kernel Headers" Tree WIP/development branch Ingo Molnar
2022-01-04 17:25 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Nick Desaulniers
2022-01-05 0:43 ` Ingo Molnar
2022-01-04 17:50 ` Nathan Chancellor
2022-01-05 0:35 ` [PATCH] x86/kbuild: Enable CONFIG_KALLSYMS_ALL=y in the defconfigs Ingo Molnar
2022-01-08 21:57 ` [tip: x86/build] " tip-bot2 for Ingo Molnar
2022-01-05 0:40 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-05 1:07 ` Ingo Molnar
2022-01-05 5:20 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel'\''s "Dependency Hell Paul Zimmerman
2022-01-05 21:42 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Nathan Chancellor
2022-01-08 10:32 ` [PATCH] headers/deps: Add header dependencies to .c files: <linux/ptrace_api.h> Ingo Molnar
2022-01-08 11:08 ` [PATCH] FIX: headers/deps: uapi/headers: Create usr/include/uapi symbolic link Ingo Molnar
2022-01-08 11:18 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-08 11:38 ` [PATCH] x86/bitops: Remove unused __sw_hweight64() assembly implementation Ingo Molnar
2022-01-08 11:49 ` [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell" Ingo Molnar
2022-01-08 12:17 ` Ingo Molnar
2022-01-10 20:03 ` Nathan Chancellor
2022-01-10 20:05 ` Nathan Chancellor
2022-01-05 22:33 ` Nathan Chancellor
2022-01-08 15:16 ` Ingo Molnar
2022-01-07 0:29 ` Nathan Chancellor
2022-01-08 11:54 ` Ingo Molnar
2022-01-04 12:36 ` Willy Tarreau
2022-01-04 16:05 ` Andy Shevchenko
2022-01-04 16:18 ` Andy Shevchenko
2022-01-15 0:42 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YdLaMvaM9vq4W6f1@gmail.com \
--to=mingo@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=ardb@kernel.org \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=gregkh@linuxfoundation.org \
--cc=jpoimboe@redhat.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.