* [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH @ 2021-02-24 14:29 Rasmus Villemoes 2021-02-24 14:29 ` [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-02-24 14:29 UTC (permalink / raw) To: linux-kernel Cc: Luis Chamberlain, Greg Kroah-Hartman, Jonathan Corbet, linux-doc, Nick Desaulniers, Linus Torvalds, Peter Zijlstra, Rasmus Villemoes These two patches are independent, but better-together. The second is a rather trivial patch that simply allows the developer to change "/sbin/modprobe" to something else - e.g. the empty string, so that all request_module() during early boot return -ENOENT early, without even spawning a usermode helper. The first patch allows delegating decompressing the initramfs to a worker thread, allowing do_initcalls() in main.c to proceed to the device_ and late_ initcalls without waiting for that decompression (and populating of rootfs) to finish. Obviously, some of those later calls may rely on the initramfs being available, so I've added synchronization points in the firmware loader and usermodehelper paths - there might be other places that would need this. There's not much to win if most of the functionality needed during boot is only available as modules. But systems with a custom-made .config and initramfs can boot faster, partly due to utilizing more than one cpu earlier, partly by avoiding known-futile modprobe calls (which would still trigger synchronization with the initramfs unpacking, thus eliminating most of the first benefit). Rasmus Villemoes (2): init/initramfs.c: allow asynchronous unpacking modules: add CONFIG_MODPROBE_PATH .../admin-guide/kernel-parameters.txt | 12 +++++ drivers/base/firmware_loader/main.c | 2 + include/linux/initrd.h | 7 +++ init/Kconfig | 12 +++++ init/initramfs.c | 51 ++++++++++++++++++- init/main.c | 1 + kernel/kmod.c | 2 +- kernel/umh.c | 2 + usr/Kconfig | 10 ++++ 9 files changed, 97 insertions(+), 2 deletions(-) -- 2.29.2 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking 2021-02-24 14:29 [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes @ 2021-02-24 14:29 ` Rasmus Villemoes 2021-02-24 17:17 ` Linus Torvalds 2021-03-02 16:26 ` Luis Chamberlain 2021-02-24 14:29 ` [PATCH/RFC 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes ` (2 subsequent siblings) 3 siblings, 2 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-02-24 14:29 UTC (permalink / raw) To: linux-kernel Cc: Luis Chamberlain, Greg Kroah-Hartman, Jonathan Corbet, linux-doc, Nick Desaulniers, Linus Torvalds, Peter Zijlstra, Rasmus Villemoes This is primarily motivated by an embedded ppc target, where unpacking even the rather modest sized initramfs takes 0.6 seconds, which is long enough that the external watchdog becomes unhappy that it doesn't get enough attention soon enough. But normal desktops might benefit as well. On my mostly stock Ubuntu kernel, my initramfs is a 26M xz-compressed blob, decompressing to around 126M. That takes almost two seconds. So add an initramfs_async= kernel parameter, allowing the main init process to proceed to handling device_initcall()s without waiting for populate_rootfs() to finish. Should one of those initcalls need something from the initramfs (say, a kernel module or a firmware blob), it will simply wait for the initramfs unpacking to be done before proceeding, which should in theory make this completely safe to always enable. But if some driver pokes around in the filesystem directly and not via one of the official kernel interfaces (i.e. request_firmware*(), call_usermodehelper*) that theory may not hold - also, I certainly might have missed a spot when sprinkling wait_for_initramfs(). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> --- .../admin-guide/kernel-parameters.txt | 12 +++++ drivers/base/firmware_loader/main.c | 2 + include/linux/initrd.h | 7 +++ init/initramfs.c | 51 ++++++++++++++++++- init/main.c | 1 + kernel/umh.c | 2 + usr/Kconfig | 10 ++++ 7 files changed, 84 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 0ac883777318..e9aca86d429b 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1820,6 +1820,18 @@ initcall functions. Useful for debugging built-in modules and initcalls. + initramfs_async= [KNL] Normally, the initramfs image is + unpacked synchronously, before most devices + are initialized. When the initramfs is huge, + or on slow CPUs, this can take a significant + amount of time. Setting this option means the + unpacking is instead done in a background + thread, allowing the main init process to + begin calling device_initcall()s while the + initramfs is being unpacked. + Format: <bool> + Default set by CONFIG_INITRAMFS_ASYNC. + initrd= [BOOT] Specify the location of the initial ramdisk initrdmem= [KNL] Specify a physical address and size from which to diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c index 78355095e00d..4fdb8219cd08 100644 --- a/drivers/base/firmware_loader/main.c +++ b/drivers/base/firmware_loader/main.c @@ -15,6 +15,7 @@ #include <linux/kernel_read_file.h> #include <linux/module.h> #include <linux/init.h> +#include <linux/initrd.h> #include <linux/timer.h> #include <linux/vmalloc.h> #include <linux/interrupt.h> @@ -504,6 +505,7 @@ fw_get_filesystem_firmware(struct device *device, struct fw_priv *fw_priv, if (!path) return -ENOMEM; + wait_for_initramfs(); for (i = 0; i < ARRAY_SIZE(fw_path); i++) { size_t file_size = 0; size_t *file_size_ptr = NULL; diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 8db6f8c8030b..7c915a7b7e26 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -24,3 +24,10 @@ extern char __initramfs_start[]; extern unsigned long __initramfs_size; void console_on_rootfs(void); + +#ifdef CONFIG_BLK_DEV_INITRD +extern void _wait_for_initramfs(const char *caller); +#define wait_for_initramfs() _wait_for_initramfs(__func__) +#else +static inline void wait_for_initramfs(void) { } +#endif diff --git a/init/initramfs.c b/init/initramfs.c index 55b74d7e5260..3a59fd323314 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -530,6 +530,43 @@ static int __init keepinitrd_setup(char *__unused) __setup("keepinitrd", keepinitrd_setup); #endif +static bool __initdata initramfs_async = CONFIG_INITRAMFS_ASYNC; +static int __init initramfs_async_setup(char *str) +{ + strtobool(str, &initramfs_async); + return 1; +} +__setup("initramfs_async=", initramfs_async_setup); + +static __initdata struct work_struct initramfs_wrk; +static DECLARE_COMPLETION(initramfs_done); +static bool initramfs_unpack_started; + +void _wait_for_initramfs(const char *caller) +{ + unsigned long start; + + if (try_wait_for_completion(&initramfs_done)) + return; + if (!initramfs_unpack_started) { + /* + * Something before rootfs_initcall wants to access + * the filesystem/initramfs. Probably a bug. Make a + * note, avoid deadlocking the machine, and let the + * caller's access fail as it used to. + */ + pr_warn_once("wait_for_initramfs() called by %s" + " before rootfs_initcalls\n", caller); + return; + } + + start = jiffies; + wait_for_completion(&initramfs_done); + pr_info("%s() waited %lu jiffies for initramfs_done\n", + caller, jiffies - start); +} +EXPORT_SYMBOL_GPL(_wait_for_initramfs); + extern char __initramfs_start[]; extern unsigned long __initramfs_size; #include <linux/initrd.h> @@ -602,7 +639,7 @@ static void __init populate_initrd_image(char *err) } #endif /* CONFIG_BLK_DEV_RAM */ -static int __init populate_rootfs(void) +static void __init do_populate_rootfs(struct work_struct *w) { /* Load the built in initramfs */ char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size); @@ -637,6 +674,18 @@ static int __init populate_rootfs(void) initrd_end = 0; flush_delayed_fput(); + complete_all(&initramfs_done); +} + +static int __init populate_rootfs(void) +{ + initramfs_unpack_started = true; + if (initramfs_async) { + INIT_WORK(&initramfs_wrk, do_populate_rootfs); + schedule_work(&initramfs_wrk); + } else { + do_populate_rootfs(NULL); + } return 0; } rootfs_initcall(populate_rootfs); diff --git a/init/main.c b/init/main.c index a626e78dbf06..fecdede7b85c 100644 --- a/init/main.c +++ b/init/main.c @@ -1534,6 +1534,7 @@ static noinline void __init kernel_init_freeable(void) kunit_run_all_tests(); + wait_for_initramfs(); console_on_rootfs(); /* diff --git a/kernel/umh.c b/kernel/umh.c index 3f646613a9d3..61f6b82c354b 100644 --- a/kernel/umh.c +++ b/kernel/umh.c @@ -27,6 +27,7 @@ #include <linux/ptrace.h> #include <linux/async.h> #include <linux/uaccess.h> +#include <linux/initrd.h> #include <trace/events/module.h> @@ -107,6 +108,7 @@ static int call_usermodehelper_exec_async(void *data) commit_creds(new); + wait_for_initramfs(); retval = kernel_execve(sub_info->path, (const char *const *)sub_info->argv, (const char *const *)sub_info->envp); diff --git a/usr/Kconfig b/usr/Kconfig index 2599bc21c1b2..56bb250458e4 100644 --- a/usr/Kconfig +++ b/usr/Kconfig @@ -32,6 +32,16 @@ config INITRAMFS_FORCE and is useful if you cannot or don't want to change the image your bootloader passes to the kernel. +config INITRAMFS_ASYNC + bool "Unpack initramfs asynchronously" + help + This option sets the default value of the initramfs_async= + command line parameter. If that parameter is set, unpacking + of initramfs (both the builtin and one passed from a + bootloader) is done asynchronously. See + <file:Documentation/admin-guide/kernel-parameters.txt> for + details. + config INITRAMFS_ROOT_UID int "User ID to map to 0 (user root)" depends on INITRAMFS_SOURCE!="" -- 2.29.2 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking 2021-02-24 14:29 ` [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes @ 2021-02-24 17:17 ` Linus Torvalds 2021-02-24 22:13 ` Rasmus Villemoes 2021-03-02 16:26 ` Luis Chamberlain 1 sibling, 1 reply; 21+ messages in thread From: Linus Torvalds @ 2021-02-24 17:17 UTC (permalink / raw) To: Rasmus Villemoes Cc: Linux Kernel Mailing List, Luis Chamberlain, Greg Kroah-Hartman, Jonathan Corbet, open list:DOCUMENTATION, Nick Desaulniers, Peter Zijlstra On Wed, Feb 24, 2021 at 6:29 AM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > So add an initramfs_async= kernel parameter, allowing the main init > process to proceed to handling device_initcall()s without waiting for > populate_rootfs() to finish. Hmm. This is why we have the whole "async_schedule()" thing (mostly used for things like disk spin-up etc). Is there some reason you didn't use that infrastructure? Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking 2021-02-24 17:17 ` Linus Torvalds @ 2021-02-24 22:13 ` Rasmus Villemoes 0 siblings, 0 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-02-24 22:13 UTC (permalink / raw) To: Linus Torvalds Cc: Linux Kernel Mailing List, Luis Chamberlain, Greg Kroah-Hartman, Jonathan Corbet, open list:DOCUMENTATION, Nick Desaulniers, Peter Zijlstra On 24/02/2021 18.17, Linus Torvalds wrote: > On Wed, Feb 24, 2021 at 6:29 AM Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >> >> So add an initramfs_async= kernel parameter, allowing the main init >> process to proceed to handling device_initcall()s without waiting for >> populate_rootfs() to finish. > > Hmm. This is why we have the whole "async_schedule()" thing (mostly > used for things like disk spin-up etc). Is there some reason you > didn't use that infrastructure? Mostly because I completely forgot it existed, it's not an API you stumble upon in every other source file. I guess I could use that, but it would look very much like what I have now - there'd still be some function to call to make sure the initramfs is ready, only that would then do async_synchronize() instead of wait_for_completion(). Is there some fundamental reason something like this shouldn't be doable? Are there places other than the usermodehelper and firmware loading (and obviously right-before-opening /dev/console and exec'ing /init) that would need to be taught about this? Rasmus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking 2021-02-24 14:29 ` [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes 2021-02-24 17:17 ` Linus Torvalds @ 2021-03-02 16:26 ` Luis Chamberlain 1 sibling, 0 replies; 21+ messages in thread From: Luis Chamberlain @ 2021-03-02 16:26 UTC (permalink / raw) To: Rasmus Villemoes, Borislav Petkov, Jessica Yu, Takashi Iwai Cc: linux-kernel, Greg Kroah-Hartman, Jonathan Corbet, linux-doc, Nick Desaulniers, Linus Torvalds, Peter Zijlstra On Wed, Feb 24, 2021 at 03:29:08PM +0100, Rasmus Villemoes wrote: > This is primarily motivated by an embedded ppc target, where unpacking > even the rather modest sized initramfs takes 0.6 seconds, which is > long enough that the external watchdog becomes unhappy that it doesn't > get enough attention soon enough. > > But normal desktops might benefit as well. On my mostly stock Ubuntu > kernel, my initramfs is a 26M xz-compressed blob, decompressing to > around 126M. That takes almost two seconds. > > So add an initramfs_async= kernel parameter, allowing the main init > process to proceed to handling device_initcall()s without waiting for > populate_rootfs() to finish. > > Should one of those initcalls need something from the initramfs (say, > a kernel module or a firmware blob), it will simply wait for the > initramfs unpacking to be done before proceeding, which should in > theory make this completely safe to always enable. But if some driver > pokes around in the filesystem directly and not via one of the > official kernel interfaces (i.e. request_firmware*(), > call_usermodehelper*) that theory may not hold - also, I certainly > might have missed a spot when sprinkling wait_for_initramfs(). > > Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > --- > .../admin-guide/kernel-parameters.txt | 12 +++++ > drivers/base/firmware_loader/main.c | 2 + > include/linux/initrd.h | 7 +++ > init/initramfs.c | 51 ++++++++++++++++++- > init/main.c | 1 + > kernel/umh.c | 2 + > usr/Kconfig | 10 ++++ > 7 files changed, 84 insertions(+), 1 deletion(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 0ac883777318..e9aca86d429b 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -1820,6 +1820,18 @@ > initcall functions. Useful for debugging built-in > modules and initcalls. > > + initramfs_async= [KNL] Normally, the initramfs image is > + unpacked synchronously, before most devices > + are initialized. When the initramfs is huge, > + or on slow CPUs, this can take a significant > + amount of time. Setting this option means the > + unpacking is instead done in a background > + thread, allowing the main init process to > + begin calling device_initcall()s while the > + initramfs is being unpacked. > + Format: <bool> > + Default set by CONFIG_INITRAMFS_ASYNC. > + > initrd= [BOOT] Specify the location of the initial ramdisk > > initrdmem= [KNL] Specify a physical address and size from which to > diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c > index 78355095e00d..4fdb8219cd08 100644 > --- a/drivers/base/firmware_loader/main.c > +++ b/drivers/base/firmware_loader/main.c > @@ -15,6 +15,7 @@ > #include <linux/kernel_read_file.h> > #include <linux/module.h> > #include <linux/init.h> > +#include <linux/initrd.h> > #include <linux/timer.h> > #include <linux/vmalloc.h> > #include <linux/interrupt.h> > @@ -504,6 +505,7 @@ fw_get_filesystem_firmware(struct device *device, struct fw_priv *fw_priv, > if (!path) > return -ENOMEM; > > + wait_for_initramfs(); Some folks might want this to not wait, say for folks who use built-in firmware, but for such use cases a new API which *purposely* only look for builtin-firmware would resolve that. The only case I think think of that folks may explicitly want this today is in arch/x86/kernel/cpu/microcode/, see get_builtin_firmware() calls, those should use a proper API, not a hack-in solution like that. I think Boris was working on this long ago, but he's as usual busy. But since this use the builtin stuff directly it is not affected. And even if it was affected by this delay, it would have been before. Other than what Linus pointed out, I see no reason why folks could experiment with this, in fact I welcome it. Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH/RFC 2/2] modules: add CONFIG_MODPROBE_PATH 2021-02-24 14:29 [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-02-24 14:29 ` [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes @ 2021-02-24 14:29 ` Rasmus Villemoes 2021-02-24 22:19 ` [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-03-09 21:16 ` [PATCH v2 " Rasmus Villemoes 3 siblings, 0 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-02-24 14:29 UTC (permalink / raw) To: linux-kernel Cc: Luis Chamberlain, Greg Kroah-Hartman, Jonathan Corbet, linux-doc, Nick Desaulniers, Linus Torvalds, Peter Zijlstra, Rasmus Villemoes Allow the developer to specifiy the initial value of the modprobe_path[] string. This can be used to set it to the empty string initially, thus effectively disabling request_module() during early boot until userspace writes a new value via the /proc/sys/kernel/modprobe interface. [1] When building a custom kernel (often for an embedded target), it's normal to build everything into the kernel that is needed for booting, and indeed the initramfs often contains no modules at all, so every such request_module() done before userspace init has mounted the real rootfs is a waste of time. This is particularly useful when combined with the previous patch, which allowed the initramfs to be unpacked asynchronously - for that to work, it had to make any usermodehelper call wait for the unpacking to finish before attempting to invoke the userspace helper. By eliminating all such (known-to-be-futile) calls of usermodehelper, the initramfs unpacking and the {device,late}_initcalls can proceed in parallel for much longer. [1] __request_module() already has an early -ENOENT return when modprobe_path is the empty string. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> --- init/Kconfig | 12 ++++++++++++ kernel/kmod.c | 2 +- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/init/Kconfig b/init/Kconfig index ba8bd5256980..e3f61e15445d 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2272,6 +2272,18 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS If unsure, say N. +config MODPROBE_PATH + string "Path to modprobe binary" + default "/sbin/modprobe" + help + When kernel code requests a module, it does so by calling + the "modprobe" userspace utility. This option allows you to + set the path where that binary is found. This can be changed + at runtime via the sysctl file + /proc/sys/kernel/modprobe. Setting this to the empty string + removes the kernel's ability to request modules (but + userspace can still load modules explicitly). + config TRIM_UNUSED_KSYMS bool "Trim unused exported kernel symbols" depends on BROKEN diff --git a/kernel/kmod.c b/kernel/kmod.c index 3cd075ce2a1e..b717134ebe17 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -58,7 +58,7 @@ static DECLARE_WAIT_QUEUE_HEAD(kmod_wq); /* modprobe_path is set via /proc/sys. */ -char modprobe_path[KMOD_PATH_LEN] = "/sbin/modprobe"; +char modprobe_path[KMOD_PATH_LEN] = CONFIG_MODPROBE_PATH; static void free_modprobe_argv(struct subprocess_info *info) { -- 2.29.2 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH 2021-02-24 14:29 [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-02-24 14:29 ` [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes 2021-02-24 14:29 ` [PATCH/RFC 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes @ 2021-02-24 22:19 ` Rasmus Villemoes 2021-03-09 21:16 ` [PATCH v2 " Rasmus Villemoes 3 siblings, 0 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-02-24 22:19 UTC (permalink / raw) To: linux-kernel Cc: Luis Chamberlain, Greg Kroah-Hartman, Jonathan Corbet, linux-doc, Nick Desaulniers, Linus Torvalds, Peter Zijlstra On 24/02/2021 15.29, Rasmus Villemoes wrote: > These two patches are independent, but better-together. kernel test robot says I'll have to add a 0/2 patch: linux/initrd.h: grow an include guard [because defining a macro a second time with the same contents is ok, so is doing another extern declaration, but a trivial static inline definition can't be repeated sigh] ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH 2021-02-24 14:29 [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes ` (2 preceding siblings ...) 2021-02-24 22:19 ` [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes @ 2021-03-09 21:16 ` Rasmus Villemoes 2021-03-09 21:16 ` [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes 2021-03-09 21:17 ` [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes 3 siblings, 2 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-09 21:16 UTC (permalink / raw) To: Luis Chamberlain, Linus Torvalds, Andrew Morton Cc: Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, linux-kernel, Nick Desaulniers, Rasmus Villemoes These two patches are independent, but better-together. The second is a rather trivial patch that simply allows the developer to change "/sbin/modprobe" to something else - e.g. the empty string, so that all request_module() during early boot return -ENOENT early, without even spawning a usermode helper. The first patch allows delegating decompressing the initramfs to a worker thread, allowing do_initcalls() in main.c to proceed to the device_ and late_ initcalls without waiting for that decompression (and populating of rootfs) to finish. Obviously, some of those later calls may rely on the initramfs being available, so I've added synchronization points in the firmware loader and usermodehelper paths - there might be other places that would need this. There's not much to win if most of the functionality needed during boot is only available as modules. But systems with a custom-made .config and initramfs can boot faster, partly due to utilizing more than one cpu earlier, partly by avoiding known-futile modprobe calls (which would still trigger synchronization with the initramfs unpacking, thus eliminating most of the first benefit). Routing-wise, I hope akpm can handle both patches. Andrew, Luis? Changes in v2: - Rebase on master, piggy-backing on the include guard and #ifdef CONFIG_BLK_DEV_INITRD added to initrd.h in fade5cad93 and c72160fe05. - Use existing async_* API instead of wait_for_completion/complete_all - Drop debug leftovers from wait_for_initramfs(). - Fix initialization of initramfs_async variable. Rasmus Villemoes (2): init/initramfs.c: allow asynchronous unpacking modules: add CONFIG_MODPROBE_PATH .../admin-guide/kernel-parameters.txt | 12 ++++++ drivers/base/firmware_loader/main.c | 2 + include/linux/initrd.h | 2 + init/Kconfig | 12 ++++++ init/initramfs.c | 41 ++++++++++++++++++- init/main.c | 1 + kernel/kmod.c | 2 +- kernel/umh.c | 2 + usr/Kconfig | 10 +++++ 9 files changed, 82 insertions(+), 2 deletions(-) -- 2.29.2 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-09 21:16 ` [PATCH v2 " Rasmus Villemoes @ 2021-03-09 21:16 ` Rasmus Villemoes 2021-03-09 22:07 ` Linus Torvalds 2021-03-09 22:16 ` Linus Torvalds 2021-03-09 21:17 ` [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes 1 sibling, 2 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-09 21:16 UTC (permalink / raw) To: Luis Chamberlain, Linus Torvalds, Andrew Morton Cc: Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, linux-kernel, Nick Desaulniers, Rasmus Villemoes This is primarily motivated by an embedded ppc target, where unpacking even the rather modest sized initramfs takes 0.6 seconds, which is long enough that the external watchdog becomes unhappy that it doesn't get attention soon enough. By doing the initramfs decompression in a worker thread, we get to do the device_initcalls and hence start petting the watchdog much sooner. So add an initramfs_async= kernel parameter, allowing the main init process to proceed to handling device_initcall()s without waiting for populate_rootfs() to finish. Should one of those initcalls need something from the initramfs (say, a kernel module or a firmware blob), it will simply wait for the initramfs unpacking to be done before proceeding, which should in theory make this completely safe to always enable. But if some driver pokes around in the filesystem directly and not via one of the official kernel interfaces (i.e. request_firmware*(), call_usermodehelper*) that theory may not hold - also, I certainly might have missed a spot when sprinkling wait_for_initramfs(). Normal desktops might benefit as well. On my mostly stock Ubuntu kernel, my initramfs is a 26M xz-compressed blob, decompressing to around 126M. That takes almost two seconds: [ 0.201454] Trying to unpack rootfs image as initramfs... [ 1.976633] Freeing initrd memory: 29416K Before this patch, or with initramfs_async=0, these lines occur consecutively in dmesg. With initramfs_async=1, the timestamps on these two lines is roughly the same as above, but with 172 lines inbetween - so more than one cpu has been kept busy doing work that would otherwise only happen after the populate_rootfs() finished. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> --- .../admin-guide/kernel-parameters.txt | 12 ++++++ drivers/base/firmware_loader/main.c | 2 + include/linux/initrd.h | 2 + init/initramfs.c | 41 ++++++++++++++++++- init/main.c | 1 + kernel/umh.c | 2 + usr/Kconfig | 10 +++++ 7 files changed, 69 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 04545725f187..fda9f012c42b 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1825,6 +1825,18 @@ initcall functions. Useful for debugging built-in modules and initcalls. + initramfs_async= [KNL] Normally, the initramfs image is + unpacked synchronously, before most devices + are initialized. When the initramfs is huge, + or on slow CPUs, this can take a significant + amount of time. Setting this option means the + unpacking is instead done in a background + thread, allowing the main init process to + begin calling device_initcall()s while the + initramfs is being unpacked. + Format: <bool> + Default set by CONFIG_INITRAMFS_ASYNC. + initrd= [BOOT] Specify the location of the initial ramdisk initrdmem= [KNL] Specify a physical address and size from which to diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c index 78355095e00d..4fdb8219cd08 100644 --- a/drivers/base/firmware_loader/main.c +++ b/drivers/base/firmware_loader/main.c @@ -15,6 +15,7 @@ #include <linux/kernel_read_file.h> #include <linux/module.h> #include <linux/init.h> +#include <linux/initrd.h> #include <linux/timer.h> #include <linux/vmalloc.h> #include <linux/interrupt.h> @@ -504,6 +505,7 @@ fw_get_filesystem_firmware(struct device *device, struct fw_priv *fw_priv, if (!path) return -ENOMEM; + wait_for_initramfs(); for (i = 0; i < ARRAY_SIZE(fw_path); i++) { size_t file_size = 0; size_t *file_size_ptr = NULL; diff --git a/include/linux/initrd.h b/include/linux/initrd.h index 85c15717af34..1bbe9af48dc3 100644 --- a/include/linux/initrd.h +++ b/include/linux/initrd.h @@ -20,8 +20,10 @@ extern void free_initrd_mem(unsigned long, unsigned long); #ifdef CONFIG_BLK_DEV_INITRD extern void __init reserve_initrd_mem(void); +extern void wait_for_initramfs(void); #else static inline void __init reserve_initrd_mem(void) {} +static inline void wait_for_initramfs(void) {} #endif extern phys_addr_t phys_initrd_start; diff --git a/init/initramfs.c b/init/initramfs.c index d677e8e717f1..d33bd98481c2 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/init.h> +#include <linux/async.h> #include <linux/fs.h> #include <linux/slab.h> #include <linux/types.h> @@ -541,6 +542,14 @@ static int __init keepinitrd_setup(char *__unused) __setup("keepinitrd", keepinitrd_setup); #endif +static bool initramfs_async = IS_ENABLED(CONFIG_INITRAMFS_ASYNC); +static int __init initramfs_async_setup(char *str) +{ + strtobool(str, &initramfs_async); + return 1; +} +__setup("initramfs_async=", initramfs_async_setup); + extern char __initramfs_start[]; extern unsigned long __initramfs_size; #include <linux/initrd.h> @@ -658,7 +667,7 @@ static void __init populate_initrd_image(char *err) } #endif /* CONFIG_BLK_DEV_RAM */ -static int __init populate_rootfs(void) +static void __init do_populate_rootfs(void *unused, async_cookie_t cookie) { /* Load the built in initramfs */ char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size); @@ -693,6 +702,36 @@ static int __init populate_rootfs(void) initrd_end = 0; flush_delayed_fput(); +} + +static ASYNC_DOMAIN_EXCLUSIVE(initramfs_domain); +static async_cookie_t initramfs_cookie; + +void wait_for_initramfs(void) +{ + if (!initramfs_async) + return; + if (!initramfs_cookie) { + /* + * Something before rootfs_initcall wants to access + * the filesystem/initramfs. Probably a bug. Make a + * note, avoid deadlocking the machine, and let the + * caller's access fail as it used to. + */ + pr_warn_once("wait_for_initramfs() called before rootfs_initcalls\n"); + return; + } + async_synchronize_cookie_domain(initramfs_cookie + 1, &initramfs_domain); +} +EXPORT_SYMBOL_GPL(wait_for_initramfs); + +static int __init populate_rootfs(void) +{ + if (initramfs_async) + initramfs_cookie = async_schedule_domain(do_populate_rootfs, NULL, + &initramfs_domain); + else + do_populate_rootfs(NULL, 0); return 0; } rootfs_initcall(populate_rootfs); diff --git a/init/main.c b/init/main.c index 53b278845b88..64253b181a84 100644 --- a/init/main.c +++ b/init/main.c @@ -1538,6 +1538,7 @@ static noinline void __init kernel_init_freeable(void) kunit_run_all_tests(); + wait_for_initramfs(); console_on_rootfs(); /* diff --git a/kernel/umh.c b/kernel/umh.c index 3f646613a9d3..61f6b82c354b 100644 --- a/kernel/umh.c +++ b/kernel/umh.c @@ -27,6 +27,7 @@ #include <linux/ptrace.h> #include <linux/async.h> #include <linux/uaccess.h> +#include <linux/initrd.h> #include <trace/events/module.h> @@ -107,6 +108,7 @@ static int call_usermodehelper_exec_async(void *data) commit_creds(new); + wait_for_initramfs(); retval = kernel_execve(sub_info->path, (const char *const *)sub_info->argv, (const char *const *)sub_info->envp); diff --git a/usr/Kconfig b/usr/Kconfig index 8bbcf699fe3b..0f167c9f7eb9 100644 --- a/usr/Kconfig +++ b/usr/Kconfig @@ -32,6 +32,16 @@ config INITRAMFS_FORCE and is useful if you cannot or don't want to change the image your bootloader passes to the kernel. +config INITRAMFS_ASYNC + bool "Unpack initramfs asynchronously" + help + This option sets the default value of the initramfs_async= + command line parameter. If that parameter is set, unpacking + of initramfs (both the builtin and one passed from a + bootloader) is done asynchronously. See + <file:Documentation/admin-guide/kernel-parameters.txt> for + details. + config INITRAMFS_ROOT_UID int "User ID to map to 0 (user root)" depends on INITRAMFS_SOURCE!="" -- 2.29.2 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-09 21:16 ` [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes @ 2021-03-09 22:07 ` Linus Torvalds 2021-03-09 22:39 ` Rasmus Villemoes 2021-03-11 17:55 ` Luis Chamberlain 2021-03-09 22:16 ` Linus Torvalds 1 sibling, 2 replies; 21+ messages in thread From: Linus Torvalds @ 2021-03-09 22:07 UTC (permalink / raw) To: Rasmus Villemoes Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > So add an initramfs_async= kernel parameter, allowing the main init > process to proceed to handling device_initcall()s without waiting for > populate_rootfs() to finish. I like this smaller second version of the patch, but am wondering why we even need the parameter. It sounds mostly like a "maybe I didn't think of all cases" thing - and one that will mean that this code will not see a lot of actual test coverage.. And because of the lack of test coverage, I'd rather reverse the meaning, and have the async case on by default (without even the Kconfig option), and have the kernel command line purely as a "oops, it's buggy, easy to ask people to test if this is what ails them". What *can* happen early boot outside of firmware loading and usermodehelpers? Hmm? Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-09 22:07 ` Linus Torvalds @ 2021-03-09 22:39 ` Rasmus Villemoes 2021-03-11 17:55 ` Luis Chamberlain 1 sibling, 0 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-09 22:39 UTC (permalink / raw) To: Linus Torvalds, Rasmus Villemoes Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On 09/03/2021 23.07, Linus Torvalds wrote: > On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >> >> So add an initramfs_async= kernel parameter, allowing the main init >> process to proceed to handling device_initcall()s without waiting for >> populate_rootfs() to finish. > > I like this smaller second version of the patch, but am wondering why > we even need the parameter. > > It sounds mostly like a "maybe I didn't think of all cases" thing - That's exactly what it is. > and one that will mean that this code will not see a lot of actual > test coverage.. Yeah, that's probably true. > And because of the lack of test coverage, I'd rather reverse the > meaning, and have the async case on by default (without even the > Kconfig option), and have the kernel command line purely as a "oops, > it's buggy, easy to ask people to test if this is what ails them". Well, I wasn't bold enough to make it "default y" by myself, but I can certainly do that and nuke the config option. > What *can* happen early boot outside of firmware loading and usermodehelpers? Well, that was what I tried to get people to tell me when I sent the first version as RFC, and also before that (https://lore.kernel.org/lkml/19574912-44b4-c1dc-44c3-67309968d465@rasmusvillemoes.dk/). That you can't think of anything suggests that I have covered the important cases - which does leave random drivers that poke around the filesystem on their own, but (a) it would probably be a good thing to have this flush those out and (b) there's the command line option to make it boot anyway. Rasmus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-09 22:07 ` Linus Torvalds 2021-03-09 22:39 ` Rasmus Villemoes @ 2021-03-11 17:55 ` Luis Chamberlain 1 sibling, 0 replies; 21+ messages in thread From: Luis Chamberlain @ 2021-03-11 17:55 UTC (permalink / raw) To: Linus Torvalds, tglx, mingo, bp, x86, hpa, dwmw2, baolu.lu, joro, iommu, andreas.noever, michael.jamet, mika.westerberg, YehezkelShB, linux-usb Cc: Rasmus Villemoes, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On Tue, Mar 09, 2021 at 02:07:36PM -0800, Linus Torvalds wrote: > On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: > > > > So add an initramfs_async= kernel parameter, allowing the main init > > process to proceed to handling device_initcall()s without waiting for > > populate_rootfs() to finish. > > I like this smaller second version of the patch, but am wondering why > we even need the parameter. > > It sounds mostly like a "maybe I didn't think of all cases" thing - > and one that will mean that this code will not see a lot of actual > test coverage.. > > And because of the lack of test coverage, I'd rather reverse the > meaning, and have the async case on by default (without even the > Kconfig option), and have the kernel command line purely as a "oops, > it's buggy, easy to ask people to test if this is what ails them". If we're going to set this as default it might be good to document on init.h that components that need content in initramfs need the wait call. > What *can* happen early boot outside of firmware loading and usermodehelpers? *In practice* the only thing I can think of at this time is races with other rootfs_initcall() calls, granted ordering among these is done at linker time, but I can't think of a issue with them: arch/x86/kernel/pci-dma.c:rootfs_initcall(pci_iommu_init); drivers/iommu/intel/irq_remapping.c:rootfs_initcall(ir_dev_scope_init); drivers/mfd/sta2x11-mfd.c:rootfs_initcall(sta2x11_mfd_init); drivers/thunderbolt/nhi.c:rootfs_initcall(nhi_init); But Cc'ing the maintainers of these components just in case. Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-09 21:16 ` [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes 2021-03-09 22:07 ` Linus Torvalds @ 2021-03-09 22:16 ` Linus Torvalds 2021-03-09 22:51 ` Rasmus Villemoes 2021-03-11 0:17 ` Rasmus Villemoes 1 sibling, 2 replies; 21+ messages in thread From: Linus Torvalds @ 2021-03-09 22:16 UTC (permalink / raw) To: Rasmus Villemoes Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > So add an initramfs_async= kernel parameter, allowing the main init > process to proceed to handling device_initcall()s without waiting for > populate_rootfs() to finish. Oh, and a completely unrelated second comment about this: some of the initramfs population code seems to be actively written to be slow. For example, I'm not sure why that write_buffer() function uses an array of indirect function pointer actions. Even ignoring the whole "speculation protections make that really slow" issue that came later, it seems to always have been actively (a) slower and (b) more complex. The normal way to do that is with a simple switch() statement, which makes the compiler able to do a much better job. Not just for the state selector - maybe it picks that function pointer approach, but probably these days just direct comparisons - but simply to do things like inline all those "it's used in one place" cases entirely. In fact, at that point, a lot of the state machine changes may end up turning into pure goto's - compilers are sometimes good at that (because state machines have often been very timing-critical). Is that likely to be a big part of the costs? No. I assume it's the decompression and the actual VFS operations. But when the series is about how this all takes a long time, and I go "that code really looks actively pessimally written", maybe it would be a good thing to look into it? Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-09 22:16 ` Linus Torvalds @ 2021-03-09 22:51 ` Rasmus Villemoes 2021-03-11 0:17 ` Rasmus Villemoes 1 sibling, 0 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-09 22:51 UTC (permalink / raw) To: Linus Torvalds, Rasmus Villemoes Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On 09/03/2021 23.16, Linus Torvalds wrote: > On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >> >> So add an initramfs_async= kernel parameter, allowing the main init >> process to proceed to handling device_initcall()s without waiting for >> populate_rootfs() to finish. > > Oh, and a completely unrelated second comment about this: some of the > initramfs population code seems to be actively written to be slow. > > For example, I'm not sure why that write_buffer() function uses an > array of indirect function pointer actions. Even ignoring the whole > "speculation protections make that really slow" issue that came later, > it seems to always have been actively (a) slower and (b) more complex. > [...] > Is that likely to be a big part of the costs? No. I assume it's the > decompression and the actual VFS operations. Yes, I have been doing some simple measurements, simply by decompressing the blob in userspace and comparing to the time to that used by populate_rootfs(). For both the 6M lz4-compressed blob on my ppc target and the 26M xz-compressed blob on my laptop, the result is that the decompression itself accounts for the vast majority of the time - and for ppc in particular, I don't think there's any spectre slowdown. So I haven't dared looking into changing the unpack implementation since it doesn't seem it could buy that much. Rasmus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-09 22:16 ` Linus Torvalds 2021-03-09 22:51 ` Rasmus Villemoes @ 2021-03-11 0:17 ` Rasmus Villemoes 2021-03-11 1:45 ` Rasmus Villemoes 1 sibling, 1 reply; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-11 0:17 UTC (permalink / raw) To: Linus Torvalds Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On 09/03/2021 23.16, Linus Torvalds wrote: > On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >> >> So add an initramfs_async= kernel parameter, allowing the main init >> process to proceed to handling device_initcall()s without waiting for >> populate_rootfs() to finish. > > Oh, and a completely unrelated second comment about this: some of the > initramfs population code seems to be actively written to be slow. > > For example, I'm not sure why that write_buffer() function uses an > array of indirect function pointer actions. Even ignoring the whole > "speculation protections make that really slow" issue that came later, > it seems to always have been actively (a) slower and (b) more complex. > > The normal way to do that is with a simple switch() statement, which > makes the compiler able to do a much better job. Not just for the > state selector - maybe it picks that function pointer approach, but > probably these days just direct comparisons - but simply to do things > like inline all those "it's used in one place" cases entirely. In > fact, at that point, a lot of the state machine changes may end up > turning into pure goto's - compilers are sometimes good at that > (because state machines have often been very timing-critical). FWIW, I tried doing --- a/init/initramfs.c +++ b/init/initramfs.c @@ -401,24 +401,26 @@ static int __init do_symlink(void) return 0; } -static __initdata int (*actions[])(void) = { - [Start] = do_start, - [Collect] = do_collect, - [GotHeader] = do_header, - [SkipIt] = do_skip, - [GotName] = do_name, - [CopyFile] = do_copy, - [GotSymlink] = do_symlink, - [Reset] = do_reset, -}; - static long __init write_buffer(char *buf, unsigned long len) { + int ret; + byte_count = len; victim = buf; - while (!actions[state]()) - ; + do { + switch (state) { + case Start: ret = do_start(); break; + case Collect: ret = do_collect(); break; + case GotHeader: ret = do_header(); break; + case SkipIt: ret = do_skip(); break; + case GotName: ret = do_name(); break; + case CopyFile: ret = do_copy(); break; + case GotSymlink: ret = do_symlink(); break; + case Reset: ret = do_reset(); break; + } + } while (!ret); + return len - byte_count; } and yes, all the do_* functions get inlined into write_buffer with some net space saving: add/remove: 0/9 grow/shrink: 1/0 up/down: 1696/-2112 (-416) Function old new delta write_buffer 100 1796 +1696 actions 32 - -32 do_start 52 - -52 do_reset 112 - -112 do_skip 128 - -128 do_collect 144 - -144 do_symlink 172 - -172 do_copy 304 - -304 do_header 572 - -572 do_name 596 - -596 Total: Before=5360, After=4944, chg -7.76% (ppc32 build). But the unpacking still takes the same time. It might be different on x86. Rasmus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-11 0:17 ` Rasmus Villemoes @ 2021-03-11 1:45 ` Rasmus Villemoes 2021-03-11 18:02 ` Linus Torvalds 0 siblings, 1 reply; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-11 1:45 UTC (permalink / raw) To: Linus Torvalds Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On 11/03/2021 01.17, Rasmus Villemoes wrote: > On 09/03/2021 23.16, Linus Torvalds wrote: >> On Tue, Mar 9, 2021 at 1:17 PM Rasmus Villemoes >> <linux@rasmusvillemoes.dk> wrote: >>> >>> So add an initramfs_async= kernel parameter, allowing the main init >>> process to proceed to handling device_initcall()s without waiting for >>> populate_rootfs() to finish. >> >> Oh, and a completely unrelated second comment about this: some of the >> initramfs population code seems to be actively written to be slow. >> >> For example, I'm not sure why that write_buffer() function uses an >> array of indirect function pointer actions. Even ignoring the whole >> "speculation protections make that really slow" issue that came later, >> it seems to always have been actively (a) slower and (b) more complex. >> >> The normal way to do that is with a simple switch() statement, which >> makes the compiler able to do a much better job. Not just for the >> state selector - maybe it picks that function pointer approach, but >> probably these days just direct comparisons - but simply to do things >> like inline all those "it's used in one place" cases entirely. In >> fact, at that point, a lot of the state machine changes may end up >> turning into pure goto's - compilers are sometimes good at that >> (because state machines have often been very timing-critical). > > FWIW, I tried doing > Hm, gcc does elide the test of the return value, but jumps back to a place where it always loads state from its memory location and does the whole switch(). To get it to jump directly to the code implementing the various do_* helpers it seems one needs to avoid that global variable and instead return the next state explicitly. The below boots, but I still can't see any measurable improvement on ppc. Rasmus Subject: [PATCH] init/initramfs.c: change state machine implementation Instead of having write_buffer() rely on the global variable "state", have each of the do_* helpers return the next state, or the new token Stop. Also, instead of an array of function pointers, use a switch statement. This means all the do_* helpers end up inlined into write_buffer(), and all the places which return a compile-time constant next state now compile to a direct jump to that code. We still need the global variable state for the initial choice within write_buffer, and we also need to preserve the last non-Stop state across calls. --- init/initramfs.c | 90 ++++++++++++++++++++++++------------------------ 1 file changed, 45 insertions(+), 45 deletions(-) diff --git a/init/initramfs.c b/init/initramfs.c index 1d0fdd05e5e9..ad7e04393acb 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -189,7 +189,8 @@ static __initdata enum state { GotName, CopyFile, GotSymlink, - Reset + Reset, + Stop } state, next_state; static __initdata char *victim; @@ -207,17 +208,17 @@ static __initdata char *collected; static long remains __initdata; static __initdata char *collect; -static void __init read_into(char *buf, unsigned size, enum state next) +static int __init read_into(char *buf, unsigned size, enum state next) { if (byte_count >= size) { collected = victim; eat(size); - state = next; + return next; } else { collect = collected = buf; remains = size; next_state = next; - state = Collect; + return Collect; } } @@ -225,8 +226,7 @@ static __initdata char *header_buf, *symlink_buf, *name_buf; static int __init do_start(void) { - read_into(header_buf, 110, GotHeader); - return 0; + return read_into(header_buf, 110, GotHeader); } static int __init do_collect(void) @@ -238,50 +238,46 @@ static int __init do_collect(void) eat(n); collect += n; if ((remains -= n) != 0) - return 1; - state = next_state; - return 0; + return Stop; + return next_state; } static int __init do_header(void) { if (memcmp(collected, "070707", 6)==0) { error("incorrect cpio method used: use -H newc option"); - return 1; + return Stop; } if (memcmp(collected, "070701", 6)) { error("no cpio magic"); - return 1; + return Stop; } parse_header(collected); next_header = this_header + N_ALIGN(name_len) + body_len; next_header = (next_header + 3) & ~3; - state = SkipIt; if (name_len <= 0 || name_len > PATH_MAX) - return 0; + return SkipIt; if (S_ISLNK(mode)) { if (body_len > PATH_MAX) - return 0; + return SkipIt; collect = collected = symlink_buf; remains = N_ALIGN(name_len) + body_len; next_state = GotSymlink; - state = Collect; - return 0; + return Collect; } if (S_ISREG(mode) || !body_len) - read_into(name_buf, N_ALIGN(name_len), GotName); - return 0; + return read_into(name_buf, N_ALIGN(name_len), GotName); + return SkipIt; } static int __init do_skip(void) { if (this_header + byte_count < next_header) { eat(byte_count); - return 1; + return Stop; } else { eat(next_header - this_header); - state = next_state; - return 0; + return next_state; } } @@ -291,7 +287,7 @@ static int __init do_reset(void) eat(1); if (byte_count && (this_header & 3)) error("broken padding"); - return 1; + return Stop; } static void __init clean_path(char *path, umode_t fmode) @@ -324,11 +320,12 @@ static __initdata loff_t wfile_pos; static int __init do_name(void) { - state = SkipIt; + int s = SkipIt; + next_state = Reset; if (strcmp(collected, "TRAILER!!!") == 0) { free_hash(); - return 0; + return s; } clean_path(collected, mode); if (S_ISREG(mode)) { @@ -339,14 +336,14 @@ static int __init do_name(void) openflags |= O_TRUNC; wfile = filp_open(collected, openflags, mode); if (IS_ERR(wfile)) - return 0; + return s; wfile_pos = 0; vfs_fchown(wfile, uid, gid); vfs_fchmod(wfile, mode); if (body_len) vfs_truncate(&wfile->f_path, body_len); - state = CopyFile; + s = CopyFile; } } else if (S_ISDIR(mode)) { init_mkdir(collected, mode); @@ -362,7 +359,7 @@ static int __init do_name(void) do_utime(collected, mtime); } } - return 0; + return s; } static int __init do_copy(void) @@ -378,14 +375,13 @@ static int __init do_copy(void) fput(wfile); eat(body_len); - state = SkipIt; - return 0; + return SkipIt; } else { if (xwrite(wfile, victim, byte_count, &wfile_pos) != byte_count) error("write error"); body_len -= byte_count; eat(byte_count); - return 1; + return Stop; } } @@ -396,29 +392,33 @@ static int __init do_symlink(void) init_symlink(collected + N_ALIGN(name_len), collected); init_chown(collected, uid, gid, AT_SYMLINK_NOFOLLOW); do_utime(collected, mtime); - state = SkipIt; next_state = Reset; - return 0; + return SkipIt; } -static __initdata int (*actions[])(void) = { - [Start] = do_start, - [Collect] = do_collect, - [GotHeader] = do_header, - [SkipIt] = do_skip, - [GotName] = do_name, - [CopyFile] = do_copy, - [GotSymlink] = do_symlink, - [Reset] = do_reset, -}; - static long __init write_buffer(char *buf, unsigned long len) { + int s = state; + int save; + byte_count = len; victim = buf; - while (!actions[state]()) - ; + do { + save = s; + switch (s) { + case Start: s = do_start(); break; + case Collect: s = do_collect(); break; + case GotHeader: s = do_header(); break; + case SkipIt: s = do_skip(); break; + case GotName: s = do_name(); break; + case CopyFile: s = do_copy(); break; + case GotSymlink: s = do_symlink(); break; + case Reset: s = do_reset(); break; + } + } while (s != Stop); + state = save; + return len - byte_count; } -- 2.29.2 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-11 1:45 ` Rasmus Villemoes @ 2021-03-11 18:02 ` Linus Torvalds 2021-03-13 13:13 ` Rasmus Villemoes 0 siblings, 1 reply; 21+ messages in thread From: Linus Torvalds @ 2021-03-11 18:02 UTC (permalink / raw) To: Rasmus Villemoes Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On Wed, Mar 10, 2021 at 5:45 PM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > Hm, gcc does elide the test of the return value, but jumps back to a > place where it always loads state from its memory location and does the > whole switch(). To get it to jump directly to the code implementing the > various do_* helpers it seems one needs to avoid that global variable > and instead return the next state explicitly. The below boots, but I > still can't see any measurable improvement on ppc. Ok. That's definitely the right way to do efficient statemachines that the compiler can actually generate ok code for, but if you can't measure the difference I guess it isn't even worth doing. Thanks for checking, though. Linus ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking 2021-03-11 18:02 ` Linus Torvalds @ 2021-03-13 13:13 ` Rasmus Villemoes 0 siblings, 0 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-13 13:13 UTC (permalink / raw) To: Linus Torvalds, Rasmus Villemoes Cc: Luis Chamberlain, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, Linux Kernel Mailing List, Nick Desaulniers On 11/03/2021 19.02, Linus Torvalds wrote: > On Wed, Mar 10, 2021 at 5:45 PM Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >> >> Hm, gcc does elide the test of the return value, but jumps back to a >> place where it always loads state from its memory location and does the >> whole switch(). To get it to jump directly to the code implementing the >> various do_* helpers it seems one needs to avoid that global variable >> and instead return the next state explicitly. The below boots, but I >> still can't see any measurable improvement on ppc. > > Ok. That's definitely the right way to do efficient statemachines that > the compiler can actually generate ok code for, but if you can't > measure the difference I guess it isn't even worth doing. Just for good measure, I now got around to test on x86 as well, where I thought the speculation stuff might make a difference. However, the indirect calls through the actions[] array don't actually hurt due to __noinitretpoline, and even removing that from the __init definition, I only see about 1.5% difference with that state machine patch applied. So it doesn't seem worth pursuing. I'll send v3 of the async patches shortly. Rasmus ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH 2021-03-09 21:16 ` [PATCH v2 " Rasmus Villemoes 2021-03-09 21:16 ` [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes @ 2021-03-09 21:17 ` Rasmus Villemoes 2021-03-10 9:46 ` Greg Kroah-Hartman 2021-03-11 13:28 ` Jessica Yu 1 sibling, 2 replies; 21+ messages in thread From: Rasmus Villemoes @ 2021-03-09 21:17 UTC (permalink / raw) To: Luis Chamberlain, Linus Torvalds, Andrew Morton Cc: Jessica Yu, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, linux-kernel, Nick Desaulniers, Rasmus Villemoes Allow the developer to specifiy the initial value of the modprobe_path[] string. This can be used to set it to the empty string initially, thus effectively disabling request_module() during early boot until userspace writes a new value via the /proc/sys/kernel/modprobe interface. [1] When building a custom kernel (often for an embedded target), it's normal to build everything into the kernel that is needed for booting, and indeed the initramfs often contains no modules at all, so every such request_module() done before userspace init has mounted the real rootfs is a waste of time. This is particularly useful when combined with the previous patch, which allowed the initramfs to be unpacked asynchronously - for that to work, it had to make any usermodehelper call wait for the unpacking to finish before attempting to invoke the userspace helper. By eliminating all such (known-to-be-futile) calls of usermodehelper, the initramfs unpacking and the {device,late}_initcalls can proceed in parallel for much longer. For a relatively slow ppc board I'm working on, the two patches combined lead to 0.2s faster boot - but more importantly, the fact that the initramfs unpacking proceeds completely in the background while devices get probed means I get to handle the gpio watchdog in time without getting reset. [1] __request_module() already has an early -ENOENT return when modprobe_path is the empty string. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> --- init/Kconfig | 12 ++++++++++++ kernel/kmod.c | 2 +- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/init/Kconfig b/init/Kconfig index 22946fe5ded9..18b4ec7346d4 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2264,6 +2264,18 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS If unsure, say N. +config MODPROBE_PATH + string "Path to modprobe binary" + default "/sbin/modprobe" + help + When kernel code requests a module, it does so by calling + the "modprobe" userspace utility. This option allows you to + set the path where that binary is found. This can be changed + at runtime via the sysctl file + /proc/sys/kernel/modprobe. Setting this to the empty string + removes the kernel's ability to request modules (but + userspace can still load modules explicitly). + config TRIM_UNUSED_KSYMS bool "Trim unused exported kernel symbols" if EXPERT depends on !COMPILE_TEST diff --git a/kernel/kmod.c b/kernel/kmod.c index 3cd075ce2a1e..b717134ebe17 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -58,7 +58,7 @@ static DECLARE_WAIT_QUEUE_HEAD(kmod_wq); /* modprobe_path is set via /proc/sys. */ -char modprobe_path[KMOD_PATH_LEN] = "/sbin/modprobe"; +char modprobe_path[KMOD_PATH_LEN] = CONFIG_MODPROBE_PATH; static void free_modprobe_argv(struct subprocess_info *info) { -- 2.29.2 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH 2021-03-09 21:17 ` [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes @ 2021-03-10 9:46 ` Greg Kroah-Hartman 2021-03-11 13:28 ` Jessica Yu 1 sibling, 0 replies; 21+ messages in thread From: Greg Kroah-Hartman @ 2021-03-10 9:46 UTC (permalink / raw) To: Rasmus Villemoes Cc: Luis Chamberlain, Linus Torvalds, Andrew Morton, Jessica Yu, Borislav Petkov, Jonathan Corbet, linux-kernel, Nick Desaulniers On Tue, Mar 09, 2021 at 10:17:00PM +0100, Rasmus Villemoes wrote: > Allow the developer to specifiy the initial value of the > modprobe_path[] string. This can be used to set it to the empty string > initially, thus effectively disabling request_module() during early > boot until userspace writes a new value via the > /proc/sys/kernel/modprobe interface. [1] > > When building a custom kernel (often for an embedded target), it's > normal to build everything into the kernel that is needed for booting, > and indeed the initramfs often contains no modules at all, so every > such request_module() done before userspace init has mounted the real > rootfs is a waste of time. > > This is particularly useful when combined with the previous patch, > which allowed the initramfs to be unpacked asynchronously - for that > to work, it had to make any usermodehelper call wait for the unpacking > to finish before attempting to invoke the userspace helper. By > eliminating all such (known-to-be-futile) calls of usermodehelper, the > initramfs unpacking and the {device,late}_initcalls can proceed in > parallel for much longer. > > For a relatively slow ppc board I'm working on, the two patches > combined lead to 0.2s faster boot - but more importantly, the fact > that the initramfs unpacking proceeds completely in the background > while devices get probed means I get to handle the gpio watchdog in > time without getting reset. > > [1] __request_module() already has an early -ENOENT return when > modprobe_path is the empty string. > > Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > --- > init/Kconfig | 12 ++++++++++++ > kernel/kmod.c | 2 +- > 2 files changed, 13 insertions(+), 1 deletion(-) Looks sane to me, odd we didn't think of doing this before. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH 2021-03-09 21:17 ` [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-03-10 9:46 ` Greg Kroah-Hartman @ 2021-03-11 13:28 ` Jessica Yu 1 sibling, 0 replies; 21+ messages in thread From: Jessica Yu @ 2021-03-11 13:28 UTC (permalink / raw) To: Rasmus Villemoes Cc: Luis Chamberlain, Linus Torvalds, Andrew Morton, Borislav Petkov, Jonathan Corbet, Greg Kroah-Hartman, linux-kernel, Nick Desaulniers +++ Rasmus Villemoes [09/03/21 22:17 +0100]: >Allow the developer to specifiy the initial value of the >modprobe_path[] string. This can be used to set it to the empty string >initially, thus effectively disabling request_module() during early >boot until userspace writes a new value via the >/proc/sys/kernel/modprobe interface. [1] > >When building a custom kernel (often for an embedded target), it's >normal to build everything into the kernel that is needed for booting, >and indeed the initramfs often contains no modules at all, so every >such request_module() done before userspace init has mounted the real >rootfs is a waste of time. > >This is particularly useful when combined with the previous patch, >which allowed the initramfs to be unpacked asynchronously - for that >to work, it had to make any usermodehelper call wait for the unpacking >to finish before attempting to invoke the userspace helper. By >eliminating all such (known-to-be-futile) calls of usermodehelper, the >initramfs unpacking and the {device,late}_initcalls can proceed in >parallel for much longer. > >For a relatively slow ppc board I'm working on, the two patches >combined lead to 0.2s faster boot - but more importantly, the fact >that the initramfs unpacking proceeds completely in the background >while devices get probed means I get to handle the gpio watchdog in >time without getting reset. > >[1] __request_module() already has an early -ENOENT return when >modprobe_path is the empty string. > >Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Jessica Yu <jeyu@kernel.org> Thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2021-03-13 13:14 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-24 14:29 [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-02-24 14:29 ` [PATCH/RFC 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes 2021-02-24 17:17 ` Linus Torvalds 2021-02-24 22:13 ` Rasmus Villemoes 2021-03-02 16:26 ` Luis Chamberlain 2021-02-24 14:29 ` [PATCH/RFC 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-02-24 22:19 ` [PATCH/RFC 0/2] background initramfs unpacking, and CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-03-09 21:16 ` [PATCH v2 " Rasmus Villemoes 2021-03-09 21:16 ` [PATCH v2 1/2] init/initramfs.c: allow asynchronous unpacking Rasmus Villemoes 2021-03-09 22:07 ` Linus Torvalds 2021-03-09 22:39 ` Rasmus Villemoes 2021-03-11 17:55 ` Luis Chamberlain 2021-03-09 22:16 ` Linus Torvalds 2021-03-09 22:51 ` Rasmus Villemoes 2021-03-11 0:17 ` Rasmus Villemoes 2021-03-11 1:45 ` Rasmus Villemoes 2021-03-11 18:02 ` Linus Torvalds 2021-03-13 13:13 ` Rasmus Villemoes 2021-03-09 21:17 ` [PATCH v2 2/2] modules: add CONFIG_MODPROBE_PATH Rasmus Villemoes 2021-03-10 9:46 ` Greg Kroah-Hartman 2021-03-11 13:28 ` Jessica Yu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).