All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] initramfs: Support unpacking directly to tmpfs
@ 2023-11-29  9:00 Emily Shepherd
  2023-11-29 16:38 ` Rob Landley
  0 siblings, 1 reply; 10+ messages in thread
From: Emily Shepherd @ 2023-11-29  9:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: initramfs, Thomas Strömberg, Anders Björklund,
	Giuseppe Scrivano, Al Viro, Christoph Hellwig, Jens Axboe,
	Rob Landley

For systems which run directly from initramfs, it is not possible to use
pivot_root without first changing root. This is because of the
intentional design choice that rootfs, which is where initramfs is
unpacked to, cannot be unmounted.

pivot_root is an important feature for creating containers and the
alternative (mounting the new root over the top of the old with MS_MOVE
and then calling chroot) is not favoured by most container runtimes
[1][2] as it does not completely remove the host system mounts from the
mount namespace.

The general work around, when running directly from initramfs, is to
have init mount a new tmpfs, copy everything out of rootfs, and then
switch_root [3][4]. This is only required when running directly from the
initramfs as all other methods of acquiring a root device (having the
kernel mount a root device directly via the root= parameter, or using
initramfs to mount and then switch_root to a new root) leave an empty
rootfs at the top of the mount stack.

This commit adds a new build option - EMPTY_ROOTFS, available when
initrd/initramfs is enabled. When selected, rather than unpacking the
inbuilt / bootloader provided initramfs directly into rootfs, the kernel
will mount a new tmpfs/ramfs over the top of the rootfs and unpack to
that instead, leaving an empty rootfs at the top of the stack. This
removes the need to have init copy everything as a workaround.

[1]: https://github.com/opencontainers/runc/blob/95a93c132cf179a017312e22a954f137e8237c4e/man/runc-create.8.md?plain=1#L27
[2]: https://github.com/containers/crun/blob/8e8d7972f738f28294cd5c16091d136ca278759e/crun.1.md?plain=1#L103
[3]: https://github.com/tinycorelinux/Core-scripts/blob/dbb24bf42a0a9935b18e66a0b936266b2244251b/init#L13
[4]: https://github.com/kubernetes/minikube/blob/master/deploy/iso/minikube-iso/board/minikube/x86_64/rootfs-overlay/init#L6

Signed-off-by: Emily Shepherd <emily@redcoat.dev>
---
v2:
  - Fix formatting error in patch
  - Update overmount_rootfs() return type to void
  - cc relevant kernel devs based on blame of init files
  - cc OCI container runtime devs who have supported no-pivot options
  - cc small / embedded linux devs who have mitigated this by copying 
    root
  - tweak to changelog: clarify why no-pivot is not recommended
  - tweak to changelog: include missing reference to minikube's rootfs 
    mitigation
---
 init/Kconfig     | 13 +++++++++++++
 init/do_mounts.c | 23 +++++++++++++++++++++++
 init/do_mounts.h |  6 ++++++
 init/initramfs.c |  4 ++++
 4 files changed, 46 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index 6d35728b94b2b..bf15bd08abdc2 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1299,6 +1299,19 @@ config BLK_DEV_INITRD
 
 if BLK_DEV_INITRD
 
+config EMPTY_ROOTFS
+	bool "Mount initramfs over empty rootfs"
+	help
+		Normally initramfs is unpacked directly into the rootfs. When this
+		option is enabled, initramfs is instead unpacked into a tmpfs
+		mounted on top of a permanently empty rootfs.
+
+		This is mostly useful for embedded operating systems, running
+		directly from initramfs, which need to make use of pivot_root (for
+		example systems running containers).
+
+		If unsure, say N.
+
 source "usr/Kconfig"
 
 endif
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 5dfd30b13f485..7cf106cf976db 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -514,3 +514,26 @@ void __init init_rootfs(void)
 		(!root_fs_names || strstr(root_fs_names, "tmpfs")))
 		is_tmpfs = true;
 }
+
+#ifdef CONFIG_EMPTY_ROOTFS
+void __init overmount_rootfs(void) {
+	int err;
+
+	err = init_mkdir("/root", 0700);
+	if (err != 0)
+		goto out;
+
+	err = init_mount("rootfs", "/root", is_tmpfs ? "tmpfs" : "ramfs", 0, NULL);
+	if (err != 0)
+		goto out;
+
+	init_chdir("/root");
+	init_mount(".", "/", NULL, MS_MOVE, NULL);
+	init_chroot(".");
+
+	return;
+
+out:
+	printk(KERN_WARNING "Failed to mount over rootfs\n");
+}
+#endif /* CONFIG_EMPTY_ROOTFS */
diff --git a/init/do_mounts.h b/init/do_mounts.h
index 15e372b00ce70..3a261f1ae0d64 100644
--- a/init/do_mounts.h
+++ b/init/do_mounts.h
@@ -41,3 +41,9 @@ static inline bool initrd_load(char *root_device_name)
 	}
 
 #endif
+
+#ifdef CONFIG_EMPTY_ROOTFS
+void __init overmount_rootfs(void);
+#else
+static inline void __init overmount_rootfs(void) { return; }
+#endif
diff --git a/init/initramfs.c b/init/initramfs.c
index 8d0fd946cdd2b..76525108a39d2 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -19,6 +19,8 @@
 #include <linux/task_work.h>
 #include <linux/umh.h>
 
+#include "do_mounts.h"
+
 static __initdata bool csum_present;
 static __initdata u32 io_csum;
 
@@ -688,6 +690,8 @@ static void __init populate_initrd_image(char *err)
 
 static void __init do_populate_rootfs(void *unused, async_cookie_t cookie)
 {
+	overmount_rootfs();
+
 	/* Load the built in initramfs */
 	char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size);
 	if (err)
-- 2.42.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread
* Re: [PATCH v2] initramfs: Support unpacking directly to tmpfs
@ 2023-12-19 19:22 Askar Safin
  0 siblings, 0 replies; 10+ messages in thread
From: Askar Safin @ 2023-12-19 19:22 UTC (permalink / raw)
  To: emily; +Cc: initramfs, rob

Hi, Emily.

I propose this solution: in very beginning of your initramfs's init do
equivalent of this:

mkdir root
mount --bind / root
cd root
mount --move . /
chroot .

And then everything else.

This will create second mount of initramfs. Everything will look same,
but pivot_root will work.

Also, you can do this: create file named, say, "preinit" and put it to
initramfs. Write to this file code above and put to the end of the
file "exec /init". Of course, "preinit" could be written in shell or C
or any other language. Add "rdinit=/preinit" to kernel command line.
This will execute preinit at first, preinit will duplicate initramfs
mount and then execute actual initramfs's init.

I didn't test this, but I'm nearly sure it will work. If you want, I
can test this.

Also you can put "rdinit=/preinit" to CONFIG_CMDLINE. As well as I
understand, CONFIG_CMDLINE will be merged with command line provided
by bootloader, but I'm not sure. Also you can link small initramfs
with /preinit to kernel image. Again, as well as I understand it will
be merged with initramfs provided by bootloader. Thus, kernel with
CONFIG_CMDLINE and with linked in initramfs with /preinit will behave
very similarly to kernel with your patch. Of course, initial mount
will not be empty, but I think this is a minor point.

Of course, instead of that "mount --bind" trick you can do the same
with "cp + rm" solution.

Also, I remember I saw patch similar to yours in Linux mailing lists.
It was rejected, too. If you want I can try to find it.

I suggest solution described above. But let me also provide some
alternative solutions. You can implement patch similar to yours, but
which will work unconditionally, always. This will solve the problem
once and for all. Ideally initial mount will be nullfs as suggested by
Rob. I. e. file system, which has no state at all. To make sure
containers cannot exchange data. I think such filesystem is easy to
create. All operations will be no-ops. Look at "fs" directory in
kernel tree and write something similar.

Another way is to make pivot_root work with initial mount. I think
this will be hard.

Note that Rob Landley is absent from MAINTAINERS file (
https://elixir.bootlin.com/linux/latest/source/MAINTAINERS ), so he
doesn't decide whether a patch will be accepted. (I doesn't decide,
too. I'm not a kernel developer, I just occasionally found this thread
and decided to answer.)

If you ever want to start new discussion or sent new patch, then,
please, sent it to LKML, not to initramfs@vger.kernel.org . As you can
see here: https://lore.kernel.org/initramfs/ , new discussions are
started nearly once a month at initramfs@vger.kernel.org , so I think
very few people will see your message.

Ask me any questions.

-- 
Askar Safin

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-12-19 19:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-29  9:00 [PATCH v2] initramfs: Support unpacking directly to tmpfs Emily Shepherd
2023-11-29 16:38 ` Rob Landley
2023-11-29 17:48   ` Emily Shepherd
2023-11-29 20:53     ` Rob Landley
2023-11-30  3:31       ` Emily Shepherd
2023-12-01 22:02         ` Rob Landley
2023-12-01 23:37           ` Emily Shepherd
2023-12-02  5:40             ` Rob Landley
2023-12-02 23:27               ` Emily Shepherd
2023-12-19 19:22 Askar Safin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.