All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] init/initramfs.c: make initramfs support pivot_root
@ 2021-06-02 14:46 menglong8.dong
  2021-06-02 14:46 ` [PATCH v4 1/3] init/main.c: introduce function ramdisk_exec_exist() menglong8.dong
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: menglong8.dong @ 2021-06-02 14:46 UTC (permalink / raw)
  To: christian.brauner
  Cc: viro, keescook, samitolvanen, johan, ojeda, akpm, dong.menglong,
	masahiroy, joe, hare, axboe, jack, tj, gregkh, song, neilb, brho,
	mcgrof, palmerdabbelt, arnd, f.fainelli, linux, wangkefeng.wang,
	mhiramat, rostedt, vbabka, pmladek, glider, chris, ebiederm,
	jojing64, mingo, terrelln, geert, linux-fsdevel, linux-kernel,
	jeyu, bhelgaas, josh

From: Menglong Dong <dong.menglong@zte.com.cn>

As Luis Chamberlain suggested, I split the patch:
[init/initramfs.c: make initramfs support pivot_root]
(https://lore.kernel.org/linux-fsdevel/20210520154244.20209-1-dong.menglong@zte.com.cn/)
into three.

The goal of the series patches is to make pivot_root() support initramfs.

In the first patch, I introduce the function ramdisk_exec_exist(), which
is used to check the exist of 'ramdisk_execute_command' in LOOKUP_DOWN
lookup mode.

In the second patch, I create a second mount, which is called
'user root', and make it become the root. Therefore, the root has a
parent mount, and it can be umounted or pivot_root.

In the third patch, I fix rootfs_fs_type with ramfs, as it is not used
directly any more, and it make no sense to switch it between ramfs and
tmpfs, just fix it with ramfs to simplify the code.

Changes since V3:

Do a code cleanup for the second patch, as Christian Brauner suggested:
- remove the concept 'user root', which seems not suitable.
- introduce inline function 'check_tmpfs_enabled()' to avoid duplicated
  code.
- rename function 'mount_user_root' to 'prepare_mount_rootfs'
- rename function 'end_mount_user_root' to 'finish_mount_rootfs'
- join 'init_user_rootfs()' with 'prepare_mount_rootfs()'

Changes since V2:

In the first patch, I use vfs_path_lookup() in init_eaccess() to make the
path lookup follow the mount on '/'. After this, the problem reported by
Masami Hiramatsu is solved. Thanks for your report :/


Changes since V1:

In the first patch, I add the flag LOOKUP_DOWN to init_eaccess(), to make
it support the check of filesystem mounted on '/'.

In the second patch, I control 'user root' with kconfig option
'CONFIG_INITRAMFS_USER_ROOT', and add some comments, as Luis Chamberlain
suggested.

In the third patch, I make 'rootfs_fs_type' in control of
'CONFIG_INITRAMFS_USER_ROOT'.



Menglong Dong (3):
  init/main.c: introduce function ramdisk_exec_exist()
  init/do_mounts.c: create second mount for initramfs
  init/do_mounts.c: fix rootfs_fs_type with ramfs

 fs/init.c            |  11 ++++-
 include/linux/init.h |   4 ++
 init/do_mounts.c     | 101 ++++++++++++++++++++++++++++++++++++++++---
 init/do_mounts.h     |  16 ++++++-
 init/initramfs.c     |   8 ++++
 init/main.c          |   7 ++-
 usr/Kconfig          |  10 +++++
 7 files changed, 146 insertions(+), 11 deletions(-)

-- 
2.32.0.rc0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/3] init/main.c: introduce function ramdisk_exec_exist()
  2021-06-02 14:46 [PATCH v4 0/3] init/initramfs.c: make initramfs support pivot_root menglong8.dong
@ 2021-06-02 14:46 ` menglong8.dong
  2021-06-02 14:46 ` [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs menglong8.dong
  2021-06-02 14:46 ` [PATCH v4 3/3] init/do_mounts.c: fix rootfs_fs_type with ramfs menglong8.dong
  2 siblings, 0 replies; 7+ messages in thread
From: menglong8.dong @ 2021-06-02 14:46 UTC (permalink / raw)
  To: christian.brauner
  Cc: viro, keescook, samitolvanen, johan, ojeda, akpm, dong.menglong,
	masahiroy, joe, hare, axboe, jack, tj, gregkh, song, neilb, brho,
	mcgrof, palmerdabbelt, arnd, f.fainelli, linux, wangkefeng.wang,
	mhiramat, rostedt, vbabka, pmladek, glider, chris, ebiederm,
	jojing64, mingo, terrelln, geert, linux-fsdevel, linux-kernel,
	jeyu, bhelgaas, josh

From: Menglong Dong <dong.menglong@zte.com.cn>

Introduce the function ramdisk_exec_exist, which is used to check the
exist of 'ramdisk_execute_command'.

To make path lookup follow the mount on '/', use vfs_path_lookup() in
init_eaccess(), and make the filesystem that mounted on '/' as root
during path lookup.

Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
---
 fs/init.c   | 11 +++++++++--
 init/main.c |  7 ++++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/init.c b/fs/init.c
index 5c36adaa9b44..166356a1f15f 100644
--- a/fs/init.c
+++ b/fs/init.c
@@ -112,14 +112,21 @@ int __init init_chmod(const char *filename, umode_t mode)
 
 int __init init_eaccess(const char *filename)
 {
-	struct path path;
+	struct path path, root;
 	int error;
 
-	error = kern_path(filename, LOOKUP_FOLLOW, &path);
+	error = kern_path("/", LOOKUP_DOWN, &root);
 	if (error)
 		return error;
+	error = vfs_path_lookup(root.dentry, root.mnt, filename,
+				LOOKUP_FOLLOW, &path);
+	if (error)
+		goto on_err;
 	error = path_permission(&path, MAY_ACCESS);
+
 	path_put(&path);
+on_err:
+	path_put(&root);
 	return error;
 }
 
diff --git a/init/main.c b/init/main.c
index eb01e121d2f1..1153571ca977 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1522,6 +1522,11 @@ void __init console_on_rootfs(void)
 	fput(file);
 }
 
+bool __init ramdisk_exec_exist(void)
+{
+	return init_eaccess(ramdisk_execute_command) == 0;
+}
+
 static noinline void __init kernel_init_freeable(void)
 {
 	/*
@@ -1568,7 +1573,7 @@ static noinline void __init kernel_init_freeable(void)
 	 * check if there is an early userspace init.  If yes, let it do all
 	 * the work
 	 */
-	if (init_eaccess(ramdisk_execute_command) != 0) {
+	if (!ramdisk_exec_exist()) {
 		ramdisk_execute_command = NULL;
 		prepare_namespace();
 	}
-- 
2.32.0.rc0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs
  2021-06-02 14:46 [PATCH v4 0/3] init/initramfs.c: make initramfs support pivot_root menglong8.dong
  2021-06-02 14:46 ` [PATCH v4 1/3] init/main.c: introduce function ramdisk_exec_exist() menglong8.dong
@ 2021-06-02 14:46 ` menglong8.dong
  2021-06-03 13:30   ` Christian Brauner
  2021-06-02 14:46 ` [PATCH v4 3/3] init/do_mounts.c: fix rootfs_fs_type with ramfs menglong8.dong
  2 siblings, 1 reply; 7+ messages in thread
From: menglong8.dong @ 2021-06-02 14:46 UTC (permalink / raw)
  To: christian.brauner
  Cc: viro, keescook, samitolvanen, johan, ojeda, akpm, dong.menglong,
	masahiroy, joe, hare, axboe, jack, tj, gregkh, song, neilb, brho,
	mcgrof, palmerdabbelt, arnd, f.fainelli, linux, wangkefeng.wang,
	mhiramat, rostedt, vbabka, pmladek, glider, chris, ebiederm,
	jojing64, mingo, terrelln, geert, linux-fsdevel, linux-kernel,
	jeyu, bhelgaas, josh

From: Menglong Dong <dong.menglong@zte.com.cn>

If using container platforms such as Docker, upon initialization it
wants to use pivot_root() so that currently mounted devices do not
propagate to containers. An example of value in this is that
a USB device connected prior to the creation of a containers on the
host gets disconnected after a container is created; if the
USB device was mounted on containers, but already removed and
umounted on the host, the mount point will not go away until all
containers unmount the USB device.

Another reason for container platforms such as Docker to use pivot_root
is that upon initialization the net-namspace is mounted under
/var/run/docker/netns/ on the host by dockerd. Without pivot_root
Docker must either wait to create the network namespace prior to
the creation of containers or simply deal with leaking this to each
container.

pivot_root is supported if the rootfs is a initrd or block device, but
it's not supported if the rootfs uses an initramfs (tmpfs). This means
container platforms today must resort to using block devices if
they want to pivot_root from the rootfs. A workaround to use chroot()
is not a clean viable option given every container will have a
duplicate of every mount point on the host.

In order to support using container platforms such as Docker on
all the supported rootfs types we must extend Linux to support
pivot_root on initramfs as well. This patch does the work to do
just that.

pivot_root will unmount the mount of the rootfs from its parent mount
and mount the new root to it. However, when it comes to initramfs, it
donesn't work, because the root filesystem has not parent mount, which
makes initramfs not supported by pivot_root.

In order to make pivot_root supported on initramfs, we create a second
mount with type of rootfs before unpacking cpio, and change root to
this mount after unpacking.

While mounting the second rootfs, 'rootflags' is passed, and it means
that we can set options for the mount of rootfs in boot cmd now.
For example, the size of tmpfs can be set with 'rootflags=size=1024M'.

Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
---
 init/do_mounts.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++--
 init/do_mounts.h | 16 ++++++++-
 init/initramfs.c |  8 +++++
 usr/Kconfig      | 10 ++++++
 4 files changed, 119 insertions(+), 4 deletions(-)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index a78e44ee6adb..5f82db43ac0f 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -617,6 +617,91 @@ void __init prepare_namespace(void)
 	init_chroot(".");
 }
 
+static inline __init bool check_tmpfs_enabled(void)
+{
+	return IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] &&
+		(!root_fs_names || strstr(root_fs_names, "tmpfs"));
+}
+
+#ifdef CONFIG_INITRAMFS_MOUNT
+
+static __init bool is_ramfs_enabled(void)
+{
+	return true;
+}
+
+static __init bool is_tmpfs_enabled(void)
+{
+	return check_tmpfs_enabled();
+}
+
+struct fs_rootfs_root {
+	bool (*enabled)(void);
+	char *dev_name;
+	char *fs_name;
+};
+
+static struct fs_rootfs_root rootfs_roots[] __initdata = {
+	{
+		.enabled  = is_tmpfs_enabled,
+		.dev_name = "tmpfs",
+		.fs_name  = "tmpfs",
+	},
+	{
+		.enabled  = is_ramfs_enabled,
+		.dev_name = "ramfs",
+		.fs_name  = "ramfs"
+	}
+};
+
+/*
+ * Give systems running from the initramfs and making use of pivot_root a
+ * proper mount so it can be umounted during pivot_root.
+ */
+int __init prepare_mount_rootfs(void)
+{
+	struct fs_rootfs_root *root = NULL;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(rootfs_roots); i++) {
+		if (rootfs_roots[i].enabled()) {
+			root = &rootfs_roots[i];
+			break;
+		}
+	}
+
+	if (unlikely(!root))
+		return -EFAULT;
+
+	return do_mount_root(root->dev_name,
+			     root->fs_name,
+			     root_mountflags & ~MS_RDONLY,
+			     root_mount_data);
+}
+
+/*
+ * Change root to the new rootfs that mounted in prepare_mount_rootfs()
+ * if cpio is unpacked successfully and 'ramdisk_execute_command' exist.
+ * Otherwise, umount the new rootfs.
+ */
+void __init finish_mount_rootfs(bool success)
+{
+	if (!success)
+		goto on_fail;
+
+	init_mount(".", "/", NULL, MS_MOVE, NULL);
+	if (!ramdisk_exec_exist())
+		goto on_fail;
+
+	init_chroot(".");
+	return;
+
+on_fail:
+	init_chdir("/");
+	init_umount(".", 0);
+}
+#endif
+
 static bool is_tmpfs;
 static int rootfs_init_fs_context(struct fs_context *fc)
 {
@@ -634,7 +719,5 @@ struct file_system_type rootfs_fs_type = {
 
 void __init init_rootfs(void)
 {
-	if (IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] &&
-		(!root_fs_names || strstr(root_fs_names, "tmpfs")))
-		is_tmpfs = true;
+	is_tmpfs = check_tmpfs_enabled();
 }
diff --git a/init/do_mounts.h b/init/do_mounts.h
index 7a29ac3e427b..6a878c09a801 100644
--- a/init/do_mounts.h
+++ b/init/do_mounts.h
@@ -10,9 +10,23 @@
 #include <linux/root_dev.h>
 #include <linux/init_syscalls.h>
 
+extern int root_mountflags;
+
 void  mount_block_root(char *name, int flags);
 void  mount_root(void);
-extern int root_mountflags;
+bool  ramdisk_exec_exist(void);
+
+#ifdef CONFIG_INITRAMFS_MOUNT
+
+int   prepare_mount_rootfs(void);
+void  finish_mount_rootfs(bool success);
+
+#else
+
+static inline int   prepare_mount_rootfs(void) { return 0; }
+static inline void  finish_mount_rootfs(bool success) { }
+
+#endif
 
 static inline __init int create_dev(char *name, dev_t dev)
 {
diff --git a/init/initramfs.c b/init/initramfs.c
index af27abc59643..59c8e54bebad 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -16,6 +16,8 @@
 #include <linux/namei.h>
 #include <linux/init_syscalls.h>
 
+#include "do_mounts.h"
+
 static ssize_t __init xwrite(struct file *file, const char *p, size_t count,
 		loff_t *pos)
 {
@@ -682,15 +684,21 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie)
 	else
 		printk(KERN_INFO "Unpacking initramfs...\n");
 
+	if (prepare_mount_rootfs())
+		panic("Failed to mount rootfs");
+
 	err = unpack_to_rootfs((char *)initrd_start, initrd_end - initrd_start);
 	if (err) {
+		finish_mount_rootfs(false);
 #ifdef CONFIG_BLK_DEV_RAM
 		populate_initrd_image(err);
 #else
 		printk(KERN_EMERG "Initramfs unpacking failed: %s\n", err);
 #endif
+		goto done;
 	}
 
+	finish_mount_rootfs(true);
 done:
 	/*
 	 * If the initrd region is overlapped with crashkernel reserved region,
diff --git a/usr/Kconfig b/usr/Kconfig
index 8bbcf699fe3b..4f6ac12eafe9 100644
--- a/usr/Kconfig
+++ b/usr/Kconfig
@@ -52,6 +52,16 @@ config INITRAMFS_ROOT_GID
 
 	  If you are not sure, leave it set to "0".
 
+config INITRAMFS_MOUNT
+	bool "Create second mount to make pivot_root() supported"
+	default y
+	help
+	  Before unpacking cpio, create a second mount and make it become
+	  the root filesystem. Therefore, initramfs will be supported by
+	  pivot_root().
+
+	  If container platforms is used with initramfs, say Y.
+
 config RD_GZIP
 	bool "Support initial ramdisk/ramfs compressed using gzip"
 	default y
-- 
2.32.0.rc0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 3/3] init/do_mounts.c: fix rootfs_fs_type with ramfs
  2021-06-02 14:46 [PATCH v4 0/3] init/initramfs.c: make initramfs support pivot_root menglong8.dong
  2021-06-02 14:46 ` [PATCH v4 1/3] init/main.c: introduce function ramdisk_exec_exist() menglong8.dong
  2021-06-02 14:46 ` [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs menglong8.dong
@ 2021-06-02 14:46 ` menglong8.dong
  2 siblings, 0 replies; 7+ messages in thread
From: menglong8.dong @ 2021-06-02 14:46 UTC (permalink / raw)
  To: christian.brauner
  Cc: viro, keescook, samitolvanen, johan, ojeda, akpm, dong.menglong,
	masahiroy, joe, hare, axboe, jack, tj, gregkh, song, neilb, brho,
	mcgrof, palmerdabbelt, arnd, f.fainelli, linux, wangkefeng.wang,
	mhiramat, rostedt, vbabka, pmladek, glider, chris, ebiederm,
	jojing64, mingo, terrelln, geert, linux-fsdevel, linux-kernel,
	jeyu, bhelgaas, josh

From: Menglong Dong <dong.menglong@zte.com.cn>

As for the existence of second mount which is introduced in previous
patch, 'rootfs_fs_type', which is used as the root of mount tree, is
not used directly any more. So it make no sense to make it tmpfs
while 'CONFIG_INITRAMFS_MOUNT' is enabled.

Make 'rootfs_fs_type' ramfs when 'CONFIG_INITRAMFS_MOUNT' enabled.

Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
---
 include/linux/init.h |  4 ++++
 init/do_mounts.c     | 16 ++++++++++------
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/init.h b/include/linux/init.h
index 045ad1650ed1..45ab6970851f 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -148,7 +148,11 @@ extern unsigned int reset_devices;
 /* used by init/main.c */
 void setup_arch(char **);
 void prepare_namespace(void);
+#ifndef CONFIG_INITRAMFS_MOUNT
 void __init init_rootfs(void);
+#else
+static inline void __init init_rootfs(void) { }
+#endif
 extern struct file_system_type rootfs_fs_type;
 
 #if defined(CONFIG_STRICT_KERNEL_RWX) || defined(CONFIG_STRICT_MODULE_RWX)
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 5f82db43ac0f..fcdc849a102a 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -700,7 +700,10 @@ void __init finish_mount_rootfs(bool success)
 	init_chdir("/");
 	init_umount(".", 0);
 }
-#endif
+
+#define rootfs_init_fs_context ramfs_init_fs_context
+
+#else
 
 static bool is_tmpfs;
 static int rootfs_init_fs_context(struct fs_context *fc)
@@ -711,13 +714,14 @@ static int rootfs_init_fs_context(struct fs_context *fc)
 	return ramfs_init_fs_context(fc);
 }
 
+void __init init_rootfs(void)
+{
+	is_tmpfs = check_tmpfs_enabled();
+}
+#endif
+
 struct file_system_type rootfs_fs_type = {
 	.name		= "rootfs",
 	.init_fs_context = rootfs_init_fs_context,
 	.kill_sb	= kill_litter_super,
 };
-
-void __init init_rootfs(void)
-{
-	is_tmpfs = check_tmpfs_enabled();
-}
-- 
2.32.0.rc0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs
  2021-06-02 14:46 ` [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs menglong8.dong
@ 2021-06-03 13:30   ` Christian Brauner
  2021-06-03 15:05     ` Menglong Dong
  0 siblings, 1 reply; 7+ messages in thread
From: Christian Brauner @ 2021-06-03 13:30 UTC (permalink / raw)
  To: menglong8.dong
  Cc: viro, keescook, samitolvanen, johan, ojeda, akpm, dong.menglong,
	masahiroy, joe, hare, axboe, jack, tj, gregkh, song, neilb, brho,
	mcgrof, palmerdabbelt, arnd, f.fainelli, linux, wangkefeng.wang,
	mhiramat, rostedt, vbabka, pmladek, glider, chris, ebiederm,
	jojing64, mingo, terrelln, geert, linux-fsdevel, linux-kernel,
	jeyu, bhelgaas, josh

On Wed, Jun 02, 2021 at 10:46:29PM +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <dong.menglong@zte.com.cn>
> 
> If using container platforms such as Docker, upon initialization it
> wants to use pivot_root() so that currently mounted devices do not
> propagate to containers. An example of value in this is that
> a USB device connected prior to the creation of a containers on the
> host gets disconnected after a container is created; if the
> USB device was mounted on containers, but already removed and
> umounted on the host, the mount point will not go away until all
> containers unmount the USB device.
> 
> Another reason for container platforms such as Docker to use pivot_root
> is that upon initialization the net-namspace is mounted under
> /var/run/docker/netns/ on the host by dockerd. Without pivot_root
> Docker must either wait to create the network namespace prior to
> the creation of containers or simply deal with leaking this to each
> container.
> 
> pivot_root is supported if the rootfs is a initrd or block device, but
> it's not supported if the rootfs uses an initramfs (tmpfs). This means
> container platforms today must resort to using block devices if
> they want to pivot_root from the rootfs. A workaround to use chroot()
> is not a clean viable option given every container will have a
> duplicate of every mount point on the host.
> 
> In order to support using container platforms such as Docker on
> all the supported rootfs types we must extend Linux to support
> pivot_root on initramfs as well. This patch does the work to do
> just that.
> 
> pivot_root will unmount the mount of the rootfs from its parent mount
> and mount the new root to it. However, when it comes to initramfs, it
> donesn't work, because the root filesystem has not parent mount, which
> makes initramfs not supported by pivot_root.
> 
> In order to make pivot_root supported on initramfs, we create a second
> mount with type of rootfs before unpacking cpio, and change root to
> this mount after unpacking.
> 
> While mounting the second rootfs, 'rootflags' is passed, and it means
> that we can set options for the mount of rootfs in boot cmd now.
> For example, the size of tmpfs can be set with 'rootflags=size=1024M'.
> 
> Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
> ---

Thanks for the reworked version a few more comments.

>  init/do_mounts.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++--
>  init/do_mounts.h | 16 ++++++++-
>  init/initramfs.c |  8 +++++
>  usr/Kconfig      | 10 ++++++
>  4 files changed, 119 insertions(+), 4 deletions(-)
> 
> diff --git a/init/do_mounts.c b/init/do_mounts.c
> index a78e44ee6adb..5f82db43ac0f 100644
> --- a/init/do_mounts.c
> +++ b/init/do_mounts.c
> @@ -617,6 +617,91 @@ void __init prepare_namespace(void)
>  	init_chroot(".");
>  }
>  
> +static inline __init bool check_tmpfs_enabled(void)
> +{
> +	return IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] &&
> +		(!root_fs_names || strstr(root_fs_names, "tmpfs"));
> +}
> +
> +#ifdef CONFIG_INITRAMFS_MOUNT
> +
> +static __init bool is_ramfs_enabled(void)
> +{
> +	return true;
> +}
> +
> +static __init bool is_tmpfs_enabled(void)
> +{
> +	return check_tmpfs_enabled();
> +}

This seems unnecessary. You can just introduce rename
check_tmpfs_enabled() to is_tmpfs_enabled() and get rid of this wrapper
around it.

> +
> +struct fs_rootfs_root {
> +	bool (*enabled)(void);
> +	char *dev_name;
> +	char *fs_name;
> +};
> +
> +static struct fs_rootfs_root rootfs_roots[] __initdata = {
> +	{
> +		.enabled  = is_tmpfs_enabled,
> +		.dev_name = "tmpfs",
> +		.fs_name  = "tmpfs",
> +	},
> +	{
> +		.enabled  = is_ramfs_enabled,
> +		.dev_name = "ramfs",
> +		.fs_name  = "ramfs"
> +	}
> +};
> +
> +/*
> + * Give systems running from the initramfs and making use of pivot_root a
> + * proper mount so it can be umounted during pivot_root.
> + */
> +int __init prepare_mount_rootfs(void)
> +{
> +	struct fs_rootfs_root *root = NULL;
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(rootfs_roots); i++) {
> +		if (rootfs_roots[i].enabled()) {
> +			root = &rootfs_roots[i];
> +			break;
> +		}
> +	}
> +
> +	if (unlikely(!root))
> +		return -EFAULT;

The errno value is a weird choice. But in any case the check seems
unnecessary since is_ramfs_enabled() always returns true so is always
available apparently.

In fact you seem to be only using this struct you're introducing in this
single place which makes me think that it's not needed at all. So what's
preventing us from doing:

> +
> +	return do_mount_root(root->dev_name,
> +			     root->fs_name,
> +			     root_mountflags & ~MS_RDONLY,
> +			     root_mount_data);
> +}

int __init prepare_mount_rootfs(void)
{
	if (is_tmpfs_enabled())
		return do_mount_root("tmpfs", "tmpfs",
				     root_mountflags & ~MS_RDONLY,
				     root_mount_data);

	return do_mount_root("ramfs", "ramfs",
			     root_mountflags & ~MS_RDONLY,
			     root_mount_data);
}

> +
> +/*
> + * Change root to the new rootfs that mounted in prepare_mount_rootfs()
> + * if cpio is unpacked successfully and 'ramdisk_execute_command' exist.
> + * Otherwise, umount the new rootfs.
> + */
> +void __init finish_mount_rootfs(bool success)
> +{
> +	if (!success)
> +		goto on_fail;
> +
> +	init_mount(".", "/", NULL, MS_MOVE, NULL);
> +	if (!ramdisk_exec_exist())
> +		goto on_fail;
> +
> +	init_chroot(".");
> +	return;
> +
> +on_fail:
> +	init_chdir("/");
> +	init_umount(".", 0);
> +}
> +#endif

This is convoluted imho. I would simply use two tiny helpers:

void __init finish_mount_rootfs(void)
{
	init_mount(".", "/", NULL, MS_MOVE, NULL);

	if (ramdisk_exec_exist())
		init_chroot(".");
}

void __init revert_mount_rootfs(void)
{
	init_chdir("/");
	init_umount(".", 0);
}

> +
>  static bool is_tmpfs;
>  static int rootfs_init_fs_context(struct fs_context *fc)
>  {
> @@ -634,7 +719,5 @@ struct file_system_type rootfs_fs_type = {
>  
>  void __init init_rootfs(void)
>  {
> -	if (IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] &&
> -		(!root_fs_names || strstr(root_fs_names, "tmpfs")))
> -		is_tmpfs = true;
> +	is_tmpfs = check_tmpfs_enabled();
>  }
> diff --git a/init/do_mounts.h b/init/do_mounts.h
> index 7a29ac3e427b..6a878c09a801 100644
> --- a/init/do_mounts.h
> +++ b/init/do_mounts.h
> @@ -10,9 +10,23 @@
>  #include <linux/root_dev.h>
>  #include <linux/init_syscalls.h>
>  
> +extern int root_mountflags;
> +
>  void  mount_block_root(char *name, int flags);
>  void  mount_root(void);
> -extern int root_mountflags;
> +bool  ramdisk_exec_exist(void);

Feels like that declaration belongs with the previous commit?

> +
> +#ifdef CONFIG_INITRAMFS_MOUNT
> +
> +int   prepare_mount_rootfs(void);
> +void  finish_mount_rootfs(bool success);
> +
> +#else
> +
> +static inline int   prepare_mount_rootfs(void) { return 0; }
> +static inline void  finish_mount_rootfs(bool success) { }
> +
> +#endif
>  
>  static inline __init int create_dev(char *name, dev_t dev)
>  {
> diff --git a/init/initramfs.c b/init/initramfs.c
> index af27abc59643..59c8e54bebad 100644
> --- a/init/initramfs.c
> +++ b/init/initramfs.c
> @@ -16,6 +16,8 @@
>  #include <linux/namei.h>
>  #include <linux/init_syscalls.h>
>  
> +#include "do_mounts.h"
> +
>  static ssize_t __init xwrite(struct file *file, const char *p, size_t count,
>  		loff_t *pos)
>  {
> @@ -682,15 +684,21 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie)
>  	else
>  		printk(KERN_INFO "Unpacking initramfs...\n");
>  
> +	if (prepare_mount_rootfs())
> +		panic("Failed to mount rootfs");
> +
>  	err = unpack_to_rootfs((char *)initrd_start, initrd_end - initrd_start);
>  	if (err) {
> +		finish_mount_rootfs(false);

This then becomes
		revert_mount_rootfs();

>  #ifdef CONFIG_BLK_DEV_RAM
>  		populate_initrd_image(err);
>  #else
>  		printk(KERN_EMERG "Initramfs unpacking failed: %s\n", err);
>  #endif
> +		goto done;

This becomes unnecessary if you do:

>  	} else {
		finish_mount_rootfs(true);
	}

So overall you'd end up with something like this which should be way
shorter (__completely untested__):

diff --git a/init/do_mounts.c b/init/do_mounts.c
index a78e44ee6adb..e6c82c7c931f 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -459,7 +459,7 @@ void __init mount_block_root(char *name, int flags)
 out:
 	put_page(page);
 }
  
 #ifdef CONFIG_ROOT_NFS
 
 #define NFSROOT_TIMEOUT_MIN	5
@@ -617,6 +617,50 @@ void __init prepare_namespace(void)
 	init_chroot(".");
 }
 
+static inline __init bool is_tmpfs_enabled(void)
+{
+	return IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] &&
+		(!root_fs_names || strstr(root_fs_names, "tmpfs"));
+}
+
+#ifdef CONFIG_INITRAMFS_MOUNT
+
+/*
+ * Give systems running from the initramfs and making use of pivot_root a
+ * proper mount so it can be umounted during pivot_root.
+ */
+int __init prepare_mount_rootfs(void)
+{
+	if (is_tmpfs_enabled())
+		return do_mount_root("tmpfs", "tmpfs",
+				     root_mountflags & ~MS_RDONLY,
+				     root_mount_data);
+
+	return do_mount_root("ramfs", "ramfs",
+			     root_mountflags & ~MS_RDONLY,
+			     root_mount_data);
+}
+
+/*
+ * Change root to the new rootfs that mounted in prepare_mount_rootfs()
+ * if cpio is unpacked successfully and 'ramdisk_execute_command' exist.
+ * Otherwise, umount the new rootfs.
+ */
+void __init finish_mount_rootfs(void)
+{
+	init_mount(".", "/", NULL, MS_MOVE, NULL);
+
+	if (ramdisk_exec_exist())
+		init_chroot(".");
+}
+
+void __init revert_mount_rootfs(void)
+{
+	init_chdir("/");
+	init_umount(".", 0);
+}
+#endif
+
 static bool is_tmpfs;
 static int rootfs_init_fs_context(struct fs_context *fc)
 {
@@ -634,7 +678,5 @@ struct file_system_type rootfs_fs_type = {
 
 void __init init_rootfs(void)
 {
-	if (IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] &&
-		(!root_fs_names || strstr(root_fs_names, "tmpfs")))
-		is_tmpfs = true;
+	is_tmpfs = is_tmpfs_enabled();
 }
diff --git a/init/do_mounts.h b/init/do_mounts.h
index 7a29ac3e427b..1724fd072dae 100644
--- a/init/do_mounts.h
+++ b/init/do_mounts.h
@@ -10,9 +10,25 @@
 #include <linux/root_dev.h>
 #include <linux/init_syscalls.h>
 
+extern int root_mountflags;
+
 void  mount_block_root(char *name, int flags);
 void  mount_root(void);
-extern int root_mountflags;
+bool  ramdisk_exec_exist(void);
+
+#ifdef CONFIG_INITRAMFS_MOUNT
+
+int   prepare_mount_rootfs(void);
+void  finish_mount_rootfs(void);
+void __init revert_mount_rootfs(void);
+
+#else
+
+static inline int   prepare_mount_rootfs(void) { return 0; }
+static inline void  finish_mount_rootfs() { }
+static inline void  revert_mount_rootfs() { }
+
+#endif
 
 static inline __init int create_dev(char *name, dev_t dev)
 {
diff --git a/init/initramfs.c b/init/initramfs.c
index af27abc59643..f49d29de4fd9 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -16,6 +16,8 @@
 #include <linux/namei.h>
 #include <linux/init_syscalls.h>
 
+#include "do_mounts.h"
+
 static ssize_t __init xwrite(struct file *file, const char *p, size_t count,
 		loff_t *pos)
 {
@@ -682,15 +684,20 @@ static void __init do_populate_rootfs(void *unused, async_cookie_t cookie)
 	else
 		printk(KERN_INFO "Unpacking initramfs...\n");
 
+	if (prepare_mount_rootfs())
+		panic("Failed to mount rootfs");
+
 	err = unpack_to_rootfs((char *)initrd_start, initrd_end - initrd_start);
 	if (err) {
+		revert_mount_rootfs();
 #ifdef CONFIG_BLK_DEV_RAM
 		populate_initrd_image(err);
 #else
 		printk(KERN_EMERG "Initramfs unpacking failed: %s\n", err);
 #endif
+	} else {
+		finish_mount_rootfs();
 	}
-
 done:
 	/*
 	 * If the initrd region is overlapped with crashkernel reserved region,
diff --git a/usr/Kconfig b/usr/Kconfig
index 8bbcf699fe3b..4f6ac12eafe9 100644
--- a/usr/Kconfig
+++ b/usr/Kconfig
@@ -52,6 +52,16 @@ config INITRAMFS_ROOT_GID
 
 	  If you are not sure, leave it set to "0".
 
+config INITRAMFS_MOUNT
+	bool "Create second mount to make pivot_root() supported"
+	default y
+	help
+	  Before unpacking cpio, create a second mount and make it become
+	  the root filesystem. Therefore, initramfs will be supported by
+	  pivot_root().
+
+	  If container platforms is used with initramfs, say Y.
+
 config RD_GZIP
 	bool "Support initial ramdisk/ramfs compressed using gzip"
 	default y
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs
  2021-06-03 13:30   ` Christian Brauner
@ 2021-06-03 15:05     ` Menglong Dong
  2021-06-04  9:59       ` Christian Brauner
  0 siblings, 1 reply; 7+ messages in thread
From: Menglong Dong @ 2021-06-03 15:05 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexander Viro, Kees Cook, Sami Tolvanen, johan, ojeda,
	Andrew Morton, Menglong Dong, masahiroy, joe, hare, Jens Axboe,
	Jan Kara, tj, gregkh, song, NeilBrown, Barret Rhoden,
	Luis Chamberlain, palmerdabbelt, arnd, f.fainelli,
	Rasmus Villemoes, wangkefeng.wang, Masami Hiramatsu,
	Steven Rostedt, vbabka, pmladek, Alexander Potapenko, Chris Down,
	Eric W. Biederman, jojing64, mingo, terrelln, geert,
	linux-fsdevel, LKML, jeyu, Bjorn Helgaas, Josh Triplett

On Thu, Jun 3, 2021 at 9:30 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
[...]
>
> In fact you seem to be only using this struct you're introducing in this
> single place which makes me think that it's not needed at all. So what's
> preventing us from doing:
>
> > +
> > +     return do_mount_root(root->dev_name,
> > +                          root->fs_name,
> > +                          root_mountflags & ~MS_RDONLY,
> > +                          root_mount_data);
> > +}
>
> int __init prepare_mount_rootfs(void)
> {
>         if (is_tmpfs_enabled())
>                 return do_mount_root("tmpfs", "tmpfs",
>                                      root_mountflags & ~MS_RDONLY,
>                                      root_mount_data);
>
>         return do_mount_root("ramfs", "ramfs",
>                              root_mountflags & ~MS_RDONLY,
>                              root_mount_data);
> }

It seems to make sense, but I just feel that it is a little hardcode.
What if a new file system
of rootfs arises? Am I too sensitive?

[...]
>
> This is convoluted imho. I would simply use two tiny helpers:
>
> void __init finish_mount_rootfs(void)
> {
>         init_mount(".", "/", NULL, MS_MOVE, NULL);
>
>         if (ramdisk_exec_exist())
>                 init_chroot(".");
> }
>
> void __init revert_mount_rootfs(void)
> {
>         init_chdir("/");
>         init_umount(".", 0);
> }
>

This looks nice.


Thanks!
Menglong Dong

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs
  2021-06-03 15:05     ` Menglong Dong
@ 2021-06-04  9:59       ` Christian Brauner
  0 siblings, 0 replies; 7+ messages in thread
From: Christian Brauner @ 2021-06-04  9:59 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Alexander Viro, Kees Cook, Sami Tolvanen, johan, ojeda,
	Andrew Morton, Menglong Dong, masahiroy, joe, hare, Jens Axboe,
	Jan Kara, tj, gregkh, song, NeilBrown, Barret Rhoden,
	Luis Chamberlain, palmerdabbelt, arnd, f.fainelli,
	Rasmus Villemoes, wangkefeng.wang, Masami Hiramatsu,
	Steven Rostedt, vbabka, pmladek, Alexander Potapenko, Chris Down,
	Eric W. Biederman, jojing64, mingo, terrelln, geert,
	linux-fsdevel, LKML, jeyu, Bjorn Helgaas, Josh Triplett

On Thu, Jun 03, 2021 at 11:05:08PM +0800, Menglong Dong wrote:
> On Thu, Jun 3, 2021 at 9:30 PM Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> [...]
> >
> > In fact you seem to be only using this struct you're introducing in this
> > single place which makes me think that it's not needed at all. So what's
> > preventing us from doing:
> >
> > > +
> > > +     return do_mount_root(root->dev_name,
> > > +                          root->fs_name,
> > > +                          root_mountflags & ~MS_RDONLY,
> > > +                          root_mount_data);
> > > +}
> >
> > int __init prepare_mount_rootfs(void)
> > {
> >         if (is_tmpfs_enabled())
> >                 return do_mount_root("tmpfs", "tmpfs",
> >                                      root_mountflags & ~MS_RDONLY,
> >                                      root_mount_data);
> >
> >         return do_mount_root("ramfs", "ramfs",
> >                              root_mountflags & ~MS_RDONLY,
> >                              root_mount_data);
> > }
> 
> It seems to make sense, but I just feel that it is a little hardcode.
> What if a new file system
> of rootfs arises? Am I too sensitive?

It'sn understandable but premature worry and I don't think it should
justify all that extra code.

Christian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-06-04  9:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02 14:46 [PATCH v4 0/3] init/initramfs.c: make initramfs support pivot_root menglong8.dong
2021-06-02 14:46 ` [PATCH v4 1/3] init/main.c: introduce function ramdisk_exec_exist() menglong8.dong
2021-06-02 14:46 ` [PATCH v4 2/3] init/do_mounts.c: create second mount for initramfs menglong8.dong
2021-06-03 13:30   ` Christian Brauner
2021-06-03 15:05     ` Menglong Dong
2021-06-04  9:59       ` Christian Brauner
2021-06-02 14:46 ` [PATCH v4 3/3] init/do_mounts.c: fix rootfs_fs_type with ramfs menglong8.dong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.