linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Menglong Dong <menglong8.dong@gmail.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
	hare@suse.de, gregkh@linuxfoundation.org, tj@kernel.org,
	Menglong Dong <dong.menglong@zte.com.cn>,
	song@kernel.org, NeilBrown <neilb@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	wangkefeng.wang@huawei.com, f.fainelli@gmail.com, arnd@arndb.de,
	Barret Rhoden <brho@google.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	mhiramat@kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	Kees Cook <keescook@chromium.org>,
	vbabka@suse.cz, Alexander Potapenko <glider@google.com>,
	pmladek@suse.com, Chris Down <chris@chrisdown.name>,
	ebiederm@xmission.com, jojing64@gmail.com,
	LKML <linux-kernel@vger.kernel.org>,
	palmerdabbelt@google.com, linux-fsdevel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH RESEND] init/initramfs.c: make initramfs support pivot_root
Date: Sat, 22 May 2021 12:09:30 +0800	[thread overview]
Message-ID: <CADxym3Z7bdEJECEejPqg-15ycghgX3ZEmOGWYwxZ1_HPWLU1NA@mail.gmail.com> (raw)
In-Reply-To: <20210521155020.GW4332@42.do-not-panic.com>

On Fri, May 21, 2021 at 11:50 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> > That's a solution, but I don't think it is feasible. Users may create many
> > containers, and you can't make docker create all the containers first
> > and create network namespace later, as you don't know if there are any
> > containers to create later.
>
> It doesn't seem impossible, but worth noting why inside the commit log
> this was not a preferred option.
>

In fact, the network namespace is just a case for the problem that the
'mount leak' caused. And this kind modification is not friendly to
current docker users, it makes great changes to the usage of docker.

>
> We still have:
>
> start_kernel() --> vfs_caches_init() --> mnt_init() -->
>
> mnt_init()
> {
>         ...
>         shmem_init();
>         init_rootfs();
>         init_mount_tree();
> }
>
> You've now modified init_rootfs() to essentially just set the new user_root,
> and that's it. But we stil call init_mount_tree() even if we did set the
> rootfs to use tmpfs.

The variate of 'is_tmpfs' is only used in 'rootfs_init_fs_context'. I used
ramfs_init_fs_context directly for rootfs, so it is not needed any more
and I just removed it in init_rootfs().

The initialization of 'user_root' in init_rootfs() is used in:
do_populate_rootfs -> mount_user_root, which set the file system(
ramfs or tmpfs) of the second mount.

Seems it's not suitable to place it in init_rootfs()......


> > In do_populate_ro
> > tmpfs, and that's the real rootfs for initramfs. And I call this root
> > as 'user_root',
> > because it is created for user space.
> >
> > int __init mount_user_root(void)
> > {
> >        return do_mount_root(user_root->dev_name,
> >                             user_root->fs_name,
> >                             root_mountflags,
> >                             root_mount_data);
> >  }
> >
> > In other words, I moved the realization of 'rootfs_fs_type' here to
> > do_populate_rootfs(), and fixed this 'rootfs_fs_type' with
> > ramfs_init_fs_context, as it is a fake root now.
>
> do_populate_rootfs() is called from populate_rootfs() and that in turn
> is a:
>
> rootfs_initcall(populate_rootfs);
>
> In fact the latest changes have made this to schedule asynchronously as
> well. And so indeed, init_mount_tree() always kicks off first. So its
> still unclear to me why the first mount now always has a fs context of
> ramfs_init_fs_context, even if we did not care for a ramdisk.
>
> Are you suggesting it can be arbitrary now?

With the existence of the new user_root, the first mount is not directly used
any more, so the filesystem type of it doesn't  matter.

So it makes no sense to make the file system of the first mount selectable.
To simplify the code here, I make it ramfs_init_fs_context directly. In fact,
it's fine to make it shmen_init_fs_context here too.

> > Now, the rootfs that user space used is separated with the init_task,
> > and that's exactly what a block root file system does.
>
> The secondary effort is a bit clearer, its the earlier part that is not
> so clear yet to me at least.
>
> Regardless, to help make the changes easier to review, I wonder if it
> makes sense to split up your work into a few patches. First do what you
> have done for init_rootfs() and the structure you added to replace the
> is_tmpfs bool, and let initialization use it and the context. And then
> as a second patch introduce the second mount effort.

I thinks it is a good idea. I will reorganize the code and split it into several
patches.

Thanks!
Menglong Dong

  reply	other threads:[~2021-05-22  4:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20 15:42 [PATCH RESEND] init/initramfs.c: make initramfs support pivot_root menglong8.dong
2021-05-20 21:41 ` Luis Chamberlain
2021-05-21  0:41   ` Menglong Dong
2021-05-21 15:50     ` Luis Chamberlain
2021-05-22  4:09       ` Menglong Dong [this message]
2021-05-24 22:58         ` Luis Chamberlain
2021-05-25  0:55           ` Menglong Dong
2021-05-25  1:43             ` Luis Chamberlain
2021-05-25  6:09               ` Menglong Dong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADxym3Z7bdEJECEejPqg-15ycghgX3ZEmOGWYwxZ1_HPWLU1NA@mail.gmail.com \
    --to=menglong8.dong@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=brho@google.com \
    --cc=chris@chrisdown.name \
    --cc=dong.menglong@zte.com.cn \
    --cc=ebiederm@xmission.com \
    --cc=f.fainelli@gmail.com \
    --cc=glider@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.de \
    --cc=jack@suse.cz \
    --cc=jojing64@gmail.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=neilb@suse.de \
    --cc=palmerdabbelt@google.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=song@kernel.org \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).