linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Menglong Dong <menglong8.dong@gmail.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
	hare@suse.de, gregkh@linuxfoundation.org, tj@kernel.org,
	Menglong Dong <dong.menglong@zte.com.cn>,
	song@kernel.org, NeilBrown <neilb@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	wangkefeng.wang@huawei.com, f.fainelli@gmail.com, arnd@arndb.de,
	Barret Rhoden <brho@google.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	mhiramat@kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	Kees Cook <keescook@chromium.org>,
	vbabka@suse.cz, Alexander Potapenko <glider@google.com>,
	pmladek@suse.com, Chris Down <chris@chrisdown.name>,
	ebiederm@xmission.com, jojing64@gmail.com,
	LKML <linux-kernel@vger.kernel.org>,
	palmerdabbelt@google.com, linux-fsdevel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH RESEND] init/initramfs.c: make initramfs support pivot_root
Date: Tue, 25 May 2021 08:55:48 +0800	[thread overview]
Message-ID: <CADxym3akKEurTTGiBxYZiXKVWUbOg=a8UeuRsJ07tT+DixA8mw@mail.gmail.com> (raw)
In-Reply-To: <20210524225827.GA4332@42.do-not-panic.com>

Hello!

On Tue, May 25, 2021 at 6:58 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> > In fact, the network namespace is just a case for the problem that the
> > 'mount leak' caused. And this kind modification is not friendly to
> > current docker users, it makes great changes to the usage of docker.
>
> You mean an upgrade of docker? If so... that does not seem like a
> definitive reason to do something new in the kernel *always*.

No, I means that in this solution, users have to create all the
containers first,
and create the network namespace for them later.

>
> However, if you introduce it as a kconfig option so that users
> who want to use this new feature can enable it, and then use it,
> the its sold as a new feature.
>
> Should this always be enabled, or done this way? Should we never have
> the option to revert back to the old behaviour? If not, why not?
>

This change seems transparent to users, which don't change the behavior
of initramfs. However, it seems more reasonable to make it a kconfig option.
I'll do it in the v2 of the three patches I sended.

>
> I don't see you using any context directly, where are you specifying the
> context directly?
>
> > so it is not needed any more
> > and I just removed it in init_rootfs().
> >
> > The initialization of 'user_root' in init_rootfs() is used in:
> > do_populate_rootfs -> mount_user_root, which set the file system(
> > ramfs or tmpfs) of the second mount.
> >
> > Seems it's not suitable to place it in init_rootfs()......
>
> OK I think I just need to understand how you added the context of the
> first mount explicitly now and where, as I don't see it.
>

......

>
> What do you mean? init_mount_tree() is always called, and it has
> statically:
>
> static void __init init_mount_tree(void)
> {
>         struct vfsmount *mnt;
>         ...
>         mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", NULL);
>         ...
> }
>
> And as I noted, this is *always* called earlier than
> do_populate_rootfs(). Your changes did not remove the init_mount_tree()
> or modify it, and so why would the context of the above call always
> be OK to be used now with a ramfs context now?
>
> > So it makes no sense to make the file system of the first mount selectable.
>
> Why? I don't see why, nor is it explained, we're always caling
> vfs_kern_mount(&rootfs_fs_type, ...) and you have not changed that
> either.
>
> > To simplify the code here, I make it ramfs_init_fs_context directly. In fact,
> > it's fine to make it shmen_init_fs_context here too.
>
> So indeed you're suggesting its arbitrary now.... I don't see why.
>

So the biggest problem now seems to be the first mount I changed, maybe I didn't
make it clear before.

Let's call the first mount which is created in init_mount_tree() the
'init_mount'.
If the 'root' is a block fs, initrd or nfs, the 'init_mount' will be a
ramfs, that seems
clear, it can be seen from the enable of tmpfs:

void __init init_rootfs(void)
{
    if (IS_ENABLED(CONFIG_TMPFS) && !saved_root_name[0] &&
       (!root_fs_names || strstr(root_fs_names, "tmpfs")))
       is_tmpfs = true;
}

That makes sense, because the 'init_mount' will not be the root file
system, as the
block fs or initrd or nfs that mounted later on '/root' will become
the root by a call of
init_chroot().

If the 'root' is initramfs, cpio will be unpacked init 'init_mount'.
In this scene, the
'init_mount' can be ramfs or tmpfs, and it is the root file system.

In my patch, I create a second mount which is mounted on the
'init_mount', let's call
it 'user_mount'.

cpio is unpacked into 'user_mount', and the 'user_mount' is made the
root by 'init_chroot()'
later. The 'user_mount' can be ramfs or tmpfs. So 'init_mount' is not
the root any more,
and it makes no sense to make it tmpfs, just like why it should be
ramfs when 'root'
is a block fs or initrd.

'init_mount' is created from 'rootfs_fs_type' in init_mount_tree():

static void __init init_mount_tree(void)
{
        struct vfsmount *mnt;
        ...
        mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", NULL);
        ...
}

I make the 'init_mount' to be ramfs by making
'rootfs_fs_type->init_fs_context' with
'ramfs_init_fs_context':

struct file_system_type rootfs_fs_type = {
.name = "rootfs",
.init_fs_context = ramfs_init_fs_context,
.kill_sb = kill_litter_super,
};

I think the third patch that I sended has made it clear what I do to
the 'init_mount'.

Thanks!
Menglong Dong

  reply	other threads:[~2021-05-25  0:56 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20 15:42 [PATCH RESEND] init/initramfs.c: make initramfs support pivot_root menglong8.dong
2021-05-20 21:41 ` Luis Chamberlain
2021-05-21  0:41   ` Menglong Dong
2021-05-21 15:50     ` Luis Chamberlain
2021-05-22  4:09       ` Menglong Dong
2021-05-24 22:58         ` Luis Chamberlain
2021-05-25  0:55           ` Menglong Dong [this message]
2021-05-25  1:43             ` Luis Chamberlain
2021-05-25  6:09               ` Menglong Dong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADxym3akKEurTTGiBxYZiXKVWUbOg=a8UeuRsJ07tT+DixA8mw@mail.gmail.com' \
    --to=menglong8.dong@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=brho@google.com \
    --cc=chris@chrisdown.name \
    --cc=dong.menglong@zte.com.cn \
    --cc=ebiederm@xmission.com \
    --cc=f.fainelli@gmail.com \
    --cc=glider@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.de \
    --cc=jack@suse.cz \
    --cc=jojing64@gmail.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=neilb@suse.de \
    --cc=palmerdabbelt@google.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=song@kernel.org \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).