linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* support booting of arbitrary non-blockdevice file systems v2
@ 2021-06-21  6:26 Christoph Hellwig
  2021-06-21  6:26 ` [PATCH 1/2] init: split get_fs_names Christoph Hellwig
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-06-21  6:26 UTC (permalink / raw)
  To: viro; +Cc: Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

Hi all,

this series adds support to boot off arbitrary non-blockdevice root file
systems, based off an earlier patch from Vivek.

Chances since v1:
 - don't try to mount every registered file system if none is specified
 - fix various null pointer dereferences when certain kernel paramters are
   not set
 - general refactoring.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/2] init: split get_fs_names
  2021-06-21  6:26 support booting of arbitrary non-blockdevice file systems v2 Christoph Hellwig
@ 2021-06-21  6:26 ` Christoph Hellwig
  2021-06-21 14:46   ` Al Viro
  2021-06-21 15:09   ` Matthew Wilcox
  2021-06-21  6:26 ` [PATCH 2/2] init: allow mounting arbitrary non-blockdevice filesystems as root Christoph Hellwig
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-06-21  6:26 UTC (permalink / raw)
  To: viro; +Cc: Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

Split get_fs_names into one function that splits up the command line
argument, and one that gets the list of all registered file systems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 init/do_mounts.c | 48 ++++++++++++++++++++++++++----------------------
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index 74aede860de7..ec32de3ad52b 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -338,30 +338,31 @@ __setup("rootflags=", root_data_setup);
 __setup("rootfstype=", fs_names_setup);
 __setup("rootdelay=", root_delay_setup);
 
-static void __init get_fs_names(char *page)
+static void __init split_fs_names(char *page, char *names)
 {
-	char *s = page;
-
-	if (root_fs_names) {
-		strcpy(page, root_fs_names);
-		while (*s++) {
-			if (s[-1] == ',')
-				s[-1] = '\0';
-		}
-	} else {
-		int len = get_filesystem_list(page);
-		char *p, *next;
+	strcpy(page, root_fs_names);
+	while (*page++) {
+		if (page[-1] == ',')
+			page[-1] = '\0';
+	}
+	*page = '\0';
+}
 
-		page[len] = '\0';
-		for (p = page-1; p; p = next) {
-			next = strchr(++p, '\n');
-			if (*p++ != '\t')
-				continue;
-			while ((*s++ = *p++) != '\n')
-				;
-			s[-1] = '\0';
-		}
+static void __init get_all_fs_names(char *page)
+{
+	int len = get_filesystem_list(page);
+	char *s = page, *p, *next;
+
+	page[len] = '\0';
+	for (p = page - 1; p; p = next) {
+		next = strchr(++p, '\n');
+		if (*p++ != '\t')
+			continue;
+		while ((*s++ = *p++) != '\n')
+			;
+		s[-1] = '\0';
 	}
+
 	*s = '\0';
 }
 
@@ -411,7 +412,10 @@ void __init mount_block_root(char *name, int flags)
 
 	scnprintf(b, BDEVNAME_SIZE, "unknown-block(%u,%u)",
 		  MAJOR(ROOT_DEV), MINOR(ROOT_DEV));
-	get_fs_names(fs_names);
+	if (root_fs_names)
+		split_fs_names(fs_names, root_fs_names);
+	else
+		get_all_fs_names(fs_names);
 retry:
 	for (p = fs_names; *p; p += strlen(p)+1) {
 		int err = do_mount_root(name, p, flags, root_mount_data);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2/2] init: allow mounting arbitrary non-blockdevice filesystems as root
  2021-06-21  6:26 support booting of arbitrary non-blockdevice file systems v2 Christoph Hellwig
  2021-06-21  6:26 ` [PATCH 1/2] init: split get_fs_names Christoph Hellwig
@ 2021-06-21  6:26 ` Christoph Hellwig
  2021-06-21 13:31 ` [Virtio-fs] support booting of arbitrary non-blockdevice file systems v2 Stefan Hajnoczi
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-06-21  6:26 UTC (permalink / raw)
  To: viro; +Cc: Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

Currently the only non-blockdevice filesystems that can be used as the
initial root filesystem are NFS and CIFS, which use the magic
"root=/dev/nfs" and "root=/dev/cifs" syntax that requires the root
device file system details to come from filesystem specific kernel
command line options.

Add a little bit of new code that allows to just pass arbitrary
string mount options to any non-blockdevice filesystems so that it can
be mounted as the root file system.

For example a virtiofs root file system can be mounted using the
following syntax:

"root=myfs rootfstype=virtiofs rw"

Based on an earlier patch from Vivek Goyal <vgoyal@redhat.com>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 init/do_mounts.c | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index ec32de3ad52b..bdeb90b8d669 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -534,6 +534,45 @@ static int __init mount_cifs_root(void)
 }
 #endif
 
+static bool __init fs_is_nodev(char *fstype)
+{
+	struct file_system_type *fs = get_fs_type(fstype);
+	bool ret = false;
+
+	if (fs) {
+		ret = !(fs->fs_flags & FS_REQUIRES_DEV);
+		put_filesystem(fs);
+	}
+
+	return ret;
+}
+
+static int __init mount_nodev_root(void)
+{
+	char *fs_names, *fstype;
+	int err = -EINVAL;
+
+	fs_names = (void *)__get_free_page(GFP_KERNEL);
+	if (!fs_names)
+		return -EINVAL;
+	split_fs_names(fs_names, root_fs_names);
+
+	for (fstype = fs_names; *fstype; fstype += strlen(fstype) + 1) {
+		if (!fs_is_nodev(fstype))
+			continue;
+		err = do_mount_root(root_device_name, fstype, root_mountflags,
+				    root_mount_data);
+		if (!err)
+			break;
+		if (err != -EACCES && err != -EINVAL)
+			panic("VFS: Unable to mount root \"%s\" (%s), err=%d\n",
+			      root_device_name, fstype, err);
+	}
+
+	free_page((unsigned long)fs_names);
+	return err;
+}
+
 void __init mount_root(void)
 {
 #ifdef CONFIG_ROOT_NFS
@@ -550,6 +589,10 @@ void __init mount_root(void)
 		return;
 	}
 #endif
+	if (ROOT_DEV == 0 && root_device_name && root_fs_names) {
+		if (mount_nodev_root() == 0)
+			return;
+	}
 #ifdef CONFIG_BLOCK
 	{
 		int err = create_dev("/dev/root", ROOT_DEV);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] support booting of arbitrary non-blockdevice file systems v2
  2021-06-21  6:26 support booting of arbitrary non-blockdevice file systems v2 Christoph Hellwig
  2021-06-21  6:26 ` [PATCH 1/2] init: split get_fs_names Christoph Hellwig
  2021-06-21  6:26 ` [PATCH 2/2] init: allow mounting arbitrary non-blockdevice filesystems as root Christoph Hellwig
@ 2021-06-21 13:31 ` Stefan Hajnoczi
  2021-06-21 14:35 ` Vivek Goyal
  2021-06-22  8:12 ` [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names Christoph Hellwig
  4 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2021-06-21 13:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, virtio-fs, linux-kernel, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 649 bytes --]

On Mon, Jun 21, 2021 at 08:26:55AM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series adds support to boot off arbitrary non-blockdevice root file
> systems, based off an earlier patch from Vivek.
> 
> Chances since v1:
>  - don't try to mount every registered file system if none is specified
>  - fix various null pointer dereferences when certain kernel paramters are
>    not set
>  - general refactoring.
> 
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://listman.redhat.com/mailman/listinfo/virtio-fs
> 

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: support booting of arbitrary non-blockdevice file systems v2
  2021-06-21  6:26 support booting of arbitrary non-blockdevice file systems v2 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2021-06-21 13:31 ` [Virtio-fs] support booting of arbitrary non-blockdevice file systems v2 Stefan Hajnoczi
@ 2021-06-21 14:35 ` Vivek Goyal
  2021-06-22  8:12 ` [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names Christoph Hellwig
  4 siblings, 0 replies; 21+ messages in thread
From: Vivek Goyal @ 2021-06-21 14:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: viro, linux-fsdevel, linux-kernel, virtio-fs

On Mon, Jun 21, 2021 at 08:26:55AM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series adds support to boot off arbitrary non-blockdevice root file
> systems, based off an earlier patch from Vivek.
> 
> Chances since v1:
>  - don't try to mount every registered file system if none is specified
>  - fix various null pointer dereferences when certain kernel paramters are
>    not set
>  - general refactoring.

Thanks Christoph. This version looks good to me. I tested both with
virtiofs and 9p and it works for me.

Reviewed-by: Vivek Goyal <vgoyal@redhat.com>

Thanks
Vivek


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] init: split get_fs_names
  2021-06-21  6:26 ` [PATCH 1/2] init: split get_fs_names Christoph Hellwig
@ 2021-06-21 14:46   ` Al Viro
  2021-06-21 14:51     ` Al Viro
  2021-06-21 14:53     ` Christoph Hellwig
  2021-06-21 15:09   ` Matthew Wilcox
  1 sibling, 2 replies; 21+ messages in thread
From: Al Viro @ 2021-06-21 14:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

On Mon, Jun 21, 2021 at 08:26:56AM +0200, Christoph Hellwig wrote:
> Split get_fs_names into one function that splits up the command line
> argument, and one that gets the list of all registered file systems.

> +static void __init get_all_fs_names(char *page)
> +{
> +	int len = get_filesystem_list(page);
> +	char *s = page, *p, *next;
> +
> +	page[len] = '\0';
> +	for (p = page - 1; p; p = next) {
> +		next = strchr(++p, '\n');
> +		if (*p++ != '\t')
> +			continue;
> +		while ((*s++ = *p++) != '\n')
> +			;
> +		s[-1] = '\0';
>  	}
> +
>  	*s = '\0';
>  }

TBH, I would rather take that one into fs/filesystems.c.  Rationale:
get_filesystem_list(), for all its resemblance to /proc/filesystems
contents, is used only by init/*.c and it's not a big deal to make
it

int __init get_filesystem_list(char *buf, bool is_dev)
{
	int f = is_dev ? FS_REQUIRES_DEV : 0;
        int left = PAGE_SIZE, count = 0;
        struct file_system_type *p;

        read_lock(&file_systems_lock);
	for (p = file_systems; p; p = p->next) {
		if ((p->fs_flags & FS_REQUIRES_DEV) == f) {
			size_t len = strlen(p->name) + 1;
			if (len > left)
				break;
			memcpy(buf, p->name, len);
			buf += len;
			left -= len;
			count++;
		}
	}
        read_unlock(&file_systems_lock);
	return count;
}

Generates NUL-separated list, returns the number of list elements,
the second argument is "what kind do you want"...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] init: split get_fs_names
  2021-06-21 14:46   ` Al Viro
@ 2021-06-21 14:51     ` Al Viro
  2021-06-21 14:53     ` Christoph Hellwig
  1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2021-06-21 14:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

On Mon, Jun 21, 2021 at 02:46:37PM +0000, Al Viro wrote:

> int __init get_filesystem_list(char *buf, bool is_dev)
> {
> 	int f = is_dev ? FS_REQUIRES_DEV : 0;
>         int left = PAGE_SIZE, count = 0;
>         struct file_system_type *p;
> 
>         read_lock(&file_systems_lock);
> 	for (p = file_systems; p; p = p->next) {
> 		if ((p->fs_flags & FS_REQUIRES_DEV) == f) {
> 			size_t len = strlen(p->name) + 1;
> 			if (len > left)
> 				break;
> 			memcpy(buf, p->name, len);
> 			buf += len;
> 			left -= len;
> 			count++;
> 		}
> 	}
>         read_unlock(&file_systems_lock);
> 	return count;
> }
> 
> Generates NUL-separated list, returns the number of list elements,
> the second argument is "what kind do you want"...

Actually, looking at the rest of the queue, that's only used for
!nodev ones, so might as well drop the flag here...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] init: split get_fs_names
  2021-06-21 14:46   ` Al Viro
  2021-06-21 14:51     ` Al Viro
@ 2021-06-21 14:53     ` Christoph Hellwig
  2021-06-21 14:59       ` Al Viro
  1 sibling, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2021-06-21 14:53 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

On Mon, Jun 21, 2021 at 02:46:37PM +0000, Al Viro wrote:
> TBH, I would rather take that one into fs/filesystems.c.  Rationale:
> get_filesystem_list(), for all its resemblance to /proc/filesystems
> contents, is used only by init/*.c and it's not a big deal to make
> it

Yeah, unwinding this mess actually is a good idea.  I didn't really
look outside of do_mounts.c, but once doing that it becomes completely
obvious.

> int __init get_filesystem_list(char *buf, bool is_dev)

As-is we don't even really need the is_dev argument, as the only
callers wants block device file systems anyway.  In fact it would
much rather have a cursor based iteration so that we can skip the
allocation, but that is probaby overengineering the problem.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] init: split get_fs_names
  2021-06-21 14:53     ` Christoph Hellwig
@ 2021-06-21 14:59       ` Al Viro
  0 siblings, 0 replies; 21+ messages in thread
From: Al Viro @ 2021-06-21 14:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

On Mon, Jun 21, 2021 at 04:53:09PM +0200, Christoph Hellwig wrote:
> On Mon, Jun 21, 2021 at 02:46:37PM +0000, Al Viro wrote:
> > TBH, I would rather take that one into fs/filesystems.c.  Rationale:
> > get_filesystem_list(), for all its resemblance to /proc/filesystems
> > contents, is used only by init/*.c and it's not a big deal to make
> > it
> 
> Yeah, unwinding this mess actually is a good idea.  I didn't really
> look outside of do_mounts.c, but once doing that it becomes completely
> obvious.
> 
> > int __init get_filesystem_list(char *buf, bool is_dev)
> 
> As-is we don't even really need the is_dev argument, as the only
> callers wants block device file systems anyway.

*nod*

> In fact it would
> much rather have a cursor based iteration so that we can skip the
> allocation, but that is probaby overengineering the problem.

Very much so.

Sigh...  I really wish we had more uniform syntax, though -
e.g.  root=nfs(<options>) or root=xfs(sdb11,noatime), etc.
Oh, well...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] init: split get_fs_names
  2021-06-21  6:26 ` [PATCH 1/2] init: split get_fs_names Christoph Hellwig
  2021-06-21 14:46   ` Al Viro
@ 2021-06-21 15:09   ` Matthew Wilcox
  2021-06-21 15:22     ` Christoph Hellwig
  1 sibling, 1 reply; 21+ messages in thread
From: Matthew Wilcox @ 2021-06-21 15:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

On Mon, Jun 21, 2021 at 08:26:56AM +0200, Christoph Hellwig wrote:
> -static void __init get_fs_names(char *page)
> +static void __init split_fs_names(char *page, char *names)

If you're going to respin it anyway, can you rename 'page' to 'buf'
or something?  Kind of confusing to have a char * called 'page'.

>  {
> +	strcpy(page, root_fs_names);
> +	while (*page++) {
> +		if (page[-1] == ',')
> +			page[-1] = '\0';
> +	}
> +	*page = '\0';
> +}

is it really worth doing a strcpy() followed by a custom strtok()?
would this work better?

	char c;

	do {
		c =  *root_fs_names++;
		*buf++ = c;
		if (c == ',')
			buf[-1] = '\0';
	} while (c);

> +static void __init get_all_fs_names(char *page)
> +{
> +	int len = get_filesystem_list(page);

it occurs to me that get_filesystem_list() fails silently.  if you build
every linux filesystem in, and want your root on zonefs (assuming
they're alphabetical), we'll fail to find it without a message
indicating that we overflowed the buffer.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] init: split get_fs_names
  2021-06-21 15:09   ` Matthew Wilcox
@ 2021-06-21 15:22     ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-06-21 15:22 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, viro, Vivek Goyal, linux-fsdevel,
	linux-kernel, virtio-fs

On Mon, Jun 21, 2021 at 04:09:16PM +0100, Matthew Wilcox wrote:
> On Mon, Jun 21, 2021 at 08:26:56AM +0200, Christoph Hellwig wrote:
> > -static void __init get_fs_names(char *page)
> > +static void __init split_fs_names(char *page, char *names)
> 
> If you're going to respin it anyway, can you rename 'page' to 'buf'
> or something?  Kind of confusing to have a char * called 'page'.

I was hoping that we did not need a respin.  While Al's suggestion is
nice, and I've added a variant of it to my local tree it really
is just incremental improvements.

I actually thing that page name is not too bad, as it implements the
implicit assumptions that it is a PAGE_SIZE allocation (which is a bad
idea to start with, but we're digging a deeper and deeper hole here..)

> is it really worth doing a strcpy() followed by a custom strtok()?
> would this work better?
> 
> 	char c;
> 
> 	do {
> 		c =  *root_fs_names++;
> 		*buf++ = c;
> 		if (c == ',')
> 			buf[-1] = '\0';
> 	} while (c);

Maybe.  Then again all this is age old rarely used code, so it might be
better to not stirr it too much..

> > +static void __init get_all_fs_names(char *page)
> > +{
> > +	int len = get_filesystem_list(page);
> 
> it occurs to me that get_filesystem_list() fails silently.  if you build
> every linux filesystem in, and want your root on zonefs (assuming
> they're alphabetical), we'll fail to find it without a message
> indicating that we overflowed the buffer.

Yes.  The only sensible way to fix this would be some kind of cursor.
We could use an xarray for the file_systems list, but unless we want
to assume file systems can't go away at init type (which might be an
ok assumption) we'd then need to get into refcount, and and and..

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-06-21  6:26 support booting of arbitrary non-blockdevice file systems v2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2021-06-21 14:35 ` Vivek Goyal
@ 2021-06-22  8:12 ` Christoph Hellwig
  2021-06-22  8:36   ` [Virtio-fs] " Stefan Hajnoczi
  4 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2021-06-22  8:12 UTC (permalink / raw)
  To: viro; +Cc: Vivek Goyal, linux-fsdevel, linux-kernel, virtio-fs

Just output the '\0' separate list of supported file systems for block
devices directly rather than going through a pointless round of string
manipulation.

Based on an earlier patch from Al Viro <viro@zeniv.linux.org.uk>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/filesystems.c   | 24 ++++++++++++++----------
 include/linux/fs.h |  2 +-
 init/do_mounts.c   | 20 +-------------------
 3 files changed, 16 insertions(+), 30 deletions(-)

diff --git a/fs/filesystems.c b/fs/filesystems.c
index 90b8d879fbaf..7c136251607a 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -209,21 +209,25 @@ SYSCALL_DEFINE3(sysfs, int, option, unsigned long, arg1, unsigned long, arg2)
 }
 #endif
 
-int __init get_filesystem_list(char *buf)
+void __init list_bdev_fs_names(char *buf, size_t size)
 {
-	int len = 0;
-	struct file_system_type * tmp;
+	struct file_system_type *p;
+	size_t len;
 
 	read_lock(&file_systems_lock);
-	tmp = file_systems;
-	while (tmp && len < PAGE_SIZE - 80) {
-		len += sprintf(buf+len, "%s\t%s\n",
-			(tmp->fs_flags & FS_REQUIRES_DEV) ? "" : "nodev",
-			tmp->name);
-		tmp = tmp->next;
+	for (p = file_systems; p; p = p->next) {
+		if (!(p->fs_flags & FS_REQUIRES_DEV))
+			continue;
+		len = strlen(p->name) + 1;
+		if (len > size) {
+			pr_warn("%s: truncating file system list\n", __func__);
+			break;
+		}
+		memcpy(buf, p->name, len);
+		buf += len;
+		size -= len;
 	}
 	read_unlock(&file_systems_lock);
-	return len;
 }
 
 #ifdef CONFIG_PROC_FS
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f4ed7bf1130d..cdcd7f2a2c3f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3624,7 +3624,7 @@ int proc_nr_dentry(struct ctl_table *table, int write,
 		  void *buffer, size_t *lenp, loff_t *ppos);
 int proc_nr_inodes(struct ctl_table *table, int write,
 		   void *buffer, size_t *lenp, loff_t *ppos);
-int __init get_filesystem_list(char *buf);
+void __init list_bdev_fs_names(char *buf, size_t size);
 
 #define __FMODE_EXEC		((__force int) FMODE_EXEC)
 #define __FMODE_NONOTIFY	((__force int) FMODE_NONOTIFY)
diff --git a/init/do_mounts.c b/init/do_mounts.c
index bdeb90b8d669..ffbe4deeb274 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -348,24 +348,6 @@ static void __init split_fs_names(char *page, char *names)
 	*page = '\0';
 }
 
-static void __init get_all_fs_names(char *page)
-{
-	int len = get_filesystem_list(page);
-	char *s = page, *p, *next;
-
-	page[len] = '\0';
-	for (p = page - 1; p; p = next) {
-		next = strchr(++p, '\n');
-		if (*p++ != '\t')
-			continue;
-		while ((*s++ = *p++) != '\n')
-			;
-		s[-1] = '\0';
-	}
-
-	*s = '\0';
-}
-
 static int __init do_mount_root(const char *name, const char *fs,
 				 const int flags, const void *data)
 {
@@ -415,7 +397,7 @@ void __init mount_block_root(char *name, int flags)
 	if (root_fs_names)
 		split_fs_names(fs_names, root_fs_names);
 	else
-		get_all_fs_names(fs_names);
+		list_bdev_fs_names(fs_names, PAGE_SIZE);
 retry:
 	for (p = fs_names; *p; p += strlen(p)+1) {
 		int err = do_mount_root(name, p, flags, root_mount_data);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-06-22  8:12 ` [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names Christoph Hellwig
@ 2021-06-22  8:36   ` Stefan Hajnoczi
  2021-06-29 20:50     ` Vivek Goyal
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2021-06-22  8:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, virtio-fs, linux-kernel, Vivek Goyal

[-- Attachment #1: Type: text/plain, Size: 1791 bytes --]

On Tue, Jun 22, 2021 at 10:12:17AM +0200, Christoph Hellwig wrote:
> Just output the '\0' separate list of supported file systems for block
> devices directly rather than going through a pointless round of string
> manipulation.
> 
> Based on an earlier patch from Al Viro <viro@zeniv.linux.org.uk>.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/filesystems.c   | 24 ++++++++++++++----------
>  include/linux/fs.h |  2 +-
>  init/do_mounts.c   | 20 +-------------------
>  3 files changed, 16 insertions(+), 30 deletions(-)
> 
> diff --git a/fs/filesystems.c b/fs/filesystems.c
> index 90b8d879fbaf..7c136251607a 100644
> --- a/fs/filesystems.c
> +++ b/fs/filesystems.c
> @@ -209,21 +209,25 @@ SYSCALL_DEFINE3(sysfs, int, option, unsigned long, arg1, unsigned long, arg2)
>  }
>  #endif
>  
> -int __init get_filesystem_list(char *buf)
> +void __init list_bdev_fs_names(char *buf, size_t size)
>  {
> -	int len = 0;
> -	struct file_system_type * tmp;
> +	struct file_system_type *p;
> +	size_t len;
>  
>  	read_lock(&file_systems_lock);
> -	tmp = file_systems;
> -	while (tmp && len < PAGE_SIZE - 80) {
> -		len += sprintf(buf+len, "%s\t%s\n",
> -			(tmp->fs_flags & FS_REQUIRES_DEV) ? "" : "nodev",
> -			tmp->name);
> -		tmp = tmp->next;
> +	for (p = file_systems; p; p = p->next) {
> +		if (!(p->fs_flags & FS_REQUIRES_DEV))
> +			continue;
> +		len = strlen(p->name) + 1;
> +		if (len > size) {
> +			pr_warn("%s: truncating file system list\n", __func__);
> +			break;
> +		}
> +		memcpy(buf, p->name, len);
> +		buf += len;
> +		size -= len;
>  	}
>  	read_unlock(&file_systems_lock);
> -	return len;
>  }

I don't see the extra NUL terminator byte being added that's required by
the loop in mount_block_root()?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-06-22  8:36   ` [Virtio-fs] " Stefan Hajnoczi
@ 2021-06-29 20:50     ` Vivek Goyal
  2021-06-30  5:36       ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Vivek Goyal @ 2021-06-29 20:50 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Christoph Hellwig, viro, linux-fsdevel, virtio-fs, linux-kernel

On Tue, Jun 22, 2021 at 09:36:33AM +0100, Stefan Hajnoczi wrote:
> On Tue, Jun 22, 2021 at 10:12:17AM +0200, Christoph Hellwig wrote:
> > Just output the '\0' separate list of supported file systems for block
> > devices directly rather than going through a pointless round of string
> > manipulation.
> > 
> > Based on an earlier patch from Al Viro <viro@zeniv.linux.org.uk>.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  fs/filesystems.c   | 24 ++++++++++++++----------
> >  include/linux/fs.h |  2 +-
> >  init/do_mounts.c   | 20 +-------------------
> >  3 files changed, 16 insertions(+), 30 deletions(-)
> > 
> > diff --git a/fs/filesystems.c b/fs/filesystems.c
> > index 90b8d879fbaf..7c136251607a 100644
> > --- a/fs/filesystems.c
> > +++ b/fs/filesystems.c
> > @@ -209,21 +209,25 @@ SYSCALL_DEFINE3(sysfs, int, option, unsigned long, arg1, unsigned long, arg2)
> >  }
> >  #endif
> >  
> > -int __init get_filesystem_list(char *buf)
> > +void __init list_bdev_fs_names(char *buf, size_t size)
> >  {
> > -	int len = 0;
> > -	struct file_system_type * tmp;
> > +	struct file_system_type *p;
> > +	size_t len;
> >  
> >  	read_lock(&file_systems_lock);
> > -	tmp = file_systems;
> > -	while (tmp && len < PAGE_SIZE - 80) {
> > -		len += sprintf(buf+len, "%s\t%s\n",
> > -			(tmp->fs_flags & FS_REQUIRES_DEV) ? "" : "nodev",
> > -			tmp->name);
> > -		tmp = tmp->next;
> > +	for (p = file_systems; p; p = p->next) {
> > +		if (!(p->fs_flags & FS_REQUIRES_DEV))
> > +			continue;
> > +		len = strlen(p->name) + 1;
> > +		if (len > size) {
> > +			pr_warn("%s: truncating file system list\n", __func__);
> > +			break;
> > +		}
> > +		memcpy(buf, p->name, len);
> > +		buf += len;
> > +		size -= len;
> >  	}
> >  	read_unlock(&file_systems_lock);
> > -	return len;
> >  }
> 
> I don't see the extra NUL terminator byte being added that's required by
> the loop in mount_block_root()?

May be we should modify mount_block_root() code so that it does not
require that extra "\0". Possibly zero initialize page and that should
make sure list_bdev_fs_names() does not have to worry about it.

It is possible that a page gets full from the list of filesystems, and
last byte on page is terminating null. In that case just zeroing page
will not help. We can keep track of some sort of end pointer and make
sure we are not searching beyond that for valid filesystem types.

end = page + PAGE_SIZE - 1;

mount_block_root()
{
	for (p = fs_names; p < end && *p; p += strlen(p)+1) {
	}
}

Thanks
Vivek


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-06-29 20:50     ` Vivek Goyal
@ 2021-06-30  5:36       ` Christoph Hellwig
  2021-06-30 17:33         ` Vivek Goyal
  2021-07-07 21:04         ` Vivek Goyal
  0 siblings, 2 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-06-30  5:36 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Stefan Hajnoczi, Christoph Hellwig, viro, linux-fsdevel,
	virtio-fs, linux-kernel

On Tue, Jun 29, 2021 at 04:50:48PM -0400, Vivek Goyal wrote:
> May be we should modify mount_block_root() code so that it does not
> require that extra "\0". Possibly zero initialize page and that should
> make sure list_bdev_fs_names() does not have to worry about it.
> 
> It is possible that a page gets full from the list of filesystems, and
> last byte on page is terminating null. In that case just zeroing page
> will not help. We can keep track of some sort of end pointer and make
> sure we are not searching beyond that for valid filesystem types.
> 
> end = page + PAGE_SIZE - 1;
> 
> mount_block_root()
> {
> 	for (p = fs_names; p < end && *p; p += strlen(p)+1) {
> 	}
> }

Maybe.  To honest I'd prefer to not even touch this unrelated code given
how full of landmines it is :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-06-30  5:36       ` Christoph Hellwig
@ 2021-06-30 17:33         ` Vivek Goyal
  2021-07-07 21:04         ` Vivek Goyal
  1 sibling, 0 replies; 21+ messages in thread
From: Vivek Goyal @ 2021-06-30 17:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Stefan Hajnoczi, viro, linux-fsdevel, virtio-fs, linux-kernel

On Wed, Jun 30, 2021 at 07:36:01AM +0200, Christoph Hellwig wrote:
> On Tue, Jun 29, 2021 at 04:50:48PM -0400, Vivek Goyal wrote:
> > May be we should modify mount_block_root() code so that it does not
> > require that extra "\0". Possibly zero initialize page and that should
> > make sure list_bdev_fs_names() does not have to worry about it.
> > 
> > It is possible that a page gets full from the list of filesystems, and
> > last byte on page is terminating null. In that case just zeroing page
> > will not help. We can keep track of some sort of end pointer and make
> > sure we are not searching beyond that for valid filesystem types.
> > 
> > end = page + PAGE_SIZE - 1;
> > 
> > mount_block_root()
> > {
> > 	for (p = fs_names; p < end && *p; p += strlen(p)+1) {
> > 	}
> > }
> 
> Maybe.  To honest I'd prefer to not even touch this unrelated code given
> how full of landmines it is :)

Agreed. It probably is better to make such changes incrementally. 

Given third patch is nice to have cleanup kind of thing, can we first
just merge first two patches to support non-block device filesystems as
rootfs.

We will really like to have a method to properly boot virtiofs as rootfs.

Thanks
Vivek


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-06-30  5:36       ` Christoph Hellwig
  2021-06-30 17:33         ` Vivek Goyal
@ 2021-07-07 21:04         ` Vivek Goyal
  2021-07-07 21:06           ` Vivek Goyal
  1 sibling, 1 reply; 21+ messages in thread
From: Vivek Goyal @ 2021-07-07 21:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Stefan Hajnoczi, viro, linux-fsdevel, virtio-fs, linux-kernel

On Wed, Jun 30, 2021 at 07:36:01AM +0200, Christoph Hellwig wrote:
> On Tue, Jun 29, 2021 at 04:50:48PM -0400, Vivek Goyal wrote:
> > May be we should modify mount_block_root() code so that it does not
> > require that extra "\0". Possibly zero initialize page and that should
> > make sure list_bdev_fs_names() does not have to worry about it.
> > 
> > It is possible that a page gets full from the list of filesystems, and
> > last byte on page is terminating null. In that case just zeroing page
> > will not help. We can keep track of some sort of end pointer and make
> > sure we are not searching beyond that for valid filesystem types.
> > 
> > end = page + PAGE_SIZE - 1;
> > 
> > mount_block_root()
> > {
> > 	for (p = fs_names; p < end && *p; p += strlen(p)+1) {
> > 	}
> > }
> 
> Maybe.  To honest I'd prefer to not even touch this unrelated code given
> how full of landmines it is :)

Hi Christoph,

How about following patch. This applies on top of your patches. I noticed
that Al had suggested to return number of filesystems from helper
functions. I just did that and used that to iterate in the loop.

I tested it with a virtual block device (root=/dev/vda1) and it works.
I also filled page with garbage after allocation to make sure natually
occurring null is not there in the middle of page to terminate string.

If you like it, can you please incorporate it in your patches.

Thanks
Vivek

---
 fs/filesystems.c   |    5 ++++-
 include/linux/fs.h |    2 +-
 init/do_mounts.c   |    7 ++++---
 3 files changed, 9 insertions(+), 5 deletions(-)

Index: redhat-linux/fs/filesystems.c
===================================================================
--- redhat-linux.orig/fs/filesystems.c	2021-07-07 16:12:08.890562576 -0400
+++ redhat-linux/fs/filesystems.c	2021-07-07 16:27:51.197620063 -0400
@@ -209,10 +209,11 @@ SYSCALL_DEFINE3(sysfs, int, option, unsi
 }
 #endif
 
-void __init list_bdev_fs_names(char *buf, size_t size)
+int __init list_bdev_fs_names(char *buf, size_t size)
 {
 	struct file_system_type *p;
 	size_t len;
+	int count = 0;
 
 	read_lock(&file_systems_lock);
 	for (p = file_systems; p; p = p->next) {
@@ -226,8 +227,10 @@ void __init list_bdev_fs_names(char *buf
 		memcpy(buf, p->name, len);
 		buf += len;
 		size -= len;
+		count++;
 	}
 	read_unlock(&file_systems_lock);
+	return count;
 }
 
 #ifdef CONFIG_PROC_FS
Index: redhat-linux/include/linux/fs.h
===================================================================
--- redhat-linux.orig/include/linux/fs.h	2021-07-07 15:36:43.224418935 -0400
+++ redhat-linux/include/linux/fs.h	2021-07-07 16:12:18.232949807 -0400
@@ -3622,7 +3622,7 @@ int proc_nr_dentry(struct ctl_table *tab
 		  void *buffer, size_t *lenp, loff_t *ppos);
 int proc_nr_inodes(struct ctl_table *table, int write,
 		   void *buffer, size_t *lenp, loff_t *ppos);
-void __init list_bdev_fs_names(char *buf, size_t size);
+int __init list_bdev_fs_names(char *buf, size_t size);
 
 #define __FMODE_EXEC		((__force int) FMODE_EXEC)
 #define __FMODE_NONOTIFY	((__force int) FMODE_NONOTIFY)
Index: redhat-linux/init/do_mounts.c
===================================================================
--- redhat-linux.orig/init/do_mounts.c	2021-07-07 16:12:08.890562576 -0400
+++ redhat-linux/init/do_mounts.c	2021-07-07 16:23:32.308889444 -0400
@@ -391,15 +391,16 @@ void __init mount_block_root(char *name,
 	char *fs_names = page_address(page);
 	char *p;
 	char b[BDEVNAME_SIZE];
+	int num_fs, i;
 
 	scnprintf(b, BDEVNAME_SIZE, "unknown-block(%u,%u)",
 		  MAJOR(ROOT_DEV), MINOR(ROOT_DEV));
 	if (root_fs_names)
 		split_fs_names(fs_names, root_fs_names);
 	else
-		list_bdev_fs_names(fs_names, PAGE_SIZE);
+		num_fs = list_bdev_fs_names(fs_names, PAGE_SIZE);
 retry:
-	for (p = fs_names; *p; p += strlen(p)+1) {
+	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++) {
 		int err = do_mount_root(name, p, flags, root_mount_data);
 		switch (err) {
 			case 0:
@@ -432,7 +433,7 @@ retry:
 	printk("List of all partitions:\n");
 	printk_all_partitions();
 	printk("No filesystem could mount root, tried: ");
-	for (p = fs_names; *p; p += strlen(p)+1)
+	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++)
 		printk(" %s", p);
 	printk("\n");
 	panic("VFS: Unable to mount root fs on %s", b);



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-07-07 21:04         ` Vivek Goyal
@ 2021-07-07 21:06           ` Vivek Goyal
  2021-07-08 12:59             ` Vivek Goyal
  0 siblings, 1 reply; 21+ messages in thread
From: Vivek Goyal @ 2021-07-07 21:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Stefan Hajnoczi, viro, linux-fsdevel, virtio-fs, linux-kernel

On Wed, Jul 07, 2021 at 05:04:04PM -0400, Vivek Goyal wrote:
> On Wed, Jun 30, 2021 at 07:36:01AM +0200, Christoph Hellwig wrote:
> > On Tue, Jun 29, 2021 at 04:50:48PM -0400, Vivek Goyal wrote:
> > > May be we should modify mount_block_root() code so that it does not
> > > require that extra "\0". Possibly zero initialize page and that should
> > > make sure list_bdev_fs_names() does not have to worry about it.
> > > 
> > > It is possible that a page gets full from the list of filesystems, and
> > > last byte on page is terminating null. In that case just zeroing page
> > > will not help. We can keep track of some sort of end pointer and make
> > > sure we are not searching beyond that for valid filesystem types.
> > > 
> > > end = page + PAGE_SIZE - 1;
> > > 
> > > mount_block_root()
> > > {
> > > 	for (p = fs_names; p < end && *p; p += strlen(p)+1) {
> > > 	}
> > > }
> > 
> > Maybe.  To honest I'd prefer to not even touch this unrelated code given
> > how full of landmines it is :)
> 
> Hi Christoph,
> 
> How about following patch. This applies on top of your patches. I noticed
> that Al had suggested to return number of filesystems from helper
> functions. I just did that and used that to iterate in the loop.
> 
> I tested it with a virtual block device (root=/dev/vda1) and it works.
> I also filled page with garbage after allocation to make sure natually
> occurring null is not there in the middle of page to terminate string.
> 
> If you like it, can you please incorporate it in your patches.

I noticed this will break with "root_fs_names=". Sorry, will have to
fix split_fs_names() as well. Will do.

Vivek

> 
> Thanks
> Vivek
> 
> ---
>  fs/filesystems.c   |    5 ++++-
>  include/linux/fs.h |    2 +-
>  init/do_mounts.c   |    7 ++++---
>  3 files changed, 9 insertions(+), 5 deletions(-)
> 
> Index: redhat-linux/fs/filesystems.c
> ===================================================================
> --- redhat-linux.orig/fs/filesystems.c	2021-07-07 16:12:08.890562576 -0400
> +++ redhat-linux/fs/filesystems.c	2021-07-07 16:27:51.197620063 -0400
> @@ -209,10 +209,11 @@ SYSCALL_DEFINE3(sysfs, int, option, unsi
>  }
>  #endif
>  
> -void __init list_bdev_fs_names(char *buf, size_t size)
> +int __init list_bdev_fs_names(char *buf, size_t size)
>  {
>  	struct file_system_type *p;
>  	size_t len;
> +	int count = 0;
>  
>  	read_lock(&file_systems_lock);
>  	for (p = file_systems; p; p = p->next) {
> @@ -226,8 +227,10 @@ void __init list_bdev_fs_names(char *buf
>  		memcpy(buf, p->name, len);
>  		buf += len;
>  		size -= len;
> +		count++;
>  	}
>  	read_unlock(&file_systems_lock);
> +	return count;
>  }
>  
>  #ifdef CONFIG_PROC_FS
> Index: redhat-linux/include/linux/fs.h
> ===================================================================
> --- redhat-linux.orig/include/linux/fs.h	2021-07-07 15:36:43.224418935 -0400
> +++ redhat-linux/include/linux/fs.h	2021-07-07 16:12:18.232949807 -0400
> @@ -3622,7 +3622,7 @@ int proc_nr_dentry(struct ctl_table *tab
>  		  void *buffer, size_t *lenp, loff_t *ppos);
>  int proc_nr_inodes(struct ctl_table *table, int write,
>  		   void *buffer, size_t *lenp, loff_t *ppos);
> -void __init list_bdev_fs_names(char *buf, size_t size);
> +int __init list_bdev_fs_names(char *buf, size_t size);
>  
>  #define __FMODE_EXEC		((__force int) FMODE_EXEC)
>  #define __FMODE_NONOTIFY	((__force int) FMODE_NONOTIFY)
> Index: redhat-linux/init/do_mounts.c
> ===================================================================
> --- redhat-linux.orig/init/do_mounts.c	2021-07-07 16:12:08.890562576 -0400
> +++ redhat-linux/init/do_mounts.c	2021-07-07 16:23:32.308889444 -0400
> @@ -391,15 +391,16 @@ void __init mount_block_root(char *name,
>  	char *fs_names = page_address(page);
>  	char *p;
>  	char b[BDEVNAME_SIZE];
> +	int num_fs, i;
>  
>  	scnprintf(b, BDEVNAME_SIZE, "unknown-block(%u,%u)",
>  		  MAJOR(ROOT_DEV), MINOR(ROOT_DEV));
>  	if (root_fs_names)
>  		split_fs_names(fs_names, root_fs_names);
>  	else
> -		list_bdev_fs_names(fs_names, PAGE_SIZE);
> +		num_fs = list_bdev_fs_names(fs_names, PAGE_SIZE);
>  retry:
> -	for (p = fs_names; *p; p += strlen(p)+1) {
> +	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++) {
>  		int err = do_mount_root(name, p, flags, root_mount_data);
>  		switch (err) {
>  			case 0:
> @@ -432,7 +433,7 @@ retry:
>  	printk("List of all partitions:\n");
>  	printk_all_partitions();
>  	printk("No filesystem could mount root, tried: ");
> -	for (p = fs_names; *p; p += strlen(p)+1)
> +	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++)
>  		printk(" %s", p);
>  	printk("\n");
>  	panic("VFS: Unable to mount root fs on %s", b);
> 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-07-07 21:06           ` Vivek Goyal
@ 2021-07-08 12:59             ` Vivek Goyal
  2021-07-12 18:21               ` Vivek Goyal
  0 siblings, 1 reply; 21+ messages in thread
From: Vivek Goyal @ 2021-07-08 12:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Stefan Hajnoczi, viro, linux-fsdevel, virtio-fs, linux-kernel

On Wed, Jul 07, 2021 at 05:06:36PM -0400, Vivek Goyal wrote:
> On Wed, Jul 07, 2021 at 05:04:04PM -0400, Vivek Goyal wrote:
> > On Wed, Jun 30, 2021 at 07:36:01AM +0200, Christoph Hellwig wrote:
> > > On Tue, Jun 29, 2021 at 04:50:48PM -0400, Vivek Goyal wrote:
> > > > May be we should modify mount_block_root() code so that it does not
> > > > require that extra "\0". Possibly zero initialize page and that should
> > > > make sure list_bdev_fs_names() does not have to worry about it.
> > > > 
> > > > It is possible that a page gets full from the list of filesystems, and
> > > > last byte on page is terminating null. In that case just zeroing page
> > > > will not help. We can keep track of some sort of end pointer and make
> > > > sure we are not searching beyond that for valid filesystem types.
> > > > 
> > > > end = page + PAGE_SIZE - 1;
> > > > 
> > > > mount_block_root()
> > > > {
> > > > 	for (p = fs_names; p < end && *p; p += strlen(p)+1) {
> > > > 	}
> > > > }
> > > 
> > > Maybe.  To honest I'd prefer to not even touch this unrelated code given
> > > how full of landmines it is :)
> > 
> > Hi Christoph,
> > 
> > How about following patch. This applies on top of your patches. I noticed
> > that Al had suggested to return number of filesystems from helper
> > functions. I just did that and used that to iterate in the loop.
> > 
> > I tested it with a virtual block device (root=/dev/vda1) and it works.
> > I also filled page with garbage after allocation to make sure natually
> > occurring null is not there in the middle of page to terminate string.
> > 
> > If you like it, can you please incorporate it in your patches.
> 
> I noticed this will break with "root_fs_names=". Sorry, will have to
> fix split_fs_names() as well. Will do.

Hi Christoph,

I fixed it. Now both split_fs_names() and list_bdev_fs_names() return
count of fstype strings it placed in the buffer. And callers now
use that count to loop (instead of relying on extra null byte at the
end of the buffer).

I tested both nodev (virtiofs, 9p) and block dev rootfs (ext4) and
it works for me. Please have a look.

Thanks
Vivek


---
 fs/filesystems.c   |    5 ++++-
 include/linux/fs.h |    2 +-
 init/do_mounts.c   |   35 +++++++++++++++++++++++------------
 3 files changed, 28 insertions(+), 14 deletions(-)

Index: redhat-linux/fs/filesystems.c
===================================================================
--- redhat-linux.orig/fs/filesystems.c	2021-07-08 08:02:09.772766786 -0400
+++ redhat-linux/fs/filesystems.c	2021-07-08 08:02:12.044860918 -0400
@@ -209,10 +209,11 @@ SYSCALL_DEFINE3(sysfs, int, option, unsi
 }
 #endif
 
-void __init list_bdev_fs_names(char *buf, size_t size)
+int __init list_bdev_fs_names(char *buf, size_t size)
 {
 	struct file_system_type *p;
 	size_t len;
+	int count = 0;
 
 	read_lock(&file_systems_lock);
 	for (p = file_systems; p; p = p->next) {
@@ -226,8 +227,10 @@ void __init list_bdev_fs_names(char *buf
 		memcpy(buf, p->name, len);
 		buf += len;
 		size -= len;
+		count++;
 	}
 	read_unlock(&file_systems_lock);
+	return count;
 }
 
 #ifdef CONFIG_PROC_FS
Index: redhat-linux/include/linux/fs.h
===================================================================
--- redhat-linux.orig/include/linux/fs.h	2021-07-08 08:02:09.774766869 -0400
+++ redhat-linux/include/linux/fs.h	2021-07-08 08:02:12.046861001 -0400
@@ -3622,7 +3622,7 @@ int proc_nr_dentry(struct ctl_table *tab
 		  void *buffer, size_t *lenp, loff_t *ppos);
 int proc_nr_inodes(struct ctl_table *table, int write,
 		   void *buffer, size_t *lenp, loff_t *ppos);
-void __init list_bdev_fs_names(char *buf, size_t size);
+int __init list_bdev_fs_names(char *buf, size_t size);
 
 #define __FMODE_EXEC		((__force int) FMODE_EXEC)
 #define __FMODE_NONOTIFY	((__force int) FMODE_NONOTIFY)
Index: redhat-linux/init/do_mounts.c
===================================================================
--- redhat-linux.orig/init/do_mounts.c	2021-07-08 08:02:09.774766869 -0400
+++ redhat-linux/init/do_mounts.c	2021-07-08 08:02:12.046861001 -0400
@@ -338,14 +338,22 @@ __setup("rootflags=", root_data_setup);
 __setup("rootfstype=", fs_names_setup);
 __setup("rootdelay=", root_delay_setup);
 
-static void __init split_fs_names(char *page, char *names)
+static int __init split_fs_names(char *page, char *names)
 {
-	strcpy(page, root_fs_names);
-	while (*page++) {
-		if (page[-1] == ',')
-			page[-1] = '\0';
+	int count = 0;
+	char *p = page;
+
+	strcpy(p, root_fs_names);
+	while (*p++) {
+		if (p[-1] == ',')
+			p[-1] = '\0';
 	}
-	*page = '\0';
+	*p = '\0';
+
+	for (p = page; *p; p += strlen(p)+1)
+		count++;
+
+	return count;
 }
 
 static int __init do_mount_root(const char *name, const char *fs,
@@ -391,15 +399,16 @@ void __init mount_block_root(char *name,
 	char *fs_names = page_address(page);
 	char *p;
 	char b[BDEVNAME_SIZE];
+	int num_fs, i;
 
 	scnprintf(b, BDEVNAME_SIZE, "unknown-block(%u,%u)",
 		  MAJOR(ROOT_DEV), MINOR(ROOT_DEV));
 	if (root_fs_names)
-		split_fs_names(fs_names, root_fs_names);
+		num_fs = split_fs_names(fs_names, root_fs_names);
 	else
-		list_bdev_fs_names(fs_names, PAGE_SIZE);
+		num_fs = list_bdev_fs_names(fs_names, PAGE_SIZE);
 retry:
-	for (p = fs_names; *p; p += strlen(p)+1) {
+	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++) {
 		int err = do_mount_root(name, p, flags, root_mount_data);
 		switch (err) {
 			case 0:
@@ -432,7 +441,7 @@ retry:
 	printk("List of all partitions:\n");
 	printk_all_partitions();
 	printk("No filesystem could mount root, tried: ");
-	for (p = fs_names; *p; p += strlen(p)+1)
+	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++)
 		printk(" %s", p);
 	printk("\n");
 	panic("VFS: Unable to mount root fs on %s", b);
@@ -533,13 +542,15 @@ static int __init mount_nodev_root(void)
 {
 	char *fs_names, *fstype;
 	int err = -EINVAL;
+	int num_fs, i;
 
 	fs_names = (void *)__get_free_page(GFP_KERNEL);
 	if (!fs_names)
 		return -EINVAL;
-	split_fs_names(fs_names, root_fs_names);
+	num_fs = split_fs_names(fs_names, root_fs_names);
 
-	for (fstype = fs_names; *fstype; fstype += strlen(fstype) + 1) {
+	for (fstype = fs_names, i = 0; i < num_fs;
+	     fstype += strlen(fstype) + 1, i++) {
 		if (!fs_is_nodev(fstype))
 			continue;
 		err = do_mount_root(root_device_name, fstype, root_mountflags,


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-07-08 12:59             ` Vivek Goyal
@ 2021-07-12 18:21               ` Vivek Goyal
  2021-07-13  5:40                 ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Vivek Goyal @ 2021-07-12 18:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Stefan Hajnoczi, viro, linux-fsdevel, virtio-fs, linux-kernel

On Thu, Jul 08, 2021 at 08:59:36AM -0400, Vivek Goyal wrote:
> On Wed, Jul 07, 2021 at 05:06:36PM -0400, Vivek Goyal wrote:
> > On Wed, Jul 07, 2021 at 05:04:04PM -0400, Vivek Goyal wrote:
> > > On Wed, Jun 30, 2021 at 07:36:01AM +0200, Christoph Hellwig wrote:
> > > > On Tue, Jun 29, 2021 at 04:50:48PM -0400, Vivek Goyal wrote:
> > > > > May be we should modify mount_block_root() code so that it does not
> > > > > require that extra "\0". Possibly zero initialize page and that should
> > > > > make sure list_bdev_fs_names() does not have to worry about it.
> > > > > 
> > > > > It is possible that a page gets full from the list of filesystems, and
> > > > > last byte on page is terminating null. In that case just zeroing page
> > > > > will not help. We can keep track of some sort of end pointer and make
> > > > > sure we are not searching beyond that for valid filesystem types.
> > > > > 
> > > > > end = page + PAGE_SIZE - 1;
> > > > > 
> > > > > mount_block_root()
> > > > > {
> > > > > 	for (p = fs_names; p < end && *p; p += strlen(p)+1) {
> > > > > 	}
> > > > > }
> > > > 
> > > > Maybe.  To honest I'd prefer to not even touch this unrelated code given
> > > > how full of landmines it is :)
> > > 
> > > Hi Christoph,
> > > 
> > > How about following patch. This applies on top of your patches. I noticed
> > > that Al had suggested to return number of filesystems from helper
> > > functions. I just did that and used that to iterate in the loop.
> > > 
> > > I tested it with a virtual block device (root=/dev/vda1) and it works.
> > > I also filled page with garbage after allocation to make sure natually
> > > occurring null is not there in the middle of page to terminate string.
> > > 
> > > If you like it, can you please incorporate it in your patches.
> > 
> > I noticed this will break with "root_fs_names=". Sorry, will have to
> > fix split_fs_names() as well. Will do.
> 
> Hi Christoph,
> 
> I fixed it. Now both split_fs_names() and list_bdev_fs_names() return
> count of fstype strings it placed in the buffer. And callers now
> use that count to loop (instead of relying on extra null byte at the
> end of the buffer).
> 
> I tested both nodev (virtiofs, 9p) and block dev rootfs (ext4) and
> it works for me. Please have a look.

Hi Christoph,

In case you are finding it hard to spend some time on these patches, I 
can take those patches, merge my changes and repost them.

Vivek

> 
> 
> ---
>  fs/filesystems.c   |    5 ++++-
>  include/linux/fs.h |    2 +-
>  init/do_mounts.c   |   35 +++++++++++++++++++++++------------
>  3 files changed, 28 insertions(+), 14 deletions(-)
> 
> Index: redhat-linux/fs/filesystems.c
> ===================================================================
> --- redhat-linux.orig/fs/filesystems.c	2021-07-08 08:02:09.772766786 -0400
> +++ redhat-linux/fs/filesystems.c	2021-07-08 08:02:12.044860918 -0400
> @@ -209,10 +209,11 @@ SYSCALL_DEFINE3(sysfs, int, option, unsi
>  }
>  #endif
>  
> -void __init list_bdev_fs_names(char *buf, size_t size)
> +int __init list_bdev_fs_names(char *buf, size_t size)
>  {
>  	struct file_system_type *p;
>  	size_t len;
> +	int count = 0;
>  
>  	read_lock(&file_systems_lock);
>  	for (p = file_systems; p; p = p->next) {
> @@ -226,8 +227,10 @@ void __init list_bdev_fs_names(char *buf
>  		memcpy(buf, p->name, len);
>  		buf += len;
>  		size -= len;
> +		count++;
>  	}
>  	read_unlock(&file_systems_lock);
> +	return count;
>  }
>  
>  #ifdef CONFIG_PROC_FS
> Index: redhat-linux/include/linux/fs.h
> ===================================================================
> --- redhat-linux.orig/include/linux/fs.h	2021-07-08 08:02:09.774766869 -0400
> +++ redhat-linux/include/linux/fs.h	2021-07-08 08:02:12.046861001 -0400
> @@ -3622,7 +3622,7 @@ int proc_nr_dentry(struct ctl_table *tab
>  		  void *buffer, size_t *lenp, loff_t *ppos);
>  int proc_nr_inodes(struct ctl_table *table, int write,
>  		   void *buffer, size_t *lenp, loff_t *ppos);
> -void __init list_bdev_fs_names(char *buf, size_t size);
> +int __init list_bdev_fs_names(char *buf, size_t size);
>  
>  #define __FMODE_EXEC		((__force int) FMODE_EXEC)
>  #define __FMODE_NONOTIFY	((__force int) FMODE_NONOTIFY)
> Index: redhat-linux/init/do_mounts.c
> ===================================================================
> --- redhat-linux.orig/init/do_mounts.c	2021-07-08 08:02:09.774766869 -0400
> +++ redhat-linux/init/do_mounts.c	2021-07-08 08:02:12.046861001 -0400
> @@ -338,14 +338,22 @@ __setup("rootflags=", root_data_setup);
>  __setup("rootfstype=", fs_names_setup);
>  __setup("rootdelay=", root_delay_setup);
>  
> -static void __init split_fs_names(char *page, char *names)
> +static int __init split_fs_names(char *page, char *names)
>  {
> -	strcpy(page, root_fs_names);
> -	while (*page++) {
> -		if (page[-1] == ',')
> -			page[-1] = '\0';
> +	int count = 0;
> +	char *p = page;
> +
> +	strcpy(p, root_fs_names);
> +	while (*p++) {
> +		if (p[-1] == ',')
> +			p[-1] = '\0';
>  	}
> -	*page = '\0';
> +	*p = '\0';
> +
> +	for (p = page; *p; p += strlen(p)+1)
> +		count++;
> +
> +	return count;
>  }
>  
>  static int __init do_mount_root(const char *name, const char *fs,
> @@ -391,15 +399,16 @@ void __init mount_block_root(char *name,
>  	char *fs_names = page_address(page);
>  	char *p;
>  	char b[BDEVNAME_SIZE];
> +	int num_fs, i;
>  
>  	scnprintf(b, BDEVNAME_SIZE, "unknown-block(%u,%u)",
>  		  MAJOR(ROOT_DEV), MINOR(ROOT_DEV));
>  	if (root_fs_names)
> -		split_fs_names(fs_names, root_fs_names);
> +		num_fs = split_fs_names(fs_names, root_fs_names);
>  	else
> -		list_bdev_fs_names(fs_names, PAGE_SIZE);
> +		num_fs = list_bdev_fs_names(fs_names, PAGE_SIZE);
>  retry:
> -	for (p = fs_names; *p; p += strlen(p)+1) {
> +	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++) {
>  		int err = do_mount_root(name, p, flags, root_mount_data);
>  		switch (err) {
>  			case 0:
> @@ -432,7 +441,7 @@ retry:
>  	printk("List of all partitions:\n");
>  	printk_all_partitions();
>  	printk("No filesystem could mount root, tried: ");
> -	for (p = fs_names; *p; p += strlen(p)+1)
> +	for (p = fs_names, i = 0; i < num_fs; p += strlen(p)+1, i++)
>  		printk(" %s", p);
>  	printk("\n");
>  	panic("VFS: Unable to mount root fs on %s", b);
> @@ -533,13 +542,15 @@ static int __init mount_nodev_root(void)
>  {
>  	char *fs_names, *fstype;
>  	int err = -EINVAL;
> +	int num_fs, i;
>  
>  	fs_names = (void *)__get_free_page(GFP_KERNEL);
>  	if (!fs_names)
>  		return -EINVAL;
> -	split_fs_names(fs_names, root_fs_names);
> +	num_fs = split_fs_names(fs_names, root_fs_names);
>  
> -	for (fstype = fs_names; *fstype; fstype += strlen(fstype) + 1) {
> +	for (fstype = fs_names, i = 0; i < num_fs;
> +	     fstype += strlen(fstype) + 1, i++) {
>  		if (!fs_is_nodev(fstype))
>  			continue;
>  		err = do_mount_root(root_device_name, fstype, root_mountflags,


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Virtio-fs] [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names
  2021-07-12 18:21               ` Vivek Goyal
@ 2021-07-13  5:40                 ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-07-13  5:40 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Christoph Hellwig, Stefan Hajnoczi, viro, linux-fsdevel,
	virtio-fs, linux-kernel

On Mon, Jul 12, 2021 at 02:21:30PM -0400, Vivek Goyal wrote:
> In case you are finding it hard to spend some time on these patches, I 
> can take those patches, merge my changes and repost them.

Yes, please do.  Thanks a lot!

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-07-13  5:40 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-21  6:26 support booting of arbitrary non-blockdevice file systems v2 Christoph Hellwig
2021-06-21  6:26 ` [PATCH 1/2] init: split get_fs_names Christoph Hellwig
2021-06-21 14:46   ` Al Viro
2021-06-21 14:51     ` Al Viro
2021-06-21 14:53     ` Christoph Hellwig
2021-06-21 14:59       ` Al Viro
2021-06-21 15:09   ` Matthew Wilcox
2021-06-21 15:22     ` Christoph Hellwig
2021-06-21  6:26 ` [PATCH 2/2] init: allow mounting arbitrary non-blockdevice filesystems as root Christoph Hellwig
2021-06-21 13:31 ` [Virtio-fs] support booting of arbitrary non-blockdevice file systems v2 Stefan Hajnoczi
2021-06-21 14:35 ` Vivek Goyal
2021-06-22  8:12 ` [PATCH 3/2] fs: simplify get_filesystem_list / get_all_fs_names Christoph Hellwig
2021-06-22  8:36   ` [Virtio-fs] " Stefan Hajnoczi
2021-06-29 20:50     ` Vivek Goyal
2021-06-30  5:36       ` Christoph Hellwig
2021-06-30 17:33         ` Vivek Goyal
2021-07-07 21:04         ` Vivek Goyal
2021-07-07 21:06           ` Vivek Goyal
2021-07-08 12:59             ` Vivek Goyal
2021-07-12 18:21               ` Vivek Goyal
2021-07-13  5:40                 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).