Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line
@ 2020-03-26 18:16 Vlastimil Babka
  2020-03-26 18:16 ` [RFC v3 2/2] kernel/sysctl: support handling command line aliases Vlastimil Babka
  2020-03-26 20:24 ` [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line Kees Cook
  0 siblings, 2 replies; 7+ messages in thread
From: Vlastimil Babka @ 2020-03-26 18:16 UTC (permalink / raw)
  To: Luis Chamberlain, Kees Cook, Iurii Zaikin
  Cc: linux-kernel, linux-api, linux-mm, Ivan Teterevkov, Michal Hocko,
	David Rientjes, Matthew Wilcox, Eric W . Biederman,
	Guilherme G . Piccoli, Vlastimil Babka

A recently proposed patch to add vm_swappiness command line parameter in
addition to existing sysctl [1] made me wonder why we don't have a general
support for passing sysctl parameters via command line. Googling found only
somebody else wondering the same [2], but I haven't found any prior discussion
with reasons why not to do this.

Settings the vm_swappiness issue aside (the underlying issue might be solved in
a different way), quick search of kernel-parameters.txt shows there are already
some that exist as both sysctl and kernel parameter - hung_task_panic,
nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism
would remove the need to add more of those one-offs and might be handy in
situations where configuration by e.g. /etc/sysctl.d/ is impractical.

Hence, this patch adds a new parse_args() pass that looks for parameters
prefixed by 'sysctl.' and tries to interpret them as writes to the
corresponding sys/ files using an temporary in-kernel procfs mount. This
mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically
registered sysctl tables. Errors due to e.g. invalid parameter name or value
are reported in the kernel log.

The processing is hooked right before the init process is loaded, as some
handlers might be more complicated than simple setters and might need some
subsystems to be initialized. At the moment the init process can be started and
eventually execute a process writing to /proc/sys/ then it should be also fine
to do that from the kernel.

Sysctls registered later on module load time are not set by this mechanism -
it's expected that in such scenarios, setting sysctl values from userspace is
practical enough.

[1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
[2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
[3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
Changes in v3:
- use temporary procfs mount as Eric suggested. Seems to be the better option
  after all. Naming wise it simply converts all . to / - according to strace the
  sysctl tool seems to be doing the same.

Since the major change, I'm sending another RFC. If this approach is ok, then
it probably needs just some tweaks to the various error prints, and then
converting the rest of existing on-off aliases (if I come up with an idea how
to find them all). Thanks for all the feedback so far.

 .../admin-guide/kernel-parameters.txt         |  9 ++
 fs/proc/proc_sysctl.c                         | 90 +++++++++++++++++++
 include/linux/sysctl.h                        |  4 +
 init/main.c                                   |  2 +
 4 files changed, 105 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index c07815d230bc..0c7e032e7c2e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4793,6 +4793,15 @@
 
 	switches=	[HW,M68k]
 
+	sysctl.*=	[KNL]
+			Set a sysctl parameter, right before loading the init
+			process, as if the value was written to the respective
+			/proc/sys/... file. Unrecognized parameters and invalid
+			values are reported in the kernel log. Sysctls
+			registered later by a loaded module cannot be set this
+			way.
+			Example: sysctl.vm.swappiness=40
+
 	sysfs.deprecated=0|1 [KNL]
 			Enable/disable old style sysfs layout for old udev
 			on older distributions. When this option is enabled
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index c75bb4632ed1..8ee3273e4540 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -14,6 +14,7 @@
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/bpf-cgroup.h>
+#include <linux/mount.h>
 #include "internal.h"
 
 static const struct dentry_operations proc_sys_dentry_operations;
@@ -1725,3 +1726,92 @@ int __init proc_sys_init(void)
 
 	return sysctl_init();
 }
+
+struct vfsmount *proc_mnt = NULL;
+
+/* Set sysctl value passed on kernel command line. */
+static int process_sysctl_arg(char *param, char *val,
+			       const char *unused, void *arg)
+{
+	char *path;
+	struct file_system_type *proc_fs_type;
+	struct file *file;
+	int len;
+	int err;
+	loff_t pos = 0;
+	ssize_t wret;
+
+	if (strncmp(param, "sysctl", sizeof("sysctl") - 1))
+		return 0;
+
+	param += sizeof("sysctl") - 1;
+
+	if (param[0] != '/' && param[0] != '.')
+		return 0;
+
+	param++;
+
+	if (!proc_mnt) {
+		proc_fs_type = get_fs_type("proc");
+		if (!proc_fs_type) {
+			pr_err("Failed to mount procfs to set sysctl from command line");
+			return 0;
+		}
+		proc_mnt = kern_mount(proc_fs_type);
+		put_filesystem(proc_fs_type);
+		if (IS_ERR(proc_mnt)) {
+			pr_err("Failed to mount procfs to set sysctl from command line");
+			proc_mnt = NULL;
+			return 0;
+		}
+	}
+
+	len = 4 + strlen(param) + 1;
+	path = kmalloc(len, GFP_KERNEL);
+	if (!path)
+		panic("%s: Failed to allocate %d bytes t\n", __func__, len);
+
+	strcpy(path, "sys/");
+	strcat(path, param);
+	strreplace(path, '.', '/');
+
+	file = file_open_root(proc_mnt->mnt_root, proc_mnt, path, O_WRONLY, 0);
+	if (IS_ERR(file)) {
+		err = PTR_ERR(file);
+		pr_err("Error %d opening proc file %s to set sysctl parameter '%s=%s'",
+			err, path, param, val);
+		goto out;
+	}
+	len = strlen(val);
+	wret = kernel_write(file, val, len, &pos);
+	if (wret < 0) {
+		err = wret;
+		pr_err("Error %d writing to proc file %s to set sysctl parameter '%s=%s'",
+			err, path, param, val);
+	} else if (wret != len) {
+		pr_err("Wrote only %ld bytes of %d  writing to proc file %s to set sysctl parameter '%s=%s'",
+			wret, len, path, param, val);
+	}
+
+	filp_close(file, NULL);
+out:
+	kfree(path);
+	return 0;
+}
+
+void do_sysctl_args(void)
+{
+	char *command_line;
+
+	command_line = kstrdup(saved_command_line, GFP_KERNEL);
+	if (!command_line)
+		panic("%s: Failed to allocate copy of command line\n", __func__);
+
+	parse_args("Setting sysctl args", command_line,
+		   NULL, 0, -1, -1, NULL, process_sysctl_arg);
+
+	if (proc_mnt)
+		kern_unmount(proc_mnt);
+
+	kfree(command_line);
+}
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 02fa84493f23..5f3f2a00d75f 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -206,6 +206,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
 void unregister_sysctl_table(struct ctl_table_header * table);
 
 extern int sysctl_init(void);
+void do_sysctl_args(void);
 
 extern struct ctl_table sysctl_mount_point[];
 
@@ -236,6 +237,9 @@ static inline void setup_sysctl_set(struct ctl_table_set *p,
 {
 }
 
+void do_sysctl_args(void)
+{
+}
 #endif /* CONFIG_SYSCTL */
 
 int sysctl_max_threads(struct ctl_table *table, int write,
diff --git a/init/main.c b/init/main.c
index ee4947af823f..a91ea166a731 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1367,6 +1367,8 @@ static int __ref kernel_init(void *unused)
 
 	rcu_end_inkernel_boot();
 
+	do_sysctl_args();
+
 	if (ramdisk_execute_command) {
 		ret = run_init_process(ramdisk_execute_command);
 		if (!ret)
-- 
2.25.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC v3 2/2] kernel/sysctl: support handling command line aliases
  2020-03-26 18:16 [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line Vlastimil Babka
@ 2020-03-26 18:16 ` Vlastimil Babka
  2020-03-26 20:34   ` Kees Cook
  2020-03-26 20:24 ` [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line Kees Cook
  1 sibling, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2020-03-26 18:16 UTC (permalink / raw)
  To: Luis Chamberlain, Kees Cook, Iurii Zaikin
  Cc: linux-kernel, linux-api, linux-mm, Ivan Teterevkov, Michal Hocko,
	David Rientjes, Matthew Wilcox, Eric W . Biederman,
	Guilherme G . Piccoli, Vlastimil Babka

We can now handle sysctl parameters on kernel command line, but historically
some parameters introduced their own command line equivalent, which we don't
want to remove for compatibility reasons. We can however convert them to the
generic infrastructure with a table translating the legacy command line
parameters to their sysctl names, and removing the one-off param handlers.

This patch adds the support and makes the first conversion to demonstrate it,
on the (deprecated) numa_zonelist_order parameter.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
Changes in v3:
- constify some things according to Kees
- expand the comment of sysctl_aliases to note on different timings

 fs/proc/proc_sysctl.c | 48 ++++++++++++++++++++++++++++++++++++-------
 mm/page_alloc.c       |  9 --------
 2 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 8ee3273e4540..3a861e0a7c7e 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1729,6 +1729,37 @@ int __init proc_sys_init(void)
 
 struct vfsmount *proc_mnt = NULL;
 
+struct sysctl_alias {
+	const char *kernel_param;
+	const char *sysctl_param;
+};
+
+/*
+ * Historically some settings had both sysctl and a command line parameter.
+ * With the generic sysctl. parameter support, we can handle them at a single
+ * place and only keep the historical name for compatibility. This is not meant
+ * to add brand new aliases. When adding existing aliases, consider whether
+ * the possibly different moment of changing the value (e.g. from early_param
+ * to the moment do_sysctl_args() is called) is an issue for the specific
+ * parameter.
+ */
+static const struct sysctl_alias sysctl_aliases[] = {
+	{"numa_zonelist_order",		"vm.numa_zonelist_order" },
+	{ }
+};
+
+const char *sysctl_find_alias(char *param)
+{
+	const struct sysctl_alias *alias;
+
+	for (alias = &sysctl_aliases[0]; alias->kernel_param != NULL; alias++) {
+		if (strcmp(alias->kernel_param, param) == 0)
+			return alias->sysctl_param;
+	}
+
+	return NULL;
+}
+
 /* Set sysctl value passed on kernel command line. */
 static int process_sysctl_arg(char *param, char *val,
 			       const char *unused, void *arg)
@@ -1741,15 +1772,18 @@ static int process_sysctl_arg(char *param, char *val,
 	loff_t pos = 0;
 	ssize_t wret;
 
-	if (strncmp(param, "sysctl", sizeof("sysctl") - 1))
-		return 0;
-
-	param += sizeof("sysctl") - 1;
+	if (strncmp(param, "sysctl", sizeof("sysctl") - 1) == 0) {
+		param += sizeof("sysctl") - 1;
 
-	if (param[0] != '/' && param[0] != '.')
-		return 0;
+		if (param[0] != '/' && param[0] != '.')
+			return 0;
 
-	param++;
+		param++;
+	} else {
+		param = (char *) sysctl_find_alias(param);
+		if (!param)
+			return 0;
+	}
 
 	if (!proc_mnt) {
 		proc_fs_type = get_fs_type("proc");
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3c4eb750a199..de7a134b1b8a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5460,15 +5460,6 @@ static int __parse_numa_zonelist_order(char *s)
 	return 0;
 }
 
-static __init int setup_numa_zonelist_order(char *s)
-{
-	if (!s)
-		return 0;
-
-	return __parse_numa_zonelist_order(s);
-}
-early_param("numa_zonelist_order", setup_numa_zonelist_order);
-
 char numa_zonelist_order[] = "Node";
 
 /*
-- 
2.25.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line
  2020-03-26 18:16 [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line Vlastimil Babka
  2020-03-26 18:16 ` [RFC v3 2/2] kernel/sysctl: support handling command line aliases Vlastimil Babka
@ 2020-03-26 20:24 ` Kees Cook
  2020-03-26 22:08   ` Vlastimil Babka
  1 sibling, 1 reply; 7+ messages in thread
From: Kees Cook @ 2020-03-26 20:24 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Luis Chamberlain, Iurii Zaikin, linux-kernel, linux-api,
	linux-mm, Ivan Teterevkov, Michal Hocko, David Rientjes,
	Matthew Wilcox, Eric W . Biederman, Guilherme G . Piccoli

On Thu, Mar 26, 2020 at 07:16:05PM +0100, Vlastimil Babka wrote:
> A recently proposed patch to add vm_swappiness command line parameter in
> addition to existing sysctl [1] made me wonder why we don't have a general
> support for passing sysctl parameters via command line. Googling found only
> somebody else wondering the same [2], but I haven't found any prior discussion
> with reasons why not to do this.
> 
> Settings the vm_swappiness issue aside (the underlying issue might be solved in
> a different way), quick search of kernel-parameters.txt shows there are already
> some that exist as both sysctl and kernel parameter - hung_task_panic,
> nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism
> would remove the need to add more of those one-offs and might be handy in
> situations where configuration by e.g. /etc/sysctl.d/ is impractical.
> 
> Hence, this patch adds a new parse_args() pass that looks for parameters
> prefixed by 'sysctl.' and tries to interpret them as writes to the
> corresponding sys/ files using an temporary in-kernel procfs mount. This
> mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically
> registered sysctl tables. Errors due to e.g. invalid parameter name or value
> are reported in the kernel log.
> 
> The processing is hooked right before the init process is loaded, as some
> handlers might be more complicated than simple setters and might need some
> subsystems to be initialized. At the moment the init process can be started and
> eventually execute a process writing to /proc/sys/ then it should be also fine
> to do that from the kernel.
> 
> Sysctls registered later on module load time are not set by this mechanism -
> it's expected that in such scenarios, setting sysctl values from userspace is
> practical enough.
> 
> [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
> [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
> [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> Changes in v3:
> - use temporary procfs mount as Eric suggested. Seems to be the better option
>   after all. Naming wise it simply converts all . to / - according to strace the
>   sysctl tool seems to be doing the same.
> 
> Since the major change, I'm sending another RFC. If this approach is ok, then
> it probably needs just some tweaks to the various error prints, and then
> converting the rest of existing on-off aliases (if I come up with an idea how
> to find them all). Thanks for all the feedback so far.

Yeah, I think you can drop "RFC" from this in the next version -- you're
well into getting this finalized IMO.

> 
>  .../admin-guide/kernel-parameters.txt         |  9 ++
>  fs/proc/proc_sysctl.c                         | 90 +++++++++++++++++++
>  include/linux/sysctl.h                        |  4 +
>  init/main.c                                   |  2 +
>  4 files changed, 105 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index c07815d230bc..0c7e032e7c2e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4793,6 +4793,15 @@
>  
>  	switches=	[HW,M68k]
>  
> +	sysctl.*=	[KNL]
> +			Set a sysctl parameter, right before loading the init
> +			process, as if the value was written to the respective
> +			/proc/sys/... file. Unrecognized parameters and invalid
> +			values are reported in the kernel log. Sysctls
> +			registered later by a loaded module cannot be set this
> +			way.

Maybe add: "Both '.' and '/' are recognized as separators."

> +			Example: sysctl.vm.swappiness=40
> +
>  	sysfs.deprecated=0|1 [KNL]
>  			Enable/disable old style sysfs layout for old udev
>  			on older distributions. When this option is enabled
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index c75bb4632ed1..8ee3273e4540 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -14,6 +14,7 @@
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/bpf-cgroup.h>
> +#include <linux/mount.h>
>  #include "internal.h"
>  
>  static const struct dentry_operations proc_sys_dentry_operations;
> @@ -1725,3 +1726,92 @@ int __init proc_sys_init(void)
>  
>  	return sysctl_init();
>  }
> +
> +struct vfsmount *proc_mnt = NULL;

Er, I had a bunch of comments about how this should be declared static
etc, but decided on a different suggestion entirely. See below...

> +
> +/* Set sysctl value passed on kernel command line. */
> +static int process_sysctl_arg(char *param, char *val,
> +			       const char *unused, void *arg)
> +{
> +	char *path;
> +	struct file_system_type *proc_fs_type;
> +	struct file *file;
> +	int len;
> +	int err;
> +	loff_t pos = 0;
> +	ssize_t wret;
> +
> +	if (strncmp(param, "sysctl", sizeof("sysctl") - 1))
> +		return 0;
> +
> +	param += sizeof("sysctl") - 1;
> +
> +	if (param[0] != '/' && param[0] != '.')
> +		return 0;
> +
> +	param++;
> +
> +	if (!proc_mnt) {
> +		proc_fs_type = get_fs_type("proc");
> +		if (!proc_fs_type) {
> +			pr_err("Failed to mount procfs to set sysctl from command line");
> +			return 0;
> +		}
> +		proc_mnt = kern_mount(proc_fs_type);
> +		put_filesystem(proc_fs_type);
> +		if (IS_ERR(proc_mnt)) {
> +			pr_err("Failed to mount procfs to set sysctl from command line");
> +			proc_mnt = NULL;
> +			return 0;
> +		}
> +	}
> +
> +	len = 4 + strlen(param) + 1;
> +	path = kmalloc(len, GFP_KERNEL);
> +	if (!path)
> +		panic("%s: Failed to allocate %d bytes t\n", __func__, len);
> +
> +	strcpy(path, "sys/");
> +	strcat(path, param);
> +	strreplace(path, '.', '/');

You can do the replacement against the param directly, and also avoid
all the open-coded string manipulations:

	strreplace(param, '.', '/');
	path = kasprintf(GFP_KERNEL, "sys/%s", param);
	if (!path)
		panic("%s: Failed to allocate path for %s\n", __func__, param);

> +
> +	file = file_open_root(proc_mnt->mnt_root, proc_mnt, path, O_WRONLY, 0);
> +	if (IS_ERR(file)) {
> +		err = PTR_ERR(file);
> +		pr_err("Error %d opening proc file %s to set sysctl parameter '%s=%s'",
> +			err, path, param, val);
> +		goto out;
> +	}
> +	len = strlen(val);
> +	wret = kernel_write(file, val, len, &pos);
> +	if (wret < 0) {
> +		err = wret;
> +		pr_err("Error %d writing to proc file %s to set sysctl parameter '%s=%s'",
> +			err, path, param, val);
> +	} else if (wret != len) {
> +		pr_err("Wrote only %ld bytes of %d  writing to proc file %s to set sysctl parameter '%s=%s'",
> +			wret, len, path, param, val);
> +	}
> +
> +	filp_close(file, NULL);

Please check the return value of filp_close() and treat that as an error
for this function too.

> +out:
> +	kfree(path);
> +	return 0;
> +}
> +
> +void do_sysctl_args(void)
> +{
> +	char *command_line;
> +
> +	command_line = kstrdup(saved_command_line, GFP_KERNEL);
> +	if (!command_line)
> +		panic("%s: Failed to allocate copy of command line\n", __func__);
> +
> +	parse_args("Setting sysctl args", command_line,
> +		   NULL, 0, -1, -1, NULL, process_sysctl_arg);
> +
> +	if (proc_mnt)
> +		kern_unmount(proc_mnt);

I don't recommend sharing allocation lifetimes between two functions
(process_sysctl_arg() allocs proc_mnt, and do_sysctl_args() frees it).
And since you have a scoped lifetime, why allocate it or have it as a
global at all? It can be stack-allocated and passed to the handler:

void do_sysctl_args(void)
{
	struct file_system_type *proc_fs_type;
	struct vfsmount *proc_mnt;
	char *command_line;

	proc_fs_type = get_fs_type("proc");
	if (!proc_fs_type) {
		pr_err("Failed to mount procfs to set sysctl from command line");
		return;
	}
	proc_mnt = kern_mount(proc_fs_type);
	put_filesystem(proc_fs_type);
	if (IS_ERR(proc_mnt)) {
		pr_err("Failed to mount procfs to set sysctl from command line");
		return;
	}

	command_line = kstrdup(saved_command_line, GFP_KERNEL);
	if (!command_line)
		panic("%s: Failed to allocate copy of command line\n",
			__func__);

	parse_args("Setting sysctl args", command_line,
		   NULL, 0, -1, -1, proc_mnt, process_sysctl_arg);

	kfree(command_line);
	kern_unmount(proc_mnt);
}

And then pull the mount from (the hilariously overloaded name) "arg":

static int process_sysctl_arg(char *param, char *val,
			       const char *unused, void *arg)
{
	struct vfsmount *proc_mnt = (struct vfsmount *)arg;
	char *path;

	if (!arg)
		...fread out...

	etc

> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
> index 02fa84493f23..5f3f2a00d75f 100644
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -206,6 +206,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
>  void unregister_sysctl_table(struct ctl_table_header * table);
>  
>  extern int sysctl_init(void);
> +void do_sysctl_args(void);
>  
>  extern struct ctl_table sysctl_mount_point[];
>  
> @@ -236,6 +237,9 @@ static inline void setup_sysctl_set(struct ctl_table_set *p,
>  {
>  }
>  
> +void do_sysctl_args(void)
> +{
> +}
>  #endif /* CONFIG_SYSCTL */
>  
>  int sysctl_max_threads(struct ctl_table *table, int write,
> diff --git a/init/main.c b/init/main.c
> index ee4947af823f..a91ea166a731 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1367,6 +1367,8 @@ static int __ref kernel_init(void *unused)
>  
>  	rcu_end_inkernel_boot();
>  
> +	do_sysctl_args();
> +
>  	if (ramdisk_execute_command) {
>  		ret = run_init_process(ramdisk_execute_command);
>  		if (!ret)
> -- 
> 2.25.1
> 

Looking good! I'm excited to see the next version. :)

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v3 2/2] kernel/sysctl: support handling command line aliases
  2020-03-26 18:16 ` [RFC v3 2/2] kernel/sysctl: support handling command line aliases Vlastimil Babka
@ 2020-03-26 20:34   ` Kees Cook
  2020-03-26 21:29     ` Christian Brauner
  0 siblings, 1 reply; 7+ messages in thread
From: Kees Cook @ 2020-03-26 20:34 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Luis Chamberlain, Iurii Zaikin, linux-kernel, linux-api,
	linux-mm, Ivan Teterevkov, Michal Hocko, David Rientjes,
	Matthew Wilcox, Eric W . Biederman, Guilherme G . Piccoli

On Thu, Mar 26, 2020 at 07:16:06PM +0100, Vlastimil Babka wrote:
> We can now handle sysctl parameters on kernel command line, but historically
> some parameters introduced their own command line equivalent, which we don't
> want to remove for compatibility reasons. We can however convert them to the
> generic infrastructure with a table translating the legacy command line
> parameters to their sysctl names, and removing the one-off param handlers.
> 
> This patch adds the support and makes the first conversion to demonstrate it,
> on the (deprecated) numa_zonelist_order parameter.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> Changes in v3:
> - constify some things according to Kees
> - expand the comment of sysctl_aliases to note on different timings
> 
>  fs/proc/proc_sysctl.c | 48 ++++++++++++++++++++++++++++++++++++-------
>  mm/page_alloc.c       |  9 --------
>  2 files changed, 41 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index 8ee3273e4540..3a861e0a7c7e 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -1729,6 +1729,37 @@ int __init proc_sys_init(void)
>  
>  struct vfsmount *proc_mnt = NULL;
>  
> +struct sysctl_alias {
> +	const char *kernel_param;
> +	const char *sysctl_param;
> +};
> +
> +/*
> + * Historically some settings had both sysctl and a command line parameter.
> + * With the generic sysctl. parameter support, we can handle them at a single
> + * place and only keep the historical name for compatibility. This is not meant
> + * to add brand new aliases. When adding existing aliases, consider whether
> + * the possibly different moment of changing the value (e.g. from early_param
> + * to the moment do_sysctl_args() is called) is an issue for the specific
> + * parameter.
> + */
> +static const struct sysctl_alias sysctl_aliases[] = {
> +	{"numa_zonelist_order",		"vm.numa_zonelist_order" },
> +	{ }
> +};
> +
> +const char *sysctl_find_alias(char *param)

This should be "static" too.

> +{
> +	const struct sysctl_alias *alias;
> +
> +	for (alias = &sysctl_aliases[0]; alias->kernel_param != NULL; alias++) {
> +		if (strcmp(alias->kernel_param, param) == 0)
> +			return alias->sysctl_param;
> +	}
> +
> +	return NULL;
> +}
> +
>  /* Set sysctl value passed on kernel command line. */
>  static int process_sysctl_arg(char *param, char *val,
>  			       const char *unused, void *arg)
> @@ -1741,15 +1772,18 @@ static int process_sysctl_arg(char *param, char *val,
>  	loff_t pos = 0;
>  	ssize_t wret;
>  
> -	if (strncmp(param, "sysctl", sizeof("sysctl") - 1))
> -		return 0;
> -
> -	param += sizeof("sysctl") - 1;
> +	if (strncmp(param, "sysctl", sizeof("sysctl") - 1) == 0) {
> +		param += sizeof("sysctl") - 1;
>  
> -	if (param[0] != '/' && param[0] != '.')
> -		return 0;
> +		if (param[0] != '/' && param[0] != '.')
> +			return 0;
>  
> -	param++;
> +		param++;
> +	} else {
> +		param = (char *) sysctl_find_alias(param);
> +		if (!param)
> +			return 0;
> +	}
>  
>  	if (!proc_mnt) {
>  		proc_fs_type = get_fs_type("proc");
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3c4eb750a199..de7a134b1b8a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5460,15 +5460,6 @@ static int __parse_numa_zonelist_order(char *s)
>  	return 0;
>  }
>  
> -static __init int setup_numa_zonelist_order(char *s)
> -{
> -	if (!s)
> -		return 0;
> -
> -	return __parse_numa_zonelist_order(s);
> -}
> -early_param("numa_zonelist_order", setup_numa_zonelist_order);
> -
>  char numa_zonelist_order[] = "Node";
>  
>  /*
> -- 
> 2.25.1
> 

Otherwise, yay, I love it! :)

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v3 2/2] kernel/sysctl: support handling command line aliases
  2020-03-26 20:34   ` Kees Cook
@ 2020-03-26 21:29     ` Christian Brauner
  0 siblings, 0 replies; 7+ messages in thread
From: Christian Brauner @ 2020-03-26 21:29 UTC (permalink / raw)
  To: Kees Cook
  Cc: Vlastimil Babka, Luis Chamberlain, Iurii Zaikin, linux-kernel,
	linux-api, linux-mm, Ivan Teterevkov, Michal Hocko,
	David Rientjes, Matthew Wilcox, Eric W . Biederman,
	Guilherme G . Piccoli

On Thu, Mar 26, 2020 at 01:34:08PM -0700, Kees Cook wrote:
> On Thu, Mar 26, 2020 at 07:16:06PM +0100, Vlastimil Babka wrote:
> > We can now handle sysctl parameters on kernel command line, but historically
> > some parameters introduced their own command line equivalent, which we don't
> > want to remove for compatibility reasons. We can however convert them to the
> > generic infrastructure with a table translating the legacy command line
> > parameters to their sysctl names, and removing the one-off param handlers.
> > 
> > This patch adds the support and makes the first conversion to demonstrate it,
> > on the (deprecated) numa_zonelist_order parameter.
> > 
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> > Changes in v3:
> > - constify some things according to Kees
> > - expand the comment of sysctl_aliases to note on different timings
> > 
> >  fs/proc/proc_sysctl.c | 48 ++++++++++++++++++++++++++++++++++++-------
> >  mm/page_alloc.c       |  9 --------
> >  2 files changed, 41 insertions(+), 16 deletions(-)
> > 
> > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> > index 8ee3273e4540..3a861e0a7c7e 100644
> > --- a/fs/proc/proc_sysctl.c
> > +++ b/fs/proc/proc_sysctl.c
> > @@ -1729,6 +1729,37 @@ int __init proc_sys_init(void)
> >  
> >  struct vfsmount *proc_mnt = NULL;
> >  
> > +struct sysctl_alias {
> > +	const char *kernel_param;
> > +	const char *sysctl_param;
> > +};
> > +
> > +/*
> > + * Historically some settings had both sysctl and a command line parameter.
> > + * With the generic sysctl. parameter support, we can handle them at a single
> > + * place and only keep the historical name for compatibility. This is not meant
> > + * to add brand new aliases. When adding existing aliases, consider whether
> > + * the possibly different moment of changing the value (e.g. from early_param
> > + * to the moment do_sysctl_args() is called) is an issue for the specific
> > + * parameter.
> > + */
> > +static const struct sysctl_alias sysctl_aliases[] = {
> > +	{"numa_zonelist_order",		"vm.numa_zonelist_order" },
> > +	{ }
> > +};
> > +
> > +const char *sysctl_find_alias(char *param)
> 
> This should be "static" too.
> 
> > +{
> > +	const struct sysctl_alias *alias;
> > +
> > +	for (alias = &sysctl_aliases[0]; alias->kernel_param != NULL; alias++) {
> > +		if (strcmp(alias->kernel_param, param) == 0)
> > +			return alias->sysctl_param;
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> >  /* Set sysctl value passed on kernel command line. */
> >  static int process_sysctl_arg(char *param, char *val,
> >  			       const char *unused, void *arg)
> > @@ -1741,15 +1772,18 @@ static int process_sysctl_arg(char *param, char *val,
> >  	loff_t pos = 0;
> >  	ssize_t wret;
> >  
> > -	if (strncmp(param, "sysctl", sizeof("sysctl") - 1))

Somewhat off-topic but in some projects I use macro-helpers to make that
more transparent. So this would become a constant expression:

#define STRLITERALLEN(x) (sizeof(""x"") - 1)
or
#define STRLEN(x) (sizeof(""x"") - 1)

But I guess that's a matter of style/taste.

Christian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line
  2020-03-26 20:24 ` [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line Kees Cook
@ 2020-03-26 22:08   ` Vlastimil Babka
  2020-03-27  3:50     ` Kees Cook
  0 siblings, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2020-03-26 22:08 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis Chamberlain, Iurii Zaikin, linux-kernel, linux-api,
	linux-mm, Ivan Teterevkov, Michal Hocko, David Rientjes,
	Matthew Wilcox, Eric W . Biederman, Guilherme G . Piccoli

On 3/26/20 9:24 PM, Kees Cook wrote:
> On Thu, Mar 26, 2020 at 07:16:05PM +0100, Vlastimil Babka wrote:
>> Since the major change, I'm sending another RFC. If this approach is ok, then
>> it probably needs just some tweaks to the various error prints, and then
>> converting the rest of existing on-off aliases (if I come up with an idea how
>> to find them all). Thanks for all the feedback so far.
> 
> Yeah, I think you can drop "RFC" from this in the next version -- you're
> well into getting this finalized IMO.

Thanks!

>>
>>  .../admin-guide/kernel-parameters.txt         |  9 ++
>>  fs/proc/proc_sysctl.c                         | 90 +++++++++++++++++++
>>  include/linux/sysctl.h                        |  4 +
>>  init/main.c                                   |  2 +
>>  4 files changed, 105 insertions(+)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index c07815d230bc..0c7e032e7c2e 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -4793,6 +4793,15 @@
>>  
>>  	switches=	[HW,M68k]
>>  
>> +	sysctl.*=	[KNL]
>> +			Set a sysctl parameter, right before loading the init
>> +			process, as if the value was written to the respective
>> +			/proc/sys/... file. Unrecognized parameters and invalid
>> +			values are reported in the kernel log. Sysctls
>> +			registered later by a loaded module cannot be set this
>> +			way.
> 
> Maybe add: "Both '.' and '/' are recognized as separators."

OK

>> +
>> +/* Set sysctl value passed on kernel command line. */
>> +static int process_sysctl_arg(char *param, char *val,
>> +			       const char *unused, void *arg)
>> +{
>> +	char *path;
>> +	struct file_system_type *proc_fs_type;
>> +	struct file *file;
>> +	int len;
>> +	int err;
>> +	loff_t pos = 0;
>> +	ssize_t wret;
>> +
>> +	if (strncmp(param, "sysctl", sizeof("sysctl") - 1))
>> +		return 0;
>> +
>> +	param += sizeof("sysctl") - 1;
>> +
>> +	if (param[0] != '/' && param[0] != '.')
>> +		return 0;
>> +
>> +	param++;
>> +
>> +	if (!proc_mnt) {
>> +		proc_fs_type = get_fs_type("proc");
>> +		if (!proc_fs_type) {
>> +			pr_err("Failed to mount procfs to set sysctl from command line");
>> +			return 0;
>> +		}
>> +		proc_mnt = kern_mount(proc_fs_type);
>> +		put_filesystem(proc_fs_type);
>> +		if (IS_ERR(proc_mnt)) {
>> +			pr_err("Failed to mount procfs to set sysctl from command line");
>> +			proc_mnt = NULL;
>> +			return 0;
>> +		}
>> +	}
>> +
>> +	len = 4 + strlen(param) + 1;
>> +	path = kmalloc(len, GFP_KERNEL);
>> +	if (!path)
>> +		panic("%s: Failed to allocate %d bytes t\n", __func__, len);
>> +
>> +	strcpy(path, "sys/");
>> +	strcat(path, param);
>> +	strreplace(path, '.', '/');
> 
> You can do the replacement against the param directly, and also avoid
> all the open-coded string manipulations:
> 
> 	strreplace(param, '.', '/');

I didn't want to modify param for the sake of error prints, but perhaps
the replacements won't confuse system admin too much?

> 	path = kasprintf(GFP_KERNEL, "sys/%s", param);

Ah yea that's nicer.

>> +
>> +	file = file_open_root(proc_mnt->mnt_root, proc_mnt, path, O_WRONLY, 0);
>> +	if (IS_ERR(file)) {
>> +		err = PTR_ERR(file);
>> +		pr_err("Error %d opening proc file %s to set sysctl parameter '%s=%s'",
>> +			err, path, param, val);
>> +		goto out;
>> +	}
>> +	len = strlen(val);
>> +	wret = kernel_write(file, val, len, &pos);
>> +	if (wret < 0) {
>> +		err = wret;
>> +		pr_err("Error %d writing to proc file %s to set sysctl parameter '%s=%s'",
>> +			err, path, param, val);
>> +	} else if (wret != len) {
>> +		pr_err("Wrote only %ld bytes of %d  writing to proc file %s to set sysctl parameter '%s=%s'",
>> +			wret, len, path, param, val);
>> +	}
>> +
>> +	filp_close(file, NULL);
> 
> Please check the return value of filp_close() and treat that as an error
> for this function too.

Well I could print it, but not much else? The unmount will probably fail
in that case?

>> +out:
>> +	kfree(path);
>> +	return 0;
>> +}
>> +
>> +void do_sysctl_args(void)
>> +{
>> +	char *command_line;
>> +
>> +	command_line = kstrdup(saved_command_line, GFP_KERNEL);
>> +	if (!command_line)
>> +		panic("%s: Failed to allocate copy of command line\n", __func__);
>> +
>> +	parse_args("Setting sysctl args", command_line,
>> +		   NULL, 0, -1, -1, NULL, process_sysctl_arg);
>> +
>> +	if (proc_mnt)
>> +		kern_unmount(proc_mnt);
> 
> I don't recommend sharing allocation lifetimes between two functions
> (process_sysctl_arg() allocs proc_mnt, and do_sysctl_args() frees it).
> And since you have a scoped lifetime, why allocate it or have it as a
> global at all? It can be stack-allocated and passed to the handler:

So the point was that the mount is only done when an applicable sysctl
parameter is found. On majority of systems there won't be any, at least
for initial X years :)

> void do_sysctl_args(void)
> {
> 	struct file_system_type *proc_fs_type;
> 	struct vfsmount *proc_mnt;
> 	char *command_line;
> 
> 	proc_fs_type = get_fs_type("proc");
> 	if (!proc_fs_type) {
> 		pr_err("Failed to mount procfs to set sysctl from command line");
> 		return;
> 	}
> 	proc_mnt = kern_mount(proc_fs_type);
> 	put_filesystem(proc_fs_type);
> 	if (IS_ERR(proc_mnt)) {
> 		pr_err("Failed to mount procfs to set sysctl from command line");
> 		return;
> 	}
> 
> 	command_line = kstrdup(saved_command_line, GFP_KERNEL);
> 	if (!command_line)
> 		panic("%s: Failed to allocate copy of command line\n",
> 			__func__);
> 
> 	parse_args("Setting sysctl args", command_line,
> 		   NULL, 0, -1, -1, proc_mnt, process_sysctl_arg);
> 
> 	kfree(command_line);
> 	kern_unmount(proc_mnt);
> }
> 
> And then pull the mount from (the hilariously overloaded name) "arg":

But I guess the "mount on first applicable argument" approach would work
with this scheme as well:

struct vfsmount *proc_mnt = NULL;
parse_args(..., &proc_mnt, ...)

Thanks!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line
  2020-03-26 22:08   ` Vlastimil Babka
@ 2020-03-27  3:50     ` Kees Cook
  0 siblings, 0 replies; 7+ messages in thread
From: Kees Cook @ 2020-03-27  3:50 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Luis Chamberlain, Iurii Zaikin, linux-kernel, linux-api,
	linux-mm, Ivan Teterevkov, Michal Hocko, David Rientjes,
	Matthew Wilcox, Eric W . Biederman, Guilherme G . Piccoli

On Thu, Mar 26, 2020 at 11:08:40PM +0100, Vlastimil Babka wrote:
> On 3/26/20 9:24 PM, Kees Cook wrote:
> I didn't want to modify param for the sake of error prints, but perhaps
> the replacements won't confuse system admin too much?

Ah, fair enough. Should be fine to do it against "path" then. Ignore
that bit from me. ;)

> >> +	filp_close(file, NULL);
> > 
> > Please check the return value of filp_close() and treat that as an error
> > for this function too.
> 
> Well I could print it, but not much else? The unmount will probably fail
> in that case?

Maybe? This is just a nit of mine from tracking horrible bugs that
turned out to be unreported 'close' failures. :)

> But I guess the "mount on first applicable argument" approach would work
> with this scheme as well:
> 
> struct vfsmount *proc_mnt = NULL;
> parse_args(..., &proc_mnt, ...)

Yes please! That would be perfect. (And yeah, it's a sensible
optimization to do it "as needed"; I hadn't thought of that.)

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26 18:16 [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line Vlastimil Babka
2020-03-26 18:16 ` [RFC v3 2/2] kernel/sysctl: support handling command line aliases Vlastimil Babka
2020-03-26 20:34   ` Kees Cook
2020-03-26 21:29     ` Christian Brauner
2020-03-26 20:24 ` [RFC v3 1/2] kernel/sysctl: support setting sysctl parameters from kernel command line Kees Cook
2020-03-26 22:08   ` Vlastimil Babka
2020-03-27  3:50     ` Kees Cook

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git