All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/10] kmod: stress test driver, few fixes and enhancements
@ 2016-12-08 18:47 Luis R. Rodriguez
  2016-12-08 18:47 ` [RFC 01/10] kmod: add test driver to stress test the module loader Luis R. Rodriguez
                   ` (10 more replies)
  0 siblings, 11 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 18:47 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, rgoldwyn, subashab, xypron.glpk, keescook,
	atomlin, mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo,
	akpm, torvalds, linux-kselftest, linux-doc, linux-kernel,
	Luis R. Rodriguez

Upon running into an old kmod v19 issue with mount (get_fs_type()) a few of us
hunted for the cause of the issue. Although the issue ended up being a
userspace issue, a stress test driver was written to help reproduce the issue,
and along the way a few other fixes and sanity checks were implemented.

I've taken the time to generalize the stress test driver as a kselftest driver
with a 9 test cases. The last two test cases reveal an existing issue which
is not yet addressed upstream, even if you have kmod v19 present. A fix is
proposed in the last patch. Orignally we had discarded this patch as too
complex due to the alias handling, but upon further analysis of test cases
and memory pressure issues, it seems worth considering. Other than the
last patch I don't think much of the other patches are controversial, but
sending as RFC first just in case.

If its not clear, an end goal here is to make module loading a bit more
deterministic with stronger sanity checks and stress tests. Please note,
the stress test diver requires 4 GiB of RAM to run all tests without running
out of memory. A lot of this has to do with the memory requirements needed
for a dynamic test for multiple threads, but note that the final memory
pressure and OOMs actually don't come from this allocation, but instead
from many finit_module() calls, this consumes quite a bit of memory, specially
if you have a lot of dependencies which also need to be loaded prior to
your needed module -- as is the case for filesystem drivers.

These patches are available on my linux-next git-tree on my branch
20161208-kmod-test-driver-try2 [0], which is based on linux-next tag
next-20161208. Patches are also available based on v4.9-rc8 [1] for
those looking for a bit more stable tree given x86_64 on linux-next is
hosed at the moment.

Since kmod.c doesn't seem to get much love, and since I've been digging
quite a bit into it for other users (firmware) I suppose I could volunteer
myself to maintain this code as well, unless there are oppositions to this.

[0] https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux-next.git/log/?h=20161208-kmod-test-driver-try2
[1] https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux.git/log/?h=20161208-kmod-test-driver

Luis R. Rodriguez (10):
  kmod: add test driver to stress test the module loader
  module: fix memory leak on early load_module() failures
  kmod: add dynamic max concurrent thread count
  kmod: provide wrappers for kmod_concurrent inc/dec
  kmod: return -EBUSY if modprobe limit is reached
  kmod: provide sanity check on kmod_concurrent access
  kmod: use simplified rate limit printk
  sysctl: add support for unsigned int properly
  kmod: add helpers for getting kmod count and limit
  kmod: add a sanity check on module loading

 Documentation/admin-guide/kernel-parameters.txt |    7 +
 include/linux/kmod.h                            |    9 +
 include/linux/sysctl.h                          |    3 +
 init/Kconfig                                    |   23 +
 init/main.c                                     |    1 +
 kernel/kmod.c                                   |  244 ++++-
 kernel/module.c                                 |   12 +-
 kernel/sysctl.c                                 |  198 +++-
 lib/Kconfig.debug                               |   25 +
 lib/Makefile                                    |    1 +
 lib/test_kmod.c                                 | 1248 +++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile           |   11 +
 tools/testing/selftests/kmod/config             |    7 +
 tools/testing/selftests/kmod/kmod.sh            |  448 ++++++++
 14 files changed, 2199 insertions(+), 38 deletions(-)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 01/10] kmod: add test driver to stress test the module loader
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
@ 2016-12-08 18:47 ` Luis R. Rodriguez
  2016-12-08 20:24   ` Kees Cook
  2016-12-08 19:48 ` [RFC 02/10] module: fix memory leak on early load_module() failures Luis R. Rodriguez
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 18:47 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, rgoldwyn, subashab, xypron.glpk, keescook,
	atomlin, mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo,
	akpm, torvalds, linux-kselftest, linux-doc, linux-kernel,
	Luis R. Rodriguez

This adds a new stress test driver for kmod: the kernel module loader.
The new stress test driver, test_kmod, is only enabled as a module right
now. It should be possible to load this as built-in and load tests early
(refer to the force_init_test module parameter), however since a lot of
test can get a system out of memory fast we leave this disabled for now.

Using a system with 1024 MiB of RAM can *easily* get your kernel
OOM fast with this test driver.

The test_kmod driver exposes API knobs for us to fine tune simple
request_module() and get_fs_type() calls. Since these API calls
only allow each one parameter a test driver for these is rather
simple. Other factors that can help out test driver though are
the number of calls we issue and knowing current limitations of
each. This exposes configuration as much as possible through
userspace to be able to build tests directly from userspace.

Since it allows multiple misc devices its will eventually (once we
add a knob to let us create new devices at will) also be possible to
perform more tests in parallel, provided you have enough memory.

We only enable tests we know work as of right now.

Demo screenshots:

 # tools/testing/selftests/kmod/kmod.sh
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0002_driver: OK! - loading kmod test
kmod_test_0002_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0002_fs: OK! - loading kmod test
kmod_test_0002_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0003: OK! - loading kmod test
kmod_test_0003: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0004: OK! - loading kmod test
kmod_test_0004: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
Test completed

You can also request for specific tests:

 # tools/testing/selftests/kmod/kmod.sh -t 0001
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
Test completed

Lastly, the current available number of tests:

 # tools/testing/selftests/kmod/kmod.sh --help
Usage: tools/testing/selftests/kmod/kmod.sh [ -t <4-number-digit> ]
Valid tests: 0001-0009

0001 - Simple test - 1 thread  for empty string
0002 - Simple test - 1 thread  for modules/filesystems that do not exist
0003 - Simple test - 1 thread  for get_fs_type() only
0004 - Simple test - 2 threads for get_fs_type() only
0005 - multithreaded tests with default setup - request_module() only
0006 - multithreaded tests with default setup - get_fs_type() only
0007 - multithreaded tests with default setup test request_module() and get_fs_type()
0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()
0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()

The following test cases currently fail, as such they are not currently
enabled by default:

 # tools/testing/selftests/kmod/kmod.sh -t 0007
 # tools/testing/selftests/kmod/kmod.sh -t 0008
 # tools/testing/selftests/kmod/kmod.sh -t 0009
 # tools/testing/selftests/kmod/kmod.sh -t 0010
 # tools/testing/selftests/kmod/kmod.sh -t 0011

To be sure to run them as intended please unload both of the modules:

  o test_module
  o xfs

And ensure they are not loaded on your system prior to testing them.
If you use these paritions for your rootfs you can change the default
test driver used for get_fs_type() by exporting it into your
environment. For example of other test defaults you can override
refer to kmod.sh allow_user_defaults().

Behind the scenes this is how we fine tune at a test case prior to
hitting a trigger to run it:

cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "2" > /sys/devices/virtual/misc/test_kmod0/config_test_case
echo -n "ext4" > /sys/devices/virtual/misc/test_kmod0/config_test_fs
echo -n "80" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "1" > /sys/devices/virtual/misc/test_kmod0/config_num_threads

Finally to trigger:

echo -n "1" > /sys/devices/virtual/misc/test_kmod0/trigger_config

The kmod.sh script uses the above constructs to build differnt test cases.

A bit of interpretation of the current failures follows, first two
premises:

a) When request_module() is used userspace figures out an optimized version of
module order for us. Once it finds the modules it needs, as per depmod
symbol dep map, it will finit_module() the respective modules which
are needed for the original request_module() request.

b) We have an optimization in place whereby if a kernel uses
request_module() on a module already loaded we never bother
userspace as the module already is loaded. This is all handled by
kernel/kmod.c.

A few things to consider to help identify root causes of issues:

0) kmod 19 has a broken heuristic for modules being assumed to be
built-in to your kernel and will return 0 even though request_module()
failed. Upgrade to a newer version of kmod.

1) A get_fs_type() call for "xfs" will request_module() for
"fs-xfs", not for "xfs". The optimization in kernel described in b)
fails to catch if we have a lot of consecutive get_fs_type() calls.
The reason is the optimization in place does not look for aliases. This
means two consecutive get_fs_type() calls will bump kmod_concurrent, whereas
request_module() will not.

This one explanation why test case 0009 fails at least once for
get_fs_type().

2) If a module fails to load --- for whatever reason (kmod_concurrent
limit reached, file not yet present due to rootfs switch, out of memory)
we have a period of time during which module request for the same name
either with request_module() or get_fs_type() will *also* fail to load
even if the file for the module is ready.

This explains why *multiple* NULLs are possible on test 0009.

3) finit_module() consumes quite a bit of memory.

4) Filesystems typically also have more dependent modules than other
modules, its important to note though that even though a get_fs_type() call
does not incur additional kmod_concurrent bumps, since userspace
loads dependencies it finds it needs via finit_module_fd(), it *will*
take much more memory to load a module with a lot of dependencies.

Because of 3) and 4) we will easily run into out of memory failures
with certain tests. For instance test 0006 fails on qemu with 1024 MiB
of RAM. It panics a box after reaping all userspace processes and still
not having enough memory to reap.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 lib/Kconfig.debug                     |   25 +
 lib/Makefile                          |    1 +
 lib/test_kmod.c                       | 1248 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile |   11 +
 tools/testing/selftests/kmod/config   |    7 +
 tools/testing/selftests/kmod/kmod.sh  |  449 ++++++++++++
 6 files changed, 1741 insertions(+)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7446097f72bd..6cad548e0682 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1994,6 +1994,31 @@ config BUG_ON_DATA_CORRUPTION
 
 	  If unsure, say N.
 
+config TEST_KMOD
+	tristate "kmod stress tester"
+	default n
+	depends on m
+	select TEST_LKM
+	select XFS_FS
+	select TUN
+	select BTRFS_FS
+	help
+	  Test the kernel's module loading mechanism: kmod. kmod implements
+	  support to load modules using the Linux kernel's usermode helper.
+	  This test provides a series of tests against kmod.
+
+	  Although technically you can either build test_kmod as a module or
+	  into the kernel we disallow building it into the kernel since
+	  it stress tests request_module() and this will very likely cause
+	  some issues by taking over precious threads available from other
+	  module load requests, ultimately this could be fatal.
+
+	  To run tests run:
+
+	  tools/testing/selftests/kmod/kmod.sh --help
+
+	  If unsure, say N.
+
 source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
diff --git a/lib/Makefile b/lib/Makefile
index d15e235f72ea..3c5a14821e16 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -55,6 +55,7 @@ obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o
 obj-$(CONFIG_TEST_PRINTF) += test_printf.o
 obj-$(CONFIG_TEST_BITMAP) += test_bitmap.o
 obj-$(CONFIG_TEST_UUID) += test_uuid.o
+obj-$(CONFIG_TEST_KMOD) += test_kmod.o
 
 ifeq ($(CONFIG_DEBUG_KOBJECT),y)
 CFLAGS_kobject.o += -DDEBUG
diff --git a/lib/test_kmod.c b/lib/test_kmod.c
new file mode 100644
index 000000000000..63fded83b9b6
--- /dev/null
+++ b/lib/test_kmod.c
@@ -0,0 +1,1248 @@
+/*
+ * kmod stress test driver
+ *
+ * Copyright (C) 2016 Luis R. Rodriguez <mcgrof@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of copyleft-next (version 0.3.1 or later) as published
+ * at http://copyleft-next.org/.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+/*
+ * This driver provides an interface to trigger and test the kernel's
+ * module loader through a series of configurations and a few triggers.
+ * To test this driver use the following script as root:
+ *
+ * tools/testing/selftests/kmod/kmod.sh --help
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/kmod.h>
+#include <linux/printk.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/vmalloc.h>
+#include <linux/slab.h>
+#include <linux/device.h>
+
+#define TEST_START_NUM_THREADS	50
+#define TEST_START_DRIVER	"test_module"
+#define TEST_START_TEST_FS	"xfs"
+#define TEST_START_TEST_CASE	TEST_KMOD_DRIVER
+
+
+static bool force_init_test = false;
+module_param(force_init_test, bool_enable_only, 0644);
+MODULE_PARM_DESC(force_init_test,
+		 "Force kicking a test immediatley after driver loads");
+
+/*
+ * For device allocation / registration
+ */
+static DEFINE_MUTEX(reg_dev_mutex);
+static LIST_HEAD(reg_test_devs);
+
+/*
+ * num_test_devs actually represents the *next* ID of the next
+ * device we will allow to create.
+ */
+static int num_test_devs;
+
+/**
+ * enum kmod_test_case - linker table test case
+ *
+ * If you add a  test case, please be sure to review if you need to se
+ * @need_mod_put for your tests case.
+ *
+ * @TEST_KMOD_DRIVER: stress tests request_module()
+ * @TEST_KMOD_FS_TYPE: stress tests get_fs_type()
+ */
+enum kmod_test_case {
+	__TEST_KMOD_INVALID = 0,
+
+	TEST_KMOD_DRIVER,
+	TEST_KMOD_FS_TYPE,
+
+	__TEST_KMOD_MAX,
+};
+
+struct test_config {
+	char *test_driver;
+	char *test_fs;
+	unsigned int num_threads;
+	enum kmod_test_case test_case;
+	int test_result;
+};
+
+struct kmod_test_device;
+
+/**
+ * kmod_test_device_info - thread info
+ *
+ * @ret_sync: return value if request_module() is used, sync request for
+ * 	@TEST_KMOD_DRIVER
+ * @fs_sync: return value of get_fs_type() for @TEST_KMOD_FS_TYPE
+ * @thread_idx: thread ID
+ * @test_dev: test device test is being performed under
+ * @need_mod_put: Some tests (get_fs_type() is one) requires putting the module
+ *	(module_put(fs_sync->owner)) when done, otherwise you will not be able
+ *	to unload the respective modules and re-test. We use this to keep
+ *	accounting of when we need this and to help out in case we need to
+ *	error out and deal with module_put() on error.
+ */
+struct kmod_test_device_info {
+	int ret_sync;
+	struct file_system_type *fs_sync;
+	struct task_struct *task_sync;
+	unsigned int thread_idx;
+	struct kmod_test_device *test_dev;
+	bool need_mod_put;
+};
+
+/**
+ * kmod_test_device - test device to help test kmod
+ *
+ * @dev_idx: unique ID for test device
+ * @config: configuration for the test
+ * @misc_dev: we use a misc device under the hood
+ * @dev: pointer to misc_dev's own struct device
+ * @config_mutex: protects configuration of test
+ * @trigger_mutex: the test trigger can only be fired once at a time
+ * @thread_lock: protects @done count, and the @info per each thread
+ * @done: number of threads which have completed or failed
+ * @test_is_oom: when we run out of memory, use this to halt moving forward
+ * @kthreads_done: completion used to signal when all work is done
+ * @list: needed to be part of the reg_test_devs
+ * @info: array of info for each thread
+ */
+struct kmod_test_device {
+	int dev_idx;
+	struct test_config config;
+	struct miscdevice misc_dev;
+	struct device *dev;
+	struct mutex config_mutex;
+	struct mutex trigger_mutex;
+	struct mutex thread_mutex;
+
+	unsigned int done;
+
+	bool test_is_oom;
+	struct completion kthreads_done;
+	struct list_head list;
+
+	struct kmod_test_device_info *info;
+};
+
+static const char *test_case_str(enum kmod_test_case test_case)
+{
+	switch (test_case) {
+	case TEST_KMOD_DRIVER:
+		return "TEST_KMOD_DRIVER";
+	case TEST_KMOD_FS_TYPE:
+		return "TEST_KMOD_FS_TYPE";
+	default:
+		return "invalid";
+	}
+}
+
+static struct miscdevice *dev_to_misc_dev(struct device *dev)
+{
+	return dev_get_drvdata(dev);
+}
+
+static struct kmod_test_device *misc_dev_to_test_dev(struct miscdevice *misc_dev)
+{
+	return container_of(misc_dev, struct kmod_test_device, misc_dev);
+}
+
+static struct kmod_test_device *dev_to_test_dev(struct device *dev)
+{
+	struct miscdevice *misc_dev;
+
+	misc_dev = dev_to_misc_dev(dev);
+
+	return misc_dev_to_test_dev(misc_dev);
+}
+
+/* Must run with thread_mutex held */
+static void kmod_test_done_check(struct kmod_test_device *test_dev,
+				 unsigned int idx)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done++;
+	dev_dbg(test_dev->dev, "Done thread count: %u\n", test_dev->done);
+
+	if (test_dev->done == config->num_threads) {
+		dev_info(test_dev->dev, "Done: %u threads have all run now\n",
+			 test_dev->done);
+		dev_info(test_dev->dev, "Last thread to run: %u\n", idx);
+		complete(&test_dev->kthreads_done);
+	}
+}
+
+static void test_kmod_put_module(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	if (!info->need_mod_put)
+		return;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		break;
+	case TEST_KMOD_FS_TYPE:
+		if (info && info->fs_sync && info->fs_sync->owner)
+			module_put(info->fs_sync->owner);
+		break;
+	default:
+		BUG();
+	}
+
+	info->need_mod_put = true;
+}
+
+static int run_request(void *data)
+{
+	struct kmod_test_device_info *info = data;
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		info->ret_sync = request_module("%s", config->test_driver);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		info->fs_sync = get_fs_type(config->test_fs);
+		info->need_mod_put = true;
+		break;
+	default:
+		/* __trigger_config_run() already checked for test sanity */
+		BUG();
+		return -EINVAL;
+	}
+
+	dev_dbg(test_dev->dev, "Ran thread %u\n", info->thread_idx);
+
+	test_kmod_put_module(info);
+
+	mutex_lock(&test_dev->thread_mutex);
+	info->task_sync = NULL;
+	kmod_test_done_check(test_dev, info->thread_idx);
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+}
+
+static int tally_work_test(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+	int err_ret = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		/*
+		 * Only capture errors, if one is found that's
+		 * enough, for now.
+		 */
+		if (info->ret_sync != 0)
+			err_ret = info->ret_sync;
+		dev_info(test_dev->dev,
+			 "Sync thread %d return status: %d\n",
+			 info->thread_idx, info->ret_sync);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		/* For now we make this simple */
+		if (!info->fs_sync)
+			err_ret = -EINVAL;
+		dev_info(test_dev->dev, "Sync thread %u fs: %s\n",
+			 info->thread_idx, info->fs_sync ? config->test_fs :
+			 "NULL");
+		break;
+	default:
+		BUG();
+	}
+
+	return err_ret;
+}
+
+/*
+ * XXX: add result option to display if all errors did not match.
+ * For now we just keep any error code if one was found.
+ *
+ * If this ran it means *all* tasks were created fine and we
+ * are now just collecting results.
+ *
+ * Only propagate errors, do not override with a subsequent sucess case.
+ */
+static void tally_up_work(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int idx;
+	int err_ret = 0;
+	int ret = 0;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	dev_info(test_dev->dev, "Results:\n");
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		info = &test_dev->info[idx];
+		ret = tally_work_test(info);
+		if (ret)
+			err_ret = ret;
+	}
+
+	/*
+	 * Note: request_module() returns 256 for a module not found even
+	 * though modprobe itself returns 1.
+	 */
+	config->test_result = err_ret;
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+static int try_one_request(struct kmod_test_device *test_dev, unsigned int idx)
+{
+	struct kmod_test_device_info *info = &test_dev->info[idx];
+	int fail_ret = -ENOMEM;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	info->thread_idx = idx;
+	info->test_dev = test_dev;
+	info->task_sync = kthread_run(run_request, info, "%s-%u",
+				      KBUILD_MODNAME, idx);
+
+	if (!info->task_sync || IS_ERR(info->task_sync)) {
+		test_dev->test_is_oom = true;
+		dev_err(test_dev->dev, "Setting up thread %u failed\n", idx);
+		info->task_sync = NULL;
+		goto err_out;
+	} else
+		dev_dbg(test_dev->dev, "Kicked off thread %u\n", idx);
+
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+
+err_out:
+	info->ret_sync = fail_ret;
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return fail_ret;
+}
+
+static void test_dev_kmod_stop_tests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int i;
+
+	dev_info(test_dev->dev, "Ending request_module() tests\n");
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	for (i=0; i < config->num_threads; i++) {
+		info = &test_dev->info[i];
+		if (info->task_sync && !IS_ERR(info->task_sync)) {
+			dev_info(test_dev->dev,
+				 "Stopping still-running thread %i\n", i);
+			kthread_stop(info->task_sync);
+		}
+
+		/*
+		 * info->task_sync is well protected, it can only be
+		 * NULL or a pointer to a struct. If its NULL we either
+		 * never ran, or we did and we completed the work. Completed
+		 * tasks *always* put the module for us. This is a sanity
+		 * check -- just in case.
+		 */
+		if (info->task_sync && info->need_mod_put)
+			test_kmod_put_module(info);
+	}
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+/*
+ * Only wait *iff* we did not run into any errors during all of our thread
+ * set up. If run into any issues we stop threads and just bail out with
+ * an error to the trigger. This also means we don't need any tally work
+ * for any threads which fail.
+ */
+static int try_requests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	unsigned int idx;
+	int ret;
+	bool any_error = false;
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		if (test_dev->test_is_oom) {
+			any_error = true;
+			break;
+		}
+
+		ret = try_one_request(test_dev, idx);
+		if (ret) {
+			any_error = true;
+			break;
+		}
+	}
+
+	if (!any_error) {
+		test_dev->test_is_oom = false;
+		dev_info(test_dev->dev,
+			 "No errors were found while initializing threads\n");
+		wait_for_completion(&test_dev->kthreads_done);
+		tally_up_work(test_dev);
+	} else {
+		test_dev->test_is_oom = true;
+		dev_info(test_dev->dev,
+			 "At least one thread failed to start, stop all work\n");
+		test_dev_kmod_stop_tests(test_dev);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int run_test_driver(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test driver to load: %s\n",
+		 config->test_driver);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static int run_test_fs_type(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test filesystem to load: %s\n",
+		 config->test_fs);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static ssize_t config_show(struct device *dev,
+			   struct device_attribute *attr,
+			   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int len = 0;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	len += sprintf(buf, "Custom trigger configuration for: %s\n",
+		       dev_name(dev));
+
+	len += sprintf(buf+len, "Number of threads:\t%u\n",
+		       config->num_threads);
+
+	len += sprintf(buf+len, "Test_case:\t%s (%u)\n",
+		       test_case_str(config->test_case),
+		       config->test_case);
+
+	if (config->test_driver)
+		len += sprintf(buf+len, "driver:\t%s\n",
+			       config->test_driver);
+	else
+		len += sprintf(buf+len, "driver:\tEMTPY\n");
+
+	if (config->test_fs)
+		len += sprintf(buf+len, "fs:\t%s\n",
+			       config->test_fs);
+	else
+		len += sprintf(buf+len, "fs:\tEMTPY\n");
+
+
+	mutex_unlock(&test_dev->config_mutex);
+
+	return len;
+}
+static DEVICE_ATTR_RO(config);
+
+/*
+ * This ensures we don't allow kicking threads through if our configuration
+ * is faulty.
+ */
+static int __trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		return run_test_driver(test_dev);
+	case TEST_KMOD_FS_TYPE:
+		return run_test_fs_type(test_dev);
+	default:
+		dev_warn(test_dev->dev,
+			 "Invalid test case requested: %u\n",
+			 config->test_case);
+		return -EINVAL;
+	}
+}
+
+static int trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __trigger_config_run(test_dev);
+	if (ret < 0)
+		goto out;
+	dev_info(test_dev->dev, "General test result: %d\n",
+		 config->test_result);
+
+	/*
+	 * We must return 0 after a trigger even unless something went
+	 * wrong with the setup of the test. If the test setup went fine
+	 * then userspace must just check the result of config->test_result.
+	 * One issue with relying on the return from a call in the kernel
+	 * is if the kernel returns a possitive value using this trigger
+	 * will not return the value to userspace, it would be lost.
+	 *
+	 * By not relying on capturing the return value of tests we are using
+	 * through the trigger it also us to run tests with set -e and only
+	 * fail when something went wrong with the driver upon trigger
+	 * requests.
+	 */
+	ret = 0;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+
+static ssize_t
+trigger_config_store(struct device *dev,
+		     struct device_attribute *attr,
+		     const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	if (test_dev->test_is_oom)
+		return -ENOMEM;
+
+	/* For all intents and purposes we don't care what userspace
+	 * sent this trigger, we care only that we were triggered.
+	 * We treat the return value only for caputuring issues with
+	 * the test setup. At this point all the test variables should
+	 * have been allocated so typically this should never fail.
+	 */
+	ret = trigger_config_run(test_dev);
+	if (unlikely(ret < 0))
+		goto out;
+
+	/*
+	 * Note: any return > 0 will be treated as success
+	 * and the error value will not be available to userspace.
+	 * Do not rely on trying to send to userspace a test value
+	 * return value as possitive return errors will be lost.
+	 */
+	if (WARN_ON(ret > 0))
+		return -EINVAL;
+
+	ret = count;
+out:
+	return ret;
+}
+static DEVICE_ATTR_WO(trigger_config);
+
+/*
+ * XXX: move to kstrncpy() once merged.
+ *
+ * Users should use kfree_const() when freeing these.
+ */
+static int __kstrncpy(char **dst, const char *name, size_t count, gfp_t gfp)
+{
+	*dst = kstrndup(name, count, gfp);
+	if (!*dst)
+		return -ENOSPC;
+	return count;
+}
+
+static int config_copy_test_driver_name(struct test_config *config,
+				    const char *name,
+				    size_t count)
+{
+	return __kstrncpy(&config->test_driver, name, count, GFP_KERNEL);
+}
+
+
+static int config_copy_test_fs(struct test_config *config, const char *name,
+			       size_t count)
+{
+	return __kstrncpy(&config->test_fs, name, count, GFP_KERNEL);
+}
+
+static void __kmod_config_free(struct test_config *config)
+{
+	if (!config)
+		return;
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	kfree_const(config->test_fs);
+	config->test_driver = NULL;
+}
+
+static void kmod_config_free(struct kmod_test_device *test_dev)
+{
+	struct test_config *config;
+
+	if (!test_dev)
+		return;
+
+	config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+	__kmod_config_free(config);
+	mutex_unlock(&test_dev->config_mutex);
+}
+
+static ssize_t config_test_driver_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	copied = config_copy_test_driver_name(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+static ssize_t config_test_driver_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+	strcpy(buf, config->test_driver);
+	strcat(buf, "\n");
+	mutex_unlock(&test_dev->config_mutex);
+
+	return strlen(buf) + 1;
+}
+static DEVICE_ATTR(config_test_driver, 0644, config_test_driver_show,
+		   config_test_driver_store);
+
+static ssize_t config_test_fs_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_fs);
+	config->test_fs = NULL;
+
+	copied = config_copy_test_fs(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+static ssize_t config_test_fs_show(struct device *dev,
+				   struct device_attribute *attr,
+				   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+	strcpy(buf, config->test_fs);
+	strcat(buf, "\n");
+	mutex_unlock(&test_dev->config_mutex);
+
+	return strlen(buf) + 1;
+}
+static DEVICE_ATTR(config_test_fs, 0644, config_test_fs_show,
+		   config_test_fs_store);
+
+static int trigger_config_run_driver(struct kmod_test_device *test_dev,
+				     const char *test_driver)
+{
+	int copied;
+	struct test_config *config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	config->test_case = TEST_KMOD_DRIVER;
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	copied = config_copy_test_driver_name(config, test_driver,
+					      strlen(test_driver));
+	mutex_unlock(&test_dev->config_mutex);
+
+	if (copied != strlen(test_driver)) {
+		test_dev->test_is_oom = true;
+		return -EINVAL;
+	}
+
+	test_dev->test_is_oom = false;
+
+	return trigger_config_run(test_dev);
+}
+
+static int trigger_config_run_fs(struct kmod_test_device *test_dev,
+				 const char *fs_type)
+{
+	int copied;
+	struct test_config *config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+	config->test_case = TEST_KMOD_FS_TYPE;
+
+	kfree_const(config->test_fs);
+	config->test_driver = NULL;
+
+	copied = config_copy_test_fs(config, fs_type, strlen(fs_type));
+	mutex_unlock(&test_dev->config_mutex);
+
+	if (copied != strlen(fs_type)) {
+		test_dev->test_is_oom = true;
+		return -EINVAL;
+	}
+
+	test_dev->test_is_oom = false;
+
+	return trigger_config_run(test_dev);
+}
+
+static void free_test_dev_info(struct kmod_test_device *test_dev)
+{
+	if (test_dev->info) {
+		vfree(test_dev->info);
+		test_dev->info = NULL;
+	}
+}
+
+static int kmod_config_sync_info(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	free_test_dev_info(test_dev);
+	test_dev->info = vzalloc(config->num_threads *
+				 sizeof(struct kmod_test_device_info));
+	if (!test_dev->info) {
+		dev_err(test_dev->dev, "Cannot alloc test_dev info\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * Old kernels may not have this, if you want to port this code to
+ * test it on older kernels.
+ */
+#ifdef get_kmod_umh_limit
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return get_kmod_umh_limit();
+}
+#else
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return TEST_START_NUM_THREADS;
+}
+#endif
+
+static int __kmod_config_init(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret = -ENOMEM, copied;
+
+	__kmod_config_free(config);
+
+	copied = config_copy_test_driver_name(config, TEST_START_DRIVER,
+					      strlen(TEST_START_DRIVER));
+	if (copied != strlen(TEST_START_DRIVER))
+		goto err_out;
+
+	copied = config_copy_test_fs(config, TEST_START_TEST_FS,
+				     strlen(TEST_START_TEST_FS));
+	if (copied != strlen(TEST_START_TEST_FS))
+		goto err_out;
+
+	config->num_threads = kmod_init_test_thread_limit();
+	config->test_result = 0;
+	config->test_case = TEST_START_TEST_CASE;
+
+	ret = kmod_config_sync_info(test_dev);
+	if (ret)
+		goto err_out;
+
+	test_dev->test_is_oom = false;
+
+	return 0;
+
+err_out:
+	test_dev->test_is_oom = true;
+	WARN_ON(test_dev->test_is_oom);
+
+	__kmod_config_free(config);
+
+	return ret;
+}
+
+static ssize_t reset_store(struct device *dev,
+			   struct device_attribute *attr,
+			   const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __kmod_config_init(test_dev);
+	if (ret < 0) {
+		ret = -ENOMEM;
+		dev_err(dev, "could not alloc settings for config trigger: %d\n",
+		       ret);
+		goto out;
+	}
+
+	dev_info(dev, "reset\n");
+	ret = count;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+static DEVICE_ATTR_WO(reset);
+
+static int test_dev_config_update_uint_sync(struct kmod_test_device *test_dev,
+					    const char *buf, size_t size,
+					    unsigned int *config,
+					    int (*test_sync)(struct kmod_test_device *test_dev))
+{
+	int ret;
+	char *end;
+	long new = simple_strtol(buf, &end, 0);
+	unsigned int old_val;
+	if (end == buf || new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	old_val = *config;
+	*(unsigned int *)config = new;
+
+	ret = test_sync(test_dev);
+	if (ret) {
+		*(unsigned int *)config = old_val;
+
+		ret = test_sync(test_dev);
+		WARN_ON(ret);
+
+		mutex_unlock(&test_dev->config_mutex);
+		return -EINVAL;
+	}
+
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_uint_range(struct kmod_test_device *test_dev,
+					     const char *buf, size_t size,
+					     unsigned int *config,
+					     unsigned int min,
+					     unsigned int max)
+{
+	char *end;
+	long new = simple_strtol(buf, &end, 0);
+	if (end == buf || new < min || new >  max || new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+	*(unsigned int *)config = new;
+	mutex_unlock(&test_dev->config_mutex);
+
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_int(struct kmod_test_device *test_dev,
+				      const char *buf, size_t size,
+				      int *config)
+{
+	char *end;
+	long new = simple_strtol(buf, &end, 0);
+	if (end == buf || new > INT_MAX || new < INT_MIN)
+		return -EINVAL;
+	mutex_lock(&test_dev->config_mutex);
+	*(int *)config = new;
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static ssize_t test_dev_config_show_int(struct kmod_test_device *test_dev,
+					char *buf,
+					int config)
+{
+	int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+static ssize_t test_dev_config_show_uint(struct kmod_test_device *test_dev,
+					 char *buf,
+					 unsigned int config)
+{
+	unsigned int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", val);
+}
+
+static ssize_t test_result_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_int(test_dev, buf, count,
+					  &config->test_result);
+}
+
+static ssize_t config_num_threads_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_sync(test_dev, buf, count,
+						&config->num_threads,
+						kmod_config_sync_info);
+}
+
+static ssize_t config_num_threads_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->num_threads);
+}
+static DEVICE_ATTR(config_num_threads, 0644, config_num_threads_show,
+		   config_num_threads_store);
+
+static ssize_t config_test_case_store(struct device *dev,
+				      struct device_attribute *attr,
+				      const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_range(test_dev, buf, count,
+						 &config->test_case,
+						 __TEST_KMOD_INVALID + 1,
+						 __TEST_KMOD_MAX - 1);
+}
+
+static ssize_t config_test_case_show(struct device *dev,
+				     struct device_attribute *attr,
+				     char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_uint(test_dev, buf, config->test_case);
+}
+static DEVICE_ATTR(config_test_case, 0644, config_test_case_show,
+		   config_test_case_store);
+
+static ssize_t test_result_show(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->test_result);
+}
+static DEVICE_ATTR(test_result, 0644, test_result_show, test_result_store);
+
+#define TEST_KMOD_DEV_ATTR(name)		&dev_attr_##name.attr
+
+static struct attribute *test_dev_attrs[] = {
+	TEST_KMOD_DEV_ATTR(trigger_config),
+	TEST_KMOD_DEV_ATTR(config),
+	TEST_KMOD_DEV_ATTR(reset),
+
+	TEST_KMOD_DEV_ATTR(config_test_driver),
+	TEST_KMOD_DEV_ATTR(config_test_fs),
+	TEST_KMOD_DEV_ATTR(config_num_threads),
+	TEST_KMOD_DEV_ATTR(config_test_case),
+	TEST_KMOD_DEV_ATTR(test_result),
+
+	NULL,
+};
+
+ATTRIBUTE_GROUPS(test_dev);
+
+static int kmod_config_init(struct kmod_test_device *test_dev)
+{
+	int ret;
+
+	mutex_lock(&test_dev->config_mutex);
+	ret = __kmod_config_init(test_dev);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return ret;
+}
+
+/*
+ * XXX: this could perhaps be made generic already too, but a hunt
+ * for actual users would be needed first. It could be generic
+ * if other test drivers end up using a similar mechanism.
+ */
+const char *test_dev_get_name(const char *base, int idx, gfp_t gfp)
+{
+	const char *name_const;
+	char *name;
+
+	if (!base)
+		return NULL;
+	if (strlen(base) > 30)
+		return NULL;
+	name = kzalloc(1024, gfp);
+	if (!name)
+		return NULL;
+
+	strncat(name, base, strlen(base));
+	sprintf(name+(strlen(base)), "%d", idx);
+	name_const = kstrdup_const(name, gfp);
+
+	kfree(name);
+
+	return name_const;
+}
+
+static struct kmod_test_device *alloc_test_dev_kmod(int idx)
+{
+	int ret;
+	struct kmod_test_device *test_dev;
+	struct miscdevice *misc_dev;
+
+	test_dev = vzalloc(sizeof(struct kmod_test_device));
+	if (!test_dev) {
+		pr_err("Cannot alloc test_dev\n");
+		goto err_out;
+	}
+
+	mutex_init(&test_dev->config_mutex);
+	mutex_init(&test_dev->trigger_mutex);
+	mutex_init(&test_dev->thread_mutex);
+
+	init_completion(&test_dev->kthreads_done);
+
+	ret = kmod_config_init(test_dev);
+	if (ret < 0) {
+		pr_err("Cannot alloc kmod_config_init()\n");
+		goto err_out_free;
+	}
+
+	test_dev->dev_idx = idx;
+	misc_dev = &test_dev->misc_dev;
+
+	misc_dev->minor = MISC_DYNAMIC_MINOR;
+	misc_dev->name = test_dev_get_name("test_kmod", test_dev->dev_idx,
+					   GFP_KERNEL);
+	if (!misc_dev->name) {
+		pr_err("Cannot alloc misc_dev->name\n");
+		goto err_out_free_config;
+	}
+	misc_dev->groups = test_dev_groups;
+
+	return test_dev;
+
+err_out_free_config:
+	free_test_dev_info(test_dev);
+	kmod_config_free(test_dev);
+err_out_free:
+	vfree(test_dev);
+	test_dev = NULL;
+err_out:
+	return NULL;
+}
+
+static void free_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	if (test_dev) {
+		kfree_const(test_dev->misc_dev.name);
+		test_dev->misc_dev.name = NULL;
+		free_test_dev_info(test_dev);
+		kmod_config_free(test_dev);
+		vfree(test_dev);
+		test_dev = NULL;
+	}
+}
+
+static struct kmod_test_device *register_test_dev_kmod(void)
+{
+	struct kmod_test_device *test_dev = NULL;
+	int ret;
+
+	mutex_unlock(&reg_dev_mutex);
+
+	/* int should suffice for number of devices, test for wrap */
+	if (unlikely(num_test_devs + 1) < 0) {
+		pr_err("reached limit of number of test devices\n");
+		goto out;
+	}
+
+	test_dev = alloc_test_dev_kmod(num_test_devs);
+	if (!test_dev)
+		goto out;
+
+	ret = misc_register(&test_dev->misc_dev);
+	if (ret) {
+		pr_err("could not register misc device: %d\n", ret);
+		free_test_dev_kmod(test_dev);
+		goto out;
+	}
+
+	test_dev->dev = test_dev->misc_dev.this_device;
+	list_add_tail(&test_dev->list, &reg_test_devs);
+	dev_info(test_dev->dev, "interface ready\n");
+
+	num_test_devs++;
+
+out:
+	mutex_unlock(&reg_dev_mutex);
+
+	return test_dev;
+
+}
+
+static int __init test_kmod_init(void)
+{
+	struct kmod_test_device *test_dev;
+	int ret;
+
+	test_dev = register_test_dev_kmod();
+	if (!test_dev) {
+		pr_err("Cannot add first test kmod device\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * With some work we might be able to gracefully enable
+	 * testing with this driver built-in, for now this seems
+	 * rather risky. For those willing to try have at it,
+	 * and enable the below. Good luck! If that works, try
+	 * lowering the init level for more fun.
+	 */
+	if (force_init_test) {
+		ret = trigger_config_run_driver(test_dev, "tun");
+		if (WARN_ON(ret))
+			return ret;
+		ret = trigger_config_run_fs(test_dev, "btrfs");
+		if (WARN_ON(ret))
+			return ret;
+	}
+
+	return 0;
+}
+late_initcall(test_kmod_init);
+
+static
+void unregister_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	test_dev_kmod_stop_tests(test_dev);
+
+	dev_info(test_dev->dev, "removing interface\n");
+	misc_deregister(&test_dev->misc_dev);
+
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	free_test_dev_kmod(test_dev);
+}
+
+static void __exit test_kmod_exit(void)
+{
+	struct kmod_test_device *test_dev, *tmp;
+
+	mutex_lock(&reg_dev_mutex);
+	list_for_each_entry_safe(test_dev, tmp, &reg_test_devs, list) {
+		list_del(&test_dev->list);
+		unregister_test_dev_kmod(test_dev);
+	}
+	mutex_unlock(&reg_dev_mutex);
+}
+module_exit(test_kmod_exit);
+
+MODULE_AUTHOR("Luis R. Rodriguez <mcgrof@kernel.org>");
+MODULE_LICENSE("GPL");
diff --git a/tools/testing/selftests/kmod/Makefile b/tools/testing/selftests/kmod/Makefile
new file mode 100644
index 000000000000..fa2ccc5fb3de
--- /dev/null
+++ b/tools/testing/selftests/kmod/Makefile
@@ -0,0 +1,11 @@
+# Makefile for kmod loading selftests
+
+# No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
+all:
+
+TEST_PROGS := kmod.sh
+
+include ../lib.mk
+
+# Nothing to clean up.
+clean:
diff --git a/tools/testing/selftests/kmod/config b/tools/testing/selftests/kmod/config
new file mode 100644
index 000000000000..259f4fd6b5e2
--- /dev/null
+++ b/tools/testing/selftests/kmod/config
@@ -0,0 +1,7 @@
+CONFIG_TEST_KMOD=m
+CONFIG_TEST_LKM=m
+CONFIG_XFS_FS=m
+
+# For the module parameter force_init_test is used
+CONFIG_TUN=m
+CONFIG_BTRFS_FS=m
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
new file mode 100755
index 000000000000..9ea1864d8bae
--- /dev/null
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -0,0 +1,449 @@
+#!/bin/bash
+#
+# Copyright (C) 2016 Luis R. Rodriguez <mcgrof@kernel.org>
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of copyleft-next (version 0.3.1 or later) as published
+# at http://copyleft-next.org/.
+
+# This is a stress test script for kmod, the kernel module loader. It uses
+# test_kmod which exposes a series of knobs for the API for us so we can
+# tweak each test in userspace rather than in kernelspace.
+#
+# The way kmod works is it uses the kernel's usermode helper API to eventually
+# call /sbin/modprobe. It has a limit of the number of concurrent calls
+# possible. The kernel interface to load modules is request_module(), however
+# mount uses get_fs_type(). Both behave slightly differently, but the
+# differences are important enough to test each call separately. For this
+# reason test_kmod starts by providing tests for both calls.
+#
+# The test driver test_kmod assumes a series of defaults which you can
+# override by exporting to your environment prior running this script.
+# For instance this script assumes you do not have xfs loaded upon boot.
+# If this is false, export DEFAULT_KMOD_FS="ext4" prior to running this
+# script if the filesyste module you don't have loaded upon bootup
+# is ext4 instead. Refer to allow_user_defaults() for a list of user
+# override variables possible.
+#
+# You'll want at least 4096 GiB of RAM to expect to run these tests
+# without running out of memory on them. For other requirements refer
+# to test_reqs()
+
+set -e
+
+TEST_DRIVER="test_kmod"
+
+function allow_user_defaults()
+{
+	if [ -z $DEFAULT_KMOD_DRIVER ]; then
+		DEFAULT_KMOD_DRIVER="test_module"
+	fi
+
+	if [ -z $DEFAULT_KMOD_FS ]; then
+		DEFAULT_KMOD_FS="xfs"
+	fi
+
+	if [ -z $PROC_DIR ]; then
+		PROC_DIR="/proc/sys/kernel/"
+	fi
+
+	if [ -z $MODPROBE_LIMIT ]; then
+		MODPROBE_LIMIT=50
+	fi
+
+	if [ -z $DIR ]; then
+		DIR="/sys/devices/virtual/misc/${TEST_DRIVER}0/"
+	fi
+
+	MODPROBE_LIMIT_FILE="${PROC_DIR}/kmod-limit"
+}
+
+test_reqs()
+{
+	if ! which modprobe 2> /dev/null > /dev/null; then
+		echo "$0: You need modprobe installed"
+		exit 1
+	fi
+
+	if ! which kmod 2> /dev/null > /dev/null; then
+		echo "$0: You need kmod installed"
+		exit 1
+	fi
+
+	# kmod 19 has a bad bug where it returns 0 when modprobe
+	# gets called *even* if the module was not loaded due to
+	# some bad heuristics. For details see:
+	#
+	# A work around is possible in-kernel but its rather
+	# complex.
+	KMOD_VERSION=$(kmod --version | awk '{print $3}')
+	if [[ $KMOD_VERSION  -le 19 ]]; then
+		echo "$0: You need at least kmod 20"
+		echo "kmod <= 19 is buggy, for details see:"
+		echo "http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4"
+		exit 1
+	fi
+}
+
+function load_req_mod()
+{
+	if [ ! -d $DIR ]; then
+		# Alanis: "Oh isn't it ironic?"
+		modprobe $TEST_DRIVER
+		if [ ! -d $DIR ]; then
+			echo "$0: $DIR not present"
+			echo "You must have the following enabled in your kernel:"
+			cat $PWD/config
+			exit 1
+		fi
+	fi
+}
+
+test_finish()
+{
+	echo "Test completed"
+}
+
+errno_name_to_val()
+{
+	case "$1" in
+	# kmod calls modprobe and upon of a module not found
+	# modprobe returns just 1... However in the kernel we
+	# *sometimes* see 256...
+	MODULE_NOT_FOUND)
+		echo 256;;
+	SUCCESS)
+		echo 0;;
+	-EPERM)
+		echo -1;;
+	-ENOENT)
+		echo -2;;
+	-EINVAL)
+		echo -22;;
+	-ERR_ANY)
+		echo -123456;;
+	*)
+		echo invalid;;
+	esac
+}
+
+errno_val_to_name()
+	case "$1" in
+	256)
+		echo MODULE_NOT_FOUND;;
+	0)
+		echo SUCCESS;;
+	-1)
+		echo -EPERM;;
+	-2)
+		echo -ENOENT;;
+	-22)
+		echo -EINVAL;;
+	-123456)
+		echo -ERR_ANY;;
+	*)
+		echo invalid;;
+	esac
+
+config_set_test_case_driver()
+{
+	if ! echo -n 1 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to driver" >&2
+		exit 1
+	fi
+}
+
+config_set_test_case_fs()
+{
+	if ! echo -n 2 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to fs" >&2
+		exit 1
+	fi
+}
+
+config_num_threads()
+{
+	if ! echo -n $1 >$DIR/config_num_threads; then
+		echo "$0: Unable to set to number of threads" >&2
+		exit 1
+	fi
+}
+
+config_get_modprobe_limit()
+{
+	if [[ -f ${MODPROBE_LIMIT_FILE} ]] ; then
+		MODPROBE_LIMIT=$(cat $MODPROBE_LIMIT_FILE)
+	fi
+	echo $MODPROBE_LIMIT
+}
+
+config_num_thread_limit_extra()
+{
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA_LIMIT=$MODPROBE_LIMIT+$1
+	config_num_threads $EXTRA_LIMIT
+}
+
+# For special characters use printf directly,
+# refer to kmod_test_0001
+config_set_driver()
+{
+	if ! echo -n $1 >$DIR/config_test_driver; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_set_fs()
+{
+	if ! echo -n $1 >$DIR/config_test_fs; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_get_driver()
+{
+	cat $DIR/config_test_driver
+}
+
+config_get_test_result()
+{
+	cat $DIR/test_result
+}
+
+config_reset()
+{
+	if ! echo -n "1" >"$DIR"/reset; then
+		echo "$0: reset shuld have worked" >&2
+		exit 1
+	fi
+}
+
+config_show_config()
+{
+	echo "----------------------------------------------------"
+	cat "$DIR"/config
+	echo "----------------------------------------------------"
+}
+
+config_trigger()
+{
+	if ! echo -n "1" >"$DIR"/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - loading should have worked"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - loading kmod test"
+}
+
+config_trigger_want_fail()
+{
+	if echo "1" > $DIR/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - test case was expected to fail"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - kmod test case failed as expected"
+}
+
+config_expect_result()
+{
+	RC=$(config_get_test_result)
+	RC_NAME=$(errno_val_to_name $RC)
+
+	ERRNO_NAME=$2
+	ERRNO=$(errno_name_to_val $ERRNO_NAME)
+
+	if [[ $ERRNO_NAME = "-ERR_ANY" ]]; then
+		if [[ $RC -ge 0 ]]; then
+			echo "$1: FAIL, test expects $ERRNO_NAME - got $RC_NAME ($RC)" >&2
+			config_show_config
+			exit 1
+		fi
+	elif [[ $RC != $ERRNO ]]; then
+		echo "$1: FAIL, test expects $ERRNO_NAME ($ERRNO) - got $RC_NAME ($RC)" >&2
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - Return value: $RC ($RC_NAME), expected $ERRNO_NAME"
+}
+
+kmod_defaults_driver()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_DRIVER
+	config_set_driver $DEFAULT_KMOD_DRIVER
+}
+
+kmod_defaults_fs()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_FS
+	config_set_fs $DEFAULT_KMOD_FS
+	config_set_test_case_fs
+}
+
+kmod_test_0001_driver()
+{
+	NAME='\000'
+
+	kmod_defaults_driver
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0001_fs()
+{
+	NAME='\000'
+
+	kmod_defaults_fs
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0001()
+{
+	kmod_test_0001_driver
+	kmod_test_0001_fs
+}
+
+kmod_test_0002_driver()
+{
+	NAME="nope-$DEFAULT_KMOD_DRIVER"
+
+	kmod_defaults_driver
+	config_set_driver $NAME
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0002_fs()
+{
+	NAME="nope-$DEFAULT_KMOD_FS"
+
+	kmod_defaults_fs
+	config_set_fs $NAME
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0002()
+{
+	kmod_test_0002_driver
+	kmod_test_0002_fs
+}
+
+kmod_test_0003()
+{
+	kmod_defaults_fs
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0004()
+{
+	kmod_defaults_fs
+	config_num_threads 2
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0005()
+{
+	kmod_defaults_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0006()
+{
+	kmod_defaults_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0007()
+{
+	kmod_test_0005
+	kmod_test_0006
+}
+
+kmod_test_0008()
+{
+	kmod_defaults_driver
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA=$MODPROBE_LIMIT/2
+	config_num_thread_limit_extra $EXTRA
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0009()
+{
+	kmod_defaults_fs
+	#MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	#let EXTRA=$MODPROBE_LIMIT/3
+	config_num_thread_limit_extra 5
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+trap "test_finish" EXIT
+test_reqs
+allow_user_defaults
+load_req_mod
+
+usage()
+{
+	echo "Usage: $0 [ -t <4-number-digit> ]"
+	echo "Valid tests: 0001-0011"
+	echo
+	echo "0001 - Simple test - 1 thread  for empty string"
+	echo "0002 - Simple test - 1 thread  for modules/filesystems that do not exist"
+	echo "0003 - Simple test - 1 thread  for get_fs_type() only"
+	echo "0004 - Simple test - 2 threads for get_fs_type() only"
+	echo "0005 - multithreaded tests with default setup - request_module() only"
+	echo "0006 - multithreaded tests with default setup - get_fs_type() only"
+	echo "0007 - multithreaded tests with default setup test request_module() and get_fs_type()"
+	echo "0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()"
+	echo "0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+	exit 1
+}
+
+# You can ask for a specific test:
+if [[ $# > 0 ]] ; then
+	if [[ $1 != "-t" ]]; then
+		usage
+	fi
+
+	re='^[0-9]+$'
+	if ! [[ $2 =~ $re ]]; then
+		usage
+	fi
+
+	RUN_TEST=kmod_test_$2
+	$RUN_TEST
+	exit 0
+fi
+
+# Once tese are enabled please leave them as-is. Write your own test,
+# we have tons of space.
+kmod_test_0001
+kmod_test_0002
+kmod_test_0003
+kmod_test_0004
+kmod_test_0005
+kmod_test_0006
+kmod_test_0007
+
+#kmod_test_0008
+#kmod_test_0009
+
+exit 0
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 02/10] module: fix memory leak on early load_module() failures
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
  2016-12-08 18:47 ` [RFC 01/10] kmod: add test driver to stress test the module loader Luis R. Rodriguez
@ 2016-12-08 19:48 ` Luis R. Rodriguez
  2016-12-08 20:30   ` Kees Cook
                     ` (2 more replies)
  2016-12-08 19:48 ` [RFC 03/10] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
                   ` (8 subsequent siblings)
  10 siblings, 3 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:48 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

While looking for early possible module loading failures I was
able to reproduce a memory leak possible with kmemleak. There
are a few rare ways to trigger a failure:

  o we've run into a failure while processing kernel parameters
    (parse_args() returns an error)
  o mod_sysfs_setup() fails
  o we're a live patch module and copy_module_elf() fails

Chances of running into this issue is really low.

kmemleak splat:

unreferenced object 0xffff9f2c4ada1b00 (size 32):
  comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
  hex dump (first 32 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
    [<ffffffff8c1bc581>] kstrdup+0x31/0x60
    [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
    [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
    [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
    [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
    [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
    [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
    [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
    [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
    [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/module.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/module.c b/kernel/module.c
index f7482db0f843..e420ed67e533 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3722,6 +3722,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
 	mod_sysfs_teardown(mod);
  coming_cleanup:
 	mod->state = MODULE_STATE_GOING;
+	destroy_params(mod->kp, mod->num_kp);
 	blocking_notifier_call_chain(&module_notify_list,
 				     MODULE_STATE_GOING, mod);
 	klp_module_going(mod);
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 03/10] kmod: add dynamic max concurrent thread count
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
  2016-12-08 18:47 ` [RFC 01/10] kmod: add test driver to stress test the module loader Luis R. Rodriguez
  2016-12-08 19:48 ` [RFC 02/10] module: fix memory leak on early load_module() failures Luis R. Rodriguez
@ 2016-12-08 19:48 ` Luis R. Rodriguez
  2016-12-08 20:28   ` Kees Cook
  2016-12-14 15:38   ` Petr Mladek
  2016-12-08 19:48 ` [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:48 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

We currently statically limit the number of modprobe threads which
we allow to run concurrently to 50. As per Keith Owens, this was a
completely arbitrary value, and it was set in the 2.3.38 days [0]
over 16 years ago in year 2000.

Although we haven't yet hit our lower limits, experimentation [1]
shows that when and if we hit this limit in the worst case, will be
fatal -- consider get_fs_type() failures upon mount on a system which
has many partitions, some of which might even be with the same
filesystem. Its best to be prudent and increase and set this
value to something more sensible which ensures we're far from hitting
the limit and also allows default build/user run time override.

The worst case is fatal given that once a module fails to load there
is a period of time during which subsequent request for the same module
will fail, so in the case of partitions its not just one request that
could fail, but whole series of partitions. This later issue of a
module request failure domino effect can be addressed later, but
increasing the limit to something more meaninful should at least give us
enough cushion to avoid this for a while.

Set this value up with a bit more meaninful modern limits:

Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)

Also allow the default max limit to be further fine tuned at compile
time and at initialization at run time at boot up using the kernel
parameter: max_modprobes.

[0] https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
[1] https://github.com/mcgrof/test_request_module

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  7 ++++
 include/linux/kmod.h                            |  3 +-
 init/Kconfig                                    | 23 +++++++++++++
 init/main.c                                     |  1 +
 kernel/kmod.c                                   | 43 ++++++++++++++++---------
 5 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index be2d6d0a03a4..92bcccc65ea4 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1700,6 +1700,13 @@
 
 	keepinitrd	[HW,ARM]
 
+	kmod.max_modprobes [KNL]
+			This lets you set the max allowed of concurrent
+			modprobes threads possible on a system overriding the
+			default heuristic of:
+
+				min(max_threads/2, 2 << CONFIG_MAX_KMOD_CONCURRENT)
+
 	kernelcore=	[KNL,X86,IA-64,PPC]
 			Format: nn[KMGTPE] | "mirror"
 			This parameter
diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index fcfd2bf14d3f..15783cd7f056 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -38,13 +38,14 @@ int __request_module(bool wait, const char *name, ...);
 #define request_module_nowait(mod...) __request_module(false, mod)
 #define try_then_request_module(x, mod...) \
 	((x) ?: (__request_module(true, mod), (x)))
+void init_kmod_umh(void);
 #else
 static inline int request_module(const char *name, ...) { return -ENOSYS; }
 static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
+static inline void init_kmod_umh(void) { }
 #define try_then_request_module(x, mod...) (x)
 #endif
 
-
 struct cred;
 struct file;
 
diff --git a/init/Kconfig b/init/Kconfig
index 271692a352f1..da2c25746937 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2111,6 +2111,29 @@ config TRIM_UNUSED_KSYMS
 
 	  If unsure, or if you need to build out-of-tree modules, say N.
 
+config MAX_KMOD_CONCURRENT
+	int "Max allowed concurrent request_module() calls (6=>64, 10=>1024)"
+	range 0 14
+	default 6 if !BASE_SMALL
+	default 7 if BASE_SMALL
+	help
+	  The kernel restricts the number of possible concurrent calls to
+	  request_module() to help avoid a recursive loop possible with
+	  modules. The default maximum number of concurrent threads allowed
+	  to run request_module() will be:
+
+	    max_modprobes = min(max_threads/2, 2 << CONFIG_MAX_KMOD_CONCURRENT);
+
+	  The value set in CONFIG_MAX_KMOD_CONCURRENT represents then the power
+	  of 2 value used at boot time for the above computation. You can
+	  override the default built value using the kernel parameter:
+
+		kmod.max_modprobes=4096
+
+	  We set this to default to 64 (2^6) concurrent modprobe threads for
+	  small systems, for larger systems this defaults to 128 (2^7)
+	  concurrent modprobe threads.
+
 endif # MODULES
 
 config MODULES_TREE_LOOKUP
diff --git a/init/main.c b/init/main.c
index 8161208d4ece..1fa441aa32c6 100644
--- a/init/main.c
+++ b/init/main.c
@@ -638,6 +638,7 @@ asmlinkage __visible void __init start_kernel(void)
 	thread_stack_cache_init();
 	cred_init();
 	fork_init();
+	init_kmod_umh();
 	proc_caches_init();
 	buffer_init();
 	key_init();
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 0277d1216f80..cb6f7ca7b8a5 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -44,6 +44,9 @@
 #include <trace/events/module.h>
 
 extern int max_threads;
+unsigned int max_modprobes;
+module_param(max_modprobes, uint, 0644);
+MODULE_PARM_DESC(max_modprobes, "Max number of allowed concurrent modprobes");
 
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
@@ -125,10 +128,8 @@ int __request_module(bool wait, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
-	unsigned int max_modprobes;
 	int ret;
 	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
-#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
 	static int kmod_loop_msg;
 
 	/*
@@ -152,19 +153,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
-	/* If modprobe needs a service that is in a module, we get a recursive
-	 * loop.  Limit the number of running kmod threads to max_threads/2 or
-	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
-	 * would be to run the parents of this process, counting how many times
-	 * kmod was invoked.  That would mean accessing the internals of the
-	 * process tables to get the command line, proc_pid_cmdline is static
-	 * and it is not worth changing the proc code just to handle this case. 
-	 * KAO.
-	 *
-	 * "trace the ppid" is simple, but will fail if someone's
-	 * parent exits.  I think this is as good as it gets. --RR
-	 */
-	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
 	atomic_inc(&kmod_concurrent);
 	if (atomic_read(&kmod_concurrent) > max_modprobes) {
 		/* We may be blaming an innocent here, but unlikely */
@@ -186,6 +174,31 @@ int __request_module(bool wait, const char *fmt, ...)
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
+
+/*
+ * If modprobe needs a service that is in a module, we get a recursive
+ * loop.  Limit the number of running kmod threads to max_threads/2 or
+ * CONFIG_MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
+ * would be to run the parents of this process, counting how many times
+ * kmod was invoked.  That would mean accessing the internals of the
+ * process tables to get the command line, proc_pid_cmdline is static
+ * and it is not worth changing the proc code just to handle this case.
+ *
+ * "trace the ppid" is simple, but will fail if someone's
+ * parent exits.  I think this is as good as it gets.
+ *
+ * You can override with with a kernel parameter, for instance to allow
+ * 4096 concurrent modprobe instances:
+ *
+ *	kmod.max_modprobes=4096
+ */
+void __init init_kmod_umh(void)
+{
+	if (!max_modprobes)
+		max_modprobes = min(max_threads/2,
+				    2 << CONFIG_MAX_KMOD_CONCURRENT);
+}
+
 #endif /* CONFIG_MODULES */
 
 static void call_usermodehelper_freeinfo(struct subprocess_info *info)
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2016-12-08 19:48 ` [RFC 03/10] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
@ 2016-12-08 19:48 ` Luis R. Rodriguez
  2016-12-08 20:29   ` Kees Cook
  2016-12-22  5:07   ` Jessica Yu
  2016-12-08 19:48 ` [RFC 05/10] kmod: return -EBUSY if modprobe limit is reached Luis R. Rodriguez
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:48 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

kmod_concurrent is used as an atomic counter for enabling
the allowed limit of modprobe calls, provide wrappers for it
to enable this to be expanded on more easily. This will be done
later.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index cb6f7ca7b8a5..049d7eabda38 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -44,6 +44,9 @@
 #include <trace/events/module.h>
 
 extern int max_threads;
+
+static atomic_t kmod_concurrent = ATOMIC_INIT(0);
+
 unsigned int max_modprobes;
 module_param(max_modprobes, uint, 0644);
 MODULE_PARM_DESC(max_modprobes, "Max number of allowed concurrent modprobes");
@@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
 	return -ENOMEM;
 }
 
+static int kmod_umh_threads_get(void)
+{
+	atomic_inc(&kmod_concurrent);
+	if (atomic_read(&kmod_concurrent) < max_modprobes)
+		return 0;
+	atomic_dec(&kmod_concurrent);
+	return -ENOMEM;
+}
+
+static void kmod_umh_threads_put(void)
+{
+	atomic_dec(&kmod_concurrent);
+}
+
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
@@ -129,7 +146,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
 	static int kmod_loop_msg;
 
 	/*
@@ -153,8 +169,8 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
-	atomic_inc(&kmod_concurrent);
-	if (atomic_read(&kmod_concurrent) > max_modprobes) {
+	ret = kmod_umh_threads_get();
+	if (ret) {
 		/* We may be blaming an innocent here, but unlikely */
 		if (kmod_loop_msg < 5) {
 			printk(KERN_ERR
@@ -162,15 +178,14 @@ int __request_module(bool wait, const char *fmt, ...)
 			       module_name);
 			kmod_loop_msg++;
 		}
-		atomic_dec(&kmod_concurrent);
-		return -ENOMEM;
+		return ret;
 	}
 
 	trace_module_request(module_name, wait, _RET_IP_);
 
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
-	atomic_dec(&kmod_concurrent);
+	kmod_umh_threads_put();
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 05/10] kmod: return -EBUSY if modprobe limit is reached
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2016-12-08 19:48 ` [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
@ 2016-12-08 19:48 ` Luis R. Rodriguez
  2016-12-08 19:48 ` [RFC 06/10] kmod: provide sanity check on kmod_concurrent access Luis R. Rodriguez
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:48 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

Running out of our modprobe limit is not a memory limit but
a system specific established limitation set to avoid a possible
recursive issue with modprobe. This gives userspace a better idea
of what happened if we can't load a module, it could use this to
wait and try again.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 049d7eabda38..ab38539f7e91 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -117,7 +117,7 @@ static int kmod_umh_threads_get(void)
 	if (atomic_read(&kmod_concurrent) < max_modprobes)
 		return 0;
 	atomic_dec(&kmod_concurrent);
-	return -ENOMEM;
+	return -EBUSY;
 }
 
 static void kmod_umh_threads_put(void)
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 06/10] kmod: provide sanity check on kmod_concurrent access
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (4 preceding siblings ...)
  2016-12-08 19:48 ` [RFC 05/10] kmod: return -EBUSY if modprobe limit is reached Luis R. Rodriguez
@ 2016-12-08 19:48 ` Luis R. Rodriguez
  2016-12-14 16:08   ` Petr Mladek
  2016-12-15 12:57   ` Petr Mladek
  2016-12-08 19:49 ` [RFC 07/10] kmod: use simplified rate limit printk Luis R. Rodriguez
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:48 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

Only decrement *iff* we're possitive. Warn if we've hit
a situation where the counter is already 0 after we're done
with a modprobe call, this would tell us we have an unaccounted
counter access -- this in theory should not be possible as
only one routine controls the counter, however preemption is
one case that could trigger this situation. Avoid that situation
by disabling preemptiong while we access the counter.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index ab38539f7e91..09cf35a2075a 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -113,16 +113,28 @@ static int call_modprobe(char *module_name, int wait)
 
 static int kmod_umh_threads_get(void)
 {
+	int ret = 0;
+
+	preempt_disable();
 	atomic_inc(&kmod_concurrent);
 	if (atomic_read(&kmod_concurrent) < max_modprobes)
-		return 0;
-	atomic_dec(&kmod_concurrent);
-	return -EBUSY;
+		goto out;
+
+	atomic_dec_if_positive(&kmod_concurrent);
+	ret = -EBUSY;
+out:
+	preempt_enable();
+	return 0;
 }
 
 static void kmod_umh_threads_put(void)
 {
-	atomic_dec(&kmod_concurrent);
+	int ret;
+
+	preempt_disable();
+	ret = atomic_dec_if_positive(&kmod_concurrent);
+	WARN_ON(ret < 0);
+	preempt_enable();
 }
 
 /**
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 07/10] kmod: use simplified rate limit printk
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (5 preceding siblings ...)
  2016-12-08 19:48 ` [RFC 06/10] kmod: provide sanity check on kmod_concurrent access Luis R. Rodriguez
@ 2016-12-08 19:49 ` Luis R. Rodriguez
  2016-12-14 16:23   ` Petr Mladek
  2016-12-08 19:49 ` [RFC 08/10] sysctl: add support for unsigned int properly Luis R. Rodriguez
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:49 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

Just use the simplified rate limit printk when the max modprobe
limit is reached, while at it throw out a bone should the error
be triggered.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 09cf35a2075a..ef65f4c3578a 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -158,7 +158,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static int kmod_loop_msg;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -183,13 +182,8 @@ int __request_module(bool wait, const char *fmt, ...)
 
 	ret = kmod_umh_threads_get();
 	if (ret) {
-		/* We may be blaming an innocent here, but unlikely */
-		if (kmod_loop_msg < 5) {
-			printk(KERN_ERR
-			       "request_module: runaway loop modprobe %s\n",
-			       module_name);
-			kmod_loop_msg++;
-		}
+		pr_err_ratelimited("request_module: modprobe limit (%u) reached with module %s\n",
+				   max_modprobes, module_name);
 		return ret;
 	}
 
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 08/10] sysctl: add support for unsigned int properly
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (6 preceding siblings ...)
  2016-12-08 19:49 ` [RFC 07/10] kmod: use simplified rate limit printk Luis R. Rodriguez
@ 2016-12-08 19:49 ` Luis R. Rodriguez
  2016-12-08 19:49 ` [RFC 09/10] kmod: add helpers for getting kmod count and limit Luis R. Rodriguez
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:49 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

Commit e7d316a02f6838 ("sysctl: handle error writing UINT_MAX to u32 fields")
added proc_douintvec() to start help adding support for unsigned int,
this however was only half the work needed, all these issues are present
with the current implementation:

  o Printing the values shows a negative value, this happens
    since do_proc_dointvec() and this uses proc_put_long()
  o We can easily wrap around the int values: UINT_MAX is
    4294967295, if we echo in 4294967295 + 1 we end up with 0,
    using 4294967295 + 2 we end up with 1.
 o We echo negative values in and they are accepted

Fix all these issues by adding our own do_proc_douintvec(). Likewise to
keep parity provide the other typically useful proc_douintvec_minmax().
Adding proc_douintvec_minmax_sysadmin() is easy but we wait for an actual
user for that.

Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields")
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 include/linux/sysctl.h |   3 +
 kernel/sysctl.c        | 184 +++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 181 insertions(+), 6 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index adf4e51cf597..a35d40ecc211 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -47,6 +47,9 @@ extern int proc_douintvec(struct ctl_table *, int,
 			 void __user *, size_t *, loff_t *);
 extern int proc_dointvec_minmax(struct ctl_table *, int,
 				void __user *, size_t *, loff_t *);
+extern int proc_douintvec_minmax(struct ctl_table *table, int write,
+				 void __user *buffer, size_t *lenp,
+				 loff_t *ppos);
 extern int proc_dointvec_jiffies(struct ctl_table *, int,
 				 void __user *, size_t *, loff_t *);
 extern int proc_dointvec_userhz_jiffies(struct ctl_table *, int,
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 1a292ebcbbb6..06711e648fa3 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2125,12 +2125,12 @@ static int do_proc_dointvec_conv(bool *negp, unsigned long *lvalp,
 	return 0;
 }
 
-static int do_proc_douintvec_conv(bool *negp, unsigned long *lvalp,
-				 int *valp,
-				 int write, void *data)
+static int do_proc_douintvec_conv(unsigned long *lvalp,
+				  unsigned int *valp,
+				  int write, void *data)
 {
 	if (write) {
-		if (*negp)
+		if (*lvalp > (unsigned long) UINT_MAX)
 			return -EINVAL;
 		*valp = *lvalp;
 	} else {
@@ -2243,6 +2243,115 @@ static int do_proc_dointvec(struct ctl_table *table, int write,
 			buffer, lenp, ppos, conv, data);
 }
 
+static int __do_proc_douintvec(void *tbl_data, struct ctl_table *table,
+			       int write, void __user *buffer,
+			       size_t *lenp, loff_t *ppos,
+			       int (*conv)(unsigned long *lvalp,
+					   unsigned int *valp,
+					   int write, void *data),
+			       void *data)
+{
+	unsigned int *i, vleft;
+	bool first = true;
+	int err = 0;
+	size_t left;
+	char *kbuf = NULL, *p;
+
+	if (!tbl_data || !table->maxlen || !*lenp || (*ppos && !write)) {
+		*lenp = 0;
+		return 0;
+	}
+
+	i = (unsigned int *) tbl_data;
+	vleft = table->maxlen / sizeof(*i);
+	left = *lenp;
+
+	if (!conv)
+		conv = do_proc_douintvec_conv;
+
+	if (write) {
+		if (*ppos) {
+			switch (sysctl_writes_strict) {
+			case SYSCTL_WRITES_STRICT:
+				goto out;
+			case SYSCTL_WRITES_WARN:
+				warn_sysctl_write(table);
+				break;
+			default:
+				break;
+			}
+		}
+
+		if (left > PAGE_SIZE - 1)
+			left = PAGE_SIZE - 1;
+		p = kbuf = memdup_user_nul(buffer, left);
+		if (IS_ERR(kbuf))
+			return PTR_ERR(kbuf);
+	}
+
+	for (; left && vleft--; i++, first=false) {
+		unsigned long lval;
+		bool neg;
+
+		if (write) {
+			left -= proc_skip_spaces(&p);
+
+			if (!left)
+				break;
+			err = proc_get_long(&p, &left, &lval, &neg,
+					     proc_wspace_sep,
+					     sizeof(proc_wspace_sep), NULL);
+			if (neg) {
+				err = -EINVAL;
+				break;
+			}
+			if (err)
+				break;
+			if (conv(&lval, i, 1, data)) {
+				err = -EINVAL;
+				break;
+			}
+		} else {
+			if (conv(&lval, i, 0, data)) {
+				err = -EINVAL;
+				break;
+			}
+			if (!first)
+				err = proc_put_char(&buffer, &left, '\t');
+			if (err)
+				break;
+			err = proc_put_long(&buffer, &left, lval, false);
+			if (err)
+				break;
+		}
+	}
+
+	if (!write && !first && left && !err)
+		err = proc_put_char(&buffer, &left, '\n');
+	if (write && !err && left)
+		left -= proc_skip_spaces(&p);
+	if (write) {
+		kfree(kbuf);
+		if (first)
+			return err ? : -EINVAL;
+	}
+	*lenp -= left;
+out:
+	*ppos += *lenp;
+	return err;
+}
+
+static int do_proc_douintvec(struct ctl_table *table, int write,
+			     void __user *buffer, size_t *lenp, loff_t *ppos,
+			     int (*conv)(unsigned long *lvalp,
+					 unsigned int *valp,
+					 int write, void *data),
+			     void *data)
+{
+	return __do_proc_douintvec(table->data, table, write,
+				   buffer, lenp, ppos, conv, data);
+}
+
 /**
  * proc_dointvec - read a vector of integers
  * @table: the sysctl table
@@ -2278,8 +2387,8 @@ int proc_dointvec(struct ctl_table *table, int write,
 int proc_douintvec(struct ctl_table *table, int write,
 		     void __user *buffer, size_t *lenp, loff_t *ppos)
 {
-	return do_proc_dointvec(table, write, buffer, lenp, ppos,
-				do_proc_douintvec_conv, NULL);
+	return do_proc_douintvec(table, write, buffer, lenp, ppos,
+				 do_proc_douintvec_conv, NULL);
 }
 
 /*
@@ -2384,6 +2493,62 @@ int proc_dointvec_minmax(struct ctl_table *table, int write,
 				do_proc_dointvec_minmax_conv, &param);
 }
 
+struct do_proc_douintvec_minmax_conv_param {
+	unsigned int *min;
+	unsigned int *max;
+};
+
+static int do_proc_douintvec_minmax_conv(unsigned long *lvalp,
+					 unsigned int *valp,
+					 int write, void *data)
+{
+	struct do_proc_douintvec_minmax_conv_param *param = data;
+	if (write) {
+		unsigned int val = *lvalp;
+		if ((param->min && *param->min > val) ||
+		    (param->max && *param->max < val))
+			return -ERANGE;
+
+		if (*lvalp > (unsigned long) UINT_MAX)
+			return -EINVAL;
+		*valp = val;
+	} else {
+		unsigned int val = *valp;
+		*lvalp = (unsigned long) val;
+	}
+	return 0;
+}
+
+/**
+ * proc_douintvec_minmax - read a vector of unsigned ints with min/max values
+ * @table: the sysctl table
+ * @write: %TRUE if this is a write to the sysctl file
+ * @buffer: the user buffer
+ * @lenp: the size of the user buffer
+ * @ppos: file position
+ *
+ * Reads/writes up to table->maxlen/sizeof(unsigned int) unsigned integer
+ * values from/to the user buffer, treated as an ASCII string. Negative
+ * strings are not allowed.
+ *
+ * This routine will ensure the values are within the range specified by
+ * table->extra1 (min) and table->extra2 (max). There is a final sanity
+ * check for UINT_MAX to avoid having to support wrap around uses from
+ * userspace.
+ *
+ * Returns 0 on success.
+ */
+int proc_douintvec_minmax(struct ctl_table *table, int write,
+			  void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	struct do_proc_douintvec_minmax_conv_param param = {
+		.min = (unsigned int *) table->extra1,
+		.max = (unsigned int *) table->extra2,
+	};
+	return do_proc_douintvec(table, write, buffer, lenp, ppos,
+				 do_proc_douintvec_minmax_conv, &param);
+}
+
 static void validate_coredump_safety(void)
 {
 #ifdef CONFIG_COREDUMP
@@ -2891,6 +3056,12 @@ int proc_dointvec_minmax(struct ctl_table *table, int write,
 	return -ENOSYS;
 }
 
+int proc_douintvec_minmax(struct ctl_table *table, int write,
+			  void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	return -ENOSYS;
+}
+
 int proc_dointvec_jiffies(struct ctl_table *table, int write,
 		    void __user *buffer, size_t *lenp, loff_t *ppos)
 {
@@ -2933,6 +3104,7 @@ EXPORT_SYMBOL(proc_dointvec);
 EXPORT_SYMBOL(proc_douintvec);
 EXPORT_SYMBOL(proc_dointvec_jiffies);
 EXPORT_SYMBOL(proc_dointvec_minmax);
+EXPORT_SYMBOL_GPL(proc_douintvec_minmax);
 EXPORT_SYMBOL(proc_dointvec_userhz_jiffies);
 EXPORT_SYMBOL(proc_dointvec_ms_jiffies);
 EXPORT_SYMBOL(proc_dostring);
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 09/10] kmod: add helpers for getting kmod count and limit
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (7 preceding siblings ...)
  2016-12-08 19:49 ` [RFC 08/10] sysctl: add support for unsigned int properly Luis R. Rodriguez
@ 2016-12-08 19:49 ` Luis R. Rodriguez
  2016-12-15 16:56   ` Petr Mladek
  2016-12-08 19:49 ` [RFC 10/10] kmod: add a sanity check on module loading Luis R. Rodriguez
  2017-01-11 19:10 ` [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
  10 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:49 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

This adds helpers for getting access to the kmod count and limit from
userspace. While at it, this also lets userspace fine tune the kmod
limit after boot, it uses the shiny new proc_douintvec_minmax().

These knobs should help userspace more gracefully and deterministically
handle module loading.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 include/linux/kmod.h |  8 +++++
 kernel/kmod.c        | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/sysctl.c      | 14 +++++++++
 3 files changed, 103 insertions(+), 2 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 15783cd7f056..94c7379cff94 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -39,13 +39,21 @@ int __request_module(bool wait, const char *name, ...);
 #define try_then_request_module(x, mod...) \
 	((x) ?: (__request_module(true, mod), (x)))
 void init_kmod_umh(void);
+unsigned int get_kmod_umh_limit(void);
+int sysctl_kmod_count(struct ctl_table *table, int write,
+		      void __user *buffer, size_t *lenp, loff_t *ppos);
+int sysctl_kmod_limit(struct ctl_table *table, int write,
+		      void __user *buffer, size_t *lenp, loff_t *ppos);
 #else
 static inline int request_module(const char *name, ...) { return -ENOSYS; }
 static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
 static inline void init_kmod_umh(void) { }
+static unsigned int get_kmod_umh_limit(void) { return 0; }
 #define try_then_request_module(x, mod...) (x)
 #endif
 
+#define get_kmod_umh_limit get_kmod_umh_limit
+
 struct cred;
 struct file;
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index ef65f4c3578a..a0f449f77ed7 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -138,6 +138,27 @@ static void kmod_umh_threads_put(void)
 }
 
 /**
+ * get_kmod_umh_limit - get concurrent modprobe thread limit
+ *
+ * Returns the number of allowed concurrent modprobe calls.
+ */
+unsigned int get_kmod_umh_limit(void)
+{
+	return max_modprobes;
+}
+EXPORT_SYMBOL_GPL(get_kmod_umh_limit);
+
+/**
+ * get_kmod_umh_count - get number of concurrent modprobe calls running
+ *
+ * Returns the number of concurrent modprobe calls currently running.
+ */
+int get_kmod_umh_count(void)
+{
+	return atomic_read(&kmod_concurrent);
+}
+
+/**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
  * @fmt: printf style format string for the name of the module
@@ -196,6 +217,11 @@ int __request_module(bool wait, const char *fmt, ...)
 }
 EXPORT_SYMBOL(__request_module);
 
+static void __set_max_modprobes(unsigned int suggested)
+{
+	max_modprobes = min((unsigned int) max_threads/2, suggested);
+}
+
 /*
  * If modprobe needs a service that is in a module, we get a recursive
  * loop.  Limit the number of running kmod threads to max_threads/2 or
@@ -212,12 +238,65 @@ EXPORT_SYMBOL(__request_module);
  * 4096 concurrent modprobe instances:
  *
  *	kmod.max_modprobes=4096
+ *
+ * You can also set the limit via sysctl:
+ *
+ * echo 4096 > /proc/sys/kernel/kmod-limit
+ *
+ * You can also set the query the current thread count:
+ *
+ * cat /proc/sys/kernel/kmod-count
+ *
+ * These knobs should enable userspace to more gracefully and
+ * deterministically handle module loading.
  */
 void __init init_kmod_umh(void)
 {
 	if (!max_modprobes)
-		max_modprobes = min(max_threads/2,
-				    2 << CONFIG_MAX_KMOD_CONCURRENT);
+		__set_max_modprobes(2 << CONFIG_MAX_KMOD_CONCURRENT);
+}
+
+int sysctl_kmod_count(struct ctl_table *table, int write,
+		      void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	struct ctl_table t;
+	int ret = 0;
+	int count = get_kmod_umh_count();
+
+	t = *table;
+	t.data = &count;
+
+	if (write)
+		return -EPERM;
+
+	ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
+
+	return ret;
+}
+
+int sysctl_kmod_limit(struct ctl_table *table, int write,
+		      void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	struct ctl_table t;
+	int ret;
+	unsigned int local_max_modprobes = max_modprobes;
+	unsigned int min = 0;
+	unsigned int max = max_threads/2;
+
+	t = *table;
+	t.data = &local_max_modprobes;
+	t.extra1 = &min;
+	t.extra2 = &max;
+
+	ret = proc_douintvec_minmax(&t, write, buffer, lenp, ppos);
+	if (ret == -ERANGE)
+		pr_err("modprobe thread valid range: %u - %u\n", min, max);
+	if (ret || !write)
+		return ret;
+
+	__set_max_modprobes((unsigned int) local_max_modprobes);
+
+	return 0;
 }
 
 #endif /* CONFIG_MODULES */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 06711e648fa3..0ba56001e49b 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -660,6 +660,20 @@ static struct ctl_table kern_table[] = {
 		.extra1		= &one,
 		.extra2		= &one,
 	},
+	{
+		.procname	= "kmod-count",
+		.data		= NULL, /* filled in by handler */
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= sysctl_kmod_count,
+	},
+	{
+		.procname	= "kmod-limit",
+		.data		= NULL, /* filled in by handler */
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0644,
+		.proc_handler	= sysctl_kmod_limit,
+	},
 #endif
 #ifdef CONFIG_UEVENT_HELPER
 	{
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (8 preceding siblings ...)
  2016-12-08 19:49 ` [RFC 09/10] kmod: add helpers for getting kmod count and limit Luis R. Rodriguez
@ 2016-12-08 19:49 ` Luis R. Rodriguez
  2016-12-09 20:03   ` Martin Wilck
                     ` (2 more replies)
  2017-01-11 19:10 ` [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
  10 siblings, 3 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 19:49 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

kmod has an optimization in place whereby if a some kernel code
uses request_module() on a module already loaded we never bother
userspace as the module already is loaded. This is not true for
get_fs_type() though as it uses aliases.

Additionally kmod <= v19 was broken -- it returns 0 to modprobe calls,
assuming the kernel module is built-in, where really we have a race as
the module starts forming. kmod <= v19 has incorrect userspace heuristics,
a userspace kmod fix is available for it:

http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4

This changes kmod to address both:

 o Provides the alias optimization for get_fs_type() so modules already
   loaded do not get re-requested.

 o Provides a sanity test to verify modprobe's work

This is important given how any get_fs_type() users assert success
means we're ready to go, and tests with the new test_kmod stress driver
reveal that request_module() and get_fs_type() might fail for a few
other reasons. You don't need old kmod to fail on request_module() or
get_fs_type(), with the right system setup, these calls *can* fail
today.

Although this does get us in the business of keeping alias maps in
kernel, the the work to support and maintain this is trivial.
Aditionally, since it may be important get_fs_type() should not fail on
certain systems, this tightens things up a bit more.

The TL;DR:

kmod <= v19 will return 0 on modprobe calls if you are built-in,
however its heuristics for checking if you are built-in were broken.

It assumed that having the directory /sys/module/module-name
but not having the file /sys/module/module-name/initstate
is sufficient to assume a module is built-in.

The kernel loads the inittstate attribute *after* it creates the
directory. This is an issue when modprobe returns 0 for kernel calls
which assumes a return of 0 on request_module() can give you the
right to assert the module is loaded and live.

We cannot trust returns of modprobe as 0 in the kernel, we need to
verify that modules are live if modprobe return 0 but only if modules
*are* modules. The kernel heuristic we use to determine if a module is
built-in is that if modprobe returns 0 we know we must be built-in or
a module, but if we are a module clearly we must have a lingering kmod
dangling on our linked list. If there is no modules there we are *somewhat*
certain the module must be built in.

This is not enough though... we cannot easily work around this since the
kernel can use aliases to userspace for modules calls. For instance
fs/namespace.c uses fs-modulename for filesystesms on get_fs_type(), so
these need to be taken into consideration as well.

Using kmod <= 19 will give you a NULL get_fs_type() return even though
the module was loaded... That is a corner case, there are other failures
for request_module() though -- the other failures are not easy to
reproduce though but fortunately we have a stress test driver to help
with that now. Use the following tests:

 # tools/testing/selftests/kmod/kmod.sh -t 0008
 # tools/testing/selftests/kmod/kmod.sh -t 0009

You can more easily see this error if you have kmod <= v19 installed.

You will need to install kmod <= v19, be sure to install its modprobe
into /sbin/ as by default the 'make install' target does not replace
your own.

This test helps cure test_kmod cases 0008 0009 so enable them.

Reported-by: Martin Wilck <martin.wilck@suse.com>
Reported-by: Randy Wright <rwright@hpe.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c                        | 73 ++++++++++++++++++++++++++++++++++++
 kernel/module.c                      | 11 ++++--
 tools/testing/selftests/kmod/kmod.sh |  9 ++---
 3 files changed, 85 insertions(+), 8 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index a0f449f77ed7..6bf0feab41d1 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -61,6 +61,11 @@ static DECLARE_RWSEM(umhelper_sem);
 
 #ifdef CONFIG_MODULES
 
+bool finished_loading(const char *name);
+int module_wait_until_finished(const char *name);
+struct module *find_module_all(const char *name, size_t len,
+			       bool even_unformed);
+
 /*
 	modprobe_path is set via /proc/sys.
 */
@@ -158,6 +163,72 @@ int get_kmod_umh_count(void)
 	return atomic_read(&kmod_concurrent);
 }
 
+static bool kmod_exists(char *name)
+{
+	struct module *mod;
+
+	mutex_lock(&module_mutex);
+	mod = find_module_all(name, strlen(name), true);
+	mutex_unlock(&module_mutex);
+
+	if (mod)
+		return true;
+
+	return false;
+}
+
+/*
+ * The assumption is this must be a module, it could still not be live though
+ * since kmod <= 19 returns 0 even if it was not ready yet.  Allow for force
+ * wait check in case you are stuck on old userspace.
+ */
+static int wait_for_kmod(char *name)
+{
+	int ret = 0;
+
+	if (!finished_loading(name))
+		ret = module_wait_until_finished(name);
+
+	return ret;
+}
+
+/*
+ * kmod <= 19 will tell us modprobe returned 0 even if the module
+ * is not ready yet, it does this because it checks the /sys/module/mod-name
+ * directory and if its created but the /sys/module/mod-name/initstate is not
+ * created it assumes you have a built-in driver. At this point the module
+ * is still unformed, and telling the kernel at any point via request_module()
+ * will cause issues given a lot of places in the kernel assert that the driver
+ * will be present and ready. We need to account for this.
+ *
+ * If we had a module and even if buggy modprobe returned 0, we know we'd at
+ * least have a dangling kmod entry we could fetch.
+ *
+ * If modprobe returned 0 and we cannot find a kmod entry this is a good
+ * indicator your by userspace and kernel space that what you have is built-in.
+ *
+ * If modprobe returned 0 and we can find a kmod entry we should air on the
+ * side of caution and wait for the module to become ready or going.
+ *
+ * In the worst case, for built-in, we have to check on the module list for
+ * as many aliases possible the kernel gives the module, if that is n, that
+ * n traversals on the module list.
+ */
+static int finished_kmod_load(char *name)
+{
+	int ret = 0;
+	bool is_fs = (strlen(name) > 3) && (strncmp(name, "fs-", 3) == 0);
+
+	if (kmod_exists(name)) {
+		ret = wait_for_kmod(name);
+	} else {
+		if (is_fs && kmod_exists(name + 3))
+			ret = wait_for_kmod(name + 3);
+	}
+
+	return ret;
+}
+
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
@@ -211,6 +282,8 @@ int __request_module(bool wait, const char *fmt, ...)
 	trace_module_request(module_name, wait, _RET_IP_);
 
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
+	if (!ret)
+		ret = finished_kmod_load(module_name);
 
 	kmod_umh_threads_put();
 	return ret;
diff --git a/kernel/module.c b/kernel/module.c
index e420ed67e533..bf854321dca0 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -590,8 +590,8 @@ EXPORT_SYMBOL_GPL(find_symbol);
  * Search for module by name: must hold module_mutex (or preempt disabled
  * for read-only access).
  */
-static struct module *find_module_all(const char *name, size_t len,
-				      bool even_unformed)
+struct module *find_module_all(const char *name, size_t len,
+			       bool even_unformed)
 {
 	struct module *mod;
 
@@ -3325,7 +3325,7 @@ static int post_relocation(struct module *mod, const struct load_info *info)
 }
 
 /* Is this module of this name done loading?  No locks held. */
-static bool finished_loading(const char *name)
+bool finished_loading(const char *name)
 {
 	struct module *mod;
 	bool ret;
@@ -3486,6 +3486,11 @@ static int may_init_module(void)
 	return 0;
 }
 
+int module_wait_until_finished(const char *name)
+{
+	return wait_event_interruptible(module_wq, finished_loading(name));
+}
+
 /*
  * We try to place it in the list now to make sure it's unique before
  * we dedicate too many resources.  In particular, temporary percpu
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
index 9ea1864d8bae..ccf35b8d1671 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -382,7 +382,7 @@ kmod_test_0008()
 	let EXTRA=$MODPROBE_LIMIT/2
 	config_num_thread_limit_extra $EXTRA
 	config_trigger ${FUNCNAME[0]}
-	config_expect_result ${FUNCNAME[0]} -EINVAL
+	config_expect_result ${FUNCNAME[0]} SUCCESS
 }
 
 kmod_test_0009()
@@ -392,7 +392,7 @@ kmod_test_0009()
 	#let EXTRA=$MODPROBE_LIMIT/3
 	config_num_thread_limit_extra 5
 	config_trigger ${FUNCNAME[0]}
-	config_expect_result ${FUNCNAME[0]} -EINVAL
+	config_expect_result ${FUNCNAME[0]} SUCCESS
 }
 
 trap "test_finish" EXIT
@@ -442,8 +442,7 @@ kmod_test_0004
 kmod_test_0005
 kmod_test_0006
 kmod_test_0007
-
-#kmod_test_0008
-#kmod_test_0009
+kmod_test_0008
+kmod_test_0009
 
 exit 0
-- 
2.10.1

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 01/10] kmod: add test driver to stress test the module loader
  2016-12-08 18:47 ` [RFC 01/10] kmod: add test driver to stress test the module loader Luis R. Rodriguez
@ 2016-12-08 20:24   ` Kees Cook
  2016-12-13 21:10     ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Kees Cook @ 2016-12-08 20:24 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, DSterba, fdmanana, neilb, rgoldwyn, subashab,
	Heinrich Schuchardt, Aaron Tomlin, mbenes, Paul E. McKenney,
	Dan Williams, Josh Poimboeuf, David S. Miller, Ingo Molnar,
	Andrew Morton, Linus Torvalds, linux-kselftest, linux-doc, LKML

On Thu, Dec 8, 2016 at 10:47 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> This adds a new stress test driver for kmod: the kernel module loader.
> The new stress test driver, test_kmod, is only enabled as a module right
> now. It should be possible to load this as built-in and load tests early
> (refer to the force_init_test module parameter), however since a lot of
> test can get a system out of memory fast we leave this disabled for now.
>
> Using a system with 1024 MiB of RAM can *easily* get your kernel
> OOM fast with this test driver.
>
> The test_kmod driver exposes API knobs for us to fine tune simple
> request_module() and get_fs_type() calls. Since these API calls
> only allow each one parameter a test driver for these is rather
> simple. Other factors that can help out test driver though are
> the number of calls we issue and knowing current limitations of
> each. This exposes configuration as much as possible through
> userspace to be able to build tests directly from userspace.
>
> Since it allows multiple misc devices its will eventually (once we
> add a knob to let us create new devices at will) also be possible to
> perform more tests in parallel, provided you have enough memory.
>
> We only enable tests we know work as of right now.
>
> Demo screenshots:
>
>  # tools/testing/selftests/kmod/kmod.sh
> kmod_test_0001_driver: OK! - loading kmod test
> kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
> kmod_test_0001_fs: OK! - loading kmod test
> kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
> kmod_test_0002_driver: OK! - loading kmod test
> kmod_test_0002_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
> kmod_test_0002_fs: OK! - loading kmod test
> kmod_test_0002_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
> kmod_test_0003: OK! - loading kmod test
> kmod_test_0003: OK! - Return value: 0 (SUCCESS), expected SUCCESS
> kmod_test_0004: OK! - loading kmod test
> kmod_test_0004: OK! - Return value: 0 (SUCCESS), expected SUCCESS
> kmod_test_0005: OK! - loading kmod test
> kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
> kmod_test_0006: OK! - loading kmod test
> kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
> kmod_test_0005: OK! - loading kmod test
> kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
> kmod_test_0006: OK! - loading kmod test
> kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
> Test completed
>
> You can also request for specific tests:
>
>  # tools/testing/selftests/kmod/kmod.sh -t 0001
> kmod_test_0001_driver: OK! - loading kmod test
> kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
> kmod_test_0001_fs: OK! - loading kmod test
> kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
> Test completed
>
> Lastly, the current available number of tests:
>
>  # tools/testing/selftests/kmod/kmod.sh --help
> Usage: tools/testing/selftests/kmod/kmod.sh [ -t <4-number-digit> ]
> Valid tests: 0001-0009
>
> 0001 - Simple test - 1 thread  for empty string
> 0002 - Simple test - 1 thread  for modules/filesystems that do not exist
> 0003 - Simple test - 1 thread  for get_fs_type() only
> 0004 - Simple test - 2 threads for get_fs_type() only
> 0005 - multithreaded tests with default setup - request_module() only
> 0006 - multithreaded tests with default setup - get_fs_type() only
> 0007 - multithreaded tests with default setup test request_module() and get_fs_type()
> 0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()
> 0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()
>
> The following test cases currently fail, as such they are not currently
> enabled by default:
>
>  # tools/testing/selftests/kmod/kmod.sh -t 0007
>  # tools/testing/selftests/kmod/kmod.sh -t 0008
>  # tools/testing/selftests/kmod/kmod.sh -t 0009
>  # tools/testing/selftests/kmod/kmod.sh -t 0010
>  # tools/testing/selftests/kmod/kmod.sh -t 0011
>
> To be sure to run them as intended please unload both of the modules:
>
>   o test_module
>   o xfs
>
> And ensure they are not loaded on your system prior to testing them.
> If you use these paritions for your rootfs you can change the default
> test driver used for get_fs_type() by exporting it into your
> environment. For example of other test defaults you can override
> refer to kmod.sh allow_user_defaults().
>
> Behind the scenes this is how we fine tune at a test case prior to
> hitting a trigger to run it:
>
> cat /sys/devices/virtual/misc/test_kmod0/config
> echo -n "2" > /sys/devices/virtual/misc/test_kmod0/config_test_case
> echo -n "ext4" > /sys/devices/virtual/misc/test_kmod0/config_test_fs
> echo -n "80" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
> cat /sys/devices/virtual/misc/test_kmod0/config
> echo -n "1" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
>
> Finally to trigger:
>
> echo -n "1" > /sys/devices/virtual/misc/test_kmod0/trigger_config
>
> The kmod.sh script uses the above constructs to build differnt test cases.

Typo: different

> A bit of interpretation of the current failures follows, first two
> premises:
>
> a) When request_module() is used userspace figures out an optimized version of
> module order for us. Once it finds the modules it needs, as per depmod
> symbol dep map, it will finit_module() the respective modules which
> are needed for the original request_module() request.
>
> b) We have an optimization in place whereby if a kernel uses
> request_module() on a module already loaded we never bother
> userspace as the module already is loaded. This is all handled by
> kernel/kmod.c.
>
> A few things to consider to help identify root causes of issues:
>
> 0) kmod 19 has a broken heuristic for modules being assumed to be
> built-in to your kernel and will return 0 even though request_module()
> failed. Upgrade to a newer version of kmod.
>
> 1) A get_fs_type() call for "xfs" will request_module() for
> "fs-xfs", not for "xfs". The optimization in kernel described in b)
> fails to catch if we have a lot of consecutive get_fs_type() calls.
> The reason is the optimization in place does not look for aliases. This
> means two consecutive get_fs_type() calls will bump kmod_concurrent, whereas
> request_module() will not.
>
> This one explanation why test case 0009 fails at least once for
> get_fs_type().
>
> 2) If a module fails to load --- for whatever reason (kmod_concurrent
> limit reached, file not yet present due to rootfs switch, out of memory)
> we have a period of time during which module request for the same name
> either with request_module() or get_fs_type() will *also* fail to load
> even if the file for the module is ready.
>
> This explains why *multiple* NULLs are possible on test 0009.
>
> 3) finit_module() consumes quite a bit of memory.

Is this due to reading the module into kernel memory or something else?

> 4) Filesystems typically also have more dependent modules than other
> modules, its important to note though that even though a get_fs_type() call
> does not incur additional kmod_concurrent bumps, since userspace
> loads dependencies it finds it needs via finit_module_fd(), it *will*
> take much more memory to load a module with a lot of dependencies.
>
> Because of 3) and 4) we will easily run into out of memory failures
> with certain tests. For instance test 0006 fails on qemu with 1024 MiB
> of RAM. It panics a box after reaping all userspace processes and still
> not having enough memory to reap.

Are the buffers not released until after all the dependent modules are
loaded? I thought it would load one by one?

> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

This is a great selftest, thanks for working on it!

Notes below...

> ---
>  lib/Kconfig.debug                     |   25 +
>  lib/Makefile                          |    1 +
>  lib/test_kmod.c                       | 1248 +++++++++++++++++++++++++++++++++
>  tools/testing/selftests/kmod/Makefile |   11 +
>  tools/testing/selftests/kmod/config   |    7 +
>  tools/testing/selftests/kmod/kmod.sh  |  449 ++++++++++++
>  6 files changed, 1741 insertions(+)
>  create mode 100644 lib/test_kmod.c
>  create mode 100644 tools/testing/selftests/kmod/Makefile
>  create mode 100644 tools/testing/selftests/kmod/config
>  create mode 100755 tools/testing/selftests/kmod/kmod.sh
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 7446097f72bd..6cad548e0682 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1994,6 +1994,31 @@ config BUG_ON_DATA_CORRUPTION
>
>           If unsure, say N.
>
> +config TEST_KMOD
> +       tristate "kmod stress tester"
> +       default n
> +       depends on m
> +       select TEST_LKM
> +       select XFS_FS
> +       select TUN
> +       select BTRFS_FS

Since the desired FS can be changed at runtime, maybe these selects
aren't needed?

> +       help
> +         Test the kernel's module loading mechanism: kmod. kmod implements
> +         support to load modules using the Linux kernel's usermode helper.
> +         This test provides a series of tests against kmod.
> +
> +         Although technically you can either build test_kmod as a module or
> +         into the kernel we disallow building it into the kernel since
> +         it stress tests request_module() and this will very likely cause
> +         some issues by taking over precious threads available from other
> +         module load requests, ultimately this could be fatal.
> +
> +         To run tests run:
> +
> +         tools/testing/selftests/kmod/kmod.sh --help
> +
> +         If unsure, say N.
> +
>  source "samples/Kconfig"
>
>  source "lib/Kconfig.kgdb"
> diff --git a/lib/Makefile b/lib/Makefile
> index d15e235f72ea..3c5a14821e16 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -55,6 +55,7 @@ obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o
>  obj-$(CONFIG_TEST_PRINTF) += test_printf.o
>  obj-$(CONFIG_TEST_BITMAP) += test_bitmap.o
>  obj-$(CONFIG_TEST_UUID) += test_uuid.o
> +obj-$(CONFIG_TEST_KMOD) += test_kmod.o
>
>  ifeq ($(CONFIG_DEBUG_KOBJECT),y)
>  CFLAGS_kobject.o += -DDEBUG
> diff --git a/lib/test_kmod.c b/lib/test_kmod.c
> new file mode 100644
> index 000000000000..63fded83b9b6
> --- /dev/null
> +++ b/lib/test_kmod.c
> @@ -0,0 +1,1248 @@
> +/*
> + * kmod stress test driver
> + *
> + * Copyright (C) 2016 Luis R. Rodriguez <mcgrof@kernel.org>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of copyleft-next (version 0.3.1 or later) as published
> + * at http://copyleft-next.org/.
> + */
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +/*
> + * This driver provides an interface to trigger and test the kernel's
> + * module loader through a series of configurations and a few triggers.
> + * To test this driver use the following script as root:
> + *
> + * tools/testing/selftests/kmod/kmod.sh --help
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/kmod.h>
> +#include <linux/printk.h>
> +#include <linux/kthread.h>
> +#include <linux/sched.h>
> +#include <linux/fs.h>
> +#include <linux/miscdevice.h>
> +#include <linux/vmalloc.h>
> +#include <linux/slab.h>
> +#include <linux/device.h>
> +
> +#define TEST_START_NUM_THREADS 50
> +#define TEST_START_DRIVER      "test_module"
> +#define TEST_START_TEST_FS     "xfs"
> +#define TEST_START_TEST_CASE   TEST_KMOD_DRIVER
> +
> +
> +static bool force_init_test = false;
> +module_param(force_init_test, bool_enable_only, 0644);
> +MODULE_PARM_DESC(force_init_test,
> +                "Force kicking a test immediatley after driver loads");

Typo: immediately

> +
> +/*
> + * For device allocation / registration
> + */
> +static DEFINE_MUTEX(reg_dev_mutex);
> +static LIST_HEAD(reg_test_devs);
> +
> +/*
> + * num_test_devs actually represents the *next* ID of the next
> + * device we will allow to create.
> + */
> +static int num_test_devs;
> +
> +/**
> + * enum kmod_test_case - linker table test case
> + *
> + * If you add a  test case, please be sure to review if you need to se
> + * @need_mod_put for your tests case.
> + *
> + * @TEST_KMOD_DRIVER: stress tests request_module()
> + * @TEST_KMOD_FS_TYPE: stress tests get_fs_type()
> + */
> +enum kmod_test_case {
> +       __TEST_KMOD_INVALID = 0,
> +
> +       TEST_KMOD_DRIVER,
> +       TEST_KMOD_FS_TYPE,
> +
> +       __TEST_KMOD_MAX,
> +};
> +
> +struct test_config {
> +       char *test_driver;
> +       char *test_fs;
> +       unsigned int num_threads;
> +       enum kmod_test_case test_case;
> +       int test_result;
> +};
> +
> +struct kmod_test_device;
> +
> +/**
> + * kmod_test_device_info - thread info
> + *
> + * @ret_sync: return value if request_module() is used, sync request for
> + *     @TEST_KMOD_DRIVER
> + * @fs_sync: return value of get_fs_type() for @TEST_KMOD_FS_TYPE
> + * @thread_idx: thread ID
> + * @test_dev: test device test is being performed under
> + * @need_mod_put: Some tests (get_fs_type() is one) requires putting the module
> + *     (module_put(fs_sync->owner)) when done, otherwise you will not be able
> + *     to unload the respective modules and re-test. We use this to keep
> + *     accounting of when we need this and to help out in case we need to
> + *     error out and deal with module_put() on error.
> + */
> +struct kmod_test_device_info {
> +       int ret_sync;
> +       struct file_system_type *fs_sync;
> +       struct task_struct *task_sync;
> +       unsigned int thread_idx;
> +       struct kmod_test_device *test_dev;
> +       bool need_mod_put;
> +};
> +
> +/**
> + * kmod_test_device - test device to help test kmod
> + *
> + * @dev_idx: unique ID for test device
> + * @config: configuration for the test
> + * @misc_dev: we use a misc device under the hood
> + * @dev: pointer to misc_dev's own struct device
> + * @config_mutex: protects configuration of test
> + * @trigger_mutex: the test trigger can only be fired once at a time
> + * @thread_lock: protects @done count, and the @info per each thread
> + * @done: number of threads which have completed or failed
> + * @test_is_oom: when we run out of memory, use this to halt moving forward
> + * @kthreads_done: completion used to signal when all work is done
> + * @list: needed to be part of the reg_test_devs
> + * @info: array of info for each thread
> + */
> +struct kmod_test_device {
> +       int dev_idx;
> +       struct test_config config;
> +       struct miscdevice misc_dev;
> +       struct device *dev;
> +       struct mutex config_mutex;
> +       struct mutex trigger_mutex;
> +       struct mutex thread_mutex;
> +
> +       unsigned int done;
> +
> +       bool test_is_oom;
> +       struct completion kthreads_done;
> +       struct list_head list;
> +
> +       struct kmod_test_device_info *info;
> +};
> +
> +static const char *test_case_str(enum kmod_test_case test_case)
> +{
> +       switch (test_case) {
> +       case TEST_KMOD_DRIVER:
> +               return "TEST_KMOD_DRIVER";
> +       case TEST_KMOD_FS_TYPE:
> +               return "TEST_KMOD_FS_TYPE";
> +       default:
> +               return "invalid";
> +       }
> +}
> +
> +static struct miscdevice *dev_to_misc_dev(struct device *dev)
> +{
> +       return dev_get_drvdata(dev);
> +}
> +
> +static struct kmod_test_device *misc_dev_to_test_dev(struct miscdevice *misc_dev)
> +{
> +       return container_of(misc_dev, struct kmod_test_device, misc_dev);
> +}
> +
> +static struct kmod_test_device *dev_to_test_dev(struct device *dev)
> +{
> +       struct miscdevice *misc_dev;
> +
> +       misc_dev = dev_to_misc_dev(dev);
> +
> +       return misc_dev_to_test_dev(misc_dev);
> +}
> +
> +/* Must run with thread_mutex held */
> +static void kmod_test_done_check(struct kmod_test_device *test_dev,
> +                                unsigned int idx)
> +{
> +       struct test_config *config = &test_dev->config;
> +
> +       test_dev->done++;
> +       dev_dbg(test_dev->dev, "Done thread count: %u\n", test_dev->done);
> +
> +       if (test_dev->done == config->num_threads) {
> +               dev_info(test_dev->dev, "Done: %u threads have all run now\n",
> +                        test_dev->done);
> +               dev_info(test_dev->dev, "Last thread to run: %u\n", idx);
> +               complete(&test_dev->kthreads_done);
> +       }
> +}
> +
> +static void test_kmod_put_module(struct kmod_test_device_info *info)
> +{
> +       struct kmod_test_device *test_dev = info->test_dev;
> +       struct test_config *config = &test_dev->config;
> +
> +       if (!info->need_mod_put)
> +               return;
> +
> +       switch (config->test_case) {
> +       case TEST_KMOD_DRIVER:
> +               break;
> +       case TEST_KMOD_FS_TYPE:
> +               if (info && info->fs_sync && info->fs_sync->owner)
> +                       module_put(info->fs_sync->owner);
> +               break;
> +       default:
> +               BUG();
> +       }
> +
> +       info->need_mod_put = true;
> +}
> +
> +static int run_request(void *data)
> +{
> +       struct kmod_test_device_info *info = data;
> +       struct kmod_test_device *test_dev = info->test_dev;
> +       struct test_config *config = &test_dev->config;
> +
> +       switch (config->test_case) {
> +       case TEST_KMOD_DRIVER:
> +               info->ret_sync = request_module("%s", config->test_driver);
> +               break;
> +       case TEST_KMOD_FS_TYPE:
> +               info->fs_sync = get_fs_type(config->test_fs);
> +               info->need_mod_put = true;
> +               break;
> +       default:
> +               /* __trigger_config_run() already checked for test sanity */
> +               BUG();
> +               return -EINVAL;
> +       }
> +
> +       dev_dbg(test_dev->dev, "Ran thread %u\n", info->thread_idx);
> +
> +       test_kmod_put_module(info);
> +
> +       mutex_lock(&test_dev->thread_mutex);
> +       info->task_sync = NULL;
> +       kmod_test_done_check(test_dev, info->thread_idx);
> +       mutex_unlock(&test_dev->thread_mutex);
> +
> +       return 0;
> +}
> +
> +static int tally_work_test(struct kmod_test_device_info *info)
> +{
> +       struct kmod_test_device *test_dev = info->test_dev;
> +       struct test_config *config = &test_dev->config;
> +       int err_ret = 0;
> +
> +       switch (config->test_case) {
> +       case TEST_KMOD_DRIVER:
> +               /*
> +                * Only capture errors, if one is found that's
> +                * enough, for now.
> +                */
> +               if (info->ret_sync != 0)
> +                       err_ret = info->ret_sync;
> +               dev_info(test_dev->dev,
> +                        "Sync thread %d return status: %d\n",
> +                        info->thread_idx, info->ret_sync);
> +               break;
> +       case TEST_KMOD_FS_TYPE:
> +               /* For now we make this simple */
> +               if (!info->fs_sync)
> +                       err_ret = -EINVAL;
> +               dev_info(test_dev->dev, "Sync thread %u fs: %s\n",
> +                        info->thread_idx, info->fs_sync ? config->test_fs :
> +                        "NULL");
> +               break;
> +       default:
> +               BUG();
> +       }
> +
> +       return err_ret;
> +}
> +
> +/*
> + * XXX: add result option to display if all errors did not match.
> + * For now we just keep any error code if one was found.
> + *
> + * If this ran it means *all* tasks were created fine and we
> + * are now just collecting results.
> + *
> + * Only propagate errors, do not override with a subsequent sucess case.
> + */
> +static void tally_up_work(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +       struct kmod_test_device_info *info;
> +       unsigned int idx;
> +       int err_ret = 0;
> +       int ret = 0;
> +
> +       mutex_lock(&test_dev->thread_mutex);
> +
> +       dev_info(test_dev->dev, "Results:\n");
> +
> +       for (idx=0; idx < config->num_threads; idx++) {
> +               info = &test_dev->info[idx];
> +               ret = tally_work_test(info);
> +               if (ret)
> +                       err_ret = ret;
> +       }
> +
> +       /*
> +        * Note: request_module() returns 256 for a module not found even
> +        * though modprobe itself returns 1.
> +        */
> +       config->test_result = err_ret;
> +
> +       mutex_unlock(&test_dev->thread_mutex);
> +}
> +
> +static int try_one_request(struct kmod_test_device *test_dev, unsigned int idx)
> +{
> +       struct kmod_test_device_info *info = &test_dev->info[idx];
> +       int fail_ret = -ENOMEM;
> +
> +       mutex_lock(&test_dev->thread_mutex);
> +
> +       info->thread_idx = idx;
> +       info->test_dev = test_dev;
> +       info->task_sync = kthread_run(run_request, info, "%s-%u",
> +                                     KBUILD_MODNAME, idx);
> +
> +       if (!info->task_sync || IS_ERR(info->task_sync)) {
> +               test_dev->test_is_oom = true;
> +               dev_err(test_dev->dev, "Setting up thread %u failed\n", idx);
> +               info->task_sync = NULL;
> +               goto err_out;
> +       } else
> +               dev_dbg(test_dev->dev, "Kicked off thread %u\n", idx);
> +
> +       mutex_unlock(&test_dev->thread_mutex);
> +
> +       return 0;
> +
> +err_out:
> +       info->ret_sync = fail_ret;
> +       mutex_unlock(&test_dev->thread_mutex);
> +
> +       return fail_ret;
> +}
> +
> +static void test_dev_kmod_stop_tests(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +       struct kmod_test_device_info *info;
> +       unsigned int i;
> +
> +       dev_info(test_dev->dev, "Ending request_module() tests\n");
> +
> +       mutex_lock(&test_dev->thread_mutex);
> +
> +       for (i=0; i < config->num_threads; i++) {
> +               info = &test_dev->info[i];
> +               if (info->task_sync && !IS_ERR(info->task_sync)) {
> +                       dev_info(test_dev->dev,
> +                                "Stopping still-running thread %i\n", i);
> +                       kthread_stop(info->task_sync);
> +               }
> +
> +               /*
> +                * info->task_sync is well protected, it can only be
> +                * NULL or a pointer to a struct. If its NULL we either
> +                * never ran, or we did and we completed the work. Completed
> +                * tasks *always* put the module for us. This is a sanity
> +                * check -- just in case.
> +                */
> +               if (info->task_sync && info->need_mod_put)
> +                       test_kmod_put_module(info);
> +       }
> +
> +       mutex_unlock(&test_dev->thread_mutex);
> +}
> +
> +/*
> + * Only wait *iff* we did not run into any errors during all of our thread
> + * set up. If run into any issues we stop threads and just bail out with
> + * an error to the trigger. This also means we don't need any tally work
> + * for any threads which fail.
> + */
> +static int try_requests(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +       unsigned int idx;
> +       int ret;
> +       bool any_error = false;
> +
> +       for (idx=0; idx < config->num_threads; idx++) {
> +               if (test_dev->test_is_oom) {
> +                       any_error = true;
> +                       break;
> +               }
> +
> +               ret = try_one_request(test_dev, idx);
> +               if (ret) {
> +                       any_error = true;
> +                       break;
> +               }
> +       }
> +
> +       if (!any_error) {
> +               test_dev->test_is_oom = false;
> +               dev_info(test_dev->dev,
> +                        "No errors were found while initializing threads\n");
> +               wait_for_completion(&test_dev->kthreads_done);
> +               tally_up_work(test_dev);
> +       } else {
> +               test_dev->test_is_oom = true;
> +               dev_info(test_dev->dev,
> +                        "At least one thread failed to start, stop all work\n");
> +               test_dev_kmod_stop_tests(test_dev);
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
> +}
> +
> +static int run_test_driver(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +
> +       dev_info(test_dev->dev, "Test case: %s (%u)\n",
> +                test_case_str(config->test_case),
> +                config->test_case);
> +       dev_info(test_dev->dev, "Test driver to load: %s\n",
> +                config->test_driver);
> +       dev_info(test_dev->dev, "Number of threads to run: %u\n",
> +                config->num_threads);
> +       dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
> +                config->num_threads - 1);
> +
> +       return try_requests(test_dev);
> +}
> +
> +static int run_test_fs_type(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +
> +       dev_info(test_dev->dev, "Test case: %s (%u)\n",
> +                test_case_str(config->test_case),
> +                config->test_case);
> +       dev_info(test_dev->dev, "Test filesystem to load: %s\n",
> +                config->test_fs);
> +       dev_info(test_dev->dev, "Number of threads to run: %u\n",
> +                config->num_threads);
> +       dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
> +                config->num_threads - 1);
> +
> +       return try_requests(test_dev);
> +}
> +
> +static ssize_t config_show(struct device *dev,
> +                          struct device_attribute *attr,
> +                          char *buf)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +       int len = 0;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       len += sprintf(buf, "Custom trigger configuration for: %s\n",
> +                      dev_name(dev));
> +
> +       len += sprintf(buf+len, "Number of threads:\t%u\n",
> +                      config->num_threads);
> +
> +       len += sprintf(buf+len, "Test_case:\t%s (%u)\n",
> +                      test_case_str(config->test_case),
> +                      config->test_case);
> +
> +       if (config->test_driver)
> +               len += sprintf(buf+len, "driver:\t%s\n",
> +                              config->test_driver);
> +       else
> +               len += sprintf(buf+len, "driver:\tEMTPY\n");
> +
> +       if (config->test_fs)
> +               len += sprintf(buf+len, "fs:\t%s\n",
> +                              config->test_fs);
> +       else
> +               len += sprintf(buf+len, "fs:\tEMTPY\n");

These should all use snprintf...

> +
> +
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       return len;
> +}
> +static DEVICE_ATTR_RO(config);
> +
> +/*
> + * This ensures we don't allow kicking threads through if our configuration
> + * is faulty.
> + */
> +static int __trigger_config_run(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +
> +       test_dev->done = 0;
> +
> +       switch (config->test_case) {
> +       case TEST_KMOD_DRIVER:
> +               return run_test_driver(test_dev);
> +       case TEST_KMOD_FS_TYPE:
> +               return run_test_fs_type(test_dev);
> +       default:
> +               dev_warn(test_dev->dev,
> +                        "Invalid test case requested: %u\n",
> +                        config->test_case);
> +               return -EINVAL;
> +       }
> +}
> +
> +static int trigger_config_run(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +       int ret;
> +
> +       mutex_lock(&test_dev->trigger_mutex);
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       ret = __trigger_config_run(test_dev);
> +       if (ret < 0)
> +               goto out;
> +       dev_info(test_dev->dev, "General test result: %d\n",
> +                config->test_result);
> +
> +       /*
> +        * We must return 0 after a trigger even unless something went
> +        * wrong with the setup of the test. If the test setup went fine
> +        * then userspace must just check the result of config->test_result.
> +        * One issue with relying on the return from a call in the kernel
> +        * is if the kernel returns a possitive value using this trigger
> +        * will not return the value to userspace, it would be lost.
> +        *
> +        * By not relying on capturing the return value of tests we are using
> +        * through the trigger it also us to run tests with set -e and only
> +        * fail when something went wrong with the driver upon trigger
> +        * requests.
> +        */
> +       ret = 0;
> +
> +out:
> +       mutex_unlock(&test_dev->config_mutex);
> +       mutex_unlock(&test_dev->trigger_mutex);
> +
> +       return ret;
> +}
> +
> +static ssize_t
> +trigger_config_store(struct device *dev,
> +                    struct device_attribute *attr,
> +                    const char *buf, size_t count)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       int ret;
> +
> +       if (test_dev->test_is_oom)
> +               return -ENOMEM;
> +
> +       /* For all intents and purposes we don't care what userspace
> +        * sent this trigger, we care only that we were triggered.
> +        * We treat the return value only for caputuring issues with
> +        * the test setup. At this point all the test variables should
> +        * have been allocated so typically this should never fail.
> +        */
> +       ret = trigger_config_run(test_dev);
> +       if (unlikely(ret < 0))
> +               goto out;
> +
> +       /*
> +        * Note: any return > 0 will be treated as success
> +        * and the error value will not be available to userspace.
> +        * Do not rely on trying to send to userspace a test value
> +        * return value as possitive return errors will be lost.
> +        */
> +       if (WARN_ON(ret > 0))
> +               return -EINVAL;
> +
> +       ret = count;
> +out:
> +       return ret;
> +}
> +static DEVICE_ATTR_WO(trigger_config);
> +
> +/*
> + * XXX: move to kstrncpy() once merged.
> + *
> + * Users should use kfree_const() when freeing these.
> + */
> +static int __kstrncpy(char **dst, const char *name, size_t count, gfp_t gfp)
> +{
> +       *dst = kstrndup(name, count, gfp);
> +       if (!*dst)
> +               return -ENOSPC;
> +       return count;
> +}
> +
> +static int config_copy_test_driver_name(struct test_config *config,
> +                                   const char *name,
> +                                   size_t count)
> +{
> +       return __kstrncpy(&config->test_driver, name, count, GFP_KERNEL);
> +}
> +
> +
> +static int config_copy_test_fs(struct test_config *config, const char *name,
> +                              size_t count)
> +{
> +       return __kstrncpy(&config->test_fs, name, count, GFP_KERNEL);
> +}
> +
> +static void __kmod_config_free(struct test_config *config)
> +{
> +       if (!config)
> +               return;
> +
> +       kfree_const(config->test_driver);
> +       config->test_driver = NULL;
> +
> +       kfree_const(config->test_fs);
> +       config->test_driver = NULL;
> +}
> +
> +static void kmod_config_free(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config;
> +
> +       if (!test_dev)
> +               return;
> +
> +       config = &test_dev->config;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       __kmod_config_free(config);
> +       mutex_unlock(&test_dev->config_mutex);
> +}
> +
> +static ssize_t config_test_driver_store(struct device *dev,
> +                                       struct device_attribute *attr,
> +                                       const char *buf, size_t count)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +       int copied;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       kfree_const(config->test_driver);
> +       config->test_driver = NULL;
> +
> +       copied = config_copy_test_driver_name(config, buf, count);
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       return copied;
> +}
> +
> +static ssize_t config_test_driver_show(struct device *dev,
> +                                       struct device_attribute *attr,
> +                                       char *buf)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       strcpy(buf, config->test_driver);
> +       strcat(buf, "\n");

IIUC, the show/store API uses a max size of PAGE_SIZE. If that's
correct, it's possible that this show routine could write past the end
of buf, due to the end newline, etc. Best to use snprintf like you do
below for the other shows.

> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       return strlen(buf) + 1;
> +}
> +static DEVICE_ATTR(config_test_driver, 0644, config_test_driver_show,
> +                  config_test_driver_store);
> +
> +static ssize_t config_test_fs_store(struct device *dev,
> +                                   struct device_attribute *attr,
> +                                   const char *buf, size_t count)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +       int copied;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       kfree_const(config->test_fs);
> +       config->test_fs = NULL;
> +
> +       copied = config_copy_test_fs(config, buf, count);
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       return copied;
> +}
> +
> +static ssize_t config_test_fs_show(struct device *dev,
> +                                  struct device_attribute *attr,
> +                                  char *buf)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       strcpy(buf, config->test_fs);
> +       strcat(buf, "\n");
> +       mutex_unlock(&test_dev->config_mutex);

Same here... (which, btw, could likely use to be a helper function,
the show and store functions here are identical except for test_driver
vs test_fs).

> +
> +       return strlen(buf) + 1;
> +}
> +static DEVICE_ATTR(config_test_fs, 0644, config_test_fs_show,
> +                  config_test_fs_store);
> +
> +static int trigger_config_run_driver(struct kmod_test_device *test_dev,
> +                                    const char *test_driver)
> +{
> +       int copied;
> +       struct test_config *config = &test_dev->config;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       config->test_case = TEST_KMOD_DRIVER;
> +
> +       kfree_const(config->test_driver);
> +       config->test_driver = NULL;
> +
> +       copied = config_copy_test_driver_name(config, test_driver,
> +                                             strlen(test_driver));
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       if (copied != strlen(test_driver)) {

Can't these copied tests just check < 0? (i.e. avoid the repeated
strlen which can be fragile.)

> +               test_dev->test_is_oom = true;
> +               return -EINVAL;
> +       }
> +
> +       test_dev->test_is_oom = false;
> +
> +       return trigger_config_run(test_dev);
> +}
> +
> +static int trigger_config_run_fs(struct kmod_test_device *test_dev,
> +                                const char *fs_type)
> +{
> +       int copied;
> +       struct test_config *config = &test_dev->config;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       config->test_case = TEST_KMOD_FS_TYPE;
> +
> +       kfree_const(config->test_fs);
> +       config->test_driver = NULL;
> +
> +       copied = config_copy_test_fs(config, fs_type, strlen(fs_type));
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       if (copied != strlen(fs_type)) {
> +               test_dev->test_is_oom = true;
> +               return -EINVAL;
> +       }
> +
> +       test_dev->test_is_oom = false;
> +
> +       return trigger_config_run(test_dev);
> +}

These two functions are almost identical too. Only test_case and the
copy function change...

> +
> +static void free_test_dev_info(struct kmod_test_device *test_dev)
> +{
> +       if (test_dev->info) {
> +               vfree(test_dev->info);
> +               test_dev->info = NULL;
> +       }
> +}

vfree() already checks for NULL, you can drop the if.

> +
> +static int kmod_config_sync_info(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +
> +       free_test_dev_info(test_dev);
> +       test_dev->info = vzalloc(config->num_threads *
> +                                sizeof(struct kmod_test_device_info));
> +       if (!test_dev->info) {
> +               dev_err(test_dev->dev, "Cannot alloc test_dev info\n");
> +               return -ENOMEM;
> +       }
> +
> +       return 0;
> +}
> +
> +/*
> + * Old kernels may not have this, if you want to port this code to
> + * test it on older kernels.
> + */
> +#ifdef get_kmod_umh_limit
> +static unsigned int kmod_init_test_thread_limit(void)
> +{
> +       return get_kmod_umh_limit();
> +}
> +#else
> +static unsigned int kmod_init_test_thread_limit(void)
> +{
> +       return TEST_START_NUM_THREADS;
> +}
> +#endif
> +
> +static int __kmod_config_init(struct kmod_test_device *test_dev)
> +{
> +       struct test_config *config = &test_dev->config;
> +       int ret = -ENOMEM, copied;
> +
> +       __kmod_config_free(config);
> +
> +       copied = config_copy_test_driver_name(config, TEST_START_DRIVER,
> +                                             strlen(TEST_START_DRIVER));
> +       if (copied != strlen(TEST_START_DRIVER))
> +               goto err_out;
> +
> +       copied = config_copy_test_fs(config, TEST_START_TEST_FS,
> +                                    strlen(TEST_START_TEST_FS));
> +       if (copied != strlen(TEST_START_TEST_FS))
> +               goto err_out;
> +
> +       config->num_threads = kmod_init_test_thread_limit();
> +       config->test_result = 0;
> +       config->test_case = TEST_START_TEST_CASE;
> +
> +       ret = kmod_config_sync_info(test_dev);
> +       if (ret)
> +               goto err_out;
> +
> +       test_dev->test_is_oom = false;
> +
> +       return 0;
> +
> +err_out:
> +       test_dev->test_is_oom = true;
> +       WARN_ON(test_dev->test_is_oom);
> +
> +       __kmod_config_free(config);
> +
> +       return ret;
> +}
> +
> +static ssize_t reset_store(struct device *dev,
> +                          struct device_attribute *attr,
> +                          const char *buf, size_t count)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       int ret;
> +
> +       mutex_lock(&test_dev->trigger_mutex);
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       ret = __kmod_config_init(test_dev);
> +       if (ret < 0) {
> +               ret = -ENOMEM;
> +               dev_err(dev, "could not alloc settings for config trigger: %d\n",
> +                      ret);
> +               goto out;
> +       }
> +
> +       dev_info(dev, "reset\n");
> +       ret = count;
> +
> +out:
> +       mutex_unlock(&test_dev->config_mutex);
> +       mutex_unlock(&test_dev->trigger_mutex);
> +
> +       return ret;
> +}
> +static DEVICE_ATTR_WO(reset);
> +
> +static int test_dev_config_update_uint_sync(struct kmod_test_device *test_dev,
> +                                           const char *buf, size_t size,
> +                                           unsigned int *config,
> +                                           int (*test_sync)(struct kmod_test_device *test_dev))
> +{
> +       int ret;
> +       char *end;
> +       long new = simple_strtol(buf, &end, 0);
> +       unsigned int old_val;
> +       if (end == buf || new > UINT_MAX)
> +               return -EINVAL;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       old_val = *config;
> +       *(unsigned int *)config = new;
> +
> +       ret = test_sync(test_dev);
> +       if (ret) {
> +               *(unsigned int *)config = old_val;
> +
> +               ret = test_sync(test_dev);
> +               WARN_ON(ret);
> +
> +               mutex_unlock(&test_dev->config_mutex);
> +               return -EINVAL;
> +       }
> +
> +       mutex_unlock(&test_dev->config_mutex);
> +       /* Always return full write size even if we didn't consume all */
> +       return size;
> +}
> +
> +static int test_dev_config_update_uint_range(struct kmod_test_device *test_dev,
> +                                            const char *buf, size_t size,
> +                                            unsigned int *config,
> +                                            unsigned int min,
> +                                            unsigned int max)
> +{
> +       char *end;
> +       long new = simple_strtol(buf, &end, 0);
> +       if (end == buf || new < min || new >  max || new > UINT_MAX)
> +               return -EINVAL;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       *(unsigned int *)config = new;

config is already an unsigned int *, why cast?

> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       /* Always return full write size even if we didn't consume all */
> +       return size;
> +}
> +
> +static int test_dev_config_update_int(struct kmod_test_device *test_dev,
> +                                     const char *buf, size_t size,
> +                                     int *config)
> +{
> +       char *end;
> +       long new = simple_strtol(buf, &end, 0);
> +       if (end == buf || new > INT_MAX || new < INT_MIN)
> +               return -EINVAL;
> +       mutex_lock(&test_dev->config_mutex);
> +       *(int *)config = new;

config is already an int *, why cast?

> +       mutex_unlock(&test_dev->config_mutex);
> +       /* Always return full write size even if we didn't consume all */
> +       return size;
> +}
> +
> +static ssize_t test_dev_config_show_int(struct kmod_test_device *test_dev,
> +                                       char *buf,
> +                                       int config)
> +{
> +       int val;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       val = config;
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       return snprintf(buf, PAGE_SIZE, "%d\n", val);
> +}
> +
> +static ssize_t test_dev_config_show_uint(struct kmod_test_device *test_dev,
> +                                        char *buf,
> +                                        unsigned int config)
> +{
> +       unsigned int val;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       val = config;
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       return snprintf(buf, PAGE_SIZE, "%u\n", val);
> +}
> +
> +static ssize_t test_result_store(struct device *dev,
> +                                struct device_attribute *attr,
> +                                const char *buf, size_t count)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       return test_dev_config_update_int(test_dev, buf, count,
> +                                         &config->test_result);
> +}
> +
> +static ssize_t config_num_threads_store(struct device *dev,
> +                                       struct device_attribute *attr,
> +                                       const char *buf, size_t count)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       return test_dev_config_update_uint_sync(test_dev, buf, count,
> +                                               &config->num_threads,
> +                                               kmod_config_sync_info);
> +}
> +
> +static ssize_t config_num_threads_show(struct device *dev,
> +                                      struct device_attribute *attr,
> +                                      char *buf)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       return test_dev_config_show_int(test_dev, buf, config->num_threads);
> +}
> +static DEVICE_ATTR(config_num_threads, 0644, config_num_threads_show,
> +                  config_num_threads_store);
> +
> +static ssize_t config_test_case_store(struct device *dev,
> +                                     struct device_attribute *attr,
> +                                     const char *buf, size_t count)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       return test_dev_config_update_uint_range(test_dev, buf, count,
> +                                                &config->test_case,
> +                                                __TEST_KMOD_INVALID + 1,
> +                                                __TEST_KMOD_MAX - 1);
> +}
> +
> +static ssize_t config_test_case_show(struct device *dev,
> +                                    struct device_attribute *attr,
> +                                    char *buf)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       return test_dev_config_show_uint(test_dev, buf, config->test_case);
> +}
> +static DEVICE_ATTR(config_test_case, 0644, config_test_case_show,
> +                  config_test_case_store);
> +
> +static ssize_t test_result_show(struct device *dev,
> +                               struct device_attribute *attr,
> +                               char *buf)
> +{
> +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> +       struct test_config *config = &test_dev->config;
> +
> +       return test_dev_config_show_int(test_dev, buf, config->test_result);
> +}
> +static DEVICE_ATTR(test_result, 0644, test_result_show, test_result_store);
> +
> +#define TEST_KMOD_DEV_ATTR(name)               &dev_attr_##name.attr
> +
> +static struct attribute *test_dev_attrs[] = {
> +       TEST_KMOD_DEV_ATTR(trigger_config),
> +       TEST_KMOD_DEV_ATTR(config),
> +       TEST_KMOD_DEV_ATTR(reset),
> +
> +       TEST_KMOD_DEV_ATTR(config_test_driver),
> +       TEST_KMOD_DEV_ATTR(config_test_fs),
> +       TEST_KMOD_DEV_ATTR(config_num_threads),
> +       TEST_KMOD_DEV_ATTR(config_test_case),
> +       TEST_KMOD_DEV_ATTR(test_result),
> +
> +       NULL,
> +};
> +
> +ATTRIBUTE_GROUPS(test_dev);
> +
> +static int kmod_config_init(struct kmod_test_device *test_dev)
> +{
> +       int ret;
> +
> +       mutex_lock(&test_dev->config_mutex);
> +       ret = __kmod_config_init(test_dev);
> +       mutex_unlock(&test_dev->config_mutex);
> +
> +       return ret;
> +}
> +
> +/*
> + * XXX: this could perhaps be made generic already too, but a hunt
> + * for actual users would be needed first. It could be generic
> + * if other test drivers end up using a similar mechanism.
> + */
> +const char *test_dev_get_name(const char *base, int idx, gfp_t gfp)
> +{
> +       const char *name_const;
> +       char *name;
> +
> +       if (!base)
> +               return NULL;
> +       if (strlen(base) > 30)
> +               return NULL;

why?

> +       name = kzalloc(1024, gfp);
> +       if (!name)
> +               return NULL;
> +
> +       strncat(name, base, strlen(base));
> +       sprintf(name+(strlen(base)), "%d", idx);
> +       name_const = kstrdup_const(name, gfp);
> +
> +       kfree(name);
> +
> +       return name_const;
> +}

What is going on here? Why not just:
    return kasprintf(gfp, "%s%d", base, idx);

For all of that code? And kstrdup_const is pointless here since it'll
always just do the dup (as the kmalloc source isn't in rodata).

> +
> +static struct kmod_test_device *alloc_test_dev_kmod(int idx)
> +{
> +       int ret;
> +       struct kmod_test_device *test_dev;
> +       struct miscdevice *misc_dev;
> +
> +       test_dev = vzalloc(sizeof(struct kmod_test_device));
> +       if (!test_dev) {
> +               pr_err("Cannot alloc test_dev\n");
> +               goto err_out;
> +       }
> +
> +       mutex_init(&test_dev->config_mutex);
> +       mutex_init(&test_dev->trigger_mutex);
> +       mutex_init(&test_dev->thread_mutex);
> +
> +       init_completion(&test_dev->kthreads_done);
> +
> +       ret = kmod_config_init(test_dev);
> +       if (ret < 0) {
> +               pr_err("Cannot alloc kmod_config_init()\n");
> +               goto err_out_free;
> +       }
> +
> +       test_dev->dev_idx = idx;
> +       misc_dev = &test_dev->misc_dev;
> +
> +       misc_dev->minor = MISC_DYNAMIC_MINOR;
> +       misc_dev->name = test_dev_get_name("test_kmod", test_dev->dev_idx,
> +                                          GFP_KERNEL);
> +       if (!misc_dev->name) {
> +               pr_err("Cannot alloc misc_dev->name\n");
> +               goto err_out_free_config;
> +       }
> +       misc_dev->groups = test_dev_groups;
> +
> +       return test_dev;
> +
> +err_out_free_config:
> +       free_test_dev_info(test_dev);
> +       kmod_config_free(test_dev);
> +err_out_free:
> +       vfree(test_dev);
> +       test_dev = NULL;
> +err_out:
> +       return NULL;
> +}
> +
> +static void free_test_dev_kmod(struct kmod_test_device *test_dev)
> +{
> +       if (test_dev) {
> +               kfree_const(test_dev->misc_dev.name);
> +               test_dev->misc_dev.name = NULL;
> +               free_test_dev_info(test_dev);
> +               kmod_config_free(test_dev);
> +               vfree(test_dev);
> +               test_dev = NULL;
> +       }
> +}
> +
> +static struct kmod_test_device *register_test_dev_kmod(void)
> +{
> +       struct kmod_test_device *test_dev = NULL;
> +       int ret;
> +
> +       mutex_unlock(&reg_dev_mutex);
> +
> +       /* int should suffice for number of devices, test for wrap */
> +       if (unlikely(num_test_devs + 1) < 0) {
> +               pr_err("reached limit of number of test devices\n");
> +               goto out;
> +       }
> +
> +       test_dev = alloc_test_dev_kmod(num_test_devs);
> +       if (!test_dev)
> +               goto out;
> +
> +       ret = misc_register(&test_dev->misc_dev);
> +       if (ret) {
> +               pr_err("could not register misc device: %d\n", ret);
> +               free_test_dev_kmod(test_dev);
> +               goto out;
> +       }
> +
> +       test_dev->dev = test_dev->misc_dev.this_device;
> +       list_add_tail(&test_dev->list, &reg_test_devs);
> +       dev_info(test_dev->dev, "interface ready\n");
> +
> +       num_test_devs++;
> +
> +out:
> +       mutex_unlock(&reg_dev_mutex);
> +
> +       return test_dev;
> +
> +}
> +
> +static int __init test_kmod_init(void)
> +{
> +       struct kmod_test_device *test_dev;
> +       int ret;
> +
> +       test_dev = register_test_dev_kmod();
> +       if (!test_dev) {
> +               pr_err("Cannot add first test kmod device\n");
> +               return -ENODEV;
> +       }
> +
> +       /*
> +        * With some work we might be able to gracefully enable
> +        * testing with this driver built-in, for now this seems
> +        * rather risky. For those willing to try have at it,
> +        * and enable the below. Good luck! If that works, try
> +        * lowering the init level for more fun.
> +        */
> +       if (force_init_test) {
> +               ret = trigger_config_run_driver(test_dev, "tun");
> +               if (WARN_ON(ret))
> +                       return ret;
> +               ret = trigger_config_run_fs(test_dev, "btrfs");
> +               if (WARN_ON(ret))
> +                       return ret;
> +       }
> +
> +       return 0;
> +}
> +late_initcall(test_kmod_init);
> +
> +static
> +void unregister_test_dev_kmod(struct kmod_test_device *test_dev)
> +{
> +       mutex_lock(&test_dev->trigger_mutex);
> +       mutex_lock(&test_dev->config_mutex);
> +
> +       test_dev_kmod_stop_tests(test_dev);
> +
> +       dev_info(test_dev->dev, "removing interface\n");
> +       misc_deregister(&test_dev->misc_dev);
> +
> +       mutex_unlock(&test_dev->config_mutex);
> +       mutex_unlock(&test_dev->trigger_mutex);
> +
> +       free_test_dev_kmod(test_dev);
> +}
> +
> +static void __exit test_kmod_exit(void)
> +{
> +       struct kmod_test_device *test_dev, *tmp;
> +
> +       mutex_lock(&reg_dev_mutex);
> +       list_for_each_entry_safe(test_dev, tmp, &reg_test_devs, list) {
> +               list_del(&test_dev->list);
> +               unregister_test_dev_kmod(test_dev);
> +       }
> +       mutex_unlock(&reg_dev_mutex);
> +}
> +module_exit(test_kmod_exit);
> +
> +MODULE_AUTHOR("Luis R. Rodriguez <mcgrof@kernel.org>");
> +MODULE_LICENSE("GPL");
> diff --git a/tools/testing/selftests/kmod/Makefile b/tools/testing/selftests/kmod/Makefile
> new file mode 100644
> index 000000000000..fa2ccc5fb3de
> --- /dev/null
> +++ b/tools/testing/selftests/kmod/Makefile
> @@ -0,0 +1,11 @@
> +# Makefile for kmod loading selftests
> +
> +# No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
> +all:
> +
> +TEST_PROGS := kmod.sh
> +
> +include ../lib.mk
> +
> +# Nothing to clean up.
> +clean:
> diff --git a/tools/testing/selftests/kmod/config b/tools/testing/selftests/kmod/config
> new file mode 100644
> index 000000000000..259f4fd6b5e2
> --- /dev/null
> +++ b/tools/testing/selftests/kmod/config
> @@ -0,0 +1,7 @@
> +CONFIG_TEST_KMOD=m
> +CONFIG_TEST_LKM=m
> +CONFIG_XFS_FS=m
> +
> +# For the module parameter force_init_test is used
> +CONFIG_TUN=m
> +CONFIG_BTRFS_FS=m
> diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
> new file mode 100755
> index 000000000000..9ea1864d8bae
> --- /dev/null
> +++ b/tools/testing/selftests/kmod/kmod.sh
> @@ -0,0 +1,449 @@
> +#!/bin/bash
> +#
> +# Copyright (C) 2016 Luis R. Rodriguez <mcgrof@kernel.org>
> +#
> +# This program is free software; you can redistribute it and/or modify it
> +# under the terms of copyleft-next (version 0.3.1 or later) as published
> +# at http://copyleft-next.org/.
> +
> +# This is a stress test script for kmod, the kernel module loader. It uses
> +# test_kmod which exposes a series of knobs for the API for us so we can
> +# tweak each test in userspace rather than in kernelspace.
> +#
> +# The way kmod works is it uses the kernel's usermode helper API to eventually
> +# call /sbin/modprobe. It has a limit of the number of concurrent calls
> +# possible. The kernel interface to load modules is request_module(), however
> +# mount uses get_fs_type(). Both behave slightly differently, but the
> +# differences are important enough to test each call separately. For this
> +# reason test_kmod starts by providing tests for both calls.
> +#
> +# The test driver test_kmod assumes a series of defaults which you can
> +# override by exporting to your environment prior running this script.
> +# For instance this script assumes you do not have xfs loaded upon boot.
> +# If this is false, export DEFAULT_KMOD_FS="ext4" prior to running this
> +# script if the filesyste module you don't have loaded upon bootup
> +# is ext4 instead. Refer to allow_user_defaults() for a list of user
> +# override variables possible.
> +#
> +# You'll want at least 4096 GiB of RAM to expect to run these tests

4TiB of RAM? I assume this was meant to be 4 GiB not 4096?

> +# without running out of memory on them. For other requirements refer
> +# to test_reqs()
> +
> +set -e
> +
> +TEST_DRIVER="test_kmod"
> +
> +function allow_user_defaults()
> +{
> +       if [ -z $DEFAULT_KMOD_DRIVER ]; then
> +               DEFAULT_KMOD_DRIVER="test_module"
> +       fi
> +
> +       if [ -z $DEFAULT_KMOD_FS ]; then
> +               DEFAULT_KMOD_FS="xfs"
> +       fi
> +
> +       if [ -z $PROC_DIR ]; then
> +               PROC_DIR="/proc/sys/kernel/"
> +       fi
> +
> +       if [ -z $MODPROBE_LIMIT ]; then
> +               MODPROBE_LIMIT=50
> +       fi
> +
> +       if [ -z $DIR ]; then
> +               DIR="/sys/devices/virtual/misc/${TEST_DRIVER}0/"
> +       fi
> +
> +       MODPROBE_LIMIT_FILE="${PROC_DIR}/kmod-limit"
> +}
> +
> +test_reqs()
> +{
> +       if ! which modprobe 2> /dev/null > /dev/null; then
> +               echo "$0: You need modprobe installed"

While not a huge deal, I prefer that error messages end up on stderr,
so adding >&2 to all the failure echos (or providing an err function)
would be nice. (This happens in later places...)

> +               exit 1
> +       fi
> +
> +       if ! which kmod 2> /dev/null > /dev/null; then
> +               echo "$0: You need kmod installed"
> +               exit 1
> +       fi
> +
> +       # kmod 19 has a bad bug where it returns 0 when modprobe
> +       # gets called *even* if the module was not loaded due to
> +       # some bad heuristics. For details see:
> +       #
> +       # A work around is possible in-kernel but its rather
> +       # complex.
> +       KMOD_VERSION=$(kmod --version | awk '{print $3}')
> +       if [[ $KMOD_VERSION  -le 19 ]]; then
> +               echo "$0: You need at least kmod 20"
> +               echo "kmod <= 19 is buggy, for details see:"
> +               echo "http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4"
> +               exit 1
> +       fi
> +}
> +
> +function load_req_mod()
> +{
> +       if [ ! -d $DIR ]; then
> +               # Alanis: "Oh isn't it ironic?"
> +               modprobe $TEST_DRIVER
> +               if [ ! -d $DIR ]; then
> +                       echo "$0: $DIR not present"
> +                       echo "You must have the following enabled in your kernel:"
> +                       cat $PWD/config

I like this (minimum config in the test directory). Are other tests
doing this too?

> +                       exit 1
> +               fi
> +       fi
> +}
> +
> +test_finish()
> +{
> +       echo "Test completed"
> +}
> +
> +errno_name_to_val()
> +{
> +       case "$1" in
> +       # kmod calls modprobe and upon of a module not found
> +       # modprobe returns just 1... However in the kernel we
> +       # *sometimes* see 256...
> +       MODULE_NOT_FOUND)
> +               echo 256;;
> +       SUCCESS)
> +               echo 0;;
> +       -EPERM)
> +               echo -1;;
> +       -ENOENT)
> +               echo -2;;
> +       -EINVAL)
> +               echo -22;;
> +       -ERR_ANY)
> +               echo -123456;;
> +       *)
> +               echo invalid;;
> +       esac
> +}
> +
> +errno_val_to_name()
> +       case "$1" in
> +       256)
> +               echo MODULE_NOT_FOUND;;
> +       0)
> +               echo SUCCESS;;
> +       -1)
> +               echo -EPERM;;
> +       -2)
> +               echo -ENOENT;;
> +       -22)
> +               echo -EINVAL;;
> +       -123456)
> +               echo -ERR_ANY;;
> +       *)
> +               echo invalid;;
> +       esac
> +
> +config_set_test_case_driver()
> +{
> +       if ! echo -n 1 >$DIR/config_test_case; then
> +               echo "$0: Unable to set to test case to driver" >&2
> +               exit 1
> +       fi
> +}
> +
> +config_set_test_case_fs()
> +{
> +       if ! echo -n 2 >$DIR/config_test_case; then
> +               echo "$0: Unable to set to test case to fs" >&2
> +               exit 1
> +       fi
> +}
> +
> +config_num_threads()
> +{
> +       if ! echo -n $1 >$DIR/config_num_threads; then
> +               echo "$0: Unable to set to number of threads" >&2
> +               exit 1
> +       fi
> +}
> +
> +config_get_modprobe_limit()
> +{
> +       if [[ -f ${MODPROBE_LIMIT_FILE} ]] ; then
> +               MODPROBE_LIMIT=$(cat $MODPROBE_LIMIT_FILE)
> +       fi
> +       echo $MODPROBE_LIMIT
> +}
> +
> +config_num_thread_limit_extra()
> +{
> +       MODPROBE_LIMIT=$(config_get_modprobe_limit)
> +       let EXTRA_LIMIT=$MODPROBE_LIMIT+$1
> +       config_num_threads $EXTRA_LIMIT
> +}
> +
> +# For special characters use printf directly,
> +# refer to kmod_test_0001
> +config_set_driver()
> +{
> +       if ! echo -n $1 >$DIR/config_test_driver; then
> +               echo "$0: Unable to set driver" >&2
> +               exit 1
> +       fi
> +}
> +
> +config_set_fs()
> +{
> +       if ! echo -n $1 >$DIR/config_test_fs; then
> +               echo "$0: Unable to set driver" >&2
> +               exit 1
> +       fi
> +}
> +
> +config_get_driver()
> +{
> +       cat $DIR/config_test_driver
> +}
> +
> +config_get_test_result()
> +{
> +       cat $DIR/test_result
> +}
> +
> +config_reset()
> +{
> +       if ! echo -n "1" >"$DIR"/reset; then
> +               echo "$0: reset shuld have worked" >&2
> +               exit 1
> +       fi
> +}
> +
> +config_show_config()
> +{
> +       echo "----------------------------------------------------"
> +       cat "$DIR"/config
> +       echo "----------------------------------------------------"
> +}
> +
> +config_trigger()
> +{
> +       if ! echo -n "1" >"$DIR"/trigger_config 2>/dev/null; then
> +               echo "$1: FAIL - loading should have worked"
> +               config_show_config
> +               exit 1
> +       fi
> +       echo "$1: OK! - loading kmod test"
> +}
> +
> +config_trigger_want_fail()
> +{
> +       if echo "1" > $DIR/trigger_config 2>/dev/null; then
> +               echo "$1: FAIL - test case was expected to fail"
> +               config_show_config
> +               exit 1
> +       fi
> +       echo "$1: OK! - kmod test case failed as expected"
> +}
> +
> +config_expect_result()
> +{
> +       RC=$(config_get_test_result)
> +       RC_NAME=$(errno_val_to_name $RC)
> +
> +       ERRNO_NAME=$2
> +       ERRNO=$(errno_name_to_val $ERRNO_NAME)
> +
> +       if [[ $ERRNO_NAME = "-ERR_ANY" ]]; then
> +               if [[ $RC -ge 0 ]]; then
> +                       echo "$1: FAIL, test expects $ERRNO_NAME - got $RC_NAME ($RC)" >&2
> +                       config_show_config
> +                       exit 1
> +               fi
> +       elif [[ $RC != $ERRNO ]]; then
> +               echo "$1: FAIL, test expects $ERRNO_NAME ($ERRNO) - got $RC_NAME ($RC)" >&2
> +               config_show_config
> +               exit 1
> +       fi
> +       echo "$1: OK! - Return value: $RC ($RC_NAME), expected $ERRNO_NAME"
> +}
> +
> +kmod_defaults_driver()
> +{
> +       config_reset
> +       modprobe -r $DEFAULT_KMOD_DRIVER
> +       config_set_driver $DEFAULT_KMOD_DRIVER
> +}
> +
> +kmod_defaults_fs()
> +{
> +       config_reset
> +       modprobe -r $DEFAULT_KMOD_FS
> +       config_set_fs $DEFAULT_KMOD_FS
> +       config_set_test_case_fs
> +}
> +
> +kmod_test_0001_driver()
> +{
> +       NAME='\000'
> +
> +       kmod_defaults_driver
> +       config_num_threads 1
> +       printf '\000' >"$DIR"/config_test_driver
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
> +}
> +
> +kmod_test_0001_fs()
> +{
> +       NAME='\000'
> +
> +       kmod_defaults_fs
> +       config_num_threads 1
> +       printf '\000' >"$DIR"/config_test_fs
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} -EINVAL
> +}
> +
> +kmod_test_0001()
> +{
> +       kmod_test_0001_driver
> +       kmod_test_0001_fs
> +}
> +
> +kmod_test_0002_driver()
> +{
> +       NAME="nope-$DEFAULT_KMOD_DRIVER"
> +
> +       kmod_defaults_driver
> +       config_set_driver $NAME
> +       config_num_threads 1
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
> +}
> +
> +kmod_test_0002_fs()
> +{
> +       NAME="nope-$DEFAULT_KMOD_FS"
> +
> +       kmod_defaults_fs
> +       config_set_fs $NAME
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} -EINVAL
> +}
> +
> +kmod_test_0002()
> +{
> +       kmod_test_0002_driver
> +       kmod_test_0002_fs
> +}
> +
> +kmod_test_0003()
> +{
> +       kmod_defaults_fs
> +       config_num_threads 1
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} SUCCESS
> +}
> +
> +kmod_test_0004()
> +{
> +       kmod_defaults_fs
> +       config_num_threads 2
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} SUCCESS
> +}
> +
> +kmod_test_0005()
> +{
> +       kmod_defaults_driver
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} SUCCESS
> +}
> +
> +kmod_test_0006()
> +{
> +       kmod_defaults_fs
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} SUCCESS
> +}
> +
> +kmod_test_0007()
> +{
> +       kmod_test_0005
> +       kmod_test_0006
> +}
> +
> +kmod_test_0008()
> +{
> +       kmod_defaults_driver
> +       MODPROBE_LIMIT=$(config_get_modprobe_limit)
> +       let EXTRA=$MODPROBE_LIMIT/2
> +       config_num_thread_limit_extra $EXTRA
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} -EINVAL
> +}
> +
> +kmod_test_0009()
> +{
> +       kmod_defaults_fs
> +       #MODPROBE_LIMIT=$(config_get_modprobe_limit)
> +       #let EXTRA=$MODPROBE_LIMIT/3
> +       config_num_thread_limit_extra 5
> +       config_trigger ${FUNCNAME[0]}
> +       config_expect_result ${FUNCNAME[0]} -EINVAL
> +}
> +
> +trap "test_finish" EXIT
> +test_reqs
> +allow_user_defaults
> +load_req_mod
> +
> +usage()
> +{
> +       echo "Usage: $0 [ -t <4-number-digit> ]"
> +       echo "Valid tests: 0001-0011"
> +       echo
> +       echo "0001 - Simple test - 1 thread  for empty string"
> +       echo "0002 - Simple test - 1 thread  for modules/filesystems that do not exist"
> +       echo "0003 - Simple test - 1 thread  for get_fs_type() only"
> +       echo "0004 - Simple test - 2 threads for get_fs_type() only"
> +       echo "0005 - multithreaded tests with default setup - request_module() only"
> +       echo "0006 - multithreaded tests with default setup - get_fs_type() only"
> +       echo "0007 - multithreaded tests with default setup test request_module() and get_fs_type()"
> +       echo "0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()"
> +       echo "0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
> +       exit 1
> +}
> +
> +# You can ask for a specific test:
> +if [[ $# > 0 ]] ; then
> +       if [[ $1 != "-t" ]]; then
> +               usage
> +       fi
> +
> +       re='^[0-9]+$'
> +       if ! [[ $2 =~ $re ]]; then
> +               usage
> +       fi
> +
> +       RUN_TEST=kmod_test_$2
> +       $RUN_TEST
> +       exit 0
> +fi
> +
> +# Once tese are enabled please leave them as-is. Write your own test,
> +# we have tons of space.
> +kmod_test_0001
> +kmod_test_0002
> +kmod_test_0003
> +kmod_test_0004
> +kmod_test_0005
> +kmod_test_0006
> +kmod_test_0007
> +
> +#kmod_test_0008
> +#kmod_test_0009

While it's documented in the commit log, I think a short note for each
disabled test should be added here too.

> +
> +exit 0
> --
> 2.10.1
>

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 03/10] kmod: add dynamic max concurrent thread count
  2016-12-08 19:48 ` [RFC 03/10] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
@ 2016-12-08 20:28   ` Kees Cook
  2016-12-08 21:00     ` Luis R. Rodriguez
  2016-12-14 15:38   ` Petr Mladek
  1 sibling, 1 reply; 65+ messages in thread
From: Kees Cook @ 2016-12-08 20:28 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck, rgoldwyn,
	subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> We currently statically limit the number of modprobe threads which
> we allow to run concurrently to 50. As per Keith Owens, this was a
> completely arbitrary value, and it was set in the 2.3.38 days [0]
> over 16 years ago in year 2000.
>
> Although we haven't yet hit our lower limits, experimentation [1]
> shows that when and if we hit this limit in the worst case, will be
> fatal -- consider get_fs_type() failures upon mount on a system which
> has many partitions, some of which might even be with the same
> filesystem. Its best to be prudent and increase and set this
> value to something more sensible which ensures we're far from hitting
> the limit and also allows default build/user run time override.
>
> The worst case is fatal given that once a module fails to load there
> is a period of time during which subsequent request for the same module
> will fail, so in the case of partitions its not just one request that
> could fail, but whole series of partitions. This later issue of a
> module request failure domino effect can be addressed later, but
> increasing the limit to something more meaninful should at least give us
> enough cushion to avoid this for a while.
>
> Set this value up with a bit more meaninful modern limits:
>
> Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
>
> Also allow the default max limit to be further fine tuned at compile
> time and at initialization at run time at boot up using the kernel
> parameter: max_modprobes.
>
> [0] https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
> [1] https://github.com/mcgrof/test_request_module
>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
>  include/linux/kmod.h                            |  3 +-
>  init/Kconfig                                    | 23 +++++++++++++
>  init/main.c                                     |  1 +
>  kernel/kmod.c                                   | 43 ++++++++++++++++---------
>  5 files changed, 61 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index be2d6d0a03a4..92bcccc65ea4 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1700,6 +1700,13 @@
>
>         keepinitrd      [HW,ARM]
>
> +       kmod.max_modprobes [KNL]
> +                       This lets you set the max allowed of concurrent
> +                       modprobes threads possible on a system overriding the
> +                       default heuristic of:
> +
> +                               min(max_threads/2, 2 << CONFIG_MAX_KMOD_CONCURRENT)
> +
>         kernelcore=     [KNL,X86,IA-64,PPC]
>                         Format: nn[KMGTPE] | "mirror"
>                         This parameter
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index fcfd2bf14d3f..15783cd7f056 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -38,13 +38,14 @@ int __request_module(bool wait, const char *name, ...);
>  #define request_module_nowait(mod...) __request_module(false, mod)
>  #define try_then_request_module(x, mod...) \
>         ((x) ?: (__request_module(true, mod), (x)))
> +void init_kmod_umh(void);
>  #else
>  static inline int request_module(const char *name, ...) { return -ENOSYS; }
>  static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
> +static inline void init_kmod_umh(void) { }
>  #define try_then_request_module(x, mod...) (x)
>  #endif
>
> -
>  struct cred;
>  struct file;
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 271692a352f1..da2c25746937 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -2111,6 +2111,29 @@ config TRIM_UNUSED_KSYMS
>
>           If unsure, or if you need to build out-of-tree modules, say N.
>
> +config MAX_KMOD_CONCURRENT
> +       int "Max allowed concurrent request_module() calls (6=>64, 10=>1024)"
> +       range 0 14
> +       default 6 if !BASE_SMALL
> +       default 7 if BASE_SMALL
> +       help
> +         The kernel restricts the number of possible concurrent calls to
> +         request_module() to help avoid a recursive loop possible with
> +         modules. The default maximum number of concurrent threads allowed
> +         to run request_module() will be:
> +
> +           max_modprobes = min(max_threads/2, 2 << CONFIG_MAX_KMOD_CONCURRENT);
> +
> +         The value set in CONFIG_MAX_KMOD_CONCURRENT represents then the power
> +         of 2 value used at boot time for the above computation. You can
> +         override the default built value using the kernel parameter:
> +
> +               kmod.max_modprobes=4096
> +
> +         We set this to default to 64 (2^6) concurrent modprobe threads for
> +         small systems, for larger systems this defaults to 128 (2^7)
> +         concurrent modprobe threads.
> +
>  endif # MODULES
>
>  config MODULES_TREE_LOOKUP
> diff --git a/init/main.c b/init/main.c
> index 8161208d4ece..1fa441aa32c6 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -638,6 +638,7 @@ asmlinkage __visible void __init start_kernel(void)
>         thread_stack_cache_init();
>         cred_init();
>         fork_init();
> +       init_kmod_umh();
>         proc_caches_init();
>         buffer_init();
>         key_init();
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 0277d1216f80..cb6f7ca7b8a5 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -44,6 +44,9 @@
>  #include <trace/events/module.h>
>
>  extern int max_threads;
> +unsigned int max_modprobes;
> +module_param(max_modprobes, uint, 0644);
> +MODULE_PARM_DESC(max_modprobes, "Max number of allowed concurrent modprobes");
>
>  #define CAP_BSET       (void *)1
>  #define CAP_PI         (void *)2
> @@ -125,10 +128,8 @@ int __request_module(bool wait, const char *fmt, ...)
>  {
>         va_list args;
>         char module_name[MODULE_NAME_LEN];
> -       unsigned int max_modprobes;
>         int ret;
>         static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> -#define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */
>         static int kmod_loop_msg;
>
>         /*
> @@ -152,19 +153,6 @@ int __request_module(bool wait, const char *fmt, ...)
>         if (ret)
>                 return ret;
>
> -       /* If modprobe needs a service that is in a module, we get a recursive
> -        * loop.  Limit the number of running kmod threads to max_threads/2 or
> -        * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> -        * would be to run the parents of this process, counting how many times
> -        * kmod was invoked.  That would mean accessing the internals of the
> -        * process tables to get the command line, proc_pid_cmdline is static
> -        * and it is not worth changing the proc code just to handle this case.
> -        * KAO.
> -        *
> -        * "trace the ppid" is simple, but will fail if someone's
> -        * parent exits.  I think this is as good as it gets. --RR
> -        */
> -       max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
>         atomic_inc(&kmod_concurrent);
>         if (atomic_read(&kmod_concurrent) > max_modprobes) {
>                 /* We may be blaming an innocent here, but unlikely */
> @@ -186,6 +174,31 @@ int __request_module(bool wait, const char *fmt, ...)
>         return ret;
>  }
>  EXPORT_SYMBOL(__request_module);
> +
> +/*
> + * If modprobe needs a service that is in a module, we get a recursive
> + * loop.  Limit the number of running kmod threads to max_threads/2 or
> + * CONFIG_MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> + * would be to run the parents of this process, counting how many times
> + * kmod was invoked.  That would mean accessing the internals of the
> + * process tables to get the command line, proc_pid_cmdline is static
> + * and it is not worth changing the proc code just to handle this case.
> + *
> + * "trace the ppid" is simple, but will fail if someone's
> + * parent exits.  I think this is as good as it gets.
> + *
> + * You can override with with a kernel parameter, for instance to allow
> + * 4096 concurrent modprobe instances:
> + *
> + *     kmod.max_modprobes=4096
> + */
> +void __init init_kmod_umh(void)

What does umh mean?

> +{
> +       if (!max_modprobes)
> +               max_modprobes = min(max_threads/2,
> +                                   2 << CONFIG_MAX_KMOD_CONCURRENT);
> +}
> +
>  #endif /* CONFIG_MODULES */
>
>  static void call_usermodehelper_freeinfo(struct subprocess_info *info)
> --
> 2.10.1
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-08 19:48 ` [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
@ 2016-12-08 20:29   ` Kees Cook
  2016-12-08 21:08     ` Luis R. Rodriguez
  2016-12-22  5:07   ` Jessica Yu
  1 sibling, 1 reply; 65+ messages in thread
From: Kees Cook @ 2016-12-08 20:29 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck, rgoldwyn,
	subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> kmod_concurrent is used as an atomic counter for enabling
> the allowed limit of modprobe calls, provide wrappers for it
> to enable this to be expanded on more easily. This will be done
> later.
>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/kmod.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index cb6f7ca7b8a5..049d7eabda38 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -44,6 +44,9 @@
>  #include <trace/events/module.h>
>
>  extern int max_threads;
> +
> +static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> +
>  unsigned int max_modprobes;
>  module_param(max_modprobes, uint, 0644);
>  MODULE_PARM_DESC(max_modprobes, "Max number of allowed concurrent modprobes");
> @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
>         return -ENOMEM;
>  }
>
> +static int kmod_umh_threads_get(void)
> +{
> +       atomic_inc(&kmod_concurrent);
> +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> +               return 0;
> +       atomic_dec(&kmod_concurrent);
> +       return -ENOMEM;
> +}
> +
> +static void kmod_umh_threads_put(void)
> +{
> +       atomic_dec(&kmod_concurrent);
> +}

Can you use a kref here instead? We're trying to kill raw use of
atomic_t for reference counting...

> +
>  /**
>   * __request_module - try to load a kernel module
>   * @wait: wait (or not) for the operation to complete
> @@ -129,7 +146,6 @@ int __request_module(bool wait, const char *fmt, ...)
>         va_list args;
>         char module_name[MODULE_NAME_LEN];
>         int ret;
> -       static atomic_t kmod_concurrent = ATOMIC_INIT(0);
>         static int kmod_loop_msg;
>
>         /*
> @@ -153,8 +169,8 @@ int __request_module(bool wait, const char *fmt, ...)
>         if (ret)
>                 return ret;
>
> -       atomic_inc(&kmod_concurrent);
> -       if (atomic_read(&kmod_concurrent) > max_modprobes) {
> +       ret = kmod_umh_threads_get();
> +       if (ret) {
>                 /* We may be blaming an innocent here, but unlikely */
>                 if (kmod_loop_msg < 5) {
>                         printk(KERN_ERR
> @@ -162,15 +178,14 @@ int __request_module(bool wait, const char *fmt, ...)
>                                module_name);
>                         kmod_loop_msg++;
>                 }
> -               atomic_dec(&kmod_concurrent);
> -               return -ENOMEM;
> +               return ret;
>         }
>
>         trace_module_request(module_name, wait, _RET_IP_);
>
>         ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
>
> -       atomic_dec(&kmod_concurrent);
> +       kmod_umh_threads_put();
>         return ret;
>  }
>  EXPORT_SYMBOL(__request_module);
> --
> 2.10.1
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 02/10] module: fix memory leak on early load_module() failures
  2016-12-08 19:48 ` [RFC 02/10] module: fix memory leak on early load_module() failures Luis R. Rodriguez
@ 2016-12-08 20:30   ` Kees Cook
  2016-12-08 21:10     ` Luis R. Rodriguez
  2016-12-09 17:06   ` Miroslav Benes
  2016-12-15 18:46   ` Aaron Tomlin
  2 siblings, 1 reply; 65+ messages in thread
From: Kees Cook @ 2016-12-08 20:30 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck, rgoldwyn,
	subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> While looking for early possible module loading failures I was
> able to reproduce a memory leak possible with kmemleak. There
> are a few rare ways to trigger a failure:
>
>   o we've run into a failure while processing kernel parameters
>     (parse_args() returns an error)
>   o mod_sysfs_setup() fails
>   o we're a live patch module and copy_module_elf() fails
>
> Chances of running into this issue is really low.
>
> kmemleak splat:
>
> unreferenced object 0xffff9f2c4ada1b00 (size 32):
>   comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
>   hex dump (first 32 bytes):
>     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
>     [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
>     [<ffffffff8c1bc581>] kstrdup+0x31/0x60
>     [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
>     [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
>     [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
>     [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
>     [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
>     [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
>     [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
>     [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
>     [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
>     [<ffffffffffffffff>] 0xffffffffffffffff
>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

Acked-by: Kees Cook <keescook@chromium.org>

Is this worth sending through -stable too?

-Kees

> ---
>  kernel/module.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index f7482db0f843..e420ed67e533 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3722,6 +3722,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
>         mod_sysfs_teardown(mod);
>   coming_cleanup:
>         mod->state = MODULE_STATE_GOING;
> +       destroy_params(mod->kp, mod->num_kp);
>         blocking_notifier_call_chain(&module_notify_list,
>                                      MODULE_STATE_GOING, mod);
>         klp_module_going(mod);
> --
> 2.10.1
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 03/10] kmod: add dynamic max concurrent thread count
  2016-12-08 20:28   ` Kees Cook
@ 2016-12-08 21:00     ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 21:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, shuah, Jessica Yu, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, martin.wilck, Michal Marek, Petr Mladek, hare,
	rwright, Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck,
	rgoldwyn, subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Thu, Dec 08, 2016 at 12:28:07PM -0800, Kees Cook wrote:
> On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 0277d1216f80..cb6f7ca7b8a5 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -44,6 +44,9 @@
> > @@ -186,6 +174,31 @@ int __request_module(bool wait, const char *fmt, ...)
> >         return ret;
> >  }
> >  EXPORT_SYMBOL(__request_module);
> > +
> > +/*
> > + * If modprobe needs a service that is in a module, we get a recursive
> > + * loop.  Limit the number of running kmod threads to max_threads/2 or
> > + * CONFIG_MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> > + * would be to run the parents of this process, counting how many times
> > + * kmod was invoked.  That would mean accessing the internals of the
> > + * process tables to get the command line, proc_pid_cmdline is static
> > + * and it is not worth changing the proc code just to handle this case.
> > + *
> > + * "trace the ppid" is simple, but will fail if someone's
> > + * parent exits.  I think this is as good as it gets.
> > + *
> > + * You can override with with a kernel parameter, for instance to allow
> > + * 4096 concurrent modprobe instances:
> > + *
> > + *     kmod.max_modprobes=4096
> > + */
> > +void __init init_kmod_umh(void)
> 
> What does umh mean?

umh is user mode helper. kmod.c actually implements the kernel's umh code.
A subsequent series I will want to move all that to umh.c and keep module
loading separate in kmod.c But that's for later as a cleanup.

BTW any chance I can have you trim replies to file name and hunk for changes
you reply to ? As an example I did that here :)

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-08 20:29   ` Kees Cook
@ 2016-12-08 21:08     ` Luis R. Rodriguez
  2016-12-15 12:46       ` Petr Mladek
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 21:08 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, shuah, Jessica Yu, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, martin.wilck, Michal Marek, Petr Mladek, hare,
	rwright, Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck,
	rgoldwyn, subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > kmod_concurrent is used as an atomic counter for enabling
> > the allowed limit of modprobe calls, provide wrappers for it
> > to enable this to be expanded on more easily. This will be done
> > later.
> >
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  kernel/kmod.c | 27 +++++++++++++++++++++------
> >  1 file changed, 21 insertions(+), 6 deletions(-)
> >
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index cb6f7ca7b8a5..049d7eabda38 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -44,6 +44,9 @@
> >  #include <trace/events/module.h>
> >
> >  extern int max_threads;
> > +
> > +static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> > +
> >  unsigned int max_modprobes;
> >  module_param(max_modprobes, uint, 0644);
> >  MODULE_PARM_DESC(max_modprobes, "Max number of allowed concurrent modprobes");
> > @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
> >         return -ENOMEM;
> >  }
> >
> > +static int kmod_umh_threads_get(void)
> > +{
> > +       atomic_inc(&kmod_concurrent);
> > +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> > +               return 0;
> > +       atomic_dec(&kmod_concurrent);
> > +       return -ENOMEM;
> > +}
> > +
> > +static void kmod_umh_threads_put(void)
> > +{
> > +       atomic_dec(&kmod_concurrent);
> > +}
> 
> Can you use a kref here instead? We're trying to kill raw use of
> atomic_t for reference counting...

That's a much broader functional change than I was looking for, but I am up for
it. Can you describe the benefit of using kref you expect or why this is an
ongoing crusade? Since its a larger functional change how about doing this
change later, and we can test impact with the tress test driver. In theory if
there are benefits can't we add a test case to prove the gains?

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 02/10] module: fix memory leak on early load_module() failures
  2016-12-08 20:30   ` Kees Cook
@ 2016-12-08 21:10     ` Luis R. Rodriguez
  2016-12-08 21:17       ` Kees Cook
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-08 21:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, DSterba, Filipe Manana, NeilBrown, Guenter Roeck,
	rgoldwyn, subashab, Heinrich Schuchardt, Aaron Tomlin,
	Miroslav Benes, Paul E. McKenney, Dan Williams, Josh Poimboeuf,
	David S. Miller, Ingo Molnar, Andrew Morton, Linus Torvalds,
	linux-kselftest, linux-doc, LKML

On Thu, Dec 8, 2016 at 2:30 PM, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> While looking for early possible module loading failures I was
>> able to reproduce a memory leak possible with kmemleak. There
>> are a few rare ways to trigger a failure:
>>
>>   o we've run into a failure while processing kernel parameters
>>     (parse_args() returns an error)
>>   o mod_sysfs_setup() fails
>>   o we're a live patch module and copy_module_elf() fails
>>
>> Chances of running into this issue is really low.
>>
>> kmemleak splat:
>>
>> unreferenced object 0xffff9f2c4ada1b00 (size 32):
>>   comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
>>   hex dump (first 32 bytes):
>>     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>   backtrace:
>>     [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
>>     [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
>>     [<ffffffff8c1bc581>] kstrdup+0x31/0x60
>>     [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
>>     [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
>>     [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
>>     [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
>>     [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
>>     [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
>>     [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
>>     [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
>>     [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
>>     [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
>
> Acked-by: Kees Cook <keescook@chromium.org>
>
> Is this worth sending through -stable too?

Yes, for some reason git-send e-mail complained to me about
stable@kernel.org not being a valid local address, so I had to remove
it, but indeed. I'll try to fix this e-mail issue later and add your
tag.

 Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 02/10] module: fix memory leak on early load_module() failures
  2016-12-08 21:10     ` Luis R. Rodriguez
@ 2016-12-08 21:17       ` Kees Cook
  0 siblings, 0 replies; 65+ messages in thread
From: Kees Cook @ 2016-12-08 21:17 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, DSterba, Filipe Manana, NeilBrown, Guenter Roeck,
	rgoldwyn, subashab, Heinrich Schuchardt, Aaron Tomlin,
	Miroslav Benes, Paul E. McKenney, Dan Williams, Josh Poimboeuf,
	David S. Miller, Ingo Molnar, Andrew Morton, Linus Torvalds,
	linux-kselftest, linux-doc, LKML

On Thu, Dec 8, 2016 at 1:10 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Thu, Dec 8, 2016 at 2:30 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>>> While looking for early possible module loading failures I was
>>> able to reproduce a memory leak possible with kmemleak. There
>>> are a few rare ways to trigger a failure:
>>>
>>>   o we've run into a failure while processing kernel parameters
>>>     (parse_args() returns an error)
>>>   o mod_sysfs_setup() fails
>>>   o we're a live patch module and copy_module_elf() fails
>>>
>>> Chances of running into this issue is really low.
>>>
>>> kmemleak splat:
>>>
>>> unreferenced object 0xffff9f2c4ada1b00 (size 32):
>>>   comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
>>>   hex dump (first 32 bytes):
>>>     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>>>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>>   backtrace:
>>>     [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
>>>     [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
>>>     [<ffffffff8c1bc581>] kstrdup+0x31/0x60
>>>     [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
>>>     [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
>>>     [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
>>>     [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
>>>     [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
>>>     [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
>>>     [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
>>>     [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
>>>     [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
>>>     [<ffffffffffffffff>] 0xffffffffffffffff
>>>
>>> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
>>
>> Acked-by: Kees Cook <keescook@chromium.org>
>>
>> Is this worth sending through -stable too?
>
> Yes, for some reason git-send e-mail complained to me about
> stable@kernel.org not being a valid local address, so I had to remove
> it, but indeed. I'll try to fix this e-mail issue later and add your
> tag.

Yup, you want stable@vger.kernel.org. :)

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 02/10] module: fix memory leak on early load_module() failures
  2016-12-08 19:48 ` [RFC 02/10] module: fix memory leak on early load_module() failures Luis R. Rodriguez
  2016-12-08 20:30   ` Kees Cook
@ 2016-12-09 17:06   ` Miroslav Benes
  2016-12-16  8:51     ` Luis R. Rodriguez
  2016-12-15 18:46   ` Aaron Tomlin
  2 siblings, 1 reply; 65+ messages in thread
From: Miroslav Benes @ 2016-12-09 17:06 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, paulmck, dan.j.williams, jpoimboe, davem,
	mingo, akpm, torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu, 8 Dec 2016, Luis R. Rodriguez wrote:

> While looking for early possible module loading failures I was
> able to reproduce a memory leak possible with kmemleak. There
> are a few rare ways to trigger a failure:
> 
>   o we've run into a failure while processing kernel parameters
>     (parse_args() returns an error)
>   o mod_sysfs_setup() fails
>   o we're a live patch module and copy_module_elf() fails
> 
> Chances of running into this issue is really low.
> 
> kmemleak splat:
> 
> unreferenced object 0xffff9f2c4ada1b00 (size 32):
>   comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
>   hex dump (first 32 bytes):
>     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
>     [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
>     [<ffffffff8c1bc581>] kstrdup+0x31/0x60
>     [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
>     [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
>     [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
>     [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
>     [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
>     [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
>     [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
>     [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
>     [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
>     [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

What about

Fixes: e180a6b7759a ("param: fix charp parameters set via sysfs")

?

Miroslav

> ---
>  kernel/module.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index f7482db0f843..e420ed67e533 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3722,6 +3722,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
>  	mod_sysfs_teardown(mod);
>   coming_cleanup:
>  	mod->state = MODULE_STATE_GOING;
> +	destroy_params(mod->kp, mod->num_kp);
>  	blocking_notifier_call_chain(&module_notify_list,
>  				     MODULE_STATE_GOING, mod);
>  	klp_module_going(mod);
> -- 
> 2.10.1
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-08 19:49 ` [RFC 10/10] kmod: add a sanity check on module loading Luis R. Rodriguez
@ 2016-12-09 20:03   ` Martin Wilck
  2016-12-09 20:56     ` Linus Torvalds
  2016-12-15  0:27   ` Rusty Russell
  2017-01-04  2:47   ` Jessica Yu
  2 siblings, 1 reply; 65+ messages in thread
From: Martin Wilck @ 2016-12-09 20:03 UTC (permalink / raw)
  To: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, dmitry.torokhov,
	acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Thu, 2016-12-08 at 11:49 -0800, Luis R. Rodriguez wrote:
> 
> Although this does get us in the business of keeping alias maps in
> kernel, the the work to support and maintain this is trivial.

You've implemented a special treatment for request_module("fs-$X")in
finished_kmod_load(), but there are many more aliases defined (and
used) in the kernel. Do you plan to implement special code for "char-
major-$X", "crypto-$X", "binfmt-$X" etc. later? 

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-09 20:03   ` Martin Wilck
@ 2016-12-09 20:56     ` Linus Torvalds
  2016-12-15 18:08       ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Torvalds @ 2016-12-09 20:56 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Luis R. Rodriguez, shuah, Jessica Yu, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, martin.wilck, Michal Marek, Petr Mladek,
	Hannes Reinecke, rwright, Jeff Mahoney, David Sterba, fdmanana,
	NeilBrown, Guenter Roeck, Goldwyn Rodrigues, subashab,
	Heinrich Schuchardt, Kees Cook, atomlin, mbenes, Paul McKenney,
	Dan Williams, Josh Poimboeuf, David Miller, Ingo Molnar,
	Andrew Morton, linux-kselftest, open list:DOCUMENTATION,
	Linux Kernel Mailing List

On Fri, Dec 9, 2016 at 12:03 PM, Martin Wilck <mwilck@suse.com> wrote:
> On Thu, 2016-12-08 at 11:49 -0800, Luis R. Rodriguez wrote:
>>
>> Although this does get us in the business of keeping alias maps in
>> kernel, the the work to support and maintain this is trivial.
>
> You've implemented a special treatment for request_module("fs-$X")in
> finished_kmod_load(), but there are many more aliases defined (and
> used) in the kernel. Do you plan to implement special code for "char-
> major-$X", "crypto-$X", "binfmt-$X" etc. later?

Yeah, no, that is just complete garbage.

Those module aliases already exist in the module info section. We just
don't parse the alias tags in the kernel.

So the real fix is to make find_module_all() just do that.

Doing random ad-hoc "let's prefix with 'fs-xyz'" games are completely
unacceptable. That's just pure shit. Stop this idiocy.

                Linus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 01/10] kmod: add test driver to stress test the module loader
  2016-12-08 20:24   ` Kees Cook
@ 2016-12-13 21:10     ` Luis R. Rodriguez
  2016-12-16  7:41       ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-13 21:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, shuah, Jessica Yu, Rusty Russell,
	Arnd Bergmann, Eric W. Biederman, Dmitry Torokhov,
	Arnaldo Carvalho de Melo, Jonathan Corbet, martin.wilck,
	Michal Marek, Petr Mladek, hare, rwright, Jeff Mahoney, DSterba,
	fdmanana, neilb, rgoldwyn, subashab, Heinrich Schuchardt,
	Aaron Tomlin, mbenes, Paul E. McKenney, Dan Williams,
	Josh Poimboeuf, David S. Miller, Ingo Molnar, Andrew Morton,
	Linus Torvalds, linux-kselftest, linux-doc, LKML

On Thu, Dec 08, 2016 at 12:24:35PM -0800, Kees Cook wrote:
> On Thu, Dec 8, 2016 at 10:47 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > The kmod.sh script uses the above constructs to build differnt test cases.
> 
> Typo: different

Fixed.

> > 3) finit_module() consumes quite a bit of memory.
> 
> Is this due to reading the module into kernel memory or something else?

Very likely yes, but to be honest I have not had chance to instrument too
carefully, its TODO work :)

> > 4) Filesystems typically also have more dependent modules than other
> > modules, its important to note though that even though a get_fs_type() call
> > does not incur additional kmod_concurrent bumps, since userspace
> > loads dependencies it finds it needs via finit_module_fd(), it *will*
> > take much more memory to load a module with a lot of dependencies.
> >
> > Because of 3) and 4) we will easily run into out of memory failures
> > with certain tests. For instance test 0006 fails on qemu with 1024 MiB
> > of RAM. It panics a box after reaping all userspace processes and still
> > not having enough memory to reap.
> 
> Are the buffers not released until after all the dependent modules are
> loaded? I thought it would load one by one?

kmod.c allows up to kmod_concurrent concurrent requests out to userspace,
how it handles this is up to userspace, but note that prior to the knobs
exposed in this patch set userspace neither knew what kmod_concurrent
was nor how many concurrent threads are active at any point in time.

> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> 
> This is a great selftest, thanks for working on it!

My pleasure.

> Notes below...
> 
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 7446097f72bd..6cad548e0682 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1994,6 +1994,31 @@ config BUG_ON_DATA_CORRUPTION
> >
> >           If unsure, say N.
> >
> > +config TEST_KMOD
> > +       tristate "kmod stress tester"
> > +       default n
> > +       depends on m
> > +       select TEST_LKM
> > +       select XFS_FS
> > +       select TUN
> > +       select BTRFS_FS
> 
> Since the desired FS can be changed at runtime, maybe these selects
> aren't needed?

Well yes and no, yes because its the defaults built-in. No, because as you note
we can alter the defaults in userspace. Without the alternatives being set the
driver will not really work at all though. Here is an example where Arnd's
kconfig "suggests" for kconfig could come in handy. Until we have that I think
I'd prefer to keep it this way.

> > diff --git a/lib/test_kmod.c b/lib/test_kmod.c
> > new file mode 100644
> > index 000000000000..63fded83b9b6
> > --- /dev/null
> > +++ b/lib/test_kmod.c
> > @@ -0,0 +1,1248 @@

> > +static bool force_init_test = false;
> > +module_param(force_init_test, bool_enable_only, 0644);
> > +MODULE_PARM_DESC(force_init_test,
> > +                "Force kicking a test immediatley after driver loads");
> 
> Typo: immediately

Fixed.

> > +static ssize_t config_show(struct device *dev,
> > +                          struct device_attribute *attr,
> > +                          char *buf)
> > +{
> > +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> > +       struct test_config *config = &test_dev->config;
> > +       int len = 0;
> > +
> > +       mutex_lock(&test_dev->config_mutex);
> > +
> > +       len += sprintf(buf, "Custom trigger configuration for: %s\n",
> > +                      dev_name(dev));
> > +
> > +       len += sprintf(buf+len, "Number of threads:\t%u\n",
> > +                      config->num_threads);
> > +
> > +       len += sprintf(buf+len, "Test_case:\t%s (%u)\n",
> > +                      test_case_str(config->test_case),
> > +                      config->test_case);
> > +
> > +       if (config->test_driver)
> > +               len += sprintf(buf+len, "driver:\t%s\n",
> > +                              config->test_driver);
> > +       else
> > +               len += sprintf(buf+len, "driver:\tEMTPY\n");
> > +
> > +       if (config->test_fs)
> > +               len += sprintf(buf+len, "fs:\t%s\n",
> > +                              config->test_fs);
> > +       else
> > +               len += sprintf(buf+len, "fs:\tEMTPY\n");
> 
> These should all use snprintf...

Fixed. If the caller is sysfs_kf_seq_show() then max is PAGE_SIZE, will
use that as the limit to start with.

> > +static ssize_t config_test_driver_show(struct device *dev,
> > +                                       struct device_attribute *attr,
> > +                                       char *buf)
> > +{
> > +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> > +       struct test_config *config = &test_dev->config;
> > +
> > +       mutex_lock(&test_dev->config_mutex);
> > +       strcpy(buf, config->test_driver);
> > +       strcat(buf, "\n");
> 
> IIUC, the show/store API uses a max size of PAGE_SIZE. If that's
> correct, it's possible that this show routine could write past the end
> of buf, due to the end newline, etc. Best to use snprintf like you do
> below for the other shows.

Sure.

> > +static ssize_t config_test_fs_show(struct device *dev,
> > +                                  struct device_attribute *attr,
> > +                                  char *buf)
> > +{
> > +       struct kmod_test_device *test_dev = dev_to_test_dev(dev);
> > +       struct test_config *config = &test_dev->config;
> > +
> > +       mutex_lock(&test_dev->config_mutex);
> > +       strcpy(buf, config->test_fs);
> > +       strcat(buf, "\n");
> > +       mutex_unlock(&test_dev->config_mutex);
> 
> Same here... (which, btw, could likely use to be a helper function,
> the show and store functions here are identical except for test_driver
> vs test_fs).

Sure, I'm starting to think a lot of test boiler plate for setup and show of
config stuff could be shared. We can consider this more once we have a few more
test drivers like this. I have 3 total now in the pipeline.

> > +
> > +       return strlen(buf) + 1;
> > +}
> > +static DEVICE_ATTR(config_test_fs, 0644, config_test_fs_show,
> > +                  config_test_fs_store);
> > +
> > +static int trigger_config_run_driver(struct kmod_test_device *test_dev,
> > +                                    const char *test_driver)
> > +{
> > +       int copied;
> > +       struct test_config *config = &test_dev->config;
> > +
> > +       mutex_lock(&test_dev->config_mutex);
> > +
> > +       config->test_case = TEST_KMOD_DRIVER;
> > +
> > +       kfree_const(config->test_driver);
> > +       config->test_driver = NULL;
> > +
> > +       copied = config_copy_test_driver_name(config, test_driver,
> > +                                             strlen(test_driver));
> > +       mutex_unlock(&test_dev->config_mutex);
> > +
> > +       if (copied != strlen(test_driver)) {
> 
> Can't these copied tests just check < 0? (i.e. avoid the repeated
> strlen which can be fragile.)

Sure, it can be:

if (copied <= 0 || copied != strlen(test_driver)) {

That way its both a negative check and also that something
non-empty was passed.

> > +               test_dev->test_is_oom = true;
> > +               return -EINVAL;

And come to think of it, these should return -ENOMEM;
> > +       }
> > +
> > +       test_dev->test_is_oom = false;
> > +
> > +       return trigger_config_run(test_dev);
> > +}
> > +
> > +static int trigger_config_run_fs(struct kmod_test_device *test_dev,
> > +                                const char *fs_type)
> > +{
> > +       int copied;
> > +       struct test_config *config = &test_dev->config;
> > +
> > +       mutex_lock(&test_dev->config_mutex);
> > +       config->test_case = TEST_KMOD_FS_TYPE;
> > +
> > +       kfree_const(config->test_fs);
> > +       config->test_driver = NULL;
> > +
> > +       copied = config_copy_test_fs(config, fs_type, strlen(fs_type));
> > +       mutex_unlock(&test_dev->config_mutex);
> > +
> > +       if (copied != strlen(fs_type)) {
> > +               test_dev->test_is_oom = true;
> > +               return -EINVAL;
> > +       }
> > +
> > +       test_dev->test_is_oom = false;
> > +
> > +       return trigger_config_run(test_dev);
> > +}
> 
> These two functions are almost identical too. Only test_case and the
> copy function change...

They are now shared.

> > +static void free_test_dev_info(struct kmod_test_device *test_dev)
> > +{
> > +       if (test_dev->info) {
> > +               vfree(test_dev->info);
> > +               test_dev->info = NULL;
> > +       }
> > +}
> 
> vfree() already checks for NULL, you can drop the if.

Fixed.

> > +static int test_dev_config_update_uint_range(struct kmod_test_device *test_dev,
> > +                                            const char *buf, size_t size,
> > +                                            unsigned int *config,
> > +                                            unsigned int min,
> > +                                            unsigned int max)
> > +{
> > +       char *end;
> > +       long new = simple_strtol(buf, &end, 0);
> > +       if (end == buf || new < min || new >  max || new > UINT_MAX)
> > +               return -EINVAL;
> > +
> > +       mutex_lock(&test_dev->config_mutex);
> > +       *(unsigned int *)config = new;
> 
> config is already an unsigned int *, why cast?

Fixed.

> > +static int test_dev_config_update_int(struct kmod_test_device *test_dev,
> > +                                     const char *buf, size_t size,
> > +                                     int *config)
> > +{
> > +       char *end;
> > +       long new = simple_strtol(buf, &end, 0);
> > +       if (end == buf || new > INT_MAX || new < INT_MIN)
> > +               return -EINVAL;
> > +       mutex_lock(&test_dev->config_mutex);
> > +       *(int *)config = new;
> 
> config is already an int *, why cast?

Fixed.

> > +/*
> > + * XXX: this could perhaps be made generic already too, but a hunt
> > + * for actual users would be needed first. It could be generic
> > + * if other test drivers end up using a similar mechanism.
> > + */
> > +const char *test_dev_get_name(const char *base, int idx, gfp_t gfp)
> > +{
> > +       const char *name_const;
> > +       char *name;
> > +
> > +       if (!base)
> > +               return NULL;
> > +       if (strlen(base) > 30)
> > +               return NULL;
> 
> why?

It was an arbitrary limit, will use PAGE_SIZE. But I'll just remove the
entire routine (see below).
> 
> > +       name = kzalloc(1024, gfp);
> > +       if (!name)
> > +               return NULL;
> > +
> > +       strncat(name, base, strlen(base));
> > +       sprintf(name+(strlen(base)), "%d", idx);
> > +       name_const = kstrdup_const(name, gfp);
> > +
> > +       kfree(name);
> > +
> > +       return name_const;
> > +}
> 
> What is going on here? Why not just:
>     return kasprintf(gfp, "%s%d", base, idx);
>
> For all of that code? And kstrdup_const is pointless here since it'll
> always just do the dup (as the kmalloc source isn't in rodata).

Heh, yeah, true, nuked.

> > diff --git a/tools/testing/selftests/kmod/config b/tools/testing/selftests/kmod/config
> > new file mode 100644
> > index 000000000000..259f4fd6b5e2
> > --- /dev/null
> > +++ b/tools/testing/selftests/kmod/config
> > @@ -0,0 +1,7 @@
> > +CONFIG_TEST_KMOD=m
> > +CONFIG_TEST_LKM=m
> > +CONFIG_XFS_FS=m
> > +
> > +# For the module parameter force_init_test is used
> > +CONFIG_TUN=m
> > +CONFIG_BTRFS_FS=m
> > diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
> > new file mode 100755
> > index 000000000000..9ea1864d8bae
> > --- /dev/null
> > +++ b/tools/testing/selftests/kmod/kmod.sh
> > @@ -0,0 +1,449 @@
> > +#!/bin/bash
> > +#

<-- snip -->

> > +# You'll want at least 4096 GiB of RAM to expect to run these tests
> 
> 4TiB of RAM? I assume this was meant to be 4 GiB not 4096?

Whoops, yeah sorry 4 GiB only.

> > +# without running out of memory on them. For other requirements refer
> > +# to test_reqs()
> > +
> > +set -e
> > +
> > +TEST_DRIVER="test_kmod"
> > +
> > +function allow_user_defaults()
> > +{
> > +       if [ -z $DEFAULT_KMOD_DRIVER ]; then
> > +               DEFAULT_KMOD_DRIVER="test_module"
> > +       fi
> > +
> > +       if [ -z $DEFAULT_KMOD_FS ]; then
> > +               DEFAULT_KMOD_FS="xfs"
> > +       fi
> > +
> > +       if [ -z $PROC_DIR ]; then
> > +               PROC_DIR="/proc/sys/kernel/"
> > +       fi
> > +
> > +       if [ -z $MODPROBE_LIMIT ]; then
> > +               MODPROBE_LIMIT=50
> > +       fi
> > +
> > +       if [ -z $DIR ]; then
> > +               DIR="/sys/devices/virtual/misc/${TEST_DRIVER}0/"
> > +       fi
> > +
> > +       MODPROBE_LIMIT_FILE="${PROC_DIR}/kmod-limit"
> > +}
> > +
> > +test_reqs()
> > +{
> > +       if ! which modprobe 2> /dev/null > /dev/null; then
> > +               echo "$0: You need modprobe installed"
> 
> While not a huge deal, I prefer that error messages end up on stderr,
> so adding >&2 to all the failure echos (or providing an err function)
> would be nice. (This happens in later places...)

Addressed.

> > +function load_req_mod()
> > +{
> > +       if [ ! -d $DIR ]; then
> > +               # Alanis: "Oh isn't it ironic?"
> > +               modprobe $TEST_DRIVER
> > +               if [ ! -d $DIR ]; then
> > +                       echo "$0: $DIR not present"
> > +                       echo "You must have the following enabled in your kernel:"
> > +                       cat $PWD/config
> 
> I like this (minimum config in the test directory). Are other tests
> doing this too?

mcgrof@ergon ~/linux-next (git::(no branch, rebasing 20161213-kmod-test-driver))$ find tools/testing/selftests/ -name config
tools/testing/selftests/static_keys/config
tools/testing/selftests/cpu-hotplug/config
tools/testing/selftests/ipc/config
tools/testing/selftests/mount/config
tools/testing/selftests/zram/config
tools/testing/selftests/seccomp/config
tools/testing/selftests/memory-hotplug/config
tools/testing/selftests/vm/config
tools/testing/selftests/ftrace/config
tools/testing/selftests/pstore/config
tools/testing/selftests/firmware/config
tools/testing/selftests/net/config
tools/testing/selftests/bpf/config
tools/testing/selftests/user/config
tools/testing/selftests/kmod/config

Seems like a hipster trend.

> > +# Once tese are enabled please leave them as-is. Write your own test,
> > +# we have tons of space.
> > +kmod_test_0001
> > +kmod_test_0002
> > +kmod_test_0003
> > +kmod_test_0004
> > +kmod_test_0005
> > +kmod_test_0006
> > +kmod_test_0007
> > +
> > +#kmod_test_0008
> > +#kmod_test_0009
> 
> While it's documented in the commit log, I think a short note for each
> disabled test should be added here too.

Will do, thanks so much for the review!

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 03/10] kmod: add dynamic max concurrent thread count
  2016-12-08 19:48 ` [RFC 03/10] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
  2016-12-08 20:28   ` Kees Cook
@ 2016-12-14 15:38   ` Petr Mladek
  2016-12-16  8:39     ` Luis R. Rodriguez
  1 sibling, 1 reply; 65+ messages in thread
From: Petr Mladek @ 2016-12-14 15:38 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu 2016-12-08 11:48:14, Luis R. Rodriguez wrote:
> We currently statically limit the number of modprobe threads which
> we allow to run concurrently to 50. As per Keith Owens, this was a
> completely arbitrary value, and it was set in the 2.3.38 days [0]
> over 16 years ago in year 2000.
> 
> Although we haven't yet hit our lower limits, experimentation [1]
> shows that when and if we hit this limit in the worst case, will be
> fatal -- consider get_fs_type() failures upon mount on a system which
> has many partitions, some of which might even be with the same
> filesystem. Its best to be prudent and increase and set this
> value to something more sensible which ensures we're far from hitting
> the limit and also allows default build/user run time override.
> 
> The worst case is fatal given that once a module fails to load there
> is a period of time during which subsequent request for the same module
> will fail, so in the case of partitions its not just one request that
> could fail, but whole series of partitions. This later issue of a
> module request failure domino effect can be addressed later, but
> increasing the limit to something more meaninful should at least give us
> enough cushion to avoid this for a while.
> 
> Set this value up with a bit more meaninful modern limits:
> 
> Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index 271692a352f1..da2c25746937 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -2111,6 +2111,29 @@ config TRIM_UNUSED_KSYMS
>  
>  	  If unsure, or if you need to build out-of-tree modules, say N.
>  
> +config MAX_KMOD_CONCURRENT
> +	int "Max allowed concurrent request_module() calls (6=>64, 10=>1024)"
> +	range 0 14

Would not too small range break loading module dependencies?
I am not sure how it is implemented but it might require having
some more module loads in progress.

I would give 6 as minimum. Nobody has troubles with the current limit.

> +	default 6 if !BASE_SMALL
> +	default 7 if BASE_SMALL

Aren't the conditions inversed?

> diff --git a/init/main.c b/init/main.c
> index 8161208d4ece..1fa441aa32c6 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -638,6 +638,7 @@ asmlinkage __visible void __init start_kernel(void)
>  	thread_stack_cache_init();
>  	cred_init();
>  	fork_init();
> +	init_kmod_umh();
>  	proc_caches_init();
>  	buffer_init();
>  	key_init();
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 0277d1216f80..cb6f7ca7b8a5 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -186,6 +174,31 @@ int __request_module(bool wait, const char *fmt, ...)
>  	return ret;
>  }
>  EXPORT_SYMBOL(__request_module);
> +
> +/*
> + * If modprobe needs a service that is in a module, we get a recursive
> + * loop.  Limit the number of running kmod threads to max_threads/2 or
> + * CONFIG_MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> + * would be to run the parents of this process, counting how many times
> + * kmod was invoked.  That would mean accessing the internals of the
> + * process tables to get the command line, proc_pid_cmdline is static
> + * and it is not worth changing the proc code just to handle this case.
> + *
> + * "trace the ppid" is simple, but will fail if someone's
> + * parent exits.  I think this is as good as it gets.
> + *
> + * You can override with with a kernel parameter, for instance to allow
> + * 4096 concurrent modprobe instances:
> + *
> + *	kmod.max_modprobes=4096
> + */
> +void __init init_kmod_umh(void)
> +{
> +	if (!max_modprobes)
> +		max_modprobes = min(max_threads/2,
> +				    2 << CONFIG_MAX_KMOD_CONCURRENT);

This should be

	1 << CONFIG_MAX_KMOD_CONCURRENT);

1 << 1 = 2;

Note that this calculation is mentioned also some comments and
documentation.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access
  2016-12-08 19:48 ` [RFC 06/10] kmod: provide sanity check on kmod_concurrent access Luis R. Rodriguez
@ 2016-12-14 16:08   ` Petr Mladek
  2016-12-14 17:12     ` Luis R. Rodriguez
  2016-12-15 12:57   ` Petr Mladek
  1 sibling, 1 reply; 65+ messages in thread
From: Petr Mladek @ 2016-12-14 16:08 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu 2016-12-08 11:48:50, Luis R. Rodriguez wrote:
> Only decrement *iff* we're possitive. Warn if we've hit
> a situation where the counter is already 0 after we're done
> with a modprobe call, this would tell us we have an unaccounted
> counter access -- this in theory should not be possible as
> only one routine controls the counter, however preemption is
> one case that could trigger this situation. Avoid that situation
> by disabling preemptiong while we access the counter.

I am curious about it. How could enabled preemption cause that
the counter will get negative?

Unaccounted access would be possible if put() is called
without get() or if put() is called before get().

I do not see a way how the value might get negative when
the calls are paired and ordered.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 07/10] kmod: use simplified rate limit printk
  2016-12-08 19:49 ` [RFC 07/10] kmod: use simplified rate limit printk Luis R. Rodriguez
@ 2016-12-14 16:23   ` Petr Mladek
  2016-12-14 16:41     ` Joe Perches
  2016-12-16  8:44     ` Luis R. Rodriguez
  0 siblings, 2 replies; 65+ messages in thread
From: Petr Mladek @ 2016-12-14 16:23 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu 2016-12-08 11:49:01, Luis R. Rodriguez wrote:
> Just use the simplified rate limit printk when the max modprobe
> limit is reached, while at it throw out a bone should the error
> be triggered.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/kmod.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 09cf35a2075a..ef65f4c3578a 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -158,7 +158,6 @@ int __request_module(bool wait, const char *fmt, ...)
>  	va_list args;
>  	char module_name[MODULE_NAME_LEN];
>  	int ret;
> -	static int kmod_loop_msg;
>  
>  	/*
>  	 * We don't allow synchronous module loading from async.  Module
> @@ -183,13 +182,8 @@ int __request_module(bool wait, const char *fmt, ...)
>  
>  	ret = kmod_umh_threads_get();
>  	if (ret) {
> -		/* We may be blaming an innocent here, but unlikely */
> -		if (kmod_loop_msg < 5) {
> -			printk(KERN_ERR
> -			       "request_module: runaway loop modprobe %s\n",
> -			       module_name);
> -			kmod_loop_msg++;
> -		}
> +		pr_err_ratelimited("request_module: modprobe limit (%u) reached with module %s\n",
> +				   max_modprobes, module_name);

I like this change. I would only be even more descriptive in which
limit is reached. Something like

		pr_err_ratelimited("request_module: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
				   module_name, max_modprobes);

Either way, feel free to add:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 07/10] kmod: use simplified rate limit printk
  2016-12-14 16:23   ` Petr Mladek
@ 2016-12-14 16:41     ` Joe Perches
  2016-12-16  8:44     ` Luis R. Rodriguez
  1 sibling, 0 replies; 65+ messages in thread
From: Joe Perches @ 2016-12-14 16:41 UTC (permalink / raw)
  To: Petr Mladek, Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Wed, 2016-12-14 at 17:23 +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:49:01, Luis R. Rodriguez wrote:
> > Just use the simplified rate limit printk when the max modprobe
> > limit is reached, while at it throw out a bone should the error
> > be triggered.
[]
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
[]
> > @@ -183,13 +182,8 @@ int __request_module(bool wait, const char *fmt, ...)
> >  
> >  	ret = kmod_umh_threads_get();
> >  	if (ret) {
> > -		/* We may be blaming an innocent here, but unlikely */
> > -		if (kmod_loop_msg < 5) {
> > -			printk(KERN_ERR
> > -			       "request_module: runaway loop modprobe %s\n",
> > -			       module_name);
> > -			kmod_loop_msg++;
> > -		}
> > +		pr_err_ratelimited("request_module: modprobe limit (%u) reached with module %s\n",
> > +				   max_modprobes, module_name);
> 
> I like this change. I would only be even more descriptive in which
> limit is reached. Something like
> 
> 		pr_err_ratelimited("request_module: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
> 				   module_name, max_modprobes);
> 
> Either way, feel free to add:
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>

Seems sensible.

I suggest using "%s: ", __func__ instead of embedding
the function name.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access
  2016-12-14 16:08   ` Petr Mladek
@ 2016-12-14 17:12     ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-14 17:12 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Wed, Dec 14, 2016 at 05:08:58PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:48:50, Luis R. Rodriguez wrote:
> > Only decrement *iff* we're possitive. Warn if we've hit
> > a situation where the counter is already 0 after we're done
> > with a modprobe call, this would tell us we have an unaccounted
> > counter access -- this in theory should not be possible as
> > only one routine controls the counter, however preemption is
> > one case that could trigger this situation. Avoid that situation
> > by disabling preemptiong while we access the counter.
> 
> I am curious about it. How could enabled preemption cause that
> the counter will get negative?

As the commit log describes today in theory this is not possible
was we have only have one routine controlling the counter. If we
were to expand this then such possibilities become more real.

> Unaccounted access would be possible if put() is called
> without get() or if put() is called before get().

Exactly, so buggy users of the get/put calls in future calls.
I can just drop the preemption disable / enable for now as it
should not be an issue now.

> I do not see a way how the value might get negative when
> the calls are paired and ordered.

Right, this just matches parity with module_put(), its perhaps
*preemptively* too cautious though so I could just drop the
preemption enable/disable for now as that would slow down
things a bit.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-08 19:49 ` [RFC 10/10] kmod: add a sanity check on module loading Luis R. Rodriguez
  2016-12-09 20:03   ` Martin Wilck
@ 2016-12-15  0:27   ` Rusty Russell
  2016-12-16  8:31     ` Luis R. Rodriguez
  2017-01-04  2:47   ` Jessica Yu
  2 siblings, 1 reply; 65+ messages in thread
From: Rusty Russell @ 2016-12-15  0:27 UTC (permalink / raw)
  To: Luis R. Rodriguez, shuah, jeyu, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

"Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> kmod has an optimization in place whereby if a some kernel code
> uses request_module() on a module already loaded we never bother
> userspace as the module already is loaded. This is not true for
> get_fs_type() though as it uses aliases.

Well, the obvious thing to do here is block kmod if we're currently
loading the same module.  Otherwise it has to do some weird spinning
thing in userspace anyway.

We already have module_wq for this, we just need a bit more code to
share the return value; and there's a weird corner case there where we
have "modprobe foo param=invalid" then "modprobe foo param=valid" and we
fail both with -EINVAL, but it's probably not worth fixing.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-08 21:08     ` Luis R. Rodriguez
@ 2016-12-15 12:46       ` Petr Mladek
  2016-12-16  8:05         ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Petr Mladek @ 2016-12-15 12:46 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, hare, rwright, Jeff Mahoney, DSterba,
	fdmanana, neilb, Guenter Roeck, rgoldwyn, subashab,
	Heinrich Schuchardt, Aaron Tomlin, mbenes, Paul E. McKenney,
	Dan Williams, Josh Poimboeuf, David S. Miller, Ingo Molnar,
	Andrew Morton, Linus Torvalds, linux-kselftest, linux-doc, LKML

On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > kmod_concurrent is used as an atomic counter for enabling
> > > the allowed limit of modprobe calls, provide wrappers for it
> > > to enable this to be expanded on more easily. This will be done
> > > later.
> > >
> > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > ---
> > >  kernel/kmod.c | 27 +++++++++++++++++++++------
> > >  1 file changed, 21 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > index cb6f7ca7b8a5..049d7eabda38 100644
> > > --- a/kernel/kmod.c
> > > +++ b/kernel/kmod.c
> > > @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
> > >         return -ENOMEM;
> > >  }
> > >
> > > +static int kmod_umh_threads_get(void)
> > > +{
> > > +       atomic_inc(&kmod_concurrent);

This approach might actually cause false failures. If we
are on the limit and more processes do this increment
in parallel, it makes the number bigger that it should be.

> > > +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> > > +               return 0;
> > > +       atomic_dec(&kmod_concurrent);
> > > +       return -ENOMEM;
> > > +}
> > > +
> > > +static void kmod_umh_threads_put(void)
> > > +{
> > > +       atomic_dec(&kmod_concurrent);
> > > +}
> > 
> > Can you use a kref here instead? We're trying to kill raw use of
> > atomic_t for reference counting...
> 
> That's a much broader functional change than I was looking for, but I am up for
> it. Can you describe the benefit of using kref you expect or why this is an
> ongoing crusade? Since its a larger functional change how about doing this
> change later, and we can test impact with the tress test driver. In theory if
> there are benefits can't we add a test case to prove the gains?

Kees probably refers to the kref improvements that Peter Zijlstra
is working on, see
https://lkml.kernel.org/r/20161114174446.832175072@infradead.org

The advantage is that the new refcount API handles over and
underflow.

Another advantage is that it increments/decrements the value
only when it is safe. It uses cmpxchg to make sure that
the checks are valid.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access
  2016-12-08 19:48 ` [RFC 06/10] kmod: provide sanity check on kmod_concurrent access Luis R. Rodriguez
  2016-12-14 16:08   ` Petr Mladek
@ 2016-12-15 12:57   ` Petr Mladek
  2017-01-10 20:00     ` Luis R. Rodriguez
  1 sibling, 1 reply; 65+ messages in thread
From: Petr Mladek @ 2016-12-15 12:57 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu 2016-12-08 11:48:50, Luis R. Rodriguez wrote:
> Only decrement *iff* we're possitive. Warn if we've hit
> a situation where the counter is already 0 after we're done
> with a modprobe call, this would tell us we have an unaccounted
> counter access -- this in theory should not be possible as
> only one routine controls the counter, however preemption is
> one case that could trigger this situation. Avoid that situation
> by disabling preemptiong while we access the counter.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/kmod.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index ab38539f7e91..09cf35a2075a 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -113,16 +113,28 @@ static int call_modprobe(char *module_name, int wait)
>  
>  static int kmod_umh_threads_get(void)
>  {
> +	int ret = 0;
> +
> +	preempt_disable();
>  	atomic_inc(&kmod_concurrent);
>  	if (atomic_read(&kmod_concurrent) < max_modprobes)
> -		return 0;
> -	atomic_dec(&kmod_concurrent);
> -	return -EBUSY;
> +		goto out;

I though more about it and the disabled preemtion might make
sense here. It makes sure that we are not rescheduled here
and that kmod_concurrent is not increased by mistake for too long.

Well, it still would make sense to increment the value
only when it is under the limit and set the incremented
value using cmpxchg to avoid races.

I mean to use similar trick that is used by refcount_inc(), see
https://lkml.kernel.org/r/20161114174446.832175072@infradead.org


> +	atomic_dec_if_positive(&kmod_concurrent);
> +	ret = -EBUSY;
> +out:
> +	preempt_enable();
> +	return 0;
>  }
>  
>  static void kmod_umh_threads_put(void)
>  {
> -	atomic_dec(&kmod_concurrent);
> +	int ret;
> +
> +	preempt_disable();
> +	ret = atomic_dec_if_positive(&kmod_concurrent);
> +	WARN_ON(ret < 0);
> +	preempt_enable();

The disabled preemption does not make much sense here.
We do not need to tie the atomic operation and the WARN
together so tightly.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 09/10] kmod: add helpers for getting kmod count and limit
  2016-12-08 19:49 ` [RFC 09/10] kmod: add helpers for getting kmod count and limit Luis R. Rodriguez
@ 2016-12-15 16:56   ` Petr Mladek
  2016-12-16  7:57     ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Petr Mladek @ 2016-12-15 16:56 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu 2016-12-08 11:49:20, Luis R. Rodriguez wrote:
> This adds helpers for getting access to the kmod count and limit from
> userspace. While at it, this also lets userspace fine tune the kmod
> limit after boot, it uses the shiny new proc_douintvec_minmax().
> 
> These knobs should help userspace more gracefully and deterministically
> handle module loading.
>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  include/linux/kmod.h |  8 +++++
>  kernel/kmod.c        | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>  kernel/sysctl.c      | 14 +++++++++
>  3 files changed, 103 insertions(+), 2 deletions(-)

I am not sure if it is worth it. As you say in the 3rd patch,
there was rather low limit for 16 years and nobody probably had
problems with it.

Anyway, it seems that such know should also get documented in
Documentation/sysctl/kernel.txt

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-09 20:56     ` Linus Torvalds
@ 2016-12-15 18:08       ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-15 18:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Martin Wilck, Luis R. Rodriguez, shuah, Jessica Yu,
	Rusty Russell, Eric W. Biederman, Dmitry Torokhov,
	Arnaldo Carvalho de Melo, Jonathan Corbet, martin.wilck,
	Michal Marek, Petr Mladek, Hannes Reinecke, rwright,
	Jeff Mahoney, David Sterba, fdmanana, NeilBrown, Guenter Roeck,
	Goldwyn Rodrigues, subashab, Heinrich Schuchardt, Kees Cook,
	atomlin, mbenes, Paul McKenney, Dan Williams, Josh Poimboeuf,
	David Miller, Ingo Molnar, Andrew Morton, linux-kselftest,
	open list:DOCUMENTATION, Linux Kernel Mailing List

On Fri, Dec 09, 2016 at 12:56:21PM -0800, Linus Torvalds wrote:
> On Fri, Dec 9, 2016 at 12:03 PM, Martin Wilck <mwilck@suse.com> wrote:
> > On Thu, 2016-12-08 at 11:49 -0800, Luis R. Rodriguez wrote:
> >>
> >> Although this does get us in the business of keeping alias maps in
> >> kernel, the the work to support and maintain this is trivial.
> >
> > You've implemented a special treatment for request_module("fs-$X")in
> > finished_kmod_load(), but there are many more aliases defined (and
> > used) in the kernel. Do you plan to implement special code for "char-
> > major-$X", "crypto-$X", "binfmt-$X" etc. later?
> 
> Yeah, no, that is just complete garbage.
> 
> Those module aliases already exist in the module info section. We just
> don't parse the alias tags in the kernel.
> 
> So the real fix is to make find_module_all() just do that.

Ah yes, that is much sexier, this is now done and it works nicely, thanks
for the suggestion.

> Doing random ad-hoc "let's prefix with 'fs-xyz'" games are completely
> unacceptable. That's just pure shit. Stop this idiocy.

Look at that fin DNA in action :)

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 02/10] module: fix memory leak on early load_module() failures
  2016-12-08 19:48 ` [RFC 02/10] module: fix memory leak on early load_module() failures Luis R. Rodriguez
  2016-12-08 20:30   ` Kees Cook
  2016-12-09 17:06   ` Miroslav Benes
@ 2016-12-15 18:46   ` Aaron Tomlin
  2 siblings, 0 replies; 65+ messages in thread
From: Aaron Tomlin @ 2016-12-15 18:46 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, mbenes, paulmck, dan.j.williams, jpoimboe, davem,
	mingo, akpm, torvalds, linux-kselftest, linux-doc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1735 bytes --]

On Thu 2016-12-08 11:48 -0800, Luis R. Rodriguez wrote:
> While looking for early possible module loading failures I was
> able to reproduce a memory leak possible with kmemleak. There
> are a few rare ways to trigger a failure:
> 
>   o we've run into a failure while processing kernel parameters
>     (parse_args() returns an error)
>   o mod_sysfs_setup() fails
>   o we're a live patch module and copy_module_elf() fails
> 
> Chances of running into this issue is really low.
> 
> kmemleak splat:
> 
> unreferenced object 0xffff9f2c4ada1b00 (size 32):
>   comm "kworker/u16:4", pid 82, jiffies 4294897636 (age 681.816s)
>   hex dump (first 32 bytes):
>     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff8c6cfeba>] kmemleak_alloc+0x4a/0xa0
>     [<ffffffff8c200046>] __kmalloc_track_caller+0x126/0x230
>     [<ffffffff8c1bc581>] kstrdup+0x31/0x60
>     [<ffffffff8c1bc5d4>] kstrdup_const+0x24/0x30
>     [<ffffffff8c3c23aa>] kvasprintf_const+0x7a/0x90
>     [<ffffffff8c3b5481>] kobject_set_name_vargs+0x21/0x90
>     [<ffffffff8c4fbdd7>] dev_set_name+0x47/0x50
>     [<ffffffffc07819e5>] memstick_check+0x95/0x33c [memstick]
>     [<ffffffff8c09c893>] process_one_work+0x1f3/0x4b0
>     [<ffffffff8c09cb98>] worker_thread+0x48/0x4e0
>     [<ffffffff8c0a2b79>] kthread+0xc9/0xe0
>     [<ffffffff8c6dab5f>] ret_from_fork+0x1f/0x40
>     [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/module.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Aaron Tomlin <atomlin@redhat.com>

-- 
Aaron Tomlin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 01/10] kmod: add test driver to stress test the module loader
  2016-12-13 21:10     ` Luis R. Rodriguez
@ 2016-12-16  7:41       ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-16  7:41 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, shuah, Jessica Yu, Rusty Russell, Arnd Bergmann,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, martin.wilck, Michal Marek, Petr Mladek, hare,
	rwright, Jeff Mahoney, DSterba, fdmanana, neilb, rgoldwyn,
	subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Tue, Dec 13, 2016 at 10:10:41PM +0100, Luis R. Rodriguez wrote:
> On Thu, Dec 08, 2016 at 12:24:35PM -0800, Kees Cook wrote:
> > On Thu, Dec 8, 2016 at 10:47 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > 3) finit_module() consumes quite a bit of memory.
> > 
> > Is this due to reading the module into kernel memory or something else?
> 
> Very likely yes, but to be honest I have not had chance to instrument too
> carefully, its TODO work :)

I've checked and the issue is since get_fs_type() does not check for
aliases we end up hammering tons of module requests, this in turn is
an analysis on load_module(). Within there layout_and_allocate()
uses first a local copy of the passed user data and mapping it into
a struct module, after a bit of sanity checks it finally allocates a
copy for us, so its struct module size * however many requests were
allowed to get in for load_module(). We could simply avoid an allocation
if the module is already present. I have this as another optimization
now but am running many other tests to compare performance.

> > > +# Once tese are enabled please leave them as-is. Write your own test,
> > > +# we have tons of space.
> > > +kmod_test_0001
> > > +kmod_test_0002
> > > +kmod_test_0003
> > > +kmod_test_0004
> > > +kmod_test_0005
> > > +kmod_test_0006
> > > +kmod_test_0007
> > > +
> > > +#kmod_test_0008
> > > +#kmod_test_0009
> > 
> > While it's documented in the commit log, I think a short note for each
> > disabled test should be added here too.
> 
> Will do, thanks so much for the review!

As I added test 0008's reason for why I think it fails I realized that the reason the test
can sometimes fail is very different than test 0009 which is for get_fs_type(). You see
get_fs_type() hammers kmod concurrent since we don't have an alias check and moprobe
calling fs-xfs for instance does not catch that the module is already loaded so it
delays the get_fs_type() call and so the __request_module() call, hogging up its
kmod concurrent increment.

For direct request_module() calls we don't have the alias issue, but since
we don't check if a module is loaded prior to calling userspace (I now have a fix
for this, reducing this latency does help) it means there are often times the
chances we will pour in tons of requests without them getting processed and
go over the concurrent limit.

I've added a clutch into __request_module() then so instead of just failing
we first check if we're at a threshold (say about 1/4 away from limit) and
if so we let a few threads breath, until they are done. This fixes *both*
test cases without much code changes, however as I've noted in other threads,
this is not the only issue to address.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 09/10] kmod: add helpers for getting kmod count and limit
  2016-12-15 16:56   ` Petr Mladek
@ 2016-12-16  7:57     ` Luis R. Rodriguez
  2017-01-11 18:27       ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-16  7:57 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Thu, Dec 15, 2016 at 05:56:19PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:49:20, Luis R. Rodriguez wrote:
> > This adds helpers for getting access to the kmod count and limit from
> > userspace. While at it, this also lets userspace fine tune the kmod
> > limit after boot, it uses the shiny new proc_douintvec_minmax().
> > 
> > These knobs should help userspace more gracefully and deterministically
> > handle module loading.
> >
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  include/linux/kmod.h |  8 +++++
> >  kernel/kmod.c        | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  kernel/sysctl.c      | 14 +++++++++
> >  3 files changed, 103 insertions(+), 2 deletions(-)
> 
> I am not sure if it is worth it. As you say in the 3rd patch,
> there was rather low limit for 16 years and nobody probably had
> problems with it.

Note, *probably* - ie, this could have gone unreported for a while, and
to be frank how can we know for sure a pesky module just did not load due
to this? In the case of get_fs_type() issue this can be fatal for a partition
mount, not a good example to wait to look forward to before we take this
serious.

I added the sysctl value mostly for read purposes, the count is probably
useless for any accounting to be done in userspace due to delays this
reading and making this value useful in userspace can have, I can nuke
that. The kmod-limit however seems very useful so that userspace knows
how to properly thread *safely* modprobe calls more deterministically.

Adding write support to let one bump the limit was just an easy convenience
possible given the read support was being added, but its use should
really only be useful for testing purposes post bootup given that the
real value in the limit will be important at boot time prior to the sysctl
parsing. The real know tweak which should be used in case of issues is
the module parameter added earlier.

So I could drop the kmod-count, and just make the kmod-limit read-only.
Thoughts?

> Anyway, it seems that such know should also get documented in
> Documentation/sysctl/kernel.txt

Will do if we keep them, thanks.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-15 12:46       ` Petr Mladek
@ 2016-12-16  8:05         ` Luis R. Rodriguez
  2016-12-22  4:48           ` Jessica Yu
  2017-01-10 18:57           ` [RFC 04/10] " Luis R. Rodriguez
  0 siblings, 2 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-16  8:05 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, Kees Cook, shuah, Jessica Yu, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, martin.wilck, Michal Marek, hare, rwright,
	Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck, rgoldwyn,
	subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > kmod_concurrent is used as an atomic counter for enabling
> > > > the allowed limit of modprobe calls, provide wrappers for it
> > > > to enable this to be expanded on more easily. This will be done
> > > > later.
> > > >
> > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > ---
> > > >  kernel/kmod.c | 27 +++++++++++++++++++++------
> > > >  1 file changed, 21 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > index cb6f7ca7b8a5..049d7eabda38 100644
> > > > --- a/kernel/kmod.c
> > > > +++ b/kernel/kmod.c
> > > > @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
> > > >         return -ENOMEM;
> > > >  }
> > > >
> > > > +static int kmod_umh_threads_get(void)
> > > > +{
> > > > +       atomic_inc(&kmod_concurrent);
> 
> This approach might actually cause false failures. If we
> are on the limit and more processes do this increment
> in parallel, it makes the number bigger that it should be.

This approach is *exactly* what the existing code does :P
I just provided wrappers. I agree with the old approach though,
reason is it acts as a lock in for the bump. What seems rather
stupid though is to just reject with an error on limit without first
taking a breather. I've now added a little clutch so that we first
take some fresh air when close to the limit, this reduces the chances
of going fatal.

With a clutch in place we can still go over the limit, its just we'd
have a few threads waiting until previous calls clear out. If there
is enough calls waiting eventually we'll fail.

Note though that __request_module() can wait, but here is an option
to not wait so such a clutch can only wait if we're allowed to.

> > > > +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> > > > +               return 0;
> > > > +       atomic_dec(&kmod_concurrent);
> > > > +       return -ENOMEM;
> > > > +}
> > > > +
> > > > +static void kmod_umh_threads_put(void)
> > > > +{
> > > > +       atomic_dec(&kmod_concurrent);
> > > > +}
> > > 
> > > Can you use a kref here instead? We're trying to kill raw use of
> > > atomic_t for reference counting...
> > 
> > That's a much broader functional change than I was looking for, but I am up for
> > it. Can you describe the benefit of using kref you expect or why this is an
> > ongoing crusade? Since its a larger functional change how about doing this
> > change later, and we can test impact with the tress test driver. In theory if
> > there are benefits can't we add a test case to prove the gains?
> 
> Kees probably refers to the kref improvements that Peter Zijlstra
> is working on, see
> https://lkml.kernel.org/r/20161114174446.832175072@infradead.org
> 
> The advantage is that the new refcount API handles over and
> underflow.
> 
> Another advantage is that it increments/decrements the value
> only when it is safe. It uses cmpxchg to make sure that
> the checks are valid.

Great thanks, will look into that.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-15  0:27   ` Rusty Russell
@ 2016-12-16  8:31     ` Luis R. Rodriguez
  2016-12-17  3:54       ` Rusty Russell
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-16  8:31 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Luis R. Rodriguez, shuah, jeyu, ebiederm, dmitry.torokhov, acme,
	corbet, martin.wilck, mmarek, pmladek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Thu, Dec 15, 2016 at 10:57:42AM +1030, Rusty Russell wrote:
> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> > kmod has an optimization in place whereby if a some kernel code
> > uses request_module() on a module already loaded we never bother
> > userspace as the module already is loaded. This is not true for
> > get_fs_type() though as it uses aliases.
> 
> Well, the obvious thing to do here is block kmod if we're currently
> loading the same module.

OK thanks, I've now added this, it sure helps. Test cases 0008 and 0009 require
hammering on the test over and over to see a failure on vanilla kernels,
an upper bound I found was about 150 times each test. Running test 0008
150 times with this enhancement you mentioned shaves off ~4 seconds.
For test 0009 it shaves off ~16 seconds, but as I note below the alias support
was needed as well.

> Otherwise it has to do some weird spinning
> thing in userspace anyway.

Right, but note that the get_fs_type() tests would still fail given
module.c was not alias-aware yet. I have the patches to add support
for the aliases now though and this is part of what helped shave
off time from the tests.

> We already have module_wq for this, we just need a bit more code to
> share the return value; and there's a weird corner case there where we
> have "modprobe foo param=invalid" then "modprobe foo param=valid" and we
> fail both with -EINVAL, but it's probably not worth fixing.

Hm OK. Although the set of patches I have fix and optimize now some
of these corner cases one issue that I still didn't quite yet figure
out was that a failure propagates secondary failures. That is,
say a module fails and you have loaded 4 request for the same module,
if the first request failed the last 3 *could* also fail. You can
trigger and see this with the latest script:

http://drvbp1.linux-foundation.org/~mcgrof/2016/12/16/kmod.sh

The latest version of the test_kmod driver:

http://drvbp1.linux-foundation.org/~mcgrof/2016/12/16/test_kmod.patch

./kmod.sh -t 0008
./kmod.sh -t 0009

When either of these fail you'll on dmesg that either a few NULL or
errors were found. It may not be worth fixing this race... given
that after apply all of my patches I no longer see this at all,
but I'm pretty sure a test case can be created to replicate more
easily.

FWIW a few things did occur to me:

a) list_add_rcu() is used so new modules get added first
b) find_module_all() returns the last module which was added as it traverses
   the module list

Because of a) and b) if two modules for the same driver can be on
the list at the same time then we'll get very likely a module which
is unformed or going than a live module. Changing module addition
to use list_add_tail_rcu() should mean we typically get the first
module added to the list for the module name I think, but other
than that I could not think clearly of the root case to allowing
multiple errors.

BTW should find_module_all() use rcu to traverse?

--- a/kernel/module.c
+++ b/kernel/module.c
@@ -594,7 +594,7 @@ static struct module *find_module_all(const char *name, size_t len,
 
 	module_assert_mutex_or_preempt();
 
-	list_for_each_entry(mod, &modules, list) {
+	list_for_each_entry_rcu(mod, &modules, list) {
 		if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
 			continue;
 		if (strlen(mod->name) == len && !memcmp(mod->name, name, len))
@@ -3532,7 +3532,7 @@ static int add_unformed_module(struct module *mod)
 		goto out;
 	}
 	mod_update_bounds(mod);
-	list_add_rcu(&mod->list, &modules);
+	list_add_tail_rcu(&mod->list, &modules);
 	mod_tree_insert(mod);
 	err = 0;
 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 03/10] kmod: add dynamic max concurrent thread count
  2016-12-14 15:38   ` Petr Mladek
@ 2016-12-16  8:39     ` Luis R. Rodriguez
  2017-01-10 19:24       ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-16  8:39 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Wed, Dec 14, 2016 at 04:38:27PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:48:14, Luis R. Rodriguez wrote:
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 271692a352f1..da2c25746937 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -2111,6 +2111,29 @@ config TRIM_UNUSED_KSYMS
> >  
> >  	  If unsure, or if you need to build out-of-tree modules, say N.
> >  
> > +config MAX_KMOD_CONCURRENT
> > +	int "Max allowed concurrent request_module() calls (6=>64, 10=>1024)"
> > +	range 0 14
> 
> Would not too small range break loading module dependencies?

No, dependencies are resolved by depmod, so userspace looks at the list and
just finit_module() the depenencies, skipping kmod. So the limit is
really only for kernel acting like a boss.

> I am not sure how it is implemented but it might require having
> some more module loads in progress.

Dependencies should be OK, a more serious concern with dependencies is
the aggregate memory it takes to load all dep modules for one required
module since finit_module() ends up allocating the struct module to copy
over data from userspace.

> I would give 6 as minimum. Nobody has troubles with the current limit.

Fair enough! Although disabling modprobe calls all together seemed like
a fun test, that should we allow that via the module parameter at least?

> > +	default 6 if !BASE_SMALL
> > +	default 7 if BASE_SMALL
> 
> Aren't the conditions inversed?

Whoops yes, sorry.

> > +void __init init_kmod_umh(void)
> > +{
> > +	if (!max_modprobes)
> > +		max_modprobes = min(max_threads/2,
> > +				    2 << CONFIG_MAX_KMOD_CONCURRENT);
> 
> This should be
> 
> 	1 << CONFIG_MAX_KMOD_CONCURRENT);
> 
> 1 << 1 = 2;
> 
> Note that this calculation is mentioned also some comments and
> documentation.

Heh sorry, yes fixed! Good thing I had still tested all along with the
value I intended though :P

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 07/10] kmod: use simplified rate limit printk
  2016-12-14 16:23   ` Petr Mladek
  2016-12-14 16:41     ` Joe Perches
@ 2016-12-16  8:44     ` Luis R. Rodriguez
  1 sibling, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-16  8:44 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Wed, Dec 14, 2016 at 05:23:50PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:49:01, Luis R. Rodriguez wrote:
> > Just use the simplified rate limit printk when the max modprobe
> > limit is reached, while at it throw out a bone should the error
> > be triggered.
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  kernel/kmod.c | 10 ++--------
> >  1 file changed, 2 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 09cf35a2075a..ef65f4c3578a 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -158,7 +158,6 @@ int __request_module(bool wait, const char *fmt, ...)
> >  	va_list args;
> >  	char module_name[MODULE_NAME_LEN];
> >  	int ret;
> > -	static int kmod_loop_msg;
> >  
> >  	/*
> >  	 * We don't allow synchronous module loading from async.  Module
> > @@ -183,13 +182,8 @@ int __request_module(bool wait, const char *fmt, ...)
> >  
> >  	ret = kmod_umh_threads_get();
> >  	if (ret) {
> > -		/* We may be blaming an innocent here, but unlikely */
> > -		if (kmod_loop_msg < 5) {
> > -			printk(KERN_ERR
> > -			       "request_module: runaway loop modprobe %s\n",
> > -			       module_name);
> > -			kmod_loop_msg++;
> > -		}
> > +		pr_err_ratelimited("request_module: modprobe limit (%u) reached with module %s\n",
> > +				   max_modprobes, module_name);
> 
> I like this change. I would only be even more descriptive in which
> limit is reached. Something like
> 
> 		pr_err_ratelimited("request_module: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
> 				   module_name, max_modprobes);

Sure, changed.

> Either way, feel free to add:
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>

Thanks!

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 02/10] module: fix memory leak on early load_module() failures
  2016-12-09 17:06   ` Miroslav Benes
@ 2016-12-16  8:51     ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-16  8:51 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, martin.wilck, mmarek, pmladek, hare, rwright,
	jeffm, DSterba, fdmanana, neilb, linux, rgoldwyn, subashab,
	xypron.glpk, keescook, atomlin, paulmck, dan.j.williams,
	jpoimboe, davem, mingo, akpm, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri, Dec 09, 2016 at 06:06:44PM +0100, Miroslav Benes wrote:
> 
> Reviewed-by: Miroslav Benes <mbenes@suse.cz>
> 
> What about
> 
> Fixes: e180a6b7759a ("param: fix charp parameters set via sysfs")
> 
> ?

Sure thing, added thanks!

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-16  8:31     ` Luis R. Rodriguez
@ 2016-12-17  3:54       ` Rusty Russell
       [not found]         ` <CAB=NE6VvuA9a6hf6yoopGfUxVJQM5HyV5bNzUdsEtUV0UhbG-g@mail.gmail.com>
  0 siblings, 1 reply; 65+ messages in thread
From: Rusty Russell @ 2016-12-17  3:54 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Luis R. Rodriguez, shuah, jeyu, ebiederm, dmitry.torokhov, acme,
	corbet, martin.wilck, mmarek, pmladek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

"Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> On Thu, Dec 15, 2016 at 10:57:42AM +1030, Rusty Russell wrote:
>> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>> > kmod has an optimization in place whereby if a some kernel code
>> > uses request_module() on a module already loaded we never bother
>> > userspace as the module already is loaded. This is not true for
>> > get_fs_type() though as it uses aliases.
>> 
>> Well, the obvious thing to do here is block kmod if we're currently
>> loading the same module.
>
> OK thanks, I've now added this, it sure helps. Test cases 0008 and 0009 require
> hammering on the test over and over to see a failure on vanilla kernels,
> an upper bound I found was about 150 times each test. Running test 0008
> 150 times with this enhancement you mentioned shaves off ~4 seconds.
> For test 0009 it shaves off ~16 seconds, but as I note below the alias support
> was needed as well.
>
>> Otherwise it has to do some weird spinning
>> thing in userspace anyway.
>
> Right, but note that the get_fs_type() tests would still fail given
> module.c was not alias-aware yet.

AFAICT the mistake here is that kmod is returning "done, OK" when the
module it is trying to load is already loading (but not finished
loading).  That's the root problem; it's an attempt at optimization by
kmod which goes awry.

Looking at the code in the kernel, we *already* get this right: block if
a module is still loading anyway.  Once it succeeds we return -EBUSY; if
it fails we'll proceed to try to load it again.

I don't understand what you're trying to fix with adding aliases
in-kernel?

> FWIW a few things did occur to me:
>
> a) list_add_rcu() is used so new modules get added first

Only after we're sure that there are no duplicates.

> b) find_module_all() returns the last module which was added as it traverses
>    the module list

> BTW should find_module_all() use rcu to traverse?

Yes; the kallsyms code does this on Oops.  Not really a big issue in
practice, but a nice fix.

Thanks,
Rusty.

>
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -594,7 +594,7 @@ static struct module *find_module_all(const char *name, size_t len,
>  
>  	module_assert_mutex_or_preempt();
>  
> -	list_for_each_entry(mod, &modules, list) {
> +	list_for_each_entry_rcu(mod, &modules, list) {
>  		if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
>  			continue;
>  		if (strlen(mod->name) == len && !memcmp(mod->name, name, len))
> @@ -3532,7 +3532,7 @@ static int add_unformed_module(struct module *mod)
>  		goto out;
>  	}
>  	mod_update_bounds(mod);
> -	list_add_rcu(&mod->list, &modules);
> +	list_add_tail_rcu(&mod->list, &modules);
>  	mod_tree_insert(mod);
>  	err = 0;
>  

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
       [not found]         ` <CAB=NE6VvuA9a6hf6yoopGfUxVJQM5HyV5bNzUdsEtUV0UhbG-g@mail.gmail.com>
@ 2016-12-20  0:53           ` Rusty Russell
  2016-12-20 18:52             ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Rusty Russell @ 2016-12-20  0:53 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	Jonathan Corbet, Linus Torvalds, linux-kselftest, Andrew Morton,
	Dan Williams, Aaron Tomlin, rwright, Heinrich Schuchardt,
	Michal Marek, martin.wilck, Jeff Mahoney, Ingo Molnar,
	Petr Mladek, Dmitry Torokhov, Guenter Roeck, Eric W. Biederman,
	shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Julia Lawall

"Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> On Dec 16, 2016 9:54 PM, "Rusty Russell" <rusty@rustcorp.com.au> wrote:
> > AFAICT the mistake here is that kmod is returning "done, OK" when the
> > module it is trying to load is already loading (but not finished
> > loading).  That's the root problem; it's an attempt at optimization by
> > kmod which goes awry.
>
> This is true! To be precise though the truth of the matter is that kmod'd
> respective usermode helper: modprobe can be buggy and may lie to us. It may
> allow request_module() to return 0 but since we don't validate it, any
> assumption we make can be deadly. In the case of get_fs_type() its a null
> dereference.

Wait, what??  I can't see that in get_fs_type, which hasn't changed
since 2013.  If a caller is assuming get_fs_type() doesn't return NULL,
they're broken and need fixing of course:

        struct file_system_type *get_fs_type(const char *name)
        {
        	struct file_system_type *fs;
        	const char *dot = strchr(name, '.');
        	int len = dot ? dot - name : strlen(name);

        	fs = __get_fs_type(name, len);
        	if (!fs && (request_module("fs-%.*s", len, name) == 0))
        		fs = __get_fs_type(name, len);

        	if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
        		put_filesystem(fs);
        		fs = NULL;
        	}
        	return fs;
        }

Where does this NULL-deref is the module isn't correctly loaded?

> *Iff* we want a sanity check to verify kmod's umh is not lying to us we
> need to verify after 0 was returned that it was not lying to us. Since kmod
> accepts aliases but find_modules_all() only works on the real module name a
> validation check cannot happen when all you have are aliases.

request_module() should block until resolution, but that's fundamentally
a userspace problem.  Let's not paper over it in kernelspace.

> *Iff* we are sure we don't want a validation (or another earlier
> optimization to avoid calling out to modrobe if the alias requested is
> already present, which does the time shaving I mentioned on the tests) then
> naturally no request_module() calls returning 0 can assert information
> about the requested module. I think we might need to change more code if we
> accept we cannot trust request_module() calls, or we accept userspace
> telling the kernel something may mean we sometimes crash. This later
> predicament seems rather odd so hence the patch.
>
> Perhaps in some cases validation of work from a umh is not critical in
> kernel but for request_module() I can tell you that today get_fs_type code
> currently asserts the module found can never be NULL.

OK, what am I missing in the code above?  

> > Looking at the code in the kernel, we *already* get this right: block if
> > a module is still loading anyway.  Once it succeeds we return -EBUSY if
> >
> > it fails we'll proceed to try to load it again.
> >
> > I don't understand what you're trying to fix with adding aliases
> > in-kernel?
>
> Two fold now:
>
> a) validation on request_module() work when an alias is used

But why?

> b) since kmod accepts aliaes, if we get aliases support, it means we could
> *also* preemptively avoid calling out to userspace for modules already
> present.

No, because once we have a module we don't request it: requesting is the
fallback case.

> >> FWIW a few things did occur to me:
> >>
> >> a) list_add_rcu() is used so new modules get added first
> >
> > Only after we're sure that there are no duplicates.
> >
> >
> OK! This is a very critical assertion. I should be able to add a debug
> WARN_ON() should two modules be on the modules list for the same module
> then ?

Yes, names must be unique.

>> b) find_module_all() returns the last module which was added as it
> traverses
>>    the module list
>
>> BTW should find_module_all() use rcu to traverse?
>
> Yes; the kallsyms code does this on Oops.  Not really a big issue in
> practice, but a nice fix.
>
> Ok, will bundle into my queue.

Please submit to Jessica for her module queue, as it's orthogonal
AFAICT.

> I will note though that I still think there's a bug in this code --
> upon a failure other "spinning" requests can fail, I believe this may
> be due to not having another state or informing pending modules too
> early of a failure but I haven't been able to prove this conjecture
> yet.

That's possible, but I can't see it from quickly re-checking the code.

The module should be fully usable at this point; the module's init has
been called successfully, so in the case of __get_fs_type() it should
now succeed.  The module cleans up its init section, but that should be
independent.

If there is a race, it's likely to be when some other caller wakes the
queue.  Moving the wakeup as soon as possible should make it easier to
trigger:

diff --git a/kernel/module.c b/kernel/module.c
index f57dd63186e6..78bd89d41a22 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3397,6 +3397,7 @@ static noinline int do_init_module(struct module *mod)
 
 	/* Now it's a first class citizen! */
 	mod->state = MODULE_STATE_LIVE;
+	wake_up_all(&module_wq);
 	blocking_notifier_call_chain(&module_notify_list,
 				     MODULE_STATE_LIVE, mod);
 
@@ -3445,7 +3446,6 @@ static noinline int do_init_module(struct module *mod)
 	 */
 	call_rcu_sched(&freeinit->rcu, do_free_init);
 	mutex_unlock(&module_mutex);
-	wake_up_all(&module_wq);
 
 	return 0;
 

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-20  0:53           ` Rusty Russell
@ 2016-12-20 18:52             ` Luis R. Rodriguez
  2016-12-21  2:21               ` Rusty Russell
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-20 18:52 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	Jonathan Corbet, Linus Torvalds, linux-kselftest, Andrew Morton,
	Dan Williams, Aaron Tomlin, rwright, Heinrich Schuchardt,
	Michal Marek, martin.wilck, Jeff Mahoney, Ingo Molnar,
	Petr Mladek, Dmitry Torokhov, Guenter Roeck, Eric W. Biederman,
	shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Julia Lawall

On Mon, Dec 19, 2016 at 6:53 PM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> Where does this NULL-deref is the module isn't correctly loaded?

No you are right, sorry -- I had confused a failure to mount over null
deref, my mistake.

>> *Iff* we want a sanity check to verify kmod's umh is not lying to us we
>> need to verify after 0 was returned that it was not lying to us. Since kmod
>> accepts aliases but find_modules_all() only works on the real module name a
>> validation check cannot happen when all you have are aliases.
>
> request_module() should block until resolution, but that's fundamentally
> a userspace problem.  Let's not paper over it in kernelspace.

OK -- if userspace messes up again it may be a bit hard to prove
unless we have a validation debug thing in place, would such a thing
in debug form be reasonable ?

>> Yes; the kallsyms code does this on Oops.  Not really a big issue in
>> practice, but a nice fix.
>>
>> Ok, will bundle into my queue.
>
> Please submit to Jessica for her module queue, as it's orthogonal
> AFAICT.

Will do.

>> I will note though that I still think there's a bug in this code --
>> upon a failure other "spinning" requests can fail, I believe this may
>> be due to not having another state or informing pending modules too
>> early of a failure but I haven't been able to prove this conjecture
>> yet.
>
> That's possible, but I can't see it from quickly re-checking the code.
>
> The module should be fully usable at this point; the module's init has
> been called successfully, so in the case of __get_fs_type() it should
> now succeed.  The module cleans up its init section, but that should be
> independent.
>
> If there is a race, it's likely to be when some other caller wakes the
> queue.  Moving the wakeup as soon as possible should make it easier to
> trigger:
>
> diff --git a/kernel/module.c b/kernel/module.c
> index f57dd63186e6..78bd89d41a22 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3397,6 +3397,7 @@ static noinline int do_init_module(struct module *mod)
>
>         /* Now it's a first class citizen! */
>         mod->state = MODULE_STATE_LIVE;
> +       wake_up_all(&module_wq);
>         blocking_notifier_call_chain(&module_notify_list,
>                                      MODULE_STATE_LIVE, mod);
>
> @@ -3445,7 +3446,6 @@ static noinline int do_init_module(struct module *mod)
>          */
>         call_rcu_sched(&freeinit->rcu, do_free_init);
>         mutex_unlock(&module_mutex);
> -       wake_up_all(&module_wq);
>
>         return 0;
>

Will give this a shot, thanks!

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-20 18:52             ` Luis R. Rodriguez
@ 2016-12-21  2:21               ` Rusty Russell
  2016-12-21 13:08                 ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Rusty Russell @ 2016-12-21  2:21 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	Jonathan Corbet, Linus Torvalds, linux-kselftest, Andrew Morton,
	Dan Williams, Aaron Tomlin, rwright, Heinrich Schuchardt,
	Michal Marek, martin.wilck, Jeff Mahoney, Ingo Molnar,
	Petr Mladek, Dmitry Torokhov, Guenter Roeck, Eric W. Biederman,
	shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel@vger.kernel.o rg, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Julia Lawall

"Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> OK -- if userspace messes up again it may be a bit hard to prove
> unless we have a validation debug thing in place, would such a thing
> in debug form be reasonable ?

That makes perfect sense.  Untested hack:

diff --git a/fs/filesystems.c b/fs/filesystems.c
index c5618db110be..e5c90e80c7d3 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -275,9 +275,10 @@ struct file_system_type *get_fs_type(const char *name)
 	int len = dot ? dot - name : strlen(name);
 
 	fs = __get_fs_type(name, len);
-	if (!fs && (request_module("fs-%.*s", len, name) == 0))
+	if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
 		fs = __get_fs_type(name, len);
-
+		WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
+	}
 	if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
 		put_filesystem(fs);
 		fs = NULL;

Maybe a similar hack for try_then_request_module(), but many places seem
to open-code request_module() so it's not as trivial...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-21  2:21               ` Rusty Russell
@ 2016-12-21 13:08                 ` Luis R. Rodriguez
  2017-01-03  0:04                   ` Rusty Russell
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2016-12-21 13:08 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	Jonathan Corbet, Linus Torvalds, linux-kselftest, Andrew Morton,
	Dan Williams, Aaron Tomlin, rwright, Heinrich Schuchardt,
	Michal Marek, martin.wilck, Jeff Mahoney, Ingo Molnar,
	Petr Mladek, Dmitry Torokhov, Guenter Roeck, Eric W. Biederman,
	shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel@vger.kernel.o rg, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Julia Lawall

On Tue, Dec 20, 2016 at 8:21 PM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>> OK -- if userspace messes up again it may be a bit hard to prove
>> unless we have a validation debug thing in place, would such a thing
>> in debug form be reasonable ?
>
> That makes perfect sense.  Untested hack:
>
> diff --git a/fs/filesystems.c b/fs/filesystems.c
> index c5618db110be..e5c90e80c7d3 100644
> --- a/fs/filesystems.c
> +++ b/fs/filesystems.c
> @@ -275,9 +275,10 @@ struct file_system_type *get_fs_type(const char *name)
>         int len = dot ? dot - name : strlen(name);
>
>         fs = __get_fs_type(name, len);
> -       if (!fs && (request_module("fs-%.*s", len, name) == 0))
> +       if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
>                 fs = __get_fs_type(name, len);
> -
> +               WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
> +       }
>         if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
>                 put_filesystem(fs);
>                 fs = NULL;

This is precisely a type of debug patch we had added first to verify "WTF".

> Maybe a similar hack for try_then_request_module(), but many places seem
> to open-code request_module() so it's not as trivial...

Right, out of ~350 request_module() calls (not included try requests)
only ~46 check the return value. Hence a validation check, and come to
think of it, *this* was the issue that originally had me believing
that in some places we might end up in a null deref --if those open
coded request_module() calls assume the driver is loaded there could
be many places where a NULL is inevitable. Granted, I agree they
should be fixed, we could add a grammar rule to start nagging at
driver developers for started, but it does beg the question also of
what a tightly knit validation for modprobe might look like, and hence
this patch and now the completed not-yet-posted alias work.

Would it be worthy as a kconfig kmod debugging aide for now? I can
follow up with a semantic patch to nag about checking the return value
of request_module(), and we can  have 0-day then also complain about
new invalid uses.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-16  8:05         ` Luis R. Rodriguez
@ 2016-12-22  4:48           ` Jessica Yu
  2017-01-06 20:54             ` Luis R. Rodriguez
  2017-01-10 18:57           ` [RFC 04/10] " Luis R. Rodriguez
  1 sibling, 1 reply; 65+ messages in thread
From: Jessica Yu @ 2016-12-22  4:48 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Petr Mladek, Kees Cook, shuah, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, hare, rwright, Jeff Mahoney, DSterba,
	fdmanana, neilb, Guenter Roeck, rgoldwyn, subashab,
	Heinrich Schuchardt, Aaron Tomlin, mbenes, Paul E. McKenney,
	Dan Williams, Josh Poimboeuf, David S. Miller, Ingo Molnar,
	Andrew Morton, Linus Torvalds, linux-kselftest, linux-doc, LKML

+++ Luis R. Rodriguez [16/12/16 09:05 +0100]:
>On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
>> On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
>> > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
>> > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > > > kmod_concurrent is used as an atomic counter for enabling
>> > > > the allowed limit of modprobe calls, provide wrappers for it
>> > > > to enable this to be expanded on more easily. This will be done
>> > > > later.
>> > > >
>> > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
>> > > > ---
>> > > >  kernel/kmod.c | 27 +++++++++++++++++++++------
>> > > >  1 file changed, 21 insertions(+), 6 deletions(-)
>> > > >
>> > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
>> > > > index cb6f7ca7b8a5..049d7eabda38 100644
>> > > > --- a/kernel/kmod.c
>> > > > +++ b/kernel/kmod.c
>> > > > @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
>> > > >         return -ENOMEM;
>> > > >  }
>> > > >
>> > > > +static int kmod_umh_threads_get(void)
>> > > > +{
>> > > > +       atomic_inc(&kmod_concurrent);
>>
>> This approach might actually cause false failures. If we
>> are on the limit and more processes do this increment
>> in parallel, it makes the number bigger that it should be.
>
>This approach is *exactly* what the existing code does :P
>I just provided wrappers. I agree with the old approach though,
>reason is it acts as a lock in for the bump. 

I think what Petr meant was that we could run into false failures when multiple
atomic increments happen between the first increment and the subsequent
atomic_read.

Say max_modprobes is 64 -

       atomic_inc(&kmod_concurrent); // thread 1: kmod_concurrent is 63
            atomic_inc(&kmod_concurrent); // thread 2: kmod_concurrent is 64
                 atomic_inc(&kmod_concurrent); // thread 3: kmod_concurrent is 65
       if (atomic_read(&kmod_concurrent) < max_modprobes) // if all threads read 65 here, then all will error out
               return 0;                                  // when the first two should have succeeded (false failures)
       atomic_dec(&kmod_concurrent);
       return -ENOMEM;

But yeah, I think this issue was already in the existing kmod code..

Jessica

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-08 19:48 ` [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
  2016-12-08 20:29   ` Kees Cook
@ 2016-12-22  5:07   ` Jessica Yu
  2017-01-10 20:28     ` Luis R. Rodriguez
  1 sibling, 1 reply; 65+ messages in thread
From: Jessica Yu @ 2016-12-22  5:07 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

+++ Luis R. Rodriguez [08/12/16 11:48 -0800]:
>kmod_concurrent is used as an atomic counter for enabling
>the allowed limit of modprobe calls, provide wrappers for it
>to enable this to be expanded on more easily. This will be done
>later.
>
>Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
>---
> kernel/kmod.c | 27 +++++++++++++++++++++------
> 1 file changed, 21 insertions(+), 6 deletions(-)
>
>diff --git a/kernel/kmod.c b/kernel/kmod.c
>index cb6f7ca7b8a5..049d7eabda38 100644
>--- a/kernel/kmod.c
>+++ b/kernel/kmod.c
>@@ -44,6 +44,9 @@
> #include <trace/events/module.h>
>
> extern int max_threads;
>+
>+static atomic_t kmod_concurrent = ATOMIC_INIT(0);
>+
> unsigned int max_modprobes;
> module_param(max_modprobes, uint, 0644);
> MODULE_PARM_DESC(max_modprobes, "Max number of allowed concurrent modprobes");
>@@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
> 	return -ENOMEM;
> }
>
>+static int kmod_umh_threads_get(void)
>+{
>+	atomic_inc(&kmod_concurrent);
>+	if (atomic_read(&kmod_concurrent) < max_modprobes)

Should this not be <=? I think this only allows up to max_modprobes-1 concurrent threads.

>+		return 0;
>+	atomic_dec(&kmod_concurrent);
>+	return -ENOMEM;
>+}
>+
>+static void kmod_umh_threads_put(void)
>+{
>+	atomic_dec(&kmod_concurrent);
>+}
>+
> /**
>  * __request_module - try to load a kernel module
>  * @wait: wait (or not) for the operation to complete
>@@ -129,7 +146,6 @@ int __request_module(bool wait, const char *fmt, ...)
> 	va_list args;
> 	char module_name[MODULE_NAME_LEN];
> 	int ret;
>-	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> 	static int kmod_loop_msg;
>
> 	/*
>@@ -153,8 +169,8 @@ int __request_module(bool wait, const char *fmt, ...)
> 	if (ret)
> 		return ret;
>
>-	atomic_inc(&kmod_concurrent);
>-	if (atomic_read(&kmod_concurrent) > max_modprobes) {
>+	ret = kmod_umh_threads_get();
>+	if (ret) {
> 		/* We may be blaming an innocent here, but unlikely */
> 		if (kmod_loop_msg < 5) {
> 			printk(KERN_ERR
>@@ -162,15 +178,14 @@ int __request_module(bool wait, const char *fmt, ...)
> 			       module_name);
> 			kmod_loop_msg++;
> 		}
>-		atomic_dec(&kmod_concurrent);
>-		return -ENOMEM;
>+		return ret;
> 	}
>
> 	trace_module_request(module_name, wait, _RET_IP_);
>
> 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
>
>-	atomic_dec(&kmod_concurrent);
>+	kmod_umh_threads_put();
> 	return ret;
> }
> EXPORT_SYMBOL(__request_module);
>-- 
>2.10.1
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2016-12-21 13:08                 ` Luis R. Rodriguez
@ 2017-01-03  0:04                   ` Rusty Russell
  2017-01-06 20:36                     ` Luis R. Rodriguez
  2017-01-06 21:03                     ` Jessica Yu
  0 siblings, 2 replies; 65+ messages in thread
From: Rusty Russell @ 2017-01-03  0:04 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	Jonathan Corbet, Linus Torvalds, linux-kselftest, Andrew Morton,
	Dan Williams, Aaron Tomlin, rwright, Heinrich Schuchardt,
	Michal Marek, martin.wilck, Jeff Mahoney, Ingo Molnar,
	Petr Mladek, Dmitry Torokhov, Guenter Roeck, Eric W. Biederman,
	shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel@vger.kernel.o rg, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Julia Lawall

"Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>> Maybe a similar hack for try_then_request_module(), but many places seem
>> to open-code request_module() so it's not as trivial...

Hi Luis, Jessica (who is the main module maintainer now),

        Back from break, sorry about delay.

> Right, out of ~350 request_module() calls (not included try requests)
> only ~46 check the return value. Hence a validation check, and come to
> think of it, *this* was the issue that originally had me believing
> that in some places we might end up in a null deref --if those open
> coded request_module() calls assume the driver is loaded there could
> be many places where a NULL is inevitable.

Yes, assuming success == module loade is simply a bug.  I wrote
try_then_request_module() to attempt to encapsulate the correct logic
into a single place; maybe we need other helpers to cover (most of?) the
remaining cases?

> Granted, I agree they
> should be fixed, we could add a grammar rule to start nagging at
> driver developers for started, but it does beg the question also of
> what a tightly knit validation for modprobe might look like, and hence
> this patch and now the completed not-yet-posted alias work.

I really think aliases-in-kernel is too heavy a hammer, but a warning
when modprobe "succeeds" and the module still isn't found would be
a Good Thing.

> Would it be worthy as a kconfig kmod debugging aide for now? I can
> follow up with a semantic patch to nag about checking the return value
> of request_module(), and we can  have 0-day then also complain about
> new invalid uses.

Yeah, a warning about this would be win for sure.

BTW, I wrote the original "check-for-module-before-loading" in
module-init-tools, but I'm starting to wonder if it was a premature
optimization.  Have you thought about simply removing it and always
trying to load the module?  If it doesn't slow things down, perhaps
simplicity FTW?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: add a sanity check on module loading
  2016-12-08 19:49 ` [RFC 10/10] kmod: add a sanity check on module loading Luis R. Rodriguez
  2016-12-09 20:03   ` Martin Wilck
  2016-12-15  0:27   ` Rusty Russell
@ 2017-01-04  2:47   ` Jessica Yu
  2 siblings, 0 replies; 65+ messages in thread
From: Jessica Yu @ 2017-01-04  2:47 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

+++ Luis R. Rodriguez [08/12/16 11:49 -0800]:
>kmod has an optimization in place whereby if a some kernel code
>uses request_module() on a module already loaded we never bother
>userspace as the module already is loaded. This is not true for
>get_fs_type() though as it uses aliases.
>
>Additionally kmod <= v19 was broken -- it returns 0 to modprobe calls,
>assuming the kernel module is built-in, where really we have a race as
>the module starts forming. kmod <= v19 has incorrect userspace heuristics,
>a userspace kmod fix is available for it:
>
>http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4
>
>This changes kmod to address both:
>
> o Provides the alias optimization for get_fs_type() so modules already
>   loaded do not get re-requested.
>
> o Provides a sanity test to verify modprobe's work
>
>This is important given how any get_fs_type() users assert success
>means we're ready to go, and tests with the new test_kmod stress driver
>reveal that request_module() and get_fs_type() might fail for a few
>other reasons. You don't need old kmod to fail on request_module() or
>get_fs_type(), with the right system setup, these calls *can* fail
>today.
>
>Although this does get us in the business of keeping alias maps in
>kernel, the the work to support and maintain this is trivial.
>Aditionally, since it may be important get_fs_type() should not fail on
>certain systems, this tightens things up a bit more.
>
>The TL;DR:
>
>kmod <= v19 will return 0 on modprobe calls if you are built-in,
>however its heuristics for checking if you are built-in were broken.
>
>It assumed that having the directory /sys/module/module-name
>but not having the file /sys/module/module-name/initstate
>is sufficient to assume a module is built-in.
>
>The kernel loads the inittstate attribute *after* it creates the
>directory. This is an issue when modprobe returns 0 for kernel calls
>which assumes a return of 0 on request_module() can give you the
>right to assert the module is loaded and live.
>
>We cannot trust returns of modprobe as 0 in the kernel, we need to
>verify that modules are live if modprobe return 0 but only if modules
>*are* modules. The kernel heuristic we use to determine if a module is
>built-in is that if modprobe returns 0 we know we must be built-in or
>a module, but if we are a module clearly we must have a lingering kmod
>dangling on our linked list. If there is no modules there we are *somewhat*
>certain the module must be built in.
>
>This is not enough though... we cannot easily work around this since the
>kernel can use aliases to userspace for modules calls. For instance
>fs/namespace.c uses fs-modulename for filesystesms on get_fs_type(), so
>these need to be taken into consideration as well.
>
>Using kmod <= 19 will give you a NULL get_fs_type() return even though
>the module was loaded... That is a corner case, there are other failures
>for request_module() though -- the other failures are not easy to
>reproduce though but fortunately we have a stress test driver to help
>with that now. Use the following tests:
>
> # tools/testing/selftests/kmod/kmod.sh -t 0008
> # tools/testing/selftests/kmod/kmod.sh -t 0009
>
>You can more easily see this error if you have kmod <= v19 installed.
>
>You will need to install kmod <= v19, be sure to install its modprobe
>into /sbin/ as by default the 'make install' target does not replace
>your own.
>
>This test helps cure test_kmod cases 0008 0009 so enable them.
>
>Reported-by: Martin Wilck <martin.wilck@suse.com>
>Reported-by: Randy Wright <rwright@hpe.com>
>Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

Back from travel today, apologies for the delay. Will be able to give
this a proper look this week.

Jessica

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
  2017-01-03  0:04                   ` Rusty Russell
@ 2017-01-06 20:36                     ` Luis R. Rodriguez
  2017-01-06 21:53                       ` Jessica Yu
       [not found]                       ` <87bmvgax51.fsf@rustcorp.com.au>
  2017-01-06 21:03                     ` Jessica Yu
  1 sibling, 2 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-06 20:36 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Luis R. Rodriguez, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Jeff Mahoney,
	Ingo Molnar, Petr Mladek, Dmitry Torokhov, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel@vger.kernel.o rg, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Julia Lawall

On Tue, Jan 03, 2017 at 10:34:53AM +1030, Rusty Russell wrote:
> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> > Right, out of ~350 request_module() calls (not included try requests)
> > only ~46 check the return value. Hence a validation check, and come to
> > think of it, *this* was the issue that originally had me believing
> > that in some places we might end up in a null deref --if those open
> > coded request_module() calls assume the driver is loaded there could
> > be many places where a NULL is inevitable.
> 
> Yes, assuming success == module loade is simply a bug.  I wrote
> try_then_request_module() to attempt to encapsulate the correct logic
> into a single place; maybe we need other helpers to cover (most of?) the
> remaining cases?

I see...

OK so indeed we have a few possible changes to kernel given the above:

a) Add SmPL rule to nag about incorrect uses of request_module() which
   never check for the return value, and fix 86% of calls (304 call sites)
   which are buggy

b) Add a new API call, perhaps request_module_assert() which would
   BUG_ON() if the requested module didn't load, and change the callers
   which do not check for the return value to this.

Make request_module() do the assert and changing all proper callers of
request_module() to a new API call which *does* let you check for the
return value is another option but tasteless.

b) seems to be what you allude to, and while it may seem also of bad taste,
in practice it may be hard to get callers to properly check for the return
value. I actually just favor a) even though its more work.

> > Granted, I agree they
> > should be fixed, we could add a grammar rule to start nagging at
> > driver developers for started, but it does beg the question also of
> > what a tightly knit validation for modprobe might look like, and hence
> > this patch and now the completed not-yet-posted alias work.
> 
> I really think aliases-in-kernel is too heavy a hammer, but a warning
> when modprobe "succeeds" and the module still isn't found would be
> a Good Thing.

OK -- such a warning can really only happen if we had alias support though.
So one option is to add this and alias parsing support as a debug option.

> > Would it be worthy as a kconfig kmod debugging aide for now? I can
> > follow up with a semantic patch to nag about checking the return value
> > of request_module(), and we can  have 0-day then also complain about
> > new invalid uses.
> 
> Yeah, a warning about this would be win for sure.

OK will work on such SmPL patch into the next patch series for this patch set.

> BTW, I wrote the original "check-for-module-before-loading" in
> module-init-tools, but I'm starting to wonder if it was a premature
> optimization.  Have you thought about simply removing it and always
> trying to load the module?  If it doesn't slow things down, perhaps
> simplicity FTW?

I've given this some thought as I tried to blow up request_module() with
the new kmod stress test driver and given the small changes I made -- I'm of the
mind set it should be based on numbers: if a change improves the time it takes
to load modules while also not regressing all the other test cases then we 
should go with it. The only issue is we don't yet have enough test cases
to cover the typical distribution setup: load tons of modules, and only
sometimes try to load a few of the same modules.

The early module-init-tools check seems fair gain to me given a bounce back to
the kernel and back to userspace should incur a bit more work than just checking
for a few files on the filesystem. As I noted though, I can't prove this for most
cases for now, but its a hunch.

So I'd advocate leaving the "check-for-module-before-loading" on kmod for now.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-22  4:48           ` Jessica Yu
@ 2017-01-06 20:54             ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-06 20:54 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Luis R. Rodriguez, Petr Mladek, Kees Cook, shuah, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, martin.wilck, Michal Marek, hare, rwright,
	Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck, rgoldwyn,
	subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Wed, Dec 21, 2016 at 08:48:06PM -0800, Jessica Yu wrote:
> +++ Luis R. Rodriguez [16/12/16 09:05 +0100]:
> > On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> > > > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > > > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > > kmod_concurrent is used as an atomic counter for enabling
> > > > > > the allowed limit of modprobe calls, provide wrappers for it
> > > > > > to enable this to be expanded on more easily. This will be done
> > > > > > later.
> > > > > >
> > > > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > > > ---
> > > > > >  kernel/kmod.c | 27 +++++++++++++++++++++------
> > > > > >  1 file changed, 21 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > > > index cb6f7ca7b8a5..049d7eabda38 100644
> > > > > > --- a/kernel/kmod.c
> > > > > > +++ b/kernel/kmod.c
> > > > > > @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
> > > > > >         return -ENOMEM;
> > > > > >  }
> > > > > >
> > > > > > +static int kmod_umh_threads_get(void)
> > > > > > +{
> > > > > > +       atomic_inc(&kmod_concurrent);
> > > 
> > > This approach might actually cause false failures. If we
> > > are on the limit and more processes do this increment
> > > in parallel, it makes the number bigger that it should be.
> > 
> > This approach is *exactly* what the existing code does :P
> > I just provided wrappers. I agree with the old approach though,
> > reason is it acts as a lock in for the bump.
> 
> I think what Petr meant was that we could run into false failures when multiple
> atomic increments happen between the first increment and the subsequent
> atomic_read.
> 
> Say max_modprobes is 64 -
> 
>       atomic_inc(&kmod_concurrent); // thread 1: kmod_concurrent is 63
>            atomic_inc(&kmod_concurrent); // thread 2: kmod_concurrent is 64
>                 atomic_inc(&kmod_concurrent); // thread 3: kmod_concurrent is 65
>       if (atomic_read(&kmod_concurrent) < max_modprobes) // if all threads read 65 here, then all will error out
>               return 0;                                  // when the first two should have succeeded (false failures)
>       atomic_dec(&kmod_concurrent);
>       return -ENOMEM;
> 
> But yeah, I think this issue was already in the existing kmod code..

Ah right, but the code was very simple and there is only one operation
in between which we'd race against given the old code just incremented
first nd immediately checked for the limit. The more code we have the
more chances for what you describe to happen.

I've added another change into my series, a clutch, its at the end of this
email. With this we change we check for the limit right away and put on
hold any items reaching the limit, while other requests passing the limit
will be bumped. We have then:
 
        if (!kmod_concurrent_sane()) {                                          
                pr_warn_ratelimited("request_module: kmod_concurrent (%u) close to critical levels (max_modprobes: %u) for module %s\n, backing off for a bit",
                                    atomic_read(&kmod_concurrent), max_modprobes, module_name);
                wait_event_interruptible(kmod_wq, kmod_concurrent_sane());      
        }                                                                       
                                                                                
        ret = kmod_umh_threads_get();                                           
        if (ret) {                                                              
                pr_err_ratelimited("%s: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
                                   __func__, module_name, max_modprobes);       
                return ret;                                                     
        }  

The same race you describe is possible -- but we now would at least use
a clutch immediately as we approach the limit. Maybe it makes sense to
post a new series after I fold the alias code and sanity check into a
debug kconfig option ?

  Luis

commit 95c55552283cf99e2a48b84dc766d5fa547f046e
Author: Luis R. Rodriguez <mcgrof@kernel.org>
Date:   Thu Dec 15 23:24:22 2016 -0600

    kmod: add a clutch around 1/4 of modprobe thread limit
    
    If we reach the limit of modprobe_limit threads running the next
    request_module() call will fail. The original reason for adding
    a kill was to do away with possible issues with in old circumstances
    which would create a recursive series of request_module() calls.
    We can do better than just be super aggressive and reject calls
    once we've reached the limit by adding a clutch so that if we're
    1/4th of the way close to the limit we make these new calls wait
    until pending threads complete.
    
    There is still a chance you can fail new incomming requests which
    can bump kmod_concurrent beyond the limit, however the clutch helps
    with a bit of breathing room to allow the system to process pending
    requests before activating the upper last 1/4th of the limit requests.
    
    This fixes test cases 0008 and 0009 of the selftest for kmod:
    
    tools/testing/selftests/kmod/kmod.sh -t 0008
    tools/testing/selftests/kmod/kmod.sh -t 0009
    
    Both tests reveal the clutch in action:
    
    Dec 15 16:12:14 piggy kernel: request_module: kmod_concurrent (96) close critical levels (max_modprobes: 128) for module test_module
    ...
    Dec 15 16:12:23 piggy kernel: request_module: kmod_concurrent (96) close critical levels (max_modprobes: 128) for module test_module
    ...
    
    The only difference is the clutch helps with avoiding making
    request_module() requests fatal more often. With x86_64 qemu,
    with 4 cores, 4 GiB of RAM it takes the following run time to
    run both tests:
    
    time kmod.sh -t 0008
    real    0m22.247s
    user    0m0.084s
    sys     0m11.328s
    
    time kmod.sh -t 0009
    real    0m58.785s
    user    0m0.492s
    sys     0m10.852s
    
    Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

diff --git a/kernel/kmod.c b/kernel/kmod.c
index d6595d2de209..f8c880bbf658 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -58,6 +58,7 @@ static DECLARE_RWSEM(umhelper_sem);
 
 #ifdef CONFIG_MODULES
 static atomic_t kmod_concurrent = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(kmod_wq);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -156,6 +157,16 @@ int get_kmod_umh_count(void)
 	return atomic_read(&kmod_concurrent);
 }
 
+static bool kmod_concurrent_sane(void)
+{
+	unsigned int clutch;
+
+	clutch = get_kmod_umh_limit() - (get_kmod_umh_limit()/4);
+	if (get_kmod_umh_count() < clutch)
+		return true;
+	return false;
+}
+
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
@@ -199,6 +210,12 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
+	if (!kmod_concurrent_sane()) {
+		pr_warn_ratelimited("request_module: kmod_concurrent (%u) close to critical levels (max_modprobes: %u) for module %s\n, backing off for a bit",
+				    atomic_read(&kmod_concurrent), max_modprobes, module_name);
+		wait_event_interruptible(kmod_wq, kmod_concurrent_sane());
+	}
+
 	ret = kmod_umh_threads_get();
 	if (ret) {
 		pr_err_ratelimited("%s: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
@@ -211,6 +228,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
 	kmod_umh_threads_put();
+	wake_up_all(&kmod_wq);
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
index f8ccc938e0fb..08d9bea4bade 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -52,28 +52,8 @@ ALL_TESTS="$ALL_TESTS 0004:1:1"
 ALL_TESTS="$ALL_TESTS 0005:10:1"
 ALL_TESTS="$ALL_TESTS 0006:10:1"
 ALL_TESTS="$ALL_TESTS 0007:5:1"
-
-# Disabled tests:
-#
-# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
-# Current best-effort failure interpretation:
-# Enough module requests get loaded in place fast enough to reach over the
-# max_modprobes limit and trigger a failure -- before we're even able to
-# start processing pending requests.
-ALL_TESTS="$ALL_TESTS 0008:150:0"
-
-# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
-# Current best-effort failure interpretation:
-#
-# get_fs_type() requests modules using aliases as such the optimization in
-# place today to look for already loaded modules will not take effect and
-# we end up requesting a new module to load, this bumps the kmod_concurrent,
-# and in certain circumstances can lead to pushing the kmod_concurrent over
-# the max_modprobe limit.
-#
-# This test fails much easier than test 0008 since the alias optimizations
-# are not in place.
-ALL_TESTS="$ALL_TESTS 0009:150:0"
+ALL_TESTS="$ALL_TESTS 0008:150:1"
+ALL_TESTS="$ALL_TESTS 0009:150:1"
 
 test_modprobe()
 {

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: add a sanity check on module loading
  2017-01-03  0:04                   ` Rusty Russell
  2017-01-06 20:36                     ` Luis R. Rodriguez
@ 2017-01-06 21:03                     ` Jessica Yu
  1 sibling, 0 replies; 65+ messages in thread
From: Jessica Yu @ 2017-01-06 21:03 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Luis R. Rodriguez, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Jeff Mahoney,
	Ingo Molnar, Petr Mladek, Dmitry Torokhov, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel@vger.kernel.o rg, David Miller,
	Subash Abhinov Kasiviswanathan, Julia Lawall

+++ Rusty Russell [03/01/17 10:34 +1030]:
>"Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>>> Maybe a similar hack for try_then_request_module(), but many places seem
>>> to open-code request_module() so it's not as trivial...
>
>Hi Luis, Jessica (who is the main module maintainer now),
>
>        Back from break, sorry about delay.
>
>> Right, out of ~350 request_module() calls (not included try requests)
>> only ~46 check the return value. Hence a validation check, and come to
>> think of it, *this* was the issue that originally had me believing
>> that in some places we might end up in a null deref --if those open
>> coded request_module() calls assume the driver is loaded there could
>> be many places where a NULL is inevitable.
>
>Yes, assuming success == module loade is simply a bug.  I wrote
>try_then_request_module() to attempt to encapsulate the correct logic
>into a single place; maybe we need other helpers to cover (most of?) the
>remaining cases?
>
>> Granted, I agree they
>> should be fixed, we could add a grammar rule to start nagging at
>> driver developers for started, but it does beg the question also of
>> what a tightly knit validation for modprobe might look like, and hence
>> this patch and now the completed not-yet-posted alias work.
>
>I really think aliases-in-kernel is too heavy a hammer, but a warning
>when modprobe "succeeds" and the module still isn't found would be
>a Good Thing.

I was under the impression that aliases were a userspace concern. i.e., we let
kmod tools take care of alias resolution and bookkeeping. I'm getting the
feeling we're bending over backwards here to accommodate buggy/untrustworthy
userspace (modprobe). If I understand correctly, we're performing this
validation work - we're proposing to make the kernel alias-aware - because we
can't even trust modprobe's return value, and the proposal is to double check
this work ourselves in-kernel.

But I thought that request_module() wasn't written to provide these "module is
now live and loaded" guarantees in the first place. This seems to be documented
in kernel/kmod.c - "Callers must check that the service they requested is now
available not blindly invoke it." Isn't it the caller's responsibility to
(indirectly) validate request_module's work, to check that the service they want is
now there? If a caller doesn't do this, then this is a bug on their side. If it
is crucial for get_fs_type() to not fail, then perhaps we should be tightening
get_fs_type() instead, be that WARNing if the requested filesystem is still not
there (as suggested earlier), or maybe even trying the request again.

>> Would it be worthy as a kconfig kmod debugging aide for now? I can
>> follow up with a semantic patch to nag about checking the return value
>> of request_module(), and we can  have 0-day then also complain about
>> new invalid uses.
>
>Yeah, a warning about this would be win for sure.
>
>BTW, I wrote the original "check-for-module-before-loading" in
>module-init-tools, but I'm starting to wonder if it was a premature
>optimization.  Have you thought about simply removing it and always
>trying to load the module?  If it doesn't slow things down, perhaps
>simplicity FTW?
>
>Thanks,
>Rusty.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: add a sanity check on module loading
  2017-01-06 20:36                     ` Luis R. Rodriguez
@ 2017-01-06 21:53                       ` Jessica Yu
  2017-01-09 20:27                         ` Luis R. Rodriguez
       [not found]                       ` <87bmvgax51.fsf@rustcorp.com.au>
  1 sibling, 1 reply; 65+ messages in thread
From: Jessica Yu @ 2017-01-06 21:53 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Rusty Russell, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Jeff Mahoney,
	Ingo Molnar, Petr Mladek, Dmitry Torokhov, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel@vger.kernel.o rg, David Miller,
	Subash Abhinov Kasiviswanathan, Julia Lawall

+++ Luis R. Rodriguez [06/01/17 21:36 +0100]:
>On Tue, Jan 03, 2017 at 10:34:53AM +1030, Rusty Russell wrote:
>> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>> > Right, out of ~350 request_module() calls (not included try requests)
>> > only ~46 check the return value. Hence a validation check, and come to
>> > think of it, *this* was the issue that originally had me believing
>> > that in some places we might end up in a null deref --if those open
>> > coded request_module() calls assume the driver is loaded there could
>> > be many places where a NULL is inevitable.
>>
>> Yes, assuming success == module loade is simply a bug.  I wrote
>> try_then_request_module() to attempt to encapsulate the correct logic
>> into a single place; maybe we need other helpers to cover (most of?) the
>> remaining cases?
>
>I see...
>
>OK so indeed we have a few possible changes to kernel given the above:
>
>a) Add SmPL rule to nag about incorrect uses of request_module() which
>   never check for the return value, and fix 86% of calls (304 call sites)
>   which are buggy
>
>b) Add a new API call, perhaps request_module_assert() which would
>   BUG_ON() if the requested module didn't load, and change the callers
>   which do not check for the return value to this.

It is probably not a good idea to panic/BUG() because a requested
module didn't load. IMO callers should already be accounting for the
fact that request_module() doesn't provide these guarantees. I haven't
looked yet to see if the majority of these callers actually do the the
responsible thing, though.

>Make request_module() do the assert and changing all proper callers of
>request_module() to a new API call which *does* let you check for the
>return value is another option but tasteless.
>
>b) seems to be what you allude to, and while it may seem also of bad taste,
>in practice it may be hard to get callers to properly check for the return
>value. I actually just favor a) even though its more work.
>
>> > Granted, I agree they
>> > should be fixed, we could add a grammar rule to start nagging at
>> > driver developers for started, but it does beg the question also of
>> > what a tightly knit validation for modprobe might look like, and hence
>> > this patch and now the completed not-yet-posted alias work.
>>
>> I really think aliases-in-kernel is too heavy a hammer, but a warning
>> when modprobe "succeeds" and the module still isn't found would be
>> a Good Thing.
>
>OK -- such a warning can really only happen if we had alias support though.
>So one option is to add this and alias parsing support as a debug option.

Hm, I see what you're saying..

To clarify the problem (if anyone was confused, as I was..): we can
verify a module is loaded by using find_module_all() and looking at
its state. However, find_module_all() operates on real module names,
and we can't verify a module has successfully loaded if all we have is
the name of the alias (eg, "fs-*" aliases in get_fs_type), because we
have no alias->real_module_name mappings in the kernel.

However, in Rusty's sample get_fs_type WARN() code, we indirectly
validated request_module()'s work by verifying that the
file_system_type has actually registered, which is what should happen
if a filesystem module successfully loads. So in this case, the caller
(get_fs_type) indirectly checks if the service it requested is now
available, which is what I *thought* callers were supposed to do in
the first place (and we didn't need the help of aliases to do that).
I think the main question we have to answer is, should the burden of
validation be on the callers, or on request_module? I am currently
leaning towards the former, but I'm still thinking.

>> > Would it be worthy as a kconfig kmod debugging aide for now? I can
>> > follow up with a semantic patch to nag about checking the return value
>> > of request_module(), and we can  have 0-day then also complain about
>> > new invalid uses.
>>
>> Yeah, a warning about this would be win for sure.
>
>OK will work on such SmPL patch into the next patch series for this patch set.
>
>> BTW, I wrote the original "check-for-module-before-loading" in
>> module-init-tools, but I'm starting to wonder if it was a premature
>> optimization.  Have you thought about simply removing it and always
>> trying to load the module?  If it doesn't slow things down, perhaps
>> simplicity FTW?
>
>I've given this some thought as I tried to blow up request_module() with
>the new kmod stress test driver and given the small changes I made -- I'm of the
>mind set it should be based on numbers: if a change improves the time it takes
>to load modules while also not regressing all the other test cases then we
>should go with it. The only issue is we don't yet have enough test cases
>to cover the typical distribution setup: load tons of modules, and only
>sometimes try to load a few of the same modules.
>
>The early module-init-tools check seems fair gain to me given a bounce back to
>the kernel and back to userspace should incur a bit more work than just checking
>for a few files on the filesystem. As I noted though, I can't prove this for most
>cases for now, but its a hunch.
>
>So I'd advocate leaving the "check-for-module-before-loading" on kmod for now.
>
>  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 10/10] kmod: add a sanity check on module loading
       [not found]                       ` <87bmvgax51.fsf@rustcorp.com.au>
@ 2017-01-09 19:56                         ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-09 19:56 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Luis R. Rodriguez, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Jeff Mahoney,
	Ingo Molnar, Petr Mladek, Dmitry Torokhov, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel@vger.kernel.o rg, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Julia Lawall

On Tue, Jan 10, 2017 at 05:17:22AM +1030, Rusty Russell wrote:
> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> > On Tue, Jan 03, 2017 at 10:34:53AM +1030, Rusty Russell wrote:
> >> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> >> > Right, out of ~350 request_module() calls (not included try requests)
> >> > only ~46 check the return value. Hence a validation check, and come to
> >> > think of it, *this* was the issue that originally had me believing
> >> > that in some places we might end up in a null deref --if those open
> >> > coded request_module() calls assume the driver is loaded there could
> >> > be many places where a NULL is inevitable.
> >> 
> >> Yes, assuming success == module loade is simply a bug.  I wrote
> >> try_then_request_module() to attempt to encapsulate the correct logic
> >> into a single place; maybe we need other helpers to cover (most of?) the
> >> remaining cases?
> >
> > I see...
> >
> > OK so indeed we have a few possible changes to kernel given the above:
> >
> > a) Add SmPL rule to nag about incorrect uses of request_module() which
> >    never check for the return value, and fix 86% of calls (304 call sites)
> >    which are buggy
> 
> Well, checking the return value is merely an optimization.  The bug
> is not re-checking for registrations and *assuming existence*.

An optimization, I see.. I was going with using the return value from
request_module() -- clearly that is not proper form. Might as well make this
void ?

> I glanced through the first 100, and they're fine.  You are supposed to
> do "request_module()" then "re-check if it's there", and that seems to
> the pattern.

OK I then understand now why you added try_then_request_module() and hinted
to more similar forms. If try_then_request_module() was capturing the
required effort properly then I will note that my grammar rule now finds
one invalid use on drivers/media/usb/as102/as102_drv.c, although its
use seems invalid though the module is loaded for firmware loading
purposes and it seems that is optional at that point in time as such
it does not seem invalid. Not sure if its worth adding a separate API
call for this to annotate its fine to ignore the return value.

One thing I am sure of at this point though is that the loose required
checks for proper form makes it pretty hard to validate the callers.

> > b) Add a new API call, perhaps request_module_assert() which would
> >    BUG_ON() if the requested module didn't load, and change the callers
> >    which do not check for the return value to this.
> >
> > Make request_module() do the assert and changing all proper callers of
> > request_module() to a new API call which *does* let you check for the
> > return value is another option but tasteless.
> >
> > b) seems to be what you allude to, and while it may seem also of bad taste,
> > in practice it may be hard to get callers to properly check for the return
> > value. I actually just favor a) even though its more work.
> 
> No, I meant to look for patterns to see if we could create helpers.  But
> I've revised that, since I don't actually see any problems.
> 
> In fact, you've yet to identify a single problem user.

Indeed, its actually hard to verify proper "form" for this API given
how different each caller verifies the requested module is present or
loaded.

> >> BTW, I wrote the original "check-for-module-before-loading" in
> >> module-init-tools, but I'm starting to wonder if it was a premature
> >> optimization.  Have you thought about simply removing it and always
> >> trying to load the module?  If it doesn't slow things down, perhaps
> >> simplicity FTW?
> >
> > I've given this some thought as I tried to blow up request_module() with
> > the new kmod stress test driver and given the small changes I made -- I'm of the
> > mind set it should be based on numbers: if a change improves the time it takes
> > to load modules while also not regressing all the other test cases then we 
> > should go with it. The only issue is we don't yet have enough test cases
> > to cover the typical distribution setup: load tons of modules, and only
> > sometimes try to load a few of the same modules.
> 
> Just benchmark boot time.  That's a pretty good test.

Alright.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: add a sanity check on module loading
  2017-01-06 21:53                       ` Jessica Yu
@ 2017-01-09 20:27                         ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-09 20:27 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Luis R. Rodriguez, Rusty Russell, Filipe Manana,
	Paul E. McKenney, linux-doc, rgoldwyn, hare, Jonathan Corbet,
	Linus Torvalds, linux-kselftest, Andrew Morton, Dan Williams,
	Aaron Tomlin, rwright, Heinrich Schuchardt, Michal Marek,
	martin.wilck, Jeff Mahoney, Ingo Molnar, Petr Mladek,
	Dmitry Torokhov, Guenter Roeck, Eric W. Biederman, shuah,
	DSterba, Kees Cook, Josh Poimboeuf, Arnaldo Carvalho de Melo,
	Miroslav Benes, NeilBrown, linux-kernel@vger.kernel.o rg,
	David Miller, Subash Abhinov Kasiviswanathan, Julia Lawall

On Fri, Jan 06, 2017 at 04:53:54PM -0500, Jessica Yu wrote:
> +++ Luis R. Rodriguez [06/01/17 21:36 +0100]:
> > On Tue, Jan 03, 2017 at 10:34:53AM +1030, Rusty Russell wrote:
> > > "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> > > > Right, out of ~350 request_module() calls (not included try requests)
> > > > only ~46 check the return value. Hence a validation check, and come to
> > > > think of it, *this* was the issue that originally had me believing
> > > > that in some places we might end up in a null deref --if those open
> > > > coded request_module() calls assume the driver is loaded there could
> > > > be many places where a NULL is inevitable.
> > > 
> > > Yes, assuming success == module loade is simply a bug.  I wrote
> > > try_then_request_module() to attempt to encapsulate the correct logic
> > > into a single place; maybe we need other helpers to cover (most of?) the
> > > remaining cases?
> > 
> > I see...
> > 
> > OK so indeed we have a few possible changes to kernel given the above:
> > 
> > a) Add SmPL rule to nag about incorrect uses of request_module() which
> >   never check for the return value, and fix 86% of calls (304 call sites)
> >   which are buggy
> > 
> > b) Add a new API call, perhaps request_module_assert() which would
> >   BUG_ON() if the requested module didn't load, and change the callers
> >   which do not check for the return value to this.
> 
> It is probably not a good idea to panic/BUG() because a requested
> module didn't load. IMO callers should already be accounting for the
> fact that request_module() doesn't provide these guarantees. I haven't
> looked yet to see if the majority of these callers actually do the the
> responsible thing, though.

It seems proper form is hard to vet for, and the return value actually
doesn't really give us much useful information.

> > Make request_module() do the assert and changing all proper callers of
> > request_module() to a new API call which *does* let you check for the
> > return value is another option but tasteless.
> > 
> > b) seems to be what you allude to, and while it may seem also of bad taste,
> > in practice it may be hard to get callers to properly check for the return
> > value. I actually just favor a) even though its more work.
> > 
> > > > Granted, I agree they
> > > > should be fixed, we could add a grammar rule to start nagging at
> > > > driver developers for started, but it does beg the question also of
> > > > what a tightly knit validation for modprobe might look like, and hence
> > > > this patch and now the completed not-yet-posted alias work.
> > > 
> > > I really think aliases-in-kernel is too heavy a hammer, but a warning
> > > when modprobe "succeeds" and the module still isn't found would be
> > > a Good Thing.
> > 
> > OK -- such a warning can really only happen if we had alias support though.
> > So one option is to add this and alias parsing support as a debug option.
> 
> Hm, I see what you're saying..
> 
> To clarify the problem (if anyone was confused, as I was..): we can
> verify a module is loaded by using find_module_all() and looking at
> its state. However, find_module_all() operates on real module names,
> and we can't verify a module has successfully loaded if all we have is
> the name of the alias (eg, "fs-*" aliases in get_fs_type), because we
> have no alias->real_module_name mappings in the kernel.

Yup!

> However, in Rusty's sample get_fs_type WARN() code, we indirectly
> validated request_module()'s work by verifying that the
> file_system_type has actually registered, which is what should happen
> if a filesystem module successfully loads. So in this case, the caller
> (get_fs_type) indirectly checks if the service it requested is now
> available, which is what I *thought* callers were supposed to do in
> the first place (and we didn't need the help of aliases to do that).
> I think the main question we have to answer is, should the burden of
> validation be on the callers, or on request_module? I am currently
> leaning towards the former, but I'm still thinking.

Validation check on the caller makes sense *but* what makes this a bit hard
is as I have found, request_module() *call* can fail for some reasons
other than the module not being available on the system -- races, and inherent
design decisions (kmod concurrent). In my patch series I address kmod concurrent
limit to be more graceful, the clutch I mentioned is another addition
to help make failures be less aggressive.

Because some issues can creep up with request_module() -- checking its return
value seems desirable -- but as Rusty notes its currently only seen as
an optimization to check for the return value. Its not really clear what
the best path forward is. Here are a bit of my current thoughts:

  o The debug check Rusty suggested seems fair for upstream get_fs_type() in
    retropsect.

  o Although callers should validate a module was loaded and that should
    in theory suffice to cover most API failures on request_module(),
    once an issue does creep up its rather hard to confirm where an
    issue came from exactly, adding some debug code to aid review on
    issues seems fair and useful.

  o We should stress test the module loader further with more tests and
    fix any other pending issues

If you agree with this the validation code I proposed would just be folded
under a debug Kconfig entry, that would also mean the alias stuff is kept
only under that debug Kconfig.

The kmod stress test driver and small fixes would be sent as the first series.
I'd split off the validation stuff into a separate series to make it clearer.

Not sure on why the return value for request_module() is kept then still though.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-16  8:05         ` Luis R. Rodriguez
  2016-12-22  4:48           ` Jessica Yu
@ 2017-01-10 18:57           ` Luis R. Rodriguez
  2017-01-11 20:08             ` Luis R. Rodriguez
  1 sibling, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-10 18:57 UTC (permalink / raw)
  To: Petr Mladek, Kees Cook, Peter Zijlstra
  Cc: mcgrof, shuah, Jessica Yu, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	martin.wilck, Michal Marek, hare, rwright, Jeff Mahoney, DSterba,
	fdmanana, neilb, Guenter Roeck, rgoldwyn, subashab,
	Heinrich Schuchardt, Aaron Tomlin, mbenes, Paul E. McKenney,
	Dan Williams, Josh Poimboeuf, David S. Miller, Ingo Molnar,
	Andrew Morton, Linus Torvalds, linux-kselftest, linux-doc, LKML

On Fri, Dec 16, 2016 at 09:05:00AM +0100, Luis R. Rodriguez wrote:
> On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> > > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> > > > > +               return 0;
> > > > > +       atomic_dec(&kmod_concurrent);
> > > > > +       return -ENOMEM;
> > > > > +}
> > > > > +
> > > > > +static void kmod_umh_threads_put(void)
> > > > > +{
> > > > > +       atomic_dec(&kmod_concurrent);
> > > > > +}
> > > > 
> > > > Can you use a kref here instead? We're trying to kill raw use of
> > > > atomic_t for reference counting...
> > > 
> > > That's a much broader functional change than I was looking for, but I am up for
> > > it. Can you describe the benefit of using kref you expect or why this is an
> > > ongoing crusade? Since its a larger functional change how about doing this
> > > change later, and we can test impact with the tress test driver. In theory if
> > > there are benefits can't we add a test case to prove the gains?
> > 
> > Kees probably refers to the kref improvements that Peter Zijlstra
> > is working on, see
> > https://lkml.kernel.org/r/20161114174446.832175072@infradead.org
> > 
> > The advantage is that the new refcount API handles over and
> > underflow.
> > 
> > Another advantage is that it increments/decrements the value
> > only when it is safe. It uses cmpxchg to make sure that
> > the checks are valid.
> 
> Great thanks, will look into that.

OK I've done the conversion now, the only thing is linux-next as of today lacks
KREF_INIT() so I've open coded it for now. Once Peter's changes get merged the
only thing we'dneed is to change the open code line to KREF_INIT().

I'll annotate this as Suggested-by Kees and Petr, I did this as a separate atomic
step after this to make it easier for review.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 03/10] kmod: add dynamic max concurrent thread count
  2016-12-16  8:39     ` Luis R. Rodriguez
@ 2017-01-10 19:24       ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-10 19:24 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Petr Mladek, shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, martin.wilck, mmarek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Fri, Dec 16, 2016 at 09:39:56AM +0100, Luis R. Rodriguez wrote:
> On Wed, Dec 14, 2016 at 04:38:27PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-08 11:48:14, Luis R. Rodriguez wrote:
> > > diff --git a/init/Kconfig b/init/Kconfig
> > > index 271692a352f1..da2c25746937 100644
> > > --- a/init/Kconfig
> > > +++ b/init/Kconfig
> > > @@ -2111,6 +2111,29 @@ config TRIM_UNUSED_KSYMS
> > >  
> > >  	  If unsure, or if you need to build out-of-tree modules, say N.
> > >  
> > > +config MAX_KMOD_CONCURRENT
> > > +	int "Max allowed concurrent request_module() calls (6=>64, 10=>1024)"
> > > +	range 0 14
> > 
> > Would not too small range break loading module dependencies?
> 
> No, dependencies are resolved by depmod, so userspace looks at the list and
> just finit_module() the depenencies, skipping kmod. So the limit is
> really only for kernel acting like a boss.
> 
> > I am not sure how it is implemented but it might require having
> > some more module loads in progress.
> 
> Dependencies should be OK, a more serious concern with dependencies is
> the aggregate memory it takes to load all dep modules for one required
> module since finit_module() ends up allocating the struct module to copy
> over data from userspace.

A simple change can enable us to bail out on finit_module() if a module
is already present by looking at the passed userspace data. I have this
change now but as discussed, whether or not its desirable should be a
matter of whether or not in the typical case (bootup time) things improve.
>From some initial tests it would seem this doesn't help much but it does
help with trying to load the same module over and over again, the explanation
I can think of for this is by introducing a lookup on finit_module() we also
delay module loading by the lookup time, in the general case we would not need
this, so this is likely not worth merging. Will run some final tests to
confirm.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access
  2016-12-15 12:57   ` Petr Mladek
@ 2017-01-10 20:00     ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-10 20:00 UTC (permalink / raw)
  To: Petr Mladek, Peter Zijlstra
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Thu, Dec 15, 2016 at 01:57:48PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:48:50, Luis R. Rodriguez wrote:
> > Only decrement *iff* we're possitive. Warn if we've hit
> > a situation where the counter is already 0 after we're done
> > with a modprobe call, this would tell us we have an unaccounted
> > counter access -- this in theory should not be possible as
> > only one routine controls the counter, however preemption is
> > one case that could trigger this situation. Avoid that situation
> > by disabling preemptiong while we access the counter.
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  kernel/kmod.c | 20 ++++++++++++++++----
> >  1 file changed, 16 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index ab38539f7e91..09cf35a2075a 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -113,16 +113,28 @@ static int call_modprobe(char *module_name, int wait)
> >  
> >  static int kmod_umh_threads_get(void)
> >  {
> > +	int ret = 0;
> > +
> > +	preempt_disable();
> >  	atomic_inc(&kmod_concurrent);
> >  	if (atomic_read(&kmod_concurrent) < max_modprobes)
> > -		return 0;
> > -	atomic_dec(&kmod_concurrent);
> > -	return -EBUSY;
> > +		goto out;
> 
> I though more about it and the disabled preemtion might make
> sense here. It makes sure that we are not rescheduled here
> and that kmod_concurrent is not increased by mistake for too long.

I think its good to add a comment here about this.

> Well, it still would make sense to increment the value
> only when it is under the limit and set the incremented
> value using cmpxchg to avoid races.
> 
> I mean to use similar trick that is used by refcount_inc(), see
> https://lkml.kernel.org/r/20161114174446.832175072@infradead.org

Right, I see now. Since we are converting this to kref though we would
immediately get the advantages of kref_get() using the new refcount_inc() once
that goes in, so I think its best we just sit tight to get that benefit given
as Jessica acknowledged the existing code has has this issue for ages, waiting
a bit longer should not hurt.  The preemption should help in the meantime as
well.

The note I've made then is:

        /*                                                                      
         * Disabling preemption makes sure that we are not rescheduled here.    
         *                                                                      
         * Also preemption helps kmod_concurrent is not increased by mistake    
         * for too long given in theory two concurrent threads could race on    
         * kref_get() before we kref_read().                                    
         *                                                                      
         * XXX: once Peter's refcount_t gets merged kref's kref_get() will use  
         * the new refcount_inc() and then each inc will be atomic with respect 
         * to each thread, as such when Peter's refcount_t gets merged          
         * the above comment "Also preemption ..." can be removed.              
         */  

Come to think of it, once Peter's changes go in at first glance it may seem
preemption would be pointless then but but I think that just mitigates a few
of the refcount_inc() instances where (old != val), that is -- when two threads
got the same bump, so think it can be kept even after Peter's refcount_t work.

> > +	atomic_dec_if_positive(&kmod_concurrent);
> > +	ret = -EBUSY;
> > +out:
> > +	preempt_enable();
> > +	return 0;
> >  }
> >  
> >  static void kmod_umh_threads_put(void)
> >  {
> > -	atomic_dec(&kmod_concurrent);
> > +	int ret;
> > +
> > +	preempt_disable();
> > +	ret = atomic_dec_if_positive(&kmod_concurrent);
> > +	WARN_ON(ret < 0);
> > +	preempt_enable();
> 
> The disabled preemption does not make much sense here.
> We do not need to tie the atomic operation and the WARN
> together so tightly.

Makes sense, will add a note.

kref also lacks such a mnemonic as atomic_dec_if_positive()
and since I've now converted this to kref I've dropped this.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: kmod: provide wrappers for kmod_concurrent inc/dec
  2016-12-22  5:07   ` Jessica Yu
@ 2017-01-10 20:28     ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-10 20:28 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Luis R. Rodriguez, shuah, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, martin.wilck, mmarek, pmladek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Wed, Dec 21, 2016 at 09:07:21PM -0800, Jessica Yu wrote:
> +++ Luis R. Rodriguez [08/12/16 11:48 -0800]:
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index cb6f7ca7b8a5..049d7eabda38 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
> > 	return -ENOMEM;
> > }
> > 
> > +static int kmod_umh_threads_get(void)
> > +{
> > +	atomic_inc(&kmod_concurrent);
> > +	if (atomic_read(&kmod_concurrent) < max_modprobes)
> 
> Should this not be <=? I think this only allows up to max_modprobes-1 concurrent threads.

True, fixed!

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 09/10] kmod: add helpers for getting kmod count and limit
  2016-12-16  7:57     ` Luis R. Rodriguez
@ 2017-01-11 18:27       ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-11 18:27 UTC (permalink / raw)
  To: Luis R. Rodriguez, Tom Gundersen
  Cc: Petr Mladek, shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, martin.wilck, mmarek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Fri, Dec 16, 2016 at 08:57:26AM +0100, Luis R. Rodriguez wrote:
> On Thu, Dec 15, 2016 at 05:56:19PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-08 11:49:20, Luis R. Rodriguez wrote:
> > > This adds helpers for getting access to the kmod count and limit from
> > > userspace. While at it, this also lets userspace fine tune the kmod
> > > limit after boot, it uses the shiny new proc_douintvec_minmax().
> > > 
> > > These knobs should help userspace more gracefully and deterministically
> > > handle module loading.
> > >
> > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > ---
> > >  include/linux/kmod.h |  8 +++++
> > >  kernel/kmod.c        | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> > >  kernel/sysctl.c      | 14 +++++++++
> > >  3 files changed, 103 insertions(+), 2 deletions(-)
> > 
> > I am not sure if it is worth it. As you say in the 3rd patch,
> > there was rather low limit for 16 years and nobody probably had
> > problems with it.
> 
> Note, *probably* - ie, this could have gone unreported for a while, and
> to be frank how can we know for sure a pesky module just did not load due
> to this? In the case of get_fs_type() issue this can be fatal for a partition
> mount, not a good example to wait to look forward to before we take this
> serious.
> 
> I added the sysctl value mostly for read purposes, the count is probably
> useless for any accounting to be done in userspace due to delays this
> reading and making this value useful in userspace can have, I can nuke
> that. The kmod-limit however seems very useful so that userspace knows
> how to properly thread *safely* modprobe calls more deterministically.
> 
> Adding write support to let one bump the limit was just an easy convenience
> possible given the read support was being added, but its use should
> really only be useful for testing purposes post bootup given that the
> real value in the limit will be important at boot time prior to the sysctl
> parsing. The real know tweak which should be used in case of issues is
> the module parameter added earlier.
> 
> So I could drop the kmod-count, and just make the kmod-limit read-only.
> Thoughts?

OK I've done this and also since there was confusion about dependencies
possibly affecting kmod_concurrent I've added a note about this on the
Documentation/sysctl/kernel.txt documentation. This documentation also
clarifies the intent behind exposing this interface, which is to help
enable userspace make using modprobe more deterministic (its why I've
Cc'd Tom). The following changes have been made, and I'll fold this into this
patch and rename the title.

> > Anyway, it seems that such know should also get documented in
> > Documentation/sysctl/kernel.txt
> 
> Will do if we keep them, thanks.

Below are the changes I've made:

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index a32b4b748644..c82aeaf60ca7 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -370,6 +370,26 @@ with the "modules_disabled" sysctl.
 
 ==============================================================
 
+kmod-limit:
+
+Get the max amount of concurrent requests (kmod_concurrent) the kernel can
+make out to userspace to call 'modprobe'. This limit is known internally to the
+kernel as max_modprobes. This interface is designed to enable userspace to
+query the kernel for the max_modprobes limit so userspace can more
+deterministically handle module loading by only enabling max_modprobes
+'modprobe' calls at a time.
+
+Dependencies are resolved in userspace through depmod, so one modprobe
+call only bumps the number of concurrent threads (kmod_concurrent) by one.
+Dependencies for a module then are loaded directly in userspace using
+init_module() / finit_module() skipping bumping kmod_concurrent or being
+affected by max_modprobes.
+
+The max_modprobes value is set at build time with CONFIG_MAX_KMOD_CONCURRENT.
+You can override at initialization with the module parameter max_modprobes.
+
+==============================================================
+
 kptr_restrict:
 
 This toggle indicates whether restrictions are placed on
diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index c30d797fe4d3..1ee833e5896d 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -40,8 +40,6 @@ int __request_module(bool wait, const char *name, ...);
 	((x) ?: (__request_module(true, mod), (x)))
 void init_kmod_umh(void);
 unsigned int get_kmod_umh_limit(void);
-int sysctl_kmod_count(struct ctl_table *table, int write,
-		      void __user *buffer, size_t *lenp, loff_t *ppos);
 int sysctl_kmod_limit(struct ctl_table *table, int write,
 		      void __user *buffer, size_t *lenp, loff_t *ppos);
 #else
diff --git a/kernel/kmod.c b/kernel/kmod.c
index f2fd9f088278..0303bce326b8 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -158,16 +158,6 @@ unsigned int get_kmod_umh_limit(void)
 EXPORT_SYMBOL_GPL(get_kmod_umh_limit);
 
 /**
- * get_kmod_umh_count - get number of concurrent modprobe calls running
- *
- * Returns the number of concurrent modprobe calls currently running.
- */
-int get_kmod_umh_count(void)
-{
-	return atomic_read(&kmod_concurrent);
-}
-
-/**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
  * @fmt: printf style format string for the name of the module
@@ -226,11 +216,6 @@ int __request_module(bool wait, const char *fmt, ...)
 }
 EXPORT_SYMBOL(__request_module);
 
-static void __set_max_modprobes(unsigned int suggested)
-{
-	max_modprobes = min((unsigned int) max_threads/2, suggested);
-}
-
 /*
  * If modprobe needs a service that is in a module, we get a recursive
  * loop.  Limit the number of running kmod threads to max_threads/2 or
@@ -247,40 +232,12 @@ static void __set_max_modprobes(unsigned int suggested)
  * 4096 concurrent modprobe instances:
  *
  *	kmod.max_modprobes=4096
- *
- * You can also set the limit via sysctl:
- *
- * echo 4096 > /proc/sys/kernel/kmod-limit
- *
- * You can also set the query the current thread count:
- *
- * cat /proc/sys/kernel/kmod-count
- *
- * These knobs should enable userspace to more gracefully and
- * deterministically handle module loading.
  */
 void __init init_kmod_umh(void)
 {
 	if (!max_modprobes)
-		__set_max_modprobes(1 << CONFIG_MAX_KMOD_CONCURRENT);
-}
-
-int sysctl_kmod_count(struct ctl_table *table, int write,
-		      void __user *buffer, size_t *lenp, loff_t *ppos)
-{
-	struct ctl_table t;
-	int ret = 0;
-	int count = get_kmod_umh_count();
-
-	t = *table;
-	t.data = &count;
-
-	if (write)
-		return -EPERM;
-
-	ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
-
-	return ret;
+		max_modprobes = min(max_threads/2,
+				    1 << CONFIG_MAX_KMOD_CONCURRENT);
 }
 
 int sysctl_kmod_limit(struct ctl_table *table, int write,
@@ -297,15 +254,12 @@ int sysctl_kmod_limit(struct ctl_table *table, int write,
 	t.extra1 = &min;
 	t.extra2 = &max;
 
-	ret = proc_douintvec_minmax(&t, write, buffer, lenp, ppos);
-	if (ret == -ERANGE)
-		pr_err("modprobe thread valid range: %u - %u\n", min, max);
-	if (ret || !write)
-		return ret;
+	if (write)
+		return -EPERM;
 
-	__set_max_modprobes((unsigned int) local_max_modprobes);
+	ret = proc_douintvec_minmax(&t, write, buffer, lenp, ppos);
 
-	return 0;
+	return ret;
 }
 
 #endif /* CONFIG_MODULES */
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index d59cca78417a..52cf84131f74 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -661,17 +661,10 @@ static struct ctl_table kern_table[] = {
 		.extra2		= &one,
 	},
 	{
-		.procname	= "kmod-count",
-		.data		= NULL, /* filled in by handler */
-		.maxlen		= sizeof(int),
-		.mode		= 0444,
-		.proc_handler	= sysctl_kmod_count,
-	},
-	{
 		.procname	= "kmod-limit",
 		.data		= NULL, /* filled in by handler */
 		.maxlen		= sizeof(unsigned int),
-		.mode		= 0644,
+		.mode		= 0444,
 		.proc_handler	= sysctl_kmod_limit,
 	},
 #endif

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 00/10] kmod: stress test driver, few fixes and enhancements
  2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
                   ` (9 preceding siblings ...)
  2016-12-08 19:49 ` [RFC 10/10] kmod: add a sanity check on module loading Luis R. Rodriguez
@ 2017-01-11 19:10 ` Luis R. Rodriguez
  10 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-11 19:10 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, rgoldwyn, subashab, xypron.glpk, keescook,
	atomlin, mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo,
	akpm, torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu, Dec 08, 2016 at 10:47:51AM -0800, Luis R. Rodriguez wrote:
> Upon running into an old kmod v19 issue with mount (get_fs_type()) a few of us
> hunted for the cause of the issue. Although the issue ended up being a
> userspace issue, a stress test driver was written to help reproduce the issue,
> and along the way a few other fixes and sanity checks were implemented.
> 
> I've taken the time to generalize the stress test driver as a kselftest driver
> with a 9 test cases. The last two test cases reveal an existing issue which
> is not yet addressed upstream, even if you have kmod v19 present. A fix is
> proposed in the last patch. Orignally we had discarded this patch as too
> complex due to the alias handling, but upon further analysis of test cases
> and memory pressure issues, it seems worth considering. Other than the
> last patch I don't think much of the other patches are controversial, but
> sending as RFC first just in case.
> 
> If its not clear, an end goal here is to make module loading a bit more
> deterministic with stronger sanity checks and stress tests. Please note,
> the stress test diver requires 4 GiB of RAM to run all tests without running
> out of memory. A lot of this has to do with the memory requirements needed
> for a dynamic test for multiple threads, but note that the final memory
> pressure and OOMs actually don't come from this allocation, but instead
> from many finit_module() calls, this consumes quite a bit of memory, specially
> if you have a lot of dependencies which also need to be loaded prior to
> your needed module -- as is the case for filesystem drivers.
> 
> These patches are available on my linux-next git-tree on my branch
> 20161208-kmod-test-driver-try2 [0], which is based on linux-next tag
> next-20161208. Patches are also available based on v4.9-rc8 [1] for
> those looking for a bit more stable tree given x86_64 on linux-next is
> hosed at the moment.
> 
> Since kmod.c doesn't seem to get much love, and since I've been digging
> quite a bit into it for other users (firmware) I suppose I could volunteer
> myself to maintain this code as well, unless there are oppositions to this.
> 
> [0] https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux-next.git/log/?h=20161208-kmod-test-driver-try2
> [1] https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux.git/log/?h=20161208-kmod-test-driver
> 
> Luis R. Rodriguez (10):
>   kmod: add test driver to stress test the module loader
>   module: fix memory leak on early load_module() failures
>   kmod: add dynamic max concurrent thread count
>   kmod: provide wrappers for kmod_concurrent inc/dec
>   kmod: return -EBUSY if modprobe limit is reached
>   kmod: provide sanity check on kmod_concurrent access
>   kmod: use simplified rate limit printk
>   sysctl: add support for unsigned int properly
>   kmod: add helpers for getting kmod count and limit
>   kmod: add a sanity check on module loading
> 

A lot of good discussions have come up form this, and so also
a few more patches. I'm going to split up the work into changes
which make sense now and leave debug work for a follow up later.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2017-01-10 18:57           ` [RFC 04/10] " Luis R. Rodriguez
@ 2017-01-11 20:08             ` Luis R. Rodriguez
  2017-05-16 18:02               ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-01-11 20:08 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Petr Mladek, Kees Cook, Peter Zijlstra, shuah, Jessica Yu,
	Rusty Russell, Eric W. Biederman, Dmitry Torokhov,
	Arnaldo Carvalho de Melo, Jonathan Corbet, martin.wilck,
	Michal Marek, hare, rwright, Jeff Mahoney, DSterba, fdmanana,
	neilb, Guenter Roeck, rgoldwyn, subashab, Heinrich Schuchardt,
	Aaron Tomlin, mbenes, Paul E. McKenney, Dan Williams,
	Josh Poimboeuf, David S. Miller, Ingo Molnar, Andrew Morton,
	Linus Torvalds, linux-kselftest, linux-doc, LKML

On Tue, Jan 10, 2017 at 07:57:10PM +0100, Luis R. Rodriguez wrote:
> On Fri, Dec 16, 2016 at 09:05:00AM +0100, Luis R. Rodriguez wrote:
> > On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> > > > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > > > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > > +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> > > > > > +               return 0;
> > > > > > +       atomic_dec(&kmod_concurrent);
> > > > > > +       return -ENOMEM;
> > > > > > +}
> > > > > > +
> > > > > > +static void kmod_umh_threads_put(void)
> > > > > > +{
> > > > > > +       atomic_dec(&kmod_concurrent);
> > > > > > +}
> > > > > 
> > > > > Can you use a kref here instead? We're trying to kill raw use of
> > > > > atomic_t for reference counting...
> > > > 
> > > > That's a much broader functional change than I was looking for, but I am up for
> > > > it. Can you describe the benefit of using kref you expect or why this is an
> > > > ongoing crusade? Since its a larger functional change how about doing this
> > > > change later, and we can test impact with the tress test driver. In theory if
> > > > there are benefits can't we add a test case to prove the gains?
> > > 
> > > Kees probably refers to the kref improvements that Peter Zijlstra
> > > is working on, see
> > > https://lkml.kernel.org/r/20161114174446.832175072@infradead.org
> > > 
> > > The advantage is that the new refcount API handles over and
> > > underflow.
> > > 
> > > Another advantage is that it increments/decrements the value
> > > only when it is safe. It uses cmpxchg to make sure that
> > > the checks are valid.
> > 
> > Great thanks, will look into that.
> 
> OK I've done the conversion now, the only thing is linux-next as of today lacks
> KREF_INIT() so I've open coded it for now. Once Peter's changes get merged the
> only thing we'dneed is to change the open code line to KREF_INIT().
> 
> I'll annotate this as Suggested-by Kees and Petr, I did this as a separate atomic
> step after this to make it easier for review.

Spoke too soon, kref_read() is not upstream yet either, so I can hold conversion
over until Peter's work is merged. Peter please Cc me on those patches if possible
:D

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2017-01-11 20:08             ` Luis R. Rodriguez
@ 2017-05-16 18:02               ` Luis R. Rodriguez
  2017-05-18  2:37                 ` Luis R. Rodriguez
  0 siblings, 1 reply; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-05-16 18:02 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Petr Mladek, Kees Cook, Peter Zijlstra, shuah, Jessica Yu,
	Rusty Russell, Eric W. Biederman, Dmitry Torokhov,
	Arnaldo Carvalho de Melo, Jonathan Corbet, martin.wilck,
	Michal Marek, hare, rwright, Jeff Mahoney, DSterba, fdmanana,
	neilb, Guenter Roeck, rgoldwyn, subashab, Heinrich Schuchardt,
	Aaron Tomlin, mbenes, Paul E. McKenney, Dan Williams,
	Josh Poimboeuf, David S. Miller, Ingo Molnar, Andrew Morton,
	Linus Torvalds, linux-kselftest, linux-doc, LKML

On Wed, Jan 11, 2017 at 09:08:57PM +0100, Luis R. Rodriguez wrote:
> On Tue, Jan 10, 2017 at 07:57:10PM +0100, Luis R. Rodriguez wrote:
> > On Fri, Dec 16, 2016 at 09:05:00AM +0100, Luis R. Rodriguez wrote:
> > > On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
> > > > On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> > > > > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > > > > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > > > +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> > > > > > > +               return 0;
> > > > > > > +       atomic_dec(&kmod_concurrent);
> > > > > > > +       return -ENOMEM;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static void kmod_umh_threads_put(void)
> > > > > > > +{
> > > > > > > +       atomic_dec(&kmod_concurrent);
> > > > > > > +}
> > > > > > 
> > > > > > Can you use a kref here instead? We're trying to kill raw use of
> > > > > > atomic_t for reference counting...
> > > > > 
> > > > > That's a much broader functional change than I was looking for, but I am up for
> > > > > it. Can you describe the benefit of using kref you expect or why this is an
> > > > > ongoing crusade? Since its a larger functional change how about doing this
> > > > > change later, and we can test impact with the tress test driver. In theory if
> > > > > there are benefits can't we add a test case to prove the gains?
> > > > 
> > > > Kees probably refers to the kref improvements that Peter Zijlstra
> > > > is working on, see
> > > > https://lkml.kernel.org/r/20161114174446.832175072@infradead.org
> > > > 
> > > > The advantage is that the new refcount API handles over and
> > > > underflow.
> > > > 
> > > > Another advantage is that it increments/decrements the value
> > > > only when it is safe. It uses cmpxchg to make sure that
> > > > the checks are valid.
> > > 
> > > Great thanks, will look into that.
> > 
> > OK I've done the conversion now, the only thing is linux-next as of today lacks
> > KREF_INIT() so I've open coded it for now. Once Peter's changes get merged the
> > only thing we'dneed is to change the open code line to KREF_INIT().
> > 
> > I'll annotate this as Suggested-by Kees and Petr, I did this as a separate atomic
> > step after this to make it easier for review.
> 
> Spoke too soon, kref_read() is not upstream yet either, so I can hold conversion
> over until Peter's work is merged. Peter please Cc me on those patches if possible
> :D

All the needed kref stuff is upstream now, however, kref is overkill for
kmod_concurrent given this is just a counter, it is not used to release
any object, and kref_put() requires such mechanism. The lightweight
refcount_t is much more appropriate here so will use that and respin
this series, finally.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec
  2017-05-16 18:02               ` Luis R. Rodriguez
@ 2017-05-18  2:37                 ` Luis R. Rodriguez
  0 siblings, 0 replies; 65+ messages in thread
From: Luis R. Rodriguez @ 2017-05-18  2:37 UTC (permalink / raw)
  To: Petr Mladek, Kees Cook, Peter Zijlstra
  Cc: Luis R. Rodriguez, shuah, Jessica Yu, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, martin.wilck, Michal Marek, hare, rwright,
	Jeff Mahoney, DSterba, fdmanana, neilb, Guenter Roeck, rgoldwyn,
	subashab, Heinrich Schuchardt, Aaron Tomlin, mbenes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Andrew Morton, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Tue, May 16, 2017 at 08:02:17PM +0200, Luis R. Rodriguez wrote:
> On Wed, Jan 11, 2017 at 09:08:57PM +0100, Luis R. Rodriguez wrote:
> > On Tue, Jan 10, 2017 at 07:57:10PM +0100, Luis R. Rodriguez wrote:
> > > On Fri, Dec 16, 2016 at 09:05:00AM +0100, Luis R. Rodriguez wrote:
> > > > On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
> > > > > On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> > > > > > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > > > > > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > > > > +       if (atomic_read(&kmod_concurrent) < max_modprobes)
> > > > > > > > +               return 0;
> > > > > > > > +       atomic_dec(&kmod_concurrent);
> > > > > > > > +       return -ENOMEM;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static void kmod_umh_threads_put(void)
> > > > > > > > +{
> > > > > > > > +       atomic_dec(&kmod_concurrent);
> > > > > > > > +}
> > > > > > > 
> > > > > > > Can you use a kref here instead? We're trying to kill raw use of
> > > > > > > atomic_t for reference counting...
> > > > > > 
> > > > > > That's a much broader functional change than I was looking for, but I am up for
> > > > > > it. Can you describe the benefit of using kref you expect or why this is an
> > > > > > ongoing crusade? Since its a larger functional change how about doing this
> > > > > > change later, and we can test impact with the tress test driver. In theory if
> > > > > > there are benefits can't we add a test case to prove the gains?
> > > > > 
> > > > > Kees probably refers to the kref improvements that Peter Zijlstra
> > > > > is working on, see
> > > > > https://lkml.kernel.org/r/20161114174446.832175072@infradead.org
> > > > > 
> > > > > The advantage is that the new refcount API handles over and
> > > > > underflow.
> > > > > 
> > > > > Another advantage is that it increments/decrements the value
> > > > > only when it is safe. It uses cmpxchg to make sure that
> > > > > the checks are valid.
> > > > 
> > > > Great thanks, will look into that.
> > > 
> > > OK I've done the conversion now, the only thing is linux-next as of today lacks
> > > KREF_INIT() so I've open coded it for now. Once Peter's changes get merged the
> > > only thing we'dneed is to change the open code line to KREF_INIT().
> > > 
> > > I'll annotate this as Suggested-by Kees and Petr, I did this as a separate atomic
> > > step after this to make it easier for review.
> > 
> > Spoke too soon, kref_read() is not upstream yet either, so I can hold conversion
> > over until Peter's work is merged. Peter please Cc me on those patches if possible
> > :D
> 
> All the needed kref stuff is upstream now, however, kref is overkill for
> kmod_concurrent given this is just a counter, it is not used to release
> any object, and kref_put() requires such mechanism. The lightweight
> refcount_t is much more appropriate here so will use that and respin
> this series, finally.

And... even the refcount_t is overkill here given even with preemption stuff on
inc we still run into the warnings implemented by the recount stuff right away.
The only way to properly fix this is with a proper lock and I don't think this is
worth it at this point.

This would be an issue if the accounting here was for an object but since its
not and its just a loose estimate for a subjective "reasonable threshold" this
is all just overkill.

Lesson: (unless I hear otherwise)

As such I see no real strong motivation for a change here now. Counters, used
without any object references or any real critical stuff is left best with the
old atomic counters.

  Luis

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2017-05-18  2:37 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
2016-12-08 18:47 ` [RFC 01/10] kmod: add test driver to stress test the module loader Luis R. Rodriguez
2016-12-08 20:24   ` Kees Cook
2016-12-13 21:10     ` Luis R. Rodriguez
2016-12-16  7:41       ` Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 02/10] module: fix memory leak on early load_module() failures Luis R. Rodriguez
2016-12-08 20:30   ` Kees Cook
2016-12-08 21:10     ` Luis R. Rodriguez
2016-12-08 21:17       ` Kees Cook
2016-12-09 17:06   ` Miroslav Benes
2016-12-16  8:51     ` Luis R. Rodriguez
2016-12-15 18:46   ` Aaron Tomlin
2016-12-08 19:48 ` [RFC 03/10] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
2016-12-08 20:28   ` Kees Cook
2016-12-08 21:00     ` Luis R. Rodriguez
2016-12-14 15:38   ` Petr Mladek
2016-12-16  8:39     ` Luis R. Rodriguez
2017-01-10 19:24       ` Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
2016-12-08 20:29   ` Kees Cook
2016-12-08 21:08     ` Luis R. Rodriguez
2016-12-15 12:46       ` Petr Mladek
2016-12-16  8:05         ` Luis R. Rodriguez
2016-12-22  4:48           ` Jessica Yu
2017-01-06 20:54             ` Luis R. Rodriguez
2017-01-10 18:57           ` [RFC 04/10] " Luis R. Rodriguez
2017-01-11 20:08             ` Luis R. Rodriguez
2017-05-16 18:02               ` Luis R. Rodriguez
2017-05-18  2:37                 ` Luis R. Rodriguez
2016-12-22  5:07   ` Jessica Yu
2017-01-10 20:28     ` Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 05/10] kmod: return -EBUSY if modprobe limit is reached Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 06/10] kmod: provide sanity check on kmod_concurrent access Luis R. Rodriguez
2016-12-14 16:08   ` Petr Mladek
2016-12-14 17:12     ` Luis R. Rodriguez
2016-12-15 12:57   ` Petr Mladek
2017-01-10 20:00     ` Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 07/10] kmod: use simplified rate limit printk Luis R. Rodriguez
2016-12-14 16:23   ` Petr Mladek
2016-12-14 16:41     ` Joe Perches
2016-12-16  8:44     ` Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 08/10] sysctl: add support for unsigned int properly Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 09/10] kmod: add helpers for getting kmod count and limit Luis R. Rodriguez
2016-12-15 16:56   ` Petr Mladek
2016-12-16  7:57     ` Luis R. Rodriguez
2017-01-11 18:27       ` Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 10/10] kmod: add a sanity check on module loading Luis R. Rodriguez
2016-12-09 20:03   ` Martin Wilck
2016-12-09 20:56     ` Linus Torvalds
2016-12-15 18:08       ` Luis R. Rodriguez
2016-12-15  0:27   ` Rusty Russell
2016-12-16  8:31     ` Luis R. Rodriguez
2016-12-17  3:54       ` Rusty Russell
     [not found]         ` <CAB=NE6VvuA9a6hf6yoopGfUxVJQM5HyV5bNzUdsEtUV0UhbG-g@mail.gmail.com>
2016-12-20  0:53           ` Rusty Russell
2016-12-20 18:52             ` Luis R. Rodriguez
2016-12-21  2:21               ` Rusty Russell
2016-12-21 13:08                 ` Luis R. Rodriguez
2017-01-03  0:04                   ` Rusty Russell
2017-01-06 20:36                     ` Luis R. Rodriguez
2017-01-06 21:53                       ` Jessica Yu
2017-01-09 20:27                         ` Luis R. Rodriguez
     [not found]                       ` <87bmvgax51.fsf@rustcorp.com.au>
2017-01-09 19:56                         ` [RFC 10/10] " Luis R. Rodriguez
2017-01-06 21:03                     ` Jessica Yu
2017-01-04  2:47   ` Jessica Yu
2017-01-11 19:10 ` [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.