LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/6] kmod: few simple enhancements
@ 2017-05-19  3:24 Luis R. Rodriguez
  2017-05-19  3:24 ` [PATCH 1/6] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
                   ` (6 more replies)
  0 siblings, 7 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19  3:24 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

I've been working on making module loading more deterministic, these are some
of the more straight forward changes I've come up with so far. The others
depend on some further sysctl changes so I'll wait to introduce those, and also
on a new kmod stress driver loader, which I'll also hold off on introducing.

I believe these are pretty straight forward, but please let me know if there
are any questions. In this re-spin I've dropped the kref / refcount proposals
given its just overkill for what we need here.

Luis R. Rodriguez (6):
  kmod: add dynamic max concurrent thread count
  module: use list_for_each_entry_rcu() on find_module_all()
  kmod: provide wrappers for kmod_concurrent inc/dec
  kmod: return -EBUSY if modprobe limit is reached
  kmod: preempt on kmod_umh_threads_get()
  kmod: use simplified rate limit printk

 Documentation/admin-guide/kernel-parameters.txt |  7 ++
 include/linux/kmod.h                            |  3 +-
 init/Kconfig                                    | 23 ++++++
 init/main.c                                     |  1 +
 kernel/kmod.c                                   | 98 +++++++++++++++++--------
 kernel/module.c                                 |  2 +-
 6 files changed, 103 insertions(+), 31 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
@ 2017-05-19  3:24 ` Luis R. Rodriguez
  2017-05-19 20:44   ` Dmitry Torokhov
  2017-05-19  3:24 ` [PATCH 2/6] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19  3:24 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

We currently statically limit the number of modprobe threads which
we allow to run concurrently to 50. As per Keith Owens, this was a
completely arbitrary value, and it was set in the 2.3.38 days [0]
over 16 years ago in year 2000.

Although we haven't yet hit our lower limits, experimentation [1]
shows that when and if we hit this limit in the worst case, will be
fatal -- consider get_fs_type() failures upon mount on a system which
has many partitions, some of which might even be with the same
filesystem. Its best to be prudent and increase and set this
value to something more sensible which ensures we're far from hitting
the limit and also allows default build/user run time override.

The worst case is fatal given that once a module fails to load there
is a period of time during which subsequent request for the same module
will fail, so in the case of partitions its not just one request that
could fail, but whole series of partitions. This later issue of a
module request failure domino effect can be addressed later, but
increasing the limit to something more meaninful should at least give us
enough cushion to avoid this for a while.

Set this value up with a bit more meaninful modern limits:

Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)

Also allow the default max limit to be further fine tuned at compile
time and at initialization at run time at boot up using the kernel
parameter: max_modprobes.

[0] https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
[1] https://github.com/mcgrof/test_request_module

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  7 ++++
 include/linux/kmod.h                            |  3 +-
 init/Kconfig                                    | 23 +++++++++++++
 init/main.c                                     |  1 +
 kernel/kmod.c                                   | 43 ++++++++++++++++---------
 5 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 15f79c27748d..1314a85c10c9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1748,6 +1748,13 @@
 
 	keepinitrd	[HW,ARM]
 
+	kmod.max_modprobes [KNL]
+			This lets you set the max allowed of concurrent
+			modprobes threads possible on a system overriding the
+			default heuristic of:
+
+				min(max_threads/2, 1 << CONFIG_MAX_KMOD_CONCURRENT)
+
 	kernelcore=	[KNL,X86,IA-64,PPC]
 			Format: nn[KMGTPE] | "mirror"
 			This parameter
diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index c4e441e00db5..302c9dfdcda9 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -38,13 +38,14 @@ int __request_module(bool wait, const char *name, ...);
 #define request_module_nowait(mod...) __request_module(false, mod)
 #define try_then_request_module(x, mod...) \
 	((x) ?: (__request_module(true, mod), (x)))
+void init_kmod_umh(void);
 #else
 static inline int request_module(const char *name, ...) { return -ENOSYS; }
 static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
+static inline void init_kmod_umh(void) { }
 #define try_then_request_module(x, mod...) (x)
 #endif
 
-
 struct cred;
 struct file;
 
diff --git a/init/Kconfig b/init/Kconfig
index 1d3475fc9496..5974ad7c9d0a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2191,6 +2191,29 @@ config TRIM_UNUSED_KSYMS
 
 	  If unsure, or if you need to build out-of-tree modules, say N.
 
+config MAX_KMOD_CONCURRENT
+	int "Max allowed concurrent request_module() calls (6=>64, 10=>1024)"
+	range 6 14
+	default 7 if !BASE_SMALL
+	default 6 if BASE_SMALL
+	help
+	  The kernel restricts the number of possible concurrent calls to
+	  request_module() to help avoid a recursive loop possible with
+	  modules. The default maximum number of concurrent threads allowed
+	  to run request_module() will be:
+
+	    max_modprobes = min(max_threads/2, 1 << CONFIG_MAX_KMOD_CONCURRENT);
+
+	  The value set in CONFIG_MAX_KMOD_CONCURRENT represents then the power
+	  of 2 value used at boot time for the above computation. You can
+	  override the default built value using the kernel parameter:
+
+		kmod.max_modprobes=4096
+
+	  We set this to default to 64 (2^6) concurrent modprobe threads for
+	  small systems, for larger systems this defaults to 128 (2^7)
+	  concurrent modprobe threads.
+
 endif # MODULES
 
 config MODULES_TREE_LOOKUP
diff --git a/init/main.c b/init/main.c
index f866510472d7..25669dfded42 100644
--- a/init/main.c
+++ b/init/main.c
@@ -650,6 +650,7 @@ asmlinkage __visible void __init start_kernel(void)
 	thread_stack_cache_init();
 	cred_init();
 	fork_init();
+	init_kmod_umh();
 	proc_caches_init();
 	buffer_init();
 	key_init();
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 563f97e2be36..6fe6848787b2 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -46,6 +46,9 @@
 #include <trace/events/module.h>
 
 extern int max_threads;
+unsigned int max_modprobes;
+module_param(max_modprobes, uint, 0644);
+MODULE_PARM_DESC(max_modprobes, "Max number of allowed concurrent modprobes");
 
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
@@ -127,10 +130,8 @@ int __request_module(bool wait, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
-	unsigned int max_modprobes;
 	int ret;
 	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
-#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
 	static int kmod_loop_msg;
 
 	/*
@@ -154,19 +155,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
-	/* If modprobe needs a service that is in a module, we get a recursive
-	 * loop.  Limit the number of running kmod threads to max_threads/2 or
-	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
-	 * would be to run the parents of this process, counting how many times
-	 * kmod was invoked.  That would mean accessing the internals of the
-	 * process tables to get the command line, proc_pid_cmdline is static
-	 * and it is not worth changing the proc code just to handle this case. 
-	 * KAO.
-	 *
-	 * "trace the ppid" is simple, but will fail if someone's
-	 * parent exits.  I think this is as good as it gets. --RR
-	 */
-	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
 	atomic_inc(&kmod_concurrent);
 	if (atomic_read(&kmod_concurrent) > max_modprobes) {
 		/* We may be blaming an innocent here, but unlikely */
@@ -188,6 +176,31 @@ int __request_module(bool wait, const char *fmt, ...)
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
+
+/*
+ * If modprobe needs a service that is in a module, we get a recursive
+ * loop.  Limit the number of running kmod threads to max_threads/2 or
+ * CONFIG_MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
+ * would be to run the parents of this process, counting how many times
+ * kmod was invoked.  That would mean accessing the internals of the
+ * process tables to get the command line, proc_pid_cmdline is static
+ * and it is not worth changing the proc code just to handle this case.
+ *
+ * "trace the ppid" is simple, but will fail if someone's
+ * parent exits.  I think this is as good as it gets.
+ *
+ * You can override with with a kernel parameter, for instance to allow
+ * 4096 concurrent modprobe instances:
+ *
+ *	kmod.max_modprobes=4096
+ */
+void __init init_kmod_umh(void)
+{
+	if (!max_modprobes)
+		max_modprobes = min(max_threads/2,
+				    1 << CONFIG_MAX_KMOD_CONCURRENT);
+}
+
 #endif /* CONFIG_MODULES */
 
 static void call_usermodehelper_freeinfo(struct subprocess_info *info)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 2/6] module: use list_for_each_entry_rcu() on find_module_all()
  2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
  2017-05-19  3:24 ` [PATCH 1/6] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
@ 2017-05-19  3:24 ` Luis R. Rodriguez
  2017-05-23 14:46   ` Miroslav Benes
  2017-05-19  3:24 ` [PATCH 3/6] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19  3:24 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

The module list has been using RCU in a lot of other calls
for a while now, we just overlooked changing this one over to
use RCU.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/module.c b/kernel/module.c
index 4a3665f8f837..70f494638974 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -602,7 +602,7 @@ static struct module *find_module_all(const char *name, size_t len,
 
 	module_assert_mutex_or_preempt();
 
-	list_for_each_entry(mod, &modules, list) {
+	list_for_each_entry_rcu(mod, &modules, list) {
 		if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
 			continue;
 		if (strlen(mod->name) == len && !memcmp(mod->name, name, len))
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 3/6] kmod: provide wrappers for kmod_concurrent inc/dec
  2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
  2017-05-19  3:24 ` [PATCH 1/6] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
  2017-05-19  3:24 ` [PATCH 2/6] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
@ 2017-05-19  3:24 ` Luis R. Rodriguez
  2017-05-19  3:24 ` [PATCH 4/6] kmod: return -EBUSY if modprobe limit is reached Luis R. Rodriguez
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19  3:24 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

kmod_concurrent is used as an atomic counter for enabling
the allowed limit of modprobe calls, provide wrappers for it
to enable this to be expanded on more easily. This will be done
later.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 6fe6848787b2..7635915dc91c 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -59,6 +59,7 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
 static DECLARE_RWSEM(umhelper_sem);
 
 #ifdef CONFIG_MODULES
+static atomic_t kmod_concurrent = ATOMIC_INIT(0);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -110,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
 	return -ENOMEM;
 }
 
+static int kmod_umh_threads_get(void)
+{
+	atomic_inc(&kmod_concurrent);
+	if (atomic_read(&kmod_concurrent) <= max_modprobes)
+		return 0;
+	atomic_dec(&kmod_concurrent);
+	return -ENOMEM;
+}
+
+static void kmod_umh_threads_put(void)
+{
+	atomic_dec(&kmod_concurrent);
+}
+
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
@@ -131,7 +146,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
 	static int kmod_loop_msg;
 
 	/*
@@ -155,8 +169,8 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
-	atomic_inc(&kmod_concurrent);
-	if (atomic_read(&kmod_concurrent) > max_modprobes) {
+	ret = kmod_umh_threads_get();
+	if (ret) {
 		/* We may be blaming an innocent here, but unlikely */
 		if (kmod_loop_msg < 5) {
 			printk(KERN_ERR
@@ -164,15 +178,14 @@ int __request_module(bool wait, const char *fmt, ...)
 			       module_name);
 			kmod_loop_msg++;
 		}
-		atomic_dec(&kmod_concurrent);
-		return -ENOMEM;
+		return ret;
 	}
 
 	trace_module_request(module_name, wait, _RET_IP_);
 
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
-	atomic_dec(&kmod_concurrent);
+	kmod_umh_threads_put();
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 4/6] kmod: return -EBUSY if modprobe limit is reached
  2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2017-05-19  3:24 ` [PATCH 3/6] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
@ 2017-05-19  3:24 ` Luis R. Rodriguez
  2017-05-19  3:24 ` [PATCH 5/6] kmod: preempt on kmod_umh_threads_get() Luis R. Rodriguez
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19  3:24 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

Running out of our modprobe limit is not a memory limit but
a system specific established limitation set to avoid a possible
recursive issue with modprobe. This gives userspace a better idea
of what happened if we can't load a module, it could use this to
wait and try again.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 7635915dc91c..563600fc9bb1 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -117,7 +117,7 @@ static int kmod_umh_threads_get(void)
 	if (atomic_read(&kmod_concurrent) <= max_modprobes)
 		return 0;
 	atomic_dec(&kmod_concurrent);
-	return -ENOMEM;
+	return -EBUSY;
 }
 
 static void kmod_umh_threads_put(void)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2017-05-19  3:24 ` [PATCH 4/6] kmod: return -EBUSY if modprobe limit is reached Luis R. Rodriguez
@ 2017-05-19  3:24 ` Luis R. Rodriguez
  2017-05-19 22:27   ` Dmitry Torokhov
  2017-05-19  3:24 ` [PATCH 6/6] kmod: use simplified rate limit printk Luis R. Rodriguez
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
  6 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19  3:24 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

In theory it is possible multiple concurrent threads will try to
kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
the same time, therefore enabling a small time during which we've
bumped kmod_concurrent but have not really enabled work. By using
preemption we mitigate this a bit.

Preemption is not needed when we kmod_umh_threads_put().

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 563600fc9bb1..7ea11dbc7564 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
 
 static int kmod_umh_threads_get(void)
 {
+	int ret = 0;
+
+	/*
+	 * Disabling preemption makes sure that we are not rescheduled here
+	 *
+	 * Also preemption helps kmod_concurrent is not increased by mistake
+	 * for too long given in theory two concurrent threads could race on
+	 * atomic_inc() before we atomic_read() -- we know that's possible
+	 * and but we don't care, this is not used for object accounting and
+	 * is just a subjective threshold. The alternative is a lock.
+	 */
+	preempt_disable();
 	atomic_inc(&kmod_concurrent);
 	if (atomic_read(&kmod_concurrent) <= max_modprobes)
-		return 0;
+		goto out;
+
 	atomic_dec(&kmod_concurrent);
-	return -EBUSY;
+	ret = -EBUSY;
+out:
+	preempt_enable();
+	return ret;
 }
 
 static void kmod_umh_threads_put(void)
 {
+	/*
+	 * Preemption is not needed given once work is done we can
+	 * pace ourselves on our way out.
+	 */
 	atomic_dec(&kmod_concurrent);
 }
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 6/6] kmod: use simplified rate limit printk
  2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
                   ` (4 preceding siblings ...)
  2017-05-19  3:24 ` [PATCH 5/6] kmod: preempt on kmod_umh_threads_get() Luis R. Rodriguez
@ 2017-05-19  3:24 ` Luis R. Rodriguez
  2017-05-19 22:23   ` Dmitry Torokhov
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
  6 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19  3:24 UTC (permalink / raw)
  To: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel, Luis R. Rodriguez

Just use the simplified rate limit printk when the max modprobe
limit is reached, while at it throw out a bone should the error
be triggered.

Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 7ea11dbc7564..56cd2a16e7ac 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -166,7 +166,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static int kmod_loop_msg;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -191,13 +190,8 @@ int __request_module(bool wait, const char *fmt, ...)
 
 	ret = kmod_umh_threads_get();
 	if (ret) {
-		/* We may be blaming an innocent here, but unlikely */
-		if (kmod_loop_msg < 5) {
-			printk(KERN_ERR
-			       "request_module: runaway loop modprobe %s\n",
-			       module_name);
-			kmod_loop_msg++;
-		}
+		pr_err_ratelimited("%s: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
+				   __func__, module_name, max_modprobes);
 		return ret;
 	}
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-19  3:24 ` [PATCH 1/6] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
@ 2017-05-19 20:44   ` Dmitry Torokhov
       [not found]     ` <CAB=NE6XGL24O+JfTNUG0HO4obhDc-v+HyL0SCrQELiZrj2-qNw@mail.gmail.com>
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-19 20:44 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, acme, corbet, martin.wilck, mmarek,
	pmladek, hare, rwright, jeffm, DSterba, fdmanana, neilb, linux,
	rgoldwyn, subashab, xypron.glpk, keescook, atomlin, mbenes,
	paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm, torvalds,
	gregkh, linux-kselftest, linux-doc, linux-kernel

On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
> We currently statically limit the number of modprobe threads which
> we allow to run concurrently to 50. As per Keith Owens, this was a
> completely arbitrary value, and it was set in the 2.3.38 days [0]
> over 16 years ago in year 2000.
> 
> Although we haven't yet hit our lower limits, experimentation [1]
> shows that when and if we hit this limit in the worst case, will be
> fatal -- consider get_fs_type() failures upon mount on a system which
> has many partitions, some of which might even be with the same
> filesystem. Its best to be prudent and increase and set this
> value to something more sensible which ensures we're far from hitting
> the limit and also allows default build/user run time override.
> 
> The worst case is fatal given that once a module fails to load there
> is a period of time during which subsequent request for the same module
> will fail, so in the case of partitions its not just one request that
> could fail, but whole series of partitions. This later issue of a
> module request failure domino effect can be addressed later, but
> increasing the limit to something more meaninful should at least give us
> enough cushion to avoid this for a while.
> 
> Set this value up with a bit more meaninful modern limits:
> 
> Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> 
> Also allow the default max limit to be further fine tuned at compile
> time and at initialization at run time at boot up using the kernel
> parameter: max_modprobes.
> 
> [0] https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
> [1] https://github.com/mcgrof/test_request_module

If we actually run into this issue, instead of slamming the system with
bazillion concurrent requests, can we wait for the other modprobes to
finish and then continue?

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
       [not found]           ` <CAB=NE6Vqmx=y6muenpuQKynTP=pGWMF8tzoCA0BXD6d63q9wPg@mail.gmail.com>
@ 2017-05-19 21:58             ` Dmitry Torokhov
  2017-05-25 16:22               ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-19 21:58 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	corbet, torvalds, linux-kselftest, akpm, dan.j.williams, atomlin,
	rwright, xypron.glpk, mmarek, martin.wilck, rusty, jeffm, mingo,
	pmladek, linux, ebiederm, shuah, DSterba, keescook, gregkh,
	jpoimboe, acme, mbenes, neilb, linux-kernel, davem, jeyu,
	subashab

On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
> On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@gmail.com>
> wrote:
> 
> On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
> > We currently statically limit the number of modprobe threads which
> > we allow to run concurrently to 50. As per Keith Owens, this was a
> > completely arbitrary value, and it was set in the 2.3.38 days [0]
> > over 16 years ago in year 2000.
> >
> > Although we haven't yet hit our lower limits, experimentation [1]
> > shows that when and if we hit this limit in the worst case, will be
> > fatal -- consider get_fs_type() failures upon mount on a system which
> > has many partitions, some of which might even be with the same
> > filesystem. Its best to be prudent and increase and set this
> > value to something more sensible which ensures we're far from hitting
> > the limit and also allows default build/user run time override.
> >
> > The worst case is fatal given that once a module fails to load there
> > is a period of time during which subsequent request for the same module
> > will fail, so in the case of partitions its not just one request that
> > could fail, but whole series of partitions. This later issue of a
> > module request failure domino effect can be addressed later, but
> > increasing the limit to something more meaninful should at least give us
> > enough cushion to avoid this for a while.
> >
> > Set this value up with a bit more meaninful modern limits:
> >
> > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> >
> > Also allow the default max limit to be further fine tuned at compile
> > time and at initialization at run time at boot up using the kernel
> > parameter: max_modprobes.
> >
> > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
> history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
> > [1] https://github.com/mcgrof/test_request_module
> 
> If we actually run into this issue, instead of slamming the system with
> bazillion concurrent requests, can we wait for the other modprobes to
> finish and then continue?
> 
> 
> Yes ! That I have a patch that does precisely that ! That is actually still
> *not enough* to not fail fatally but this would be subject of another
> series with more debatable approaches.
> 

Then please post it.

> This at least pushes us to closer safer limits for now while also making it
> configurable.

Making it configurable depending on how big/little box is makes no
sense, especially if the above is implemented, as depth of modprobe
invocations depends on configuration and not computing power of the
hardware the system is running on.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] kmod: use simplified rate limit printk
  2017-05-19  3:24 ` [PATCH 6/6] kmod: use simplified rate limit printk Luis R. Rodriguez
@ 2017-05-19 22:23   ` Dmitry Torokhov
  2017-05-23  9:00     ` Petr Mladek
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-19 22:23 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, acme, corbet, martin.wilck, mmarek,
	pmladek, hare, rwright, jeffm, DSterba, fdmanana, neilb, linux,
	rgoldwyn, subashab, xypron.glpk, keescook, atomlin, mbenes,
	paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm, torvalds,
	gregkh, linux-kselftest, linux-doc, linux-kernel

On Thu, May 18, 2017 at 08:24:44PM -0700, Luis R. Rodriguez wrote:
> Just use the simplified rate limit printk when the max modprobe
> limit is reached, while at it throw out a bone should the error
> be triggered.
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/kmod.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 7ea11dbc7564..56cd2a16e7ac 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -166,7 +166,6 @@ int __request_module(bool wait, const char *fmt, ...)
>  	va_list args;
>  	char module_name[MODULE_NAME_LEN];
>  	int ret;
> -	static int kmod_loop_msg;
>  
>  	/*
>  	 * We don't allow synchronous module loading from async.  Module
> @@ -191,13 +190,8 @@ int __request_module(bool wait, const char *fmt, ...)
>  
>  	ret = kmod_umh_threads_get();
>  	if (ret) {
> -		/* We may be blaming an innocent here, but unlikely */
> -		if (kmod_loop_msg < 5) {
> -			printk(KERN_ERR
> -			       "request_module: runaway loop modprobe %s\n",
> -			       module_name);
> -			kmod_loop_msg++;
> -		}
> +		pr_err_ratelimited("%s: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
> +				   __func__, module_name, max_modprobes);

This is completely different behavior, isn't it? Instead of reporting
first 5 occurrences we now reporting every once in a while. Why is this
new behavior better?

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-19  3:24 ` [PATCH 5/6] kmod: preempt on kmod_umh_threads_get() Luis R. Rodriguez
@ 2017-05-19 22:27   ` Dmitry Torokhov
  2017-05-25  0:14     ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-19 22:27 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, acme, corbet, martin.wilck, mmarek,
	pmladek, hare, rwright, jeffm, DSterba, fdmanana, neilb, linux,
	rgoldwyn, subashab, xypron.glpk, keescook, atomlin, mbenes,
	paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm, torvalds,
	gregkh, linux-kselftest, linux-doc, linux-kernel

On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> In theory it is possible multiple concurrent threads will try to
> kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> the same time, therefore enabling a small time during which we've
> bumped kmod_concurrent but have not really enabled work. By using
> preemption we mitigate this a bit.
> 
> Preemption is not needed when we kmod_umh_threads_put().
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/kmod.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 563600fc9bb1..7ea11dbc7564 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
>  
>  static int kmod_umh_threads_get(void)
>  {
> +	int ret = 0;
> +
> +	/*
> +	 * Disabling preemption makes sure that we are not rescheduled here
> +	 *
> +	 * Also preemption helps kmod_concurrent is not increased by mistake
> +	 * for too long given in theory two concurrent threads could race on
> +	 * atomic_inc() before we atomic_read() -- we know that's possible
> +	 * and but we don't care, this is not used for object accounting and
> +	 * is just a subjective threshold. The alternative is a lock.
> +	 */
> +	preempt_disable();
>  	atomic_inc(&kmod_concurrent);
>  	if (atomic_read(&kmod_concurrent) <= max_modprobes)

That is very "fancy" way to basically say:

	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
		...

> -		return 0;
> +		goto out;
> +
>  	atomic_dec(&kmod_concurrent);
> -	return -EBUSY;
> +	ret = -EBUSY;
> +out:
> +	preempt_enable();
> +	return ret;
>  }
>  
>  static void kmod_umh_threads_put(void)
>  {
> +	/*
> +	 * Preemption is not needed given once work is done we can
> +	 * pace ourselves on our way out.
> +	 */
>  	atomic_dec(&kmod_concurrent);
>  }
>  
> -- 
> 2.11.0
> 

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 6/6] kmod: use simplified rate limit printk
  2017-05-19 22:23   ` Dmitry Torokhov
@ 2017-05-23  9:00     ` Petr Mladek
  0 siblings, 0 replies; 69+ messages in thread
From: Petr Mladek @ 2017-05-23  9:00 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, gregkh, linux-kselftest, linux-doc, linux-kernel

On Fri 2017-05-19 15:23:27, Dmitry Torokhov wrote:
> On Thu, May 18, 2017 at 08:24:44PM -0700, Luis R. Rodriguez wrote:
> > Just use the simplified rate limit printk when the max modprobe
> > limit is reached, while at it throw out a bone should the error
> > be triggered.
> > 
> > Reviewed-by: Petr Mladek <pmladek@suse.com>
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  kernel/kmod.c | 10 ++--------
> >  1 file changed, 2 insertions(+), 8 deletions(-)
> > 
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 7ea11dbc7564..56cd2a16e7ac 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -166,7 +166,6 @@ int __request_module(bool wait, const char *fmt, ...)
> >  	va_list args;
> >  	char module_name[MODULE_NAME_LEN];
> >  	int ret;
> > -	static int kmod_loop_msg;
> >  
> >  	/*
> >  	 * We don't allow synchronous module loading from async.  Module
> > @@ -191,13 +190,8 @@ int __request_module(bool wait, const char *fmt, ...)
> >  
> >  	ret = kmod_umh_threads_get();
> >  	if (ret) {
> > -		/* We may be blaming an innocent here, but unlikely */
> > -		if (kmod_loop_msg < 5) {
> > -			printk(KERN_ERR
> > -			       "request_module: runaway loop modprobe %s\n",
> > -			       module_name);
> > -			kmod_loop_msg++;
> > -		}
> > +		pr_err_ratelimited("%s: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
> > +				   __func__, module_name, max_modprobes);
> 
> This is completely different behavior, isn't it? Instead of reporting
> first 5 occurrences we now reporting every once in a while. Why is this
> new behavior better?

pr_err_ratelimited() shows the first 10 messages by default,
see DEFAULT_RATELIMIT_BURST. In addition, it allows to see
the messages again after some time (5 sec by default).
Therefore you could see if the bad situation persists or if
the limit was reached more times during the system lifetime.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 2/6] module: use list_for_each_entry_rcu() on find_module_all()
  2017-05-19  3:24 ` [PATCH 2/6] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
@ 2017-05-23 14:46   ` Miroslav Benes
  0 siblings, 0 replies; 69+ messages in thread
From: Miroslav Benes @ 2017-05-23 14:46 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, dmitry.torokhov, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, paulmck, dan.j.williams, jpoimboe, davem,
	mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel

On Thu, 18 May 2017, Luis R. Rodriguez wrote:

> The module list has been using RCU in a lot of other calls
> for a while now, we just overlooked changing this one over to
> use RCU.
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/module.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index 4a3665f8f837..70f494638974 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -602,7 +602,7 @@ static struct module *find_module_all(const char *name, size_t len,
>  
>  	module_assert_mutex_or_preempt();
>  
> -	list_for_each_entry(mod, &modules, list) {
> +	list_for_each_entry_rcu(mod, &modules, list) {
>  		if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
>  			continue;
>  		if (strlen(mod->name) == len && !memcmp(mod->name, name, len))

This makes sense. It would not be an issue if all callers of 
find_module_all() held module_mutex, but module_kallsyms_lookup_name() 
does not.

There is one more occurrence of just list_for_each_entry(mod, &modules, 
list) in kernel/module.c -- in module_kallsyms_on_each_symbol(). But that 
one is safe because we have to hold the mutex there.

Miroslav

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-19 22:27   ` Dmitry Torokhov
@ 2017-05-25  0:14     ` Luis R. Rodriguez
  2017-05-25  0:45       ` Dmitry Torokhov
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25  0:14 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel

On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
> On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> > In theory it is possible multiple concurrent threads will try to
> > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> > the same time, therefore enabling a small time during which we've
> > bumped kmod_concurrent but have not really enabled work. By using
> > preemption we mitigate this a bit.
> > 
> > Preemption is not needed when we kmod_umh_threads_put().
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  kernel/kmod.c | 24 ++++++++++++++++++++++--
> >  1 file changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 563600fc9bb1..7ea11dbc7564 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
> >  
> >  static int kmod_umh_threads_get(void)
> >  {
> > +	int ret = 0;
> > +
> > +	/*
> > +	 * Disabling preemption makes sure that we are not rescheduled here
> > +	 *
> > +	 * Also preemption helps kmod_concurrent is not increased by mistake
> > +	 * for too long given in theory two concurrent threads could race on
> > +	 * atomic_inc() before we atomic_read() -- we know that's possible
> > +	 * and but we don't care, this is not used for object accounting and
> > +	 * is just a subjective threshold. The alternative is a lock.
> > +	 */
> > +	preempt_disable();
> >  	atomic_inc(&kmod_concurrent);
> >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
> 
> That is very "fancy" way to basically say:
> 
> 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)

Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
(as that is not a change in this patch), *or* that using a memory barrier here
with atomic_inc_return() should suffice to address the same and avoid an
explicit preemption  enable / disable ?

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-25  0:14     ` Luis R. Rodriguez
@ 2017-05-25  0:45       ` Dmitry Torokhov
  2017-05-25  1:00         ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-25  0:45 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, acme, corbet, martin.wilck, mmarek,
	pmladek, hare, rwright, jeffm, DSterba, fdmanana, neilb, linux,
	rgoldwyn, subashab, xypron.glpk, keescook, atomlin, mbenes,
	paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm, torvalds,
	gregkh, linux-kselftest, linux-doc, linux-kernel

On Thu, May 25, 2017 at 02:14:52AM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
> > On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> > > In theory it is possible multiple concurrent threads will try to
> > > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> > > the same time, therefore enabling a small time during which we've
> > > bumped kmod_concurrent but have not really enabled work. By using
> > > preemption we mitigate this a bit.
> > > 
> > > Preemption is not needed when we kmod_umh_threads_put().
> > > 
> > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > ---
> > >  kernel/kmod.c | 24 ++++++++++++++++++++++--
> > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > index 563600fc9bb1..7ea11dbc7564 100644
> > > --- a/kernel/kmod.c
> > > +++ b/kernel/kmod.c
> > > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
> > >  
> > >  static int kmod_umh_threads_get(void)
> > >  {
> > > +	int ret = 0;
> > > +
> > > +	/*
> > > +	 * Disabling preemption makes sure that we are not rescheduled here
> > > +	 *
> > > +	 * Also preemption helps kmod_concurrent is not increased by mistake
> > > +	 * for too long given in theory two concurrent threads could race on
> > > +	 * atomic_inc() before we atomic_read() -- we know that's possible
> > > +	 * and but we don't care, this is not used for object accounting and
> > > +	 * is just a subjective threshold. The alternative is a lock.
> > > +	 */
> > > +	preempt_disable();
> > >  	atomic_inc(&kmod_concurrent);
> > >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
> > 
> > That is very "fancy" way to basically say:
> > 
> > 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
> 
> Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
> (as that is not a change in this patch), *or* that using a memory barrier here
> with atomic_inc_return() should suffice to address the same and avoid an
> explicit preemption  enable / disable ?

I am saying that atomic_inc_return() will avoid situation where you have
more than one threads incrementing the counter and believing that they
are [not] allowed to start modprobe.

I have no idea why you think preempt_disable() would help here. It only
ensures that current thread will not be preempted between the point
where you update the counter and where you check the result. It does not
stop interrupts nor does it affect other threads that might be updating
the same counter.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-25  0:45       ` Dmitry Torokhov
@ 2017-05-25  1:00         ` Luis R. Rodriguez
  2017-05-25  2:27           ` Dmitry Torokhov
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25  1:00 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel

On Wed, May 24, 2017 at 05:45:37PM -0700, Dmitry Torokhov wrote:
> On Thu, May 25, 2017 at 02:14:52AM +0200, Luis R. Rodriguez wrote:
> > On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
> > > On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> > > > In theory it is possible multiple concurrent threads will try to
> > > > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> > > > the same time, therefore enabling a small time during which we've
> > > > bumped kmod_concurrent but have not really enabled work. By using
> > > > preemption we mitigate this a bit.
> > > > 
> > > > Preemption is not needed when we kmod_umh_threads_put().
> > > > 
> > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > ---
> > > >  kernel/kmod.c | 24 ++++++++++++++++++++++--
> > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > index 563600fc9bb1..7ea11dbc7564 100644
> > > > --- a/kernel/kmod.c
> > > > +++ b/kernel/kmod.c
> > > > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
> > > >  
> > > >  static int kmod_umh_threads_get(void)
> > > >  {
> > > > +	int ret = 0;
> > > > +
> > > > +	/*
> > > > +	 * Disabling preemption makes sure that we are not rescheduled here
> > > > +	 *
> > > > +	 * Also preemption helps kmod_concurrent is not increased by mistake
> > > > +	 * for too long given in theory two concurrent threads could race on
> > > > +	 * atomic_inc() before we atomic_read() -- we know that's possible
> > > > +	 * and but we don't care, this is not used for object accounting and
> > > > +	 * is just a subjective threshold. The alternative is a lock.
> > > > +	 */
> > > > +	preempt_disable();
> > > >  	atomic_inc(&kmod_concurrent);
> > > >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
> > > 
> > > That is very "fancy" way to basically say:
> > > 
> > > 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
> > 
> > Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
> > (as that is not a change in this patch), *or* that using a memory barrier here
> > with atomic_inc_return() should suffice to address the same and avoid an
> > explicit preemption  enable / disable ?
> 
> I am saying that atomic_inc_return() will avoid situation where you have
> more than one threads incrementing the counter and believing that they
> are [not] allowed to start modprobe.
> 
> I have no idea why you think preempt_disable() would help here. It only
> ensures that current thread will not be preempted between the point
> where you update the counter and where you check the result. It does not
> stop interrupts nor does it affect other threads that might be updating
> the same counter.

The preemption was inspired by __module_get() and try_module_get(), was that
rather silly ?

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-25  1:00         ` Luis R. Rodriguez
@ 2017-05-25  2:27           ` Dmitry Torokhov
  2017-05-25 11:19             ` Petr Mladek
  2017-05-25 15:18             ` Jessica Yu
  0 siblings, 2 replies; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-25  2:27 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: shuah, jeyu, rusty, ebiederm, acme, corbet, martin.wilck, mmarek,
	pmladek, hare, rwright, jeffm, DSterba, fdmanana, neilb, linux,
	rgoldwyn, subashab, xypron.glpk, keescook, atomlin, mbenes,
	paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm, torvalds,
	gregkh, linux-kselftest, linux-doc, linux-kernel

On Thu, May 25, 2017 at 03:00:17AM +0200, Luis R. Rodriguez wrote:
> On Wed, May 24, 2017 at 05:45:37PM -0700, Dmitry Torokhov wrote:
> > On Thu, May 25, 2017 at 02:14:52AM +0200, Luis R. Rodriguez wrote:
> > > On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
> > > > On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> > > > > In theory it is possible multiple concurrent threads will try to
> > > > > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> > > > > the same time, therefore enabling a small time during which we've
> > > > > bumped kmod_concurrent but have not really enabled work. By using
> > > > > preemption we mitigate this a bit.
> > > > > 
> > > > > Preemption is not needed when we kmod_umh_threads_put().
> > > > > 
> > > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > > ---
> > > > >  kernel/kmod.c | 24 ++++++++++++++++++++++--
> > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > > index 563600fc9bb1..7ea11dbc7564 100644
> > > > > --- a/kernel/kmod.c
> > > > > +++ b/kernel/kmod.c
> > > > > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
> > > > >  
> > > > >  static int kmod_umh_threads_get(void)
> > > > >  {
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	/*
> > > > > +	 * Disabling preemption makes sure that we are not rescheduled here
> > > > > +	 *
> > > > > +	 * Also preemption helps kmod_concurrent is not increased by mistake
> > > > > +	 * for too long given in theory two concurrent threads could race on
> > > > > +	 * atomic_inc() before we atomic_read() -- we know that's possible
> > > > > +	 * and but we don't care, this is not used for object accounting and
> > > > > +	 * is just a subjective threshold. The alternative is a lock.
> > > > > +	 */
> > > > > +	preempt_disable();
> > > > >  	atomic_inc(&kmod_concurrent);
> > > > >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
> > > > 
> > > > That is very "fancy" way to basically say:
> > > > 
> > > > 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
> > > 
> > > Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
> > > (as that is not a change in this patch), *or* that using a memory barrier here
> > > with atomic_inc_return() should suffice to address the same and avoid an
> > > explicit preemption  enable / disable ?
> > 
> > I am saying that atomic_inc_return() will avoid situation where you have
> > more than one threads incrementing the counter and believing that they
> > are [not] allowed to start modprobe.
> > 
> > I have no idea why you think preempt_disable() would help here. It only
> > ensures that current thread will not be preempted between the point
> > where you update the counter and where you check the result. It does not
> > stop interrupts nor does it affect other threads that might be updating
> > the same counter.
> 
> The preemption was inspired by __module_get() and try_module_get(), was that
> rather silly ?

As far as I can see prrempt_disable() was needed in __module_get() when
modules user per-cpu refcounts: you did not want to move away from CPU
while manipulating refcount.

Now that modules use simple atomics for refcounting I think these
preempt_disable() and preempt_enable() can be removed.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-25  2:27           ` Dmitry Torokhov
@ 2017-05-25 11:19             ` Petr Mladek
  2017-05-25 15:38               ` Luis R. Rodriguez
  2017-05-25 16:42               ` Dmitry Torokhov
  2017-05-25 15:18             ` Jessica Yu
  1 sibling, 2 replies; 69+ messages in thread
From: Petr Mladek @ 2017-05-25 11:19 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, gregkh, linux-kselftest, linux-doc, linux-kernel

On Wed 2017-05-24 19:27:38, Dmitry Torokhov wrote:
> On Thu, May 25, 2017 at 03:00:17AM +0200, Luis R. Rodriguez wrote:
> > On Wed, May 24, 2017 at 05:45:37PM -0700, Dmitry Torokhov wrote:
> > > On Thu, May 25, 2017 at 02:14:52AM +0200, Luis R. Rodriguez wrote:
> > > > On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
> > > > > On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> > > > > > In theory it is possible multiple concurrent threads will try to
> > > > > > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> > > > > > the same time, therefore enabling a small time during which we've
> > > > > > bumped kmod_concurrent but have not really enabled work. By using
> > > > > > preemption we mitigate this a bit.
> > > > > > 
> > > > > > Preemption is not needed when we kmod_umh_threads_put().
> > > > > > 
> > > > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > > > ---
> > > > > >  kernel/kmod.c | 24 ++++++++++++++++++++++--
> > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > > > index 563600fc9bb1..7ea11dbc7564 100644
> > > > > > --- a/kernel/kmod.c
> > > > > > +++ b/kernel/kmod.c
> > > > > > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
> > > > > >  
> > > > > >  static int kmod_umh_threads_get(void)
> > > > > >  {
> > > > > > +	int ret = 0;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * Disabling preemption makes sure that we are not rescheduled here
> > > > > > +	 *
> > > > > > +	 * Also preemption helps kmod_concurrent is not increased by mistake
> > > > > > +	 * for too long given in theory two concurrent threads could race on
> > > > > > +	 * atomic_inc() before we atomic_read() -- we know that's possible
> > > > > > +	 * and but we don't care, this is not used for object accounting and
> > > > > > +	 * is just a subjective threshold. The alternative is a lock.
> > > > > > +	 */
> > > > > > +	preempt_disable();
> > > > > >  	atomic_inc(&kmod_concurrent);
> > > > > >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
> > > > > 
> > > > > That is very "fancy" way to basically say:
> > > > > 
> > > > > 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
> > > > 
> > > > Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
> > > > (as that is not a change in this patch), *or* that using a memory barrier here
> > > > with atomic_inc_return() should suffice to address the same and avoid an
> > > > explicit preemption  enable / disable ?
> > > 
> > > I am saying that atomic_inc_return() will avoid situation where you have
> > > more than one threads incrementing the counter and believing that they
> > > are [not] allowed to start modprobe.
> > > 
> > > I have no idea why you think preempt_disable() would help here. It only
> > > ensures that current thread will not be preempted between the point
> > > where you update the counter and where you check the result. It does not
> > > stop interrupts nor does it affect other threads that might be updating
> > > the same counter.
> > 
> > The preemption was inspired by __module_get() and try_module_get(), was that
> > rather silly ?
> 
> As far as I can see prrempt_disable() was needed in __module_get() when
> modules user per-cpu refcounts: you did not want to move away from CPU
> while manipulating refcount.
> 
> Now that modules use simple atomics for refcounting I think these
> preempt_disable() and preempt_enable() can be removed.

preempt_disable() still might be useful because you do the
atomic_dec() when you reach the limit.

By other words, you have three operations that should be atomic:
inc, read, and dec. atomic_inc_return() covers only two of them.

Hmm, a solution might be to use atomic_dec_if_positive().
I would kmod_concurrent to something like kmod_concurrent_allowed,
intialize it with the maximum allowed number. Then you could do:

static int kmod_umh_threads_get(void)
{
	if (atomic_dec_if_positive(kmod_concurrent_available) < 0)
		return -EBUSY;
	return 0;
}

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-25  2:27           ` Dmitry Torokhov
  2017-05-25 11:19             ` Petr Mladek
@ 2017-05-25 15:18             ` Jessica Yu
  1 sibling, 0 replies; 69+ messages in thread
From: Jessica Yu @ 2017-05-25 15:18 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Luis R. Rodriguez, shuah, rusty, ebiederm, acme, corbet,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel

+++ Dmitry Torokhov [24/05/17 19:27 -0700]:
>On Thu, May 25, 2017 at 03:00:17AM +0200, Luis R. Rodriguez wrote:
>> On Wed, May 24, 2017 at 05:45:37PM -0700, Dmitry Torokhov wrote:
>> > On Thu, May 25, 2017 at 02:14:52AM +0200, Luis R. Rodriguez wrote:
>> > > On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
>> > > > On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
>> > > > > In theory it is possible multiple concurrent threads will try to
>> > > > > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
>> > > > > the same time, therefore enabling a small time during which we've
>> > > > > bumped kmod_concurrent but have not really enabled work. By using
>> > > > > preemption we mitigate this a bit.
>> > > > >
>> > > > > Preemption is not needed when we kmod_umh_threads_put().
>> > > > >
>> > > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
>> > > > > ---
>> > > > >  kernel/kmod.c | 24 ++++++++++++++++++++++--
>> > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
>> > > > >
>> > > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
>> > > > > index 563600fc9bb1..7ea11dbc7564 100644
>> > > > > --- a/kernel/kmod.c
>> > > > > +++ b/kernel/kmod.c
>> > > > > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
>> > > > >
>> > > > >  static int kmod_umh_threads_get(void)
>> > > > >  {
>> > > > > +	int ret = 0;
>> > > > > +
>> > > > > +	/*
>> > > > > +	 * Disabling preemption makes sure that we are not rescheduled here
>> > > > > +	 *
>> > > > > +	 * Also preemption helps kmod_concurrent is not increased by mistake
>> > > > > +	 * for too long given in theory two concurrent threads could race on
>> > > > > +	 * atomic_inc() before we atomic_read() -- we know that's possible
>> > > > > +	 * and but we don't care, this is not used for object accounting and
>> > > > > +	 * is just a subjective threshold. The alternative is a lock.
>> > > > > +	 */
>> > > > > +	preempt_disable();
>> > > > >  	atomic_inc(&kmod_concurrent);
>> > > > >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
>> > > >
>> > > > That is very "fancy" way to basically say:
>> > > >
>> > > > 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
>> > >
>> > > Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
>> > > (as that is not a change in this patch), *or* that using a memory barrier here
>> > > with atomic_inc_return() should suffice to address the same and avoid an
>> > > explicit preemption  enable / disable ?
>> >
>> > I am saying that atomic_inc_return() will avoid situation where you have
>> > more than one threads incrementing the counter and believing that they
>> > are [not] allowed to start modprobe.
>> >
>> > I have no idea why you think preempt_disable() would help here. It only
>> > ensures that current thread will not be preempted between the point
>> > where you update the counter and where you check the result. It does not
>> > stop interrupts nor does it affect other threads that might be updating
>> > the same counter.
>>
>> The preemption was inspired by __module_get() and try_module_get(), was that
>> rather silly ?
>
>As far as I can see prrempt_disable() was needed in __module_get() when
>modules user per-cpu refcounts: you did not want to move away from CPU
>while manipulating refcount.
>
>Now that modules use simple atomics for refcounting I think these
>preempt_disable() and preempt_enable() can be removed.

Yup, preempt_disable/enable was originally used for percpu module
refcounting. AFAIK they are artifacts that remained from commit
e1783a240f4 "use this_cpu_xx to dynamically allocate counters" and
subsequently commit 2f35c41f58a "Replace module_ref with atomic_t
refcnt" removed the need for it.

Jessica

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-25 11:19             ` Petr Mladek
@ 2017-05-25 15:38               ` Luis R. Rodriguez
  2017-05-25 16:42               ` Dmitry Torokhov
  1 sibling, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25 15:38 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Dmitry Torokhov, Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm,
	acme, corbet, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, akpm, torvalds, gregkh, linux-kselftest, linux-doc,
	linux-kernel

On Thu, May 25, 2017 at 01:19:31PM +0200, Petr Mladek wrote:
> On Wed 2017-05-24 19:27:38, Dmitry Torokhov wrote:
> > On Thu, May 25, 2017 at 03:00:17AM +0200, Luis R. Rodriguez wrote:
> > > On Wed, May 24, 2017 at 05:45:37PM -0700, Dmitry Torokhov wrote:
> > > > On Thu, May 25, 2017 at 02:14:52AM +0200, Luis R. Rodriguez wrote:
> > > > > On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
> > > > > > On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> > > > > > > In theory it is possible multiple concurrent threads will try to
> > > > > > > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> > > > > > > the same time, therefore enabling a small time during which we've
> > > > > > > bumped kmod_concurrent but have not really enabled work. By using
> > > > > > > preemption we mitigate this a bit.
> > > > > > > 
> > > > > > > Preemption is not needed when we kmod_umh_threads_put().
> > > > > > > 
> > > > > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > > > > ---
> > > > > > >  kernel/kmod.c | 24 ++++++++++++++++++++++--
> > > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > > > > index 563600fc9bb1..7ea11dbc7564 100644
> > > > > > > --- a/kernel/kmod.c
> > > > > > > +++ b/kernel/kmod.c
> > > > > > > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
> > > > > > >  
> > > > > > >  static int kmod_umh_threads_get(void)
> > > > > > >  {
> > > > > > > +	int ret = 0;
> > > > > > > +
> > > > > > > +	/*
> > > > > > > +	 * Disabling preemption makes sure that we are not rescheduled here
> > > > > > > +	 *
> > > > > > > +	 * Also preemption helps kmod_concurrent is not increased by mistake
> > > > > > > +	 * for too long given in theory two concurrent threads could race on
> > > > > > > +	 * atomic_inc() before we atomic_read() -- we know that's possible
> > > > > > > +	 * and but we don't care, this is not used for object accounting and
> > > > > > > +	 * is just a subjective threshold. The alternative is a lock.
> > > > > > > +	 */
> > > > > > > +	preempt_disable();
> > > > > > >  	atomic_inc(&kmod_concurrent);
> > > > > > >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
> > > > > > 
> > > > > > That is very "fancy" way to basically say:
> > > > > > 
> > > > > > 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
> > > > > 
> > > > > Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
> > > > > (as that is not a change in this patch), *or* that using a memory barrier here
> > > > > with atomic_inc_return() should suffice to address the same and avoid an
> > > > > explicit preemption  enable / disable ?
> > > > 
> > > > I am saying that atomic_inc_return() will avoid situation where you have
> > > > more than one threads incrementing the counter and believing that they
> > > > are [not] allowed to start modprobe.
> > > > 
> > > > I have no idea why you think preempt_disable() would help here. It only
> > > > ensures that current thread will not be preempted between the point
> > > > where you update the counter and where you check the result. It does not
> > > > stop interrupts nor does it affect other threads that might be updating
> > > > the same counter.
> > > 
> > > The preemption was inspired by __module_get() and try_module_get(), was that
> > > rather silly ?
> > 
> > As far as I can see prrempt_disable() was needed in __module_get() when
> > modules user per-cpu refcounts: you did not want to move away from CPU
> > while manipulating refcount.
> > 
> > Now that modules use simple atomics for refcounting I think these
> > preempt_disable() and preempt_enable() can be removed.
> 
> preempt_disable() still might be useful because you do the
> atomic_dec() when you reach the limit.
> 
> By other words, you have three operations that should be atomic:
> inc, read, and dec. atomic_inc_return() covers only two of them.
> 
> Hmm, a solution might be to use atomic_dec_if_positive().
> I would kmod_concurrent to something like kmod_concurrent_allowed,
> intialize it with the maximum allowed number. Then you could do:
> 
> static int kmod_umh_threads_get(void)
> {
> 	if (atomic_dec_if_positive(kmod_concurrent_available) < 0)
> 		return -EBUSY;
> 	return 0;
> }

I like this much more, thanks!

 Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-19 21:58             ` Dmitry Torokhov
@ 2017-05-25 16:22               ` Luis R. Rodriguez
  2017-05-25 16:38                 ` Dmitry Torokhov
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25 16:22 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Luis R. Rodriguez, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, corbet, torvalds, linux-kselftest, akpm,
	dan.j.williams, atomlin, rwright, xypron.glpk, mmarek,
	martin.wilck, rusty, jeffm, mingo, pmladek, linux, ebiederm,
	shuah, DSterba, keescook, gregkh, jpoimboe, acme, mbenes, neilb,
	linux-kernel, davem, jeyu, subashab

On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote:
> On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
> > On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@gmail.com>
> > wrote:
> > 
> > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
> > > We currently statically limit the number of modprobe threads which
> > > we allow to run concurrently to 50. As per Keith Owens, this was a
> > > completely arbitrary value, and it was set in the 2.3.38 days [0]
> > > over 16 years ago in year 2000.
> > >
> > > Although we haven't yet hit our lower limits, experimentation [1]
> > > shows that when and if we hit this limit in the worst case, will be
> > > fatal -- consider get_fs_type() failures upon mount on a system which
> > > has many partitions, some of which might even be with the same
> > > filesystem. Its best to be prudent and increase and set this
> > > value to something more sensible which ensures we're far from hitting
> > > the limit and also allows default build/user run time override.
> > >
> > > The worst case is fatal given that once a module fails to load there
> > > is a period of time during which subsequent request for the same module
> > > will fail, so in the case of partitions its not just one request that
> > > could fail, but whole series of partitions. This later issue of a
> > > module request failure domino effect can be addressed later, but
> > > increasing the limit to something more meaninful should at least give us
> > > enough cushion to avoid this for a while.
> > >
> > > Set this value up with a bit more meaninful modern limits:
> > >
> > > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> > >
> > > Also allow the default max limit to be further fine tuned at compile
> > > time and at initialization at run time at boot up using the kernel
> > > parameter: max_modprobes.
> > >
> > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
> > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
> > > [1] https://github.com/mcgrof/test_request_module
> > 
> > If we actually run into this issue, instead of slamming the system with
> > bazillion concurrent requests, can we wait for the other modprobes to
> > finish and then continue?
> > 
> > 
> > Yes ! That I have a patch that does precisely that ! That is actually still
> > *not enough* to not fail fatally but this would be subject of another
> > series with more debatable approaches.
> > 
> 
> Then please post it.

Will do.

> > This at least pushes us to closer safer limits for now while also making it
> > configurable.
> 
> Making it configurable depending on how big/little box is makes no
> sense,

If we set a hard limit then we need to patch a system if we need to increment
it. This is rather stupid given we have no current heuristics to make kmod
loading deterministic from userspace, and in the worst case this can be fatal.
General system size is a good first guess, but making it configurable is
really key given current limitations. I'll post further patches which reveals
some of these issues more clearly.

> especially if the above is implemented, as depth of modprobe
> invocations depends on configuration and not computing power of the
> hardware the system is running on.

You seem to agree making it configurable is sensible , but not depending on
the system size ?

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 16:22               ` Luis R. Rodriguez
@ 2017-05-25 16:38                 ` Dmitry Torokhov
  2017-05-25 16:50                   ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-25 16:38 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	corbet, torvalds, linux-kselftest, akpm, dan.j.williams, atomlin,
	rwright, xypron.glpk, mmarek, martin.wilck, rusty, jeffm, mingo,
	pmladek, linux, ebiederm, shuah, DSterba, keescook, gregkh,
	jpoimboe, acme, mbenes, neilb, linux-kernel, davem, jeyu,
	subashab

On Thu, May 25, 2017 at 06:22:01PM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote:
> > On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
> > > On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@gmail.com>
> > > wrote:
> > > 
> > > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
> > > > We currently statically limit the number of modprobe threads which
> > > > we allow to run concurrently to 50. As per Keith Owens, this was a
> > > > completely arbitrary value, and it was set in the 2.3.38 days [0]
> > > > over 16 years ago in year 2000.
> > > >
> > > > Although we haven't yet hit our lower limits, experimentation [1]
> > > > shows that when and if we hit this limit in the worst case, will be
> > > > fatal -- consider get_fs_type() failures upon mount on a system which
> > > > has many partitions, some of which might even be with the same
> > > > filesystem. Its best to be prudent and increase and set this
> > > > value to something more sensible which ensures we're far from hitting
> > > > the limit and also allows default build/user run time override.
> > > >
> > > > The worst case is fatal given that once a module fails to load there
> > > > is a period of time during which subsequent request for the same module
> > > > will fail, so in the case of partitions its not just one request that
> > > > could fail, but whole series of partitions. This later issue of a
> > > > module request failure domino effect can be addressed later, but
> > > > increasing the limit to something more meaninful should at least give us
> > > > enough cushion to avoid this for a while.
> > > >
> > > > Set this value up with a bit more meaninful modern limits:
> > > >
> > > > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> > > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> > > >
> > > > Also allow the default max limit to be further fine tuned at compile
> > > > time and at initialization at run time at boot up using the kernel
> > > > parameter: max_modprobes.
> > > >
> > > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
> > > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
> > > > [1] https://github.com/mcgrof/test_request_module
> > > 
> > > If we actually run into this issue, instead of slamming the system with
> > > bazillion concurrent requests, can we wait for the other modprobes to
> > > finish and then continue?
> > > 
> > > 
> > > Yes ! That I have a patch that does precisely that ! That is actually still
> > > *not enough* to not fail fatally but this would be subject of another
> > > series with more debatable approaches.
> > > 
> > 
> > Then please post it.
> 
> Will do.
> 
> > > This at least pushes us to closer safer limits for now while also making it
> > > configurable.
> > 
> > Making it configurable depending on how big/little box is makes no
> > sense,
> 
> If we set a hard limit then we need to patch a system if we need to increment
> it. This is rather stupid given we have no current heuristics to make kmod
> loading deterministic from userspace, and in the worst case this can be fatal.
> General system size is a good first guess, but making it configurable is
> really key given current limitations. I'll post further patches which reveals
> some of these issues more clearly.
> 
> > especially if the above is implemented, as depth of modprobe
> > invocations depends on configuration and not computing power of the
> > hardware the system is running on.
> 
> You seem to agree making it configurable is sensible , but not depending on
> the system size ?

No, I am saying that making it configurable based on system size makes
no sense at all, and making it configurable given you already have
patches removing hard failures gives no benefit.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 5/6] kmod: preempt on kmod_umh_threads_get()
  2017-05-25 11:19             ` Petr Mladek
  2017-05-25 15:38               ` Luis R. Rodriguez
@ 2017-05-25 16:42               ` Dmitry Torokhov
  1 sibling, 0 replies; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-25 16:42 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, shuah, jeyu, rusty, ebiederm, acme, corbet,
	martin.wilck, mmarek, hare, rwright, jeffm, DSterba, fdmanana,
	neilb, linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, akpm,
	torvalds, gregkh, linux-kselftest, linux-doc, linux-kernel

On Thu, May 25, 2017 at 01:19:31PM +0200, Petr Mladek wrote:
> On Wed 2017-05-24 19:27:38, Dmitry Torokhov wrote:
> > On Thu, May 25, 2017 at 03:00:17AM +0200, Luis R. Rodriguez wrote:
> > > On Wed, May 24, 2017 at 05:45:37PM -0700, Dmitry Torokhov wrote:
> > > > On Thu, May 25, 2017 at 02:14:52AM +0200, Luis R. Rodriguez wrote:
> > > > > On Fri, May 19, 2017 at 03:27:12PM -0700, Dmitry Torokhov wrote:
> > > > > > On Thu, May 18, 2017 at 08:24:43PM -0700, Luis R. Rodriguez wrote:
> > > > > > > In theory it is possible multiple concurrent threads will try to
> > > > > > > kmod_umh_threads_get() and as such atomic_inc(&kmod_concurrent) at
> > > > > > > the same time, therefore enabling a small time during which we've
> > > > > > > bumped kmod_concurrent but have not really enabled work. By using
> > > > > > > preemption we mitigate this a bit.
> > > > > > > 
> > > > > > > Preemption is not needed when we kmod_umh_threads_put().
> > > > > > > 
> > > > > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > > > > ---
> > > > > > >  kernel/kmod.c | 24 ++++++++++++++++++++++--
> > > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > > > > index 563600fc9bb1..7ea11dbc7564 100644
> > > > > > > --- a/kernel/kmod.c
> > > > > > > +++ b/kernel/kmod.c
> > > > > > > @@ -113,15 +113,35 @@ static int call_modprobe(char *module_name, int wait)
> > > > > > >  
> > > > > > >  static int kmod_umh_threads_get(void)
> > > > > > >  {
> > > > > > > +	int ret = 0;
> > > > > > > +
> > > > > > > +	/*
> > > > > > > +	 * Disabling preemption makes sure that we are not rescheduled here
> > > > > > > +	 *
> > > > > > > +	 * Also preemption helps kmod_concurrent is not increased by mistake
> > > > > > > +	 * for too long given in theory two concurrent threads could race on
> > > > > > > +	 * atomic_inc() before we atomic_read() -- we know that's possible
> > > > > > > +	 * and but we don't care, this is not used for object accounting and
> > > > > > > +	 * is just a subjective threshold. The alternative is a lock.
> > > > > > > +	 */
> > > > > > > +	preempt_disable();
> > > > > > >  	atomic_inc(&kmod_concurrent);
> > > > > > >  	if (atomic_read(&kmod_concurrent) <= max_modprobes)
> > > > > > 
> > > > > > That is very "fancy" way to basically say:
> > > > > > 
> > > > > > 	if (atomic_inc_return(&kmod_concurrent) <= max_modprobes)
> > > > > 
> > > > > Do you mean to combine the atomic_inc() and atomic_read() in one as you noted
> > > > > (as that is not a change in this patch), *or* that using a memory barrier here
> > > > > with atomic_inc_return() should suffice to address the same and avoid an
> > > > > explicit preemption  enable / disable ?
> > > > 
> > > > I am saying that atomic_inc_return() will avoid situation where you have
> > > > more than one threads incrementing the counter and believing that they
> > > > are [not] allowed to start modprobe.
> > > > 
> > > > I have no idea why you think preempt_disable() would help here. It only
> > > > ensures that current thread will not be preempted between the point
> > > > where you update the counter and where you check the result. It does not
> > > > stop interrupts nor does it affect other threads that might be updating
> > > > the same counter.
> > > 
> > > The preemption was inspired by __module_get() and try_module_get(), was that
> > > rather silly ?
> > 
> > As far as I can see prrempt_disable() was needed in __module_get() when
> > modules user per-cpu refcounts: you did not want to move away from CPU
> > while manipulating refcount.
> > 
> > Now that modules use simple atomics for refcounting I think these
> > preempt_disable() and preempt_enable() can be removed.
> 
> preempt_disable() still might be useful because you do the
> atomic_dec() when you reach the limit.

No, not really, because even if you disallow process to migrate to
another CPU, it will not help with another thread modifying the counter.

> 
> By other words, you have three operations that should be atomic:
> inc, read, and dec. atomic_inc_return() covers only two of them.
> 
> Hmm, a solution might be to use atomic_dec_if_positive().
> I would kmod_concurrent to something like kmod_concurrent_allowed,
> intialize it with the maximum allowed number. Then you could do:
> 
> static int kmod_umh_threads_get(void)
> {
> 	if (atomic_dec_if_positive(kmod_concurrent_available) < 0)
> 		return -EBUSY;
> 	return 0;
> }

Yes, this looks like optimal solution. And we won't need
kmod_umh_threads_get() wrapper anymore I think.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 16:38                 ` Dmitry Torokhov
@ 2017-05-25 16:50                   ` Luis R. Rodriguez
  2017-05-25 17:30                     ` Dmitry Torokhov
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25 16:50 UTC (permalink / raw)
  To: Dmitry Torokhov, Tom Gundersen
  Cc: Filipe Manana, Paul E. McKenney, linux-doc, rgoldwyn, hare,
	Jonathan Corbet, Linus Torvalds, linux-kselftest, Andrew Morton,
	Dan Williams, Aaron Tomlin, rwright, Heinrich Schuchardt,
	Michal Marek, martin.wilck, Rusty Russell, Jeff Mahoney,
	Ingo Molnar, Petr Mladek, Guenter Roeck, Eric W. Biederman,
	shuah, DSterba, Kees Cook, Greg Kroah-Hartman, Josh Poimboeuf,
	Arnaldo Carvalho de Melo, Miroslav Benes, NeilBrown,
	linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan

On Thu, May 25, 2017 at 9:38 AM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
> On Thu, May 25, 2017 at 06:22:01PM +0200, Luis R. Rodriguez wrote:
>> On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote:
>> > On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
>> > > On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@gmail.com>
>> > > wrote:
>> > >
>> > > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
>> > > > We currently statically limit the number of modprobe threads which
>> > > > we allow to run concurrently to 50. As per Keith Owens, this was a
>> > > > completely arbitrary value, and it was set in the 2.3.38 days [0]
>> > > > over 16 years ago in year 2000.
>> > > >
>> > > > Although we haven't yet hit our lower limits, experimentation [1]
>> > > > shows that when and if we hit this limit in the worst case, will be
>> > > > fatal -- consider get_fs_type() failures upon mount on a system which
>> > > > has many partitions, some of which might even be with the same
>> > > > filesystem. Its best to be prudent and increase and set this
>> > > > value to something more sensible which ensures we're far from hitting
>> > > > the limit and also allows default build/user run time override.
>> > > >
>> > > > The worst case is fatal given that once a module fails to load there
>> > > > is a period of time during which subsequent request for the same module
>> > > > will fail, so in the case of partitions its not just one request that
>> > > > could fail, but whole series of partitions. This later issue of a
>> > > > module request failure domino effect can be addressed later, but
>> > > > increasing the limit to something more meaninful should at least give us
>> > > > enough cushion to avoid this for a while.
>> > > >
>> > > > Set this value up with a bit more meaninful modern limits:
>> > > >
>> > > > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
>> > > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
>> > > >
>> > > > Also allow the default max limit to be further fine tuned at compile
>> > > > time and at initialization at run time at boot up using the kernel
>> > > > parameter: max_modprobes.
>> > > >
>> > > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
>> > > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
>> > > > [1] https://github.com/mcgrof/test_request_module
>> > >
>> > > If we actually run into this issue, instead of slamming the system with
>> > > bazillion concurrent requests, can we wait for the other modprobes to
>> > > finish and then continue?
>> > >
>> > >
>> > > Yes ! That I have a patch that does precisely that ! That is actually still
>> > > *not enough* to not fail fatally but this would be subject of another
>> > > series with more debatable approaches.
>> > >
>> >
>> > Then please post it.
>>
>> Will do.
>>
>> > > This at least pushes us to closer safer limits for now while also making it
>> > > configurable.
>> >
>> > Making it configurable depending on how big/little box is makes no
>> > sense,
>>
>> If we set a hard limit then we need to patch a system if we need to increment
>> it. This is rather stupid given we have no current heuristics to make kmod
>> loading deterministic from userspace, and in the worst case this can be fatal.
>> General system size is a good first guess, but making it configurable is
>> really key given current limitations. I'll post further patches which reveals
>> some of these issues more clearly.
>>
>> > especially if the above is implemented, as depth of modprobe
>> > invocations depends on configuration and not computing power of the
>> > hardware the system is running on.
>>
>> You seem to agree making it configurable is sensible , but not depending on
>> the system size ?
>
> No, I am saying that making it configurable based on system size makes
> no sense at all, and making it configurable given you already have
> patches removing hard failures gives no benefit.

Ah no, the problem is that hard failures are not yet removed in this
patch set at all! This series only contains the things I thought were
non-radical really.

In fact even with the subsequent patches from my 2nd series I'll
eventually post post -- these fatal issues are not cured at all unless
we dance with userspace a bit, or unless as you suggest we have *all*
pending theads wait without killing any.

*This* small patch should enable folks to move the needle to a more
*fair* limit, its also useful to backport a simple fix even if the
other stuff is not merged, *but* it *also* provides a way for systems
to move away from the slippery slope if they know what they are doing.

I'll post the 2nd series next.

 Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 16:50                   ` Luis R. Rodriguez
@ 2017-05-25 17:30                     ` Dmitry Torokhov
  2017-05-25 17:38                       ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-25 17:30 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Tom Gundersen, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Rusty Russell,
	Jeff Mahoney, Ingo Molnar, Petr Mladek, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Greg Kroah-Hartman,
	Josh Poimboeuf, Arnaldo Carvalho de Melo, Miroslav Benes,
	NeilBrown, linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan

On Thu, May 25, 2017 at 09:50:15AM -0700, Luis R. Rodriguez wrote:
> On Thu, May 25, 2017 at 9:38 AM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
> > On Thu, May 25, 2017 at 06:22:01PM +0200, Luis R. Rodriguez wrote:
> >> On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote:
> >> > On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
> >> > > On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@gmail.com>
> >> > > wrote:
> >> > >
> >> > > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
> >> > > > We currently statically limit the number of modprobe threads which
> >> > > > we allow to run concurrently to 50. As per Keith Owens, this was a
> >> > > > completely arbitrary value, and it was set in the 2.3.38 days [0]
> >> > > > over 16 years ago in year 2000.
> >> > > >
> >> > > > Although we haven't yet hit our lower limits, experimentation [1]
> >> > > > shows that when and if we hit this limit in the worst case, will be
> >> > > > fatal -- consider get_fs_type() failures upon mount on a system which
> >> > > > has many partitions, some of which might even be with the same
> >> > > > filesystem. Its best to be prudent and increase and set this
> >> > > > value to something more sensible which ensures we're far from hitting
> >> > > > the limit and also allows default build/user run time override.
> >> > > >
> >> > > > The worst case is fatal given that once a module fails to load there
> >> > > > is a period of time during which subsequent request for the same module
> >> > > > will fail, so in the case of partitions its not just one request that
> >> > > > could fail, but whole series of partitions. This later issue of a
> >> > > > module request failure domino effect can be addressed later, but
> >> > > > increasing the limit to something more meaninful should at least give us
> >> > > > enough cushion to avoid this for a while.
> >> > > >
> >> > > > Set this value up with a bit more meaninful modern limits:
> >> > > >
> >> > > > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> >> > > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> >> > > >
> >> > > > Also allow the default max limit to be further fine tuned at compile
> >> > > > time and at initialization at run time at boot up using the kernel
> >> > > > parameter: max_modprobes.
> >> > > >
> >> > > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
> >> > > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
> >> > > > [1] https://github.com/mcgrof/test_request_module
> >> > >
> >> > > If we actually run into this issue, instead of slamming the system with
> >> > > bazillion concurrent requests, can we wait for the other modprobes to
> >> > > finish and then continue?
> >> > >
> >> > >
> >> > > Yes ! That I have a patch that does precisely that ! That is actually still
> >> > > *not enough* to not fail fatally but this would be subject of another
> >> > > series with more debatable approaches.
> >> > >
> >> >
> >> > Then please post it.
> >>
> >> Will do.
> >>
> >> > > This at least pushes us to closer safer limits for now while also making it
> >> > > configurable.
> >> >
> >> > Making it configurable depending on how big/little box is makes no
> >> > sense,
> >>
> >> If we set a hard limit then we need to patch a system if we need to increment
> >> it. This is rather stupid given we have no current heuristics to make kmod
> >> loading deterministic from userspace, and in the worst case this can be fatal.
> >> General system size is a good first guess, but making it configurable is
> >> really key given current limitations. I'll post further patches which reveals
> >> some of these issues more clearly.
> >>
> >> > especially if the above is implemented, as depth of modprobe
> >> > invocations depends on configuration and not computing power of the
> >> > hardware the system is running on.
> >>
> >> You seem to agree making it configurable is sensible , but not depending on
> >> the system size ?
> >
> > No, I am saying that making it configurable based on system size makes
> > no sense at all, and making it configurable given you already have
> > patches removing hard failures gives no benefit.
> 
> Ah no, the problem is that hard failures are not yet removed in this
> patch set at all! This series only contains the things I thought were
> non-radical really.

I know they are not removed in this patch set.

> 
> In fact even with the subsequent patches from my 2nd series I'll
> eventually post post -- these fatal issues are not cured at all unless
> we dance with userspace a bit, or unless as you suggest we have *all*
> pending theads wait without killing any.

Well, that is too bad, I understood you already implemented what I
suggested.

> 
> *This* small patch should enable folks to move the needle to a more
> *fair* limit, its also useful to backport a simple fix even if the
> other stuff is not merged, *but* it *also* provides a way for systems
> to move away from the slippery slope if they know what they are doing.

Look, you are trying to push a band-aid solution for a problem that is
purely theoretical (as you say in your patch description we are not
hitting this problem in practice, only your test does). There is
no slippery slope for systems to move away, no need to backport
anything. We seem to agree that a better solution is possible (throttle
number of concurrently running modprobes without killing requesters),
and with that solution the band-aid will no longer be needed.

So please implement and post the proper fix for the issue.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 17:30                     ` Dmitry Torokhov
@ 2017-05-25 17:38                       ` Luis R. Rodriguez
  2017-05-25 18:06                         ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25 17:38 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Tom Gundersen, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Rusty Russell,
	Jeff Mahoney, Ingo Molnar, Petr Mladek, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Greg Kroah-Hartman,
	Josh Poimboeuf, Arnaldo Carvalho de Melo, Miroslav Benes,
	NeilBrown, linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan, Luis R. Rodriguez

On Thu, May 25, 2017 at 10:30 AM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
> On Thu, May 25, 2017 at 09:50:15AM -0700, Luis R. Rodriguez wrote:
>> On Thu, May 25, 2017 at 9:38 AM, Dmitry Torokhov
>> <dmitry.torokhov@gmail.com> wrote:
>> > On Thu, May 25, 2017 at 06:22:01PM +0200, Luis R. Rodriguez wrote:
>> >> On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote:
>> >> > On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
>> >> > > On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@gmail.com>
>> >> > > wrote:
>> >> > >
>> >> > > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
>> >> > > > We currently statically limit the number of modprobe threads which
>> >> > > > we allow to run concurrently to 50. As per Keith Owens, this was a
>> >> > > > completely arbitrary value, and it was set in the 2.3.38 days [0]
>> >> > > > over 16 years ago in year 2000.
>> >> > > >
>> >> > > > Although we haven't yet hit our lower limits, experimentation [1]
>> >> > > > shows that when and if we hit this limit in the worst case, will be
>> >> > > > fatal -- consider get_fs_type() failures upon mount on a system which
>> >> > > > has many partitions, some of which might even be with the same
>> >> > > > filesystem. Its best to be prudent and increase and set this
>> >> > > > value to something more sensible which ensures we're far from hitting
>> >> > > > the limit and also allows default build/user run time override.
>> >> > > >
>> >> > > > The worst case is fatal given that once a module fails to load there
>> >> > > > is a period of time during which subsequent request for the same module
>> >> > > > will fail, so in the case of partitions its not just one request that
>> >> > > > could fail, but whole series of partitions. This later issue of a
>> >> > > > module request failure domino effect can be addressed later, but
>> >> > > > increasing the limit to something more meaninful should at least give us
>> >> > > > enough cushion to avoid this for a while.
>> >> > > >
>> >> > > > Set this value up with a bit more meaninful modern limits:
>> >> > > >
>> >> > > > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
>> >> > > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
>> >> > > >
>> >> > > > Also allow the default max limit to be further fine tuned at compile
>> >> > > > time and at initialization at run time at boot up using the kernel
>> >> > > > parameter: max_modprobes.
>> >> > > >
>> >> > > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
>> >> > > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
>> >> > > > [1] https://github.com/mcgrof/test_request_module
>> >> > >
>> >> > > If we actually run into this issue, instead of slamming the system with
>> >> > > bazillion concurrent requests, can we wait for the other modprobes to
>> >> > > finish and then continue?
>> >> > >
>> >> > >
>> >> > > Yes ! That I have a patch that does precisely that ! That is actually still
>> >> > > *not enough* to not fail fatally but this would be subject of another
>> >> > > series with more debatable approaches.
>> >> > >
>> >> >
>> >> > Then please post it.
>> >>
>> >> Will do.
>> >>
>> >> > > This at least pushes us to closer safer limits for now while also making it
>> >> > > configurable.
>> >> >
>> >> > Making it configurable depending on how big/little box is makes no
>> >> > sense,
>> >>
>> >> If we set a hard limit then we need to patch a system if we need to increment
>> >> it. This is rather stupid given we have no current heuristics to make kmod
>> >> loading deterministic from userspace, and in the worst case this can be fatal.
>> >> General system size is a good first guess, but making it configurable is
>> >> really key given current limitations. I'll post further patches which reveals
>> >> some of these issues more clearly.
>> >>
>> >> > especially if the above is implemented, as depth of modprobe
>> >> > invocations depends on configuration and not computing power of the
>> >> > hardware the system is running on.
>> >>
>> >> You seem to agree making it configurable is sensible , but not depending on
>> >> the system size ?
>> >
>> > No, I am saying that making it configurable based on system size makes
>> > no sense at all, and making it configurable given you already have
>> > patches removing hard failures gives no benefit.
>>
>> Ah no, the problem is that hard failures are not yet removed in this
>> patch set at all! This series only contains the things I thought were
>> non-radical really.
>
> I know they are not removed in this patch set.
>
>>
>> In fact even with the subsequent patches from my 2nd series I'll
>> eventually post post -- these fatal issues are not cured at all unless
>> we dance with userspace a bit, or unless as you suggest we have *all*
>> pending theads wait without killing any.
>
> Well, that is too bad, I understood you already implemented what I
> suggested.

We seem to want the same so that is good actually.

>> *This* small patch should enable folks to move the needle to a more
>> *fair* limit, its also useful to backport a simple fix even if the
>> other stuff is not merged, *but* it *also* provides a way for systems
>> to move away from the slippery slope if they know what they are doing.
>
> Look, you are trying to push a band-aid solution for a problem that is
> purely theoretical (as you say in your patch description we are not
> hitting this problem in practice, only your test does).

I have a stress test driver which reveals we can easily hit it. In
practice the only way to know if we hit the limit is a:
"request_module: runaway loop modprobe %s" message on dmesg, however
its fatal, how often people inspect a kernel log to see if that came
up though... not sure. So a module could not be loaded and we may not
realize it.

> There is
> no slippery slope for systems to move away, no need to backport
> anything. We seem to agree that a better solution is possible (throttle
> number of concurrently running modprobes without killing requesters),
> and with that solution the band-aid will no longer be needed.
>
> So please implement and post the proper fix for the issue.

Alright, will do away with this patch and just go for the jugular of the issue.

 Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 17:38                       ` Luis R. Rodriguez
@ 2017-05-25 18:06                         ` Luis R. Rodriguez
  2017-05-25 18:26                           ` Dmitry Torokhov
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25 18:06 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Dmitry Torokhov, Tom Gundersen, Filipe Manana, Paul E. McKenney,
	linux-doc, rgoldwyn, hare, Jonathan Corbet, Linus Torvalds,
	linux-kselftest, Andrew Morton, Dan Williams, Aaron Tomlin,
	rwright, Heinrich Schuchardt, Michal Marek, martin.wilck,
	Rusty Russell, Jeff Mahoney, Ingo Molnar, Petr Mladek,
	Guenter Roeck, Eric W. Biederman, shuah, DSterba, Kees Cook,
	Greg Kroah-Hartman, Josh Poimboeuf, Arnaldo Carvalho de Melo,
	Miroslav Benes, NeilBrown, linux-kernel, David Miller,
	Jessica Yu, Subash Abhinov Kasiviswanathan

On Thu, May 25, 2017 at 10:38:40AM -0700, Luis R. Rodriguez wrote:
> On Thu, May 25, 2017 at 10:30 AM, Dmitry Torokhov
> > There is
> > no slippery slope for systems to move away, no need to backport
> > anything. We seem to agree that a better solution is possible (throttle
> > number of concurrently running modprobes without killing requesters),
> > and with that solution the band-aid will no longer be needed.
> >
> > So please implement and post the proper fix for the issue.
> 
> Alright, will do away with this patch and just go for the jugular of the issue.

I gave this some more thought, even if we go with the throttling right away in
practice you'll end up with a dmesg notice of a throttle kicking in once you *do*
reach this. We are forcing only 50 concurrent threads and making this a static
limit with no good reason than 2.3.38 days evaluation from 16 years ago (2000).
If we throttle we are going to throttle with a 2.3.38 days limit. And you
advocate that ?

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 18:06                         ` Luis R. Rodriguez
@ 2017-05-25 18:26                           ` Dmitry Torokhov
  2017-05-25 19:01                             ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-25 18:26 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Tom Gundersen, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Rusty Russell,
	Jeff Mahoney, Ingo Molnar, Petr Mladek, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Greg Kroah-Hartman,
	Josh Poimboeuf, Arnaldo Carvalho de Melo, Miroslav Benes,
	NeilBrown, linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan

On Thu, May 25, 2017 at 08:06:03PM +0200, Luis R. Rodriguez wrote:
> On Thu, May 25, 2017 at 10:38:40AM -0700, Luis R. Rodriguez wrote:
> > On Thu, May 25, 2017 at 10:30 AM, Dmitry Torokhov
> > > There is
> > > no slippery slope for systems to move away, no need to backport
> > > anything. We seem to agree that a better solution is possible (throttle
> > > number of concurrently running modprobes without killing requesters),
> > > and with that solution the band-aid will no longer be needed.
> > >
> > > So please implement and post the proper fix for the issue.
> > 
> > Alright, will do away with this patch and just go for the jugular of the issue.
> 
> I gave this some more thought, even if we go with the throttling right away in
> practice you'll end up with a dmesg notice of a throttle kicking in once you *do*

So remove it. The warning was meaningful when we rejected requests, now
it is not.

> reach this. We are forcing only 50 concurrent threads and making this a static
> limit with no good reason than 2.3.38 days evaluation from 16 years ago (2000).
> If we throttle we are going to throttle with a 2.3.38 days limit. And you
> advocate that ?

Yes. Can you give me reason why slamming the system with more than 50
modprobes is a good idea in 4.12 days? Does the increased limit
decreases boot time? By how much?

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 18:26                           ` Dmitry Torokhov
@ 2017-05-25 19:01                             ` Luis R. Rodriguez
  2017-05-25 21:38                               ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25 19:01 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Tom Gundersen, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Rusty Russell,
	Jeff Mahoney, Ingo Molnar, Petr Mladek, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Greg Kroah-Hartman,
	Josh Poimboeuf, Arnaldo Carvalho de Melo, Miroslav Benes,
	NeilBrown, linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan

On Thu, May 25, 2017 at 11:26 AM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
> On Thu, May 25, 2017 at 08:06:03PM +0200, Luis R. Rodriguez wrote:
>> On Thu, May 25, 2017 at 10:38:40AM -0700, Luis R. Rodriguez wrote:
>> > On Thu, May 25, 2017 at 10:30 AM, Dmitry Torokhov
>> > > There is
>> > > no slippery slope for systems to move away, no need to backport
>> > > anything. We seem to agree that a better solution is possible (throttle
>> > > number of concurrently running modprobes without killing requesters),
>> > > and with that solution the band-aid will no longer be needed.
>> > >
>> > > So please implement and post the proper fix for the issue.
>> >
>> > Alright, will do away with this patch and just go for the jugular of the issue.
>>
>> I gave this some more thought, even if we go with the throttling right away in
>> practice you'll end up with a dmesg notice of a throttle kicking in once you *do*
>
> So remove it. The warning was meaningful when we rejected requests, now
> it is not.

Great.

>> reach this. We are forcing only 50 concurrent threads and making this a static
>> limit with no good reason than 2.3.38 days evaluation from 16 years ago (2000).
>> If we throttle we are going to throttle with a 2.3.38 days limit. And you
>> advocate that ?
>
> Yes. Can you give me reason why slamming the system with more than 50
> modprobes is a good idea in 4.12 days? Does the increased limit
> decreases boot time? By how much?

If in practice we are not hitting the limit the point is moot, and
when we do I agree we can re-evaluate. With my stress test driver on a
test case we can push as hard as bringing out the OOM killer even if
we throttle, fun.

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count
  2017-05-25 19:01                             ` Luis R. Rodriguez
@ 2017-05-25 21:38                               ` Luis R. Rodriguez
  0 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-25 21:38 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Tom Gundersen, Filipe Manana, Paul E. McKenney, linux-doc,
	rgoldwyn, hare, Jonathan Corbet, Linus Torvalds, linux-kselftest,
	Andrew Morton, Dan Williams, Aaron Tomlin, rwright,
	Heinrich Schuchardt, Michal Marek, martin.wilck, Rusty Russell,
	Jeff Mahoney, Ingo Molnar, Petr Mladek, Guenter Roeck,
	Eric W. Biederman, shuah, DSterba, Kees Cook, Greg Kroah-Hartman,
	Josh Poimboeuf, Arnaldo Carvalho de Melo, Miroslav Benes,
	NeilBrown, linux-kernel, David Miller, Jessica Yu,
	Subash Abhinov Kasiviswanathan

On Thu, May 25, 2017 at 12:01 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Thu, May 25, 2017 at 11:26 AM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
>> On Thu, May 25, 2017 at 08:06:03PM +0200, Luis R. Rodriguez wrote:
>>> On Thu, May 25, 2017 at 10:38:40AM -0700, Luis R. Rodriguez wrote:
>>> > On Thu, May 25, 2017 at 10:30 AM, Dmitry Torokhov
>>> > > There is
>>> > > no slippery slope for systems to move away, no need to backport
>>> > > anything. We seem to agree that a better solution is possible (throttle
>>> > > number of concurrently running modprobes without killing requesters),
>>> > > and with that solution the band-aid will no longer be needed.
>>> > >
>>> > > So please implement and post the proper fix for the issue.
>>> >
>>> > Alright, will do away with this patch and just go for the jugular of the issue.
>>>
>>> I gave this some more thought, even if we go with the throttling right away in
>>> practice you'll end up with a dmesg notice of a throttle kicking in once you *do*
>>
>> So remove it. The warning was meaningful when we rejected requests, now
>> it is not.
>
> Great.
>
>>> reach this. We are forcing only 50 concurrent threads and making this a static
>>> limit with no good reason than 2.3.38 days evaluation from 16 years ago (2000).
>>> If we throttle we are going to throttle with a 2.3.38 days limit. And you
>>> advocate that ?
>>
>> Yes. Can you give me reason why slamming the system with more than 50
>> modprobes is a good idea in 4.12 days? Does the increased limit
>> decreases boot time? By how much?
>
> If in practice we are not hitting the limit the point is moot, and
> when we do I agree we can re-evaluate. With my stress test driver on a
> test case we can push as hard as bringing out the OOM killer even if
> we throttle, fun.

Alright, I don't see these OOMs anymore *after* I actually nuked that
patch which incremented the kmod limit. The reason we can OOM is
finit_module() can consume gobs of memory, the current value then
seems fair, and work well for my tests provided we do use the proper
throttle. Will respin.

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 0/5] kmod: help make deterministic
  2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
                   ` (5 preceding siblings ...)
  2017-05-19  3:24 ` [PATCH 6/6] kmod: use simplified rate limit printk Luis R. Rodriguez
@ 2017-05-26  0:16 ` Luis R. Rodriguez
  2017-05-26  0:16   ` [PATCH v2 1/5] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
                     ` (5 more replies)
  6 siblings, 6 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26  0:16 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

On this v2 I'm following Dmitry Torokhov's recommendation from the v1 series
[0] and I stop dancing around with work arounds and try to go straight for what
I think should be a proper fix for kmod: throttle instead of failing when kmod
max concurrent threshold is reached.

This patch series depends on the unsigned int range proc sysctl changes which
Andrew Morton recently merged into his -mm tree [1].

The kmod stress test driver uses a new license (GPL on Linux, copyleft-next
outside of Linux). Linus was fine with the copyleft-next so long as it was
clear GPL applies to Linux [2] and an or clause was used if I wanted to use
copyleft-next. Later based on discussions with Alan and Ted ironed out an "or"
language clause to use [3].

All code is also available on my 20170525-kmod-throttle branch of my linux-next
tree based on tag next-20170525 [4].

If there are any questions please let me know.

[0] https://lkml.kernel.org/r/20170519032444.18416-1-mcgrof@kernel.org
[1] https://lkml.kernel.org/r/20170519033554.18592-1-mcgrof@kernel.org
[2] https://lkml.kernel.org/r/CA+55aFyhxcvD+q7tp+-yrSFDKfR0mOHgyEAe=f_94aKLsOu0Og@mail.gmail.com
[3] https://lkml.kernel.org/r/1495234558.7848.122.camel@linux.intel.com
[4] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170525-kmod-throttle

  Luis

Luis R. Rodriguez (5):
  module: use list_for_each_entry_rcu() on find_module_all()
  kmod: reduce atomic operations on kmod_concurrent
  kmod: add test driver to stress test the module loader
  kmod: add helpers for getting kmod limit
  kmod: throttle kmod thread limit

 Documentation/sysctl/kernel.txt       |   20 +
 include/linux/kmod.h                  |    7 +
 init/main.c                           |    1 +
 kernel/kmod.c                         |   92 ++-
 kernel/module.c                       |    2 +-
 kernel/sysctl.c                       |    7 +
 lib/Kconfig.debug                     |   25 +
 lib/Makefile                          |    1 +
 lib/test_kmod.c                       | 1246 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile |   11 +
 tools/testing/selftests/kmod/config   |    7 +
 tools/testing/selftests/kmod/kmod.sh  |  615 ++++++++++++++++
 12 files changed, 2004 insertions(+), 30 deletions(-)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 1/5] module: use list_for_each_entry_rcu() on find_module_all()
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
@ 2017-05-26  0:16   ` Luis R. Rodriguez
  2017-05-26  0:16   ` [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent Luis R. Rodriguez
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26  0:16 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

The module list has been using RCU in a lot of other calls
for a while now, we just overlooked changing this one over to
use RCU.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/module.c b/kernel/module.c
index 3803449ca219..2df38d45ca37 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -603,7 +603,7 @@ static struct module *find_module_all(const char *name, size_t len,
 
 	module_assert_mutex_or_preempt();
 
-	list_for_each_entry(mod, &modules, list) {
+	list_for_each_entry_rcu(mod, &modules, list) {
 		if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
 			continue;
 		if (strlen(mod->name) == len && !memcmp(mod->name, name, len))
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
  2017-05-26  0:16   ` [PATCH v2 1/5] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
@ 2017-05-26  0:16   ` Luis R. Rodriguez
  2017-05-26  1:11     ` Dmitry Torokhov
  2017-05-26  0:16   ` [PATCH v2 3/5] kmod: add test driver to stress test the module loader Luis R. Rodriguez
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26  0:16 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

When checking if we want to allow a kmod thread to kick off we increment,
then read to see if we should enable a thread. If we were over the allowed
limit limit we decrement. Splitting the increment far apart from decrement
means there could be a time where two increments happen potentially
giving a false failure on a thread which should have been allowed.

CPU1			CPU2
atomic_inc()
			atomic_inc()
atomic_read()
			atomic_read()
atomic_dec()
			atomic_dec()

In this case a read on CPU1 gets the atomic_inc()'s and we could negate
it from getting a kmod thread. We could try to prevent this with a lock
or preemption but that is overkill. We can fix by reducing the number of
atomic operations. We do this by inverting the logic of of the enabler,
instead of incrementing kmod_concurrent as we get new kmod users, define the
variable kmod_concurrent_max as the max number of currently allowed kmod
users and as we get new kmod users just decrement it if its still positive.
This combines the dec and read in one atomic operation.

In this case we no longer get the same false failure:

CPU1			CPU2
atomic_dec_if_positive()
			atomic_dec_if_positive()
atomic_inc()
			atomic_inc()

Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 include/linux/kmod.h |  2 ++
 init/main.c          |  1 +
 kernel/kmod.c        | 44 +++++++++++++++++++++++++-------------------
 3 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index c4e441e00db5..8e2f302b214a 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -38,10 +38,12 @@ int __request_module(bool wait, const char *name, ...);
 #define request_module_nowait(mod...) __request_module(false, mod)
 #define try_then_request_module(x, mod...) \
 	((x) ?: (__request_module(true, mod), (x)))
+void init_kmod_umh(void);
 #else
 static inline int request_module(const char *name, ...) { return -ENOSYS; }
 static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
 #define try_then_request_module(x, mod...) (x)
+static inline void init_kmod_umh(void) { }
 #endif
 
 
diff --git a/init/main.c b/init/main.c
index 9ec09ff8a930..9b20be716cf7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -650,6 +650,7 @@ asmlinkage __visible void __init start_kernel(void)
 	thread_stack_cache_init();
 	cred_init();
 	fork_init();
+	init_kmod_umh();
 	proc_caches_init();
 	buffer_init();
 	key_init();
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 563f97e2be36..cafd27b92d19 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -46,6 +46,7 @@
 #include <trace/events/module.h>
 
 extern int max_threads;
+unsigned int max_modprobes;
 
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
@@ -56,6 +57,8 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
 static DECLARE_RWSEM(umhelper_sem);
 
 #ifdef CONFIG_MODULES
+static atomic_t kmod_concurrent_max = ATOMIC_INIT(0);
+#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -127,10 +130,7 @@ int __request_module(bool wait, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
-	unsigned int max_modprobes;
 	int ret;
-	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
-#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
 	static int kmod_loop_msg;
 
 	/*
@@ -154,21 +154,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
-	/* If modprobe needs a service that is in a module, we get a recursive
-	 * loop.  Limit the number of running kmod threads to max_threads/2 or
-	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
-	 * would be to run the parents of this process, counting how many times
-	 * kmod was invoked.  That would mean accessing the internals of the
-	 * process tables to get the command line, proc_pid_cmdline is static
-	 * and it is not worth changing the proc code just to handle this case. 
-	 * KAO.
-	 *
-	 * "trace the ppid" is simple, but will fail if someone's
-	 * parent exits.  I think this is as good as it gets. --RR
-	 */
-	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
-	atomic_inc(&kmod_concurrent);
-	if (atomic_read(&kmod_concurrent) > max_modprobes) {
+	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
 		/* We may be blaming an innocent here, but unlikely */
 		if (kmod_loop_msg < 5) {
 			printk(KERN_ERR
@@ -184,10 +170,30 @@ int __request_module(bool wait, const char *fmt, ...)
 
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
-	atomic_dec(&kmod_concurrent);
+	atomic_inc(&kmod_concurrent_max);
+
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
+
+/*
+ * If modprobe needs a service that is in a module, we get a recursive
+ * loop.  Limit the number of running kmod threads to max_threads/2 or
+ * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
+ * would be to run the parents of this process, counting how many times
+ * kmod was invoked.  That would mean accessing the internals of the
+ * process tables to get the command line, proc_pid_cmdline is static
+ * and it is not worth changing the proc code just to handle this case.
+ *
+ * "trace the ppid" is simple, but will fail if someone's
+ * parent exits.  I think this is as good as it gets.
+ */
+void __init init_kmod_umh(void)
+{
+	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
+	atomic_set(&kmod_concurrent_max, max_modprobes);
+}
+
 #endif /* CONFIG_MODULES */
 
 static void call_usermodehelper_freeinfo(struct subprocess_info *info)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 3/5] kmod: add test driver to stress test the module loader
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
  2017-05-26  0:16   ` [PATCH v2 1/5] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
  2017-05-26  0:16   ` [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent Luis R. Rodriguez
@ 2017-05-26  0:16   ` Luis R. Rodriguez
  2017-05-26  0:16   ` [PATCH v2 4/5] kmod: add helpers for getting kmod limit Luis R. Rodriguez
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26  0:16 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

This adds a new stress test driver for kmod: the kernel module loader.
The new stress test driver, test_kmod, is only enabled as a module right
now. It should be possible to load this as built-in and load tests early
(refer to the force_init_test module parameter), however since a lot of
test can get a system out of memory fast we leave this disabled for now.

Using a system with 1024 MiB of RAM can *easily* get your kernel
OOM fast with this test driver.

The test_kmod driver exposes API knobs for us to fine tune simple
request_module() and get_fs_type() calls. Since these API calls
only allow each one parameter a test driver for these is rather
simple. Other factors that can help out test driver though are
the number of calls we issue and knowing current limitations of
each. This exposes configuration as much as possible through
userspace to be able to build tests directly from userspace.

Since it allows multiple misc devices its will eventually (once we
add a knob to let us create new devices at will) also be possible to
perform more tests in parallel, provided you have enough memory.

We only enable tests we know work as of right now.

Demo screenshots:

 # tools/testing/selftests/kmod/kmod.sh
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0002_driver: OK! - loading kmod test
kmod_test_0002_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0002_fs: OK! - loading kmod test
kmod_test_0002_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0003: OK! - loading kmod test
kmod_test_0003: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0004: OK! - loading kmod test
kmod_test_0004: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
XXX: add test restult for 0007
Test completed

You can also request for specific tests:

 # tools/testing/selftests/kmod/kmod.sh -t 0001
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
Test completed

Lastly, the current available number of tests:

 # tools/testing/selftests/kmod/kmod.sh --help
Usage: tools/testing/selftests/kmod/kmod.sh [ -t <4-number-digit> ]
Valid tests: 0001-0009

0001 - Simple test - 1 thread  for empty string
0002 - Simple test - 1 thread  for modules/filesystems that do not exist
0003 - Simple test - 1 thread  for get_fs_type() only
0004 - Simple test - 2 threads for get_fs_type() only
0005 - multithreaded tests with default setup - request_module() only
0006 - multithreaded tests with default setup - get_fs_type() only
0007 - multithreaded tests with default setup test request_module() and get_fs_type()
0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()
0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()

The following test cases currently fail, as such they are not currently
enabled by default:

 # tools/testing/selftests/kmod/kmod.sh -t 0008
 # tools/testing/selftests/kmod/kmod.sh -t 0009

To be sure to run them as intended please unload both of the modules:

  o test_module
  o xfs

And ensure they are not loaded on your system prior to testing them.
If you use these paritions for your rootfs you can change the default
test driver used for get_fs_type() by exporting it into your
environment. For example of other test defaults you can override
refer to kmod.sh allow_user_defaults().

Behind the scenes this is how we fine tune at a test case prior to
hitting a trigger to run it:

cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "2" > /sys/devices/virtual/misc/test_kmod0/config_test_case
echo -n "ext4" > /sys/devices/virtual/misc/test_kmod0/config_test_fs
echo -n "80" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "1" > /sys/devices/virtual/misc/test_kmod0/config_num_threads

Finally to trigger:

echo -n "1" > /sys/devices/virtual/misc/test_kmod0/trigger_config

The kmod.sh script uses the above constructs to build different test cases.

A bit of interpretation of the current failures follows, first two
premises:

a) When request_module() is used userspace figures out an optimized version of
module order for us. Once it finds the modules it needs, as per depmod
symbol dep map, it will finit_module() the respective modules which
are needed for the original request_module() request.

b) We have an optimization in place whereby if a kernel uses
request_module() on a module already loaded we never bother
userspace as the module already is loaded. This is all handled by
kernel/kmod.c.

A few things to consider to help identify root causes of issues:

0) kmod 19 has a broken heuristic for modules being assumed to be
built-in to your kernel and will return 0 even though request_module()
failed. Upgrade to a newer version of kmod.

1) A get_fs_type() call for "xfs" will request_module() for
"fs-xfs", not for "xfs". The optimization in kernel described in b)
fails to catch if we have a lot of consecutive get_fs_type() calls.
The reason is the optimization in place does not look for aliases. This
means two consecutive get_fs_type() calls will bump kmod_concurrent, whereas
request_module() will not.

This one explanation why test case 0009 fails at least once for
get_fs_type().

2) If a module fails to load --- for whatever reason (kmod_concurrent
limit reached, file not yet present due to rootfs switch, out of memory)
we have a period of time during which module request for the same name
either with request_module() or get_fs_type() will *also* fail to load
even if the file for the module is ready.

This explains why *multiple* NULLs are possible on test 0009.

3) finit_module() consumes quite a bit of memory.

4) Filesystems typically also have more dependent modules than other
modules, its important to note though that even though a get_fs_type() call
does not incur additional kmod_concurrent bumps, since userspace
loads dependencies it finds it needs via finit_module_fd(), it *will*
take much more memory to load a module with a lot of dependencies.

Because of 3) and 4) we will easily run into out of memory failures
with certain tests. For instance test 0006 fails on qemu with 1024 MiB
of RAM. It panics a box after reaping all userspace processes and still
not having enough memory to reap.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 lib/Kconfig.debug                     |   25 +
 lib/Makefile                          |    1 +
 lib/test_kmod.c                       | 1246 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile |   11 +
 tools/testing/selftests/kmod/config   |    7 +
 tools/testing/selftests/kmod/kmod.sh  |  635 +++++++++++++++++
 6 files changed, 1925 insertions(+)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 650250aec2d0..2453d8997510 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1833,6 +1833,31 @@ config BUG_ON_DATA_CORRUPTION
 
 	  If unsure, say N.
 
+config TEST_KMOD
+	tristate "kmod stress tester"
+	default n
+	depends on m
+	select TEST_LKM
+	select XFS_FS
+	select TUN
+	select BTRFS_FS
+	help
+	  Test the kernel's module loading mechanism: kmod. kmod implements
+	  support to load modules using the Linux kernel's usermode helper.
+	  This test provides a series of tests against kmod.
+
+	  Although technically you can either build test_kmod as a module or
+	  into the kernel we disallow building it into the kernel since
+	  it stress tests request_module() and this will very likely cause
+	  some issues by taking over precious threads available from other
+	  module load requests, ultimately this could be fatal.
+
+	  To run tests run:
+
+	  tools/testing/selftests/kmod/kmod.sh --help
+
+	  If unsure, say N.
+
 source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
diff --git a/lib/Makefile b/lib/Makefile
index 1935a97171db..07921956506f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_TEST_PRINTF) += test_printf.o
 obj-$(CONFIG_TEST_BITMAP) += test_bitmap.o
 obj-$(CONFIG_TEST_UUID) += test_uuid.o
 obj-$(CONFIG_TEST_PARMAN) += test_parman.o
+obj-$(CONFIG_TEST_KMOD) += test_kmod.o
 
 ifeq ($(CONFIG_DEBUG_KOBJECT),y)
 CFLAGS_kobject.o += -DDEBUG
diff --git a/lib/test_kmod.c b/lib/test_kmod.c
new file mode 100644
index 000000000000..6c1d678bcf8b
--- /dev/null
+++ b/lib/test_kmod.c
@@ -0,0 +1,1246 @@
+/*
+ * kmod stress test driver
+ *
+ * Copyright (C) 2017 Luis R. Rodriguez <mcgrof@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or at your option any
+ * later version; or, when distributed separately from the Linux kernel or
+ * when incorporated into other software packages, subject to the following
+ * license:
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of copyleft-next (version 0.3.1 or later) as published
+ * at http://copyleft-next.org/.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+/*
+ * This driver provides an interface to trigger and test the kernel's
+ * module loader through a series of configurations and a few triggers.
+ * To test this driver use the following script as root:
+ *
+ * tools/testing/selftests/kmod/kmod.sh --help
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/kmod.h>
+#include <linux/printk.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/vmalloc.h>
+#include <linux/slab.h>
+#include <linux/device.h>
+
+#define TEST_START_NUM_THREADS	50
+#define TEST_START_DRIVER	"test_module"
+#define TEST_START_TEST_FS	"xfs"
+#define TEST_START_TEST_CASE	TEST_KMOD_DRIVER
+
+
+static bool force_init_test = false;
+module_param(force_init_test, bool_enable_only, 0644);
+MODULE_PARM_DESC(force_init_test,
+		 "Force kicking a test immediately after driver loads");
+
+/*
+ * For device allocation / registration
+ */
+static DEFINE_MUTEX(reg_dev_mutex);
+static LIST_HEAD(reg_test_devs);
+
+/*
+ * num_test_devs actually represents the *next* ID of the next
+ * device we will allow to create.
+ */
+static int num_test_devs;
+
+/**
+ * enum kmod_test_case - linker table test case
+ *
+ * If you add a  test case, please be sure to review if you need to se
+ * @need_mod_put for your tests case.
+ *
+ * @TEST_KMOD_DRIVER: stress tests request_module()
+ * @TEST_KMOD_FS_TYPE: stress tests get_fs_type()
+ */
+enum kmod_test_case {
+	__TEST_KMOD_INVALID = 0,
+
+	TEST_KMOD_DRIVER,
+	TEST_KMOD_FS_TYPE,
+
+	__TEST_KMOD_MAX,
+};
+
+struct test_config {
+	char *test_driver;
+	char *test_fs;
+	unsigned int num_threads;
+	enum kmod_test_case test_case;
+	int test_result;
+};
+
+struct kmod_test_device;
+
+/**
+ * kmod_test_device_info - thread info
+ *
+ * @ret_sync: return value if request_module() is used, sync request for
+ * 	@TEST_KMOD_DRIVER
+ * @fs_sync: return value of get_fs_type() for @TEST_KMOD_FS_TYPE
+ * @thread_idx: thread ID
+ * @test_dev: test device test is being performed under
+ * @need_mod_put: Some tests (get_fs_type() is one) requires putting the module
+ *	(module_put(fs_sync->owner)) when done, otherwise you will not be able
+ *	to unload the respective modules and re-test. We use this to keep
+ *	accounting of when we need this and to help out in case we need to
+ *	error out and deal with module_put() on error.
+ */
+struct kmod_test_device_info {
+	int ret_sync;
+	struct file_system_type *fs_sync;
+	struct task_struct *task_sync;
+	unsigned int thread_idx;
+	struct kmod_test_device *test_dev;
+	bool need_mod_put;
+};
+
+/**
+ * kmod_test_device - test device to help test kmod
+ *
+ * @dev_idx: unique ID for test device
+ * @config: configuration for the test
+ * @misc_dev: we use a misc device under the hood
+ * @dev: pointer to misc_dev's own struct device
+ * @config_mutex: protects configuration of test
+ * @trigger_mutex: the test trigger can only be fired once at a time
+ * @thread_lock: protects @done count, and the @info per each thread
+ * @done: number of threads which have completed or failed
+ * @test_is_oom: when we run out of memory, use this to halt moving forward
+ * @kthreads_done: completion used to signal when all work is done
+ * @list: needed to be part of the reg_test_devs
+ * @info: array of info for each thread
+ */
+struct kmod_test_device {
+	int dev_idx;
+	struct test_config config;
+	struct miscdevice misc_dev;
+	struct device *dev;
+	struct mutex config_mutex;
+	struct mutex trigger_mutex;
+	struct mutex thread_mutex;
+
+	unsigned int done;
+
+	bool test_is_oom;
+	struct completion kthreads_done;
+	struct list_head list;
+
+	struct kmod_test_device_info *info;
+};
+
+static const char *test_case_str(enum kmod_test_case test_case)
+{
+	switch (test_case) {
+	case TEST_KMOD_DRIVER:
+		return "TEST_KMOD_DRIVER";
+	case TEST_KMOD_FS_TYPE:
+		return "TEST_KMOD_FS_TYPE";
+	default:
+		return "invalid";
+	}
+}
+
+static struct miscdevice *dev_to_misc_dev(struct device *dev)
+{
+	return dev_get_drvdata(dev);
+}
+
+static struct kmod_test_device *misc_dev_to_test_dev(struct miscdevice *misc_dev)
+{
+	return container_of(misc_dev, struct kmod_test_device, misc_dev);
+}
+
+static struct kmod_test_device *dev_to_test_dev(struct device *dev)
+{
+	struct miscdevice *misc_dev;
+
+	misc_dev = dev_to_misc_dev(dev);
+
+	return misc_dev_to_test_dev(misc_dev);
+}
+
+/* Must run with thread_mutex held */
+static void kmod_test_done_check(struct kmod_test_device *test_dev,
+				 unsigned int idx)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done++;
+	dev_dbg(test_dev->dev, "Done thread count: %u\n", test_dev->done);
+
+	if (test_dev->done == config->num_threads) {
+		dev_info(test_dev->dev, "Done: %u threads have all run now\n",
+			 test_dev->done);
+		dev_info(test_dev->dev, "Last thread to run: %u\n", idx);
+		complete(&test_dev->kthreads_done);
+	}
+}
+
+static void test_kmod_put_module(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	if (!info->need_mod_put)
+		return;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		break;
+	case TEST_KMOD_FS_TYPE:
+		if (info && info->fs_sync && info->fs_sync->owner)
+			module_put(info->fs_sync->owner);
+		break;
+	default:
+		BUG();
+	}
+
+	info->need_mod_put = true;
+}
+
+static int run_request(void *data)
+{
+	struct kmod_test_device_info *info = data;
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		info->ret_sync = request_module("%s", config->test_driver);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		info->fs_sync = get_fs_type(config->test_fs);
+		info->need_mod_put = true;
+		break;
+	default:
+		/* __trigger_config_run() already checked for test sanity */
+		BUG();
+		return -EINVAL;
+	}
+
+	dev_dbg(test_dev->dev, "Ran thread %u\n", info->thread_idx);
+
+	test_kmod_put_module(info);
+
+	mutex_lock(&test_dev->thread_mutex);
+	info->task_sync = NULL;
+	kmod_test_done_check(test_dev, info->thread_idx);
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+}
+
+static int tally_work_test(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+	int err_ret = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		/*
+		 * Only capture errors, if one is found that's
+		 * enough, for now.
+		 */
+		if (info->ret_sync != 0)
+			err_ret = info->ret_sync;
+		dev_info(test_dev->dev,
+			 "Sync thread %d return status: %d\n",
+			 info->thread_idx, info->ret_sync);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		/* For now we make this simple */
+		if (!info->fs_sync)
+			err_ret = -EINVAL;
+		dev_info(test_dev->dev, "Sync thread %u fs: %s\n",
+			 info->thread_idx, info->fs_sync ? config->test_fs :
+			 "NULL");
+		break;
+	default:
+		BUG();
+	}
+
+	return err_ret;
+}
+
+/*
+ * XXX: add result option to display if all errors did not match.
+ * For now we just keep any error code if one was found.
+ *
+ * If this ran it means *all* tasks were created fine and we
+ * are now just collecting results.
+ *
+ * Only propagate errors, do not override with a subsequent sucess case.
+ */
+static void tally_up_work(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int idx;
+	int err_ret = 0;
+	int ret = 0;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	dev_info(test_dev->dev, "Results:\n");
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		info = &test_dev->info[idx];
+		ret = tally_work_test(info);
+		if (ret)
+			err_ret = ret;
+	}
+
+	/*
+	 * Note: request_module() returns 256 for a module not found even
+	 * though modprobe itself returns 1.
+	 */
+	config->test_result = err_ret;
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+static int try_one_request(struct kmod_test_device *test_dev, unsigned int idx)
+{
+	struct kmod_test_device_info *info = &test_dev->info[idx];
+	int fail_ret = -ENOMEM;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	info->thread_idx = idx;
+	info->test_dev = test_dev;
+	info->task_sync = kthread_run(run_request, info, "%s-%u",
+				      KBUILD_MODNAME, idx);
+
+	if (!info->task_sync || IS_ERR(info->task_sync)) {
+		test_dev->test_is_oom = true;
+		dev_err(test_dev->dev, "Setting up thread %u failed\n", idx);
+		info->task_sync = NULL;
+		goto err_out;
+	} else
+		dev_dbg(test_dev->dev, "Kicked off thread %u\n", idx);
+
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+
+err_out:
+	info->ret_sync = fail_ret;
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return fail_ret;
+}
+
+static void test_dev_kmod_stop_tests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int i;
+
+	dev_info(test_dev->dev, "Ending request_module() tests\n");
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	for (i=0; i < config->num_threads; i++) {
+		info = &test_dev->info[i];
+		if (info->task_sync && !IS_ERR(info->task_sync)) {
+			dev_info(test_dev->dev,
+				 "Stopping still-running thread %i\n", i);
+			kthread_stop(info->task_sync);
+		}
+
+		/*
+		 * info->task_sync is well protected, it can only be
+		 * NULL or a pointer to a struct. If its NULL we either
+		 * never ran, or we did and we completed the work. Completed
+		 * tasks *always* put the module for us. This is a sanity
+		 * check -- just in case.
+		 */
+		if (info->task_sync && info->need_mod_put)
+			test_kmod_put_module(info);
+	}
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+/*
+ * Only wait *iff* we did not run into any errors during all of our thread
+ * set up. If run into any issues we stop threads and just bail out with
+ * an error to the trigger. This also means we don't need any tally work
+ * for any threads which fail.
+ */
+static int try_requests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	unsigned int idx;
+	int ret;
+	bool any_error = false;
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		if (test_dev->test_is_oom) {
+			any_error = true;
+			break;
+		}
+
+		ret = try_one_request(test_dev, idx);
+		if (ret) {
+			any_error = true;
+			break;
+		}
+	}
+
+	if (!any_error) {
+		test_dev->test_is_oom = false;
+		dev_info(test_dev->dev,
+			 "No errors were found while initializing threads\n");
+		wait_for_completion(&test_dev->kthreads_done);
+		tally_up_work(test_dev);
+	} else {
+		test_dev->test_is_oom = true;
+		dev_info(test_dev->dev,
+			 "At least one thread failed to start, stop all work\n");
+		test_dev_kmod_stop_tests(test_dev);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int run_test_driver(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test driver to load: %s\n",
+		 config->test_driver);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static int run_test_fs_type(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test filesystem to load: %s\n",
+		 config->test_fs);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static ssize_t config_show(struct device *dev,
+			   struct device_attribute *attr,
+			   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int len = 0;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	len += snprintf(buf, PAGE_SIZE,
+			"Custom trigger configuration for: %s\n",
+			dev_name(dev));
+
+	len += snprintf(buf+len, PAGE_SIZE - len,
+			"Number of threads:\t%u\n",
+			config->num_threads);
+
+	len += snprintf(buf+len, PAGE_SIZE - len,
+			"Test_case:\t%s (%u)\n",
+			test_case_str(config->test_case),
+			config->test_case);
+
+	if (config->test_driver)
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"driver:\t%s\n",
+				config->test_driver);
+	else
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"driver:\tEMTPY\n");
+
+	if (config->test_fs)
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"fs:\t%s\n",
+				config->test_fs);
+	else
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"fs:\tEMTPY\n");
+
+	mutex_unlock(&test_dev->config_mutex);
+
+	return len;
+}
+static DEVICE_ATTR_RO(config);
+
+/*
+ * This ensures we don't allow kicking threads through if our configuration
+ * is faulty.
+ */
+static int __trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		return run_test_driver(test_dev);
+	case TEST_KMOD_FS_TYPE:
+		return run_test_fs_type(test_dev);
+	default:
+		dev_warn(test_dev->dev,
+			 "Invalid test case requested: %u\n",
+			 config->test_case);
+		return -EINVAL;
+	}
+}
+
+static int trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __trigger_config_run(test_dev);
+	if (ret < 0)
+		goto out;
+	dev_info(test_dev->dev, "General test result: %d\n",
+		 config->test_result);
+
+	/*
+	 * We must return 0 after a trigger even unless something went
+	 * wrong with the setup of the test. If the test setup went fine
+	 * then userspace must just check the result of config->test_result.
+	 * One issue with relying on the return from a call in the kernel
+	 * is if the kernel returns a possitive value using this trigger
+	 * will not return the value to userspace, it would be lost.
+	 *
+	 * By not relying on capturing the return value of tests we are using
+	 * through the trigger it also us to run tests with set -e and only
+	 * fail when something went wrong with the driver upon trigger
+	 * requests.
+	 */
+	ret = 0;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+
+static ssize_t
+trigger_config_store(struct device *dev,
+		     struct device_attribute *attr,
+		     const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	if (test_dev->test_is_oom)
+		return -ENOMEM;
+
+	/* For all intents and purposes we don't care what userspace
+	 * sent this trigger, we care only that we were triggered.
+	 * We treat the return value only for caputuring issues with
+	 * the test setup. At this point all the test variables should
+	 * have been allocated so typically this should never fail.
+	 */
+	ret = trigger_config_run(test_dev);
+	if (unlikely(ret < 0))
+		goto out;
+
+	/*
+	 * Note: any return > 0 will be treated as success
+	 * and the error value will not be available to userspace.
+	 * Do not rely on trying to send to userspace a test value
+	 * return value as possitive return errors will be lost.
+	 */
+	if (WARN_ON(ret > 0))
+		return -EINVAL;
+
+	ret = count;
+out:
+	return ret;
+}
+static DEVICE_ATTR_WO(trigger_config);
+
+/*
+ * XXX: move to kstrncpy() once merged.
+ *
+ * Users should use kfree_const() when freeing these.
+ */
+static int __kstrncpy(char **dst, const char *name, size_t count, gfp_t gfp)
+{
+	*dst = kstrndup(name, count, gfp);
+	if (!*dst)
+		return -ENOSPC;
+	return count;
+}
+
+static int config_copy_test_driver_name(struct test_config *config,
+				    const char *name,
+				    size_t count)
+{
+	return __kstrncpy(&config->test_driver, name, count, GFP_KERNEL);
+}
+
+
+static int config_copy_test_fs(struct test_config *config, const char *name,
+			       size_t count)
+{
+	return __kstrncpy(&config->test_fs, name, count, GFP_KERNEL);
+}
+
+static void __kmod_config_free(struct test_config *config)
+{
+	if (!config)
+		return;
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	kfree_const(config->test_fs);
+	config->test_driver = NULL;
+}
+
+static void kmod_config_free(struct kmod_test_device *test_dev)
+{
+	struct test_config *config;
+
+	if (!test_dev)
+		return;
+
+	config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+	__kmod_config_free(config);
+	mutex_unlock(&test_dev->config_mutex);
+}
+
+static ssize_t config_test_driver_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	copied = config_copy_test_driver_name(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+/*
+ * As per sysfs_kf_seq_show() the buf is max PAGE_SIZE.
+ */
+static ssize_t config_test_show_str(struct mutex *config_mutex,
+				    char *dst,
+				    char *src)
+{
+	int len;
+
+	mutex_lock(config_mutex);
+	len = snprintf(dst, PAGE_SIZE, "%s\n", src);
+	mutex_unlock(config_mutex);
+
+	return len;
+}
+
+static ssize_t config_test_driver_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return config_test_show_str(&test_dev->config_mutex, buf,
+				    config->test_driver);
+}
+static DEVICE_ATTR(config_test_driver, 0644, config_test_driver_show,
+		   config_test_driver_store);
+
+static ssize_t config_test_fs_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_fs);
+	config->test_fs = NULL;
+
+	copied = config_copy_test_fs(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+static ssize_t config_test_fs_show(struct device *dev,
+				   struct device_attribute *attr,
+				   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return config_test_show_str(&test_dev->config_mutex, buf,
+				    config->test_fs);
+}
+static DEVICE_ATTR(config_test_fs, 0644, config_test_fs_show,
+		   config_test_fs_store);
+
+static int trigger_config_run_type(struct kmod_test_device *test_dev,
+				   enum kmod_test_case test_case,
+				   const char *test_str)
+{
+	int copied = 0;
+	struct test_config *config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	switch (test_case) {
+	case TEST_KMOD_DRIVER:
+		kfree_const(config->test_driver);
+		config->test_driver = NULL;
+		copied = config_copy_test_driver_name(config, test_str,
+						      strlen(test_str));
+		break;
+	case TEST_KMOD_FS_TYPE:
+		break;
+		kfree_const(config->test_fs);
+		config->test_driver = NULL;
+		copied = config_copy_test_fs(config, test_str,
+					     strlen(test_str));
+	default:
+		mutex_unlock(&test_dev->config_mutex);
+		return -EINVAL;
+	}
+
+	config->test_case = test_case;
+
+	mutex_unlock(&test_dev->config_mutex);
+
+	if (copied <= 0 || copied != strlen(test_str)) {
+		test_dev->test_is_oom = true;
+		return -ENOMEM;
+	}
+
+	test_dev->test_is_oom = false;
+
+	return trigger_config_run(test_dev);
+}
+
+static void free_test_dev_info(struct kmod_test_device *test_dev)
+{
+	vfree(test_dev->info);
+	test_dev->info = NULL;
+}
+
+static int kmod_config_sync_info(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	free_test_dev_info(test_dev);
+	test_dev->info = vzalloc(config->num_threads *
+				 sizeof(struct kmod_test_device_info));
+	if (!test_dev->info) {
+		dev_err(test_dev->dev, "Cannot alloc test_dev info\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * Old kernels may not have this, if you want to port this code to
+ * test it on older kernels.
+ */
+#ifdef get_kmod_umh_limit
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return get_kmod_umh_limit();
+}
+#else
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return TEST_START_NUM_THREADS;
+}
+#endif
+
+static int __kmod_config_init(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret = -ENOMEM, copied;
+
+	__kmod_config_free(config);
+
+	copied = config_copy_test_driver_name(config, TEST_START_DRIVER,
+					      strlen(TEST_START_DRIVER));
+	if (copied != strlen(TEST_START_DRIVER))
+		goto err_out;
+
+	copied = config_copy_test_fs(config, TEST_START_TEST_FS,
+				     strlen(TEST_START_TEST_FS));
+	if (copied != strlen(TEST_START_TEST_FS))
+		goto err_out;
+
+	config->num_threads = kmod_init_test_thread_limit();
+	config->test_result = 0;
+	config->test_case = TEST_START_TEST_CASE;
+
+	ret = kmod_config_sync_info(test_dev);
+	if (ret)
+		goto err_out;
+
+	test_dev->test_is_oom = false;
+
+	return 0;
+
+err_out:
+	test_dev->test_is_oom = true;
+	WARN_ON(test_dev->test_is_oom);
+
+	__kmod_config_free(config);
+
+	return ret;
+}
+
+static ssize_t reset_store(struct device *dev,
+			   struct device_attribute *attr,
+			   const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __kmod_config_init(test_dev);
+	if (ret < 0) {
+		ret = -ENOMEM;
+		dev_err(dev, "could not alloc settings for config trigger: %d\n",
+		       ret);
+		goto out;
+	}
+
+	dev_info(dev, "reset\n");
+	ret = count;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+static DEVICE_ATTR_WO(reset);
+
+static int test_dev_config_update_uint_sync(struct kmod_test_device *test_dev,
+					    const char *buf, size_t size,
+					    unsigned int *config,
+					    int (*test_sync)(struct kmod_test_device *test_dev))
+{
+	int ret;
+	long new;
+	unsigned int old_val;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	old_val = *config;
+	*(unsigned int *)config = new;
+
+	ret = test_sync(test_dev);
+	if (ret) {
+		*(unsigned int *)config = old_val;
+
+		ret = test_sync(test_dev);
+		WARN_ON(ret);
+
+		mutex_unlock(&test_dev->config_mutex);
+		return -EINVAL;
+	}
+
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_uint_range(struct kmod_test_device *test_dev,
+					     const char *buf, size_t size,
+					     unsigned int *config,
+					     unsigned int min,
+					     unsigned int max)
+{
+	int ret;
+	long new;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new < min || new >  max || new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+	*config = new;
+	mutex_unlock(&test_dev->config_mutex);
+
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_int(struct kmod_test_device *test_dev,
+				      const char *buf, size_t size,
+				      int *config)
+{
+	int ret;
+	long new;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new > INT_MAX || new < INT_MIN)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+	*config = new;
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static ssize_t test_dev_config_show_int(struct kmod_test_device *test_dev,
+					char *buf,
+					int config)
+{
+	int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+static ssize_t test_dev_config_show_uint(struct kmod_test_device *test_dev,
+					 char *buf,
+					 unsigned int config)
+{
+	unsigned int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", val);
+}
+
+static ssize_t test_result_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_int(test_dev, buf, count,
+					  &config->test_result);
+}
+
+static ssize_t config_num_threads_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_sync(test_dev, buf, count,
+						&config->num_threads,
+						kmod_config_sync_info);
+}
+
+static ssize_t config_num_threads_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->num_threads);
+}
+static DEVICE_ATTR(config_num_threads, 0644, config_num_threads_show,
+		   config_num_threads_store);
+
+static ssize_t config_test_case_store(struct device *dev,
+				      struct device_attribute *attr,
+				      const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_range(test_dev, buf, count,
+						 &config->test_case,
+						 __TEST_KMOD_INVALID + 1,
+						 __TEST_KMOD_MAX - 1);
+}
+
+static ssize_t config_test_case_show(struct device *dev,
+				     struct device_attribute *attr,
+				     char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_uint(test_dev, buf, config->test_case);
+}
+static DEVICE_ATTR(config_test_case, 0644, config_test_case_show,
+		   config_test_case_store);
+
+static ssize_t test_result_show(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->test_result);
+}
+static DEVICE_ATTR(test_result, 0644, test_result_show, test_result_store);
+
+#define TEST_KMOD_DEV_ATTR(name)		&dev_attr_##name.attr
+
+static struct attribute *test_dev_attrs[] = {
+	TEST_KMOD_DEV_ATTR(trigger_config),
+	TEST_KMOD_DEV_ATTR(config),
+	TEST_KMOD_DEV_ATTR(reset),
+
+	TEST_KMOD_DEV_ATTR(config_test_driver),
+	TEST_KMOD_DEV_ATTR(config_test_fs),
+	TEST_KMOD_DEV_ATTR(config_num_threads),
+	TEST_KMOD_DEV_ATTR(config_test_case),
+	TEST_KMOD_DEV_ATTR(test_result),
+
+	NULL,
+};
+
+ATTRIBUTE_GROUPS(test_dev);
+
+static int kmod_config_init(struct kmod_test_device *test_dev)
+{
+	int ret;
+
+	mutex_lock(&test_dev->config_mutex);
+	ret = __kmod_config_init(test_dev);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return ret;
+}
+
+static struct kmod_test_device *alloc_test_dev_kmod(int idx)
+{
+	int ret;
+	struct kmod_test_device *test_dev;
+	struct miscdevice *misc_dev;
+
+	test_dev = vzalloc(sizeof(struct kmod_test_device));
+	if (!test_dev) {
+		pr_err("Cannot alloc test_dev\n");
+		goto err_out;
+	}
+
+	mutex_init(&test_dev->config_mutex);
+	mutex_init(&test_dev->trigger_mutex);
+	mutex_init(&test_dev->thread_mutex);
+
+	init_completion(&test_dev->kthreads_done);
+
+	ret = kmod_config_init(test_dev);
+	if (ret < 0) {
+		pr_err("Cannot alloc kmod_config_init()\n");
+		goto err_out_free;
+	}
+
+	test_dev->dev_idx = idx;
+	misc_dev = &test_dev->misc_dev;
+
+	misc_dev->minor = MISC_DYNAMIC_MINOR;
+	misc_dev->name = kasprintf(GFP_KERNEL, "test_kmod%d", idx);
+	if (!misc_dev->name) {
+		pr_err("Cannot alloc misc_dev->name\n");
+		goto err_out_free_config;
+	}
+	misc_dev->groups = test_dev_groups;
+
+	return test_dev;
+
+err_out_free_config:
+	free_test_dev_info(test_dev);
+	kmod_config_free(test_dev);
+err_out_free:
+	vfree(test_dev);
+	test_dev = NULL;
+err_out:
+	return NULL;
+}
+
+static void free_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	if (test_dev) {
+		kfree_const(test_dev->misc_dev.name);
+		test_dev->misc_dev.name = NULL;
+		free_test_dev_info(test_dev);
+		kmod_config_free(test_dev);
+		vfree(test_dev);
+		test_dev = NULL;
+	}
+}
+
+static struct kmod_test_device *register_test_dev_kmod(void)
+{
+	struct kmod_test_device *test_dev = NULL;
+	int ret;
+
+	mutex_unlock(&reg_dev_mutex);
+
+	/* int should suffice for number of devices, test for wrap */
+	if (unlikely(num_test_devs + 1) < 0) {
+		pr_err("reached limit of number of test devices\n");
+		goto out;
+	}
+
+	test_dev = alloc_test_dev_kmod(num_test_devs);
+	if (!test_dev)
+		goto out;
+
+	ret = misc_register(&test_dev->misc_dev);
+	if (ret) {
+		pr_err("could not register misc device: %d\n", ret);
+		free_test_dev_kmod(test_dev);
+		goto out;
+	}
+
+	test_dev->dev = test_dev->misc_dev.this_device;
+	list_add_tail(&test_dev->list, &reg_test_devs);
+	dev_info(test_dev->dev, "interface ready\n");
+
+	num_test_devs++;
+
+out:
+	mutex_unlock(&reg_dev_mutex);
+
+	return test_dev;
+
+}
+
+static int __init test_kmod_init(void)
+{
+	struct kmod_test_device *test_dev;
+	int ret;
+
+	test_dev = register_test_dev_kmod();
+	if (!test_dev) {
+		pr_err("Cannot add first test kmod device\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * With some work we might be able to gracefully enable
+	 * testing with this driver built-in, for now this seems
+	 * rather risky. For those willing to try have at it,
+	 * and enable the below. Good luck! If that works, try
+	 * lowering the init level for more fun.
+	 */
+	if (force_init_test) {
+		ret = trigger_config_run_type(test_dev,
+					      TEST_KMOD_DRIVER, "tun");
+		if (WARN_ON(ret))
+			return ret;
+		ret = trigger_config_run_type(test_dev,
+					      TEST_KMOD_FS_TYPE, "btrfs");
+		if (WARN_ON(ret))
+			return ret;
+	}
+
+	return 0;
+}
+late_initcall(test_kmod_init);
+
+static
+void unregister_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	test_dev_kmod_stop_tests(test_dev);
+
+	dev_info(test_dev->dev, "removing interface\n");
+	misc_deregister(&test_dev->misc_dev);
+	kfree(&test_dev->misc_dev.name);
+
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	free_test_dev_kmod(test_dev);
+}
+
+static void __exit test_kmod_exit(void)
+{
+	struct kmod_test_device *test_dev, *tmp;
+
+	mutex_lock(&reg_dev_mutex);
+	list_for_each_entry_safe(test_dev, tmp, &reg_test_devs, list) {
+		list_del(&test_dev->list);
+		unregister_test_dev_kmod(test_dev);
+	}
+	mutex_unlock(&reg_dev_mutex);
+}
+module_exit(test_kmod_exit);
+
+MODULE_AUTHOR("Luis R. Rodriguez <mcgrof@kernel.org>");
+MODULE_LICENSE("GPL");
diff --git a/tools/testing/selftests/kmod/Makefile b/tools/testing/selftests/kmod/Makefile
new file mode 100644
index 000000000000..fa2ccc5fb3de
--- /dev/null
+++ b/tools/testing/selftests/kmod/Makefile
@@ -0,0 +1,11 @@
+# Makefile for kmod loading selftests
+
+# No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
+all:
+
+TEST_PROGS := kmod.sh
+
+include ../lib.mk
+
+# Nothing to clean up.
+clean:
diff --git a/tools/testing/selftests/kmod/config b/tools/testing/selftests/kmod/config
new file mode 100644
index 000000000000..259f4fd6b5e2
--- /dev/null
+++ b/tools/testing/selftests/kmod/config
@@ -0,0 +1,7 @@
+CONFIG_TEST_KMOD=m
+CONFIG_TEST_LKM=m
+CONFIG_XFS_FS=m
+
+# For the module parameter force_init_test is used
+CONFIG_TUN=m
+CONFIG_BTRFS_FS=m
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
new file mode 100755
index 000000000000..10196a62ed09
--- /dev/null
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -0,0 +1,635 @@
+#!/bin/bash
+#
+# Copyright (C) 2017 Luis R. Rodriguez <mcgrof@kernel.org>
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 2 of the License, or at your option any
+# later version; or, when distributed separately from the Linux kernel or
+# when incorporated into other software packages, subject to the following
+# license:
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of copyleft-next (version 0.3.1 or later) as published
+# at http://copyleft-next.org/.
+
+# This is a stress test script for kmod, the kernel module loader. It uses
+# test_kmod which exposes a series of knobs for the API for us so we can
+# tweak each test in userspace rather than in kernelspace.
+#
+# The way kmod works is it uses the kernel's usermode helper API to eventually
+# call /sbin/modprobe. It has a limit of the number of concurrent calls
+# possible. The kernel interface to load modules is request_module(), however
+# mount uses get_fs_type(). Both behave slightly differently, but the
+# differences are important enough to test each call separately. For this
+# reason test_kmod starts by providing tests for both calls.
+#
+# The test driver test_kmod assumes a series of defaults which you can
+# override by exporting to your environment prior running this script.
+# For instance this script assumes you do not have xfs loaded upon boot.
+# If this is false, export DEFAULT_KMOD_FS="ext4" prior to running this
+# script if the filesyste module you don't have loaded upon bootup
+# is ext4 instead. Refer to allow_user_defaults() for a list of user
+# override variables possible.
+#
+# You'll want at least 4 GiB of RAM to expect to run these tests
+# without running out of memory on them. For other requirements refer
+# to test_reqs()
+
+set -e
+
+TEST_NAME="kmod"
+TEST_DRIVER="test_${TEST_NAME}"
+TEST_DIR=$(dirname $0)
+
+# This represents
+#
+# TEST_ID:TEST_COUNT:ENABLED
+#
+# TEST_ID: is the test id number
+# TEST_COUNT: number of times we should run the test
+# ENABLED: 1 if enabled, 0 otherwise
+#
+# Once these are enabled please leave them as-is. Write your own test,
+# we have tons of space.
+ALL_TESTS="0001:3:1"
+ALL_TESTS="$ALL_TESTS 0002:3:1"
+ALL_TESTS="$ALL_TESTS 0003:1:1"
+ALL_TESTS="$ALL_TESTS 0004:1:1"
+ALL_TESTS="$ALL_TESTS 0005:10:1"
+ALL_TESTS="$ALL_TESTS 0006:10:1"
+ALL_TESTS="$ALL_TESTS 0007:5:1"
+
+# Disabled tests:
+#
+# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
+# Current best-effort failure interpretation:
+# Enough module requests get loaded in place fast enough to reach over the
+# max_modprobes limit and trigger a failure -- before we're even able to
+# start processing pending requests.
+ALL_TESTS="$ALL_TESTS 0008:150:0"
+
+# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+# Current best-effort failure interpretation:
+#
+# get_fs_type() requests modules using aliases as such the optimization in
+# place today to look for already loaded modules will not take effect and
+# we end up requesting a new module to load, this bumps the kmod_concurrent,
+# and in certain circumstances can lead to pushing the kmod_concurrent over
+# the max_modprobe limit.
+#
+# This test fails much easier than test 0008 since the alias optimizations
+# are not in place.
+ALL_TESTS="$ALL_TESTS 0009:150:0"
+
+test_modprobe()
+{
+       if [ ! -d $DIR ]; then
+               echo "$0: $DIR not present" >&2
+               echo "You must have the following enabled in your kernel:" >&2
+               cat $TEST_DIR/config >&2
+               exit 1
+       fi
+}
+
+function allow_user_defaults()
+{
+	if [ -z $DEFAULT_KMOD_DRIVER ]; then
+		DEFAULT_KMOD_DRIVER="test_module"
+	fi
+
+	if [ -z $DEFAULT_KMOD_FS ]; then
+		DEFAULT_KMOD_FS="xfs"
+	fi
+
+	if [ -z $PROC_DIR ]; then
+		PROC_DIR="/proc/sys/kernel/"
+	fi
+
+	if [ -z $MODPROBE_LIMIT ]; then
+		MODPROBE_LIMIT=50
+	fi
+
+	if [ -z $DIR ]; then
+		DIR="/sys/devices/virtual/misc/${TEST_DRIVER}0/"
+	fi
+
+	if [ -z $DEFAULT_NUM_TESTS ]; then
+		DEFAULT_NUM_TESTS=150
+	fi
+
+	MODPROBE_LIMIT_FILE="${PROC_DIR}/kmod-limit"
+}
+
+test_reqs()
+{
+	if ! which modprobe 2> /dev/null > /dev/null; then
+		echo "$0: You need modprobe installed" >&2
+		exit 1
+	fi
+
+	if ! which kmod 2> /dev/null > /dev/null; then
+		echo "$0: You need kmod installed" >&2
+		exit 1
+	fi
+
+	# kmod 19 has a bad bug where it returns 0 when modprobe
+	# gets called *even* if the module was not loaded due to
+	# some bad heuristics. For details see:
+	#
+	# A work around is possible in-kernel but its rather
+	# complex.
+	KMOD_VERSION=$(kmod --version | awk '{print $3}')
+	if [[ $KMOD_VERSION  -le 19 ]]; then
+		echo "$0: You need at least kmod 20" >&2
+		echo "kmod <= 19 is buggy, for details see:" >&2
+		echo "http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4" >&2
+		exit 1
+	fi
+
+	uid=$(id -u)
+	if [ $uid -ne 0 ]; then
+		echo $msg must be run as root >&2
+		exit 0
+	fi
+}
+
+function load_req_mod()
+{
+	trap "test_modprobe" EXIT
+
+	if [ ! -d $DIR ]; then
+		# Alanis: "Oh isn't it ironic?"
+		modprobe $TEST_DRIVER
+	fi
+}
+
+test_finish()
+{
+	echo "Test completed"
+}
+
+errno_name_to_val()
+{
+	case "$1" in
+	# kmod calls modprobe and upon of a module not found
+	# modprobe returns just 1... However in the kernel we
+	# *sometimes* see 256...
+	MODULE_NOT_FOUND)
+		echo 256;;
+	SUCCESS)
+		echo 0;;
+	-EPERM)
+		echo -1;;
+	-ENOENT)
+		echo -2;;
+	-EINVAL)
+		echo -22;;
+	-ERR_ANY)
+		echo -123456;;
+	*)
+		echo invalid;;
+	esac
+}
+
+errno_val_to_name()
+	case "$1" in
+	256)
+		echo MODULE_NOT_FOUND;;
+	0)
+		echo SUCCESS;;
+	-1)
+		echo -EPERM;;
+	-2)
+		echo -ENOENT;;
+	-22)
+		echo -EINVAL;;
+	-123456)
+		echo -ERR_ANY;;
+	*)
+		echo invalid;;
+	esac
+
+config_set_test_case_driver()
+{
+	if ! echo -n 1 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to driver" >&2
+		exit 1
+	fi
+}
+
+config_set_test_case_fs()
+{
+	if ! echo -n 2 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to fs" >&2
+		exit 1
+	fi
+}
+
+config_num_threads()
+{
+	if ! echo -n $1 >$DIR/config_num_threads; then
+		echo "$0: Unable to set to number of threads" >&2
+		exit 1
+	fi
+}
+
+config_get_modprobe_limit()
+{
+	if [[ -f ${MODPROBE_LIMIT_FILE} ]] ; then
+		MODPROBE_LIMIT=$(cat $MODPROBE_LIMIT_FILE)
+	fi
+	echo $MODPROBE_LIMIT
+}
+
+config_num_thread_limit_extra()
+{
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA_LIMIT=$MODPROBE_LIMIT+$1
+	config_num_threads $EXTRA_LIMIT
+}
+
+# For special characters use printf directly,
+# refer to kmod_test_0001
+config_set_driver()
+{
+	if ! echo -n $1 >$DIR/config_test_driver; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_set_fs()
+{
+	if ! echo -n $1 >$DIR/config_test_fs; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_get_driver()
+{
+	cat $DIR/config_test_driver
+}
+
+config_get_test_result()
+{
+	cat $DIR/test_result
+}
+
+config_reset()
+{
+	if ! echo -n "1" >"$DIR"/reset; then
+		echo "$0: reset shuld have worked" >&2
+		exit 1
+	fi
+}
+
+config_show_config()
+{
+	echo "----------------------------------------------------"
+	cat "$DIR"/config
+	echo "----------------------------------------------------"
+}
+
+config_trigger()
+{
+	if ! echo -n "1" >"$DIR"/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - loading should have worked"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - loading kmod test"
+}
+
+config_trigger_want_fail()
+{
+	if echo "1" > $DIR/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - test case was expected to fail"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - kmod test case failed as expected"
+}
+
+config_expect_result()
+{
+	RC=$(config_get_test_result)
+	RC_NAME=$(errno_val_to_name $RC)
+
+	ERRNO_NAME=$2
+	ERRNO=$(errno_name_to_val $ERRNO_NAME)
+
+	if [[ $ERRNO_NAME = "-ERR_ANY" ]]; then
+		if [[ $RC -ge 0 ]]; then
+			echo "$1: FAIL, test expects $ERRNO_NAME - got $RC_NAME ($RC)" >&2
+			config_show_config
+			exit 1
+		fi
+	elif [[ $RC != $ERRNO ]]; then
+		echo "$1: FAIL, test expects $ERRNO_NAME ($ERRNO) - got $RC_NAME ($RC)" >&2
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - Return value: $RC ($RC_NAME), expected $ERRNO_NAME"
+}
+
+kmod_defaults_driver()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_DRIVER
+	config_set_driver $DEFAULT_KMOD_DRIVER
+}
+
+kmod_defaults_fs()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_FS
+	config_set_fs $DEFAULT_KMOD_FS
+	config_set_test_case_fs
+}
+
+kmod_test_0001_driver()
+{
+	NAME='\000'
+
+	kmod_defaults_driver
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0001_fs()
+{
+	NAME='\000'
+
+	kmod_defaults_fs
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0001()
+{
+	kmod_test_0001_driver
+	kmod_test_0001_fs
+}
+
+kmod_test_0002_driver()
+{
+	NAME="nope-$DEFAULT_KMOD_DRIVER"
+
+	kmod_defaults_driver
+	config_set_driver $NAME
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0002_fs()
+{
+	NAME="nope-$DEFAULT_KMOD_FS"
+
+	kmod_defaults_fs
+	config_set_fs $NAME
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0002()
+{
+	kmod_test_0002_driver
+	kmod_test_0002_fs
+}
+
+kmod_test_0003()
+{
+	kmod_defaults_fs
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0004()
+{
+	kmod_defaults_fs
+	config_num_threads 2
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0005()
+{
+	kmod_defaults_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0006()
+{
+	kmod_defaults_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0007()
+{
+	kmod_test_0005
+	kmod_test_0006
+}
+
+kmod_test_0008()
+{
+	kmod_defaults_driver
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA=$MODPROBE_LIMIT/6
+	config_num_thread_limit_extra $EXTRA
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0009()
+{
+	kmod_defaults_fs
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA=$MODPROBE_LIMIT/4
+	config_num_thread_limit_extra $EXTRA
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+list_tests()
+{
+	echo "Test ID list:"
+	echo
+	echo "TEST_ID x NUM_TEST"
+	echo "TEST_ID:   Test ID"
+	echo "NUM_TESTS: Number of recommended times to run the test"
+	echo
+	echo "0001 x $(get_test_count 0001) - Simple test - 1 thread  for empty string"
+	echo "0002 x $(get_test_count 0002) - Simple test - 1 thread  for modules/filesystems that do not exist"
+	echo "0003 x $(get_test_count 0003) - Simple test - 1 thread  for get_fs_type() only"
+	echo "0004 x $(get_test_count 0004) - Simple test - 2 threads for get_fs_type() only"
+	echo "0005 x $(get_test_count 0005) - multithreaded tests with default setup - request_module() only"
+	echo "0006 x $(get_test_count 0006) - multithreaded tests with default setup - get_fs_type() only"
+	echo "0007 x $(get_test_count 0007) - multithreaded tests with default setup test request_module() and get_fs_type()"
+	echo "0008 x $(get_test_count 0008) - multithreaded - push kmod_concurrent over max_modprobes for request_module()"
+	echo "0009 x $(get_test_count 0009) - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+}
+
+usage()
+{
+	NUM_TESTS=$(grep -o ' ' <<<"$ALL_TESTS" | grep -c .)
+	let NUM_TESTS=$NUM_TESTS+1
+	MAX_TEST=$(printf "%04d\n" $NUM_TESTS)
+	echo "Usage: $0 [ -t <4-number-digit> ] | [ -w <4-number-digit> ] |"
+	echo "		 [ -s <4-number-digit> ] | [ -c <4-number-digit> <test- count>"
+	echo "           [ all ] [ -h | --help ] [ -l ]"
+	echo ""
+	echo "Valid tests: 0001-$MAX_TEST"
+	echo ""
+	echo "    all     Runs all tests (default)"
+	echo "    -t      Run test ID the number amount of times is recommended"
+	echo "    -w      Watch test ID run until it runs into an error"
+	echo "    -c      Run test ID once"
+	echo "    -s      Run test ID x test-count number of times"
+	echo "    -l      List all test ID list"
+	echo " -h|--help  Help"
+	echo
+	echo "If an error every occurs execution will immediately terminate."
+	echo "If you are adding a new test try using -w <test-ID> first to"
+	echo "make sure the test passes a series of tests."
+	echo
+	echo Example uses:
+	echo
+	echo "${TEST_NAME}.sh		-- executes all tests"
+	echo "${TEST_NAME}.sh -t 0008	-- Executes test ID 0008 number of times is recomended"
+	echo "${TEST_NAME}.sh -w 0008	-- Watch test ID 0008 run until an error occurs"
+	echo "${TEST_NAME}.sh -s 0008	-- Run test ID 0008 once"
+	echo "${TEST_NAME}.sh -c 0008 3	-- Run test ID 0008 three times"
+	echo
+	list_tests
+	exit 1
+}
+
+function test_num()
+{
+	re='^[0-9]+$'
+	if ! [[ $1 =~ $re ]]; then
+		usage
+	fi
+}
+
+function get_test_count()
+{
+	test_num $1
+	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	LAST_TWO=${TEST_DATA#*:*}
+	echo ${LAST_TWO%:*}
+}
+
+function get_test_enabled()
+{
+	test_num $1
+	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	echo ${TEST_DATA#*:*:}
+}
+
+function run_all_tests()
+{
+	for i in $ALL_TESTS ; do
+		TEST_ID=${i%:*:*}
+		ENABLED=$(get_test_enabled $TEST_ID)
+		TEST_COUNT=$(get_test_count $TEST_ID)
+		if [[ $ENABLED -eq "1" ]]; then
+			test_case $TEST_ID $TEST_COUNT
+		fi
+	done
+}
+
+function watch_log()
+{
+	if [ $# -ne 3 ]; then
+		clear
+	fi
+	date
+	echo "Running test: $2 - run #$1"
+}
+
+function watch_case()
+{
+	i=0
+	while [ 1 ]; do
+
+		if [ $# -eq 1 ]; then
+			test_num $1
+			watch_log $i ${TEST_NAME}_test_$1
+			${TEST_NAME}_test_$1
+		else
+			watch_log $i all
+			run_all_tests
+		fi
+		let i=$i+1
+	done
+}
+
+function test_case()
+{
+	NUM_TESTS=$DEFAULT_NUM_TESTS
+	if [ $# -eq 2 ]; then
+		NUM_TESTS=$2
+	fi
+
+	i=0
+	while [ $i -lt $NUM_TESTS ]; do
+		test_num $1
+		watch_log $i ${TEST_NAME}_test_$1 noclear
+		RUN_TEST=${TEST_NAME}_test_$1
+		$RUN_TEST
+		let i=$i+1
+	done
+}
+
+function parse_args()
+{
+	if [ $# -eq 0 ]; then
+		run_all_tests
+	else
+		if [[ "$1" = "all" ]]; then
+			run_all_tests
+		elif [[ "$1" = "-w" ]]; then
+			shift
+			watch_case $@
+		elif [[ "$1" = "-t" ]]; then
+			shift
+			test_num $1
+			test_case $1 $(get_test_count $1)
+		elif [[ "$1" = "-c" ]]; then
+			shift
+			test_num $1
+			test_num $2
+			test_case $1 $2
+		elif [[ "$1" = "-s" ]]; then
+			shift
+			test_case $1 1
+		elif [[ "$1" = "-l" ]]; then
+			list_tests
+		elif [[ "$1" = "-h" || "$1" = "--help" ]]; then
+			usage
+		else
+			usage
+		fi
+	fi
+}
+
+test_reqs
+allow_user_defaults
+load_req_mod
+
+trap "test_finish" EXIT
+
+parse_args $@
+
+exit 0
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 4/5] kmod: add helpers for getting kmod limit
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
                     ` (2 preceding siblings ...)
  2017-05-26  0:16   ` [PATCH v2 3/5] kmod: add test driver to stress test the module loader Luis R. Rodriguez
@ 2017-05-26  0:16   ` Luis R. Rodriguez
  2017-05-26  0:56     ` Dmitry Torokhov
  2017-05-26  0:16   ` [PATCH v2 5/5] kmod: throttle kmod thread limit Luis R. Rodriguez
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
  5 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26  0:16 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez, Tom Gundersen

This adds helpers for getting access to the kmod limit from
userspace. This knob should help userspace more gracefully and
deterministically handle module loading.

Cc: Tom Gundersen <teg@jklm.no>
Cc: Petr Mladek <pmladek@suse.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 Documentation/sysctl/kernel.txt | 20 ++++++++++++++++++++
 include/linux/kmod.h            |  5 +++++
 kernel/kmod.c                   | 33 +++++++++++++++++++++++++++++++++
 kernel/sysctl.c                 |  7 +++++++
 4 files changed, 65 insertions(+)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index bac23c198360..080ccdca1f10 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -370,6 +370,26 @@ with the "modules_disabled" sysctl.
 
 ==============================================================
 
+kmod-limit:
+
+Get the max amount of concurrent requests (kmod_concurrent) the kernel can
+make out to userspace to call 'modprobe'. This limit is known internally to the
+kernel as max_modprobes. This interface is designed to enable userspace to
+query the kernel for the max_modprobes limit so userspace can more
+deterministically handle module loading by only enabling max_modprobes
+'modprobe' calls at a time.
+
+Dependencies are resolved in userspace through depmod, so one modprobe
+call only bumps the number of concurrent threads (kmod_concurrent) by one.
+Dependencies for a module then are loaded directly in userspace using
+init_module() / finit_module() skipping bumping kmod_concurrent or being
+affected by max_modprobes.
+
+The max_modprobes value is set at build time with CONFIG_MAX_KMOD_CONCURRENT.
+You can override at initialization with the module parameter max_modprobes.
+
+==============================================================
+
 kptr_restrict:
 
 This toggle indicates whether restrictions are placed on
diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 8e2f302b214a..bbc6d59190aa 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -39,13 +39,18 @@ int __request_module(bool wait, const char *name, ...);
 #define try_then_request_module(x, mod...) \
 	((x) ?: (__request_module(true, mod), (x)))
 void init_kmod_umh(void);
+unsigned int get_kmod_umh_limit(void);
+int sysctl_kmod_limit(struct ctl_table *table, int write,
+		      void __user *buffer, size_t *lenp, loff_t *ppos);
 #else
 static inline int request_module(const char *name, ...) { return -ENOSYS; }
 static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
 #define try_then_request_module(x, mod...) (x)
 static inline void init_kmod_umh(void) { }
+static inline unsigned int get_kmod_umh_limit(void) { return 0; }
 #endif
 
+#define get_kmod_umh_limit get_kmod_umh_limit
 
 struct cred;
 struct file;
diff --git a/kernel/kmod.c b/kernel/kmod.c
index cafd27b92d19..17de776cf368 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -111,6 +111,17 @@ static int call_modprobe(char *module_name, int wait)
 }
 
 /**
+ * get_kmod_umh_limit - get concurrent modprobe thread limit
+ *
+ * Returns the number of allowed concurrent modprobe calls.
+ */
+unsigned int get_kmod_umh_limit(void)
+{
+	return max_modprobes;
+}
+EXPORT_SYMBOL_GPL(get_kmod_umh_limit);
+
+/**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
  * @fmt: printf style format string for the name of the module
@@ -194,6 +205,28 @@ void __init init_kmod_umh(void)
 	atomic_set(&kmod_concurrent_max, max_modprobes);
 }
 
+int sysctl_kmod_limit(struct ctl_table *table, int write,
+		      void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	struct ctl_table t;
+	int ret;
+	unsigned int local_max_modprobes = max_modprobes;
+	unsigned int min = 0;
+	unsigned int max = max_threads/2;
+
+	t = *table;
+	t.data = &local_max_modprobes;
+	t.extra1 = &min;
+	t.extra2 = &max;
+
+	if (write)
+		return -EPERM;
+
+	ret = proc_douintvec_minmax(&t, write, buffer, lenp, ppos);
+
+	return ret;
+}
+
 #endif /* CONFIG_MODULES */
 
 static void call_usermodehelper_freeinfo(struct subprocess_info *info)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index df9f2a367882..d1c1d1999bb1 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -682,6 +682,13 @@ static struct ctl_table kern_table[] = {
 		.extra1		= &one,
 		.extra2		= &one,
 	},
+	{
+		.procname	= "kmod-limit",
+		.data		= NULL, /* filled in by handler */
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0444,
+		.proc_handler	= sysctl_kmod_limit,
+	},
 #endif
 #ifdef CONFIG_UEVENT_HELPER
 	{
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v2 5/5] kmod: throttle kmod thread limit
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
                     ` (3 preceding siblings ...)
  2017-05-26  0:16   ` [PATCH v2 4/5] kmod: add helpers for getting kmod limit Luis R. Rodriguez
@ 2017-05-26  0:16   ` Luis R. Rodriguez
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
  5 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26  0:16 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

If we reach the limit of modprobe_limit threads running the next
request_module() call will fail. The original reason for adding
a kill was to do away with possible issues with in old circumstances
which would create a recursive series of request_module() calls.
We can do better than just be super aggressive and reject calls
once we've reached the limit by simply making pending callers wait
until the threshold has been reduced.

The only difference is the clutch helps with avoiding making
request_module() requests fatal more often. With x86_64 qemu,
with 4 cores, 4 GiB of RAM it takes the following run time to
run both tests:

time kmod.sh -t 0008
real    0m14.066s
user    0m1.403s
sys     0m5.837s

time kmod.sh -t 0009
real    0m53.928s
user    0m1.271s
sys     0m7.343s

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c                        | 15 +++++----------
 tools/testing/selftests/kmod/kmod.sh | 24 ++----------------------
 2 files changed, 7 insertions(+), 32 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 17de776cf368..6cd4c88ab98d 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -59,6 +59,7 @@ static DECLARE_RWSEM(umhelper_sem);
 #ifdef CONFIG_MODULES
 static atomic_t kmod_concurrent_max = ATOMIC_INIT(0);
 #define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
+static DECLARE_WAIT_QUEUE_HEAD(kmod_wq);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -142,7 +143,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static int kmod_loop_msg;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -166,15 +166,9 @@ int __request_module(bool wait, const char *fmt, ...)
 		return ret;
 
 	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
-		/* We may be blaming an innocent here, but unlikely */
-		if (kmod_loop_msg < 5) {
-			printk(KERN_ERR
-			       "request_module: runaway loop modprobe %s\n",
-			       module_name);
-			kmod_loop_msg++;
-		}
-		atomic_dec(&kmod_concurrent);
-		return -ENOMEM;
+		pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s\n, throttling...",
+				    atomic_read(&kmod_concurrent_max), max_modprobes, module_name);
+		wait_event_interruptible(kmod_wq, atomic_dec_if_positive(&kmod_concurrent_max) >= 0);
 	}
 
 	trace_module_request(module_name, wait, _RET_IP_);
@@ -182,6 +176,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
 	atomic_inc(&kmod_concurrent_max);
+	wake_up_all(&kmod_wq);
 
 	return ret;
 }
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
index 10196a62ed09..8cecae9a8bca 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -59,28 +59,8 @@ ALL_TESTS="$ALL_TESTS 0004:1:1"
 ALL_TESTS="$ALL_TESTS 0005:10:1"
 ALL_TESTS="$ALL_TESTS 0006:10:1"
 ALL_TESTS="$ALL_TESTS 0007:5:1"
-
-# Disabled tests:
-#
-# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
-# Current best-effort failure interpretation:
-# Enough module requests get loaded in place fast enough to reach over the
-# max_modprobes limit and trigger a failure -- before we're even able to
-# start processing pending requests.
-ALL_TESTS="$ALL_TESTS 0008:150:0"
-
-# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
-# Current best-effort failure interpretation:
-#
-# get_fs_type() requests modules using aliases as such the optimization in
-# place today to look for already loaded modules will not take effect and
-# we end up requesting a new module to load, this bumps the kmod_concurrent,
-# and in certain circumstances can lead to pushing the kmod_concurrent over
-# the max_modprobe limit.
-#
-# This test fails much easier than test 0008 since the alias optimizations
-# are not in place.
-ALL_TESTS="$ALL_TESTS 0009:150:0"
+ALL_TESTS="$ALL_TESTS 0008:150:1"
+ALL_TESTS="$ALL_TESTS 0009:150:1"
 
 test_modprobe()
 {
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 4/5] kmod: add helpers for getting kmod limit
  2017-05-26  0:16   ` [PATCH v2 4/5] kmod: add helpers for getting kmod limit Luis R. Rodriguez
@ 2017-05-26  0:56     ` Dmitry Torokhov
  2017-05-26 20:27       ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-26  0:56 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, acme, corbet, martin.wilck,
	mmarek, pmladek, hare, rwright, jeffm, DSterba, fdmanana, neilb,
	linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, alan,
	tytso, gregkh, torvalds, linux-kselftest, linux-doc,
	linux-kernel, Tom Gundersen

On Thu, May 25, 2017 at 05:16:29PM -0700, Luis R. Rodriguez wrote:
> This adds helpers for getting access to the kmod limit from
> userspace. This knob should help userspace more gracefully and
> deterministically handle module loading.

I think more details is needed before we add a new ABI to the kernel.
Why can't userspace submit as much as it wants and the kernel decide how
much it will service at once?

> 
> Cc: Tom Gundersen <teg@jklm.no>
> Cc: Petr Mladek <pmladek@suse.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  Documentation/sysctl/kernel.txt | 20 ++++++++++++++++++++
>  include/linux/kmod.h            |  5 +++++
>  kernel/kmod.c                   | 33 +++++++++++++++++++++++++++++++++
>  kernel/sysctl.c                 |  7 +++++++
>  4 files changed, 65 insertions(+)
> 
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index bac23c198360..080ccdca1f10 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -370,6 +370,26 @@ with the "modules_disabled" sysctl.
>  
>  ==============================================================
>  
> +kmod-limit:
> +
> +Get the max amount of concurrent requests (kmod_concurrent) the kernel can
> +make out to userspace to call 'modprobe'. This limit is known internally to the
> +kernel as max_modprobes. This interface is designed to enable userspace to
> +query the kernel for the max_modprobes limit so userspace can more
> +deterministically handle module loading by only enabling max_modprobes
> +'modprobe' calls at a time.
> +
> +Dependencies are resolved in userspace through depmod, so one modprobe
> +call only bumps the number of concurrent threads (kmod_concurrent) by one.
> +Dependencies for a module then are loaded directly in userspace using
> +init_module() / finit_module() skipping bumping kmod_concurrent or being
> +affected by max_modprobes.
> +
> +The max_modprobes value is set at build time with CONFIG_MAX_KMOD_CONCURRENT.
> +You can override at initialization with the module parameter max_modprobes.
> +
> +==============================================================
> +
>  kptr_restrict:
>  
>  This toggle indicates whether restrictions are placed on
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index 8e2f302b214a..bbc6d59190aa 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -39,13 +39,18 @@ int __request_module(bool wait, const char *name, ...);
>  #define try_then_request_module(x, mod...) \
>  	((x) ?: (__request_module(true, mod), (x)))
>  void init_kmod_umh(void);
> +unsigned int get_kmod_umh_limit(void);
> +int sysctl_kmod_limit(struct ctl_table *table, int write,
> +		      void __user *buffer, size_t *lenp, loff_t *ppos);
>  #else
>  static inline int request_module(const char *name, ...) { return -ENOSYS; }
>  static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
>  #define try_then_request_module(x, mod...) (x)
>  static inline void init_kmod_umh(void) { }
> +static inline unsigned int get_kmod_umh_limit(void) { return 0; }
>  #endif
>  
> +#define get_kmod_umh_limit get_kmod_umh_limit
>  
>  struct cred;
>  struct file;
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index cafd27b92d19..17de776cf368 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -111,6 +111,17 @@ static int call_modprobe(char *module_name, int wait)
>  }
>  
>  /**
> + * get_kmod_umh_limit - get concurrent modprobe thread limit
> + *
> + * Returns the number of allowed concurrent modprobe calls.
> + */
> +unsigned int get_kmod_umh_limit(void)
> +{
> +	return max_modprobes;
> +}
> +EXPORT_SYMBOL_GPL(get_kmod_umh_limit);
> +
> +/**
>   * __request_module - try to load a kernel module
>   * @wait: wait (or not) for the operation to complete
>   * @fmt: printf style format string for the name of the module
> @@ -194,6 +205,28 @@ void __init init_kmod_umh(void)
>  	atomic_set(&kmod_concurrent_max, max_modprobes);
>  }
>  
> +int sysctl_kmod_limit(struct ctl_table *table, int write,
> +		      void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> +	struct ctl_table t;
> +	int ret;
> +	unsigned int local_max_modprobes = max_modprobes;
> +	unsigned int min = 0;
> +	unsigned int max = max_threads/2;
> +
> +	t = *table;
> +	t.data = &local_max_modprobes;
> +	t.extra1 = &min;
> +	t.extra2 = &max;
> +
> +	if (write)
> +		return -EPERM;
> +
> +	ret = proc_douintvec_minmax(&t, write, buffer, lenp, ppos);
> +
> +	return ret;
> +}
> +
>  #endif /* CONFIG_MODULES */
>  
>  static void call_usermodehelper_freeinfo(struct subprocess_info *info)
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index df9f2a367882..d1c1d1999bb1 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -682,6 +682,13 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= &one,
>  		.extra2		= &one,
>  	},
> +	{
> +		.procname	= "kmod-limit",
> +		.data		= NULL, /* filled in by handler */
> +		.maxlen		= sizeof(unsigned int),
> +		.mode		= 0444,
> +		.proc_handler	= sysctl_kmod_limit,
> +	},
>  #endif
>  #ifdef CONFIG_UEVENT_HELPER
>  	{
> -- 
> 2.11.0
> 

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent
  2017-05-26  0:16   ` [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent Luis R. Rodriguez
@ 2017-05-26  1:11     ` Dmitry Torokhov
  2017-05-26 20:03       ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Dmitry Torokhov @ 2017-05-26  1:11 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, acme, corbet, martin.wilck,
	mmarek, pmladek, hare, rwright, jeffm, DSterba, fdmanana, neilb,
	linux, rgoldwyn, subashab, xypron.glpk, keescook, atomlin,
	mbenes, paulmck, dan.j.williams, jpoimboe, davem, mingo, alan,
	tytso, gregkh, torvalds, linux-kselftest, linux-doc,
	linux-kernel

On Thu, May 25, 2017 at 05:16:27PM -0700, Luis R. Rodriguez wrote:
> When checking if we want to allow a kmod thread to kick off we increment,
> then read to see if we should enable a thread. If we were over the allowed
> limit limit we decrement. Splitting the increment far apart from decrement
> means there could be a time where two increments happen potentially
> giving a false failure on a thread which should have been allowed.
> 
> CPU1			CPU2
> atomic_inc()
> 			atomic_inc()
> atomic_read()
> 			atomic_read()
> atomic_dec()
> 			atomic_dec()
> 
> In this case a read on CPU1 gets the atomic_inc()'s and we could negate
> it from getting a kmod thread. We could try to prevent this with a lock
> or preemption but that is overkill. We can fix by reducing the number of
> atomic operations. We do this by inverting the logic of of the enabler,
> instead of incrementing kmod_concurrent as we get new kmod users, define the
> variable kmod_concurrent_max as the max number of currently allowed kmod
> users and as we get new kmod users just decrement it if its still positive.
> This combines the dec and read in one atomic operation.
> 
> In this case we no longer get the same false failure:
> 
> CPU1			CPU2
> atomic_dec_if_positive()
> 			atomic_dec_if_positive()
> atomic_inc()
> 			atomic_inc()
> 
> Suggested-by: Petr Mladek <pmladek@suse.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  include/linux/kmod.h |  2 ++
>  init/main.c          |  1 +
>  kernel/kmod.c        | 44 +++++++++++++++++++++++++-------------------
>  3 files changed, 28 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index c4e441e00db5..8e2f302b214a 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -38,10 +38,12 @@ int __request_module(bool wait, const char *name, ...);
>  #define request_module_nowait(mod...) __request_module(false, mod)
>  #define try_then_request_module(x, mod...) \
>  	((x) ?: (__request_module(true, mod), (x)))
> +void init_kmod_umh(void);
>  #else
>  static inline int request_module(const char *name, ...) { return -ENOSYS; }
>  static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
>  #define try_then_request_module(x, mod...) (x)
> +static inline void init_kmod_umh(void) { }
>  #endif
>  
>  
> diff --git a/init/main.c b/init/main.c
> index 9ec09ff8a930..9b20be716cf7 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -650,6 +650,7 @@ asmlinkage __visible void __init start_kernel(void)
>  	thread_stack_cache_init();
>  	cred_init();
>  	fork_init();
> +	init_kmod_umh();
>  	proc_caches_init();
>  	buffer_init();
>  	key_init();
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 563f97e2be36..cafd27b92d19 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -46,6 +46,7 @@
>  #include <trace/events/module.h>
>  
>  extern int max_threads;
> +unsigned int max_modprobes;
>  
>  #define CAP_BSET	(void *)1
>  #define CAP_PI		(void *)2
> @@ -56,6 +57,8 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
>  static DECLARE_RWSEM(umhelper_sem);
>  
>  #ifdef CONFIG_MODULES
> +static atomic_t kmod_concurrent_max = ATOMIC_INIT(0);
> +#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
>  
>  /*
>  	modprobe_path is set via /proc/sys.
> @@ -127,10 +130,7 @@ int __request_module(bool wait, const char *fmt, ...)
>  {
>  	va_list args;
>  	char module_name[MODULE_NAME_LEN];
> -	unsigned int max_modprobes;
>  	int ret;
> -	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> -#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
>  	static int kmod_loop_msg;
>  
>  	/*
> @@ -154,21 +154,7 @@ int __request_module(bool wait, const char *fmt, ...)
>  	if (ret)
>  		return ret;
>  
> -	/* If modprobe needs a service that is in a module, we get a recursive
> -	 * loop.  Limit the number of running kmod threads to max_threads/2 or
> -	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> -	 * would be to run the parents of this process, counting how many times
> -	 * kmod was invoked.  That would mean accessing the internals of the
> -	 * process tables to get the command line, proc_pid_cmdline is static
> -	 * and it is not worth changing the proc code just to handle this case. 
> -	 * KAO.
> -	 *
> -	 * "trace the ppid" is simple, but will fail if someone's
> -	 * parent exits.  I think this is as good as it gets. --RR
> -	 */
> -	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> -	atomic_inc(&kmod_concurrent);
> -	if (atomic_read(&kmod_concurrent) > max_modprobes) {
> +	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
>  		/* We may be blaming an innocent here, but unlikely */
>  		if (kmod_loop_msg < 5) {
>  			printk(KERN_ERR
> @@ -184,10 +170,30 @@ int __request_module(bool wait, const char *fmt, ...)
>  
>  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
>  
> -	atomic_dec(&kmod_concurrent);
> +	atomic_inc(&kmod_concurrent_max);
> +
>  	return ret;
>  }
>  EXPORT_SYMBOL(__request_module);
> +
> +/*
> + * If modprobe needs a service that is in a module, we get a recursive
> + * loop.  Limit the number of running kmod threads to max_threads/2 or
> + * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> + * would be to run the parents of this process, counting how many times
> + * kmod was invoked.  That would mean accessing the internals of the
> + * process tables to get the command line, proc_pid_cmdline is static
> + * and it is not worth changing the proc code just to handle this case.
> + *
> + * "trace the ppid" is simple, but will fail if someone's
> + * parent exits.  I think this is as good as it gets.
> + */
> +void __init init_kmod_umh(void)
> +{
> +	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> +	atomic_set(&kmod_concurrent_max, max_modprobes);

I would love if we could initialize atomic statically. So the trouble we
are trying to solve here is we create more threads than kernel supports,
with thread count being calculated as:

	threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
			    (u64) THREAD_SIZE * 8UL);

So to not being serve 50 threads we need to deal with system smaller
than 3200 pages, or ~13M memory (assume thread size is 8 pages - 64 bit
with kasan, smaller page sizes reduce memory even more). Can you run
4.12 with modules support on machine with such memory?

So maybe we shoudl simply say:

static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT_MAX);

and call it a day? So we do not need init_kmod_umh() and don't need to
call it from init/main.c.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent
  2017-05-26  1:11     ` Dmitry Torokhov
@ 2017-05-26 20:03       ` Luis R. Rodriguez
  0 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 20:03 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Luis R. Rodriguez, akpm, jeyu, shuah, rusty, ebiederm, acme,
	corbet, martin.wilck, mmarek, pmladek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Thu, May 25, 2017 at 06:11:00PM -0700, Dmitry Torokhov wrote:
> On Thu, May 25, 2017 at 05:16:27PM -0700, Luis R. Rodriguez wrote:
> > When checking if we want to allow a kmod thread to kick off we increment,
> > then read to see if we should enable a thread. If we were over the allowed
> > limit limit we decrement. Splitting the increment far apart from decrement
> > means there could be a time where two increments happen potentially
> > giving a false failure on a thread which should have been allowed.
> > 
> > CPU1			CPU2
> > atomic_inc()
> > 			atomic_inc()
> > atomic_read()
> > 			atomic_read()
> > atomic_dec()
> > 			atomic_dec()
> > 
> > In this case a read on CPU1 gets the atomic_inc()'s and we could negate
> > it from getting a kmod thread. We could try to prevent this with a lock
> > or preemption but that is overkill. We can fix by reducing the number of
> > atomic operations. We do this by inverting the logic of of the enabler,
> > instead of incrementing kmod_concurrent as we get new kmod users, define the
> > variable kmod_concurrent_max as the max number of currently allowed kmod
> > users and as we get new kmod users just decrement it if its still positive.
> > This combines the dec and read in one atomic operation.
> > 
> > In this case we no longer get the same false failure:
> > 
> > CPU1			CPU2
> > atomic_dec_if_positive()
> > 			atomic_dec_if_positive()
> > atomic_inc()
> > 			atomic_inc()
> > 
> > Suggested-by: Petr Mladek <pmladek@suse.com>
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  include/linux/kmod.h |  2 ++
> >  init/main.c          |  1 +
> >  kernel/kmod.c        | 44 +++++++++++++++++++++++++-------------------
> >  3 files changed, 28 insertions(+), 19 deletions(-)
> > 
> > diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> > index c4e441e00db5..8e2f302b214a 100644
> > --- a/include/linux/kmod.h
> > +++ b/include/linux/kmod.h
> > @@ -38,10 +38,12 @@ int __request_module(bool wait, const char *name, ...);
> >  #define request_module_nowait(mod...) __request_module(false, mod)
> >  #define try_then_request_module(x, mod...) \
> >  	((x) ?: (__request_module(true, mod), (x)))
> > +void init_kmod_umh(void);
> >  #else
> >  static inline int request_module(const char *name, ...) { return -ENOSYS; }
> >  static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
> >  #define try_then_request_module(x, mod...) (x)
> > +static inline void init_kmod_umh(void) { }
> >  #endif
> >  
> >  
> > diff --git a/init/main.c b/init/main.c
> > index 9ec09ff8a930..9b20be716cf7 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -650,6 +650,7 @@ asmlinkage __visible void __init start_kernel(void)
> >  	thread_stack_cache_init();
> >  	cred_init();
> >  	fork_init();
> > +	init_kmod_umh();
> >  	proc_caches_init();
> >  	buffer_init();
> >  	key_init();
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 563f97e2be36..cafd27b92d19 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -46,6 +46,7 @@
> >  #include <trace/events/module.h>
> >  
> >  extern int max_threads;
> > +unsigned int max_modprobes;
> >  
> >  #define CAP_BSET	(void *)1
> >  #define CAP_PI		(void *)2
> > @@ -56,6 +57,8 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
> >  static DECLARE_RWSEM(umhelper_sem);
> >  
> >  #ifdef CONFIG_MODULES
> > +static atomic_t kmod_concurrent_max = ATOMIC_INIT(0);
> > +#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
> >  
> >  /*
> >  	modprobe_path is set via /proc/sys.
> > @@ -127,10 +130,7 @@ int __request_module(bool wait, const char *fmt, ...)
> >  {
> >  	va_list args;
> >  	char module_name[MODULE_NAME_LEN];
> > -	unsigned int max_modprobes;
> >  	int ret;
> > -	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> > -#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
> >  	static int kmod_loop_msg;
> >  
> >  	/*
> > @@ -154,21 +154,7 @@ int __request_module(bool wait, const char *fmt, ...)
> >  	if (ret)
> >  		return ret;
> >  
> > -	/* If modprobe needs a service that is in a module, we get a recursive
> > -	 * loop.  Limit the number of running kmod threads to max_threads/2 or
> > -	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> > -	 * would be to run the parents of this process, counting how many times
> > -	 * kmod was invoked.  That would mean accessing the internals of the
> > -	 * process tables to get the command line, proc_pid_cmdline is static
> > -	 * and it is not worth changing the proc code just to handle this case. 
> > -	 * KAO.
> > -	 *
> > -	 * "trace the ppid" is simple, but will fail if someone's
> > -	 * parent exits.  I think this is as good as it gets. --RR
> > -	 */
> > -	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> > -	atomic_inc(&kmod_concurrent);
> > -	if (atomic_read(&kmod_concurrent) > max_modprobes) {
> > +	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
> >  		/* We may be blaming an innocent here, but unlikely */
> >  		if (kmod_loop_msg < 5) {
> >  			printk(KERN_ERR
> > @@ -184,10 +170,30 @@ int __request_module(bool wait, const char *fmt, ...)
> >  
> >  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> >  
> > -	atomic_dec(&kmod_concurrent);
> > +	atomic_inc(&kmod_concurrent_max);
> > +
> >  	return ret;
> >  }
> >  EXPORT_SYMBOL(__request_module);
> > +
> > +/*
> > + * If modprobe needs a service that is in a module, we get a recursive
> > + * loop.  Limit the number of running kmod threads to max_threads/2 or
> > + * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> > + * would be to run the parents of this process, counting how many times
> > + * kmod was invoked.  That would mean accessing the internals of the
> > + * process tables to get the command line, proc_pid_cmdline is static
> > + * and it is not worth changing the proc code just to handle this case.
> > + *
> > + * "trace the ppid" is simple, but will fail if someone's
> > + * parent exits.  I think this is as good as it gets.
> > + */
> > +void __init init_kmod_umh(void)
> > +{
> > +	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> > +	atomic_set(&kmod_concurrent_max, max_modprobes);
> 
> I would love if we could initialize atomic statically. So the trouble we
> are trying to solve here is we create more threads than kernel supports,
> with thread count being calculated as:
> 
> 	threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
> 			    (u64) THREAD_SIZE * 8UL);
> 
> So to not being serve 50 threads we need to deal with system smaller
> than 3200 pages, or ~13M memory (assume thread size is 8 pages - 64 bit
> with kasan, smaller page sizes reduce memory even more). Can you run
> 4.12 with modules support on machine with such memory?
> 
> So maybe we shoudl simply say:
> 
> static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT_MAX);
> 
> and call it a day? So we do not need init_kmod_umh() and don't need to
> call it from init/main.c.

I like this very much. If shit blows up we can simply use the kconfig thing
I had proposed earlier, however it would have to accept less than 50 threads
given we'd be computing this at build time, not at run time.

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v2 4/5] kmod: add helpers for getting kmod limit
  2017-05-26  0:56     ` Dmitry Torokhov
@ 2017-05-26 20:27       ` Luis R. Rodriguez
  0 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 20:27 UTC (permalink / raw)
  To: Dmitry Torokhov, Tom Gundersen
  Cc: Luis R. Rodriguez, akpm, jeyu, shuah, rusty, ebiederm, acme,
	corbet, martin.wilck, mmarek, pmladek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Thu, May 25, 2017 at 05:56:51PM -0700, Dmitry Torokhov wrote:
> On Thu, May 25, 2017 at 05:16:29PM -0700, Luis R. Rodriguez wrote:
> > This adds helpers for getting access to the kmod limit from
> > userspace. This knob should help userspace more gracefully and
> > deterministically handle module loading.
> 
> I think more details is needed before we add a new ABI to the kernel.
> Why can't userspace submit as much as it wants and the kernel decide how
> much it will service at once?

I suppose I should clarify on the commit log then, that without this heuristic
of #ifdef get_kmod_umh_limit -- *iff* userspace today is allowing more than 50
threads it can mean userspace is allowing a modprobe to fail. Check the results
of test 0008 and 0009 on dmesg, you can end up with really unexpected results.

This knob enables userspace to gracefully send in as many requests as allowed.
I guess this could just be a kernel revision check...  given that once
throttling go in it can be as crazy as it wants.

Also I suppose one could argue that *since* this has not been a real issue
*yet* (we assume), the old userspace of sending as many requests as it needs
is fine.

Will drop this!

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v3 0/4] kmod: help make deterministic
  2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
                     ` (4 preceding siblings ...)
  2017-05-26  0:16   ` [PATCH v2 5/5] kmod: throttle kmod thread limit Luis R. Rodriguez
@ 2017-05-26 21:12   ` Luis R. Rodriguez
  2017-05-26 21:12     ` [PATCH v3 1/4] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
                       ` (5 more replies)
  5 siblings, 6 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 21:12 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet, josh
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

This v3 nukes the proc sysctl interface in favor for just letting userspace
just check kernel revision. Prior to whenever this is merged userspace should
try to avoid hammering more than 50 kmod threads as they can fail and it'd
get -ENOMEM.

We do away with the old heuristics on assuming you could end up with
less than max_threads/2 < 50 threads as Dmitry notes this would mean having
a system with 16 MiB of RAM with modules enabled. It simplifies our patch
"kmod: reduce atomic operations on kmod_concurrent" considerbly.

Since the sysctl interface is gone, this no longer depends on any
other patches, the series is independent. As usual the series is
available on my linux-next 20170526-kmod-only branch which is based
on next-20170526.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170526-kmod-only

  Luis

Luis R. Rodriguez (4):
  module: use list_for_each_entry_rcu() on find_module_all()
  kmod: reduce atomic operations on kmod_concurrent and simplify
  kmod: add test driver to stress test the module loader
  kmod: throttle kmod thread limit

 kernel/kmod.c                         |   55 +-
 kernel/module.c                       |    2 +-
 lib/Kconfig.debug                     |   25 +
 lib/Makefile                          |    1 +
 lib/test_kmod.c                       | 1246 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile |   11 +
 tools/testing/selftests/kmod/config   |    7 +
 tools/testing/selftests/kmod/kmod.sh  |  615 ++++++++++++++++
 8 files changed, 1930 insertions(+), 32 deletions(-)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v3 1/4] module: use list_for_each_entry_rcu() on find_module_all()
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
@ 2017-05-26 21:12     ` Luis R. Rodriguez
  2017-05-26 21:12     ` [PATCH v3 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify Luis R. Rodriguez
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 21:12 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet, josh
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

The module list has been using RCU in a lot of other calls
for a while now, we just overlooked changing this one over to
use RCU.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/module.c b/kernel/module.c
index 3803449ca219..2df38d45ca37 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -603,7 +603,7 @@ static struct module *find_module_all(const char *name, size_t len,
 
 	module_assert_mutex_or_preempt();
 
-	list_for_each_entry(mod, &modules, list) {
+	list_for_each_entry_rcu(mod, &modules, list) {
 		if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
 			continue;
 		if (strlen(mod->name) == len && !memcmp(mod->name, name, len))
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v3 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
  2017-05-26 21:12     ` [PATCH v3 1/4] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
@ 2017-05-26 21:12     ` Luis R. Rodriguez
  2017-06-23 19:19       ` [PATCH v4 " Luis R. Rodriguez
  2017-06-26 11:36       ` [PATCH v3 " Petr Mladek
  2017-05-26 21:12     ` [PATCH v3 3/4] kmod: add test driver to stress test the module loader Luis R. Rodriguez
                       ` (3 subsequent siblings)
  5 siblings, 2 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 21:12 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet, josh
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

When checking if we want to allow a kmod thread to kick off we increment,
then read to see if we should enable a thread. If we were over the allowed
limit limit we decrement. Splitting the increment far apart from decrement
means there could be a time where two increments happen potentially
giving a false failure on a thread which should have been allowed.

CPU1			CPU2
atomic_inc()
			atomic_inc()
atomic_read()
			atomic_read()
atomic_dec()
			atomic_dec()

In this case a read on CPU1 gets the atomic_inc()'s and we could negate
it from getting a kmod thread. We could try to prevent this with a lock
or preemption but that is overkill. We can fix by reducing the number of
atomic operations. We do this by inverting the logic of of the enabler,
instead of incrementing kmod_concurrent as we get new kmod users, define the
variable kmod_concurrent_max as the max number of currently allowed kmod
users and as we get new kmod users just decrement it if its still positive.
This combines the dec and read in one atomic operation.

In this case we no longer get the same false failure:

CPU1			CPU2
atomic_dec_if_positive()
			atomic_dec_if_positive()
atomic_inc()
			atomic_inc()

The number of threads is computed at init, and since the current computation
of kmod_concurrent includes the thread count we can avoid setting
kmod_concurrent_max later in boot through an init call by simply sticking to
50 as the kmod_concurrent_max. The assumption here is a system with modules
must at least have ~16 MiB of RAM.

Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 39 +++++++++++++++++----------------------
 1 file changed, 17 insertions(+), 22 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 563f97e2be36..3e346c700e80 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -45,8 +45,6 @@
 
 #include <trace/events/module.h>
 
-extern int max_threads;
-
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
 
@@ -56,6 +54,19 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
 static DECLARE_RWSEM(umhelper_sem);
 
 #ifdef CONFIG_MODULES
+/*
+ * Assuming:
+ *
+ * threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
+ *		       (u64) THREAD_SIZE * 8UL);
+ *
+ * If you need less than 50 threads would mean we're dealing with systems
+ * smaller than 3200 pages. This assuems you are capable of having ~13M memory,
+ * and this would only be an be an upper limit, after which the OOM killer
+ * would take effect. Systems like these are very unlikely if modules are
+ * enabled.
+ * */
+static atomic_t kmod_concurrent_max = ATOMIC_INIT(50);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -127,10 +138,7 @@ int __request_module(bool wait, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
-	unsigned int max_modprobes;
 	int ret;
-	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
-#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
 	static int kmod_loop_msg;
 
 	/*
@@ -154,21 +162,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
-	/* If modprobe needs a service that is in a module, we get a recursive
-	 * loop.  Limit the number of running kmod threads to max_threads/2 or
-	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
-	 * would be to run the parents of this process, counting how many times
-	 * kmod was invoked.  That would mean accessing the internals of the
-	 * process tables to get the command line, proc_pid_cmdline is static
-	 * and it is not worth changing the proc code just to handle this case. 
-	 * KAO.
-	 *
-	 * "trace the ppid" is simple, but will fail if someone's
-	 * parent exits.  I think this is as good as it gets. --RR
-	 */
-	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
-	atomic_inc(&kmod_concurrent);
-	if (atomic_read(&kmod_concurrent) > max_modprobes) {
+	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
 		/* We may be blaming an innocent here, but unlikely */
 		if (kmod_loop_msg < 5) {
 			printk(KERN_ERR
@@ -176,7 +170,6 @@ int __request_module(bool wait, const char *fmt, ...)
 			       module_name);
 			kmod_loop_msg++;
 		}
-		atomic_dec(&kmod_concurrent);
 		return -ENOMEM;
 	}
 
@@ -184,10 +177,12 @@ int __request_module(bool wait, const char *fmt, ...)
 
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
-	atomic_dec(&kmod_concurrent);
+	atomic_inc(&kmod_concurrent_max);
+
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
+
 #endif /* CONFIG_MODULES */
 
 static void call_usermodehelper_freeinfo(struct subprocess_info *info)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v3 3/4] kmod: add test driver to stress test the module loader
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
  2017-05-26 21:12     ` [PATCH v3 1/4] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
  2017-05-26 21:12     ` [PATCH v3 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify Luis R. Rodriguez
@ 2017-05-26 21:12     ` Luis R. Rodriguez
  2017-05-26 21:12     ` [PATCH v3 4/4] kmod: throttle kmod thread limit Luis R. Rodriguez
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 21:12 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet, josh
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

This adds a new stress test driver for kmod: the kernel module loader.
The new stress test driver, test_kmod, is only enabled as a module right
now. It should be possible to load this as built-in and load tests early
(refer to the force_init_test module parameter), however since a lot of
test can get a system out of memory fast we leave this disabled for now.

Using a system with 1024 MiB of RAM can *easily* get your kernel
OOM fast with this test driver.

The test_kmod driver exposes API knobs for us to fine tune simple
request_module() and get_fs_type() calls. Since these API calls
only allow each one parameter a test driver for these is rather
simple. Other factors that can help out test driver though are
the number of calls we issue and knowing current limitations of
each. This exposes configuration as much as possible through
userspace to be able to build tests directly from userspace.

Since it allows multiple misc devices its will eventually (once we
add a knob to let us create new devices at will) also be possible to
perform more tests in parallel, provided you have enough memory.

We only enable tests we know work as of right now.

Demo screenshots:

 # tools/testing/selftests/kmod/kmod.sh
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0002_driver: OK! - loading kmod test
kmod_test_0002_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0002_fs: OK! - loading kmod test
kmod_test_0002_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0003: OK! - loading kmod test
kmod_test_0003: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0004: OK! - loading kmod test
kmod_test_0004: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
XXX: add test restult for 0007
Test completed

You can also request for specific tests:

 # tools/testing/selftests/kmod/kmod.sh -t 0001
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
Test completed

Lastly, the current available number of tests:

 # tools/testing/selftests/kmod/kmod.sh --help
Usage: tools/testing/selftests/kmod/kmod.sh [ -t <4-number-digit> ]
Valid tests: 0001-0009

0001 - Simple test - 1 thread  for empty string
0002 - Simple test - 1 thread  for modules/filesystems that do not exist
0003 - Simple test - 1 thread  for get_fs_type() only
0004 - Simple test - 2 threads for get_fs_type() only
0005 - multithreaded tests with default setup - request_module() only
0006 - multithreaded tests with default setup - get_fs_type() only
0007 - multithreaded tests with default setup test request_module() and get_fs_type()
0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()
0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()

The following test cases currently fail, as such they are not currently
enabled by default:

 # tools/testing/selftests/kmod/kmod.sh -t 0008
 # tools/testing/selftests/kmod/kmod.sh -t 0009

To be sure to run them as intended please unload both of the modules:

  o test_module
  o xfs

And ensure they are not loaded on your system prior to testing them.
If you use these paritions for your rootfs you can change the default
test driver used for get_fs_type() by exporting it into your
environment. For example of other test defaults you can override
refer to kmod.sh allow_user_defaults().

Behind the scenes this is how we fine tune at a test case prior to
hitting a trigger to run it:

cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "2" > /sys/devices/virtual/misc/test_kmod0/config_test_case
echo -n "ext4" > /sys/devices/virtual/misc/test_kmod0/config_test_fs
echo -n "80" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "1" > /sys/devices/virtual/misc/test_kmod0/config_num_threads

Finally to trigger:

echo -n "1" > /sys/devices/virtual/misc/test_kmod0/trigger_config

The kmod.sh script uses the above constructs to build different test cases.

A bit of interpretation of the current failures follows, first two
premises:

a) When request_module() is used userspace figures out an optimized version of
module order for us. Once it finds the modules it needs, as per depmod
symbol dep map, it will finit_module() the respective modules which
are needed for the original request_module() request.

b) We have an optimization in place whereby if a kernel uses
request_module() on a module already loaded we never bother
userspace as the module already is loaded. This is all handled by
kernel/kmod.c.

A few things to consider to help identify root causes of issues:

0) kmod 19 has a broken heuristic for modules being assumed to be
built-in to your kernel and will return 0 even though request_module()
failed. Upgrade to a newer version of kmod.

1) A get_fs_type() call for "xfs" will request_module() for
"fs-xfs", not for "xfs". The optimization in kernel described in b)
fails to catch if we have a lot of consecutive get_fs_type() calls.
The reason is the optimization in place does not look for aliases. This
means two consecutive get_fs_type() calls will bump kmod_concurrent, whereas
request_module() will not.

This one explanation why test case 0009 fails at least once for
get_fs_type().

2) If a module fails to load --- for whatever reason (kmod_concurrent
limit reached, file not yet present due to rootfs switch, out of memory)
we have a period of time during which module request for the same name
either with request_module() or get_fs_type() will *also* fail to load
even if the file for the module is ready.

This explains why *multiple* NULLs are possible on test 0009.

3) finit_module() consumes quite a bit of memory.

4) Filesystems typically also have more dependent modules than other
modules, its important to note though that even though a get_fs_type() call
does not incur additional kmod_concurrent bumps, since userspace
loads dependencies it finds it needs via finit_module_fd(), it *will*
take much more memory to load a module with a lot of dependencies.

Because of 3) and 4) we will easily run into out of memory failures
with certain tests. For instance test 0006 fails on qemu with 1024 MiB
of RAM. It panics a box after reaping all userspace processes and still
not having enough memory to reap.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 lib/Kconfig.debug                     |   25 +
 lib/Makefile                          |    1 +
 lib/test_kmod.c                       | 1246 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile |   11 +
 tools/testing/selftests/kmod/config   |    7 +
 tools/testing/selftests/kmod/kmod.sh  |  635 +++++++++++++++++
 6 files changed, 1925 insertions(+)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index f6aece3b8098..148cd822a08f 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1822,6 +1822,31 @@ config BUG_ON_DATA_CORRUPTION
 
 	  If unsure, say N.
 
+config TEST_KMOD
+	tristate "kmod stress tester"
+	default n
+	depends on m
+	select TEST_LKM
+	select XFS_FS
+	select TUN
+	select BTRFS_FS
+	help
+	  Test the kernel's module loading mechanism: kmod. kmod implements
+	  support to load modules using the Linux kernel's usermode helper.
+	  This test provides a series of tests against kmod.
+
+	  Although technically you can either build test_kmod as a module or
+	  into the kernel we disallow building it into the kernel since
+	  it stress tests request_module() and this will very likely cause
+	  some issues by taking over precious threads available from other
+	  module load requests, ultimately this could be fatal.
+
+	  To run tests run:
+
+	  tools/testing/selftests/kmod/kmod.sh --help
+
+	  If unsure, say N.
+
 source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
diff --git a/lib/Makefile b/lib/Makefile
index 07fbe6a75692..ee78d3da33eb 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -60,6 +60,7 @@ obj-$(CONFIG_TEST_PRINTF) += test_printf.o
 obj-$(CONFIG_TEST_BITMAP) += test_bitmap.o
 obj-$(CONFIG_TEST_UUID) += test_uuid.o
 obj-$(CONFIG_TEST_PARMAN) += test_parman.o
+obj-$(CONFIG_TEST_KMOD) += test_kmod.o
 
 ifeq ($(CONFIG_DEBUG_KOBJECT),y)
 CFLAGS_kobject.o += -DDEBUG
diff --git a/lib/test_kmod.c b/lib/test_kmod.c
new file mode 100644
index 000000000000..6c1d678bcf8b
--- /dev/null
+++ b/lib/test_kmod.c
@@ -0,0 +1,1246 @@
+/*
+ * kmod stress test driver
+ *
+ * Copyright (C) 2017 Luis R. Rodriguez <mcgrof@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or at your option any
+ * later version; or, when distributed separately from the Linux kernel or
+ * when incorporated into other software packages, subject to the following
+ * license:
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of copyleft-next (version 0.3.1 or later) as published
+ * at http://copyleft-next.org/.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+/*
+ * This driver provides an interface to trigger and test the kernel's
+ * module loader through a series of configurations and a few triggers.
+ * To test this driver use the following script as root:
+ *
+ * tools/testing/selftests/kmod/kmod.sh --help
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/kmod.h>
+#include <linux/printk.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/vmalloc.h>
+#include <linux/slab.h>
+#include <linux/device.h>
+
+#define TEST_START_NUM_THREADS	50
+#define TEST_START_DRIVER	"test_module"
+#define TEST_START_TEST_FS	"xfs"
+#define TEST_START_TEST_CASE	TEST_KMOD_DRIVER
+
+
+static bool force_init_test = false;
+module_param(force_init_test, bool_enable_only, 0644);
+MODULE_PARM_DESC(force_init_test,
+		 "Force kicking a test immediately after driver loads");
+
+/*
+ * For device allocation / registration
+ */
+static DEFINE_MUTEX(reg_dev_mutex);
+static LIST_HEAD(reg_test_devs);
+
+/*
+ * num_test_devs actually represents the *next* ID of the next
+ * device we will allow to create.
+ */
+static int num_test_devs;
+
+/**
+ * enum kmod_test_case - linker table test case
+ *
+ * If you add a  test case, please be sure to review if you need to se
+ * @need_mod_put for your tests case.
+ *
+ * @TEST_KMOD_DRIVER: stress tests request_module()
+ * @TEST_KMOD_FS_TYPE: stress tests get_fs_type()
+ */
+enum kmod_test_case {
+	__TEST_KMOD_INVALID = 0,
+
+	TEST_KMOD_DRIVER,
+	TEST_KMOD_FS_TYPE,
+
+	__TEST_KMOD_MAX,
+};
+
+struct test_config {
+	char *test_driver;
+	char *test_fs;
+	unsigned int num_threads;
+	enum kmod_test_case test_case;
+	int test_result;
+};
+
+struct kmod_test_device;
+
+/**
+ * kmod_test_device_info - thread info
+ *
+ * @ret_sync: return value if request_module() is used, sync request for
+ * 	@TEST_KMOD_DRIVER
+ * @fs_sync: return value of get_fs_type() for @TEST_KMOD_FS_TYPE
+ * @thread_idx: thread ID
+ * @test_dev: test device test is being performed under
+ * @need_mod_put: Some tests (get_fs_type() is one) requires putting the module
+ *	(module_put(fs_sync->owner)) when done, otherwise you will not be able
+ *	to unload the respective modules and re-test. We use this to keep
+ *	accounting of when we need this and to help out in case we need to
+ *	error out and deal with module_put() on error.
+ */
+struct kmod_test_device_info {
+	int ret_sync;
+	struct file_system_type *fs_sync;
+	struct task_struct *task_sync;
+	unsigned int thread_idx;
+	struct kmod_test_device *test_dev;
+	bool need_mod_put;
+};
+
+/**
+ * kmod_test_device - test device to help test kmod
+ *
+ * @dev_idx: unique ID for test device
+ * @config: configuration for the test
+ * @misc_dev: we use a misc device under the hood
+ * @dev: pointer to misc_dev's own struct device
+ * @config_mutex: protects configuration of test
+ * @trigger_mutex: the test trigger can only be fired once at a time
+ * @thread_lock: protects @done count, and the @info per each thread
+ * @done: number of threads which have completed or failed
+ * @test_is_oom: when we run out of memory, use this to halt moving forward
+ * @kthreads_done: completion used to signal when all work is done
+ * @list: needed to be part of the reg_test_devs
+ * @info: array of info for each thread
+ */
+struct kmod_test_device {
+	int dev_idx;
+	struct test_config config;
+	struct miscdevice misc_dev;
+	struct device *dev;
+	struct mutex config_mutex;
+	struct mutex trigger_mutex;
+	struct mutex thread_mutex;
+
+	unsigned int done;
+
+	bool test_is_oom;
+	struct completion kthreads_done;
+	struct list_head list;
+
+	struct kmod_test_device_info *info;
+};
+
+static const char *test_case_str(enum kmod_test_case test_case)
+{
+	switch (test_case) {
+	case TEST_KMOD_DRIVER:
+		return "TEST_KMOD_DRIVER";
+	case TEST_KMOD_FS_TYPE:
+		return "TEST_KMOD_FS_TYPE";
+	default:
+		return "invalid";
+	}
+}
+
+static struct miscdevice *dev_to_misc_dev(struct device *dev)
+{
+	return dev_get_drvdata(dev);
+}
+
+static struct kmod_test_device *misc_dev_to_test_dev(struct miscdevice *misc_dev)
+{
+	return container_of(misc_dev, struct kmod_test_device, misc_dev);
+}
+
+static struct kmod_test_device *dev_to_test_dev(struct device *dev)
+{
+	struct miscdevice *misc_dev;
+
+	misc_dev = dev_to_misc_dev(dev);
+
+	return misc_dev_to_test_dev(misc_dev);
+}
+
+/* Must run with thread_mutex held */
+static void kmod_test_done_check(struct kmod_test_device *test_dev,
+				 unsigned int idx)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done++;
+	dev_dbg(test_dev->dev, "Done thread count: %u\n", test_dev->done);
+
+	if (test_dev->done == config->num_threads) {
+		dev_info(test_dev->dev, "Done: %u threads have all run now\n",
+			 test_dev->done);
+		dev_info(test_dev->dev, "Last thread to run: %u\n", idx);
+		complete(&test_dev->kthreads_done);
+	}
+}
+
+static void test_kmod_put_module(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	if (!info->need_mod_put)
+		return;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		break;
+	case TEST_KMOD_FS_TYPE:
+		if (info && info->fs_sync && info->fs_sync->owner)
+			module_put(info->fs_sync->owner);
+		break;
+	default:
+		BUG();
+	}
+
+	info->need_mod_put = true;
+}
+
+static int run_request(void *data)
+{
+	struct kmod_test_device_info *info = data;
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		info->ret_sync = request_module("%s", config->test_driver);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		info->fs_sync = get_fs_type(config->test_fs);
+		info->need_mod_put = true;
+		break;
+	default:
+		/* __trigger_config_run() already checked for test sanity */
+		BUG();
+		return -EINVAL;
+	}
+
+	dev_dbg(test_dev->dev, "Ran thread %u\n", info->thread_idx);
+
+	test_kmod_put_module(info);
+
+	mutex_lock(&test_dev->thread_mutex);
+	info->task_sync = NULL;
+	kmod_test_done_check(test_dev, info->thread_idx);
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+}
+
+static int tally_work_test(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+	int err_ret = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		/*
+		 * Only capture errors, if one is found that's
+		 * enough, for now.
+		 */
+		if (info->ret_sync != 0)
+			err_ret = info->ret_sync;
+		dev_info(test_dev->dev,
+			 "Sync thread %d return status: %d\n",
+			 info->thread_idx, info->ret_sync);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		/* For now we make this simple */
+		if (!info->fs_sync)
+			err_ret = -EINVAL;
+		dev_info(test_dev->dev, "Sync thread %u fs: %s\n",
+			 info->thread_idx, info->fs_sync ? config->test_fs :
+			 "NULL");
+		break;
+	default:
+		BUG();
+	}
+
+	return err_ret;
+}
+
+/*
+ * XXX: add result option to display if all errors did not match.
+ * For now we just keep any error code if one was found.
+ *
+ * If this ran it means *all* tasks were created fine and we
+ * are now just collecting results.
+ *
+ * Only propagate errors, do not override with a subsequent sucess case.
+ */
+static void tally_up_work(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int idx;
+	int err_ret = 0;
+	int ret = 0;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	dev_info(test_dev->dev, "Results:\n");
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		info = &test_dev->info[idx];
+		ret = tally_work_test(info);
+		if (ret)
+			err_ret = ret;
+	}
+
+	/*
+	 * Note: request_module() returns 256 for a module not found even
+	 * though modprobe itself returns 1.
+	 */
+	config->test_result = err_ret;
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+static int try_one_request(struct kmod_test_device *test_dev, unsigned int idx)
+{
+	struct kmod_test_device_info *info = &test_dev->info[idx];
+	int fail_ret = -ENOMEM;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	info->thread_idx = idx;
+	info->test_dev = test_dev;
+	info->task_sync = kthread_run(run_request, info, "%s-%u",
+				      KBUILD_MODNAME, idx);
+
+	if (!info->task_sync || IS_ERR(info->task_sync)) {
+		test_dev->test_is_oom = true;
+		dev_err(test_dev->dev, "Setting up thread %u failed\n", idx);
+		info->task_sync = NULL;
+		goto err_out;
+	} else
+		dev_dbg(test_dev->dev, "Kicked off thread %u\n", idx);
+
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+
+err_out:
+	info->ret_sync = fail_ret;
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return fail_ret;
+}
+
+static void test_dev_kmod_stop_tests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int i;
+
+	dev_info(test_dev->dev, "Ending request_module() tests\n");
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	for (i=0; i < config->num_threads; i++) {
+		info = &test_dev->info[i];
+		if (info->task_sync && !IS_ERR(info->task_sync)) {
+			dev_info(test_dev->dev,
+				 "Stopping still-running thread %i\n", i);
+			kthread_stop(info->task_sync);
+		}
+
+		/*
+		 * info->task_sync is well protected, it can only be
+		 * NULL or a pointer to a struct. If its NULL we either
+		 * never ran, or we did and we completed the work. Completed
+		 * tasks *always* put the module for us. This is a sanity
+		 * check -- just in case.
+		 */
+		if (info->task_sync && info->need_mod_put)
+			test_kmod_put_module(info);
+	}
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+/*
+ * Only wait *iff* we did not run into any errors during all of our thread
+ * set up. If run into any issues we stop threads and just bail out with
+ * an error to the trigger. This also means we don't need any tally work
+ * for any threads which fail.
+ */
+static int try_requests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	unsigned int idx;
+	int ret;
+	bool any_error = false;
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		if (test_dev->test_is_oom) {
+			any_error = true;
+			break;
+		}
+
+		ret = try_one_request(test_dev, idx);
+		if (ret) {
+			any_error = true;
+			break;
+		}
+	}
+
+	if (!any_error) {
+		test_dev->test_is_oom = false;
+		dev_info(test_dev->dev,
+			 "No errors were found while initializing threads\n");
+		wait_for_completion(&test_dev->kthreads_done);
+		tally_up_work(test_dev);
+	} else {
+		test_dev->test_is_oom = true;
+		dev_info(test_dev->dev,
+			 "At least one thread failed to start, stop all work\n");
+		test_dev_kmod_stop_tests(test_dev);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int run_test_driver(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test driver to load: %s\n",
+		 config->test_driver);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static int run_test_fs_type(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test filesystem to load: %s\n",
+		 config->test_fs);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static ssize_t config_show(struct device *dev,
+			   struct device_attribute *attr,
+			   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int len = 0;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	len += snprintf(buf, PAGE_SIZE,
+			"Custom trigger configuration for: %s\n",
+			dev_name(dev));
+
+	len += snprintf(buf+len, PAGE_SIZE - len,
+			"Number of threads:\t%u\n",
+			config->num_threads);
+
+	len += snprintf(buf+len, PAGE_SIZE - len,
+			"Test_case:\t%s (%u)\n",
+			test_case_str(config->test_case),
+			config->test_case);
+
+	if (config->test_driver)
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"driver:\t%s\n",
+				config->test_driver);
+	else
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"driver:\tEMTPY\n");
+
+	if (config->test_fs)
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"fs:\t%s\n",
+				config->test_fs);
+	else
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"fs:\tEMTPY\n");
+
+	mutex_unlock(&test_dev->config_mutex);
+
+	return len;
+}
+static DEVICE_ATTR_RO(config);
+
+/*
+ * This ensures we don't allow kicking threads through if our configuration
+ * is faulty.
+ */
+static int __trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		return run_test_driver(test_dev);
+	case TEST_KMOD_FS_TYPE:
+		return run_test_fs_type(test_dev);
+	default:
+		dev_warn(test_dev->dev,
+			 "Invalid test case requested: %u\n",
+			 config->test_case);
+		return -EINVAL;
+	}
+}
+
+static int trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __trigger_config_run(test_dev);
+	if (ret < 0)
+		goto out;
+	dev_info(test_dev->dev, "General test result: %d\n",
+		 config->test_result);
+
+	/*
+	 * We must return 0 after a trigger even unless something went
+	 * wrong with the setup of the test. If the test setup went fine
+	 * then userspace must just check the result of config->test_result.
+	 * One issue with relying on the return from a call in the kernel
+	 * is if the kernel returns a possitive value using this trigger
+	 * will not return the value to userspace, it would be lost.
+	 *
+	 * By not relying on capturing the return value of tests we are using
+	 * through the trigger it also us to run tests with set -e and only
+	 * fail when something went wrong with the driver upon trigger
+	 * requests.
+	 */
+	ret = 0;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+
+static ssize_t
+trigger_config_store(struct device *dev,
+		     struct device_attribute *attr,
+		     const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	if (test_dev->test_is_oom)
+		return -ENOMEM;
+
+	/* For all intents and purposes we don't care what userspace
+	 * sent this trigger, we care only that we were triggered.
+	 * We treat the return value only for caputuring issues with
+	 * the test setup. At this point all the test variables should
+	 * have been allocated so typically this should never fail.
+	 */
+	ret = trigger_config_run(test_dev);
+	if (unlikely(ret < 0))
+		goto out;
+
+	/*
+	 * Note: any return > 0 will be treated as success
+	 * and the error value will not be available to userspace.
+	 * Do not rely on trying to send to userspace a test value
+	 * return value as possitive return errors will be lost.
+	 */
+	if (WARN_ON(ret > 0))
+		return -EINVAL;
+
+	ret = count;
+out:
+	return ret;
+}
+static DEVICE_ATTR_WO(trigger_config);
+
+/*
+ * XXX: move to kstrncpy() once merged.
+ *
+ * Users should use kfree_const() when freeing these.
+ */
+static int __kstrncpy(char **dst, const char *name, size_t count, gfp_t gfp)
+{
+	*dst = kstrndup(name, count, gfp);
+	if (!*dst)
+		return -ENOSPC;
+	return count;
+}
+
+static int config_copy_test_driver_name(struct test_config *config,
+				    const char *name,
+				    size_t count)
+{
+	return __kstrncpy(&config->test_driver, name, count, GFP_KERNEL);
+}
+
+
+static int config_copy_test_fs(struct test_config *config, const char *name,
+			       size_t count)
+{
+	return __kstrncpy(&config->test_fs, name, count, GFP_KERNEL);
+}
+
+static void __kmod_config_free(struct test_config *config)
+{
+	if (!config)
+		return;
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	kfree_const(config->test_fs);
+	config->test_driver = NULL;
+}
+
+static void kmod_config_free(struct kmod_test_device *test_dev)
+{
+	struct test_config *config;
+
+	if (!test_dev)
+		return;
+
+	config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+	__kmod_config_free(config);
+	mutex_unlock(&test_dev->config_mutex);
+}
+
+static ssize_t config_test_driver_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	copied = config_copy_test_driver_name(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+/*
+ * As per sysfs_kf_seq_show() the buf is max PAGE_SIZE.
+ */
+static ssize_t config_test_show_str(struct mutex *config_mutex,
+				    char *dst,
+				    char *src)
+{
+	int len;
+
+	mutex_lock(config_mutex);
+	len = snprintf(dst, PAGE_SIZE, "%s\n", src);
+	mutex_unlock(config_mutex);
+
+	return len;
+}
+
+static ssize_t config_test_driver_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return config_test_show_str(&test_dev->config_mutex, buf,
+				    config->test_driver);
+}
+static DEVICE_ATTR(config_test_driver, 0644, config_test_driver_show,
+		   config_test_driver_store);
+
+static ssize_t config_test_fs_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_fs);
+	config->test_fs = NULL;
+
+	copied = config_copy_test_fs(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+static ssize_t config_test_fs_show(struct device *dev,
+				   struct device_attribute *attr,
+				   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return config_test_show_str(&test_dev->config_mutex, buf,
+				    config->test_fs);
+}
+static DEVICE_ATTR(config_test_fs, 0644, config_test_fs_show,
+		   config_test_fs_store);
+
+static int trigger_config_run_type(struct kmod_test_device *test_dev,
+				   enum kmod_test_case test_case,
+				   const char *test_str)
+{
+	int copied = 0;
+	struct test_config *config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	switch (test_case) {
+	case TEST_KMOD_DRIVER:
+		kfree_const(config->test_driver);
+		config->test_driver = NULL;
+		copied = config_copy_test_driver_name(config, test_str,
+						      strlen(test_str));
+		break;
+	case TEST_KMOD_FS_TYPE:
+		break;
+		kfree_const(config->test_fs);
+		config->test_driver = NULL;
+		copied = config_copy_test_fs(config, test_str,
+					     strlen(test_str));
+	default:
+		mutex_unlock(&test_dev->config_mutex);
+		return -EINVAL;
+	}
+
+	config->test_case = test_case;
+
+	mutex_unlock(&test_dev->config_mutex);
+
+	if (copied <= 0 || copied != strlen(test_str)) {
+		test_dev->test_is_oom = true;
+		return -ENOMEM;
+	}
+
+	test_dev->test_is_oom = false;
+
+	return trigger_config_run(test_dev);
+}
+
+static void free_test_dev_info(struct kmod_test_device *test_dev)
+{
+	vfree(test_dev->info);
+	test_dev->info = NULL;
+}
+
+static int kmod_config_sync_info(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	free_test_dev_info(test_dev);
+	test_dev->info = vzalloc(config->num_threads *
+				 sizeof(struct kmod_test_device_info));
+	if (!test_dev->info) {
+		dev_err(test_dev->dev, "Cannot alloc test_dev info\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * Old kernels may not have this, if you want to port this code to
+ * test it on older kernels.
+ */
+#ifdef get_kmod_umh_limit
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return get_kmod_umh_limit();
+}
+#else
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return TEST_START_NUM_THREADS;
+}
+#endif
+
+static int __kmod_config_init(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret = -ENOMEM, copied;
+
+	__kmod_config_free(config);
+
+	copied = config_copy_test_driver_name(config, TEST_START_DRIVER,
+					      strlen(TEST_START_DRIVER));
+	if (copied != strlen(TEST_START_DRIVER))
+		goto err_out;
+
+	copied = config_copy_test_fs(config, TEST_START_TEST_FS,
+				     strlen(TEST_START_TEST_FS));
+	if (copied != strlen(TEST_START_TEST_FS))
+		goto err_out;
+
+	config->num_threads = kmod_init_test_thread_limit();
+	config->test_result = 0;
+	config->test_case = TEST_START_TEST_CASE;
+
+	ret = kmod_config_sync_info(test_dev);
+	if (ret)
+		goto err_out;
+
+	test_dev->test_is_oom = false;
+
+	return 0;
+
+err_out:
+	test_dev->test_is_oom = true;
+	WARN_ON(test_dev->test_is_oom);
+
+	__kmod_config_free(config);
+
+	return ret;
+}
+
+static ssize_t reset_store(struct device *dev,
+			   struct device_attribute *attr,
+			   const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __kmod_config_init(test_dev);
+	if (ret < 0) {
+		ret = -ENOMEM;
+		dev_err(dev, "could not alloc settings for config trigger: %d\n",
+		       ret);
+		goto out;
+	}
+
+	dev_info(dev, "reset\n");
+	ret = count;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+static DEVICE_ATTR_WO(reset);
+
+static int test_dev_config_update_uint_sync(struct kmod_test_device *test_dev,
+					    const char *buf, size_t size,
+					    unsigned int *config,
+					    int (*test_sync)(struct kmod_test_device *test_dev))
+{
+	int ret;
+	long new;
+	unsigned int old_val;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	old_val = *config;
+	*(unsigned int *)config = new;
+
+	ret = test_sync(test_dev);
+	if (ret) {
+		*(unsigned int *)config = old_val;
+
+		ret = test_sync(test_dev);
+		WARN_ON(ret);
+
+		mutex_unlock(&test_dev->config_mutex);
+		return -EINVAL;
+	}
+
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_uint_range(struct kmod_test_device *test_dev,
+					     const char *buf, size_t size,
+					     unsigned int *config,
+					     unsigned int min,
+					     unsigned int max)
+{
+	int ret;
+	long new;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new < min || new >  max || new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+	*config = new;
+	mutex_unlock(&test_dev->config_mutex);
+
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_int(struct kmod_test_device *test_dev,
+				      const char *buf, size_t size,
+				      int *config)
+{
+	int ret;
+	long new;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new > INT_MAX || new < INT_MIN)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+	*config = new;
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static ssize_t test_dev_config_show_int(struct kmod_test_device *test_dev,
+					char *buf,
+					int config)
+{
+	int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+static ssize_t test_dev_config_show_uint(struct kmod_test_device *test_dev,
+					 char *buf,
+					 unsigned int config)
+{
+	unsigned int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", val);
+}
+
+static ssize_t test_result_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_int(test_dev, buf, count,
+					  &config->test_result);
+}
+
+static ssize_t config_num_threads_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_sync(test_dev, buf, count,
+						&config->num_threads,
+						kmod_config_sync_info);
+}
+
+static ssize_t config_num_threads_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->num_threads);
+}
+static DEVICE_ATTR(config_num_threads, 0644, config_num_threads_show,
+		   config_num_threads_store);
+
+static ssize_t config_test_case_store(struct device *dev,
+				      struct device_attribute *attr,
+				      const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_range(test_dev, buf, count,
+						 &config->test_case,
+						 __TEST_KMOD_INVALID + 1,
+						 __TEST_KMOD_MAX - 1);
+}
+
+static ssize_t config_test_case_show(struct device *dev,
+				     struct device_attribute *attr,
+				     char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_uint(test_dev, buf, config->test_case);
+}
+static DEVICE_ATTR(config_test_case, 0644, config_test_case_show,
+		   config_test_case_store);
+
+static ssize_t test_result_show(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->test_result);
+}
+static DEVICE_ATTR(test_result, 0644, test_result_show, test_result_store);
+
+#define TEST_KMOD_DEV_ATTR(name)		&dev_attr_##name.attr
+
+static struct attribute *test_dev_attrs[] = {
+	TEST_KMOD_DEV_ATTR(trigger_config),
+	TEST_KMOD_DEV_ATTR(config),
+	TEST_KMOD_DEV_ATTR(reset),
+
+	TEST_KMOD_DEV_ATTR(config_test_driver),
+	TEST_KMOD_DEV_ATTR(config_test_fs),
+	TEST_KMOD_DEV_ATTR(config_num_threads),
+	TEST_KMOD_DEV_ATTR(config_test_case),
+	TEST_KMOD_DEV_ATTR(test_result),
+
+	NULL,
+};
+
+ATTRIBUTE_GROUPS(test_dev);
+
+static int kmod_config_init(struct kmod_test_device *test_dev)
+{
+	int ret;
+
+	mutex_lock(&test_dev->config_mutex);
+	ret = __kmod_config_init(test_dev);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return ret;
+}
+
+static struct kmod_test_device *alloc_test_dev_kmod(int idx)
+{
+	int ret;
+	struct kmod_test_device *test_dev;
+	struct miscdevice *misc_dev;
+
+	test_dev = vzalloc(sizeof(struct kmod_test_device));
+	if (!test_dev) {
+		pr_err("Cannot alloc test_dev\n");
+		goto err_out;
+	}
+
+	mutex_init(&test_dev->config_mutex);
+	mutex_init(&test_dev->trigger_mutex);
+	mutex_init(&test_dev->thread_mutex);
+
+	init_completion(&test_dev->kthreads_done);
+
+	ret = kmod_config_init(test_dev);
+	if (ret < 0) {
+		pr_err("Cannot alloc kmod_config_init()\n");
+		goto err_out_free;
+	}
+
+	test_dev->dev_idx = idx;
+	misc_dev = &test_dev->misc_dev;
+
+	misc_dev->minor = MISC_DYNAMIC_MINOR;
+	misc_dev->name = kasprintf(GFP_KERNEL, "test_kmod%d", idx);
+	if (!misc_dev->name) {
+		pr_err("Cannot alloc misc_dev->name\n");
+		goto err_out_free_config;
+	}
+	misc_dev->groups = test_dev_groups;
+
+	return test_dev;
+
+err_out_free_config:
+	free_test_dev_info(test_dev);
+	kmod_config_free(test_dev);
+err_out_free:
+	vfree(test_dev);
+	test_dev = NULL;
+err_out:
+	return NULL;
+}
+
+static void free_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	if (test_dev) {
+		kfree_const(test_dev->misc_dev.name);
+		test_dev->misc_dev.name = NULL;
+		free_test_dev_info(test_dev);
+		kmod_config_free(test_dev);
+		vfree(test_dev);
+		test_dev = NULL;
+	}
+}
+
+static struct kmod_test_device *register_test_dev_kmod(void)
+{
+	struct kmod_test_device *test_dev = NULL;
+	int ret;
+
+	mutex_unlock(&reg_dev_mutex);
+
+	/* int should suffice for number of devices, test for wrap */
+	if (unlikely(num_test_devs + 1) < 0) {
+		pr_err("reached limit of number of test devices\n");
+		goto out;
+	}
+
+	test_dev = alloc_test_dev_kmod(num_test_devs);
+	if (!test_dev)
+		goto out;
+
+	ret = misc_register(&test_dev->misc_dev);
+	if (ret) {
+		pr_err("could not register misc device: %d\n", ret);
+		free_test_dev_kmod(test_dev);
+		goto out;
+	}
+
+	test_dev->dev = test_dev->misc_dev.this_device;
+	list_add_tail(&test_dev->list, &reg_test_devs);
+	dev_info(test_dev->dev, "interface ready\n");
+
+	num_test_devs++;
+
+out:
+	mutex_unlock(&reg_dev_mutex);
+
+	return test_dev;
+
+}
+
+static int __init test_kmod_init(void)
+{
+	struct kmod_test_device *test_dev;
+	int ret;
+
+	test_dev = register_test_dev_kmod();
+	if (!test_dev) {
+		pr_err("Cannot add first test kmod device\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * With some work we might be able to gracefully enable
+	 * testing with this driver built-in, for now this seems
+	 * rather risky. For those willing to try have at it,
+	 * and enable the below. Good luck! If that works, try
+	 * lowering the init level for more fun.
+	 */
+	if (force_init_test) {
+		ret = trigger_config_run_type(test_dev,
+					      TEST_KMOD_DRIVER, "tun");
+		if (WARN_ON(ret))
+			return ret;
+		ret = trigger_config_run_type(test_dev,
+					      TEST_KMOD_FS_TYPE, "btrfs");
+		if (WARN_ON(ret))
+			return ret;
+	}
+
+	return 0;
+}
+late_initcall(test_kmod_init);
+
+static
+void unregister_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	test_dev_kmod_stop_tests(test_dev);
+
+	dev_info(test_dev->dev, "removing interface\n");
+	misc_deregister(&test_dev->misc_dev);
+	kfree(&test_dev->misc_dev.name);
+
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	free_test_dev_kmod(test_dev);
+}
+
+static void __exit test_kmod_exit(void)
+{
+	struct kmod_test_device *test_dev, *tmp;
+
+	mutex_lock(&reg_dev_mutex);
+	list_for_each_entry_safe(test_dev, tmp, &reg_test_devs, list) {
+		list_del(&test_dev->list);
+		unregister_test_dev_kmod(test_dev);
+	}
+	mutex_unlock(&reg_dev_mutex);
+}
+module_exit(test_kmod_exit);
+
+MODULE_AUTHOR("Luis R. Rodriguez <mcgrof@kernel.org>");
+MODULE_LICENSE("GPL");
diff --git a/tools/testing/selftests/kmod/Makefile b/tools/testing/selftests/kmod/Makefile
new file mode 100644
index 000000000000..fa2ccc5fb3de
--- /dev/null
+++ b/tools/testing/selftests/kmod/Makefile
@@ -0,0 +1,11 @@
+# Makefile for kmod loading selftests
+
+# No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
+all:
+
+TEST_PROGS := kmod.sh
+
+include ../lib.mk
+
+# Nothing to clean up.
+clean:
diff --git a/tools/testing/selftests/kmod/config b/tools/testing/selftests/kmod/config
new file mode 100644
index 000000000000..259f4fd6b5e2
--- /dev/null
+++ b/tools/testing/selftests/kmod/config
@@ -0,0 +1,7 @@
+CONFIG_TEST_KMOD=m
+CONFIG_TEST_LKM=m
+CONFIG_XFS_FS=m
+
+# For the module parameter force_init_test is used
+CONFIG_TUN=m
+CONFIG_BTRFS_FS=m
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
new file mode 100755
index 000000000000..10196a62ed09
--- /dev/null
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -0,0 +1,635 @@
+#!/bin/bash
+#
+# Copyright (C) 2017 Luis R. Rodriguez <mcgrof@kernel.org>
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 2 of the License, or at your option any
+# later version; or, when distributed separately from the Linux kernel or
+# when incorporated into other software packages, subject to the following
+# license:
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of copyleft-next (version 0.3.1 or later) as published
+# at http://copyleft-next.org/.
+
+# This is a stress test script for kmod, the kernel module loader. It uses
+# test_kmod which exposes a series of knobs for the API for us so we can
+# tweak each test in userspace rather than in kernelspace.
+#
+# The way kmod works is it uses the kernel's usermode helper API to eventually
+# call /sbin/modprobe. It has a limit of the number of concurrent calls
+# possible. The kernel interface to load modules is request_module(), however
+# mount uses get_fs_type(). Both behave slightly differently, but the
+# differences are important enough to test each call separately. For this
+# reason test_kmod starts by providing tests for both calls.
+#
+# The test driver test_kmod assumes a series of defaults which you can
+# override by exporting to your environment prior running this script.
+# For instance this script assumes you do not have xfs loaded upon boot.
+# If this is false, export DEFAULT_KMOD_FS="ext4" prior to running this
+# script if the filesyste module you don't have loaded upon bootup
+# is ext4 instead. Refer to allow_user_defaults() for a list of user
+# override variables possible.
+#
+# You'll want at least 4 GiB of RAM to expect to run these tests
+# without running out of memory on them. For other requirements refer
+# to test_reqs()
+
+set -e
+
+TEST_NAME="kmod"
+TEST_DRIVER="test_${TEST_NAME}"
+TEST_DIR=$(dirname $0)
+
+# This represents
+#
+# TEST_ID:TEST_COUNT:ENABLED
+#
+# TEST_ID: is the test id number
+# TEST_COUNT: number of times we should run the test
+# ENABLED: 1 if enabled, 0 otherwise
+#
+# Once these are enabled please leave them as-is. Write your own test,
+# we have tons of space.
+ALL_TESTS="0001:3:1"
+ALL_TESTS="$ALL_TESTS 0002:3:1"
+ALL_TESTS="$ALL_TESTS 0003:1:1"
+ALL_TESTS="$ALL_TESTS 0004:1:1"
+ALL_TESTS="$ALL_TESTS 0005:10:1"
+ALL_TESTS="$ALL_TESTS 0006:10:1"
+ALL_TESTS="$ALL_TESTS 0007:5:1"
+
+# Disabled tests:
+#
+# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
+# Current best-effort failure interpretation:
+# Enough module requests get loaded in place fast enough to reach over the
+# max_modprobes limit and trigger a failure -- before we're even able to
+# start processing pending requests.
+ALL_TESTS="$ALL_TESTS 0008:150:0"
+
+# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+# Current best-effort failure interpretation:
+#
+# get_fs_type() requests modules using aliases as such the optimization in
+# place today to look for already loaded modules will not take effect and
+# we end up requesting a new module to load, this bumps the kmod_concurrent,
+# and in certain circumstances can lead to pushing the kmod_concurrent over
+# the max_modprobe limit.
+#
+# This test fails much easier than test 0008 since the alias optimizations
+# are not in place.
+ALL_TESTS="$ALL_TESTS 0009:150:0"
+
+test_modprobe()
+{
+       if [ ! -d $DIR ]; then
+               echo "$0: $DIR not present" >&2
+               echo "You must have the following enabled in your kernel:" >&2
+               cat $TEST_DIR/config >&2
+               exit 1
+       fi
+}
+
+function allow_user_defaults()
+{
+	if [ -z $DEFAULT_KMOD_DRIVER ]; then
+		DEFAULT_KMOD_DRIVER="test_module"
+	fi
+
+	if [ -z $DEFAULT_KMOD_FS ]; then
+		DEFAULT_KMOD_FS="xfs"
+	fi
+
+	if [ -z $PROC_DIR ]; then
+		PROC_DIR="/proc/sys/kernel/"
+	fi
+
+	if [ -z $MODPROBE_LIMIT ]; then
+		MODPROBE_LIMIT=50
+	fi
+
+	if [ -z $DIR ]; then
+		DIR="/sys/devices/virtual/misc/${TEST_DRIVER}0/"
+	fi
+
+	if [ -z $DEFAULT_NUM_TESTS ]; then
+		DEFAULT_NUM_TESTS=150
+	fi
+
+	MODPROBE_LIMIT_FILE="${PROC_DIR}/kmod-limit"
+}
+
+test_reqs()
+{
+	if ! which modprobe 2> /dev/null > /dev/null; then
+		echo "$0: You need modprobe installed" >&2
+		exit 1
+	fi
+
+	if ! which kmod 2> /dev/null > /dev/null; then
+		echo "$0: You need kmod installed" >&2
+		exit 1
+	fi
+
+	# kmod 19 has a bad bug where it returns 0 when modprobe
+	# gets called *even* if the module was not loaded due to
+	# some bad heuristics. For details see:
+	#
+	# A work around is possible in-kernel but its rather
+	# complex.
+	KMOD_VERSION=$(kmod --version | awk '{print $3}')
+	if [[ $KMOD_VERSION  -le 19 ]]; then
+		echo "$0: You need at least kmod 20" >&2
+		echo "kmod <= 19 is buggy, for details see:" >&2
+		echo "http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4" >&2
+		exit 1
+	fi
+
+	uid=$(id -u)
+	if [ $uid -ne 0 ]; then
+		echo $msg must be run as root >&2
+		exit 0
+	fi
+}
+
+function load_req_mod()
+{
+	trap "test_modprobe" EXIT
+
+	if [ ! -d $DIR ]; then
+		# Alanis: "Oh isn't it ironic?"
+		modprobe $TEST_DRIVER
+	fi
+}
+
+test_finish()
+{
+	echo "Test completed"
+}
+
+errno_name_to_val()
+{
+	case "$1" in
+	# kmod calls modprobe and upon of a module not found
+	# modprobe returns just 1... However in the kernel we
+	# *sometimes* see 256...
+	MODULE_NOT_FOUND)
+		echo 256;;
+	SUCCESS)
+		echo 0;;
+	-EPERM)
+		echo -1;;
+	-ENOENT)
+		echo -2;;
+	-EINVAL)
+		echo -22;;
+	-ERR_ANY)
+		echo -123456;;
+	*)
+		echo invalid;;
+	esac
+}
+
+errno_val_to_name()
+	case "$1" in
+	256)
+		echo MODULE_NOT_FOUND;;
+	0)
+		echo SUCCESS;;
+	-1)
+		echo -EPERM;;
+	-2)
+		echo -ENOENT;;
+	-22)
+		echo -EINVAL;;
+	-123456)
+		echo -ERR_ANY;;
+	*)
+		echo invalid;;
+	esac
+
+config_set_test_case_driver()
+{
+	if ! echo -n 1 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to driver" >&2
+		exit 1
+	fi
+}
+
+config_set_test_case_fs()
+{
+	if ! echo -n 2 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to fs" >&2
+		exit 1
+	fi
+}
+
+config_num_threads()
+{
+	if ! echo -n $1 >$DIR/config_num_threads; then
+		echo "$0: Unable to set to number of threads" >&2
+		exit 1
+	fi
+}
+
+config_get_modprobe_limit()
+{
+	if [[ -f ${MODPROBE_LIMIT_FILE} ]] ; then
+		MODPROBE_LIMIT=$(cat $MODPROBE_LIMIT_FILE)
+	fi
+	echo $MODPROBE_LIMIT
+}
+
+config_num_thread_limit_extra()
+{
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA_LIMIT=$MODPROBE_LIMIT+$1
+	config_num_threads $EXTRA_LIMIT
+}
+
+# For special characters use printf directly,
+# refer to kmod_test_0001
+config_set_driver()
+{
+	if ! echo -n $1 >$DIR/config_test_driver; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_set_fs()
+{
+	if ! echo -n $1 >$DIR/config_test_fs; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_get_driver()
+{
+	cat $DIR/config_test_driver
+}
+
+config_get_test_result()
+{
+	cat $DIR/test_result
+}
+
+config_reset()
+{
+	if ! echo -n "1" >"$DIR"/reset; then
+		echo "$0: reset shuld have worked" >&2
+		exit 1
+	fi
+}
+
+config_show_config()
+{
+	echo "----------------------------------------------------"
+	cat "$DIR"/config
+	echo "----------------------------------------------------"
+}
+
+config_trigger()
+{
+	if ! echo -n "1" >"$DIR"/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - loading should have worked"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - loading kmod test"
+}
+
+config_trigger_want_fail()
+{
+	if echo "1" > $DIR/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - test case was expected to fail"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - kmod test case failed as expected"
+}
+
+config_expect_result()
+{
+	RC=$(config_get_test_result)
+	RC_NAME=$(errno_val_to_name $RC)
+
+	ERRNO_NAME=$2
+	ERRNO=$(errno_name_to_val $ERRNO_NAME)
+
+	if [[ $ERRNO_NAME = "-ERR_ANY" ]]; then
+		if [[ $RC -ge 0 ]]; then
+			echo "$1: FAIL, test expects $ERRNO_NAME - got $RC_NAME ($RC)" >&2
+			config_show_config
+			exit 1
+		fi
+	elif [[ $RC != $ERRNO ]]; then
+		echo "$1: FAIL, test expects $ERRNO_NAME ($ERRNO) - got $RC_NAME ($RC)" >&2
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - Return value: $RC ($RC_NAME), expected $ERRNO_NAME"
+}
+
+kmod_defaults_driver()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_DRIVER
+	config_set_driver $DEFAULT_KMOD_DRIVER
+}
+
+kmod_defaults_fs()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_FS
+	config_set_fs $DEFAULT_KMOD_FS
+	config_set_test_case_fs
+}
+
+kmod_test_0001_driver()
+{
+	NAME='\000'
+
+	kmod_defaults_driver
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0001_fs()
+{
+	NAME='\000'
+
+	kmod_defaults_fs
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0001()
+{
+	kmod_test_0001_driver
+	kmod_test_0001_fs
+}
+
+kmod_test_0002_driver()
+{
+	NAME="nope-$DEFAULT_KMOD_DRIVER"
+
+	kmod_defaults_driver
+	config_set_driver $NAME
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0002_fs()
+{
+	NAME="nope-$DEFAULT_KMOD_FS"
+
+	kmod_defaults_fs
+	config_set_fs $NAME
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0002()
+{
+	kmod_test_0002_driver
+	kmod_test_0002_fs
+}
+
+kmod_test_0003()
+{
+	kmod_defaults_fs
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0004()
+{
+	kmod_defaults_fs
+	config_num_threads 2
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0005()
+{
+	kmod_defaults_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0006()
+{
+	kmod_defaults_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0007()
+{
+	kmod_test_0005
+	kmod_test_0006
+}
+
+kmod_test_0008()
+{
+	kmod_defaults_driver
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA=$MODPROBE_LIMIT/6
+	config_num_thread_limit_extra $EXTRA
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0009()
+{
+	kmod_defaults_fs
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA=$MODPROBE_LIMIT/4
+	config_num_thread_limit_extra $EXTRA
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+list_tests()
+{
+	echo "Test ID list:"
+	echo
+	echo "TEST_ID x NUM_TEST"
+	echo "TEST_ID:   Test ID"
+	echo "NUM_TESTS: Number of recommended times to run the test"
+	echo
+	echo "0001 x $(get_test_count 0001) - Simple test - 1 thread  for empty string"
+	echo "0002 x $(get_test_count 0002) - Simple test - 1 thread  for modules/filesystems that do not exist"
+	echo "0003 x $(get_test_count 0003) - Simple test - 1 thread  for get_fs_type() only"
+	echo "0004 x $(get_test_count 0004) - Simple test - 2 threads for get_fs_type() only"
+	echo "0005 x $(get_test_count 0005) - multithreaded tests with default setup - request_module() only"
+	echo "0006 x $(get_test_count 0006) - multithreaded tests with default setup - get_fs_type() only"
+	echo "0007 x $(get_test_count 0007) - multithreaded tests with default setup test request_module() and get_fs_type()"
+	echo "0008 x $(get_test_count 0008) - multithreaded - push kmod_concurrent over max_modprobes for request_module()"
+	echo "0009 x $(get_test_count 0009) - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+}
+
+usage()
+{
+	NUM_TESTS=$(grep -o ' ' <<<"$ALL_TESTS" | grep -c .)
+	let NUM_TESTS=$NUM_TESTS+1
+	MAX_TEST=$(printf "%04d\n" $NUM_TESTS)
+	echo "Usage: $0 [ -t <4-number-digit> ] | [ -w <4-number-digit> ] |"
+	echo "		 [ -s <4-number-digit> ] | [ -c <4-number-digit> <test- count>"
+	echo "           [ all ] [ -h | --help ] [ -l ]"
+	echo ""
+	echo "Valid tests: 0001-$MAX_TEST"
+	echo ""
+	echo "    all     Runs all tests (default)"
+	echo "    -t      Run test ID the number amount of times is recommended"
+	echo "    -w      Watch test ID run until it runs into an error"
+	echo "    -c      Run test ID once"
+	echo "    -s      Run test ID x test-count number of times"
+	echo "    -l      List all test ID list"
+	echo " -h|--help  Help"
+	echo
+	echo "If an error every occurs execution will immediately terminate."
+	echo "If you are adding a new test try using -w <test-ID> first to"
+	echo "make sure the test passes a series of tests."
+	echo
+	echo Example uses:
+	echo
+	echo "${TEST_NAME}.sh		-- executes all tests"
+	echo "${TEST_NAME}.sh -t 0008	-- Executes test ID 0008 number of times is recomended"
+	echo "${TEST_NAME}.sh -w 0008	-- Watch test ID 0008 run until an error occurs"
+	echo "${TEST_NAME}.sh -s 0008	-- Run test ID 0008 once"
+	echo "${TEST_NAME}.sh -c 0008 3	-- Run test ID 0008 three times"
+	echo
+	list_tests
+	exit 1
+}
+
+function test_num()
+{
+	re='^[0-9]+$'
+	if ! [[ $1 =~ $re ]]; then
+		usage
+	fi
+}
+
+function get_test_count()
+{
+	test_num $1
+	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	LAST_TWO=${TEST_DATA#*:*}
+	echo ${LAST_TWO%:*}
+}
+
+function get_test_enabled()
+{
+	test_num $1
+	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	echo ${TEST_DATA#*:*:}
+}
+
+function run_all_tests()
+{
+	for i in $ALL_TESTS ; do
+		TEST_ID=${i%:*:*}
+		ENABLED=$(get_test_enabled $TEST_ID)
+		TEST_COUNT=$(get_test_count $TEST_ID)
+		if [[ $ENABLED -eq "1" ]]; then
+			test_case $TEST_ID $TEST_COUNT
+		fi
+	done
+}
+
+function watch_log()
+{
+	if [ $# -ne 3 ]; then
+		clear
+	fi
+	date
+	echo "Running test: $2 - run #$1"
+}
+
+function watch_case()
+{
+	i=0
+	while [ 1 ]; do
+
+		if [ $# -eq 1 ]; then
+			test_num $1
+			watch_log $i ${TEST_NAME}_test_$1
+			${TEST_NAME}_test_$1
+		else
+			watch_log $i all
+			run_all_tests
+		fi
+		let i=$i+1
+	done
+}
+
+function test_case()
+{
+	NUM_TESTS=$DEFAULT_NUM_TESTS
+	if [ $# -eq 2 ]; then
+		NUM_TESTS=$2
+	fi
+
+	i=0
+	while [ $i -lt $NUM_TESTS ]; do
+		test_num $1
+		watch_log $i ${TEST_NAME}_test_$1 noclear
+		RUN_TEST=${TEST_NAME}_test_$1
+		$RUN_TEST
+		let i=$i+1
+	done
+}
+
+function parse_args()
+{
+	if [ $# -eq 0 ]; then
+		run_all_tests
+	else
+		if [[ "$1" = "all" ]]; then
+			run_all_tests
+		elif [[ "$1" = "-w" ]]; then
+			shift
+			watch_case $@
+		elif [[ "$1" = "-t" ]]; then
+			shift
+			test_num $1
+			test_case $1 $(get_test_count $1)
+		elif [[ "$1" = "-c" ]]; then
+			shift
+			test_num $1
+			test_num $2
+			test_case $1 $2
+		elif [[ "$1" = "-s" ]]; then
+			shift
+			test_case $1 1
+		elif [[ "$1" = "-l" ]]; then
+			list_tests
+		elif [[ "$1" = "-h" || "$1" = "--help" ]]; then
+			usage
+		else
+			usage
+		fi
+	fi
+}
+
+test_reqs
+allow_user_defaults
+load_req_mod
+
+trap "test_finish" EXIT
+
+parse_args $@
+
+exit 0
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v3 4/4] kmod: throttle kmod thread limit
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
                       ` (2 preceding siblings ...)
  2017-05-26 21:12     ` [PATCH v3 3/4] kmod: add test driver to stress test the module loader Luis R. Rodriguez
@ 2017-05-26 21:12     ` Luis R. Rodriguez
  2017-06-22 15:19       ` Petr Mladek
  2017-06-23 19:20       ` [PATCH v4 " Luis R. Rodriguez
  2017-06-20 20:56     ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
  2017-06-28 22:31     ` [PATCH v4 0/3] " Luis R. Rodriguez
  5 siblings, 2 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 21:12 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet, josh
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

If we reach the limit of modprobe_limit threads running the next
request_module() call will fail. The original reason for adding
a kill was to do away with possible issues with in old circumstances
which would create a recursive series of request_module() calls.
We can do better than just be super aggressive and reject calls
once we've reached the limit by simply making pending callers wait
until the threshold has been reduced.

The only difference is the clutch helps with avoiding making
request_module() requests fatal more often. With x86_64 qemu,
with 4 cores, 4 GiB of RAM it takes the following run time to
run both tests:

time ./kmod.sh -t 0008
real    0m12.364s
user    0m0.704s
sys     0m5.373s

time ./kmod.sh -t 0009
real    0m47.638s
user    0m1.033s
sys     0m5.425s

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c                        | 16 +++++++---------
 tools/testing/selftests/kmod/kmod.sh | 24 ++----------------------
 2 files changed, 9 insertions(+), 31 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 3e346c700e80..46b12fed6fd0 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -67,6 +67,7 @@ static DECLARE_RWSEM(umhelper_sem);
  * enabled.
  * */
 static atomic_t kmod_concurrent_max = ATOMIC_INIT(50);
+static DECLARE_WAIT_QUEUE_HEAD(kmod_wq);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -139,7 +140,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static int kmod_loop_msg;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -163,14 +163,11 @@ int __request_module(bool wait, const char *fmt, ...)
 		return ret;
 
 	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
-		/* We may be blaming an innocent here, but unlikely */
-		if (kmod_loop_msg < 5) {
-			printk(KERN_ERR
-			       "request_module: runaway loop modprobe %s\n",
-			       module_name);
-			kmod_loop_msg++;
-		}
-		return -ENOMEM;
+		pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s\n, throttling...",
+				    atomic_read(&kmod_concurrent_max),
+				    50, module_name);
+		wait_event_interruptible(kmod_wq,
+					 atomic_dec_if_positive(&kmod_concurrent_max) >= 0);
 	}
 
 	trace_module_request(module_name, wait, _RET_IP_);
@@ -178,6 +175,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
 	atomic_inc(&kmod_concurrent_max);
+	wake_up_all(&kmod_wq);
 
 	return ret;
 }
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
index 10196a62ed09..8cecae9a8bca 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -59,28 +59,8 @@ ALL_TESTS="$ALL_TESTS 0004:1:1"
 ALL_TESTS="$ALL_TESTS 0005:10:1"
 ALL_TESTS="$ALL_TESTS 0006:10:1"
 ALL_TESTS="$ALL_TESTS 0007:5:1"
-
-# Disabled tests:
-#
-# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
-# Current best-effort failure interpretation:
-# Enough module requests get loaded in place fast enough to reach over the
-# max_modprobes limit and trigger a failure -- before we're even able to
-# start processing pending requests.
-ALL_TESTS="$ALL_TESTS 0008:150:0"
-
-# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
-# Current best-effort failure interpretation:
-#
-# get_fs_type() requests modules using aliases as such the optimization in
-# place today to look for already loaded modules will not take effect and
-# we end up requesting a new module to load, this bumps the kmod_concurrent,
-# and in certain circumstances can lead to pushing the kmod_concurrent over
-# the max_modprobe limit.
-#
-# This test fails much easier than test 0008 since the alias optimizations
-# are not in place.
-ALL_TESTS="$ALL_TESTS 0009:150:0"
+ALL_TESTS="$ALL_TESTS 0008:150:1"
+ALL_TESTS="$ALL_TESTS 0009:150:1"
 
 test_modprobe()
 {
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
                       ` (3 preceding siblings ...)
  2017-05-26 21:12     ` [PATCH v3 4/4] kmod: throttle kmod thread limit Luis R. Rodriguez
@ 2017-06-20 20:56     ` Luis R. Rodriguez
  2017-06-21  0:23       ` Kees Cook
  2017-06-28 22:31     ` [PATCH v4 0/3] " Luis R. Rodriguez
  5 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-20 20:56 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, josh, martin.wilck, mmarek, pmladek, hare, rwright,
	jeffm, DSterba, fdmanana, neilb, linux, rgoldwyn, subashab,
	xypron.glpk, keescook, atomlin, mbenes, paulmck, dan.j.williams,
	jpoimboe, davem, mingo, alan, tytso, gregkh, torvalds,
	linux-kselftest, linux-doc, linux-kernel

On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
> This v3 nukes the proc sysctl interface in favor for just letting userspace
> just check kernel revision. Prior to whenever this is merged userspace should
> try to avoid hammering more than 50 kmod threads as they can fail and it'd
> get -ENOMEM.
> 
> We do away with the old heuristics on assuming you could end up with
> less than max_threads/2 < 50 threads as Dmitry notes this would mean having
> a system with 16 MiB of RAM with modules enabled. It simplifies our patch
> "kmod: reduce atomic operations on kmod_concurrent" considerbly.
> 
> Since the sysctl interface is gone, this no longer depends on any
> other patches, the series is independent. As usual the series is
> available on my linux-next 20170526-kmod-only branch which is based
> on next-20170526.
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170526-kmod-only
> 
>   Luis
> 
> Luis R. Rodriguez (4):
>   module: use list_for_each_entry_rcu() on find_module_all()
>   kmod: reduce atomic operations on kmod_concurrent and simplify
>   kmod: add test driver to stress test the module loader
>   kmod: throttle kmod thread limit

About a month now with no further nitpicks. What tree should these changes
go through if there are no issues? Andrew's, Jessica's ?

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-20 20:56     ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
@ 2017-06-21  0:23       ` Kees Cook
  2017-06-26 21:37         ` Jessica Yu
  0 siblings, 1 reply; 69+ messages in thread
From: Kees Cook @ 2017-06-21  0:23 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Andrew Morton, Jessica Yu, Shuah Khan, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, Josh Triplett, martin.wilck, Michal Marek,
	Petr Mladek, hare, rwright, Jeff Mahoney, David Sterba,
	Filipe Manana, NeilBrown, Guenter Roeck, rgoldwyn,
	Subash Abhinov Kasiviswanathan, Heinrich Schuchardt,
	Aaron Tomlin, Miroslav Benes, Paul E. McKenney, Dan Williams,
	Josh Poimboeuf, David S. Miller, Ingo Molnar, Alan Cox,
	Ted Ts'o, Greg KH, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

On Tue, Jun 20, 2017 at 1:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
>> This v3 nukes the proc sysctl interface in favor for just letting userspace
>> just check kernel revision. Prior to whenever this is merged userspace should
>> try to avoid hammering more than 50 kmod threads as they can fail and it'd
>> get -ENOMEM.
>>
>> We do away with the old heuristics on assuming you could end up with
>> less than max_threads/2 < 50 threads as Dmitry notes this would mean having
>> a system with 16 MiB of RAM with modules enabled. It simplifies our patch
>> "kmod: reduce atomic operations on kmod_concurrent" considerbly.
>>
>> Since the sysctl interface is gone, this no longer depends on any
>> other patches, the series is independent. As usual the series is
>> available on my linux-next 20170526-kmod-only branch which is based
>> on next-20170526.
>>
>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170526-kmod-only
>>
>>   Luis
>>
>> Luis R. Rodriguez (4):
>>   module: use list_for_each_entry_rcu() on find_module_all()
>>   kmod: reduce atomic operations on kmod_concurrent and simplify
>>   kmod: add test driver to stress test the module loader
>>   kmod: throttle kmod thread limit
>
> About a month now with no further nitpicks. What tree should these changes
> go through if there are no issues? Andrew's, Jessica's ?

Seems like going through Jessica's would make the most sense?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 4/4] kmod: throttle kmod thread limit
  2017-05-26 21:12     ` [PATCH v3 4/4] kmod: throttle kmod thread limit Luis R. Rodriguez
@ 2017-06-22 15:19       ` Petr Mladek
  2017-06-23 16:16         ` Luis R. Rodriguez
  2017-06-23 19:20       ` [PATCH v4 " Luis R. Rodriguez
  1 sibling, 1 reply; 69+ messages in thread
From: Petr Mladek @ 2017-06-22 15:19 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, josh, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri 2017-05-26 14:12:28, Luis R. Rodriguez wrote:
> If we reach the limit of modprobe_limit threads running the next
> request_module() call will fail. The original reason for adding
> a kill was to do away with possible issues with in old circumstances
> which would create a recursive series of request_module() calls.
> We can do better than just be super aggressive and reject calls
> once we've reached the limit by simply making pending callers wait
> until the threshold has been reduced.
> 
> The only difference is the clutch helps with avoiding making
> request_module() requests fatal more often. With x86_64 qemu,
> with 4 cores, 4 GiB of RAM it takes the following run time to
> run both tests:
> 
> time ./kmod.sh -t 0008
> real    0m12.364s
> user    0m0.704s
> sys     0m5.373s
> 
> time ./kmod.sh -t 0009
> real    0m47.638s
> user    0m1.033s
> sys     0m5.425s
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  kernel/kmod.c                        | 16 +++++++---------
>  tools/testing/selftests/kmod/kmod.sh | 24 ++----------------------
>  2 files changed, 9 insertions(+), 31 deletions(-)
> 
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 3e346c700e80..46b12fed6fd0 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -163,14 +163,11 @@ int __request_module(bool wait, const char *fmt, ...)
>  		return ret;
>  
>  	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
> -		/* We may be blaming an innocent here, but unlikely */
> -		if (kmod_loop_msg < 5) {
> -			printk(KERN_ERR
> -			       "request_module: runaway loop modprobe %s\n",
> -			       module_name);
> -			kmod_loop_msg++;
> -		}
> -		return -ENOMEM;
> +		pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s\n, throttling...",
> +				    atomic_read(&kmod_concurrent_max),
> +				    50, module_name);

It is weird to pass the constant '50' via %s. Also a #define should be
used to keep it in sync with the kmod_concurrent_max initialization.


> +		wait_event_interruptible(kmod_wq,
> +					 atomic_dec_if_positive(&kmod_concurrent_max) >= 0);
>  	}
>  
>  	trace_module_request(module_name, wait, _RET_IP_);
> @@ -178,6 +175,7 @@ int __request_module(bool wait, const char *fmt, ...)
>  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
>  
>  	atomic_inc(&kmod_concurrent_max);
> +	wake_up_all(&kmod_wq);

Does it make sense to wake up all waiters when we released the resource
only for one? IMHO, a simple wake_up() should be here.

I am sorry for the late review. The month ran really fast.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 4/4] kmod: throttle kmod thread limit
  2017-06-22 15:19       ` Petr Mladek
@ 2017-06-23 16:16         ` Luis R. Rodriguez
  2017-06-23 17:56           ` Luis R. Rodriguez
  2017-06-26  9:55           ` Petr Mladek
  0 siblings, 2 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-23 16:16 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, akpm, jeyu, shuah, rusty, ebiederm,
	dmitry.torokhov, acme, corbet, josh, martin.wilck, mmarek, hare,
	rwright, jeffm, DSterba, fdmanana, neilb, linux, rgoldwyn,
	subashab, xypron.glpk, keescook, atomlin, mbenes, paulmck,
	dan.j.williams, jpoimboe, davem, mingo, alan, tytso, gregkh,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Thu, Jun 22, 2017 at 05:19:36PM +0200, Petr Mladek wrote:
> On Fri 2017-05-26 14:12:28, Luis R. Rodriguez wrote:
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -163,14 +163,11 @@ int __request_module(bool wait, const char *fmt, ...)
> >  		return ret;
> >  
> >  	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
> > -		/* We may be blaming an innocent here, but unlikely */
> > -		if (kmod_loop_msg < 5) {
> > -			printk(KERN_ERR
> > -			       "request_module: runaway loop modprobe %s\n",
> > -			       module_name);
> > -			kmod_loop_msg++;
> > -		}
> > -		return -ENOMEM;
> > +		pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s\n, throttling...",
> > +				    atomic_read(&kmod_concurrent_max),
> > +				    50, module_name);
> 
> It is weird to pass the constant '50' via %s.

The 50 was passed with %u, so I take it you meant it is odd to use a parameter
for it.

> Also a #define should be
> used to keep it in sync with the kmod_concurrent_max initialization.

OK.

> > +		wait_event_interruptible(kmod_wq,
> > +					 atomic_dec_if_positive(&kmod_concurrent_max) >= 0);
> >  	}
> >  
> >  	trace_module_request(module_name, wait, _RET_IP_);
> > @@ -178,6 +175,7 @@ int __request_module(bool wait, const char *fmt, ...)
> >  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> >  
> >  	atomic_inc(&kmod_concurrent_max);
> > +	wake_up_all(&kmod_wq);
> 
> Does it make sense to wake up all waiters when we released the resource
> only for one? IMHO, a simple wake_up() should be here.

Then we should wake_up() also on failure, otherwise we have the potential
to not wake some in a proper time.

> I am sorry for the late review. The month ran really fast.

No worries!

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 4/4] kmod: throttle kmod thread limit
  2017-06-23 16:16         ` Luis R. Rodriguez
@ 2017-06-23 17:56           ` Luis R. Rodriguez
  2017-06-23 19:16             ` Luis R. Rodriguez
  2017-06-26  9:55           ` Petr Mladek
  1 sibling, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-23 17:56 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Petr Mladek, akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, josh, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri, Jun 23, 2017 at 06:16:19PM +0200, Luis R. Rodriguez wrote:
> On Thu, Jun 22, 2017 at 05:19:36PM +0200, Petr Mladek wrote:
> > On Fri 2017-05-26 14:12:28, Luis R. Rodriguez wrote:
> > > --- a/kernel/kmod.c
> > > +++ b/kernel/kmod.c
> > > @@ -178,6 +175,7 @@ int __request_module(bool wait, const char *fmt, ...)
> > >  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> > >  
> > >  	atomic_inc(&kmod_concurrent_max);
> > > +	wake_up_all(&kmod_wq);
> > 
> > Does it make sense to wake up all waiters when we released the resource
> > only for one? IMHO, a simple wake_up() should be here.
> 
> Then we should wake_up() also on failure, otherwise we have the potential
> to not wake some in a proper time.

I checked and it turns out we have no error paths after we consume a kmod
ticket, if you will. Once we bump with atomic_dec_if_positive() we assume
we're moving forward with an attempt, and the only failure path is already
bundled with a wake at the end of the __request_module() call.

Then the next question would be *who* exactly gets woken up next if we just
use wake_up() ? The common core wake up code varies depending on use and
all this reminded me of the complexity we just don't need, so I have now
converted to use swait. swait uses list_add() if empty and then iterates
with list_first_entry() on wakeup, so that should get the first item added
to the wait list.

Works with me. Will run a test a before v4 is sent, but since only 2 patches
are modified will only send a respective update for these 2 patches.

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 4/4] kmod: throttle kmod thread limit
  2017-06-23 17:56           ` Luis R. Rodriguez
@ 2017-06-23 19:16             ` Luis R. Rodriguez
  2017-06-26 10:03               ` Petr Mladek
  0 siblings, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-23 19:16 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Petr Mladek, akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov,
	acme, corbet, josh, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri, Jun 23, 2017 at 07:56:11PM +0200, Luis R. Rodriguez wrote:
> On Fri, Jun 23, 2017 at 06:16:19PM +0200, Luis R. Rodriguez wrote:
> > On Thu, Jun 22, 2017 at 05:19:36PM +0200, Petr Mladek wrote:
> > > On Fri 2017-05-26 14:12:28, Luis R. Rodriguez wrote:
> > > > --- a/kernel/kmod.c
> > > > +++ b/kernel/kmod.c
> > > > @@ -178,6 +175,7 @@ int __request_module(bool wait, const char *fmt, ...)
> > > >  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> > > >  
> > > >  	atomic_inc(&kmod_concurrent_max);
> > > > +	wake_up_all(&kmod_wq);
> > > 
> > > Does it make sense to wake up all waiters when we released the resource
> > > only for one? IMHO, a simple wake_up() should be here.
> > 
> > Then we should wake_up() also on failure, otherwise we have the potential
> > to not wake some in a proper time.
> 
> I checked and it turns out we have no error paths after we consume a kmod
> ticket, if you will. Once we bump with atomic_dec_if_positive() we assume
> we're moving forward with an attempt, and the only failure path is already
> bundled with a wake at the end of the __request_module() call.
> 
> Then the next question would be *who* exactly gets woken up next if we just
> use wake_up() ? The common core wake up code varies depending on use and
> all this reminded me of the complexity we just don't need, so I have now
> converted to use swait. swait uses list_add() if empty and then iterates
> with list_first_entry() on wakeup, so that should get the first item added
> to the wait list.
> 
> Works with me. Will run a test a before v4 is sent, but since only 2 patches
> are modified will only send a respective update for these 2 patches.

Alright, this worked out well! Its just a tiny bit slower on test cases 0008
and 0009 (few seconds) but that's fine, its natural due to the lack of the
swake_up_all().

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify
  2017-05-26 21:12     ` [PATCH v3 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify Luis R. Rodriguez
@ 2017-06-23 19:19       ` Luis R. Rodriguez
  2017-06-26 11:36       ` [PATCH v3 " Petr Mladek
  1 sibling, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-23 19:19 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet, josh
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

When checking if we want to allow a kmod thread to kick off we increment,
then read to see if we should enable a thread. If we were over the allowed
limit limit we decrement. Splitting the increment far apart from decrement
means there could be a time where two increments happen potentially
giving a false failure on a thread which should have been allowed.

CPU1			CPU2
atomic_inc()
			atomic_inc()
atomic_read()
			atomic_read()
atomic_dec()
			atomic_dec()

In this case a read on CPU1 gets the atomic_inc()'s and we could negate
it from getting a kmod thread. We could try to prevent this with a lock
or preemption but that is overkill. We can fix by reducing the number of
atomic operations. We do this by inverting the logic of of the enabler,
instead of incrementing kmod_concurrent as we get new kmod users, define the
variable kmod_concurrent_max as the max number of currently allowed kmod
users and as we get new kmod users just decrement it if its still positive.
This combines the dec and read in one atomic operation.

In this case we no longer get the same false failure:

CPU1			CPU2
atomic_dec_if_positive()
			atomic_dec_if_positive()
atomic_inc()
			atomic_inc()

The number of threads is computed at init, and since the current computation
of kmod_concurrent includes the thread count we can avoid setting
kmod_concurrent_max later in boot through an init call by simply sticking to
50 as the kmod_concurrent_max. The assumption here is a system with modules
must at least have ~16 MiB of RAM.

Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c | 40 ++++++++++++++++++----------------------
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 563f97e2be36..ff68198fe83b 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -45,8 +45,6 @@
 
 #include <trace/events/module.h>
 
-extern int max_threads;
-
 #define CAP_BSET	(void *)1
 #define CAP_PI		(void *)2
 
@@ -56,6 +54,20 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
 static DECLARE_RWSEM(umhelper_sem);
 
 #ifdef CONFIG_MODULES
+/*
+ * Assuming:
+ *
+ * threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
+ *		       (u64) THREAD_SIZE * 8UL);
+ *
+ * If you need less than 50 threads would mean we're dealing with systems
+ * smaller than 3200 pages. This assuems you are capable of having ~13M memory,
+ * and this would only be an be an upper limit, after which the OOM killer
+ * would take effect. Systems like these are very unlikely if modules are
+ * enabled.
+ */
+#define MAX_KMOD_CONCURRENT 50
+static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -127,10 +139,7 @@ int __request_module(bool wait, const char *fmt, ...)
 {
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
-	unsigned int max_modprobes;
 	int ret;
-	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
-#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
 	static int kmod_loop_msg;
 
 	/*
@@ -154,21 +163,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
-	/* If modprobe needs a service that is in a module, we get a recursive
-	 * loop.  Limit the number of running kmod threads to max_threads/2 or
-	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
-	 * would be to run the parents of this process, counting how many times
-	 * kmod was invoked.  That would mean accessing the internals of the
-	 * process tables to get the command line, proc_pid_cmdline is static
-	 * and it is not worth changing the proc code just to handle this case. 
-	 * KAO.
-	 *
-	 * "trace the ppid" is simple, but will fail if someone's
-	 * parent exits.  I think this is as good as it gets. --RR
-	 */
-	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
-	atomic_inc(&kmod_concurrent);
-	if (atomic_read(&kmod_concurrent) > max_modprobes) {
+	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
 		/* We may be blaming an innocent here, but unlikely */
 		if (kmod_loop_msg < 5) {
 			printk(KERN_ERR
@@ -176,7 +171,6 @@ int __request_module(bool wait, const char *fmt, ...)
 			       module_name);
 			kmod_loop_msg++;
 		}
-		atomic_dec(&kmod_concurrent);
 		return -ENOMEM;
 	}
 
@@ -184,10 +178,12 @@ int __request_module(bool wait, const char *fmt, ...)
 
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
-	atomic_dec(&kmod_concurrent);
+	atomic_inc(&kmod_concurrent_max);
+
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
+
 #endif /* CONFIG_MODULES */
 
 static void call_usermodehelper_freeinfo(struct subprocess_info *info)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 4/4] kmod: throttle kmod thread limit
  2017-05-26 21:12     ` [PATCH v3 4/4] kmod: throttle kmod thread limit Luis R. Rodriguez
  2017-06-22 15:19       ` Petr Mladek
@ 2017-06-23 19:20       ` Luis R. Rodriguez
  2017-06-26 11:38         ` Petr Mladek
  1 sibling, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-23 19:20 UTC (permalink / raw)
  To: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, corbet, josh
  Cc: martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel, Luis R. Rodriguez

If we reach the limit of modprobe_limit threads running the next
request_module() call will fail. The original reason for adding
a kill was to do away with possible issues with in old circumstances
which would create a recursive series of request_module() calls.

We can do better than just be super aggressive and reject calls
once we've reached the limit by simply making pending callers wait
until the threshold has been reduced, and then throttling them in,
one by one.

This throttling enables requests over the kmod concurrent limit to
be processed once a pending request completes. Only the first item
queued up to wait is woken up. The assumption here is once a task
is woken it will have no other option to also kick the queue to check
if there are more pending tasks -- regardless of whether or not it
was successful.

By throttling and processing only max kmod concurrent tasks we ensure
we avoid unexpected fatal request_module() calls, and we keep memory
consumption on module loading to a minimum.

With x86_64 qemu, with 4 cores, 4 GiB of RAM it takes the following run
time to run both tests:

time ./kmod.sh -t 0008
real    0m16.523s
user    0m0.879s
sys     0m8.977s

time ./kmod.sh -t 0009
real    0m56.080s
user    0m0.717s
sys     0m10.324s

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c                        | 17 ++++++++---------
 tools/testing/selftests/kmod/kmod.sh | 24 ++----------------------
 2 files changed, 10 insertions(+), 31 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index ff68198fe83b..d3b4f8e3f2b0 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -42,6 +42,7 @@
 #include <linux/ptrace.h>
 #include <linux/async.h>
 #include <linux/uaccess.h>
+#include <linux/swait.h>
 
 #include <trace/events/module.h>
 
@@ -68,6 +69,7 @@ static DECLARE_RWSEM(umhelper_sem);
  */
 #define MAX_KMOD_CONCURRENT 50
 static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT);
+static DECLARE_SWAIT_QUEUE_HEAD(kmod_wq);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -140,7 +142,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static int kmod_loop_msg;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -164,14 +165,11 @@ int __request_module(bool wait, const char *fmt, ...)
 		return ret;
 
 	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
-		/* We may be blaming an innocent here, but unlikely */
-		if (kmod_loop_msg < 5) {
-			printk(KERN_ERR
-			       "request_module: runaway loop modprobe %s\n",
-			       module_name);
-			kmod_loop_msg++;
-		}
-		return -ENOMEM;
+		pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s, throttling...",
+				    atomic_read(&kmod_concurrent_max),
+				    MAX_KMOD_CONCURRENT, module_name);
+		swait_event_interruptible(kmod_wq,
+					  atomic_dec_if_positive(&kmod_concurrent_max) >= 0);
 	}
 
 	trace_module_request(module_name, wait, _RET_IP_);
@@ -179,6 +177,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
 	atomic_inc(&kmod_concurrent_max);
+	swake_up(&kmod_wq);
 
 	return ret;
 }
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
index 10196a62ed09..8cecae9a8bca 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -59,28 +59,8 @@ ALL_TESTS="$ALL_TESTS 0004:1:1"
 ALL_TESTS="$ALL_TESTS 0005:10:1"
 ALL_TESTS="$ALL_TESTS 0006:10:1"
 ALL_TESTS="$ALL_TESTS 0007:5:1"
-
-# Disabled tests:
-#
-# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
-# Current best-effort failure interpretation:
-# Enough module requests get loaded in place fast enough to reach over the
-# max_modprobes limit and trigger a failure -- before we're even able to
-# start processing pending requests.
-ALL_TESTS="$ALL_TESTS 0008:150:0"
-
-# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
-# Current best-effort failure interpretation:
-#
-# get_fs_type() requests modules using aliases as such the optimization in
-# place today to look for already loaded modules will not take effect and
-# we end up requesting a new module to load, this bumps the kmod_concurrent,
-# and in certain circumstances can lead to pushing the kmod_concurrent over
-# the max_modprobe limit.
-#
-# This test fails much easier than test 0008 since the alias optimizations
-# are not in place.
-ALL_TESTS="$ALL_TESTS 0009:150:0"
+ALL_TESTS="$ALL_TESTS 0008:150:1"
+ALL_TESTS="$ALL_TESTS 0009:150:1"
 
 test_modprobe()
 {
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 4/4] kmod: throttle kmod thread limit
  2017-06-23 16:16         ` Luis R. Rodriguez
  2017-06-23 17:56           ` Luis R. Rodriguez
@ 2017-06-26  9:55           ` Petr Mladek
  1 sibling, 0 replies; 69+ messages in thread
From: Petr Mladek @ 2017-06-26  9:55 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, josh, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri 2017-06-23 18:16:19, Luis R. Rodriguez wrote:
> On Thu, Jun 22, 2017 at 05:19:36PM +0200, Petr Mladek wrote:
> > On Fri 2017-05-26 14:12:28, Luis R. Rodriguez wrote:
> > > --- a/kernel/kmod.c
> > > +++ b/kernel/kmod.c
> > > @@ -163,14 +163,11 @@ int __request_module(bool wait, const char *fmt, ...)
> > >  		return ret;
> > >  
> > >  	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
> > > -		/* We may be blaming an innocent here, but unlikely */
> > > -		if (kmod_loop_msg < 5) {
> > > -			printk(KERN_ERR
> > > -			       "request_module: runaway loop modprobe %s\n",
> > > -			       module_name);
> > > -			kmod_loop_msg++;
> > > -		}
> > > -		return -ENOMEM;
> > > +		pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s\n, throttling...",
> > > +				    atomic_read(&kmod_concurrent_max),
> > > +				    50, module_name);
> > 
> > It is weird to pass the constant '50' via %s.
> 
> The 50 was passed with %u, so I take it you meant it is odd to use a parameter
> for it.

Yeah, I meant %u and not %s.

> > Also a #define should be
> > used to keep it in sync with the kmod_concurrent_max initialization.
> 
> OK.
> 
> > > +		wait_event_interruptible(kmod_wq,
> > > +					 atomic_dec_if_positive(&kmod_concurrent_max) >= 0);
> > >  	}
> > >  
> > >  	trace_module_request(module_name, wait, _RET_IP_);
> > > @@ -178,6 +175,7 @@ int __request_module(bool wait, const char *fmt, ...)
> > >  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> > >  
> > >  	atomic_inc(&kmod_concurrent_max);
> > > +	wake_up_all(&kmod_wq);
> > 
> > Does it make sense to wake up all waiters when we released the resource
> > only for one? IMHO, a simple wake_up() should be here.
> 
> Then we should wake_up() also on failure, otherwise we have the potential
> to not wake some in a proper time.

I think that we must wake_up() always when we increment
kmod_concurrent_max. If the value was negative, the increment will
allow exactly one process to pass that
atomic_dec_if_positive(&kmod_concurrent_max) >= 0). It the value
is positive, there must have been other wake_up() calls or there
is no waiter.

IMHO, this works because kmod_concurrent_max handling is atomic
and race-less now. Also (s)wait_event_interruptible() is safe
and does not allow to get into sleep when the resource is available.

Anyway, it is great that you have double checked this.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 4/4] kmod: throttle kmod thread limit
  2017-06-23 19:16             ` Luis R. Rodriguez
@ 2017-06-26 10:03               ` Petr Mladek
  0 siblings, 0 replies; 69+ messages in thread
From: Petr Mladek @ 2017-06-26 10:03 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, josh, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri 2017-06-23 21:16:37, Luis R. Rodriguez wrote:
> On Fri, Jun 23, 2017 at 07:56:11PM +0200, Luis R. Rodriguez wrote:
> > On Fri, Jun 23, 2017 at 06:16:19PM +0200, Luis R. Rodriguez wrote:
> > > On Thu, Jun 22, 2017 at 05:19:36PM +0200, Petr Mladek wrote:
> > > > On Fri 2017-05-26 14:12:28, Luis R. Rodriguez wrote:
> > > > > --- a/kernel/kmod.c
> > > > > +++ b/kernel/kmod.c
> > > > > @@ -178,6 +175,7 @@ int __request_module(bool wait, const char *fmt, ...)
> > > > >  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> > > > >  
> > > > >  	atomic_inc(&kmod_concurrent_max);
> > > > > +	wake_up_all(&kmod_wq);
> > > > 
> > > > Does it make sense to wake up all waiters when we released the resource
> > > > only for one? IMHO, a simple wake_up() should be here.
> > > 
> > > Then we should wake_up() also on failure, otherwise we have the potential
> > > to not wake some in a proper time.
> > 
> > I checked and it turns out we have no error paths after we consume a kmod
> > ticket, if you will. Once we bump with atomic_dec_if_positive() we assume
> > we're moving forward with an attempt, and the only failure path is already
> > bundled with a wake at the end of the __request_module() call.
> > 
> > Then the next question would be *who* exactly gets woken up next if we just
> > use wake_up() ? The common core wake up code varies depending on use and
> > all this reminded me of the complexity we just don't need, so I have now
> > converted to use swait. swait uses list_add() if empty and then iterates
> > with list_first_entry() on wakeup, so that should get the first item added
> > to the wait list.
> > 
> > Works with me. Will run a test a before v4 is sent, but since only 2 patches
> > are modified will only send a respective update for these 2 patches.
> 
> Alright, this worked out well! Its just a tiny bit slower on test cases 0008
> and 0009 (few seconds) but that's fine, its natural due to the lack of the
> swake_up_all().

This is interesting. I guess that it was faster with swake_up_all()
because it worked as a speculative pre-wake. I mean that it takes some
time between adding a process into run-queue and really running it.
IMHO, swake_up_all() caused that __request_module() callers were
more often really running and trying to pass that
atomic_dec_if_positive(&kmod_concurrent_max) >= 0).

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify
  2017-05-26 21:12     ` [PATCH v3 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify Luis R. Rodriguez
  2017-06-23 19:19       ` [PATCH v4 " Luis R. Rodriguez
@ 2017-06-26 11:36       ` Petr Mladek
  1 sibling, 0 replies; 69+ messages in thread
From: Petr Mladek @ 2017-06-26 11:36 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, josh, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri 2017-05-26 14:12:26, Luis R. Rodriguez wrote:
> atomic operations. We do this by inverting the logic of of the enabler,
> instead of incrementing kmod_concurrent as we get new kmod users, define the
> variable kmod_concurrent_max as the max number of currently allowed kmod
> users and as we get new kmod users just decrement it if its still positive.
> This combines the dec and read in one atomic operation.
> 
> In this case we no longer get the same false failure:
> 
> CPU1			CPU2
> atomic_dec_if_positive()
> 			atomic_dec_if_positive()
> atomic_inc()
> 			atomic_inc()
> 
> Suggested-by: Petr Mladek <pmladek@suse.com>
> Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

The change looks fine to me. The code is much easier and less hacky.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 4/4] kmod: throttle kmod thread limit
  2017-06-23 19:20       ` [PATCH v4 " Luis R. Rodriguez
@ 2017-06-26 11:38         ` Petr Mladek
  2017-06-28 22:11           ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Petr Mladek @ 2017-06-26 11:38 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: akpm, jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme,
	corbet, josh, martin.wilck, mmarek, hare, rwright, jeffm,
	DSterba, fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, linux-kselftest,
	linux-doc, linux-kernel

On Fri 2017-06-23 12:20:11, Luis R. Rodriguez wrote:
> If we reach the limit of modprobe_limit threads running the next
> request_module() call will fail. The original reason for adding
> a kill was to do away with possible issues with in old circumstances
> which would create a recursive series of request_module() calls.
> 
> We can do better than just be super aggressive and reject calls
> once we've reached the limit by simply making pending callers wait
> until the threshold has been reduced, and then throttling them in,
> one by one.
> 
> This throttling enables requests over the kmod concurrent limit to
> be processed once a pending request completes. Only the first item
> queued up to wait is woken up. The assumption here is once a task
> is woken it will have no other option to also kick the queue to check
> if there are more pending tasks -- regardless of whether or not it
> was successful.
> 
> By throttling and processing only max kmod concurrent tasks we ensure
> we avoid unexpected fatal request_module() calls, and we keep memory
> consumption on module loading to a minimum.
> 
> With x86_64 qemu, with 4 cores, 4 GiB of RAM it takes the following run
> time to run both tests:
> 
> time ./kmod.sh -t 0008
> real    0m16.523s
> user    0m0.879s
> sys     0m8.977s
> 
> time ./kmod.sh -t 0009
> real    0m56.080s
> user    0m0.717s
> sys     0m10.324s
> 
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

All the changes look fine to me. They make perfect sense.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-21  0:23       ` Kees Cook
@ 2017-06-26 21:37         ` Jessica Yu
  2017-06-26 22:44           ` Luis R. Rodriguez
  0 siblings, 1 reply; 69+ messages in thread
From: Jessica Yu @ 2017-06-26 21:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: Luis R. Rodriguez, Andrew Morton, Shuah Khan, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, Josh Triplett, martin.wilck, Michal Marek,
	Petr Mladek, hare, rwright, Jeff Mahoney, David Sterba,
	Filipe Manana, NeilBrown, Guenter Roeck, rgoldwyn,
	Subash Abhinov Kasiviswanathan, Heinrich Schuchardt,
	Aaron Tomlin, Miroslav Benes, Paul E. McKenney, Dan Williams,
	Josh Poimboeuf, David S. Miller, Ingo Molnar, Alan Cox,
	Ted Ts'o, Greg KH, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

+++ Kees Cook [20/06/17 17:23 -0700]:
>On Tue, Jun 20, 2017 at 1:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
>>> This v3 nukes the proc sysctl interface in favor for just letting userspace
>>> just check kernel revision. Prior to whenever this is merged userspace should
>>> try to avoid hammering more than 50 kmod threads as they can fail and it'd
>>> get -ENOMEM.
>>>
>>> We do away with the old heuristics on assuming you could end up with
>>> less than max_threads/2 < 50 threads as Dmitry notes this would mean having
>>> a system with 16 MiB of RAM with modules enabled. It simplifies our patch
>>> "kmod: reduce atomic operations on kmod_concurrent" considerbly.
>>>
>>> Since the sysctl interface is gone, this no longer depends on any
>>> other patches, the series is independent. As usual the series is
>>> available on my linux-next 20170526-kmod-only branch which is based
>>> on next-20170526.
>>>
>>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170526-kmod-only
>>>
>>>   Luis
>>>
>>> Luis R. Rodriguez (4):
>>>   module: use list_for_each_entry_rcu() on find_module_all()
>>>   kmod: reduce atomic operations on kmod_concurrent and simplify
>>>   kmod: add test driver to stress test the module loader
>>>   kmod: throttle kmod thread limit
>>
>> About a month now with no further nitpicks. What tree should these changes
>> go through if there are no issues? Andrew's, Jessica's ?
>
>Seems like going through Jessica's would make the most sense?

Would be happy to take patches 01 (which I need to anyway), 02,
possibly 04 if decoupled from the test driver (03). I can't take patch
03 through my tree just yet, as I haven't had time to give it a look
yet :-/

[ Side comment, it seems that kmod.c isn't directly maintained by anyone
right now, perhaps Luis would be interested in picking it up? :-) ]

Thanks,

Jessica

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-26 21:37         ` Jessica Yu
@ 2017-06-26 22:44           ` Luis R. Rodriguez
  2017-06-27  0:27             ` Luis R. Rodriguez
  2017-06-27 15:26             ` Jessica Yu
  0 siblings, 2 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-26 22:44 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Kees Cook, Luis R. Rodriguez, Andrew Morton, Shuah Khan,
	Rusty Russell, Eric W. Biederman, Dmitry Torokhov,
	Arnaldo Carvalho de Melo, Jonathan Corbet, Josh Triplett,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, David Sterba, Filipe Manana, NeilBrown,
	Guenter Roeck, rgoldwyn, Subash Abhinov Kasiviswanathan,
	Heinrich Schuchardt, Aaron Tomlin, Miroslav Benes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Alan Cox, Ted Ts'o, Greg KH, Linus Torvalds,
	linux-kselftest, linux-doc, LKML

On Mon, Jun 26, 2017 at 11:37:36PM +0200, Jessica Yu wrote:
> +++ Kees Cook [20/06/17 17:23 -0700]:
> > On Tue, Jun 20, 2017 at 1:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
> > > > This v3 nukes the proc sysctl interface in favor for just letting userspace
> > > > just check kernel revision. Prior to whenever this is merged userspace should
> > > > try to avoid hammering more than 50 kmod threads as they can fail and it'd
> > > > get -ENOMEM.
> > > > 
> > > > We do away with the old heuristics on assuming you could end up with
> > > > less than max_threads/2 < 50 threads as Dmitry notes this would mean having
> > > > a system with 16 MiB of RAM with modules enabled. It simplifies our patch
> > > > "kmod: reduce atomic operations on kmod_concurrent" considerbly.
> > > > 
> > > > Since the sysctl interface is gone, this no longer depends on any
> > > > other patches, the series is independent. As usual the series is
> > > > available on my linux-next 20170526-kmod-only branch which is based
> > > > on next-20170526.
> > > > 
> > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170526-kmod-only
> > > > 
> > > >   Luis
> > > > 
> > > > Luis R. Rodriguez (4):
> > > >   module: use list_for_each_entry_rcu() on find_module_all()
> > > >   kmod: reduce atomic operations on kmod_concurrent and simplify
> > > >   kmod: add test driver to stress test the module loader
> > > >   kmod: throttle kmod thread limit
> > > 
> > > About a month now with no further nitpicks. What tree should these changes
> > > go through if there are no issues? Andrew's, Jessica's ?
> > 
> > Seems like going through Jessica's would make the most sense?
> 
> Would be happy to take patches 01 (which I need to anyway), 02,
> possibly 04 if decoupled from the test driver (03).

Feel free to decouple it, but note that then the commit log must then be
changed. My own take is this fix is not so critical as it is a corner case, so
I have instead preferred to couple in the test case and respective fix
together. I'll leave it up to you how to proceed.

> I can't take patch 03 through my tree just yet, as I haven't had time to give
> it a look yet :-/

Understood. I'd appreciate at least a review though.

> [ Side comment, it seems that kmod.c isn't directly maintained by anyone
> right now, perhaps Luis would be interested in picking it up? :-) ]

Sure thing, I'm not sure if it makes sense to decouple kernel/kmod.c on
MAINTAINERS though, if you do let me know what you'd prefer to call it,
"KMOD MODULE USERMODE HELPER" ?

If you prefer to keep them together I can certainly volunteer to review all
kmod changes and can send a patch to add kmod and myself under "MODULE
SUPPORT".

Either way, let me know!

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-26 22:44           ` Luis R. Rodriguez
@ 2017-06-27  0:27             ` Luis R. Rodriguez
  2017-06-27  8:13               ` Petr Mladek
  2017-06-27 15:26             ` Jessica Yu
  1 sibling, 1 reply; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-27  0:27 UTC (permalink / raw)
  To: Jessica Yu, Linus Torvalds, Peter Zijlstra
  Cc: Jessica Yu, Kees Cook, Luis R. Rodriguez, Andrew Morton,
	Shuah Khan, Rusty Russell, Eric W. Biederman, Dmitry Torokhov,
	Arnaldo Carvalho de Melo, Jonathan Corbet, Josh Triplett,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, David Sterba, Filipe Manana, NeilBrown,
	Guenter Roeck, rgoldwyn, Subash Abhinov Kasiviswanathan,
	Heinrich Schuchardt, Aaron Tomlin, Miroslav Benes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Alan Cox, Ted Ts'o, Greg KH, linux-kselftest,
	linux-doc, LKML

On Tue, Jun 27, 2017 at 12:44:42AM +0200, Luis R. Rodriguez wrote:
> On Mon, Jun 26, 2017 at 11:37:36PM +0200, Jessica Yu wrote:
> > +++ Kees Cook [20/06/17 17:23 -0700]:
> > > On Tue, Jun 20, 2017 at 1:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
> > > > > Luis R. Rodriguez (4):
> > > > >   module: use list_for_each_entry_rcu() on find_module_all()
> > > > >   kmod: reduce atomic operations on kmod_concurrent and simplify
> > > > >   kmod: add test driver to stress test the module loader
> > > > >   kmod: throttle kmod thread limit
> > > > 
> > > > About a month now with no further nitpicks. What tree should these changes
> > > > go through if there are no issues? Andrew's, Jessica's ?
> > > 
> > > Seems like going through Jessica's would make the most sense?
> > 
> > Would be happy to take patches 01 (which I need to anyway), 02,
> > possibly 04 if decoupled from the test driver (03).
> 
> Feel free to decouple it, but note that then the commit log must then be
> changed. My own take is this fix is not so critical as it is a corner case, so
> I have instead preferred to couple in the test case and respective fix
> together. I'll leave it up to you how to proceed.

Note: Linus noted swait is actually very special use-case [0] so I'd hate to
add a new use case not vetted for. This use case on kmod.c really does *not*
require anything but a simple wait though, so really am inclined to let that
through unless I hear back...

[0] https://lkml.kernel.org/r/20170627001534.GK21846@wotan.suse.de

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-27  0:27             ` Luis R. Rodriguez
@ 2017-06-27  8:13               ` Petr Mladek
  2017-06-27 10:04                 ` Jessica Yu
  0 siblings, 1 reply; 69+ messages in thread
From: Petr Mladek @ 2017-06-27  8:13 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Jessica Yu, Linus Torvalds, Peter Zijlstra, Kees Cook,
	Andrew Morton, Shuah Khan, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	Josh Triplett, martin.wilck, Michal Marek, hare, rwright,
	Jeff Mahoney, David Sterba, Filipe Manana, NeilBrown,
	Guenter Roeck, rgoldwyn, Subash Abhinov Kasiviswanathan,
	Heinrich Schuchardt, Aaron Tomlin, Miroslav Benes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Alan Cox, Ted Ts'o, Greg KH, linux-kselftest,
	linux-doc, LKML

On Tue 2017-06-27 02:27:44, Luis R. Rodriguez wrote:
> On Tue, Jun 27, 2017 at 12:44:42AM +0200, Luis R. Rodriguez wrote:
> > On Mon, Jun 26, 2017 at 11:37:36PM +0200, Jessica Yu wrote:
> > > +++ Kees Cook [20/06/17 17:23 -0700]:
> > > > On Tue, Jun 20, 2017 at 1:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
> > > > > > Luis R. Rodriguez (4):
> > > > > >   module: use list_for_each_entry_rcu() on find_module_all()
> > > > > >   kmod: reduce atomic operations on kmod_concurrent and simplify
> > > > > >   kmod: add test driver to stress test the module loader
> > > > > >   kmod: throttle kmod thread limit
> > > > > 
> > > > > About a month now with no further nitpicks. What tree should these changes
> > > > > go through if there are no issues? Andrew's, Jessica's ?
> > > > 
> > > > Seems like going through Jessica's would make the most sense?
> > > 
> > > Would be happy to take patches 01 (which I need to anyway), 02,
> > > possibly 04 if decoupled from the test driver (03).
> > 
> > Feel free to decouple it, but note that then the commit log must then be
> > changed. My own take is this fix is not so critical as it is a corner case, so
> > I have instead preferred to couple in the test case and respective fix
> > together. I'll leave it up to you how to proceed.
> 
> Note: Linus noted swait is actually very special use-case [0] so I'd hate to
> add a new use case not vetted for. This use case on kmod.c really does *not*
> require anything but a simple wait though, so really am inclined to let that
> through unless I hear back...
> 
> [0] https://lkml.kernel.org/r/20170627001534.GK21846@wotan.suse.de

Heh, I was not aware of this special case either. The welcoming
comment of the swait API confused me as well.

In this light, I suggest to switch the patch to using the normal wait API.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-27  8:13               ` Petr Mladek
@ 2017-06-27 10:04                 ` Jessica Yu
  0 siblings, 0 replies; 69+ messages in thread
From: Jessica Yu @ 2017-06-27 10:04 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, Linus Torvalds, Peter Zijlstra, Kees Cook,
	Andrew Morton, Shuah Khan, Rusty Russell, Eric W. Biederman,
	Dmitry Torokhov, Arnaldo Carvalho de Melo, Jonathan Corbet,
	Josh Triplett, martin.wilck, Michal Marek, hare, rwright,
	Jeff Mahoney, David Sterba, Filipe Manana, NeilBrown,
	Guenter Roeck, rgoldwyn, Subash Abhinov Kasiviswanathan,
	Heinrich Schuchardt, Aaron Tomlin, Miroslav Benes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Alan Cox, Ted Ts'o, Greg KH, linux-kselftest,
	linux-doc, LKML

+++ Petr Mladek [27/06/17 10:13 +0200]:
>On Tue 2017-06-27 02:27:44, Luis R. Rodriguez wrote:
>> On Tue, Jun 27, 2017 at 12:44:42AM +0200, Luis R. Rodriguez wrote:
>> > On Mon, Jun 26, 2017 at 11:37:36PM +0200, Jessica Yu wrote:
>> > > +++ Kees Cook [20/06/17 17:23 -0700]:
>> > > > On Tue, Jun 20, 2017 at 1:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > > > > On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
>> > > > > > Luis R. Rodriguez (4):
>> > > > > >   module: use list_for_each_entry_rcu() on find_module_all()
>> > > > > >   kmod: reduce atomic operations on kmod_concurrent and simplify
>> > > > > >   kmod: add test driver to stress test the module loader
>> > > > > >   kmod: throttle kmod thread limit
>> > > > >
>> > > > > About a month now with no further nitpicks. What tree should these changes
>> > > > > go through if there are no issues? Andrew's, Jessica's ?
>> > > >
>> > > > Seems like going through Jessica's would make the most sense?
>> > >
>> > > Would be happy to take patches 01 (which I need to anyway), 02,
>> > > possibly 04 if decoupled from the test driver (03).
>> >
>> > Feel free to decouple it, but note that then the commit log must then be
>> > changed. My own take is this fix is not so critical as it is a corner case, so
>> > I have instead preferred to couple in the test case and respective fix
>> > together. I'll leave it up to you how to proceed.
>>
>> Note: Linus noted swait is actually very special use-case [0] so I'd hate to
>> add a new use case not vetted for. This use case on kmod.c really does *not*
>> require anything but a simple wait though, so really am inclined to let that
>> through unless I hear back...
>>
>> [0] https://lkml.kernel.org/r/20170627001534.GK21846@wotan.suse.de
>
>Heh, I was not aware of this special case either. The welcoming
>comment of the swait API confused me as well.
>
>In this light, I suggest to switch the patch to using the normal wait API.

Huh, I wasn't aware either :-/ But I agree, judging from Linus'
response [0], it's probably best to use the well established wait_*
variants. I'm not sure I understood why the patch switched to swait,
but in any case I don't think we'd be hitting the "thundering herd"
problem very often here (and if we do, we could just use exclusive
wait. But in that scenario I'd be more interested in why a normal
system would be battered with more than 50 in-kernel modprobe requests
at a time).

[0] https://marc.info/?l=linux-kernel&m=149851347228696&w=2

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-26 22:44           ` Luis R. Rodriguez
  2017-06-27  0:27             ` Luis R. Rodriguez
@ 2017-06-27 15:26             ` Jessica Yu
  2017-06-28  0:49               ` Luis R. Rodriguez
  1 sibling, 1 reply; 69+ messages in thread
From: Jessica Yu @ 2017-06-27 15:26 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andrew Morton, Shuah Khan, Rusty Russell,
	Eric W. Biederman, Dmitry Torokhov, Arnaldo Carvalho de Melo,
	Jonathan Corbet, Josh Triplett, martin.wilck, Michal Marek,
	Petr Mladek, hare, rwright, Jeff Mahoney, David Sterba,
	Filipe Manana, NeilBrown, Guenter Roeck, rgoldwyn,
	Subash Abhinov Kasiviswanathan, Heinrich Schuchardt,
	Aaron Tomlin, Miroslav Benes, Paul E. McKenney, Dan Williams,
	Josh Poimboeuf, David S. Miller, Ingo Molnar, Alan Cox,
	Ted Ts'o, Greg KH, Linus Torvalds, linux-kselftest,
	linux-doc, LKML

+++ Luis R. Rodriguez [27/06/17 00:44 +0200]:
>On Mon, Jun 26, 2017 at 11:37:36PM +0200, Jessica Yu wrote:
>> +++ Kees Cook [20/06/17 17:23 -0700]:
>> > On Tue, Jun 20, 2017 at 1:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> > > On Fri, May 26, 2017 at 02:12:24PM -0700, Luis R. Rodriguez wrote:
>> > > > This v3 nukes the proc sysctl interface in favor for just letting userspace
>> > > > just check kernel revision. Prior to whenever this is merged userspace should
>> > > > try to avoid hammering more than 50 kmod threads as they can fail and it'd
>> > > > get -ENOMEM.
>> > > >
>> > > > We do away with the old heuristics on assuming you could end up with
>> > > > less than max_threads/2 < 50 threads as Dmitry notes this would mean having
>> > > > a system with 16 MiB of RAM with modules enabled. It simplifies our patch
>> > > > "kmod: reduce atomic operations on kmod_concurrent" considerbly.
>> > > >
>> > > > Since the sysctl interface is gone, this no longer depends on any
>> > > > other patches, the series is independent. As usual the series is
>> > > > available on my linux-next 20170526-kmod-only branch which is based
>> > > > on next-20170526.
>> > > >
>> > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170526-kmod-only
>> > > >
>> > > >   Luis
>> > > >
>> > > > Luis R. Rodriguez (4):
>> > > >   module: use list_for_each_entry_rcu() on find_module_all()
>> > > >   kmod: reduce atomic operations on kmod_concurrent and simplify
>> > > >   kmod: add test driver to stress test the module loader
>> > > >   kmod: throttle kmod thread limit
>> > >
>> > > About a month now with no further nitpicks. What tree should these changes
>> > > go through if there are no issues? Andrew's, Jessica's ?
>> >
>> > Seems like going through Jessica's would make the most sense?
>>
>> Would be happy to take patches 01 (which I need to anyway), 02,
>> possibly 04 if decoupled from the test driver (03).
>
>Feel free to decouple it, but note that then the commit log must then be
>changed. My own take is this fix is not so critical as it is a corner case, so
>I have instead preferred to couple in the test case and respective fix
>together. I'll leave it up to you how to proceed.

I'll take 01 and 02 for the next merge window, as they are
straightforward. 03/04 can stay together, and as I understand it 04
may need to switch back to using the normal wait_* api.

>> I can't take patch 03 through my tree just yet, as I haven't had time to give
>> it a look yet :-/
>
>Understood. I'd appreciate at least a review though.

Of course! I should have rephrased and said *by this upcoming merge window. 

>> [ Side comment, it seems that kmod.c isn't directly maintained by anyone
>> right now, perhaps Luis would be interested in picking it up? :-) ]
>
>Sure thing, I'm not sure if it makes sense to decouple kernel/kmod.c on
>MAINTAINERS though, if you do let me know what you'd prefer to call it,
>"KMOD MODULE USERMODE HELPER" ?
>
>If you prefer to keep them together I can certainly volunteer to review all
>kmod changes and can send a patch to add kmod and myself under "MODULE
>SUPPORT".

I'm not the maintainer for kmod.c, if that's what you mean by
decoupling. But I don't think it has one, which is why I'm suggesting
adding it to MAINTAINERS, since you've been actively working on it :)
(looking at git log, it looks like Andrew did most of the sign-off's
for kmod.c in the past). I think a separate entry in MAINTAINERS is
good, with the name you suggested.

Thanks!

Jessica

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v3 0/4] kmod: help make deterministic
  2017-06-27 15:26             ` Jessica Yu
@ 2017-06-28  0:49               ` Luis R. Rodriguez
  0 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-28  0:49 UTC (permalink / raw)
  To: Jessica Yu
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Shuah Khan,
	Rusty Russell, Eric W. Biederman, Dmitry Torokhov,
	Arnaldo Carvalho de Melo, Jonathan Corbet, Josh Triplett,
	martin.wilck, Michal Marek, Petr Mladek, hare, rwright,
	Jeff Mahoney, David Sterba, Filipe Manana, NeilBrown,
	Guenter Roeck, rgoldwyn, Subash Abhinov Kasiviswanathan,
	Heinrich Schuchardt, Aaron Tomlin, Miroslav Benes,
	Paul E. McKenney, Dan Williams, Josh Poimboeuf, David S. Miller,
	Ingo Molnar, Alan Cox, Ted Ts'o, Greg KH, Linus Torvalds,
	linux-kselftest, linux-doc, LKML

On Tue, Jun 27, 2017 at 05:26:05PM +0200, Jessica Yu wrote:
> +++ Luis R. Rodriguez [27/06/17 00:44 +0200]:
> > Feel free to decouple it, but note that then the commit log must then be
> > changed. My own take is this fix is not so critical as it is a corner case, so
> > I have instead preferred to couple in the test case and respective fix
> > together. I'll leave it up to you how to proceed.
> 
> I'll take 01 and 02 for the next merge window, as they are
> straightforward. 03/04 can stay together, and as I understand it 04
> may need to switch back to using the normal wait_* api.

OK, I'll rework 03-04 with the regular wait and submit aiming towards Andrew's tree.

> I'm not the maintainer for kmod.c, if that's what you mean by
> decoupling.

I was suggesting adding kmod.c to MODULE SUPPORT list as its all module
related, adding my self to the list and I'd just take on helping kmod.c stuff.
My point was that it feels odd to decouple kmod.c from module stuff. But lets
give it a shot with them separated first and see how that goes.

>  But I don't think it has one, which is why I'm suggesting
> adding it to MAINTAINERS, since you've been actively working on it :)
> (looking at git log, it looks like Andrew did most of the sign-off's
> for kmod.c in the past). I think a separate entry in MAINTAINERS is
> good, with the name you suggested.

Sure.

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH v4 4/4] kmod: throttle kmod thread limit
  2017-06-26 11:38         ` Petr Mladek
@ 2017-06-28 22:11           ` Luis R. Rodriguez
  0 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-28 22:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Luis R. Rodriguez, akpm, jeyu, shuah, rusty, ebiederm,
	dmitry.torokhov, acme, corbet, josh, martin.wilck, mmarek, hare,
	rwright, jeffm, DSterba, fdmanana, neilb, linux, rgoldwyn,
	subashab, xypron.glpk, keescook, atomlin, mbenes, paulmck,
	dan.j.williams, jpoimboe, davem, mingo, alan, tytso, gregkh,
	torvalds, linux-kselftest, linux-doc, linux-kernel

On Mon, Jun 26, 2017 at 01:38:31PM +0200, Petr Mladek wrote:
> On Fri 2017-06-23 12:20:11, Luis R. Rodriguez wrote:
> > If we reach the limit of modprobe_limit threads running the next
> > request_module() call will fail. The original reason for adding
> > a kill was to do away with possible issues with in old circumstances
> > which would create a recursive series of request_module() calls.
> > 
> > We can do better than just be super aggressive and reject calls
> > once we've reached the limit by simply making pending callers wait
> > until the threshold has been reduced, and then throttling them in,
> > one by one.
> > 
> > This throttling enables requests over the kmod concurrent limit to
> > be processed once a pending request completes. Only the first item
> > queued up to wait is woken up. The assumption here is once a task
> > is woken it will have no other option to also kick the queue to check
> > if there are more pending tasks -- regardless of whether or not it
> > was successful.
> > 
> > By throttling and processing only max kmod concurrent tasks we ensure
> > we avoid unexpected fatal request_module() calls, and we keep memory
> > consumption on module loading to a minimum.
> > 
> > With x86_64 qemu, with 4 cores, 4 GiB of RAM it takes the following run
> > time to run both tests:
> > 
> > time ./kmod.sh -t 0008
> > real    0m16.523s
> > user    0m0.879s
> > sys     0m8.977s
> > 
> > time ./kmod.sh -t 0009
> > real    0m56.080s
> > user    0m0.717s
> > sys     0m10.324s
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> 
> All the changes look fine to me. They make perfect sense.

Thanks, I'll peg a

Reviewed-by: Petr Mladek <pmladek@suse.com>

  Luis

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 0/3] kmod: help make deterministic
  2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
                       ` (4 preceding siblings ...)
  2017-06-20 20:56     ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
@ 2017-06-28 22:31     ` Luis R. Rodriguez
  2017-06-28 22:31       ` [PATCH v4 1/3] MAINTAINERS: give kmod some maintainer love Luis R. Rodriguez
                         ` (2 more replies)
  5 siblings, 3 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-28 22:31 UTC (permalink / raw)
  To: akpm
  Cc: jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, josh,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, dave,
	linux-kselftest, linux-kernel, Luis R. Rodriguez

Andrew,

I'm submitting the last few patches of my kmod series through your tree and as
suggested by Jessica I am picking up maintenance on kmod. These last changes
can either wait until the next release or can be merged now provided someone
reviews the kmod test driver and the merge conflict is addressed with Jessica's
tree with the kmod patch she picked up.

This v4 moves back again from swait to regular waitqueue as suggested generally
by Linus as swait is very specialized [0], and it also drops two patches picked
up by Jessica. Jessica picked the following two patches visible on linux-next
tag next-20170628 as follows:

165d1cc0074b kmod: reduce atomic operations on kmod_concurrent and simplify
93437353daef module: use list_for_each_entry_rcu() on find_module_all()

FWIW, the swait --> wait change proved to just help speed up tests only
marginally, even over a spread of tests these results were generally
consistent. Figured some peformance folks might be interested in that tidbit,
should they wish to go play.

With wait:

time ./kmod.sh -t 0008
real    0m16.366s
user    0m0.883s
sys     0m8.916s

time ./kmod.sh -t 0009
real    0m50.803s
user    0m0.791s
sys     0m9.852s
                                                                               
With swait: 

time ./kmod.sh -t 0008
real    0m16.523s
user    0m0.879s
sys     0m8.977s

time ./kmod.sh -t 0009
real    0m51.258s
user    0m0.812s
sys     0m10.133s

If anyone wants these in a git tree you can check out the 20170628-kmod-only
branch from my linux-next tree [1], based on next-20170628. If there are any
questions or issues please let me know.

[0] https://marc.info/?l=linux-kernel&m=149851347228696&w=2
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170628-kmod-only

Luis R. Rodriguez (3):
  MAINTAINERS: give kmod some maintainer love
  kmod: add test driver to stress test the module loader
  kmod: throttle kmod thread limit

 MAINTAINERS                           |    9 +
 kernel/kmod.c                         |   16 +-
 lib/Kconfig.debug                     |   25 +
 lib/Makefile                          |    1 +
 lib/test_kmod.c                       | 1246 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile |   11 +
 tools/testing/selftests/kmod/config   |    7 +
 tools/testing/selftests/kmod/kmod.sh  |  615 ++++++++++++++++
 8 files changed, 1921 insertions(+), 9 deletions(-)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 1/3] MAINTAINERS: give kmod some maintainer love
  2017-06-28 22:31     ` [PATCH v4 0/3] " Luis R. Rodriguez
@ 2017-06-28 22:31       ` Luis R. Rodriguez
  2017-06-28 22:31       ` [PATCH v4 2/3] kmod: add test driver to stress test the module loader Luis R. Rodriguez
  2017-06-28 22:31       ` [PATCH v4 3/3] kmod: throttle kmod thread limit Luis R. Rodriguez
  2 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-28 22:31 UTC (permalink / raw)
  To: akpm
  Cc: jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, josh,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, dave,
	linux-kselftest, linux-kernel, Luis R. Rodriguez

As suggested by Jessica, I've been actively working on kmod, so
might as well reflect its maintained status.

Changes are expected to go through akpm's tree.

Cc: Jessica Yu <jeyu@redhat.com>
Suggested-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 MAINTAINERS | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index aab3ae5aa12c..d9f5d8687cc1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7561,6 +7561,13 @@ F:	include/linux/kmemleak.h
 F:	mm/kmemleak.c
 F:	mm/kmemleak-test.c
 
+KMOD MODULE USERMODE HELPER
+M:	"Luis R. Rodriguez" <mcgrof@kernel.org>
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+F:	kernel/kmod.c
+F:	include/linux/kmod.h
+
 KPROBES
 M:	Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
 M:	Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 2/3] kmod: add test driver to stress test the module loader
  2017-06-28 22:31     ` [PATCH v4 0/3] " Luis R. Rodriguez
  2017-06-28 22:31       ` [PATCH v4 1/3] MAINTAINERS: give kmod some maintainer love Luis R. Rodriguez
@ 2017-06-28 22:31       ` Luis R. Rodriguez
  2017-06-28 22:31       ` [PATCH v4 3/3] kmod: throttle kmod thread limit Luis R. Rodriguez
  2 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-28 22:31 UTC (permalink / raw)
  To: akpm
  Cc: jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, josh,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, dave,
	linux-kselftest, linux-kernel, Luis R. Rodriguez

This adds a new stress test driver for kmod: the kernel module loader.
The new stress test driver, test_kmod, is only enabled as a module right
now. It should be possible to load this as built-in and load tests early
(refer to the force_init_test module parameter), however since a lot of
test can get a system out of memory fast we leave this disabled for now.

Using a system with 1024 MiB of RAM can *easily* get your kernel
OOM fast with this test driver.

The test_kmod driver exposes API knobs for us to fine tune simple
request_module() and get_fs_type() calls. Since these API calls
only allow each one parameter a test driver for these is rather
simple. Other factors that can help out test driver though are
the number of calls we issue and knowing current limitations of
each. This exposes configuration as much as possible through
userspace to be able to build tests directly from userspace.

Since it allows multiple misc devices its will eventually (once we
add a knob to let us create new devices at will) also be possible to
perform more tests in parallel, provided you have enough memory.

We only enable tests we know work as of right now.

Demo screenshots:

 # tools/testing/selftests/kmod/kmod.sh
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0002_driver: OK! - loading kmod test
kmod_test_0002_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0002_fs: OK! - loading kmod test
kmod_test_0002_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
kmod_test_0003: OK! - loading kmod test
kmod_test_0003: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0004: OK! - loading kmod test
kmod_test_0004: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0005: OK! - loading kmod test
kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
kmod_test_0006: OK! - loading kmod test
kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
XXX: add test restult for 0007
Test completed

You can also request for specific tests:

 # tools/testing/selftests/kmod/kmod.sh -t 0001
kmod_test_0001_driver: OK! - loading kmod test
kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
kmod_test_0001_fs: OK! - loading kmod test
kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
Test completed

Lastly, the current available number of tests:

 # tools/testing/selftests/kmod/kmod.sh --help
Usage: tools/testing/selftests/kmod/kmod.sh [ -t <4-number-digit> ]
Valid tests: 0001-0009

0001 - Simple test - 1 thread  for empty string
0002 - Simple test - 1 thread  for modules/filesystems that do not exist
0003 - Simple test - 1 thread  for get_fs_type() only
0004 - Simple test - 2 threads for get_fs_type() only
0005 - multithreaded tests with default setup - request_module() only
0006 - multithreaded tests with default setup - get_fs_type() only
0007 - multithreaded tests with default setup test request_module() and get_fs_type()
0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()
0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()

The following test cases currently fail, as such they are not currently
enabled by default:

 # tools/testing/selftests/kmod/kmod.sh -t 0008
 # tools/testing/selftests/kmod/kmod.sh -t 0009

To be sure to run them as intended please unload both of the modules:

  o test_module
  o xfs

And ensure they are not loaded on your system prior to testing them.
If you use these paritions for your rootfs you can change the default
test driver used for get_fs_type() by exporting it into your
environment. For example of other test defaults you can override
refer to kmod.sh allow_user_defaults().

Behind the scenes this is how we fine tune at a test case prior to
hitting a trigger to run it:

cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "2" > /sys/devices/virtual/misc/test_kmod0/config_test_case
echo -n "ext4" > /sys/devices/virtual/misc/test_kmod0/config_test_fs
echo -n "80" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
cat /sys/devices/virtual/misc/test_kmod0/config
echo -n "1" > /sys/devices/virtual/misc/test_kmod0/config_num_threads

Finally to trigger:

echo -n "1" > /sys/devices/virtual/misc/test_kmod0/trigger_config

The kmod.sh script uses the above constructs to build different test cases.

A bit of interpretation of the current failures follows, first two
premises:

a) When request_module() is used userspace figures out an optimized version of
module order for us. Once it finds the modules it needs, as per depmod
symbol dep map, it will finit_module() the respective modules which
are needed for the original request_module() request.

b) We have an optimization in place whereby if a kernel uses
request_module() on a module already loaded we never bother
userspace as the module already is loaded. This is all handled by
kernel/kmod.c.

A few things to consider to help identify root causes of issues:

0) kmod 19 has a broken heuristic for modules being assumed to be
built-in to your kernel and will return 0 even though request_module()
failed. Upgrade to a newer version of kmod.

1) A get_fs_type() call for "xfs" will request_module() for
"fs-xfs", not for "xfs". The optimization in kernel described in b)
fails to catch if we have a lot of consecutive get_fs_type() calls.
The reason is the optimization in place does not look for aliases. This
means two consecutive get_fs_type() calls will bump kmod_concurrent, whereas
request_module() will not.

This one explanation why test case 0009 fails at least once for
get_fs_type().

2) If a module fails to load --- for whatever reason (kmod_concurrent
limit reached, file not yet present due to rootfs switch, out of memory)
we have a period of time during which module request for the same name
either with request_module() or get_fs_type() will *also* fail to load
even if the file for the module is ready.

This explains why *multiple* NULLs are possible on test 0009.

3) finit_module() consumes quite a bit of memory.

4) Filesystems typically also have more dependent modules than other
modules, its important to note though that even though a get_fs_type() call
does not incur additional kmod_concurrent bumps, since userspace
loads dependencies it finds it needs via finit_module_fd(), it *will*
take much more memory to load a module with a lot of dependencies.

Because of 3) and 4) we will easily run into out of memory failures
with certain tests. For instance test 0006 fails on qemu with 1024 MiB
of RAM. It panics a box after reaping all userspace processes and still
not having enough memory to reap.

Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 MAINTAINERS                           |    2 +
 lib/Kconfig.debug                     |   25 +
 lib/Makefile                          |    1 +
 lib/test_kmod.c                       | 1246 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/kmod/Makefile |   11 +
 tools/testing/selftests/kmod/config   |    7 +
 tools/testing/selftests/kmod/kmod.sh  |  635 +++++++++++++++++
 7 files changed, 1927 insertions(+)
 create mode 100644 lib/test_kmod.c
 create mode 100644 tools/testing/selftests/kmod/Makefile
 create mode 100644 tools/testing/selftests/kmod/config
 create mode 100755 tools/testing/selftests/kmod/kmod.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index d9f5d8687cc1..2bf93ae07d82 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7567,6 +7567,8 @@ L:	linux-kernel@vger.kernel.org
 S:	Maintained
 F:	kernel/kmod.c
 F:	include/linux/kmod.h
+F:	lib/test_kmod.c
+F:	tools/testing/selftests/kmod/
 
 KPROBES
 M:	Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 0e0ecc8bd076..dfa50b54f017 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1875,6 +1875,31 @@ config BUG_ON_DATA_CORRUPTION
 
 	  If unsure, say N.
 
+config TEST_KMOD
+	tristate "kmod stress tester"
+	default n
+	depends on m
+	select TEST_LKM
+	select XFS_FS
+	select TUN
+	select BTRFS_FS
+	help
+	  Test the kernel's module loading mechanism: kmod. kmod implements
+	  support to load modules using the Linux kernel's usermode helper.
+	  This test provides a series of tests against kmod.
+
+	  Although technically you can either build test_kmod as a module or
+	  into the kernel we disallow building it into the kernel since
+	  it stress tests request_module() and this will very likely cause
+	  some issues by taking over precious threads available from other
+	  module load requests, ultimately this could be fatal.
+
+	  To run tests run:
+
+	  tools/testing/selftests/kmod/kmod.sh --help
+
+	  If unsure, say N.
+
 source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
diff --git a/lib/Makefile b/lib/Makefile
index 85e91e51a9fe..40c18372b301 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_TEST_PRINTF) += test_printf.o
 obj-$(CONFIG_TEST_BITMAP) += test_bitmap.o
 obj-$(CONFIG_TEST_UUID) += test_uuid.o
 obj-$(CONFIG_TEST_PARMAN) += test_parman.o
+obj-$(CONFIG_TEST_KMOD) += test_kmod.o
 
 ifeq ($(CONFIG_DEBUG_KOBJECT),y)
 CFLAGS_kobject.o += -DDEBUG
diff --git a/lib/test_kmod.c b/lib/test_kmod.c
new file mode 100644
index 000000000000..6c1d678bcf8b
--- /dev/null
+++ b/lib/test_kmod.c
@@ -0,0 +1,1246 @@
+/*
+ * kmod stress test driver
+ *
+ * Copyright (C) 2017 Luis R. Rodriguez <mcgrof@kernel.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or at your option any
+ * later version; or, when distributed separately from the Linux kernel or
+ * when incorporated into other software packages, subject to the following
+ * license:
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of copyleft-next (version 0.3.1 or later) as published
+ * at http://copyleft-next.org/.
+ */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+/*
+ * This driver provides an interface to trigger and test the kernel's
+ * module loader through a series of configurations and a few triggers.
+ * To test this driver use the following script as root:
+ *
+ * tools/testing/selftests/kmod/kmod.sh --help
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/kmod.h>
+#include <linux/printk.h>
+#include <linux/kthread.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/vmalloc.h>
+#include <linux/slab.h>
+#include <linux/device.h>
+
+#define TEST_START_NUM_THREADS	50
+#define TEST_START_DRIVER	"test_module"
+#define TEST_START_TEST_FS	"xfs"
+#define TEST_START_TEST_CASE	TEST_KMOD_DRIVER
+
+
+static bool force_init_test = false;
+module_param(force_init_test, bool_enable_only, 0644);
+MODULE_PARM_DESC(force_init_test,
+		 "Force kicking a test immediately after driver loads");
+
+/*
+ * For device allocation / registration
+ */
+static DEFINE_MUTEX(reg_dev_mutex);
+static LIST_HEAD(reg_test_devs);
+
+/*
+ * num_test_devs actually represents the *next* ID of the next
+ * device we will allow to create.
+ */
+static int num_test_devs;
+
+/**
+ * enum kmod_test_case - linker table test case
+ *
+ * If you add a  test case, please be sure to review if you need to se
+ * @need_mod_put for your tests case.
+ *
+ * @TEST_KMOD_DRIVER: stress tests request_module()
+ * @TEST_KMOD_FS_TYPE: stress tests get_fs_type()
+ */
+enum kmod_test_case {
+	__TEST_KMOD_INVALID = 0,
+
+	TEST_KMOD_DRIVER,
+	TEST_KMOD_FS_TYPE,
+
+	__TEST_KMOD_MAX,
+};
+
+struct test_config {
+	char *test_driver;
+	char *test_fs;
+	unsigned int num_threads;
+	enum kmod_test_case test_case;
+	int test_result;
+};
+
+struct kmod_test_device;
+
+/**
+ * kmod_test_device_info - thread info
+ *
+ * @ret_sync: return value if request_module() is used, sync request for
+ * 	@TEST_KMOD_DRIVER
+ * @fs_sync: return value of get_fs_type() for @TEST_KMOD_FS_TYPE
+ * @thread_idx: thread ID
+ * @test_dev: test device test is being performed under
+ * @need_mod_put: Some tests (get_fs_type() is one) requires putting the module
+ *	(module_put(fs_sync->owner)) when done, otherwise you will not be able
+ *	to unload the respective modules and re-test. We use this to keep
+ *	accounting of when we need this and to help out in case we need to
+ *	error out and deal with module_put() on error.
+ */
+struct kmod_test_device_info {
+	int ret_sync;
+	struct file_system_type *fs_sync;
+	struct task_struct *task_sync;
+	unsigned int thread_idx;
+	struct kmod_test_device *test_dev;
+	bool need_mod_put;
+};
+
+/**
+ * kmod_test_device - test device to help test kmod
+ *
+ * @dev_idx: unique ID for test device
+ * @config: configuration for the test
+ * @misc_dev: we use a misc device under the hood
+ * @dev: pointer to misc_dev's own struct device
+ * @config_mutex: protects configuration of test
+ * @trigger_mutex: the test trigger can only be fired once at a time
+ * @thread_lock: protects @done count, and the @info per each thread
+ * @done: number of threads which have completed or failed
+ * @test_is_oom: when we run out of memory, use this to halt moving forward
+ * @kthreads_done: completion used to signal when all work is done
+ * @list: needed to be part of the reg_test_devs
+ * @info: array of info for each thread
+ */
+struct kmod_test_device {
+	int dev_idx;
+	struct test_config config;
+	struct miscdevice misc_dev;
+	struct device *dev;
+	struct mutex config_mutex;
+	struct mutex trigger_mutex;
+	struct mutex thread_mutex;
+
+	unsigned int done;
+
+	bool test_is_oom;
+	struct completion kthreads_done;
+	struct list_head list;
+
+	struct kmod_test_device_info *info;
+};
+
+static const char *test_case_str(enum kmod_test_case test_case)
+{
+	switch (test_case) {
+	case TEST_KMOD_DRIVER:
+		return "TEST_KMOD_DRIVER";
+	case TEST_KMOD_FS_TYPE:
+		return "TEST_KMOD_FS_TYPE";
+	default:
+		return "invalid";
+	}
+}
+
+static struct miscdevice *dev_to_misc_dev(struct device *dev)
+{
+	return dev_get_drvdata(dev);
+}
+
+static struct kmod_test_device *misc_dev_to_test_dev(struct miscdevice *misc_dev)
+{
+	return container_of(misc_dev, struct kmod_test_device, misc_dev);
+}
+
+static struct kmod_test_device *dev_to_test_dev(struct device *dev)
+{
+	struct miscdevice *misc_dev;
+
+	misc_dev = dev_to_misc_dev(dev);
+
+	return misc_dev_to_test_dev(misc_dev);
+}
+
+/* Must run with thread_mutex held */
+static void kmod_test_done_check(struct kmod_test_device *test_dev,
+				 unsigned int idx)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done++;
+	dev_dbg(test_dev->dev, "Done thread count: %u\n", test_dev->done);
+
+	if (test_dev->done == config->num_threads) {
+		dev_info(test_dev->dev, "Done: %u threads have all run now\n",
+			 test_dev->done);
+		dev_info(test_dev->dev, "Last thread to run: %u\n", idx);
+		complete(&test_dev->kthreads_done);
+	}
+}
+
+static void test_kmod_put_module(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	if (!info->need_mod_put)
+		return;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		break;
+	case TEST_KMOD_FS_TYPE:
+		if (info && info->fs_sync && info->fs_sync->owner)
+			module_put(info->fs_sync->owner);
+		break;
+	default:
+		BUG();
+	}
+
+	info->need_mod_put = true;
+}
+
+static int run_request(void *data)
+{
+	struct kmod_test_device_info *info = data;
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		info->ret_sync = request_module("%s", config->test_driver);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		info->fs_sync = get_fs_type(config->test_fs);
+		info->need_mod_put = true;
+		break;
+	default:
+		/* __trigger_config_run() already checked for test sanity */
+		BUG();
+		return -EINVAL;
+	}
+
+	dev_dbg(test_dev->dev, "Ran thread %u\n", info->thread_idx);
+
+	test_kmod_put_module(info);
+
+	mutex_lock(&test_dev->thread_mutex);
+	info->task_sync = NULL;
+	kmod_test_done_check(test_dev, info->thread_idx);
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+}
+
+static int tally_work_test(struct kmod_test_device_info *info)
+{
+	struct kmod_test_device *test_dev = info->test_dev;
+	struct test_config *config = &test_dev->config;
+	int err_ret = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		/*
+		 * Only capture errors, if one is found that's
+		 * enough, for now.
+		 */
+		if (info->ret_sync != 0)
+			err_ret = info->ret_sync;
+		dev_info(test_dev->dev,
+			 "Sync thread %d return status: %d\n",
+			 info->thread_idx, info->ret_sync);
+		break;
+	case TEST_KMOD_FS_TYPE:
+		/* For now we make this simple */
+		if (!info->fs_sync)
+			err_ret = -EINVAL;
+		dev_info(test_dev->dev, "Sync thread %u fs: %s\n",
+			 info->thread_idx, info->fs_sync ? config->test_fs :
+			 "NULL");
+		break;
+	default:
+		BUG();
+	}
+
+	return err_ret;
+}
+
+/*
+ * XXX: add result option to display if all errors did not match.
+ * For now we just keep any error code if one was found.
+ *
+ * If this ran it means *all* tasks were created fine and we
+ * are now just collecting results.
+ *
+ * Only propagate errors, do not override with a subsequent sucess case.
+ */
+static void tally_up_work(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int idx;
+	int err_ret = 0;
+	int ret = 0;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	dev_info(test_dev->dev, "Results:\n");
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		info = &test_dev->info[idx];
+		ret = tally_work_test(info);
+		if (ret)
+			err_ret = ret;
+	}
+
+	/*
+	 * Note: request_module() returns 256 for a module not found even
+	 * though modprobe itself returns 1.
+	 */
+	config->test_result = err_ret;
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+static int try_one_request(struct kmod_test_device *test_dev, unsigned int idx)
+{
+	struct kmod_test_device_info *info = &test_dev->info[idx];
+	int fail_ret = -ENOMEM;
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	info->thread_idx = idx;
+	info->test_dev = test_dev;
+	info->task_sync = kthread_run(run_request, info, "%s-%u",
+				      KBUILD_MODNAME, idx);
+
+	if (!info->task_sync || IS_ERR(info->task_sync)) {
+		test_dev->test_is_oom = true;
+		dev_err(test_dev->dev, "Setting up thread %u failed\n", idx);
+		info->task_sync = NULL;
+		goto err_out;
+	} else
+		dev_dbg(test_dev->dev, "Kicked off thread %u\n", idx);
+
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return 0;
+
+err_out:
+	info->ret_sync = fail_ret;
+	mutex_unlock(&test_dev->thread_mutex);
+
+	return fail_ret;
+}
+
+static void test_dev_kmod_stop_tests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	struct kmod_test_device_info *info;
+	unsigned int i;
+
+	dev_info(test_dev->dev, "Ending request_module() tests\n");
+
+	mutex_lock(&test_dev->thread_mutex);
+
+	for (i=0; i < config->num_threads; i++) {
+		info = &test_dev->info[i];
+		if (info->task_sync && !IS_ERR(info->task_sync)) {
+			dev_info(test_dev->dev,
+				 "Stopping still-running thread %i\n", i);
+			kthread_stop(info->task_sync);
+		}
+
+		/*
+		 * info->task_sync is well protected, it can only be
+		 * NULL or a pointer to a struct. If its NULL we either
+		 * never ran, or we did and we completed the work. Completed
+		 * tasks *always* put the module for us. This is a sanity
+		 * check -- just in case.
+		 */
+		if (info->task_sync && info->need_mod_put)
+			test_kmod_put_module(info);
+	}
+
+	mutex_unlock(&test_dev->thread_mutex);
+}
+
+/*
+ * Only wait *iff* we did not run into any errors during all of our thread
+ * set up. If run into any issues we stop threads and just bail out with
+ * an error to the trigger. This also means we don't need any tally work
+ * for any threads which fail.
+ */
+static int try_requests(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	unsigned int idx;
+	int ret;
+	bool any_error = false;
+
+	for (idx=0; idx < config->num_threads; idx++) {
+		if (test_dev->test_is_oom) {
+			any_error = true;
+			break;
+		}
+
+		ret = try_one_request(test_dev, idx);
+		if (ret) {
+			any_error = true;
+			break;
+		}
+	}
+
+	if (!any_error) {
+		test_dev->test_is_oom = false;
+		dev_info(test_dev->dev,
+			 "No errors were found while initializing threads\n");
+		wait_for_completion(&test_dev->kthreads_done);
+		tally_up_work(test_dev);
+	} else {
+		test_dev->test_is_oom = true;
+		dev_info(test_dev->dev,
+			 "At least one thread failed to start, stop all work\n");
+		test_dev_kmod_stop_tests(test_dev);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int run_test_driver(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test driver to load: %s\n",
+		 config->test_driver);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static int run_test_fs_type(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	dev_info(test_dev->dev, "Test case: %s (%u)\n",
+		 test_case_str(config->test_case),
+		 config->test_case);
+	dev_info(test_dev->dev, "Test filesystem to load: %s\n",
+		 config->test_fs);
+	dev_info(test_dev->dev, "Number of threads to run: %u\n",
+		 config->num_threads);
+	dev_info(test_dev->dev, "Thread IDs will range from 0 - %u\n",
+		 config->num_threads - 1);
+
+	return try_requests(test_dev);
+}
+
+static ssize_t config_show(struct device *dev,
+			   struct device_attribute *attr,
+			   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int len = 0;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	len += snprintf(buf, PAGE_SIZE,
+			"Custom trigger configuration for: %s\n",
+			dev_name(dev));
+
+	len += snprintf(buf+len, PAGE_SIZE - len,
+			"Number of threads:\t%u\n",
+			config->num_threads);
+
+	len += snprintf(buf+len, PAGE_SIZE - len,
+			"Test_case:\t%s (%u)\n",
+			test_case_str(config->test_case),
+			config->test_case);
+
+	if (config->test_driver)
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"driver:\t%s\n",
+				config->test_driver);
+	else
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"driver:\tEMTPY\n");
+
+	if (config->test_fs)
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"fs:\t%s\n",
+				config->test_fs);
+	else
+		len += snprintf(buf+len, PAGE_SIZE - len,
+				"fs:\tEMTPY\n");
+
+	mutex_unlock(&test_dev->config_mutex);
+
+	return len;
+}
+static DEVICE_ATTR_RO(config);
+
+/*
+ * This ensures we don't allow kicking threads through if our configuration
+ * is faulty.
+ */
+static int __trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	test_dev->done = 0;
+
+	switch (config->test_case) {
+	case TEST_KMOD_DRIVER:
+		return run_test_driver(test_dev);
+	case TEST_KMOD_FS_TYPE:
+		return run_test_fs_type(test_dev);
+	default:
+		dev_warn(test_dev->dev,
+			 "Invalid test case requested: %u\n",
+			 config->test_case);
+		return -EINVAL;
+	}
+}
+
+static int trigger_config_run(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __trigger_config_run(test_dev);
+	if (ret < 0)
+		goto out;
+	dev_info(test_dev->dev, "General test result: %d\n",
+		 config->test_result);
+
+	/*
+	 * We must return 0 after a trigger even unless something went
+	 * wrong with the setup of the test. If the test setup went fine
+	 * then userspace must just check the result of config->test_result.
+	 * One issue with relying on the return from a call in the kernel
+	 * is if the kernel returns a possitive value using this trigger
+	 * will not return the value to userspace, it would be lost.
+	 *
+	 * By not relying on capturing the return value of tests we are using
+	 * through the trigger it also us to run tests with set -e and only
+	 * fail when something went wrong with the driver upon trigger
+	 * requests.
+	 */
+	ret = 0;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+
+static ssize_t
+trigger_config_store(struct device *dev,
+		     struct device_attribute *attr,
+		     const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	if (test_dev->test_is_oom)
+		return -ENOMEM;
+
+	/* For all intents and purposes we don't care what userspace
+	 * sent this trigger, we care only that we were triggered.
+	 * We treat the return value only for caputuring issues with
+	 * the test setup. At this point all the test variables should
+	 * have been allocated so typically this should never fail.
+	 */
+	ret = trigger_config_run(test_dev);
+	if (unlikely(ret < 0))
+		goto out;
+
+	/*
+	 * Note: any return > 0 will be treated as success
+	 * and the error value will not be available to userspace.
+	 * Do not rely on trying to send to userspace a test value
+	 * return value as possitive return errors will be lost.
+	 */
+	if (WARN_ON(ret > 0))
+		return -EINVAL;
+
+	ret = count;
+out:
+	return ret;
+}
+static DEVICE_ATTR_WO(trigger_config);
+
+/*
+ * XXX: move to kstrncpy() once merged.
+ *
+ * Users should use kfree_const() when freeing these.
+ */
+static int __kstrncpy(char **dst, const char *name, size_t count, gfp_t gfp)
+{
+	*dst = kstrndup(name, count, gfp);
+	if (!*dst)
+		return -ENOSPC;
+	return count;
+}
+
+static int config_copy_test_driver_name(struct test_config *config,
+				    const char *name,
+				    size_t count)
+{
+	return __kstrncpy(&config->test_driver, name, count, GFP_KERNEL);
+}
+
+
+static int config_copy_test_fs(struct test_config *config, const char *name,
+			       size_t count)
+{
+	return __kstrncpy(&config->test_fs, name, count, GFP_KERNEL);
+}
+
+static void __kmod_config_free(struct test_config *config)
+{
+	if (!config)
+		return;
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	kfree_const(config->test_fs);
+	config->test_driver = NULL;
+}
+
+static void kmod_config_free(struct kmod_test_device *test_dev)
+{
+	struct test_config *config;
+
+	if (!test_dev)
+		return;
+
+	config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+	__kmod_config_free(config);
+	mutex_unlock(&test_dev->config_mutex);
+}
+
+static ssize_t config_test_driver_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_driver);
+	config->test_driver = NULL;
+
+	copied = config_copy_test_driver_name(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+/*
+ * As per sysfs_kf_seq_show() the buf is max PAGE_SIZE.
+ */
+static ssize_t config_test_show_str(struct mutex *config_mutex,
+				    char *dst,
+				    char *src)
+{
+	int len;
+
+	mutex_lock(config_mutex);
+	len = snprintf(dst, PAGE_SIZE, "%s\n", src);
+	mutex_unlock(config_mutex);
+
+	return len;
+}
+
+static ssize_t config_test_driver_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return config_test_show_str(&test_dev->config_mutex, buf,
+				    config->test_driver);
+}
+static DEVICE_ATTR(config_test_driver, 0644, config_test_driver_show,
+		   config_test_driver_store);
+
+static ssize_t config_test_fs_store(struct device *dev,
+				    struct device_attribute *attr,
+				    const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+	int copied;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	kfree_const(config->test_fs);
+	config->test_fs = NULL;
+
+	copied = config_copy_test_fs(config, buf, count);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return copied;
+}
+
+static ssize_t config_test_fs_show(struct device *dev,
+				   struct device_attribute *attr,
+				   char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return config_test_show_str(&test_dev->config_mutex, buf,
+				    config->test_fs);
+}
+static DEVICE_ATTR(config_test_fs, 0644, config_test_fs_show,
+		   config_test_fs_store);
+
+static int trigger_config_run_type(struct kmod_test_device *test_dev,
+				   enum kmod_test_case test_case,
+				   const char *test_str)
+{
+	int copied = 0;
+	struct test_config *config = &test_dev->config;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	switch (test_case) {
+	case TEST_KMOD_DRIVER:
+		kfree_const(config->test_driver);
+		config->test_driver = NULL;
+		copied = config_copy_test_driver_name(config, test_str,
+						      strlen(test_str));
+		break;
+	case TEST_KMOD_FS_TYPE:
+		break;
+		kfree_const(config->test_fs);
+		config->test_driver = NULL;
+		copied = config_copy_test_fs(config, test_str,
+					     strlen(test_str));
+	default:
+		mutex_unlock(&test_dev->config_mutex);
+		return -EINVAL;
+	}
+
+	config->test_case = test_case;
+
+	mutex_unlock(&test_dev->config_mutex);
+
+	if (copied <= 0 || copied != strlen(test_str)) {
+		test_dev->test_is_oom = true;
+		return -ENOMEM;
+	}
+
+	test_dev->test_is_oom = false;
+
+	return trigger_config_run(test_dev);
+}
+
+static void free_test_dev_info(struct kmod_test_device *test_dev)
+{
+	vfree(test_dev->info);
+	test_dev->info = NULL;
+}
+
+static int kmod_config_sync_info(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+
+	free_test_dev_info(test_dev);
+	test_dev->info = vzalloc(config->num_threads *
+				 sizeof(struct kmod_test_device_info));
+	if (!test_dev->info) {
+		dev_err(test_dev->dev, "Cannot alloc test_dev info\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * Old kernels may not have this, if you want to port this code to
+ * test it on older kernels.
+ */
+#ifdef get_kmod_umh_limit
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return get_kmod_umh_limit();
+}
+#else
+static unsigned int kmod_init_test_thread_limit(void)
+{
+	return TEST_START_NUM_THREADS;
+}
+#endif
+
+static int __kmod_config_init(struct kmod_test_device *test_dev)
+{
+	struct test_config *config = &test_dev->config;
+	int ret = -ENOMEM, copied;
+
+	__kmod_config_free(config);
+
+	copied = config_copy_test_driver_name(config, TEST_START_DRIVER,
+					      strlen(TEST_START_DRIVER));
+	if (copied != strlen(TEST_START_DRIVER))
+		goto err_out;
+
+	copied = config_copy_test_fs(config, TEST_START_TEST_FS,
+				     strlen(TEST_START_TEST_FS));
+	if (copied != strlen(TEST_START_TEST_FS))
+		goto err_out;
+
+	config->num_threads = kmod_init_test_thread_limit();
+	config->test_result = 0;
+	config->test_case = TEST_START_TEST_CASE;
+
+	ret = kmod_config_sync_info(test_dev);
+	if (ret)
+		goto err_out;
+
+	test_dev->test_is_oom = false;
+
+	return 0;
+
+err_out:
+	test_dev->test_is_oom = true;
+	WARN_ON(test_dev->test_is_oom);
+
+	__kmod_config_free(config);
+
+	return ret;
+}
+
+static ssize_t reset_store(struct device *dev,
+			   struct device_attribute *attr,
+			   const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	int ret;
+
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	ret = __kmod_config_init(test_dev);
+	if (ret < 0) {
+		ret = -ENOMEM;
+		dev_err(dev, "could not alloc settings for config trigger: %d\n",
+		       ret);
+		goto out;
+	}
+
+	dev_info(dev, "reset\n");
+	ret = count;
+
+out:
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	return ret;
+}
+static DEVICE_ATTR_WO(reset);
+
+static int test_dev_config_update_uint_sync(struct kmod_test_device *test_dev,
+					    const char *buf, size_t size,
+					    unsigned int *config,
+					    int (*test_sync)(struct kmod_test_device *test_dev))
+{
+	int ret;
+	long new;
+	unsigned int old_val;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+
+	old_val = *config;
+	*(unsigned int *)config = new;
+
+	ret = test_sync(test_dev);
+	if (ret) {
+		*(unsigned int *)config = old_val;
+
+		ret = test_sync(test_dev);
+		WARN_ON(ret);
+
+		mutex_unlock(&test_dev->config_mutex);
+		return -EINVAL;
+	}
+
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_uint_range(struct kmod_test_device *test_dev,
+					     const char *buf, size_t size,
+					     unsigned int *config,
+					     unsigned int min,
+					     unsigned int max)
+{
+	int ret;
+	long new;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new < min || new >  max || new > UINT_MAX)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+	*config = new;
+	mutex_unlock(&test_dev->config_mutex);
+
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static int test_dev_config_update_int(struct kmod_test_device *test_dev,
+				      const char *buf, size_t size,
+				      int *config)
+{
+	int ret;
+	long new;
+
+	ret = kstrtol(buf, 10, &new);
+	if (ret)
+		return ret;
+
+	if (new > INT_MAX || new < INT_MIN)
+		return -EINVAL;
+
+	mutex_lock(&test_dev->config_mutex);
+	*config = new;
+	mutex_unlock(&test_dev->config_mutex);
+	/* Always return full write size even if we didn't consume all */
+	return size;
+}
+
+static ssize_t test_dev_config_show_int(struct kmod_test_device *test_dev,
+					char *buf,
+					int config)
+{
+	int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+static ssize_t test_dev_config_show_uint(struct kmod_test_device *test_dev,
+					 char *buf,
+					 unsigned int config)
+{
+	unsigned int val;
+
+	mutex_lock(&test_dev->config_mutex);
+	val = config;
+	mutex_unlock(&test_dev->config_mutex);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", val);
+}
+
+static ssize_t test_result_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_int(test_dev, buf, count,
+					  &config->test_result);
+}
+
+static ssize_t config_num_threads_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_sync(test_dev, buf, count,
+						&config->num_threads,
+						kmod_config_sync_info);
+}
+
+static ssize_t config_num_threads_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->num_threads);
+}
+static DEVICE_ATTR(config_num_threads, 0644, config_num_threads_show,
+		   config_num_threads_store);
+
+static ssize_t config_test_case_store(struct device *dev,
+				      struct device_attribute *attr,
+				      const char *buf, size_t count)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_update_uint_range(test_dev, buf, count,
+						 &config->test_case,
+						 __TEST_KMOD_INVALID + 1,
+						 __TEST_KMOD_MAX - 1);
+}
+
+static ssize_t config_test_case_show(struct device *dev,
+				     struct device_attribute *attr,
+				     char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_uint(test_dev, buf, config->test_case);
+}
+static DEVICE_ATTR(config_test_case, 0644, config_test_case_show,
+		   config_test_case_store);
+
+static ssize_t test_result_show(struct device *dev,
+				struct device_attribute *attr,
+				char *buf)
+{
+	struct kmod_test_device *test_dev = dev_to_test_dev(dev);
+	struct test_config *config = &test_dev->config;
+
+	return test_dev_config_show_int(test_dev, buf, config->test_result);
+}
+static DEVICE_ATTR(test_result, 0644, test_result_show, test_result_store);
+
+#define TEST_KMOD_DEV_ATTR(name)		&dev_attr_##name.attr
+
+static struct attribute *test_dev_attrs[] = {
+	TEST_KMOD_DEV_ATTR(trigger_config),
+	TEST_KMOD_DEV_ATTR(config),
+	TEST_KMOD_DEV_ATTR(reset),
+
+	TEST_KMOD_DEV_ATTR(config_test_driver),
+	TEST_KMOD_DEV_ATTR(config_test_fs),
+	TEST_KMOD_DEV_ATTR(config_num_threads),
+	TEST_KMOD_DEV_ATTR(config_test_case),
+	TEST_KMOD_DEV_ATTR(test_result),
+
+	NULL,
+};
+
+ATTRIBUTE_GROUPS(test_dev);
+
+static int kmod_config_init(struct kmod_test_device *test_dev)
+{
+	int ret;
+
+	mutex_lock(&test_dev->config_mutex);
+	ret = __kmod_config_init(test_dev);
+	mutex_unlock(&test_dev->config_mutex);
+
+	return ret;
+}
+
+static struct kmod_test_device *alloc_test_dev_kmod(int idx)
+{
+	int ret;
+	struct kmod_test_device *test_dev;
+	struct miscdevice *misc_dev;
+
+	test_dev = vzalloc(sizeof(struct kmod_test_device));
+	if (!test_dev) {
+		pr_err("Cannot alloc test_dev\n");
+		goto err_out;
+	}
+
+	mutex_init(&test_dev->config_mutex);
+	mutex_init(&test_dev->trigger_mutex);
+	mutex_init(&test_dev->thread_mutex);
+
+	init_completion(&test_dev->kthreads_done);
+
+	ret = kmod_config_init(test_dev);
+	if (ret < 0) {
+		pr_err("Cannot alloc kmod_config_init()\n");
+		goto err_out_free;
+	}
+
+	test_dev->dev_idx = idx;
+	misc_dev = &test_dev->misc_dev;
+
+	misc_dev->minor = MISC_DYNAMIC_MINOR;
+	misc_dev->name = kasprintf(GFP_KERNEL, "test_kmod%d", idx);
+	if (!misc_dev->name) {
+		pr_err("Cannot alloc misc_dev->name\n");
+		goto err_out_free_config;
+	}
+	misc_dev->groups = test_dev_groups;
+
+	return test_dev;
+
+err_out_free_config:
+	free_test_dev_info(test_dev);
+	kmod_config_free(test_dev);
+err_out_free:
+	vfree(test_dev);
+	test_dev = NULL;
+err_out:
+	return NULL;
+}
+
+static void free_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	if (test_dev) {
+		kfree_const(test_dev->misc_dev.name);
+		test_dev->misc_dev.name = NULL;
+		free_test_dev_info(test_dev);
+		kmod_config_free(test_dev);
+		vfree(test_dev);
+		test_dev = NULL;
+	}
+}
+
+static struct kmod_test_device *register_test_dev_kmod(void)
+{
+	struct kmod_test_device *test_dev = NULL;
+	int ret;
+
+	mutex_unlock(&reg_dev_mutex);
+
+	/* int should suffice for number of devices, test for wrap */
+	if (unlikely(num_test_devs + 1) < 0) {
+		pr_err("reached limit of number of test devices\n");
+		goto out;
+	}
+
+	test_dev = alloc_test_dev_kmod(num_test_devs);
+	if (!test_dev)
+		goto out;
+
+	ret = misc_register(&test_dev->misc_dev);
+	if (ret) {
+		pr_err("could not register misc device: %d\n", ret);
+		free_test_dev_kmod(test_dev);
+		goto out;
+	}
+
+	test_dev->dev = test_dev->misc_dev.this_device;
+	list_add_tail(&test_dev->list, &reg_test_devs);
+	dev_info(test_dev->dev, "interface ready\n");
+
+	num_test_devs++;
+
+out:
+	mutex_unlock(&reg_dev_mutex);
+
+	return test_dev;
+
+}
+
+static int __init test_kmod_init(void)
+{
+	struct kmod_test_device *test_dev;
+	int ret;
+
+	test_dev = register_test_dev_kmod();
+	if (!test_dev) {
+		pr_err("Cannot add first test kmod device\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * With some work we might be able to gracefully enable
+	 * testing with this driver built-in, for now this seems
+	 * rather risky. For those willing to try have at it,
+	 * and enable the below. Good luck! If that works, try
+	 * lowering the init level for more fun.
+	 */
+	if (force_init_test) {
+		ret = trigger_config_run_type(test_dev,
+					      TEST_KMOD_DRIVER, "tun");
+		if (WARN_ON(ret))
+			return ret;
+		ret = trigger_config_run_type(test_dev,
+					      TEST_KMOD_FS_TYPE, "btrfs");
+		if (WARN_ON(ret))
+			return ret;
+	}
+
+	return 0;
+}
+late_initcall(test_kmod_init);
+
+static
+void unregister_test_dev_kmod(struct kmod_test_device *test_dev)
+{
+	mutex_lock(&test_dev->trigger_mutex);
+	mutex_lock(&test_dev->config_mutex);
+
+	test_dev_kmod_stop_tests(test_dev);
+
+	dev_info(test_dev->dev, "removing interface\n");
+	misc_deregister(&test_dev->misc_dev);
+	kfree(&test_dev->misc_dev.name);
+
+	mutex_unlock(&test_dev->config_mutex);
+	mutex_unlock(&test_dev->trigger_mutex);
+
+	free_test_dev_kmod(test_dev);
+}
+
+static void __exit test_kmod_exit(void)
+{
+	struct kmod_test_device *test_dev, *tmp;
+
+	mutex_lock(&reg_dev_mutex);
+	list_for_each_entry_safe(test_dev, tmp, &reg_test_devs, list) {
+		list_del(&test_dev->list);
+		unregister_test_dev_kmod(test_dev);
+	}
+	mutex_unlock(&reg_dev_mutex);
+}
+module_exit(test_kmod_exit);
+
+MODULE_AUTHOR("Luis R. Rodriguez <mcgrof@kernel.org>");
+MODULE_LICENSE("GPL");
diff --git a/tools/testing/selftests/kmod/Makefile b/tools/testing/selftests/kmod/Makefile
new file mode 100644
index 000000000000..fa2ccc5fb3de
--- /dev/null
+++ b/tools/testing/selftests/kmod/Makefile
@@ -0,0 +1,11 @@
+# Makefile for kmod loading selftests
+
+# No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
+all:
+
+TEST_PROGS := kmod.sh
+
+include ../lib.mk
+
+# Nothing to clean up.
+clean:
diff --git a/tools/testing/selftests/kmod/config b/tools/testing/selftests/kmod/config
new file mode 100644
index 000000000000..259f4fd6b5e2
--- /dev/null
+++ b/tools/testing/selftests/kmod/config
@@ -0,0 +1,7 @@
+CONFIG_TEST_KMOD=m
+CONFIG_TEST_LKM=m
+CONFIG_XFS_FS=m
+
+# For the module parameter force_init_test is used
+CONFIG_TUN=m
+CONFIG_BTRFS_FS=m
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
new file mode 100755
index 000000000000..10196a62ed09
--- /dev/null
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -0,0 +1,635 @@
+#!/bin/bash
+#
+# Copyright (C) 2017 Luis R. Rodriguez <mcgrof@kernel.org>
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 2 of the License, or at your option any
+# later version; or, when distributed separately from the Linux kernel or
+# when incorporated into other software packages, subject to the following
+# license:
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of copyleft-next (version 0.3.1 or later) as published
+# at http://copyleft-next.org/.
+
+# This is a stress test script for kmod, the kernel module loader. It uses
+# test_kmod which exposes a series of knobs for the API for us so we can
+# tweak each test in userspace rather than in kernelspace.
+#
+# The way kmod works is it uses the kernel's usermode helper API to eventually
+# call /sbin/modprobe. It has a limit of the number of concurrent calls
+# possible. The kernel interface to load modules is request_module(), however
+# mount uses get_fs_type(). Both behave slightly differently, but the
+# differences are important enough to test each call separately. For this
+# reason test_kmod starts by providing tests for both calls.
+#
+# The test driver test_kmod assumes a series of defaults which you can
+# override by exporting to your environment prior running this script.
+# For instance this script assumes you do not have xfs loaded upon boot.
+# If this is false, export DEFAULT_KMOD_FS="ext4" prior to running this
+# script if the filesyste module you don't have loaded upon bootup
+# is ext4 instead. Refer to allow_user_defaults() for a list of user
+# override variables possible.
+#
+# You'll want at least 4 GiB of RAM to expect to run these tests
+# without running out of memory on them. For other requirements refer
+# to test_reqs()
+
+set -e
+
+TEST_NAME="kmod"
+TEST_DRIVER="test_${TEST_NAME}"
+TEST_DIR=$(dirname $0)
+
+# This represents
+#
+# TEST_ID:TEST_COUNT:ENABLED
+#
+# TEST_ID: is the test id number
+# TEST_COUNT: number of times we should run the test
+# ENABLED: 1 if enabled, 0 otherwise
+#
+# Once these are enabled please leave them as-is. Write your own test,
+# we have tons of space.
+ALL_TESTS="0001:3:1"
+ALL_TESTS="$ALL_TESTS 0002:3:1"
+ALL_TESTS="$ALL_TESTS 0003:1:1"
+ALL_TESTS="$ALL_TESTS 0004:1:1"
+ALL_TESTS="$ALL_TESTS 0005:10:1"
+ALL_TESTS="$ALL_TESTS 0006:10:1"
+ALL_TESTS="$ALL_TESTS 0007:5:1"
+
+# Disabled tests:
+#
+# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
+# Current best-effort failure interpretation:
+# Enough module requests get loaded in place fast enough to reach over the
+# max_modprobes limit and trigger a failure -- before we're even able to
+# start processing pending requests.
+ALL_TESTS="$ALL_TESTS 0008:150:0"
+
+# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+# Current best-effort failure interpretation:
+#
+# get_fs_type() requests modules using aliases as such the optimization in
+# place today to look for already loaded modules will not take effect and
+# we end up requesting a new module to load, this bumps the kmod_concurrent,
+# and in certain circumstances can lead to pushing the kmod_concurrent over
+# the max_modprobe limit.
+#
+# This test fails much easier than test 0008 since the alias optimizations
+# are not in place.
+ALL_TESTS="$ALL_TESTS 0009:150:0"
+
+test_modprobe()
+{
+       if [ ! -d $DIR ]; then
+               echo "$0: $DIR not present" >&2
+               echo "You must have the following enabled in your kernel:" >&2
+               cat $TEST_DIR/config >&2
+               exit 1
+       fi
+}
+
+function allow_user_defaults()
+{
+	if [ -z $DEFAULT_KMOD_DRIVER ]; then
+		DEFAULT_KMOD_DRIVER="test_module"
+	fi
+
+	if [ -z $DEFAULT_KMOD_FS ]; then
+		DEFAULT_KMOD_FS="xfs"
+	fi
+
+	if [ -z $PROC_DIR ]; then
+		PROC_DIR="/proc/sys/kernel/"
+	fi
+
+	if [ -z $MODPROBE_LIMIT ]; then
+		MODPROBE_LIMIT=50
+	fi
+
+	if [ -z $DIR ]; then
+		DIR="/sys/devices/virtual/misc/${TEST_DRIVER}0/"
+	fi
+
+	if [ -z $DEFAULT_NUM_TESTS ]; then
+		DEFAULT_NUM_TESTS=150
+	fi
+
+	MODPROBE_LIMIT_FILE="${PROC_DIR}/kmod-limit"
+}
+
+test_reqs()
+{
+	if ! which modprobe 2> /dev/null > /dev/null; then
+		echo "$0: You need modprobe installed" >&2
+		exit 1
+	fi
+
+	if ! which kmod 2> /dev/null > /dev/null; then
+		echo "$0: You need kmod installed" >&2
+		exit 1
+	fi
+
+	# kmod 19 has a bad bug where it returns 0 when modprobe
+	# gets called *even* if the module was not loaded due to
+	# some bad heuristics. For details see:
+	#
+	# A work around is possible in-kernel but its rather
+	# complex.
+	KMOD_VERSION=$(kmod --version | awk '{print $3}')
+	if [[ $KMOD_VERSION  -le 19 ]]; then
+		echo "$0: You need at least kmod 20" >&2
+		echo "kmod <= 19 is buggy, for details see:" >&2
+		echo "http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4" >&2
+		exit 1
+	fi
+
+	uid=$(id -u)
+	if [ $uid -ne 0 ]; then
+		echo $msg must be run as root >&2
+		exit 0
+	fi
+}
+
+function load_req_mod()
+{
+	trap "test_modprobe" EXIT
+
+	if [ ! -d $DIR ]; then
+		# Alanis: "Oh isn't it ironic?"
+		modprobe $TEST_DRIVER
+	fi
+}
+
+test_finish()
+{
+	echo "Test completed"
+}
+
+errno_name_to_val()
+{
+	case "$1" in
+	# kmod calls modprobe and upon of a module not found
+	# modprobe returns just 1... However in the kernel we
+	# *sometimes* see 256...
+	MODULE_NOT_FOUND)
+		echo 256;;
+	SUCCESS)
+		echo 0;;
+	-EPERM)
+		echo -1;;
+	-ENOENT)
+		echo -2;;
+	-EINVAL)
+		echo -22;;
+	-ERR_ANY)
+		echo -123456;;
+	*)
+		echo invalid;;
+	esac
+}
+
+errno_val_to_name()
+	case "$1" in
+	256)
+		echo MODULE_NOT_FOUND;;
+	0)
+		echo SUCCESS;;
+	-1)
+		echo -EPERM;;
+	-2)
+		echo -ENOENT;;
+	-22)
+		echo -EINVAL;;
+	-123456)
+		echo -ERR_ANY;;
+	*)
+		echo invalid;;
+	esac
+
+config_set_test_case_driver()
+{
+	if ! echo -n 1 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to driver" >&2
+		exit 1
+	fi
+}
+
+config_set_test_case_fs()
+{
+	if ! echo -n 2 >$DIR/config_test_case; then
+		echo "$0: Unable to set to test case to fs" >&2
+		exit 1
+	fi
+}
+
+config_num_threads()
+{
+	if ! echo -n $1 >$DIR/config_num_threads; then
+		echo "$0: Unable to set to number of threads" >&2
+		exit 1
+	fi
+}
+
+config_get_modprobe_limit()
+{
+	if [[ -f ${MODPROBE_LIMIT_FILE} ]] ; then
+		MODPROBE_LIMIT=$(cat $MODPROBE_LIMIT_FILE)
+	fi
+	echo $MODPROBE_LIMIT
+}
+
+config_num_thread_limit_extra()
+{
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA_LIMIT=$MODPROBE_LIMIT+$1
+	config_num_threads $EXTRA_LIMIT
+}
+
+# For special characters use printf directly,
+# refer to kmod_test_0001
+config_set_driver()
+{
+	if ! echo -n $1 >$DIR/config_test_driver; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_set_fs()
+{
+	if ! echo -n $1 >$DIR/config_test_fs; then
+		echo "$0: Unable to set driver" >&2
+		exit 1
+	fi
+}
+
+config_get_driver()
+{
+	cat $DIR/config_test_driver
+}
+
+config_get_test_result()
+{
+	cat $DIR/test_result
+}
+
+config_reset()
+{
+	if ! echo -n "1" >"$DIR"/reset; then
+		echo "$0: reset shuld have worked" >&2
+		exit 1
+	fi
+}
+
+config_show_config()
+{
+	echo "----------------------------------------------------"
+	cat "$DIR"/config
+	echo "----------------------------------------------------"
+}
+
+config_trigger()
+{
+	if ! echo -n "1" >"$DIR"/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - loading should have worked"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - loading kmod test"
+}
+
+config_trigger_want_fail()
+{
+	if echo "1" > $DIR/trigger_config 2>/dev/null; then
+		echo "$1: FAIL - test case was expected to fail"
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - kmod test case failed as expected"
+}
+
+config_expect_result()
+{
+	RC=$(config_get_test_result)
+	RC_NAME=$(errno_val_to_name $RC)
+
+	ERRNO_NAME=$2
+	ERRNO=$(errno_name_to_val $ERRNO_NAME)
+
+	if [[ $ERRNO_NAME = "-ERR_ANY" ]]; then
+		if [[ $RC -ge 0 ]]; then
+			echo "$1: FAIL, test expects $ERRNO_NAME - got $RC_NAME ($RC)" >&2
+			config_show_config
+			exit 1
+		fi
+	elif [[ $RC != $ERRNO ]]; then
+		echo "$1: FAIL, test expects $ERRNO_NAME ($ERRNO) - got $RC_NAME ($RC)" >&2
+		config_show_config
+		exit 1
+	fi
+	echo "$1: OK! - Return value: $RC ($RC_NAME), expected $ERRNO_NAME"
+}
+
+kmod_defaults_driver()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_DRIVER
+	config_set_driver $DEFAULT_KMOD_DRIVER
+}
+
+kmod_defaults_fs()
+{
+	config_reset
+	modprobe -r $DEFAULT_KMOD_FS
+	config_set_fs $DEFAULT_KMOD_FS
+	config_set_test_case_fs
+}
+
+kmod_test_0001_driver()
+{
+	NAME='\000'
+
+	kmod_defaults_driver
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0001_fs()
+{
+	NAME='\000'
+
+	kmod_defaults_fs
+	config_num_threads 1
+	printf '\000' >"$DIR"/config_test_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0001()
+{
+	kmod_test_0001_driver
+	kmod_test_0001_fs
+}
+
+kmod_test_0002_driver()
+{
+	NAME="nope-$DEFAULT_KMOD_DRIVER"
+
+	kmod_defaults_driver
+	config_set_driver $NAME
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
+kmod_test_0002_fs()
+{
+	NAME="nope-$DEFAULT_KMOD_FS"
+
+	kmod_defaults_fs
+	config_set_fs $NAME
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -EINVAL
+}
+
+kmod_test_0002()
+{
+	kmod_test_0002_driver
+	kmod_test_0002_fs
+}
+
+kmod_test_0003()
+{
+	kmod_defaults_fs
+	config_num_threads 1
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0004()
+{
+	kmod_defaults_fs
+	config_num_threads 2
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0005()
+{
+	kmod_defaults_driver
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0006()
+{
+	kmod_defaults_fs
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0007()
+{
+	kmod_test_0005
+	kmod_test_0006
+}
+
+kmod_test_0008()
+{
+	kmod_defaults_driver
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA=$MODPROBE_LIMIT/6
+	config_num_thread_limit_extra $EXTRA
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+kmod_test_0009()
+{
+	kmod_defaults_fs
+	MODPROBE_LIMIT=$(config_get_modprobe_limit)
+	let EXTRA=$MODPROBE_LIMIT/4
+	config_num_thread_limit_extra $EXTRA
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} SUCCESS
+}
+
+list_tests()
+{
+	echo "Test ID list:"
+	echo
+	echo "TEST_ID x NUM_TEST"
+	echo "TEST_ID:   Test ID"
+	echo "NUM_TESTS: Number of recommended times to run the test"
+	echo
+	echo "0001 x $(get_test_count 0001) - Simple test - 1 thread  for empty string"
+	echo "0002 x $(get_test_count 0002) - Simple test - 1 thread  for modules/filesystems that do not exist"
+	echo "0003 x $(get_test_count 0003) - Simple test - 1 thread  for get_fs_type() only"
+	echo "0004 x $(get_test_count 0004) - Simple test - 2 threads for get_fs_type() only"
+	echo "0005 x $(get_test_count 0005) - multithreaded tests with default setup - request_module() only"
+	echo "0006 x $(get_test_count 0006) - multithreaded tests with default setup - get_fs_type() only"
+	echo "0007 x $(get_test_count 0007) - multithreaded tests with default setup test request_module() and get_fs_type()"
+	echo "0008 x $(get_test_count 0008) - multithreaded - push kmod_concurrent over max_modprobes for request_module()"
+	echo "0009 x $(get_test_count 0009) - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+}
+
+usage()
+{
+	NUM_TESTS=$(grep -o ' ' <<<"$ALL_TESTS" | grep -c .)
+	let NUM_TESTS=$NUM_TESTS+1
+	MAX_TEST=$(printf "%04d\n" $NUM_TESTS)
+	echo "Usage: $0 [ -t <4-number-digit> ] | [ -w <4-number-digit> ] |"
+	echo "		 [ -s <4-number-digit> ] | [ -c <4-number-digit> <test- count>"
+	echo "           [ all ] [ -h | --help ] [ -l ]"
+	echo ""
+	echo "Valid tests: 0001-$MAX_TEST"
+	echo ""
+	echo "    all     Runs all tests (default)"
+	echo "    -t      Run test ID the number amount of times is recommended"
+	echo "    -w      Watch test ID run until it runs into an error"
+	echo "    -c      Run test ID once"
+	echo "    -s      Run test ID x test-count number of times"
+	echo "    -l      List all test ID list"
+	echo " -h|--help  Help"
+	echo
+	echo "If an error every occurs execution will immediately terminate."
+	echo "If you are adding a new test try using -w <test-ID> first to"
+	echo "make sure the test passes a series of tests."
+	echo
+	echo Example uses:
+	echo
+	echo "${TEST_NAME}.sh		-- executes all tests"
+	echo "${TEST_NAME}.sh -t 0008	-- Executes test ID 0008 number of times is recomended"
+	echo "${TEST_NAME}.sh -w 0008	-- Watch test ID 0008 run until an error occurs"
+	echo "${TEST_NAME}.sh -s 0008	-- Run test ID 0008 once"
+	echo "${TEST_NAME}.sh -c 0008 3	-- Run test ID 0008 three times"
+	echo
+	list_tests
+	exit 1
+}
+
+function test_num()
+{
+	re='^[0-9]+$'
+	if ! [[ $1 =~ $re ]]; then
+		usage
+	fi
+}
+
+function get_test_count()
+{
+	test_num $1
+	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	LAST_TWO=${TEST_DATA#*:*}
+	echo ${LAST_TWO%:*}
+}
+
+function get_test_enabled()
+{
+	test_num $1
+	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	echo ${TEST_DATA#*:*:}
+}
+
+function run_all_tests()
+{
+	for i in $ALL_TESTS ; do
+		TEST_ID=${i%:*:*}
+		ENABLED=$(get_test_enabled $TEST_ID)
+		TEST_COUNT=$(get_test_count $TEST_ID)
+		if [[ $ENABLED -eq "1" ]]; then
+			test_case $TEST_ID $TEST_COUNT
+		fi
+	done
+}
+
+function watch_log()
+{
+	if [ $# -ne 3 ]; then
+		clear
+	fi
+	date
+	echo "Running test: $2 - run #$1"
+}
+
+function watch_case()
+{
+	i=0
+	while [ 1 ]; do
+
+		if [ $# -eq 1 ]; then
+			test_num $1
+			watch_log $i ${TEST_NAME}_test_$1
+			${TEST_NAME}_test_$1
+		else
+			watch_log $i all
+			run_all_tests
+		fi
+		let i=$i+1
+	done
+}
+
+function test_case()
+{
+	NUM_TESTS=$DEFAULT_NUM_TESTS
+	if [ $# -eq 2 ]; then
+		NUM_TESTS=$2
+	fi
+
+	i=0
+	while [ $i -lt $NUM_TESTS ]; do
+		test_num $1
+		watch_log $i ${TEST_NAME}_test_$1 noclear
+		RUN_TEST=${TEST_NAME}_test_$1
+		$RUN_TEST
+		let i=$i+1
+	done
+}
+
+function parse_args()
+{
+	if [ $# -eq 0 ]; then
+		run_all_tests
+	else
+		if [[ "$1" = "all" ]]; then
+			run_all_tests
+		elif [[ "$1" = "-w" ]]; then
+			shift
+			watch_case $@
+		elif [[ "$1" = "-t" ]]; then
+			shift
+			test_num $1
+			test_case $1 $(get_test_count $1)
+		elif [[ "$1" = "-c" ]]; then
+			shift
+			test_num $1
+			test_num $2
+			test_case $1 $2
+		elif [[ "$1" = "-s" ]]; then
+			shift
+			test_case $1 1
+		elif [[ "$1" = "-l" ]]; then
+			list_tests
+		elif [[ "$1" = "-h" || "$1" = "--help" ]]; then
+			usage
+		else
+			usage
+		fi
+	fi
+}
+
+test_reqs
+allow_user_defaults
+load_req_mod
+
+trap "test_finish" EXIT
+
+parse_args $@
+
+exit 0
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH v4 3/3] kmod: throttle kmod thread limit
  2017-06-28 22:31     ` [PATCH v4 0/3] " Luis R. Rodriguez
  2017-06-28 22:31       ` [PATCH v4 1/3] MAINTAINERS: give kmod some maintainer love Luis R. Rodriguez
  2017-06-28 22:31       ` [PATCH v4 2/3] kmod: add test driver to stress test the module loader Luis R. Rodriguez
@ 2017-06-28 22:31       ` Luis R. Rodriguez
  2 siblings, 0 replies; 69+ messages in thread
From: Luis R. Rodriguez @ 2017-06-28 22:31 UTC (permalink / raw)
  To: akpm
  Cc: jeyu, shuah, rusty, ebiederm, dmitry.torokhov, acme, josh,
	martin.wilck, mmarek, pmladek, hare, rwright, jeffm, DSterba,
	fdmanana, neilb, linux, rgoldwyn, subashab, xypron.glpk,
	keescook, atomlin, mbenes, paulmck, dan.j.williams, jpoimboe,
	davem, mingo, alan, tytso, gregkh, torvalds, dave,
	linux-kselftest, linux-kernel, Luis R. Rodriguez

If we reach the limit of modprobe_limit threads running the next
request_module() call will fail. The original reason for adding
a kill was to do away with possible issues with in old circumstances
which would create a recursive series of request_module() calls.

We can do better than just be super aggressive and reject calls
once we've reached the limit by simply making pending callers wait
until the threshold has been reduced, and then throttling them in,
one by one.

This throttling enables requests over the kmod concurrent limit to
be processed once a pending request completes. Only the first item
queued up to wait is woken up. The assumption here is once a task
is woken it will have no other option to also kick the queue to check
if there are more pending tasks -- regardless of whether or not it
was successful.

By throttling and processing only max kmod concurrent tasks we ensure
we avoid unexpected fatal request_module() calls, and we keep memory
consumption on module loading to a minimum.

With x86_64 qemu, with 4 cores, 4 GiB of RAM it takes the following run
time to run both tests:

time ./kmod.sh -t 0008
real    0m16.366s
user    0m0.883s
sys     0m8.916s

time ./kmod.sh -t 0009
real    0m50.803s
user    0m0.791s
sys     0m9.852s

Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 kernel/kmod.c                        | 16 +++++++---------
 tools/testing/selftests/kmod/kmod.sh | 24 ++----------------------
 2 files changed, 9 insertions(+), 31 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index ff68198fe83b..6d016c5d97c8 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -68,6 +68,7 @@ static DECLARE_RWSEM(umhelper_sem);
  */
 #define MAX_KMOD_CONCURRENT 50
 static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT);
+static DECLARE_WAIT_QUEUE_HEAD(kmod_wq);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -140,7 +141,6 @@ int __request_module(bool wait, const char *fmt, ...)
 	va_list args;
 	char module_name[MODULE_NAME_LEN];
 	int ret;
-	static int kmod_loop_msg;
 
 	/*
 	 * We don't allow synchronous module loading from async.  Module
@@ -164,14 +164,11 @@ int __request_module(bool wait, const char *fmt, ...)
 		return ret;
 
 	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
-		/* We may be blaming an innocent here, but unlikely */
-		if (kmod_loop_msg < 5) {
-			printk(KERN_ERR
-			       "request_module: runaway loop modprobe %s\n",
-			       module_name);
-			kmod_loop_msg++;
-		}
-		return -ENOMEM;
+		pr_warn_ratelimited("request_module: kmod_concurrent_max (%u) close to 0 (max_modprobes: %u), for module %s, throttling...",
+				    atomic_read(&kmod_concurrent_max),
+				    MAX_KMOD_CONCURRENT, module_name);
+		wait_event_interruptible(kmod_wq,
+					 atomic_dec_if_positive(&kmod_concurrent_max) >= 0);
 	}
 
 	trace_module_request(module_name, wait, _RET_IP_);
@@ -179,6 +176,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
 	atomic_inc(&kmod_concurrent_max);
+	wake_up(&kmod_wq);
 
 	return ret;
 }
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
index 10196a62ed09..8cecae9a8bca 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -59,28 +59,8 @@ ALL_TESTS="$ALL_TESTS 0004:1:1"
 ALL_TESTS="$ALL_TESTS 0005:10:1"
 ALL_TESTS="$ALL_TESTS 0006:10:1"
 ALL_TESTS="$ALL_TESTS 0007:5:1"
-
-# Disabled tests:
-#
-# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
-# Current best-effort failure interpretation:
-# Enough module requests get loaded in place fast enough to reach over the
-# max_modprobes limit and trigger a failure -- before we're even able to
-# start processing pending requests.
-ALL_TESTS="$ALL_TESTS 0008:150:0"
-
-# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
-# Current best-effort failure interpretation:
-#
-# get_fs_type() requests modules using aliases as such the optimization in
-# place today to look for already loaded modules will not take effect and
-# we end up requesting a new module to load, this bumps the kmod_concurrent,
-# and in certain circumstances can lead to pushing the kmod_concurrent over
-# the max_modprobe limit.
-#
-# This test fails much easier than test 0008 since the alias optimizations
-# are not in place.
-ALL_TESTS="$ALL_TESTS 0009:150:0"
+ALL_TESTS="$ALL_TESTS 0008:150:1"
+ALL_TESTS="$ALL_TESTS 0009:150:1"
 
 test_modprobe()
 {
-- 
2.11.0

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, back to index

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-19  3:24 [PATCH 0/6] kmod: few simple enhancements Luis R. Rodriguez
2017-05-19  3:24 ` [PATCH 1/6] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
2017-05-19 20:44   ` Dmitry Torokhov
     [not found]     ` <CAB=NE6XGL24O+JfTNUG0HO4obhDc-v+HyL0SCrQELiZrj2-qNw@mail.gmail.com>
     [not found]       ` <CAB=NE6Wa4Nemh80yaCCwbjrNRLPD+GJMncg12APg9Vq63AWVng@mail.gmail.com>
     [not found]         ` <CAB=NE6Vc6RDAytn2Pkv2V58HFo8ncR0eOHZ3===kbZ2NF78ubg@mail.gmail.com>
     [not found]           ` <CAB=NE6Vqmx=y6muenpuQKynTP=pGWMF8tzoCA0BXD6d63q9wPg@mail.gmail.com>
2017-05-19 21:58             ` Dmitry Torokhov
2017-05-25 16:22               ` Luis R. Rodriguez
2017-05-25 16:38                 ` Dmitry Torokhov
2017-05-25 16:50                   ` Luis R. Rodriguez
2017-05-25 17:30                     ` Dmitry Torokhov
2017-05-25 17:38                       ` Luis R. Rodriguez
2017-05-25 18:06                         ` Luis R. Rodriguez
2017-05-25 18:26                           ` Dmitry Torokhov
2017-05-25 19:01                             ` Luis R. Rodriguez
2017-05-25 21:38                               ` Luis R. Rodriguez
2017-05-19  3:24 ` [PATCH 2/6] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
2017-05-23 14:46   ` Miroslav Benes
2017-05-19  3:24 ` [PATCH 3/6] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
2017-05-19  3:24 ` [PATCH 4/6] kmod: return -EBUSY if modprobe limit is reached Luis R. Rodriguez
2017-05-19  3:24 ` [PATCH 5/6] kmod: preempt on kmod_umh_threads_get() Luis R. Rodriguez
2017-05-19 22:27   ` Dmitry Torokhov
2017-05-25  0:14     ` Luis R. Rodriguez
2017-05-25  0:45       ` Dmitry Torokhov
2017-05-25  1:00         ` Luis R. Rodriguez
2017-05-25  2:27           ` Dmitry Torokhov
2017-05-25 11:19             ` Petr Mladek
2017-05-25 15:38               ` Luis R. Rodriguez
2017-05-25 16:42               ` Dmitry Torokhov
2017-05-25 15:18             ` Jessica Yu
2017-05-19  3:24 ` [PATCH 6/6] kmod: use simplified rate limit printk Luis R. Rodriguez
2017-05-19 22:23   ` Dmitry Torokhov
2017-05-23  9:00     ` Petr Mladek
2017-05-26  0:16 ` [PATCH v2 0/5] kmod: help make deterministic Luis R. Rodriguez
2017-05-26  0:16   ` [PATCH v2 1/5] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
2017-05-26  0:16   ` [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent Luis R. Rodriguez
2017-05-26  1:11     ` Dmitry Torokhov
2017-05-26 20:03       ` Luis R. Rodriguez
2017-05-26  0:16   ` [PATCH v2 3/5] kmod: add test driver to stress test the module loader Luis R. Rodriguez
2017-05-26  0:16   ` [PATCH v2 4/5] kmod: add helpers for getting kmod limit Luis R. Rodriguez
2017-05-26  0:56     ` Dmitry Torokhov
2017-05-26 20:27       ` Luis R. Rodriguez
2017-05-26  0:16   ` [PATCH v2 5/5] kmod: throttle kmod thread limit Luis R. Rodriguez
2017-05-26 21:12   ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
2017-05-26 21:12     ` [PATCH v3 1/4] module: use list_for_each_entry_rcu() on find_module_all() Luis R. Rodriguez
2017-05-26 21:12     ` [PATCH v3 2/4] kmod: reduce atomic operations on kmod_concurrent and simplify Luis R. Rodriguez
2017-06-23 19:19       ` [PATCH v4 " Luis R. Rodriguez
2017-06-26 11:36       ` [PATCH v3 " Petr Mladek
2017-05-26 21:12     ` [PATCH v3 3/4] kmod: add test driver to stress test the module loader Luis R. Rodriguez
2017-05-26 21:12     ` [PATCH v3 4/4] kmod: throttle kmod thread limit Luis R. Rodriguez
2017-06-22 15:19       ` Petr Mladek
2017-06-23 16:16         ` Luis R. Rodriguez
2017-06-23 17:56           ` Luis R. Rodriguez
2017-06-23 19:16             ` Luis R. Rodriguez
2017-06-26 10:03               ` Petr Mladek
2017-06-26  9:55           ` Petr Mladek
2017-06-23 19:20       ` [PATCH v4 " Luis R. Rodriguez
2017-06-26 11:38         ` Petr Mladek
2017-06-28 22:11           ` Luis R. Rodriguez
2017-06-20 20:56     ` [PATCH v3 0/4] kmod: help make deterministic Luis R. Rodriguez
2017-06-21  0:23       ` Kees Cook
2017-06-26 21:37         ` Jessica Yu
2017-06-26 22:44           ` Luis R. Rodriguez
2017-06-27  0:27             ` Luis R. Rodriguez
2017-06-27  8:13               ` Petr Mladek
2017-06-27 10:04                 ` Jessica Yu
2017-06-27 15:26             ` Jessica Yu
2017-06-28  0:49               ` Luis R. Rodriguez
2017-06-28 22:31     ` [PATCH v4 0/3] " Luis R. Rodriguez
2017-06-28 22:31       ` [PATCH v4 1/3] MAINTAINERS: give kmod some maintainer love Luis R. Rodriguez
2017-06-28 22:31       ` [PATCH v4 2/3] kmod: add test driver to stress test the module loader Luis R. Rodriguez
2017-06-28 22:31       ` [PATCH v4 3/3] kmod: throttle kmod thread limit Luis R. Rodriguez

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git