From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S939696AbdEZBLH (ORCPT ); Thu, 25 May 2017 21:11:07 -0400 Received: from mail-pf0-f196.google.com ([209.85.192.196]:35425 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761292AbdEZBLE (ORCPT ); Thu, 25 May 2017 21:11:04 -0400 Date: Thu, 25 May 2017 18:11:00 -0700 From: Dmitry Torokhov To: "Luis R. Rodriguez" Cc: akpm@linux-foundation.org, jeyu@redhat.com, shuah@kernel.org, rusty@rustcorp.com.au, ebiederm@xmission.com, acme@redhat.com, corbet@lwn.net, martin.wilck@suse.com, mmarek@suse.com, pmladek@suse.com, hare@suse.com, rwright@hpe.com, jeffm@suse.com, DSterba@suse.com, fdmanana@suse.com, neilb@suse.com, linux@roeck-us.net, rgoldwyn@suse.com, subashab@codeaurora.org, xypron.glpk@gmx.de, keescook@chromium.org, atomlin@redhat.com, mbenes@suse.cz, paulmck@linux.vnet.ibm.com, dan.j.williams@intel.com, jpoimboe@redhat.com, davem@davemloft.net, mingo@redhat.com, alan@linux.intel.com, tytso@mit.edu, gregkh@linuxfoundation.org, torvalds@linux-foundation.org, linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent Message-ID: <20170526011100.GF29859@dtor-ws> References: <20170519032444.18416-1-mcgrof@kernel.org> <20170526001630.19203-1-mcgrof@kernel.org> <20170526001630.19203-3-mcgrof@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170526001630.19203-3-mcgrof@kernel.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 25, 2017 at 05:16:27PM -0700, Luis R. Rodriguez wrote: > When checking if we want to allow a kmod thread to kick off we increment, > then read to see if we should enable a thread. If we were over the allowed > limit limit we decrement. Splitting the increment far apart from decrement > means there could be a time where two increments happen potentially > giving a false failure on a thread which should have been allowed. > > CPU1 CPU2 > atomic_inc() > atomic_inc() > atomic_read() > atomic_read() > atomic_dec() > atomic_dec() > > In this case a read on CPU1 gets the atomic_inc()'s and we could negate > it from getting a kmod thread. We could try to prevent this with a lock > or preemption but that is overkill. We can fix by reducing the number of > atomic operations. We do this by inverting the logic of of the enabler, > instead of incrementing kmod_concurrent as we get new kmod users, define the > variable kmod_concurrent_max as the max number of currently allowed kmod > users and as we get new kmod users just decrement it if its still positive. > This combines the dec and read in one atomic operation. > > In this case we no longer get the same false failure: > > CPU1 CPU2 > atomic_dec_if_positive() > atomic_dec_if_positive() > atomic_inc() > atomic_inc() > > Suggested-by: Petr Mladek > Signed-off-by: Luis R. Rodriguez > --- > include/linux/kmod.h | 2 ++ > init/main.c | 1 + > kernel/kmod.c | 44 +++++++++++++++++++++++++------------------- > 3 files changed, 28 insertions(+), 19 deletions(-) > > diff --git a/include/linux/kmod.h b/include/linux/kmod.h > index c4e441e00db5..8e2f302b214a 100644 > --- a/include/linux/kmod.h > +++ b/include/linux/kmod.h > @@ -38,10 +38,12 @@ int __request_module(bool wait, const char *name, ...); > #define request_module_nowait(mod...) __request_module(false, mod) > #define try_then_request_module(x, mod...) \ > ((x) ?: (__request_module(true, mod), (x))) > +void init_kmod_umh(void); > #else > static inline int request_module(const char *name, ...) { return -ENOSYS; } > static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; } > #define try_then_request_module(x, mod...) (x) > +static inline void init_kmod_umh(void) { } > #endif > > > diff --git a/init/main.c b/init/main.c > index 9ec09ff8a930..9b20be716cf7 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -650,6 +650,7 @@ asmlinkage __visible void __init start_kernel(void) > thread_stack_cache_init(); > cred_init(); > fork_init(); > + init_kmod_umh(); > proc_caches_init(); > buffer_init(); > key_init(); > diff --git a/kernel/kmod.c b/kernel/kmod.c > index 563f97e2be36..cafd27b92d19 100644 > --- a/kernel/kmod.c > +++ b/kernel/kmod.c > @@ -46,6 +46,7 @@ > #include > > extern int max_threads; > +unsigned int max_modprobes; > > #define CAP_BSET (void *)1 > #define CAP_PI (void *)2 > @@ -56,6 +57,8 @@ static DEFINE_SPINLOCK(umh_sysctl_lock); > static DECLARE_RWSEM(umhelper_sem); > > #ifdef CONFIG_MODULES > +static atomic_t kmod_concurrent_max = ATOMIC_INIT(0); > +#define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */ > > /* > modprobe_path is set via /proc/sys. > @@ -127,10 +130,7 @@ int __request_module(bool wait, const char *fmt, ...) > { > va_list args; > char module_name[MODULE_NAME_LEN]; > - unsigned int max_modprobes; > int ret; > - static atomic_t kmod_concurrent = ATOMIC_INIT(0); > -#define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */ > static int kmod_loop_msg; > > /* > @@ -154,21 +154,7 @@ int __request_module(bool wait, const char *fmt, ...) > if (ret) > return ret; > > - /* If modprobe needs a service that is in a module, we get a recursive > - * loop. Limit the number of running kmod threads to max_threads/2 or > - * MAX_KMOD_CONCURRENT, whichever is the smaller. A cleaner method > - * would be to run the parents of this process, counting how many times > - * kmod was invoked. That would mean accessing the internals of the > - * process tables to get the command line, proc_pid_cmdline is static > - * and it is not worth changing the proc code just to handle this case. > - * KAO. > - * > - * "trace the ppid" is simple, but will fail if someone's > - * parent exits. I think this is as good as it gets. --RR > - */ > - max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT); > - atomic_inc(&kmod_concurrent); > - if (atomic_read(&kmod_concurrent) > max_modprobes) { > + if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) { > /* We may be blaming an innocent here, but unlikely */ > if (kmod_loop_msg < 5) { > printk(KERN_ERR > @@ -184,10 +170,30 @@ int __request_module(bool wait, const char *fmt, ...) > > ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC); > > - atomic_dec(&kmod_concurrent); > + atomic_inc(&kmod_concurrent_max); > + > return ret; > } > EXPORT_SYMBOL(__request_module); > + > +/* > + * If modprobe needs a service that is in a module, we get a recursive > + * loop. Limit the number of running kmod threads to max_threads/2 or > + * MAX_KMOD_CONCURRENT, whichever is the smaller. A cleaner method > + * would be to run the parents of this process, counting how many times > + * kmod was invoked. That would mean accessing the internals of the > + * process tables to get the command line, proc_pid_cmdline is static > + * and it is not worth changing the proc code just to handle this case. > + * > + * "trace the ppid" is simple, but will fail if someone's > + * parent exits. I think this is as good as it gets. > + */ > +void __init init_kmod_umh(void) > +{ > + max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT); > + atomic_set(&kmod_concurrent_max, max_modprobes); I would love if we could initialize atomic statically. So the trouble we are trying to solve here is we create more threads than kernel supports, with thread count being calculated as: threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE, (u64) THREAD_SIZE * 8UL); So to not being serve 50 threads we need to deal with system smaller than 3200 pages, or ~13M memory (assume thread size is 8 pages - 64 bit with kasan, smaller page sizes reduce memory even more). Can you run 4.12 with modules support on machine with such memory? So maybe we shoudl simply say: static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT_MAX); and call it a day? So we do not need init_kmod_umh() and don't need to call it from init/main.c. Thanks. -- Dmitry