All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: Jessica Yu <jeyu@redhat.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>,
	Petr Mladek <pmladek@suse.com>, Kees Cook <keescook@chromium.org>,
	shuah@kernel.org, Rusty Russell <rusty@rustcorp.com.au>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Dmitry Torokhov <dmitry.torokhov@gmail.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	martin.wilck@suse.com, Michal Marek <mmarek@suse.com>,
	hare@suse.com, rwright@hpe.com, Jeff Mahoney <jeffm@suse.com>,
	DSterba@suse.com, fdmanana@suse.com, neilb@suse.com,
	Guenter Roeck <linux@roeck-us.net>,
	rgoldwyn@suse.com, subashab@codeaurora.org,
	Heinrich Schuchardt <xypron.glpk@gmx.de>,
	Aaron Tomlin <atomlin@redhat.com>,
	mbenes@suse.cz, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kselftest@vger.kernel.org,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: kmod: provide wrappers for kmod_concurrent inc/dec
Date: Fri, 6 Jan 2017 21:54:42 +0100	[thread overview]
Message-ID: <20170106205442.GW13946@wotan.suse.de> (raw)
In-Reply-To: <20161222044806.bajz6tdg5gtvud2t@jeyu>

On Wed, Dec 21, 2016 at 08:48:06PM -0800, Jessica Yu wrote:
> +++ Luis R. Rodriguez [16/12/16 09:05 +0100]:
> > On Thu, Dec 15, 2016 at 01:46:25PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-08 22:08:59, Luis R. Rodriguez wrote:
> > > > On Thu, Dec 08, 2016 at 12:29:42PM -0800, Kees Cook wrote:
> > > > > On Thu, Dec 8, 2016 at 11:48 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > > kmod_concurrent is used as an atomic counter for enabling
> > > > > > the allowed limit of modprobe calls, provide wrappers for it
> > > > > > to enable this to be expanded on more easily. This will be done
> > > > > > later.
> > > > > >
> > > > > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > > > > ---
> > > > > >  kernel/kmod.c | 27 +++++++++++++++++++++------
> > > > > >  1 file changed, 21 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > > > > > index cb6f7ca7b8a5..049d7eabda38 100644
> > > > > > --- a/kernel/kmod.c
> > > > > > +++ b/kernel/kmod.c
> > > > > > @@ -108,6 +111,20 @@ static int call_modprobe(char *module_name, int wait)
> > > > > >         return -ENOMEM;
> > > > > >  }
> > > > > >
> > > > > > +static int kmod_umh_threads_get(void)
> > > > > > +{
> > > > > > +       atomic_inc(&kmod_concurrent);
> > > 
> > > This approach might actually cause false failures. If we
> > > are on the limit and more processes do this increment
> > > in parallel, it makes the number bigger that it should be.
> > 
> > This approach is *exactly* what the existing code does :P
> > I just provided wrappers. I agree with the old approach though,
> > reason is it acts as a lock in for the bump.
> 
> I think what Petr meant was that we could run into false failures when multiple
> atomic increments happen between the first increment and the subsequent
> atomic_read.
> 
> Say max_modprobes is 64 -
> 
>       atomic_inc(&kmod_concurrent); // thread 1: kmod_concurrent is 63
>            atomic_inc(&kmod_concurrent); // thread 2: kmod_concurrent is 64
>                 atomic_inc(&kmod_concurrent); // thread 3: kmod_concurrent is 65
>       if (atomic_read(&kmod_concurrent) < max_modprobes) // if all threads read 65 here, then all will error out
>               return 0;                                  // when the first two should have succeeded (false failures)
>       atomic_dec(&kmod_concurrent);
>       return -ENOMEM;
> 
> But yeah, I think this issue was already in the existing kmod code..

Ah right, but the code was very simple and there is only one operation
in between which we'd race against given the old code just incremented
first nd immediately checked for the limit. The more code we have the
more chances for what you describe to happen.

I've added another change into my series, a clutch, its at the end of this
email. With this we change we check for the limit right away and put on
hold any items reaching the limit, while other requests passing the limit
will be bumped. We have then:
 
        if (!kmod_concurrent_sane()) {                                          
                pr_warn_ratelimited("request_module: kmod_concurrent (%u) close to critical levels (max_modprobes: %u) for module %s\n, backing off for a bit",
                                    atomic_read(&kmod_concurrent), max_modprobes, module_name);
                wait_event_interruptible(kmod_wq, kmod_concurrent_sane());      
        }                                                                       
                                                                                
        ret = kmod_umh_threads_get();                                           
        if (ret) {                                                              
                pr_err_ratelimited("%s: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
                                   __func__, module_name, max_modprobes);       
                return ret;                                                     
        }  

The same race you describe is possible -- but we now would at least use
a clutch immediately as we approach the limit. Maybe it makes sense to
post a new series after I fold the alias code and sanity check into a
debug kconfig option ?

  Luis

commit 95c55552283cf99e2a48b84dc766d5fa547f046e
Author: Luis R. Rodriguez <mcgrof@kernel.org>
Date:   Thu Dec 15 23:24:22 2016 -0600

    kmod: add a clutch around 1/4 of modprobe thread limit
    
    If we reach the limit of modprobe_limit threads running the next
    request_module() call will fail. The original reason for adding
    a kill was to do away with possible issues with in old circumstances
    which would create a recursive series of request_module() calls.
    We can do better than just be super aggressive and reject calls
    once we've reached the limit by adding a clutch so that if we're
    1/4th of the way close to the limit we make these new calls wait
    until pending threads complete.
    
    There is still a chance you can fail new incomming requests which
    can bump kmod_concurrent beyond the limit, however the clutch helps
    with a bit of breathing room to allow the system to process pending
    requests before activating the upper last 1/4th of the limit requests.
    
    This fixes test cases 0008 and 0009 of the selftest for kmod:
    
    tools/testing/selftests/kmod/kmod.sh -t 0008
    tools/testing/selftests/kmod/kmod.sh -t 0009
    
    Both tests reveal the clutch in action:
    
    Dec 15 16:12:14 piggy kernel: request_module: kmod_concurrent (96) close critical levels (max_modprobes: 128) for module test_module
    ...
    Dec 15 16:12:23 piggy kernel: request_module: kmod_concurrent (96) close critical levels (max_modprobes: 128) for module test_module
    ...
    
    The only difference is the clutch helps with avoiding making
    request_module() requests fatal more often. With x86_64 qemu,
    with 4 cores, 4 GiB of RAM it takes the following run time to
    run both tests:
    
    time kmod.sh -t 0008
    real    0m22.247s
    user    0m0.084s
    sys     0m11.328s
    
    time kmod.sh -t 0009
    real    0m58.785s
    user    0m0.492s
    sys     0m10.852s
    
    Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>

diff --git a/kernel/kmod.c b/kernel/kmod.c
index d6595d2de209..f8c880bbf658 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -58,6 +58,7 @@ static DECLARE_RWSEM(umhelper_sem);
 
 #ifdef CONFIG_MODULES
 static atomic_t kmod_concurrent = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(kmod_wq);
 
 /*
 	modprobe_path is set via /proc/sys.
@@ -156,6 +157,16 @@ int get_kmod_umh_count(void)
 	return atomic_read(&kmod_concurrent);
 }
 
+static bool kmod_concurrent_sane(void)
+{
+	unsigned int clutch;
+
+	clutch = get_kmod_umh_limit() - (get_kmod_umh_limit()/4);
+	if (get_kmod_umh_count() < clutch)
+		return true;
+	return false;
+}
+
 /**
  * __request_module - try to load a kernel module
  * @wait: wait (or not) for the operation to complete
@@ -199,6 +210,12 @@ int __request_module(bool wait, const char *fmt, ...)
 	if (ret)
 		return ret;
 
+	if (!kmod_concurrent_sane()) {
+		pr_warn_ratelimited("request_module: kmod_concurrent (%u) close to critical levels (max_modprobes: %u) for module %s\n, backing off for a bit",
+				    atomic_read(&kmod_concurrent), max_modprobes, module_name);
+		wait_event_interruptible(kmod_wq, kmod_concurrent_sane());
+	}
+
 	ret = kmod_umh_threads_get();
 	if (ret) {
 		pr_err_ratelimited("%s: module \"%s\" reached limit (%u) of concurrent modprobe calls\n",
@@ -211,6 +228,7 @@ int __request_module(bool wait, const char *fmt, ...)
 	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
 
 	kmod_umh_threads_put();
+	wake_up_all(&kmod_wq);
 	return ret;
 }
 EXPORT_SYMBOL(__request_module);
diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh
index f8ccc938e0fb..08d9bea4bade 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -52,28 +52,8 @@ ALL_TESTS="$ALL_TESTS 0004:1:1"
 ALL_TESTS="$ALL_TESTS 0005:10:1"
 ALL_TESTS="$ALL_TESTS 0006:10:1"
 ALL_TESTS="$ALL_TESTS 0007:5:1"
-
-# Disabled tests:
-#
-# 0008 x 150 -  multithreaded - push kmod_concurrent over max_modprobes for request_module()"
-# Current best-effort failure interpretation:
-# Enough module requests get loaded in place fast enough to reach over the
-# max_modprobes limit and trigger a failure -- before we're even able to
-# start processing pending requests.
-ALL_TESTS="$ALL_TESTS 0008:150:0"
-
-# 0009 x 150 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
-# Current best-effort failure interpretation:
-#
-# get_fs_type() requests modules using aliases as such the optimization in
-# place today to look for already loaded modules will not take effect and
-# we end up requesting a new module to load, this bumps the kmod_concurrent,
-# and in certain circumstances can lead to pushing the kmod_concurrent over
-# the max_modprobe limit.
-#
-# This test fails much easier than test 0008 since the alias optimizations
-# are not in place.
-ALL_TESTS="$ALL_TESTS 0009:150:0"
+ALL_TESTS="$ALL_TESTS 0008:150:1"
+ALL_TESTS="$ALL_TESTS 0009:150:1"
 
 test_modprobe()
 {

  reply	other threads:[~2017-01-06 20:56 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-08 18:47 [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez
2016-12-08 18:47 ` [RFC 01/10] kmod: add test driver to stress test the module loader Luis R. Rodriguez
2016-12-08 20:24   ` Kees Cook
2016-12-13 21:10     ` Luis R. Rodriguez
2016-12-16  7:41       ` Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 02/10] module: fix memory leak on early load_module() failures Luis R. Rodriguez
2016-12-08 20:30   ` Kees Cook
2016-12-08 21:10     ` Luis R. Rodriguez
2016-12-08 21:17       ` Kees Cook
2016-12-09 17:06   ` Miroslav Benes
2016-12-16  8:51     ` Luis R. Rodriguez
2016-12-15 18:46   ` Aaron Tomlin
2016-12-08 19:48 ` [RFC 03/10] kmod: add dynamic max concurrent thread count Luis R. Rodriguez
2016-12-08 20:28   ` Kees Cook
2016-12-08 21:00     ` Luis R. Rodriguez
2016-12-14 15:38   ` Petr Mladek
2016-12-16  8:39     ` Luis R. Rodriguez
2017-01-10 19:24       ` Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 04/10] kmod: provide wrappers for kmod_concurrent inc/dec Luis R. Rodriguez
2016-12-08 20:29   ` Kees Cook
2016-12-08 21:08     ` Luis R. Rodriguez
2016-12-15 12:46       ` Petr Mladek
2016-12-16  8:05         ` Luis R. Rodriguez
2016-12-22  4:48           ` Jessica Yu
2017-01-06 20:54             ` Luis R. Rodriguez [this message]
2017-01-10 18:57           ` [RFC 04/10] " Luis R. Rodriguez
2017-01-11 20:08             ` Luis R. Rodriguez
2017-05-16 18:02               ` Luis R. Rodriguez
2017-05-18  2:37                 ` Luis R. Rodriguez
2016-12-22  5:07   ` Jessica Yu
2017-01-10 20:28     ` Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 05/10] kmod: return -EBUSY if modprobe limit is reached Luis R. Rodriguez
2016-12-08 19:48 ` [RFC 06/10] kmod: provide sanity check on kmod_concurrent access Luis R. Rodriguez
2016-12-14 16:08   ` Petr Mladek
2016-12-14 17:12     ` Luis R. Rodriguez
2016-12-15 12:57   ` Petr Mladek
2017-01-10 20:00     ` Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 07/10] kmod: use simplified rate limit printk Luis R. Rodriguez
2016-12-14 16:23   ` Petr Mladek
2016-12-14 16:41     ` Joe Perches
2016-12-16  8:44     ` Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 08/10] sysctl: add support for unsigned int properly Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 09/10] kmod: add helpers for getting kmod count and limit Luis R. Rodriguez
2016-12-15 16:56   ` Petr Mladek
2016-12-16  7:57     ` Luis R. Rodriguez
2017-01-11 18:27       ` Luis R. Rodriguez
2016-12-08 19:49 ` [RFC 10/10] kmod: add a sanity check on module loading Luis R. Rodriguez
2016-12-09 20:03   ` Martin Wilck
2016-12-09 20:56     ` Linus Torvalds
2016-12-15 18:08       ` Luis R. Rodriguez
2016-12-15  0:27   ` Rusty Russell
2016-12-16  8:31     ` Luis R. Rodriguez
2016-12-17  3:54       ` Rusty Russell
     [not found]         ` <CAB=NE6VvuA9a6hf6yoopGfUxVJQM5HyV5bNzUdsEtUV0UhbG-g@mail.gmail.com>
2016-12-20  0:53           ` Rusty Russell
2016-12-20 18:52             ` Luis R. Rodriguez
2016-12-21  2:21               ` Rusty Russell
2016-12-21 13:08                 ` Luis R. Rodriguez
2017-01-03  0:04                   ` Rusty Russell
2017-01-06 20:36                     ` Luis R. Rodriguez
2017-01-06 21:53                       ` Jessica Yu
2017-01-09 20:27                         ` Luis R. Rodriguez
     [not found]                       ` <87bmvgax51.fsf@rustcorp.com.au>
2017-01-09 19:56                         ` [RFC 10/10] " Luis R. Rodriguez
2017-01-06 21:03                     ` Jessica Yu
2017-01-04  2:47   ` Jessica Yu
2017-01-11 19:10 ` [RFC 00/10] kmod: stress test driver, few fixes and enhancements Luis R. Rodriguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170106205442.GW13946@wotan.suse.de \
    --to=mcgrof@kernel.org \
    --cc=DSterba@suse.com \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@redhat.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=davem@davemloft.net \
    --cc=dmitry.torokhov@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=fdmanana@suse.com \
    --cc=hare@suse.com \
    --cc=jeffm@suse.com \
    --cc=jeyu@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=martin.wilck@suse.com \
    --cc=mbenes@suse.cz \
    --cc=mingo@redhat.com \
    --cc=mmarek@suse.com \
    --cc=neilb@suse.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pmladek@suse.com \
    --cc=rgoldwyn@suse.com \
    --cc=rusty@rustcorp.com.au \
    --cc=rwright@hpe.com \
    --cc=shuah@kernel.org \
    --cc=subashab@codeaurora.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xypron.glpk@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.