From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S969899AbdEYQjD (ORCPT ); Thu, 25 May 2017 12:39:03 -0400 Received: from mail-pf0-f196.google.com ([209.85.192.196]:33929 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966777AbdEYQjB (ORCPT ); Thu, 25 May 2017 12:39:01 -0400 Date: Thu, 25 May 2017 09:38:57 -0700 From: Dmitry Torokhov To: "Luis R. Rodriguez" Cc: Filipe Manana , "Paul E. McKenney" , linux-doc@vger.kernel.org, rgoldwyn@suse.com, hare@suse.com, corbet@lwn.net, torvalds@linux-foundation.org, linux-kselftest@vger.kernel.org, akpm@linux-foundation.org, dan.j.williams@intel.com, atomlin@redhat.com, rwright@hpe.com, xypron.glpk@gmx.de, mmarek@suse.com, martin.wilck@suse.com, rusty@rustcorp.com.au, jeffm@suse.com, mingo@redhat.com, pmladek@suse.com, linux@roeck-us.net, ebiederm@xmission.com, shuah@kernel.org, DSterba@suse.com, keescook@chromium.org, gregkh@linuxfoundation.org, jpoimboe@redhat.com, acme@redhat.com, mbenes@suse.cz, neilb@suse.com, linux-kernel@vger.kernel.org, davem@davemloft.net, jeyu@redhat.com, subashab@codeaurora.org Subject: Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count Message-ID: <20170525163857.GC26128@dtor-ws> References: <20170519032444.18416-1-mcgrof@kernel.org> <20170519032444.18416-2-mcgrof@kernel.org> <20170519204457.GC19281@dtor-ws> <20170519215829.GE19281@dtor-ws> <20170525162201.GV8951@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170525162201.GV8951@wotan.suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 25, 2017 at 06:22:01PM +0200, Luis R. Rodriguez wrote: > On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote: > > On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote: > > > On May 19, 2017 1:45 PM, "Dmitry Torokhov" > > > wrote: > > > > > > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote: > > > > We currently statically limit the number of modprobe threads which > > > > we allow to run concurrently to 50. As per Keith Owens, this was a > > > > completely arbitrary value, and it was set in the 2.3.38 days [0] > > > > over 16 years ago in year 2000. > > > > > > > > Although we haven't yet hit our lower limits, experimentation [1] > > > > shows that when and if we hit this limit in the worst case, will be > > > > fatal -- consider get_fs_type() failures upon mount on a system which > > > > has many partitions, some of which might even be with the same > > > > filesystem. Its best to be prudent and increase and set this > > > > value to something more sensible which ensures we're far from hitting > > > > the limit and also allows default build/user run time override. > > > > > > > > The worst case is fatal given that once a module fails to load there > > > > is a period of time during which subsequent request for the same module > > > > will fail, so in the case of partitions its not just one request that > > > > could fail, but whole series of partitions. This later issue of a > > > > module request failure domino effect can be addressed later, but > > > > increasing the limit to something more meaninful should at least give us > > > > enough cushion to avoid this for a while. > > > > > > > > Set this value up with a bit more meaninful modern limits: > > > > > > > > Bump this up to 64 max for small systems (CONFIG_BASE_SMALL) > > > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL) > > > > > > > > Also allow the default max limit to be further fine tuned at compile > > > > time and at initialization at run time at boot up using the kernel > > > > parameter: max_modprobes. > > > > > > > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/ > > > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44 > > > > [1] https://github.com/mcgrof/test_request_module > > > > > > If we actually run into this issue, instead of slamming the system with > > > bazillion concurrent requests, can we wait for the other modprobes to > > > finish and then continue? > > > > > > > > > Yes ! That I have a patch that does precisely that ! That is actually still > > > *not enough* to not fail fatally but this would be subject of another > > > series with more debatable approaches. > > > > > > > Then please post it. > > Will do. > > > > This at least pushes us to closer safer limits for now while also making it > > > configurable. > > > > Making it configurable depending on how big/little box is makes no > > sense, > > If we set a hard limit then we need to patch a system if we need to increment > it. This is rather stupid given we have no current heuristics to make kmod > loading deterministic from userspace, and in the worst case this can be fatal. > General system size is a good first guess, but making it configurable is > really key given current limitations. I'll post further patches which reveals > some of these issues more clearly. > > > especially if the above is implemented, as depth of modprobe > > invocations depends on configuration and not computing power of the > > hardware the system is running on. > > You seem to agree making it configurable is sensible , but not depending on > the system size ? No, I am saying that making it configurable based on system size makes no sense at all, and making it configurable given you already have patches removing hard failures gives no benefit. Thanks. -- Dmitry