From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1162317AbdAJUB1 (ORCPT <rfc822;w@1wt.eu>);
        Tue, 10 Jan 2017 15:01:27 -0500
Received: from mx2.suse.de ([195.135.220.15]:56635 "EHLO mx2.suse.de"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S965040AbdAJUAO (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 10 Jan 2017 15:00:14 -0500
Date: Tue, 10 Jan 2017 21:00:08 +0100
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: Petr Mladek <pmladek@suse.com>, Peter Zijlstra <peterz@infradead.org>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>, shuah@kernel.org,
        jeyu@redhat.com, rusty@rustcorp.com.au, ebiederm@xmission.com,
        dmitry.torokhov@gmail.com, acme@redhat.com, corbet@lwn.net,
        martin.wilck@suse.com, mmarek@suse.com, hare@suse.com, rwright@hpe.com,
        jeffm@suse.com, DSterba@suse.com, fdmanana@suse.com, neilb@suse.com,
        linux@roeck-us.net, rgoldwyn@suse.com, subashab@codeaurora.org,
        xypron.glpk@gmx.de, keescook@chromium.org, atomlin@redhat.com,
        mbenes@suse.cz, paulmck@linux.vnet.ibm.com, dan.j.williams@intel.com,
        jpoimboe@redhat.com, davem@davemloft.net, mingo@redhat.com,
        akpm@linux-foundation.org, torvalds@linux-foundation.org,
        linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org,
        linux-kernel@vger.kernel.org
Subject: Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access
Message-ID: <20170110200008.GI13946@wotan.suse.de>
References: <20161208184801.1689-1-mcgrof@kernel.org>
 <20161208194850.2627-1-mcgrof@kernel.org>
 <20161215125747.GB14324@pathway.suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20161215125747.GB14324@pathway.suse.cz>
User-Agent: Mutt/1.6.0 (2016-04-01)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Dec 15, 2016 at 01:57:48PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:48:50, Luis R. Rodriguez wrote:
> > Only decrement *iff* we're possitive. Warn if we've hit
> > a situation where the counter is already 0 after we're done
> > with a modprobe call, this would tell us we have an unaccounted
> > counter access -- this in theory should not be possible as
> > only one routine controls the counter, however preemption is
> > one case that could trigger this situation. Avoid that situation
> > by disabling preemptiong while we access the counter.
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  kernel/kmod.c | 20 ++++++++++++++++----
> >  1 file changed, 16 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index ab38539f7e91..09cf35a2075a 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -113,16 +113,28 @@ static int call_modprobe(char *module_name, int wait)
> >  
> >  static int kmod_umh_threads_get(void)
> >  {
> > +	int ret = 0;
> > +
> > +	preempt_disable();
> >  	atomic_inc(&kmod_concurrent);
> >  	if (atomic_read(&kmod_concurrent) < max_modprobes)
> > -		return 0;
> > -	atomic_dec(&kmod_concurrent);
> > -	return -EBUSY;
> > +		goto out;
> 
> I though more about it and the disabled preemtion might make
> sense here. It makes sure that we are not rescheduled here
> and that kmod_concurrent is not increased by mistake for too long.

I think its good to add a comment here about this.

> Well, it still would make sense to increment the value
> only when it is under the limit and set the incremented
> value using cmpxchg to avoid races.
> 
> I mean to use similar trick that is used by refcount_inc(), see
> https://lkml.kernel.org/r/20161114174446.832175072@infradead.org

Right, I see now. Since we are converting this to kref though we would
immediately get the advantages of kref_get() using the new refcount_inc() once
that goes in, so I think its best we just sit tight to get that benefit given
as Jessica acknowledged the existing code has has this issue for ages, waiting
a bit longer should not hurt.  The preemption should help in the meantime as
well.

The note I've made then is:

        /*                                                                      
         * Disabling preemption makes sure that we are not rescheduled here.    
         *                                                                      
         * Also preemption helps kmod_concurrent is not increased by mistake    
         * for too long given in theory two concurrent threads could race on    
         * kref_get() before we kref_read().                                    
         *                                                                      
         * XXX: once Peter's refcount_t gets merged kref's kref_get() will use  
         * the new refcount_inc() and then each inc will be atomic with respect 
         * to each thread, as such when Peter's refcount_t gets merged          
         * the above comment "Also preemption ..." can be removed.              
         */  

Come to think of it, once Peter's changes go in at first glance it may seem
preemption would be pointless then but but I think that just mitigates a few
of the refcount_inc() instances where (old != val), that is -- when two threads
got the same bump, so think it can be kept even after Peter's refcount_t work.

> > +	atomic_dec_if_positive(&kmod_concurrent);
> > +	ret = -EBUSY;
> > +out:
> > +	preempt_enable();
> > +	return 0;
> >  }
> >  
> >  static void kmod_umh_threads_put(void)
> >  {
> > -	atomic_dec(&kmod_concurrent);
> > +	int ret;
> > +
> > +	preempt_disable();
> > +	ret = atomic_dec_if_positive(&kmod_concurrent);
> > +	WARN_ON(ret < 0);
> > +	preempt_enable();
> 
> The disabled preemption does not make much sense here.
> We do not need to tie the atomic operation and the WARN
> together so tightly.

Makes sense, will add a note.

kref also lacks such a mnemonic as atomic_dec_if_positive()
and since I've now converted this to kref I've dropped this.

  Luis