From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757359AbcEFAm3 (ORCPT ); Thu, 5 May 2016 20:42:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52231 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754220AbcEFAm1 (ORCPT ); Thu, 5 May 2016 20:42:27 -0400 Date: Thu, 5 May 2016 20:42:22 -0400 From: Jessica Yu To: Josh Poimboeuf Cc: Miroslav Benes , Jiri Kosina , pmladek@suse.com, jslaby@suse.cz, live-patching@vger.kernel.org, linux-kernel@vger.kernel.org, huawei.libin@huawei.com, minfei.huang@yahoo.com Subject: Re: livepatch: allow removal of a disabled patch Message-ID: <20160506004222.GA31986@packer-debian-8-amd64.digitalocean.com> References: <20160504023948.ttb3ko2wnicwruwy@treble> <20160504033619.5osteklgal3ixcbo@treble> <20160504131423.5yqvie2zy67jspak@treble> <20160504161423.pvupipravfxuyktz@treble> <20160505132748.kwk3pmgetud3c66r@treble> <20160505150403.h76mtzyqikitov6f@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20160505150403.h76mtzyqikitov6f@treble> X-OS: Linux eisen.io 3.16.0-4-amd64 x86_64 User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org +++ Josh Poimboeuf [05/05/16 10:04 -0500]: >On Thu, May 05, 2016 at 04:25:48PM +0200, Miroslav Benes wrote: >> On Thu, 5 May 2016, Josh Poimboeuf wrote: >> >> > On Thu, May 05, 2016 at 10:28:12AM +0200, Miroslav Benes wrote: >> > > I think it boils down to the following problem. >> > > >> > > 1. CONFIG_DEBUG_KOBJECT_RELEASE=y >> > > >> > > 2. we have dynamic kobjects, so there is a pointer in klp_patch to struct >> > > kobject >> > > >> > > 3. it is allocated during klp_init_patch() and all is fine >> > > >> > > 4. now we want to remove the patch module. It is disabled and module_put() >> > > is called. User calls rmmod on the module. >> > > >> > > 5. klp_unregister_patch() is called in __exit method. >> > > >> > > 6. klp_free_patch() is called. >> > > >> > > 7. kobject_put(patch->kobj) is called. >> > > >> > > ...now it gets interesting... >> > > >> > > 8. among others kobject_cleanup() is scheduled as a delayed work (this is >> > > important). >> > > >> > > 9. there is no completion, so kobject_put returns and the module goes >> > > away. >> > > >> > > 10. someone calls patch enabled_store attribute (for example). They can >> > > because kobject_cleanup() has not been called yet. It is delayed >> > > scheduled. >> > > >> > > ...crash... >> > >> > But what exactly causes the crash? In enabled_store() we can see that >> > the patch isn't in the list, so we can return -EINVAL. >> >> Ok, bad example. Take enabled_show() instead. It could be fixed in the >> same way, but I am not sure it is the right thing to do. It does not scale >> because the problem is elsewhere. >> >> Anyway, it is (even if theoretically) there in my opinion and we >> have two options. >> >> 1. We could forget about CONFIG_DEBUG_KOBJECT_RELEASE and all is ok >> without completion and regardless of dynamic/static kobject allocation. >> >> 2. We introduce completion and we are ok even with >> CONFIG_DEBUG_KOBJECT_RELEASE=y and again regardless of dynamic/static >> kobject allocation. > >I would disagree with the statement that the dynamic kobject doesn't >scale. We would just need a helper function to get from a kobject to >its klp_patch. > >In fact, to me it seems like the right way to do it. It doesn't make >sense for the code which creates the kobject to be different from the >code which initializes it. It's slightly out of context, but >kobject.txt does say: > > "Code which creates a kobject must, of course, initialize that object." > >I view the completion as a hack to compensate for the fact that we're >abusing the kobject interface. And so it makes sense to me that >CONFIG_DEBUG_KOBJECT_RELEASE would cause problems, because we're using >kobjects in the wrong way. > >So in my view, the two options are: > >1. Convert the kobject to dynamic as I described. > >2. Change the klp_register() interface so that klp_patch gets allocated > in livepatch code. > >I'd be curious to hear what others think. So, I think both of these solutions would enable us to get rid of the completion. Let me try to summarize.. For solution #1, if we dynamically allocate the kobject, i.e. we have a pointer now instead of having it embedded in the klp_patch struct, we no longer need to worry if the corresponding klp_patch gets deallocated under our nose. Since the kobject_cleanup function is delayed w/ CONFIG_DEBUG_KOBJECT_RELEASE, it is possible to have sysfs entries that refer to a klp_patch that no longer exists. Thus if any of the sysfs functions get called, we would have to take care to ensure that the klp_patch struct corresponding to the kobject in question actually still exists. In this case, all sysfs functions would require an extra check to make sure the matching klp_patch is still on the patches list and return an error if it isn't found. The "pro" is that this change would be simple, the "con" is that now kobjects are decoupled and managed completely separately from the object (klp_patch) with which they are associated, which doesn't feel 100% right. For solution #2, we could have livepatch manage the (de)allocation of klp_patch objects internally. Maybe in this scenario the caller would need to request a klp_patch object be allocated and the caller would fill out the returned klp_patch struct as appropriate. In this case, we would be able leave the kobject embedded in the klp_patch struct (and dynamic kobjects wouldn't be needed), as livepatch would now have control of both structures. Then during the patch module exit path, when kobject_put is called, the klp_patch struct would be freed in its kobject's release function. We wouldn't have to hold up rmmod, and delayed execution of kobject_cleanup wouldn't break anything, because a klp_patch would then have the same "lifespan" as its corresponding kobject, and therefore it would be safe to invoke enabled_store & co. up until kobject_cleanup is finally executed. We'd be able to use container_of in this case as well. In addition, we wouldn't have to force all sysfs functions to support an awkward edge-case (i.e. checking if the corresponding klp_patch still exists). I think this solution also matches better with the typical use-case of the kobject release method, as described in kobject.txt (replacing 'my_object' with klp_patch): --- (...) This notification is done through a kobject's release() method. Usually such a method has a form like: void my_object_release(struct kobject *kobj) { struct my_object *mine = container_of(kobj, struct my_object, kobj); /* Perform any additional cleanup on this object, then... */ kfree(mine); } --- Apologies for the giant wall of text. In any case I feel like solution #2 is actually closer in line with how kobjects are normally used, embedded in the structures they refer to, which get deallocated once their refcount hits 0. What do people think? Jessica