linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org,
	Luis Chamberlain <mcgrof@kernel.org>,
	Changhui Zhong <czhong@redhat.com>,
	linux-scsi@vger.kernel.org
Subject: Re: [PATCH V2 2/2] kobject: wait until kobject is cleaned up before freeing module
Date: Thu, 7 Jul 2022 12:54:06 +0800	[thread overview]
Message-ID: <YsZm7lSXYAHT14ui@T590> (raw)
In-Reply-To: <Ya9pkdq3n2pVRK+v@alley>

On Tue, Dec 07, 2021 at 03:02:57PM +0100, Petr Mladek wrote:
> On Tue 2021-12-07 20:51:24, Ming Lei wrote:
> > On Tue, Dec 07, 2021 at 11:32:27AM +0100, Petr Mladek wrote:
> > > On Mon 2021-12-06 09:04:40, Greg Kroah-Hartman wrote:
> > > > On Mon, Dec 06, 2021 at 10:13:53AM +0800, Ming Lei wrote:
> > > > > On Fri, Dec 03, 2021 at 04:07:39PM +0100, Greg Kroah-Hartman wrote:
> > > > > > On Mon, Nov 29, 2021 at 11:45:09AM +0800, Ming Lei wrote:
> > > > > > > kobject_put() may become asynchronously because of
> > > > > > > CONFIG_DEBUG_KOBJECT_RELEASE, so once kobject_put() returns, the caller may
> > > > > > > expect the kobject is released after the last refcnt is dropped, however
> > > > > > > CONFIG_DEBUG_KOBJECT_RELEASE just schedules one delayed work function
> > > > > > > for cleaning up the kobject.
> > > > > > 
> > > > > > The caller should NOT expect the kobject to be released.  That's the
> > > > > > whole point of dynamic reference counted objects, you never "know" when
> > > > > > the last object is released.  This option just makes it obvious so that
> > > > > > you know when to fix up code that has this assumption.
> > > > > 
> > > > > > > Inside the cleanup handler, kobj->ktype and kobj->ktype->release are
> > > > > > > required.
> > > > > > 
> > > > > > Yes. Is that a problem?
> > > > > 
> > > > > Of course for CONFIG_DEBUG_KOBJECT_RELEASE, which delays to call
> > > > > ->release after random time, when the module for storing ->ktype and
> > > > > ->ktype->release has been unloaded.
> > > > > 
> > > > > As I mentioned, the issue can be triggered 100% by 'modprobe -r
> > > > > kset-example' when CONFIG_DEBUG_KOBJECT_RELEASE is enabled if the
> > > > > 1st patch is applied.
> > > > 
> > > > Is there any "real" kernel code that this causes problems on?
> > > > 
> > > > Again, this is for debugging, yes, this tiny example will crash that
> > > > way, but that is fine, as we can obviously see that the kernel code here
> > > > is correct.
> > > > 
> > > > And if you really want to ensure that it works properly, let's wait on
> > > > release before allowing that module to be unloaded.
> > > 
> > > This is exactly what this patch is trying to achieve. IMHO,
> > > we should do it another way, see below.
> > > 
> > > 
> > > > But again, module unload is NOT a normal operation and is not what
> > > > this debugging option was created to help out with.
> > > 
> > > But people do unload module and especially when testing kernel.
> > > IMHO, we want both CONFIG_DEBUG_KOBJECT_RELEASE and module unload
> > > enabled when testing kernel.
> > > 
> > > 
> > > > Again, the confusion between kobjects (which protect data) and module
> > > > references (which protect code) is getting mixed up here.
> > > 
> > > This is perfect description of the problem. And the problem is real.
> > > 
> > > kobjects protect data but they need to call code (callbacks) when
> > > they are released. module unload is special because it removes the
> > > code. CONFIG_DEBUG_KOBJECT_RELEASE always delays the code calls.
> > > It results into a crash even when everything works as expected.
> > > 
> > > 
> > > Now, back to the proposed patch. I agree that it looks weird. It
> > > makes CONFIG_DEBUG_KOBJECT_RELEASE useless in this scenario.
> > 
> > Can you explain how this patch makes CONFIG_DEBUG_KOBJECT_RELEASE useless?
> > The kobject is still cleaned up with random delay.
> 
> To make it clear. The patch still allows to detect many problems,
> like use-after-free.
> 
> I wrote "in this scenario". By "this scenario" I mean that a module
> might be removed while some kobjects might still use its code.
> 
> IMHO, the confusion is because CONFIG_DEBUG_KOBJECT_RELEASE does
> really bad job in this scenario. It is supposed to help to catch
> these problems. But it actively creates the problem even when
> the code is correct.
> 
> Your patch is trying to remove the false positives. My concern is
> that it reduces the chance to see real problems.
> 
> 
> IMHO, there is a design problem:
> 
>    + CONFIG_DEBUG_KOBJECT_RELEASE delays the release because nobody
>      knows which kobject_put() is the last one.
> 
>    + kobject release callback need to call a code from the module. But
>      module_exit() does not wait because it is not aware of the still
>      referenced callbacks.

Yeah, I think it is one fundamental issue. There is neither kobject_put_final()
interface, nor reference counter returned from kobject_put(). Any kobject_put()
called before module_exit() might trigger use after free if this kobject or any
its ancestor's release handler is called after module_exit() returns.

Recently Changhui reports such problem on scsi_debug, turns out the scsi
host's parent device release handler can be called after the scsi driver's
module is unloaded.

sdebug_add_host_helper():

	sdbg_host->dev.release = &sdebug_release_adapter;

> 
> 
> By other words, all the crashes caused by CONFIG_DEBUG_KOBJECT_RELEASE and
> module removal are yelling at developers that there is this design problem.
> 
> IMHO, the right solution is to fix this design problem instead of
> calming down CONFIG_DEBUG_KOBJECT_RELEASE.

Now I agree.

> 
> 
> > > I have another idea. What about adding a pointer to
> > > struct module *mod into struct kobj_type. Some reference
> > > counter and wait_queue into struct module. They might be
> > > used to block the module_exit() until the reference counter
> > > reaches zero.
> > > 
> > > I mean something like:
> > > 
> > > Let's take samples/kobject/kset-sample.c as an example.
> > > We could define:
> > > 
> > > static struct kobj_type foo_ktype = {
> > > 	.sysfs_ops = &foo_sysfs_ops,
> > > 	.release = foo_release,
> > > 	.default_groups = foo_default_groups,
> > > 	.mod = THIS_MODULE,
> > > };
> > > 
> > > then we might do:
> > > 
> > > static int kobject_add_internal(struct kobject *kobj)
> > > {
> > > [...]
> > > 	if (kobject->ktype->mod)
> > > 		module_get_kobject_referece(kobject->ktype->mod);
> > > [...]
> > > }
> > > 
> > > and
> > > 
> > > static void kobject_cleanup(struct kobject *kobj)
> > > {
> > > [...]
> > > 	if (kobject->ktype->mod)
> > > 		module_put_kobject_referece(kobject->ktype->mod);
> > > [...]
> > > }
> > > 
> > > where
> > > 
> > > void module_get_kobject_referece(struct module *mod)
> > > {
> > > 	mutex_lock(&module_mutex);
> > > 	mod->kobject_ref++;
> > > 	mutex_lock(&module_mutex);
> > > }
> > > 
> > > void module_put_kobject_referece(struct module *mod)
> > > {
> > > 	struct wait_queue_head *module_kobject_wq;
> > > 
> > > 	mutex_lock(&module_mutex);
> > > 	mod->kobject_ref--;
> > > 	if (!mod->kobject_ref)
> > > 		wake_up(mod->kobj_release_wq);
> > > 	mutex_lock(&module_mutex);
> > 
> > The question is why kobject is so special for taking one extra
> > module ref here.
> > 
> > > }
> > > 
> > > 
> > > and
> > > 
> > > SYSCALL_DEFINE2(delete_module, const char __user *, name_user,
> > > 		unsigned int, flags)
> > > {
> > > [...]
> > > 	wait_event_interruptible(mod->kobj_release_wq, !mod->kobj_ref);
> > > [...]
> > > }
> > > 
> > > There might be many details to be solved.
> > > 
> > > But it looks like a win-win solution. It should make module unload
> > > much more secure. Broken modules will just get blocked in
> > > module_cleanup forever. CONFIG_DEBUG_KOBJECT_RELEASE will still
> > > work as designed.
> > 
> > The above approach might work, but it needs every driver with kobjects
> > to be changed, so it is more complicated.
> 
> As explained above. There is a design problem. modules do not wait
> for kobject release. kobject release is not synchronous by design.

It could be one generic referencing counting vs. module unload.

Either the reference counter data or the associated released handler
can't be referred after the module is unloaded.

> 
> Motivation for the more complex solution:
> 
> Bad reference counting or dependencies of kobjects might cause:
> 
>     + memory leak (not good but not fatal)
> 
>     + data use after free (serious but there is still chance to
>       survive)
> 
>     + code use after module removal (always fatal)
> 
> From this POV, early module removal is more fatal than other possible
> problems. It would deserve some effort.
> 
> 
> Regarding the complexity. If the approach with kobject_type worked
> then we would need to add ".mod = THIS_MODULE" into all/most statically
> defined kobject_type structures:
> 
> git grep "static struct kobj_type" | wc -l
> 156

You didn't count device release handler, which should have more use
cases. And almost every 'struct device' and its container is allocated
dynamically actually, but code is linked statically.


Thanks, 
Ming


      reply	other threads:[~2022-07-07  4:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-29  3:45 [PATCH V2 0/2] kobject: avoid to cleanup kobject after module is unloaded Ming Lei
2021-11-29  3:45 ` [PATCH V2 1/2] kobject: don't delay to cleanup module kobject Ming Lei
2021-12-03 15:08   ` Greg Kroah-Hartman
2021-12-07 13:58   ` kernel test robot
2021-11-29  3:45 ` [PATCH V2 2/2] kobject: wait until kobject is cleaned up before freeing module Ming Lei
2021-12-03 15:07   ` Greg Kroah-Hartman
2021-12-06  2:13     ` Ming Lei
2021-12-06  8:04       ` Greg Kroah-Hartman
2021-12-07  1:50         ` Ming Lei
2021-12-07 10:32         ` Petr Mladek
2021-12-07 12:51           ` Ming Lei
2021-12-07 14:02             ` Petr Mladek
2022-07-07  4:54               ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YsZm7lSXYAHT14ui@T590 \
    --to=ming.lei@redhat.com \
    --cc=czhong@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=pmladek@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).