All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Yang Shen <shenyang39@huawei.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org,
	linux-accelerators@lists.ozlabs.org
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove
Date: Tue, 21 Jun 2022 09:44:11 +0200	[thread overview]
Message-ID: <YrF2yypHZfiNVRBh@kroah.com> (raw)
In-Reply-To: <b5011dd2-e8ec-a307-1b43-5aff6cbb6891@linaro.org>

On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote:
> 
> 
> On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote:
> > On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
> > > On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
> > > > > The refcount only ensures that the uacce_device object is not freed as
> > > > > long as there are open fds. But uacce_remove() can run while there are
> > > > > open fds, or fds in the process of being opened. And atfer uacce_remove()
> > > > > runs, the uacce_device object still exists but is mostly unusable. For
> > > > > example once the module is freed, uacce->ops is not valid anymore. But
> > > > > currently uacce_fops_open() may dereference the ops in this case:
> > > > > 
> > > > > 	uacce_fops_open()
> > > > > 	 if (!uacce->parent->driver)
> > > > > 	 /* Still valid, keep going */		
> > > > > 	 ...					rmmod
> > > > > 						 uacce_remove()
> > > > > 	 ...					 free_module()
> > > > > 	 uacce->ops->get_queue() /* BUG */
> > > > uacce_remove should wait for uacce->queues_lock, until fops_open release the
> > > > lock.
> > > > If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
> > > > should fail.
> > > Ah yes sorry, I lost sight of what this patch was adding. But we could
> > > have the same issue with the patch, just in a different order, no?
> > > 
> > > 	uacce_fops_open()
> > > 	 uacce = xa_load()
> > > 	 ...					rmmod
> > Um, how is rmmod called if the file descriptor is open?
> > 
> > That should not be possible if the owner of the file descriptor is
> > properly set.  Please fix that up.
> Thanks Greg
> 
> Set cdev owner or use module_get/put can block rmmod once fops_open.
> -       uacce->cdev->owner = THIS_MODULE;
> +       uacce->cdev->owner = uacce->parent->driver->owner;
> 
> However, still not find good method to block removing parent pci device.
> 
> $ echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
> 
> [   32.563350]  uacce_remove+0x6c/0x148
> [   32.563353]  hisi_qm_uninit+0x12c/0x178
> [   32.563356]  hisi_zip_remove+0xa0/0xd0 [hisi_zip]
> [   32.563361]  pci_device_remove+0x44/0xd8
> [   32.563364]  device_remove+0x54/0x88
> [   32.563367]  device_release_driver_internal+0xec/0x1a0
> [   32.563370]  device_release_driver+0x20/0x30
> [   32.563372]  pci_stop_bus_device+0x8c/0xe0
> [   32.563375]  pci_stop_and_remove_bus_device_locked+0x28/0x60
> [   32.563378]  remove_store+0x9c/0xb0
> [   32.563379]  dev_attr_store+0x20/0x38

Removing the parent pci device does not remove the module code, it
removes the device itself.  Don't confuse code vs. data here.

thanks,

greg k-h
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Zhangfei Gao <zhangfei.gao@linaro.org>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Wangzhou <wangzhou1@hisilicon.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	linux-accelerators@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org,
	iommu@lists.linux-foundation.org,
	Yang Shen <shenyang39@huawei.com>
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove
Date: Tue, 21 Jun 2022 09:44:11 +0200	[thread overview]
Message-ID: <YrF2yypHZfiNVRBh@kroah.com> (raw)
In-Reply-To: <b5011dd2-e8ec-a307-1b43-5aff6cbb6891@linaro.org>

On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote:
> 
> 
> On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote:
> > On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
> > > On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
> > > > > The refcount only ensures that the uacce_device object is not freed as
> > > > > long as there are open fds. But uacce_remove() can run while there are
> > > > > open fds, or fds in the process of being opened. And atfer uacce_remove()
> > > > > runs, the uacce_device object still exists but is mostly unusable. For
> > > > > example once the module is freed, uacce->ops is not valid anymore. But
> > > > > currently uacce_fops_open() may dereference the ops in this case:
> > > > > 
> > > > > 	uacce_fops_open()
> > > > > 	 if (!uacce->parent->driver)
> > > > > 	 /* Still valid, keep going */		
> > > > > 	 ...					rmmod
> > > > > 						 uacce_remove()
> > > > > 	 ...					 free_module()
> > > > > 	 uacce->ops->get_queue() /* BUG */
> > > > uacce_remove should wait for uacce->queues_lock, until fops_open release the
> > > > lock.
> > > > If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
> > > > should fail.
> > > Ah yes sorry, I lost sight of what this patch was adding. But we could
> > > have the same issue with the patch, just in a different order, no?
> > > 
> > > 	uacce_fops_open()
> > > 	 uacce = xa_load()
> > > 	 ...					rmmod
> > Um, how is rmmod called if the file descriptor is open?
> > 
> > That should not be possible if the owner of the file descriptor is
> > properly set.  Please fix that up.
> Thanks Greg
> 
> Set cdev owner or use module_get/put can block rmmod once fops_open.
> -       uacce->cdev->owner = THIS_MODULE;
> +       uacce->cdev->owner = uacce->parent->driver->owner;
> 
> However, still not find good method to block removing parent pci device.
> 
> $ echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
> 
> [   32.563350]  uacce_remove+0x6c/0x148
> [   32.563353]  hisi_qm_uninit+0x12c/0x178
> [   32.563356]  hisi_zip_remove+0xa0/0xd0 [hisi_zip]
> [   32.563361]  pci_device_remove+0x44/0xd8
> [   32.563364]  device_remove+0x54/0x88
> [   32.563367]  device_release_driver_internal+0xec/0x1a0
> [   32.563370]  device_release_driver+0x20/0x30
> [   32.563372]  pci_stop_bus_device+0x8c/0xe0
> [   32.563375]  pci_stop_and_remove_bus_device_locked+0x28/0x60
> [   32.563378]  remove_store+0x9c/0xb0
> [   32.563379]  dev_attr_store+0x20/0x38

Removing the parent pci device does not remove the module code, it
removes the device itself.  Don't confuse code vs. data here.

thanks,

greg k-h

  reply	other threads:[~2022-06-21  7:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-10 12:34 [PATCH] uacce: fix concurrency of fops_open and uacce_remove Zhangfei Gao
2022-06-10 12:34 ` Zhangfei Gao
2022-06-15 15:16 ` Jean-Philippe Brucker
2022-06-15 15:16   ` Jean-Philippe Brucker
2022-06-16  4:10   ` Zhangfei Gao
2022-06-16  4:10     ` Zhangfei Gao
2022-06-16  8:14     ` Jean-Philippe Brucker
2022-06-16  8:14       ` Jean-Philippe Brucker
2022-06-17  6:05       ` Zhangfei Gao
2022-06-17  6:05         ` Zhangfei Gao
2022-06-17  8:20         ` Zhangfei Gao
2022-06-17  8:20           ` Zhangfei Gao
2022-06-17 14:23           ` Zhangfei Gao
2022-06-17 14:23             ` Zhangfei Gao
2022-06-20 13:25             ` Jean-Philippe Brucker
2022-06-20 13:25               ` Jean-Philippe Brucker
2022-06-20 13:24         ` Jean-Philippe Brucker
2022-06-20 13:24           ` Jean-Philippe Brucker
2022-06-20 13:36           ` Greg Kroah-Hartman
2022-06-20 13:36             ` Greg Kroah-Hartman
2022-06-21  7:37             ` Zhangfei Gao
2022-06-21  7:37               ` Zhangfei Gao
2022-06-21  7:44               ` Greg Kroah-Hartman [this message]
2022-06-21  7:44                 ` Greg Kroah-Hartman
2022-06-22  8:14                 ` Zhangfei Gao
2022-06-22  8:14                   ` Zhangfei Gao
2022-06-22  8:24                   ` Greg Kroah-Hartman
2022-06-22  8:24                     ` Greg Kroah-Hartman
2022-06-20 13:38           ` Greg Kroah-Hartman
2022-06-20 13:38             ` Greg Kroah-Hartman
2022-06-20 20:18           ` [PATCH] uacce: Tidy up locking kernel test robot
2022-06-20 20:18             ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YrF2yypHZfiNVRBh@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=arnd@arndb.de \
    --cc=herbert@gondor.apana.org.au \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.org \
    --cc=linux-accelerators@lists.ozlabs.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shenyang39@huawei.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.