All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] pci: avoid dead lock between device reset and sriov disable
@ 2022-04-04  6:25 Jay Zhou
  2022-05-10  2:03 ` Yicong Yang
  2022-05-11 22:22 ` Bjorn Helgaas
  0 siblings, 2 replies; 3+ messages in thread
From: Jay Zhou @ 2022-04-04  6:25 UTC (permalink / raw)
  To: linux-pci, linux-kernel, bhelgaas, alex.williamson
  Cc: weidong.huang, jianjay.zhou

Call trace of PF SRIOV disable:
sriov_numvfs_store
  device_lock <----------------- (1) get the device lock
    ->sriov_configure # e.g. vfio_pci_sriov_configure
      sriov_disable
        pci_cfg_access_lock <--- (4) wait dev->block_cfg_access to be 0

Call trace of PF reset:
reset_store
  pci_reset_function
    pci_dev_lock
      pci_cfg_access_lock <----- (2) set dev->block_cfg_access = 1
      device_lock <------------- (3) want to get the device lock

These two oprations would wait for each other forever if the
code execution sequence is (1)(2)(3)(4).

Let's get the device lock and then the config access lock in
pci_dev_lock().

Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
---
 drivers/pci/pci.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 9ecce435fb3f..61a6db1d21f6 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5103,19 +5103,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
 
 void pci_dev_lock(struct pci_dev *dev)
 {
-	pci_cfg_access_lock(dev);
 	/* block PM suspend, driver probe, etc. */
 	device_lock(&dev->dev);
+	pci_cfg_access_lock(dev);
 }
 EXPORT_SYMBOL_GPL(pci_dev_lock);
 
 /* Return 1 on successful lock, 0 on contention */
 int pci_dev_trylock(struct pci_dev *dev)
 {
-	if (pci_cfg_access_trylock(dev)) {
-		if (device_trylock(&dev->dev))
+	if (device_trylock(&dev->dev)) {
+		if (pci_cfg_access_trylock(dev))
 			return 1;
-		pci_cfg_access_unlock(dev);
+		device_unlock(&dev->dev);
 	}
 
 	return 0;
@@ -5124,8 +5124,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock);
 
 void pci_dev_unlock(struct pci_dev *dev)
 {
-	device_unlock(&dev->dev);
 	pci_cfg_access_unlock(dev);
+	device_unlock(&dev->dev);
 }
 EXPORT_SYMBOL_GPL(pci_dev_unlock);
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] pci: avoid dead lock between device reset and sriov disable
  2022-04-04  6:25 [PATCH] pci: avoid dead lock between device reset and sriov disable Jay Zhou
@ 2022-05-10  2:03 ` Yicong Yang
  2022-05-11 22:22 ` Bjorn Helgaas
  1 sibling, 0 replies; 3+ messages in thread
From: Yicong Yang @ 2022-05-10  2:03 UTC (permalink / raw)
  To: Jay Zhou, linux-pci, linux-kernel, bhelgaas, alex.williamson
  Cc: yangyicong, weidong.huang

On 2022/4/4 14:25, Jay Zhou wrote:
> Call trace of PF SRIOV disable:
> sriov_numvfs_store
>   device_lock <----------------- (1) get the device lock
>     ->sriov_configure # e.g. vfio_pci_sriov_configure
>       sriov_disable
>         pci_cfg_access_lock <--- (4) wait dev->block_cfg_access to be 0
> 
> Call trace of PF reset:
> reset_store
>   pci_reset_function
>     pci_dev_lock
>       pci_cfg_access_lock <----- (2) set dev->block_cfg_access = 1
>       device_lock <------------- (3) want to get the device lock
> 
> These two oprations would wait for each other forever if the
> code execution sequence is (1)(2)(3)(4).
> 
> Let's get the device lock and then the config access lock in
> pci_dev_lock().
> 
> Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>

The patch looks good to me,

Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>

I met the same problem and tried to fix it in the same way. It's worth to be merged if somebody meets
the same problem again.
https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyicong@hisilicon.com/

> ---
>  drivers/pci/pci.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9ecce435fb3f..61a6db1d21f6 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5103,19 +5103,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>  
>  void pci_dev_lock(struct pci_dev *dev)
>  {
> -	pci_cfg_access_lock(dev);
>  	/* block PM suspend, driver probe, etc. */
>  	device_lock(&dev->dev);
> +	pci_cfg_access_lock(dev);
>  }
>  EXPORT_SYMBOL_GPL(pci_dev_lock);
>  
>  /* Return 1 on successful lock, 0 on contention */
>  int pci_dev_trylock(struct pci_dev *dev)
>  {
> -	if (pci_cfg_access_trylock(dev)) {
> -		if (device_trylock(&dev->dev))
> +	if (device_trylock(&dev->dev)) {
> +		if (pci_cfg_access_trylock(dev))
>  			return 1;
> -		pci_cfg_access_unlock(dev);
> +		device_unlock(&dev->dev);
>  	}
>  
>  	return 0;
> @@ -5124,8 +5124,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock);
>  
>  void pci_dev_unlock(struct pci_dev *dev)
>  {
> -	device_unlock(&dev->dev);
>  	pci_cfg_access_unlock(dev);
> +	device_unlock(&dev->dev);
>  }
>  EXPORT_SYMBOL_GPL(pci_dev_unlock);
>  
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] pci: avoid dead lock between device reset and sriov disable
  2022-04-04  6:25 [PATCH] pci: avoid dead lock between device reset and sriov disable Jay Zhou
  2022-05-10  2:03 ` Yicong Yang
@ 2022-05-11 22:22 ` Bjorn Helgaas
  1 sibling, 0 replies; 3+ messages in thread
From: Bjorn Helgaas @ 2022-05-11 22:22 UTC (permalink / raw)
  To: Jay Zhou
  Cc: linux-pci, linux-kernel, bhelgaas, alex.williamson,
	weidong.huang, Donald Dutile, Jason Gunthorpe, Leon Romanovsky

[+cc Don, Jason, Leon, just FYI]

On Mon, Apr 04, 2022 at 02:25:39PM +0800, Jay Zhou wrote:
> Call trace of PF SRIOV disable:
> sriov_numvfs_store
>   device_lock <----------------- (1) get the device lock
>     ->sriov_configure # e.g. vfio_pci_sriov_configure
>       sriov_disable
>         pci_cfg_access_lock <--- (4) wait dev->block_cfg_access to be 0
> 
> Call trace of PF reset:
> reset_store
>   pci_reset_function
>     pci_dev_lock
>       pci_cfg_access_lock <----- (2) set dev->block_cfg_access = 1
>       device_lock <------------- (3) want to get the device lock
> 
> These two oprations would wait for each other forever if the
> code execution sequence is (1)(2)(3)(4).
> 
> Let's get the device lock and then the config access lock in
> pci_dev_lock().
> 
> Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>

Thanks, I (finally) applied Yicong's identical patch from a year ago
to pci/virtualization for v5.19, with a commit log combined and
adapted from both:

    PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
    
    The sysfs sriov_numvfs_store() path acquires the device lock before the
    config space access lock:
    
      sriov_numvfs_store
        device_lock                 # A (1) acquire device lock
        sriov_configure
          vfio_pci_sriov_configure  # (for example)
            vfio_pci_core_sriov_configure
              pci_disable_sriov
                sriov_disable
                  pci_cfg_access_lock
                    pci_wait_cfg    # B (4) wait for dev->block_cfg_access == 0
    
    Previously, pci_dev_lock() acquired the config space access lock before the
    device lock:
    
      pci_dev_lock
        pci_cfg_access_lock
          dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1
        device_lock                 # A (3) wait for device lock
    
    Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may
    deadlock with sriov_numvfs_store() if the operations occur in the sequence
    (1) (2) (3) (4).
    
    Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires
    the device lock before the config space access lock, the same as the
    sriov_numvfs_store() path.
    
    [bhelgaas: combined and adapted commit log from Jay Zhou's independent
    subsequent posting:
    https://lore.kernel.org/r/20220404062539.1710-1-jianjay.zhou@huawei.com]
    Link: https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyicong@hisilicon.com/
    Also-posted-by: Jay Zhou <jianjay.zhou@huawei.com>
    Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/pci.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9ecce435fb3f..61a6db1d21f6 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5103,19 +5103,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>  
>  void pci_dev_lock(struct pci_dev *dev)
>  {
> -	pci_cfg_access_lock(dev);
>  	/* block PM suspend, driver probe, etc. */
>  	device_lock(&dev->dev);
> +	pci_cfg_access_lock(dev);
>  }
>  EXPORT_SYMBOL_GPL(pci_dev_lock);
>  
>  /* Return 1 on successful lock, 0 on contention */
>  int pci_dev_trylock(struct pci_dev *dev)
>  {
> -	if (pci_cfg_access_trylock(dev)) {
> -		if (device_trylock(&dev->dev))
> +	if (device_trylock(&dev->dev)) {
> +		if (pci_cfg_access_trylock(dev))
>  			return 1;
> -		pci_cfg_access_unlock(dev);
> +		device_unlock(&dev->dev);
>  	}
>  
>  	return 0;
> @@ -5124,8 +5124,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock);
>  
>  void pci_dev_unlock(struct pci_dev *dev)
>  {
> -	device_unlock(&dev->dev);
>  	pci_cfg_access_unlock(dev);
> +	device_unlock(&dev->dev);
>  }
>  EXPORT_SYMBOL_GPL(pci_dev_unlock);
>  
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-05-11 22:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-04  6:25 [PATCH] pci: avoid dead lock between device reset and sriov disable Jay Zhou
2022-05-10  2:03 ` Yicong Yang
2022-05-11 22:22 ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.