All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Keith Busch <keith.busch@intel.com>
Cc: Linux PCI <linux-pci@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Alex_Gagniuc@Dellteam.com, Scott Bauer <scott.bauer@intel.com>
Subject: Re: [PATCH 4/4] PCI/AER: Lock pci topology when scanning errors
Date: Tue, 5 Jun 2018 17:09:11 -0500	[thread overview]
Message-ID: <20180605220911.GB226399@bhelgaas-glaptop.roam.corp.google.com> (raw)
In-Reply-To: <20180409220444.6632-5-keith.busch@intel.com>

On Mon, Apr 09, 2018 at 04:04:44PM -0600, Keith Busch wrote:
> The side effects of surprise removal may trigger AER handling. The AER
> handling walks the pci topology and may access a pci_dev that is being
> freed by the hotplug handler.
> 
> This patch fixes that use-after-free by locking the PCI topology in
> the AER handler so it isn't racing with the pciehp removal.
> 
> Since the AER handler now runs under a global PCI lock, the rpc specific
> mutex is no longer necessary.
> 
> Reported-by: Alex Gagniuc <Alex_Gagniuc@Dellteam.com>
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  drivers/pci/pcie/aer/aerdrv.c      | 1 -
>  drivers/pci/pcie/aer/aerdrv.h      | 5 -----
>  drivers/pci/pcie/aer/aerdrv_core.c | 4 ++--
>  3 files changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv.c b/drivers/pci/pcie/aer/aerdrv.c
> index 0b2eb88c422b..b88e5e2f3700 100644
> --- a/drivers/pci/pcie/aer/aerdrv.c
> +++ b/drivers/pci/pcie/aer/aerdrv.c
> @@ -237,7 +237,6 @@ static struct aer_rpc *aer_alloc_rpc(struct pcie_device *dev)
>  	rpc->rpd = pci_dev_get(dev->port);
>  	kref_init(&rpc->ref);
>  	INIT_WORK(&rpc->dpc_handler, aer_isr);
> -	mutex_init(&rpc->rpc_mutex);
>  
>  	/* Use PCIe bus function to store rpc into PCIe device */
>  	set_service_data(dev, rpc);
> diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
> index f886521e2c7b..b90fc5d4cda2 100644
> --- a/drivers/pci/pcie/aer/aerdrv.h
> +++ b/drivers/pci/pcie/aer/aerdrv.h
> @@ -70,11 +70,6 @@ struct aer_rpc {
>  					 * Lock access to Error Status/ID Regs
>  					 * and error producer/consumer index
>  					 */
> -	struct mutex rpc_mutex;		/*
> -					 * only one thread could do
> -					 * recovery on the same
> -					 * root port hierarchy
> -					 */
>  };
>  
>  struct aer_broadcast_data {
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index e4059d7fa7fa..de210b7439eb 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -796,10 +796,10 @@ void aer_isr(struct work_struct *work)
>  	struct aer_rpc *rpc = container_of(work, struct aer_rpc, dpc_handler);
>  	struct aer_err_source uninitialized_var(e_src);
>  
> -	mutex_lock(&rpc->rpc_mutex);
> +	pci_lock_rescan_remove();
>  	while (get_e_source(rpc, &e_src))
>  		aer_isr_one_error(rpc, &e_src);
> -	mutex_unlock(&rpc->rpc_mutex);
> +	pci_unlock_rescan_remove();

I think this needs to be updated after Oza's patches, doesn't it?

It looks like this would deadlock if I applied it to my current "next"
branch as-is:

  aer_isr
    pci_lock_rescan_remove
    aer_isr_one_error
      aer_process_err_devices
        handle_error_source
          pcie_do_fatal_recovery
            pci_lock_rescan_remove      <-- deadlock

>       aer_release(rpc);
>  }
> -- 
> 2.14.3
> 

  reply	other threads:[~2018-06-05 22:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-09 22:04 [PATCH 0/4] PCI/AER: Use-after-free fix Keith Busch
2018-04-09 22:04 ` [PATCH 1/4] PCI/AER: Remove unused parameters Keith Busch
2018-04-09 22:04 ` [PATCH 2/4] PCI/AER: Replace struct pcie_device with pci_dev Keith Busch
2018-04-09 22:04 ` [PATCH 3/4] PCI/AER: Reference count aer structures Keith Busch
2018-04-09 22:04 ` [PATCH 4/4] PCI/AER: Lock pci topology when scanning errors Keith Busch
2018-06-05 22:09   ` Bjorn Helgaas [this message]
2018-06-05 22:18     ` Keith Busch
2018-06-06 13:52       ` Bjorn Helgaas
2018-04-10 13:15 ` [PATCH 0/4] PCI/AER: Use-after-free fix Dongdong Liu
2018-04-12 17:06 ` Alex_Gagniuc
2018-04-12 16:47   ` Scott Bauer
2018-04-13 14:49     ` Alex_Gagniuc
2018-04-16 19:49     ` Alex_Gagniuc
2018-04-12 17:10   ` Keith Busch
2018-06-05 22:11 ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180605220911.GB226399@bhelgaas-glaptop.roam.corp.google.com \
    --to=helgaas@kernel.org \
    --cc=Alex_Gagniuc@Dellteam.com \
    --cc=bhelgaas@google.com \
    --cc=keith.busch@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=scott.bauer@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.