All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Ian May <ian.may@canonical.com>
Cc: linux-pci@vger.kernel.org, Keith Busch <keith.busch@wdc.com>
Subject: Re: [PATCH 1/2] PCIe hotplug interrupt and AER deadlock with reset_lock and device_lock
Date: Sat, 1 Aug 2020 18:14:05 +0200	[thread overview]
Message-ID: <20200801161405.bww7kvcfj5lrcso4@wunner.de> (raw)
In-Reply-To: <CAE1ug=a8PJpKh0Jx2ZZxo5kwQvvK883xs+mzyMWbW8T1oqyKDg@mail.gmail.com>

[cc += Keith]

On Fri, Jul 17, 2020 at 09:02:22AM -0500, Ian May wrote:
> I do now have a "better" patch that I was going to submit to the list
> where I converted the pci_slot_mutex to a rw_semaphore.  Do you see
> any potential problems with changing the lock type?  I attached the
> patch if you are interested in checking it over.

The question is, if pci_slot_mutex is an rw_semaphore, can it happen
that pciehp acquires it for writing, provoking a deadlock like this:

        Hotplug                                AER
	----------------------------       ---------------------------
      1 down_read(&ctrl->reset_lock)
	                                 2 down_read(&pci_slot_mutex)
      3 down_write(&pci_slot_mutex)
                                         4 down_write(&ctrl->reset_lock)
	** DEADLOCK **

I think this can happen if the device inserted into the hotplug slot
contains a PCIe switch which itself has hotplug ports.  That's the
case with Thunderbolt:  Every Thunderbolt device contains a PCIe
switch with hotplug ports to extend the Thunderbolt chain.  E.g.
the PCIe hierarchy looks like this for a Thunderbolt host controller
with a chain of two devices:

Root - Upstream - Downstream - Upstream - Downstream - Upstream - Downstream

(host ...)                     (1st device ...)        (2nd device ...)

When a Thunderbolt device is attached, pci_slot_mutex would be taken
for writing in pci_create_slot():

pciehp_configure_device()
  pci_scan_slot()
    pci_scan_single_device()
      pci_scan_device()
            pci_setup_device()
                pci_dev_assign_slot() # acquire pci_slot_mutex for reading
        pci_device_add() # match_driver = false; device_add()
    pci_bus_add_devices()
      pci_bus_add_device() # match_driver = true;  device_attach()
        device_attach()
          __device_attach()
            __device_attach_driver()
              driver_probe_device()
                pcie_portdrv_probe()
                  pcie_port_device_register()
                    pcie_device_init()
                      device_register()
                        device_add()
                          bus_probe_device()
                            device_initial_probe()
                              __device_attach()
                                __device_attach_driver()
                                  driver_probe_device()
                                    pciehp_probe()
                                      init_slot()
				        pci_hp_initialize()
					  pci_create_slot()
					    down_write(pci_slot_mutex)

(You may want to double-check that I got this right.)

In principle, Keith did the right thing to acquire pci_slot_mutex in
pci_bus_error_reset() for accessing the bus->slots list.

I need to think some more to come up with a solution for this particular
deadlock.  Maybe using a klist and traversing it with klist_iter_init()
(holds a ref on each slot, allowing concurrent list access) or something
along those lines...

Thanks,

Lukas

  parent reply	other threads:[~2020-08-01 16:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-15 14:32 [PATCH 0/2] Hotplug interrupt and AER recovery deadlocks Ian May
2020-06-15 14:32 ` [PATCH 1/2] PCIe hotplug interrupt and AER deadlock with reset_lock and device_lock Ian May
2020-06-15 18:56   ` Lukas Wunner
2020-06-16 17:13     ` Ian May
2020-07-17  5:20       ` Lukas Wunner
     [not found]         ` <CAE1ug=a8PJpKh0Jx2ZZxo5kwQvvK883xs+mzyMWbW8T1oqyKDg@mail.gmail.com>
2020-08-01 16:14           ` Lukas Wunner [this message]
2020-07-14 23:58     ` Bjorn Helgaas
2020-06-15 14:32 ` [PATCH 2/2] PCIe hotplug interrupt and AER pci_slot_mutex deadlock Ian May

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200801161405.bww7kvcfj5lrcso4@wunner.de \
    --to=lukas@wunner.de \
    --cc=ian.may@canonical.com \
    --cc=keith.busch@wdc.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.