Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Ian May <ian.may@canonical.com>
Cc: linux-pci@vger.kernel.org, Keith Busch <keith.busch@wdc.com>
Subject: Re: [PATCH 1/2] PCIe hotplug interrupt and AER deadlock with reset_lock and device_lock
Date: Sat, 1 Aug 2020 18:14:05 +0200
Message-ID: <20200801161405.bww7kvcfj5lrcso4@wunner.de> (raw)
In-Reply-To: <CAE1ug=a8PJpKh0Jx2ZZxo5kwQvvK883xs+mzyMWbW8T1oqyKDg@mail.gmail.com>

[cc += Keith]

On Fri, Jul 17, 2020 at 09:02:22AM -0500, Ian May wrote:
> I do now have a "better" patch that I was going to submit to the list
> where I converted the pci_slot_mutex to a rw_semaphore.  Do you see
> any potential problems with changing the lock type?  I attached the
> patch if you are interested in checking it over.

The question is, if pci_slot_mutex is an rw_semaphore, can it happen
that pciehp acquires it for writing, provoking a deadlock like this:

        Hotplug                                AER
	----------------------------       ---------------------------
      1 down_read(&ctrl->reset_lock)
	                                 2 down_read(&pci_slot_mutex)
      3 down_write(&pci_slot_mutex)
                                         4 down_write(&ctrl->reset_lock)
	** DEADLOCK **

I think this can happen if the device inserted into the hotplug slot
contains a PCIe switch which itself has hotplug ports.  That's the
case with Thunderbolt:  Every Thunderbolt device contains a PCIe
switch with hotplug ports to extend the Thunderbolt chain.  E.g.
the PCIe hierarchy looks like this for a Thunderbolt host controller
with a chain of two devices:

Root - Upstream - Downstream - Upstream - Downstream - Upstream - Downstream

(host ...)                     (1st device ...)        (2nd device ...)

When a Thunderbolt device is attached, pci_slot_mutex would be taken
for writing in pci_create_slot():

pciehp_configure_device()
  pci_scan_slot()
    pci_scan_single_device()
      pci_scan_device()
            pci_setup_device()
                pci_dev_assign_slot() # acquire pci_slot_mutex for reading
        pci_device_add() # match_driver = false; device_add()
    pci_bus_add_devices()
      pci_bus_add_device() # match_driver = true;  device_attach()
        device_attach()
          __device_attach()
            __device_attach_driver()
              driver_probe_device()
                pcie_portdrv_probe()
                  pcie_port_device_register()
                    pcie_device_init()
                      device_register()
                        device_add()
                          bus_probe_device()
                            device_initial_probe()
                              __device_attach()
                                __device_attach_driver()
                                  driver_probe_device()
                                    pciehp_probe()
                                      init_slot()
				        pci_hp_initialize()
					  pci_create_slot()
					    down_write(pci_slot_mutex)

(You may want to double-check that I got this right.)

In principle, Keith did the right thing to acquire pci_slot_mutex in
pci_bus_error_reset() for accessing the bus->slots list.

I need to think some more to come up with a solution for this particular
deadlock.  Maybe using a klist and traversing it with klist_iter_init()
(holds a ref on each slot, allowing concurrent list access) or something
along those lines...

Thanks,

Lukas

  parent reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-15 14:32 [PATCH 0/2] Hotplug interrupt and AER recovery deadlocks Ian May
2020-06-15 14:32 ` [PATCH 1/2] PCIe hotplug interrupt and AER deadlock with reset_lock and device_lock Ian May
2020-06-15 18:56   ` Lukas Wunner
2020-06-16 17:13     ` Ian May
2020-07-17  5:20       ` Lukas Wunner
     [not found]         ` <CAE1ug=a8PJpKh0Jx2ZZxo5kwQvvK883xs+mzyMWbW8T1oqyKDg@mail.gmail.com>
2020-08-01 16:14           ` Lukas Wunner [this message]
2020-07-14 23:58     ` Bjorn Helgaas
2020-06-15 14:32 ` [PATCH 2/2] PCIe hotplug interrupt and AER pci_slot_mutex deadlock Ian May

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200801161405.bww7kvcfj5lrcso4@wunner.de \
    --to=lukas@wunner.de \
    --cc=ian.may@canonical.com \
    --cc=keith.busch@wdc.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git