From: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: <linux-pci@vger.kernel.org>, <linuxppc-dev@lists.ozlabs.org>,
<linux@yadro.com>, Srinath Mannam <srinath.mannam@broadcom.com>,
Marta Rybczynska <mrybczyn@kalray.eu>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC v4 02/21] PCI: Fix race condition in pci_enable/disable_device()
Date: Wed, 27 Mar 2019 20:11:51 +0300 [thread overview]
Message-ID: <f0a5be4c-4a90-d5d3-f4fc-7b364143a85b@yadro.com> (raw)
In-Reply-To: <20190326190038.GL24180@google.com>
[-- Attachment #1.1: Type: text/plain, Size: 7632 bytes --]
On 3/26/19 10:00 PM, Bjorn Helgaas wrote:
> [+cc Srinath, Marta, LKML]
>
> On Mon, Mar 11, 2019 at 04:31:03PM +0300, Sergey Miroshnichenko wrote:
>> CPU0 CPU1
>>
>> pci_enable_device_mem() pci_enable_device_mem()
>> pci_enable_bridge() pci_enable_bridge()
>> pci_is_enabled()
>> return false;
>> atomic_inc_return(enable_cnt)
>> Start actual enabling the bridge
>> ... pci_is_enabled()
>> ... return true;
>> ... Start memory requests <-- FAIL
>> ...
>> Set the PCI_COMMAND_MEMORY bit <-- Must wait for this
>>
>> This patch protects the pci_enable/disable_device() and pci_enable_bridge()
>> with mutexes.
>
> This is a subtle issue that we've tried to fix before, but we've never
> had a satisfactory solution, so I hope you've figured out the right
> fix.
>
> I'll include some links to previous discussion. This patch is very
> similar to [2], which we didn't actually apply. We did apply the
> patch from [3] as 40f11adc7cd9 ("PCI: Avoid race while enabling
> upstream bridges"), but it caused the regressions reported in [4,5],
> so we reverted it with 0f50a49e3008 ("Revert "PCI: Avoid race while
> enabling upstream bridges"").
>
Thanks for the links, I wasn't aware of these discussions and patches!
On PowerNV this issue is partially hidden by db2173198b95 ("powerpc/powernv/pci: Work
around races in PCI bridge enabling"), and on x86 BIOS pre-initializes all the bridges, so
it doesn't reproduce until hotplugging in a hotplugged bridge.
This patch is indeed similar to 40f11adc7cd9 ("PCI: Avoid race while enabling upstream
bridges"), but instead of a single static mutex it adds per-device mutexes and prevents
the dev->enable_cnt from incrementing too early. So it's not needed anymore to carefully
select a moment safe enough to enable the device.
Serge
> I think the underlying design problem is that we have a driver for
> device B calling pci_enable_device(), and it is changing the state of
> device A (an upstream bridge). The model generally is that a driver
> should only touch the device it is bound to.
>
> It's tricky to get the locking right when several children of device A
> all need to operate on A.
>
> That's all to say I'll have to think carefully about this particular
> patch, so I'll go on to the others and come back to this one.
>
> Bjorn
>
> [1] https://lore.kernel.org/linux-pci/1494256190-28993-1-git-send-email-srinath.mannam@broadcom.com/T/#u
> [RFC PATCH] pci: Concurrency issue in NVMe Init through PCIe switch
>
> [2] https://lore.kernel.org/linux-pci/1496135297-19680-1-git-send-email-srinath.mannam@broadcom.com/T/#u
> [RFC PATCH v2] pci: Concurrency issue in NVMe Init through PCIe switch
>
> [3] https://lore.kernel.org/linux-pci/1501858648-22228-1-git-send-email-srinath.mannam@broadcom.com/T/#u
> [RFC PATCH v3] pci: Concurrency issue during pci enable bridge
>
> [4] https://lore.kernel.org/linux-pci/150547971091.977464.16294045866179907260.stgit@buzz/T/#u
> [PATCH bisected regression in 4.14] PCI: fix race while enabling upstream bridges concurrently
>
> [5] https://lore.kernel.org/linux-wireless/04c9b578-693c-1dc6-9f0f-904580231b21@kernel.dk/T/#u
> iwlwifi firmware load broken in current -git
>
> [6] https://lore.kernel.org/linux-pci/744877924.5841545.1521630049567.JavaMail.zimbra@kalray.eu/T/#u
> [RFC PATCH] nvme: avoid race-conditions when enabling devices
>
>> Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
>> ---
>> drivers/pci/pci.c | 26 ++++++++++++++++++++++----
>> drivers/pci/probe.c | 1 +
>> include/linux/pci.h | 1 +
>> 3 files changed, 24 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index f006068be209..895201d4c9e6 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1615,6 +1615,8 @@ static void pci_enable_bridge(struct pci_dev *dev)
>> struct pci_dev *bridge;
>> int retval;
>>
>> + mutex_lock(&dev->enable_mutex);
>> +
>> bridge = pci_upstream_bridge(dev);
>> if (bridge)
>> pci_enable_bridge(bridge);
>> @@ -1622,6 +1624,7 @@ static void pci_enable_bridge(struct pci_dev *dev)
>> if (pci_is_enabled(dev)) {
>> if (!dev->is_busmaster)
>> pci_set_master(dev);
>> + mutex_unlock(&dev->enable_mutex);
>> return;
>> }
>>
>> @@ -1630,11 +1633,14 @@ static void pci_enable_bridge(struct pci_dev *dev)
>> pci_err(dev, "Error enabling bridge (%d), continuing\n",
>> retval);
>> pci_set_master(dev);
>> + mutex_unlock(&dev->enable_mutex);
>> }
>>
>> static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
>> {
>> struct pci_dev *bridge;
>> + /* Enable-locking of bridges is performed within the pci_enable_bridge() */
>> + bool need_lock = !dev->subordinate;
>> int err;
>> int i, bars = 0;
>>
>> @@ -1650,8 +1656,13 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
>> dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
>> }
>>
>> - if (atomic_inc_return(&dev->enable_cnt) > 1)
>> + if (need_lock)
>> + mutex_lock(&dev->enable_mutex);
>> + if (pci_is_enabled(dev)) {
>> + if (need_lock)
>> + mutex_unlock(&dev->enable_mutex);
>> return 0; /* already enabled */
>> + }
>>
>> bridge = pci_upstream_bridge(dev);
>> if (bridge)
>> @@ -1666,8 +1677,10 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
>> bars |= (1 << i);
>>
>> err = do_pci_enable_device(dev, bars);
>> - if (err < 0)
>> - atomic_dec(&dev->enable_cnt);
>> + if (err >= 0)
>> + atomic_inc(&dev->enable_cnt);
>> + if (need_lock)
>> + mutex_unlock(&dev->enable_mutex);
>> return err;
>> }
>>
>> @@ -1910,15 +1923,20 @@ void pci_disable_device(struct pci_dev *dev)
>> if (dr)
>> dr->enabled = 0;
>>
>> + mutex_lock(&dev->enable_mutex);
>> dev_WARN_ONCE(&dev->dev, atomic_read(&dev->enable_cnt) <= 0,
>> "disabling already-disabled device");
>>
>> - if (atomic_dec_return(&dev->enable_cnt) != 0)
>> + if (atomic_dec_return(&dev->enable_cnt) != 0) {
>> + mutex_unlock(&dev->enable_mutex);
>> return;
>> + }
>>
>> do_pci_disable_device(dev);
>>
>> dev->is_busmaster = 0;
>> +
>> + mutex_unlock(&dev->enable_mutex);
>> }
>> EXPORT_SYMBOL(pci_disable_device);
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index 2ec0df04e0dc..977a127ce791 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -2267,6 +2267,7 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
>> INIT_LIST_HEAD(&dev->bus_list);
>> dev->dev.type = &pci_dev_type;
>> dev->bus = pci_bus_get(bus);
>> + mutex_init(&dev->enable_mutex);
>>
>> return dev;
>> }
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 77448215ef5b..cb2760a31fe2 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -419,6 +419,7 @@ struct pci_dev {
>> unsigned int no_vf_scan:1; /* Don't scan for VFs after IOV enablement */
>> pci_dev_flags_t dev_flags;
>> atomic_t enable_cnt; /* pci_enable_device has been called */
>> + struct mutex enable_mutex;
>>
>> u32 saved_config_space[16]; /* Config space saved at suspend time */
>> struct hlist_head saved_cap_space;
>> --
>> 2.20.1
>>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2019-03-27 17:12 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-11 13:31 [PATCH RFC v4 00/21] PCI: Allow BAR movement during hotplug Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 01/21] PCI: Fix writing invalid BARs during pci_restore_state() Sergey Miroshnichenko
2019-03-26 14:02 ` Bjorn Helgaas
2019-03-11 13:31 ` [PATCH RFC v4 02/21] PCI: Fix race condition in pci_enable/disable_device() Sergey Miroshnichenko
2019-03-26 19:00 ` Bjorn Helgaas
2019-03-27 17:11 ` Sergey Miroshnichenko [this message]
2019-03-11 13:31 ` [PATCH RFC v4 03/21] PCI: Enable bridge's I/O and MEM access for hotplugged devices Sergey Miroshnichenko
2019-03-26 19:13 ` Bjorn Helgaas
2019-03-27 17:13 ` Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 04/21] PCI: Define PCI-specific version of the release_child_resources() Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 05/21] PCI: hotplug: Add a flag for the movable BARs feature Sergey Miroshnichenko
2019-03-26 19:24 ` Bjorn Helgaas
2019-03-27 17:16 ` Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 06/21] PCI: Pause the devices with movable BARs during rescan Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 07/21] PCI: Wake up bridges during rescan when movable BARs enabled Sergey Miroshnichenko
2019-03-26 19:28 ` Bjorn Helgaas
2019-03-11 13:31 ` [PATCH RFC v4 08/21] nvme-pci: Handle movable BARs Sergey Miroshnichenko
2019-03-26 20:20 ` Bjorn Helgaas
2019-03-27 17:30 ` Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 09/21] PCI: Mark immovable BARs with PCI_FIXED Sergey Miroshnichenko
2019-03-26 20:28 ` Bjorn Helgaas
2019-03-27 17:03 ` David Laight
2019-03-27 17:39 ` Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 10/21] PCI: Fix assigning of fixed prefetchable resources Sergey Miroshnichenko
2019-03-26 20:37 ` Bjorn Helgaas
2019-03-11 13:31 ` [PATCH RFC v4 11/21] PCI: Release and reassign the root bridge resources during rescan Sergey Miroshnichenko
2019-03-26 20:41 ` Bjorn Helgaas
2019-03-27 17:40 ` Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 12/21] PCI: Don't allow hotplugged devices to steal resources Sergey Miroshnichenko
2019-03-26 20:55 ` Bjorn Helgaas
2019-03-27 18:02 ` Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 13/21] PCI: Include fixed BARs into the bus size calculating Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 14/21] PCI: Don't reserve memory for hotplug when enabled movable BARs Sergey Miroshnichenko
2019-03-26 20:57 ` Bjorn Helgaas
2019-03-11 13:31 ` [PATCH RFC v4 15/21] PCI: Allow the failed resources to be reassigned later Sergey Miroshnichenko
2019-03-26 20:58 ` Bjorn Helgaas
2019-03-11 13:31 ` [PATCH RFC v4 16/21] PCI: Calculate fixed areas of bridge windows based on fixed BARs Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 17/21] PCI: Calculate boundaries for bridge windows Sergey Miroshnichenko
2019-03-26 21:01 ` Bjorn Helgaas
2019-03-11 13:31 ` [PATCH RFC v4 18/21] PCI: Make sure bridge windows include their fixed BARs Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 19/21] PCI: Prioritize fixed BAR assigning over the movable ones Sergey Miroshnichenko
2019-03-11 13:31 ` [PATCH RFC v4 20/21] PCI: pciehp: Add support for the movable BARs feature Sergey Miroshnichenko
2019-03-26 21:11 ` Bjorn Helgaas
2019-03-11 13:31 ` [PATCH RFC v4 21/21] powerpc/pci: Fix crash with enabled movable BARs Sergey Miroshnichenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f0a5be4c-4a90-d5d3-f4fc-7b364143a85b@yadro.com \
--to=s.miroshnichenko@yadro.com \
--cc=helgaas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux@yadro.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mrybczyn@kalray.eu \
--cc=srinath.mannam@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).