linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Logan Gunthorpe <logang@deltatee.com>,
	<linux-kernel@vger.kernel.org>, <linux-nvme@lists.infradead.org>,
	<linux-block@vger.kernel.org>, <linux-pci@vger.kernel.org>,
	<linux-mm@kvack.org>, <iommu@lists.linux-foundation.org>
Cc: "Stephen Bates" <sbates@raithlin.com>,
	"Christoph Hellwig" <hch@lst.de>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Christian König" <christian.koenig@amd.com>,
	"Don Dutile" <ddutile@redhat.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Jakowski Andrzej" <andrzej.jakowski@intel.com>,
	"Minturn Dave B" <dave.b.minturn@intel.com>,
	"Jason Ekstrand" <jason@jlekstrand.net>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Xiong Jianxin" <jianxin.xiong@intel.com>,
	"Bjorn Helgaas" <helgaas@kernel.org>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Robin Murphy" <robin.murphy@arm.com>
Subject: Re: [PATCH 02/16] PCI/P2PDMA: Avoid pci_get_slot() which sleeps
Date: Sat, 1 May 2021 22:35:43 -0700	[thread overview]
Message-ID: <d6220bff-83fc-6c03-76f7-32e9e00e40fd@nvidia.com> (raw)
In-Reply-To: <20210408170123.8788-3-logang@deltatee.com>

On 4/8/21 10:01 AM, Logan Gunthorpe wrote:
> In order to use upstream_bridge_distance_warn() from a dma_map function,
> it must not sleep. However, pci_get_slot() takes the pci_bus_sem so it
> might sleep.
> 
> In order to avoid this, try to get the host bridge's device from
> bus->self, and if that is not set, just get the first element in the
> device list. It should be impossible for the host bridge's device to
> go away while references are held on child devices, so the first element
> should not be able to change and, thus, this should be safe.
> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---
>   drivers/pci/p2pdma.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index bd89437faf06..473a08940fbc 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -311,16 +311,26 @@ static const struct pci_p2pdma_whitelist_entry {
>   static bool __host_bridge_whitelist(struct pci_host_bridge *host,
>   				    bool same_host_bridge)
>   {
> -	struct pci_dev *root = pci_get_slot(host->bus, PCI_DEVFN(0, 0));
>   	const struct pci_p2pdma_whitelist_entry *entry;
> +	struct pci_dev *root = host->bus->self;
>   	unsigned short vendor, device;
>   
> +	/*
> +	 * This makes the assumption that the first device on the bus is the
> +	 * bridge itself and it has the devfn of 00.0. This assumption should
> +	 * hold for the devices in the white list above, and if there are cases
> +	 * where this isn't true they will have to be dealt with when such a
> +	 * case is added to the whitelist.

Actually, it makes the assumption that the first device *in the list*
(the host->bus-devices list) is 00.0.  The previous code made the
assumption that you wrote.

By the way, pre-existing code comment: pci_p2pdma_whitelist[] seems
really short. From a naive point of view, I'd expect that there must be
a lot more CPUs/chipsets that can do pci p2p, what do you think? I
wonder if we have to be so super strict, anyway. It just seems extremely
limited, and I suspect there will be some additions to the list as soon
as we start to use this.


> +	 */
>   	if (!root)
> +		root = list_first_entry_or_null(&host->bus->devices,
> +						struct pci_dev, bus_list);

OK, yes this avoids taking the pci_bus_sem, but it's kind of cheating.
Why is it OK to avoid taking any locks in order to retrieve the
first entry from the list, but in order to retrieve any other entry, you
have to aquire the pci_bus_sem, and get a reference as well? Something
is inconsistent there.

The new version here also no longer takes a reference on the device,
which is also cheating. But I'm guessing that the unstated assumption
here is that there is always at least one entry in the list. But if
that's true, then it's better to show clearly that assumption, instead
of hiding it in an implicit call that skips both locking and reference
counting.

You could add a new function, which is a cut-down version of pci_get_slot(),
like this, and call this from __host_bridge_whitelist():

/*
  * A special purpose variant of pci_get_slot() that doesn't take the pci_bus_sem
  * lock, and only looks for the 00.0 bus-device-function. Once the PCI bus is
  * up, it is safe to call this, because there will always be a top-level PCI
  * root device.
  *
  * Other assumptions: the root device is the first device in the list, and the
  * root device is numbered 00.0.
  */
struct pci_dev *pci_get_root_slot(struct pci_bus *bus)
{
	struct pci_dev *root;
	unsigned devfn = PCI_DEVFN(0, 0);

	root = list_first_entry_or_null(&bus->devices, struct pci_dev,
					bus_list);
	if (root->devfn == devfn)
		goto out;

	root = NULL;
  out:
	pci_dev_get(root);
	return root;
}
EXPORT_SYMBOL(pci_get_root_slot);

...I think that's a lot clearer to the reader, about what's going on here.

Note that I'm not really sure if it *is* safe, I would need to ask other
PCIe subsystem developers with more experience. But I don't think anyone
is trying to make p2pdma calls so early that PCIe buses are uninitialized.


> +
> +	if (!root || root->devfn)
>   		return false;
>   
>   	vendor = root->vendor;
>   	device = root->device;
> -	pci_dev_put(root);
>   
>   	for (entry = pci_p2pdma_whitelist; entry->vendor; entry++) {
>   		if (vendor != entry->vendor || device != entry->device)
> 

thanks,
-- 
John Hubbard
NVIDIA

  reply	other threads:[~2021-05-02  5:35 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-08 17:01 [PATCH 00/16] Add new DMA mapping operation for P2PDMA Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 01/16] PCI/P2PDMA: Pass gfp_mask flags to upstream_bridge_distance_warn() Logan Gunthorpe
2021-05-02  3:58   ` John Hubbard
2021-05-03 15:57     ` Logan Gunthorpe
2021-05-03 18:17       ` John Hubbard
2021-05-03 18:20         ` Logan Gunthorpe
2021-05-03 18:23           ` John Hubbard
2021-05-03 18:24         ` Christoph Hellwig
2021-05-11 16:05     ` Don Dutile
2021-05-11 16:12       ` Logan Gunthorpe
2021-05-11 16:23         ` Don Dutile
2021-04-08 17:01 ` [PATCH 02/16] PCI/P2PDMA: Avoid pci_get_slot() which sleeps Logan Gunthorpe
2021-05-02  5:35   ` John Hubbard [this message]
2021-05-03 16:08     ` Logan Gunthorpe
2021-05-03 18:20       ` John Hubbard
2021-05-03 18:25       ` Christoph Hellwig
2021-05-11 16:05     ` Don Dutile
2021-05-11 16:16       ` Logan Gunthorpe
2021-05-11 16:05   ` Don Dutile
2021-05-11 16:14     ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 03/16] PCI/P2PDMA: Attempt to set map_type if it has not been set Logan Gunthorpe
2021-05-02 19:58   ` John Hubbard
2021-05-03 16:17     ` Logan Gunthorpe
2021-05-03 18:22       ` John Hubbard
2021-05-03 18:35       ` Christoph Hellwig
2021-05-03 18:46         ` Logan Gunthorpe
2021-05-11 16:05     ` Don Dutile
2021-04-08 17:01 ` [PATCH 04/16] PCI/P2PDMA: Refactor pci_p2pdma_map_type() to take pagmap and device Logan Gunthorpe
2021-05-02 20:41   ` John Hubbard
2021-05-03 16:30     ` Logan Gunthorpe
2021-05-03 18:31       ` John Hubbard
2021-05-03 18:56         ` Logan Gunthorpe
2021-05-03 21:54           ` John Hubbard
2021-05-03 22:57             ` Jason Gunthorpe
2021-05-03 23:40               ` John Hubbard
2021-04-08 17:01 ` [PATCH 05/16] dma-mapping: Introduce dma_map_sg_p2pdma() Logan Gunthorpe
2021-04-27 19:22   ` Jason Gunthorpe
2021-04-27 22:49     ` Logan Gunthorpe
2021-04-27 19:31   ` Jason Gunthorpe
2021-04-27 22:55     ` Logan Gunthorpe
2021-04-27 23:01       ` Jason Gunthorpe
2021-05-03 18:28         ` Christoph Hellwig
2021-05-03 18:31           ` Logan Gunthorpe
2021-05-02 21:23   ` John Hubbard
2021-05-03 16:38     ` Logan Gunthorpe
2021-05-11 16:05   ` Don Dutile
2021-04-08 17:01 ` [PATCH 06/16] lib/scatterlist: Add flag for indicating P2PDMA segments in an SGL Logan Gunthorpe
2021-05-02 22:34   ` John Hubbard
2021-04-08 17:01 ` [PATCH 07/16] PCI/P2PDMA: Make pci_p2pdma_map_type() non-static Logan Gunthorpe
2021-05-02 22:44   ` John Hubbard
2021-05-03 16:39     ` Logan Gunthorpe
2021-05-11 16:06   ` Don Dutile
2021-05-11 16:17     ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 08/16] PCI/P2PDMA: Introduce helpers for dma_map_sg implementations Logan Gunthorpe
2021-05-02 22:52   ` John Hubbard
2021-05-03  0:50   ` John Hubbard
2021-05-03 17:15     ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 09/16] dma-direct: Support PCI P2PDMA pages in dma-direct map_sg Logan Gunthorpe
2021-04-27 19:33   ` Jason Gunthorpe
2021-04-27 19:40     ` Jason Gunthorpe
2021-04-27 22:56       ` Logan Gunthorpe
2021-05-02 23:28   ` John Hubbard
2021-05-02 23:32     ` John Hubbard
2021-05-03 17:06       ` Logan Gunthorpe
2021-05-03 16:55     ` Logan Gunthorpe
2021-05-04  0:12       ` John Hubbard
2021-05-03 17:04     ` Logan Gunthorpe
2021-05-04  0:01       ` John Hubbard
2021-04-08 17:01 ` [PATCH 10/16] dma-mapping: Add flags to dma_map_ops to indicate PCI P2PDMA support Logan Gunthorpe
2021-05-03  0:32   ` John Hubbard
2021-05-03 17:09     ` Logan Gunthorpe
2021-05-11 16:06   ` Don Dutile
2021-05-11 16:19     ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 11/16] iommu/dma: Support PCI P2PDMA pages in dma-iommu map_sg Logan Gunthorpe
2021-04-27 19:43   ` Jason Gunthorpe
2021-04-27 22:59     ` Logan Gunthorpe
2021-05-03  1:14   ` John Hubbard
2021-05-06 23:59     ` Logan Gunthorpe
2021-05-11 16:06   ` Don Dutile
2021-05-11 16:35     ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 12/16] nvme-pci: Check DMA ops when indicating support for PCI P2PDMA Logan Gunthorpe
2021-05-03  1:29   ` John Hubbard
2021-05-03 17:17     ` Logan Gunthorpe
2021-05-04  0:17       ` John Hubbard
2021-04-08 17:01 ` [PATCH 13/16] nvme-pci: Convert to using dma_map_sg_p2pdma for p2pdma pages Logan Gunthorpe
2021-05-03  1:34   ` John Hubbard
2021-05-03 17:19     ` Logan Gunthorpe
2021-05-04  0:26       ` John Hubbard
2021-04-08 17:01 ` [PATCH 14/16] nvme-rdma: Ensure dma support when using p2pdma Logan Gunthorpe
2021-04-27 19:47   ` Jason Gunthorpe
2021-04-27 22:59     ` Logan Gunthorpe
2021-05-03  1:37   ` John Hubbard
2021-04-08 17:01 ` [PATCH 15/16] RDMA/rw: use dma_map_sg_p2pdma() Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 16/16] PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg() Logan Gunthorpe
2021-04-27 19:28 ` [PATCH 00/16] Add new DMA mapping operation for P2PDMA Jason Gunthorpe
2021-04-27 20:21   ` John Hubbard
2021-04-27 20:48     ` Dan Williams
2021-05-02  1:22 ` John Hubbard
2021-05-11 16:05 ` Don Dutile

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d6220bff-83fc-6c03-76f7-32e9e00e40fd@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=andrzej.jakowski@intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dave.b.minturn@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ddutile@redhat.com \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=ira.weiny@intel.com \
    --cc=jason@jlekstrand.net \
    --cc=jgg@ziepe.ca \
    --cc=jianxin.xiong@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=robin.murphy@arm.com \
    --cc=sbates@raithlin.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).