From: Logan Gunthorpe <logang@deltatee.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>,
linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org,
linux-rdma@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
Bjorn Helgaas <bhelgaas@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Sagi Grimberg <sagi@grimberg.me>, Keith Busch <kbusch@kernel.org>,
Stephen Bates <sbates@raithlin.com>
Subject: Re: [RFC PATCH 00/28] Removing struct page from P2PDMA
Date: Thu, 27 Jun 2019 12:00:35 -0600 [thread overview]
Message-ID: <e63d0259-e17f-effe-b76d-43dbfda8ae3a@deltatee.com> (raw)
In-Reply-To: <20190627170027.GE10652@lst.de>
On 2019-06-27 11:00 a.m., Christoph Hellwig wrote:
> It is not. (c) is fundamentally very different as it is not actually
> an operation that ever goes out to the wire at all, and which is why the
> actual physical address on the wire does not matter at all.
> Some interfaces like NVMe have designed it in a way that it the commands
> used to do this internal transfer look like (b2), but that is just their
> (IMHO very questionable) interface design choice, that produces a whole
> chain of problems.
From the mapping device's driver's perspective yes, but from the
perspective of a submitting driver they would be the same.
>>> I guess it might make sense to just have a block layer flag that (b) or
>>> (c) might be contained in a bio. Then we always look up the data
>>> structure, but can still fall back to (a) if nothing was found. That
>>> even allows free mixing and matching of memory types, at least as long
>>> as they are contained to separate bio_vec segments.
>>
>> IMO these three cases should be reflected in flags in the bio_vec. We'd
>> probably still need a queue flag to indicate support for mapping these,
>> but a flag on the bio that indicates special cases *might* exist in the
>> bio_vec and the driver has to do extra work to somehow distinguish the
>> three types doesn't seem useful. bio_vec flags also make it easy to
>> support mixing segments from different memory types.
>
> So I іnitially suggested these flags. But without a pgmap we absolutely
> need a lookup operation to find which phys address ranges map to which
> device. And once we do that the data structure the only thing we need
> is a flag saying that we need that information, and everything else
> can be in the data structure returned from that lookup.
Yes, you did suggest them. But what I'm trying to suggest is we don't
*necessarily* need the lookup. For demonstration purposes only, a
submitting driver could very roughly potentially do:
struct bio_vec vec;
dist = pci_p2pdma_dist(provider_pdev, mapping_pdev);
if (dist < 0) {
/* use regular memory */
vec.bv_addr = virt_to_phys(kmalloc(...));
vec.bv_flags = 0;
} else if (dist & PCI_P2PDMA_THRU_HOST_BRIDGE) {
vec.bv_addr = pci_p2pmem_alloc_phys(provider_pdev, ...);
vec.bv_flags = BVEC_MAP_RESOURCE;
} else {
vec.bv_addr = pci_p2pmem_alloc_bus_addr(provider_pdev, ...);
vec.bv_flags = BVEC_MAP_BUS_ADDR;
}
-- And a mapping driver would roughly just do:
dma_addr_t dma_addr;
if (vec.bv_flags & BVEC_MAP_BUS_ADDR) {
if (pci_bus_addr_in_bar(mapping_pdev, vec.bv_addr, &bar, &off)) {
/* case (c) */
/* program the DMA engine with bar and off */
} else {
/* case (b2) */
dma_addr = vec.bv_addr;
}
} else if (vec.bv_flags & BVEC_MAP_RESOURCE) {
/* case (b1) */
dma_addr = dma_map_resource(mapping_dev, vec.bv_addr, ...);
} else {
/* case (a) */
dma_addr = dma_map_page(..., phys_to_page(vec.bv_addr), ...);
}
The real difficulty here is that you'd really want all the above handled
by a dma_map_bvec() so it can combine every vector hitting the IOMMU
into a single continuous IOVA -- but it's hard to fit case (c) into that
equation. So it might be that a dma_map_bvec() handles cases (a), (b1)
and (b2) and the mapping driver has to then check each resulting DMA
vector for pci_bus_addr_in_bar() while it is programming the DMA engine
to deal with case (c).
Logan
next prev parent reply other threads:[~2019-06-27 18:00 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-20 16:12 [RFC PATCH 00/28] Removing struct page from P2PDMA Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 01/28] block: Introduce DMA direct request type Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 02/28] block: Add dma_vec structure Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 03/28] block: Warn on mis-use of dma-direct bios Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 04/28] block: Never bounce " Logan Gunthorpe
2019-06-20 17:23 ` Jason Gunthorpe
2019-06-20 18:38 ` Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 05/28] block: Skip dma-direct bios in bio_integrity_prep() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 06/28] block: Support dma-direct bios in bio_advance_iter() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 07/28] block: Use dma_vec length in bio_cur_bytes() for dma-direct bios Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 08/28] block: Introduce dmavec_phys_mergeable() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 09/28] block: Introduce vec_gap_to_prev() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 10/28] block: Create generic vec_split_segs() from bvec_split_segs() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 11/28] block: Create blk_segment_split_ctx Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 12/28] block: Create helper for bvec_should_split() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 13/28] block: Generalize bvec_should_split() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 14/28] block: Support splitting dma-direct bios Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 15/28] block: Support counting dma-direct bio segments Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 16/28] block: Implement mapping dma-direct requests to SGs in blk_rq_map_sg() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 17/28] block: Introduce queue flag to indicate support for dma-direct bios Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 18/28] block: Introduce bio_add_dma_addr() Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 19/28] nvme-pci: Support dma-direct bios Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 20/28] IB/core: Introduce API for initializing a RW ctx from a DMA address Logan Gunthorpe
2019-06-20 16:49 ` Jason Gunthorpe
2019-06-20 16:59 ` Logan Gunthorpe
2019-06-20 17:11 ` Jason Gunthorpe
2019-06-20 18:24 ` Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 21/28] nvmet: Split nvmet_bdev_execute_rw() into a helper function Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 22/28] nvmet: Use DMA addresses instead of struct pages for P2P Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 23/28] nvme-pci: Remove support for PCI_P2PDMA requests Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 24/28] block: Remove PCI_P2PDMA queue flag Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 25/28] IB/core: Remove P2PDMA mapping support in rdma_rw_ctx Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 26/28] PCI/P2PDMA: Remove SGL helpers Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 27/28] PCI/P2PDMA: Remove struct pages that back P2PDMA memory Logan Gunthorpe
2019-06-20 16:12 ` [RFC PATCH 28/28] memremap: Remove PCI P2PDMA page memory type Logan Gunthorpe
2019-06-20 18:45 ` [RFC PATCH 00/28] Removing struct page from P2PDMA Dan Williams
2019-06-20 19:33 ` Jason Gunthorpe
2019-06-20 20:18 ` Dan Williams
2019-06-20 20:51 ` Logan Gunthorpe
2019-06-21 17:47 ` Jason Gunthorpe
2019-06-21 17:54 ` Dan Williams
2019-06-24 7:31 ` Christoph Hellwig
2019-06-24 13:46 ` Jason Gunthorpe
2019-06-24 13:50 ` Christoph Hellwig
2019-06-24 13:55 ` Jason Gunthorpe
2019-06-24 16:53 ` Logan Gunthorpe
2019-06-24 18:16 ` Jason Gunthorpe
2019-06-24 18:28 ` Logan Gunthorpe
2019-06-24 18:54 ` Jason Gunthorpe
2019-06-24 19:37 ` Logan Gunthorpe
2019-06-24 16:10 ` Logan Gunthorpe
2019-06-25 7:18 ` Christoph Hellwig
2019-06-20 19:34 ` Logan Gunthorpe
2019-06-20 23:40 ` Dan Williams
2019-06-20 23:42 ` Logan Gunthorpe
2019-06-24 7:27 ` Christoph Hellwig
2019-06-24 16:07 ` Logan Gunthorpe
2019-06-25 7:20 ` Christoph Hellwig
2019-06-25 15:57 ` Logan Gunthorpe
2019-06-25 17:01 ` Christoph Hellwig
2019-06-25 19:54 ` Logan Gunthorpe
2019-06-26 6:57 ` Christoph Hellwig
2019-06-26 18:31 ` Logan Gunthorpe
2019-06-26 20:21 ` Jason Gunthorpe
2019-06-26 20:39 ` Dan Williams
2019-06-26 20:54 ` Jason Gunthorpe
2019-06-26 20:55 ` Logan Gunthorpe
2019-06-26 20:45 ` Logan Gunthorpe
2019-06-26 21:00 ` Jason Gunthorpe
2019-06-26 21:18 ` Logan Gunthorpe
2019-06-27 6:32 ` Jason Gunthorpe
2019-06-27 16:09 ` Logan Gunthorpe
2019-06-27 16:35 ` Jason Gunthorpe
2019-06-27 16:49 ` Logan Gunthorpe
2019-06-28 4:57 ` Jason Gunthorpe
2019-06-28 16:22 ` Logan Gunthorpe
2019-06-28 17:29 ` Jason Gunthorpe
2019-06-28 18:29 ` Logan Gunthorpe
2019-06-28 19:09 ` Jason Gunthorpe
2019-06-28 19:35 ` Logan Gunthorpe
2019-07-02 22:45 ` Jason Gunthorpe
2019-07-02 22:52 ` Logan Gunthorpe
2019-06-27 9:08 ` Christoph Hellwig
2019-06-27 16:30 ` Logan Gunthorpe
2019-06-27 17:00 ` Christoph Hellwig
2019-06-27 18:00 ` Logan Gunthorpe [this message]
2019-06-28 13:38 ` Christoph Hellwig
2019-06-28 15:54 ` Logan Gunthorpe
2019-06-27 9:01 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e63d0259-e17f-effe-b76d-43dbfda8ae3a@deltatee.com \
--to=logang@deltatee.com \
--cc=axboe@kernel.dk \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=jgg@ziepe.ca \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=sbates@raithlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).