From: "Christian König" <christian.koenig@amd.com>
To: Logan Gunthorpe <logang@deltatee.com>,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
linux-nvdimm@lists.01.org, linux-block@vger.kernel.org
Cc: "Stephen Bates" <sbates@raithlin.com>,
"Christoph Hellwig" <hch@lst.de>, "Jens Axboe" <axboe@kernel.dk>,
"Keith Busch" <keith.busch@intel.com>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Jason Gunthorpe" <jgg@mellanox.com>,
"Max Gurtovoy" <maxg@mellanox.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
"Alex Williamson" <alex.williamson@redhat.com>
Subject: Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory
Date: Thu, 3 May 2018 19:29:11 +0200 [thread overview]
Message-ID: <38d866cf-f7b4-7118-d737-5a5dcd9f3784@amd.com> (raw)
In-Reply-To: <ce2e5351-f32a-bd80-1c77-b7fc38842175@deltatee.com>
Am 03.05.2018 um 17:59 schrieb Logan Gunthorpe:
> On 03/05/18 03:05 AM, Christian König wrote:
>> Second question is how to you want to handle things when device are not
>> behind the same root port (which is perfectly possible in the cases I
>> deal with)?
> I think we need to implement a whitelist. If both root ports are in the
> white list and are on the same bus then we return a larger distance
> instead of -1.
Sounds good.
>> Third question why multiple clients? That feels a bit like you are
>> pushing something special to your use case into the common PCI
>> subsystem. Something which usually isn't a good idea.
> No, I think this will be pretty standard. In the simple general case you
> are going to have one provider and at least two clients (one which
> writes the memory and one which reads it). However, one client is
> likely, but not necessarily, the same as the provider.
Ok, that is the point where I'm stuck. Why do we need that in one
function call in the PCIe subsystem?
The problem at least with GPUs is that we seriously don't have that
information here, cause the PCI subsystem might not be aware of all the
interconnections.
For example it isn't uncommon to put multiple GPUs on one board. To the
PCI subsystem that looks like separate devices, but in reality all GPUs
are interconnected and can access each others memory directly without
going over the PCIe bus.
I seriously don't want to model that in the PCI subsystem, but rather
the driver. That's why it feels like a mistake to me to push all that
into the PCI function.
> In the NVMeof case, we might have N clients: 1 RDMA device and N-1 block
> devices. The code doesn't care which device provides the memory as it
> could be the RDMA device or one/all of the block devices (or, in theory,
> a completely separate device with P2P-able memory). However, it does
> require that all devices involved are accessible per
> pci_p2pdma_distance() or it won't use P2P transactions.
>
> I could also imagine other use cases: ie. an RDMA NIC sends data to a
> GPU for processing and then sends the data to an NVMe device for storage
> (or vice-versa). In this case we have 3 clients and one provider.
Why can't we model that as two separate transactions?
E.g. one from the RDMA NIC to the GPU memory. And another one from the
GPU memory to the NVMe device.
That would also match how I get this information from userspace.
>> As far as I can see we need a function which return the distance between
>> a initiator and target device. This function then returns -1 if the
>> transaction can't be made and a positive value otherwise.
> If you need to make a simpler convenience function for your use case I'm
> not against it.
Yeah, same for me. If Bjorn is ok with that specialized NVM functions
that I'm fine with that as well.
I think it would just be more convenient when we can come up with
functions which can handle all use cases, cause there still seems to be
a lot of similarities.
>
>> We also need to give the direction of the transaction and have a
>> whitelist root complex PCI-IDs which can handle P2P transactions from
>> different ports for a certain DMA direction.
> Yes. In the NVMeof case we need all devices to be able to DMA in both
> directions so we did not need the DMA direction. But I can see this
> being useful once we add the whitelist.
Ok, I agree that can be added later on. For simplicity let's assume for
now we always to bidirectional transfers.
Thanks for the explanation,
Christian.
>
> Logan
next prev parent reply other threads:[~2018-05-03 17:30 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-23 23:30 [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory Logan Gunthorpe
2018-05-07 23:00 ` Bjorn Helgaas
2018-05-07 23:09 ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 02/14] PCI/P2PDMA: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 03/14] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-05-07 23:02 ` Bjorn Helgaas
2018-04-23 23:30 ` [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches Logan Gunthorpe
2018-04-24 3:33 ` Randy Dunlap
2018-05-07 23:13 ` Bjorn Helgaas
2018-05-08 7:17 ` Christian König
2018-05-08 14:25 ` Stephen Bates
2018-05-08 16:37 ` Christian König
2018-05-08 16:27 ` Logan Gunthorpe
2018-05-08 16:50 ` Christian König
2018-05-08 19:13 ` Logan Gunthorpe
2018-05-08 19:34 ` Alex Williamson
2018-05-08 19:45 ` Logan Gunthorpe
2018-05-08 20:13 ` Alex Williamson
2018-05-08 20:19 ` Logan Gunthorpe
2018-05-08 20:43 ` Alex Williamson
2018-05-08 20:49 ` Logan Gunthorpe
2018-05-08 21:26 ` Alex Williamson
2018-05-08 21:42 ` Stephen Bates
2018-05-08 22:03 ` Alex Williamson
2018-05-08 22:10 ` Logan Gunthorpe
2018-05-08 22:25 ` Stephen Bates
2018-05-08 23:11 ` Alex Williamson
2018-05-08 23:31 ` Logan Gunthorpe
2018-05-09 0:17 ` Alex Williamson
2018-05-08 22:32 ` Alex Williamson
2018-05-08 23:00 ` Dan Williams
2018-05-08 23:15 ` Logan Gunthorpe
2018-05-09 12:38 ` Stephen Bates
2018-05-08 22:21 ` Don Dutile
2018-05-09 12:44 ` Stephen Bates
2018-05-09 15:58 ` Don Dutile
2018-05-08 20:50 ` Jerome Glisse
2018-05-08 21:35 ` Stephen Bates
2018-05-09 13:12 ` Stephen Bates
2018-05-09 13:40 ` Christian König
2018-05-09 15:41 ` Stephen Bates
2018-05-09 16:07 ` Jerome Glisse
2018-05-09 16:30 ` Stephen Bates
2018-05-09 17:49 ` Jerome Glisse
2018-05-10 14:20 ` Stephen Bates
2018-05-10 14:29 ` Christian König
2018-05-10 14:59 ` Jerome Glisse
2018-05-10 18:44 ` Stephen Bates
2018-05-09 16:45 ` Logan Gunthorpe
2018-05-10 12:52 ` Christian König
2018-05-10 14:16 ` Stephen Bates
2018-05-10 14:41 ` Jerome Glisse
2018-05-10 18:41 ` Stephen Bates
2018-05-10 18:59 ` Logan Gunthorpe
2018-05-10 19:10 ` Alex Williamson
2018-05-10 19:24 ` Jerome Glisse
2018-05-10 16:32 ` Logan Gunthorpe
2018-05-10 17:11 ` Stephen Bates
2018-05-10 17:15 ` Logan Gunthorpe
2018-05-11 8:52 ` Christian König
2018-05-11 15:48 ` Logan Gunthorpe
2018-05-11 21:50 ` Stephen Bates
2018-05-11 22:24 ` Stephen Bates
2018-05-11 22:55 ` Logan Gunthorpe
2018-05-08 14:31 ` Dan Williams
2018-05-08 14:44 ` Stephen Bates
2018-05-08 21:04 ` Don Dutile
2018-05-08 21:27 ` Stephen Bates
2018-05-08 23:06 ` Don Dutile
2018-05-09 0:01 ` Alex Williamson
2018-05-09 12:35 ` Stephen Bates
2018-05-09 14:44 ` Alex Williamson
2018-05-09 15:52 ` Don Dutile
2018-05-09 15:47 ` Don Dutile
2018-05-09 15:53 ` Don Dutile
2018-04-23 23:30 ` [PATCH v4 05/14] docs-rst: Add a new directory for PCI documentation Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation Logan Gunthorpe
2018-05-07 23:20 ` Bjorn Helgaas
2018-05-22 21:24 ` Randy Dunlap
2018-05-22 21:28 ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 07/14] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 08/14] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 09/14] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 10/14] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 11/14] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 12/14] nvmet: Introduce helper functions to allocate and free request SGLs Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 13/14] nvmet-rdma: Use new SGL alloc/free helper for requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 14/14] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
2018-05-02 11:51 ` [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Christian König
2018-05-02 15:56 ` Logan Gunthorpe
2018-05-03 9:05 ` Christian König
2018-05-03 15:59 ` Logan Gunthorpe
2018-05-03 17:29 ` Christian König [this message]
2018-05-03 18:43 ` Logan Gunthorpe
2018-05-04 14:27 ` Christian König
2018-05-04 15:52 ` Logan Gunthorpe
2018-05-07 23:23 ` Bjorn Helgaas
2018-05-07 23:34 ` Logan Gunthorpe
2018-05-08 16:57 ` Alex Williamson
2018-05-08 19:14 ` Logan Gunthorpe
2018-05-08 21:25 ` Don Dutile
2018-05-08 21:40 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=38d866cf-f7b4-7118-d737-5a5dcd9f3784@amd.com \
--to=christian.koenig@amd.com \
--cc=alex.williamson@redhat.com \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=jgg@mellanox.com \
--cc=jglisse@redhat.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=maxg@mellanox.com \
--cc=sagi@grimberg.me \
--cc=sbates@raithlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).