linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Logan Gunthorpe <logang@deltatee.com>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
	linux-nvdimm@lists.01.org, linux-block@vger.kernel.org
Cc: "Stephen Bates" <sbates@raithlin.com>,
	"Christoph Hellwig" <hch@lst.de>, "Jens Axboe" <axboe@kernel.dk>,
	"Keith Busch" <keith.busch@intel.com>,
	"Sagi Grimberg" <sagi@grimberg.me>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Jason Gunthorpe" <jgg@mellanox.com>,
	"Max Gurtovoy" <maxg@mellanox.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
	"Alex Williamson" <alex.williamson@redhat.com>
Subject: Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory
Date: Thu, 3 May 2018 19:29:11 +0200	[thread overview]
Message-ID: <38d866cf-f7b4-7118-d737-5a5dcd9f3784@amd.com> (raw)
In-Reply-To: <ce2e5351-f32a-bd80-1c77-b7fc38842175@deltatee.com>

Am 03.05.2018 um 17:59 schrieb Logan Gunthorpe:
> On 03/05/18 03:05 AM, Christian König wrote:
>> Second question is how to you want to handle things when device are not
>> behind the same root port (which is perfectly possible in the cases I
>> deal with)?
> I think we need to implement a whitelist. If both root ports are in the
> white list and are on the same bus then we return a larger distance
> instead of -1.

Sounds good.

>> Third question why multiple clients? That feels a bit like you are
>> pushing something special to your use case into the common PCI
>> subsystem. Something which usually isn't a good idea.
> No, I think this will be pretty standard. In the simple general case you
> are going to have one provider and at least two clients (one which
> writes the memory and one which reads it). However, one client is
> likely, but not necessarily, the same as the provider.

Ok, that is the point where I'm stuck. Why do we need that in one 
function call in the PCIe subsystem?

The problem at least with GPUs is that we seriously don't have that 
information here, cause the PCI subsystem might not be aware of all the 
interconnections.

For example it isn't uncommon to put multiple GPUs on one board. To the 
PCI subsystem that looks like separate devices, but in reality all GPUs 
are interconnected and can access each others memory directly without 
going over the PCIe bus.

I seriously don't want to model that in the PCI subsystem, but rather 
the driver. That's why it feels like a mistake to me to push all that 
into the PCI function.

> In the NVMeof case, we might have N clients: 1 RDMA device and N-1 block
> devices. The code doesn't care which device provides the memory as it
> could be the RDMA device or one/all of the block devices (or, in theory,
> a completely separate device with P2P-able memory). However, it does
> require that all devices involved are accessible per
> pci_p2pdma_distance() or it won't use P2P transactions.
>
> I could also imagine other use cases: ie. an RDMA NIC sends data to a
> GPU for processing and then sends the data to an NVMe device for storage
> (or vice-versa). In this case we have 3 clients and one provider.

Why can't we model that as two separate transactions?

E.g. one from the RDMA NIC to the GPU memory. And another one from the 
GPU memory to the NVMe device.

That would also match how I get this information from userspace.

>> As far as I can see we need a function which return the distance between
>> a initiator and target device. This function then returns -1 if the
>> transaction can't be made and a positive value otherwise.
> If you need to make a simpler convenience function for your use case I'm
> not against it.

Yeah, same for me. If Bjorn is ok with that specialized NVM functions 
that I'm fine with that as well.

I think it would just be more convenient when we can come up with 
functions which can handle all use cases, cause there still seems to be 
a lot of similarities.

>
>> We also need to give the direction of the transaction and have a
>> whitelist root complex PCI-IDs which can handle P2P transactions from
>> different ports for a certain DMA direction.
> Yes. In the NVMeof case we need all devices to be able to DMA in both
> directions so we did not need the DMA direction. But I can see this
> being useful once we add the whitelist.

Ok, I agree that can be added later on. For simplicity let's assume for 
now we always to bidirectional transfers.

Thanks for the explanation,
Christian.

>
> Logan

  reply	other threads:[~2018-05-03 17:30 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-23 23:30 [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory Logan Gunthorpe
2018-05-07 23:00   ` Bjorn Helgaas
2018-05-07 23:09     ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 02/14] PCI/P2PDMA: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 03/14] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-05-07 23:02   ` Bjorn Helgaas
2018-04-23 23:30 ` [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches Logan Gunthorpe
2018-04-24  3:33   ` Randy Dunlap
2018-05-07 23:13   ` Bjorn Helgaas
2018-05-08  7:17     ` Christian König
2018-05-08 14:25       ` Stephen  Bates
2018-05-08 16:37         ` Christian König
2018-05-08 16:27       ` Logan Gunthorpe
2018-05-08 16:50         ` Christian König
2018-05-08 19:13           ` Logan Gunthorpe
2018-05-08 19:34             ` Alex Williamson
2018-05-08 19:45               ` Logan Gunthorpe
2018-05-08 20:13                 ` Alex Williamson
2018-05-08 20:19                   ` Logan Gunthorpe
2018-05-08 20:43                     ` Alex Williamson
2018-05-08 20:49                       ` Logan Gunthorpe
2018-05-08 21:26                         ` Alex Williamson
2018-05-08 21:42                           ` Stephen  Bates
2018-05-08 22:03                             ` Alex Williamson
2018-05-08 22:10                               ` Logan Gunthorpe
2018-05-08 22:25                                 ` Stephen  Bates
2018-05-08 23:11                                   ` Alex Williamson
2018-05-08 23:31                                     ` Logan Gunthorpe
2018-05-09  0:17                                       ` Alex Williamson
2018-05-08 22:32                                 ` Alex Williamson
2018-05-08 23:00                                   ` Dan Williams
2018-05-08 23:15                                     ` Logan Gunthorpe
2018-05-09 12:38                                       ` Stephen  Bates
2018-05-08 22:21                               ` Don Dutile
2018-05-09 12:44                                 ` Stephen  Bates
2018-05-09 15:58                                   ` Don Dutile
2018-05-08 20:50                     ` Jerome Glisse
2018-05-08 21:35                       ` Stephen  Bates
2018-05-09 13:12                       ` Stephen  Bates
2018-05-09 13:40                         ` Christian König
2018-05-09 15:41                           ` Stephen  Bates
2018-05-09 16:07                             ` Jerome Glisse
2018-05-09 16:30                               ` Stephen  Bates
2018-05-09 17:49                                 ` Jerome Glisse
2018-05-10 14:20                                   ` Stephen  Bates
2018-05-10 14:29                                     ` Christian König
2018-05-10 14:59                                       ` Jerome Glisse
2018-05-10 18:44                                         ` Stephen  Bates
2018-05-09 16:45                           ` Logan Gunthorpe
2018-05-10 12:52                             ` Christian König
2018-05-10 14:16                               ` Stephen  Bates
2018-05-10 14:41                                 ` Jerome Glisse
2018-05-10 18:41                                   ` Stephen  Bates
2018-05-10 18:59                                     ` Logan Gunthorpe
2018-05-10 19:10                                     ` Alex Williamson
2018-05-10 19:24                                       ` Jerome Glisse
2018-05-10 16:32                                 ` Logan Gunthorpe
2018-05-10 17:11                                   ` Stephen  Bates
2018-05-10 17:15                                     ` Logan Gunthorpe
2018-05-11  8:52                                       ` Christian König
2018-05-11 15:48                                         ` Logan Gunthorpe
2018-05-11 21:50                                           ` Stephen  Bates
2018-05-11 22:24                                             ` Stephen  Bates
2018-05-11 22:55                                               ` Logan Gunthorpe
2018-05-08 14:31   ` Dan Williams
2018-05-08 14:44     ` Stephen  Bates
2018-05-08 21:04       ` Don Dutile
2018-05-08 21:27         ` Stephen  Bates
2018-05-08 23:06           ` Don Dutile
2018-05-09  0:01             ` Alex Williamson
2018-05-09 12:35               ` Stephen  Bates
2018-05-09 14:44                 ` Alex Williamson
2018-05-09 15:52                   ` Don Dutile
2018-05-09 15:47               ` Don Dutile
2018-05-09 15:53           ` Don Dutile
2018-04-23 23:30 ` [PATCH v4 05/14] docs-rst: Add a new directory for PCI documentation Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation Logan Gunthorpe
2018-05-07 23:20   ` Bjorn Helgaas
2018-05-22 21:24   ` Randy Dunlap
2018-05-22 21:28     ` Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 07/14] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 08/14] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 09/14] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 10/14] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 11/14] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 12/14] nvmet: Introduce helper functions to allocate and free request SGLs Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 13/14] nvmet-rdma: Use new SGL alloc/free helper for requests Logan Gunthorpe
2018-04-23 23:30 ` [PATCH v4 14/14] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
2018-05-02 11:51 ` [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory Christian König
2018-05-02 15:56   ` Logan Gunthorpe
2018-05-03  9:05     ` Christian König
2018-05-03 15:59       ` Logan Gunthorpe
2018-05-03 17:29         ` Christian König [this message]
2018-05-03 18:43           ` Logan Gunthorpe
2018-05-04 14:27             ` Christian König
2018-05-04 15:52               ` Logan Gunthorpe
2018-05-07 23:23 ` Bjorn Helgaas
2018-05-07 23:34   ` Logan Gunthorpe
2018-05-08 16:57   ` Alex Williamson
2018-05-08 19:14     ` Logan Gunthorpe
2018-05-08 21:25     ` Don Dutile
2018-05-08 21:40       ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38d866cf-f7b4-7118-d737-5a5dcd9f3784@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=jgg@mellanox.com \
    --cc=jglisse@redhat.com \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=maxg@mellanox.com \
    --cc=sagi@grimberg.me \
    --cc=sbates@raithlin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).