All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: "Christian König" <christian.koenig@amd.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Cornelia Huck" <cohuck@redhat.com>,
	dri-devel@lists.freedesktop.org, kvm@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org, linux-media@vger.kernel.org,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Leon Romanovsky" <leon@kernel.org>,
	linux-rdma@vger.kernel.org, "Maor Gottlieb" <maorg@nvidia.com>,
	"Oded Gabbay" <ogabbay@kernel.org>,
	"Dan Williams" <dan.j.williams@intel.com>
Subject: Re: [PATCH v2 4/4] vfio/pci: Allow MMIO regions to be exported through dma-buf
Date: Fri, 9 Sep 2022 11:09:07 -0300	[thread overview]
Message-ID: <YxtJA34l5pYhZaCQ@nvidia.com> (raw)
In-Reply-To: <Yxs+k6psNfBLDqdv@infradead.org>

On Fri, Sep 09, 2022 at 06:24:35AM -0700, Christoph Hellwig wrote:
> On Wed, Sep 07, 2022 at 01:12:52PM -0300, Jason Gunthorpe wrote:
> > The PCI offset is some embedded thing - I've never seen it in a server
> > platform.
> 
> That's not actually true, e.g. some power system definitively had it,
> althiugh I don't know if the current ones do.

I thought those were all power embedded systems.

> There is a reason why we have these proper APIs and no one has any
> business bypassing them.

Yes, we should try to support these things, but you said this patch
didn't work and wasn't tested - that is not true at all.

And it isn't like we have APIs just sitting here to solve this
specific problem. So lets make something.

> > So, would you be OK with this series if I try to make a dma_map_p2p()
> > that resolves the offset issue?
> 
> Well, if it also solves the other issue of invalid scatterlists leaking
> outside of drm we can think about it.

The scatterlist stuff has already leaked outside of DRM anyhow.

Again, I think it is very problematic to let DRM get away with things
and then insist all the poor non-DRM people be responsible to clean up
their mess.

I'm skeptical I can fix AMD GPU, but I can try to create a DMABUF op
that returns something that is not a scatterlist and teach RDMA to use
it. So at least the VFIO/RDMA part can avoid the scatter list abuse. I
expected to need non-scatterlist for iommufd anyhow.

Coupled with a series to add some dma_map_resource_pci() that handles
the PCI_P2PDMA_MAP_BUS_ADDR and the PCI offset, would it be an
agreeable direction?

> Take a look at iommu_dma_map_sg and pci_p2pdma_map_segment to see how
> this is handled.

So there is a bug in all these DMABUF implementations, they do ignore
the PCI_P2PDMA_MAP_BUS_ADDR "distance type".

This isn't a real-world problem for VFIO because VFIO is largely
incompatible with the non-ACS configuration that would trigger
PCI_P2PDMA_MAP_BUS_ADDR, and explains why we never saw any
problem. All our systems have ACS turned on so we can use VFIO.

I'm unclear how Habana or AMD have avoided a problem here..

This is much more serious than the pci offset in my mind.

Thanks,
Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: "Leon Romanovsky" <leon@kernel.org>,
	kvm@vger.kernel.org, linux-rdma@vger.kernel.org,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Oded Gabbay" <ogabbay@kernel.org>,
	"Cornelia Huck" <cohuck@redhat.com>,
	dri-devel@lists.freedesktop.org,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	linaro-mm-sig@lists.linaro.org,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Maor Gottlieb" <maorg@nvidia.com>,
	"Christian König" <christian.koenig@amd.com>,
	linux-media@vger.kernel.org
Subject: Re: [PATCH v2 4/4] vfio/pci: Allow MMIO regions to be exported through dma-buf
Date: Fri, 9 Sep 2022 11:09:07 -0300	[thread overview]
Message-ID: <YxtJA34l5pYhZaCQ@nvidia.com> (raw)
In-Reply-To: <Yxs+k6psNfBLDqdv@infradead.org>

On Fri, Sep 09, 2022 at 06:24:35AM -0700, Christoph Hellwig wrote:
> On Wed, Sep 07, 2022 at 01:12:52PM -0300, Jason Gunthorpe wrote:
> > The PCI offset is some embedded thing - I've never seen it in a server
> > platform.
> 
> That's not actually true, e.g. some power system definitively had it,
> althiugh I don't know if the current ones do.

I thought those were all power embedded systems.

> There is a reason why we have these proper APIs and no one has any
> business bypassing them.

Yes, we should try to support these things, but you said this patch
didn't work and wasn't tested - that is not true at all.

And it isn't like we have APIs just sitting here to solve this
specific problem. So lets make something.

> > So, would you be OK with this series if I try to make a dma_map_p2p()
> > that resolves the offset issue?
> 
> Well, if it also solves the other issue of invalid scatterlists leaking
> outside of drm we can think about it.

The scatterlist stuff has already leaked outside of DRM anyhow.

Again, I think it is very problematic to let DRM get away with things
and then insist all the poor non-DRM people be responsible to clean up
their mess.

I'm skeptical I can fix AMD GPU, but I can try to create a DMABUF op
that returns something that is not a scatterlist and teach RDMA to use
it. So at least the VFIO/RDMA part can avoid the scatter list abuse. I
expected to need non-scatterlist for iommufd anyhow.

Coupled with a series to add some dma_map_resource_pci() that handles
the PCI_P2PDMA_MAP_BUS_ADDR and the PCI offset, would it be an
agreeable direction?

> Take a look at iommu_dma_map_sg and pci_p2pdma_map_segment to see how
> this is handled.

So there is a bug in all these DMABUF implementations, they do ignore
the PCI_P2PDMA_MAP_BUS_ADDR "distance type".

This isn't a real-world problem for VFIO because VFIO is largely
incompatible with the non-ACS configuration that would trigger
PCI_P2PDMA_MAP_BUS_ADDR, and explains why we never saw any
problem. All our systems have ACS turned on so we can use VFIO.

I'm unclear how Habana or AMD have avoided a problem here..

This is much more serious than the pci offset in my mind.

Thanks,
Jason

  reply	other threads:[~2022-09-09 14:09 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-31 23:12 [PATCH v2 0/4] Allow MMIO regions to be exported through dma-buf Jason Gunthorpe
2022-08-31 23:12 ` Jason Gunthorpe
2022-08-31 23:12 ` [PATCH v2 1/4] dma-buf: Add dma_buf_try_get() Jason Gunthorpe
2022-08-31 23:12   ` Jason Gunthorpe
2022-09-01  7:55   ` Christian König
2022-09-01  7:55     ` Christian König
2022-09-06 16:44     ` Jason Gunthorpe
2022-09-06 16:44       ` Jason Gunthorpe
2022-09-06 17:52       ` Christian König
2022-09-06 17:52         ` Christian König
2022-08-31 23:12 ` [PATCH v2 2/4] vfio: Add vfio_device_get() Jason Gunthorpe
2022-08-31 23:12   ` Jason Gunthorpe
2022-08-31 23:12 ` [PATCH v2 3/4] vfio_pci: Do not open code pci_try_reset_function() Jason Gunthorpe
2022-08-31 23:12   ` Jason Gunthorpe
2022-08-31 23:12 ` [PATCH v2 4/4] vfio/pci: Allow MMIO regions to be exported through dma-buf Jason Gunthorpe
2022-08-31 23:12   ` Jason Gunthorpe
2022-09-06  9:51   ` Christoph Hellwig
2022-09-06 10:38     ` Christian König
2022-09-06 10:38       ` Christian König
2022-09-06 11:48       ` Jason Gunthorpe
2022-09-06 11:48         ` Jason Gunthorpe
2022-09-06 12:34         ` Oded Gabbay
2022-09-06 12:34           ` Oded Gabbay
2022-09-06 17:59           ` Jason Gunthorpe
2022-09-06 17:59             ` Jason Gunthorpe
2022-09-06 19:44             ` Oded Gabbay
2022-09-06 19:44               ` Oded Gabbay
2022-09-07 12:07               ` Christoph Hellwig
2022-09-07 12:05         ` Christoph Hellwig
2022-09-07 12:33           ` Jason Gunthorpe
2022-09-07 12:33             ` Jason Gunthorpe
2022-09-07 14:29             ` Christoph Hellwig
2022-09-07 14:46               ` Oded Gabbay
2022-09-07 14:46                 ` Oded Gabbay
2022-09-07 15:23               ` Jason Gunthorpe
2022-09-07 15:23                 ` Jason Gunthorpe
2022-09-07 15:32                 ` Christoph Hellwig
2022-09-07 16:12                   ` Jason Gunthorpe
2022-09-07 16:12                     ` Jason Gunthorpe
2022-09-09 13:24                     ` Christoph Hellwig
2022-09-09 14:09                       ` Jason Gunthorpe [this message]
2022-09-09 14:09                         ` Jason Gunthorpe
2022-09-07 16:31                 ` Robin Murphy
2022-09-07 16:31                   ` Robin Murphy
2022-09-07 16:47                   ` Jason Gunthorpe
2022-09-07 16:47                     ` Jason Gunthorpe
2022-09-07 17:03               ` Dan Williams
2022-09-07 17:03                 ` Dan Williams
2022-09-07 15:01             ` Robin Murphy
2022-09-07 15:01               ` Robin Murphy
2022-09-07 12:03       ` Christoph Hellwig
2022-09-07 15:08         ` Christian König
2022-09-07 15:08           ` Christian König
2022-09-07 15:23           ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YxtJA34l5pYhZaCQ@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=christian.koenig@amd.com \
    --cc=cohuck@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=ogabbay@kernel.org \
    --cc=sumit.semwal@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.