All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: "Koenig, Christian" <Christian.Koenig@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"daniel.vetter@ffwll.ch" <daniel.vetter@ffwll.ch>,
	"jian.xu.zheng@intel.com" <jian.xu.zheng@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>, "jgg@ziepe.ca" <jgg@ziepe.ca>,
	"sakari.ailus@linux.intel.com" <sakari.ailus@linux.intel.com>,
	"bingbu.cao@intel.com" <bingbu.cao@intel.com>,
	"linux-media@vger.kernel.org" <linux-media@vger.kernel.org>,
	"shiraz.saleem@intel.com" <shiraz.saleem@intel.com>,
	"hch@lst.de" <hch@lst.de>,
	"tian.shu.qiu@intel.com" <tian.shu.qiu@intel.com>,
	"yong.zhi@intel.com" <yong.zhi@intel.com>
Subject: Re: [PATCH] lib/scatterlist: Provide a DMA page iterator
Date: Wed, 16 Jan 2019 11:14:40 +0100	[thread overview]
Message-ID: <20190116101440.GR10517@phenom.ffwll.local> (raw)
In-Reply-To: <8aadac80-da9b-b52a-a4bf-066406127117@amd.com>

On Wed, Jan 16, 2019 at 07:28:13AM +0000, Koenig, Christian wrote:
> Am 16.01.19 um 08:09 schrieb Thomas Hellstrom:
> > On Tue, 2019-01-15 at 21:58 +0100, hch@lst.de wrote:
> >> On Tue, Jan 15, 2019 at 07:13:11PM +0000, Koenig, Christian wrote:
> >>> Thomas is correct that the interface you propose here doesn't work
> >>> at
> >>> all for GPUs.
> >>>
> >>> The kernel driver is not informed of flush/sync, but rather just
> >>> setups
> >>> coherent mappings between system memory and devices.
> >>>
> >>> In other words you have an array of struct pages and need to map
> >>> that to
> >>> a specific device and so create dma_addresses for the mappings.
> >> If you want a coherent mapping you need to use dma_alloc_coherent
> >> and dma_mmap_coherent and you are done, that is not the problem.
> >> That actually is one of the vmgfx modes, so I don't understand what
> >> problem we are trying to solve if you don't actually want a non-
> >> coherent mapping.
> > For vmwgfx, not making dma_alloc_coherent default has a couple of
> > reasons:
> > 1) Memory is associated with a struct device. It has not been clear
> > that it is exportable to other devices.
> > 2) There seems to be restrictions in the system pages allowable. GPUs
> > generally prefer highmem pages but dma_alloc_coherent returns a virtual
> > address implying GFP_KERNEL? While not used by vmwgfx, TTM typically
> > prefers HIGHMEM pages to facilitate caching mode switching without
> > having to touch the kernel map.
> > 3) Historically we had APIs to allow coherent access to user-space
> > defined pages. That has gone away not but the infrastructure was built
> > around it.
> >
> > dma_mmap_coherent isn't use because as the data moves between system
> > memory, swap and VRAM, PTEs of user-space mappings are adjusted
> > accordingly, meaning user-space doesn't have to unmap when an operation
> > is initiated that might mean the data is moved.
> 
> To summarize once more: We have an array of struct pages and want to 
> coherently map that to a device.
> 
> If that is not possible because of whatever reason we want to get an 
> error code or even not load the driver from the beginning.

I guess to make this work we'd also need information about how we're
allowed to mmap this on the cpu side, both from the kernel (kmap or vmap)
and for userspace. At least for i915 we use all kinds of combinations,
e.g. cpu mmap ptes as cached w/ coherent device transactions, or
cached+clflush on the cpu side, and non-coherent device transactions (the
no-snoop thing), or wc mode in the cpu ptes and non-coherent device
transactions-

Plus some debug mode so we catch abuse, because reality is that most of
the gpu driver work happens on x86, where all of this just works. Even if
you do some really serious layering violations (which is why this isn't
that high a priority for gpu folks).
-Daniel

> 
> >
> >
> >> Although last time I had that discussion with Daniel Vetter
> >> I was under the impressions that GPUs really wanted non-coherent
> >> mappings.
> > Intel historically has done things a bit differently. And it's also
> > possible that embedded platforms and ARM prefer this mode of operation,
> > but I haven't caught up on that discussion.
> >
> >> But if you want a coherent mapping you can't go to a struct page,
> >> because on many systems you can't just map arbitrary memory as
> >> uncachable.  It might either come from very special limited pools,
> >> or might need other magic applied to it so that it is not visible
> >> in the normal direct mapping, or at least not access through it.
> >
> > The TTM subsystem has been relied on to provide coherent memory with
> > the option to switch caching mode of pages. But only on selected and
> > well tested platforms. On other platforms we simply do not load, and
> > that's fine for now.
> >
> > But as mentioned multiple times, to make GPU drivers more compliant,
> > we'd really want that
> >
> > bool dma_streaming_is_coherent(const struct device *)
> >
> > API to help us decide when to load or not.
> 
> Yes, please.
> 
> Christian.
> 
> >
> > Thanks,
> > Thomas
> >
> >
> >
> >
> >
> >
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: "Koenig, Christian" <Christian.Koenig@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>,
	"hch@lst.de" <hch@lst.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"yong.zhi@intel.com" <yong.zhi@intel.com>,
	"daniel.vetter@ffwll.ch" <daniel.vetter@ffwll.ch>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-media@vger.kernel.org" <linux-media@vger.kernel.org>,
	"bingbu.cao@intel.com" <bingbu.cao@intel.com>,
	"jian.xu.zheng@intel.com" <jian.xu.zheng@intel.com>,
	"tian.shu.qiu@intel.com" <tian.shu.qiu@intel.com>,
	"shiraz.saleem@intel.com" <shiraz.saleem@intel.com>,
	"sakari.ailus@linux.intel.com" <sakari.ailus@linux.intel.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>, "jgg@ziepe.ca" <jgg@ziepe.ca>
Subject: Re: [PATCH] lib/scatterlist: Provide a DMA page iterator
Date: Wed, 16 Jan 2019 11:14:40 +0100	[thread overview]
Message-ID: <20190116101440.GR10517@phenom.ffwll.local> (raw)
In-Reply-To: <8aadac80-da9b-b52a-a4bf-066406127117@amd.com>

On Wed, Jan 16, 2019 at 07:28:13AM +0000, Koenig, Christian wrote:
> Am 16.01.19 um 08:09 schrieb Thomas Hellstrom:
> > On Tue, 2019-01-15 at 21:58 +0100, hch@lst.de wrote:
> >> On Tue, Jan 15, 2019 at 07:13:11PM +0000, Koenig, Christian wrote:
> >>> Thomas is correct that the interface you propose here doesn't work
> >>> at
> >>> all for GPUs.
> >>>
> >>> The kernel driver is not informed of flush/sync, but rather just
> >>> setups
> >>> coherent mappings between system memory and devices.
> >>>
> >>> In other words you have an array of struct pages and need to map
> >>> that to
> >>> a specific device and so create dma_addresses for the mappings.
> >> If you want a coherent mapping you need to use dma_alloc_coherent
> >> and dma_mmap_coherent and you are done, that is not the problem.
> >> That actually is one of the vmgfx modes, so I don't understand what
> >> problem we are trying to solve if you don't actually want a non-
> >> coherent mapping.
> > For vmwgfx, not making dma_alloc_coherent default has a couple of
> > reasons:
> > 1) Memory is associated with a struct device. It has not been clear
> > that it is exportable to other devices.
> > 2) There seems to be restrictions in the system pages allowable. GPUs
> > generally prefer highmem pages but dma_alloc_coherent returns a virtual
> > address implying GFP_KERNEL? While not used by vmwgfx, TTM typically
> > prefers HIGHMEM pages to facilitate caching mode switching without
> > having to touch the kernel map.
> > 3) Historically we had APIs to allow coherent access to user-space
> > defined pages. That has gone away not but the infrastructure was built
> > around it.
> >
> > dma_mmap_coherent isn't use because as the data moves between system
> > memory, swap and VRAM, PTEs of user-space mappings are adjusted
> > accordingly, meaning user-space doesn't have to unmap when an operation
> > is initiated that might mean the data is moved.
> 
> To summarize once more: We have an array of struct pages and want to 
> coherently map that to a device.
> 
> If that is not possible because of whatever reason we want to get an 
> error code or even not load the driver from the beginning.

I guess to make this work we'd also need information about how we're
allowed to mmap this on the cpu side, both from the kernel (kmap or vmap)
and for userspace. At least for i915 we use all kinds of combinations,
e.g. cpu mmap ptes as cached w/ coherent device transactions, or
cached+clflush on the cpu side, and non-coherent device transactions (the
no-snoop thing), or wc mode in the cpu ptes and non-coherent device
transactions-

Plus some debug mode so we catch abuse, because reality is that most of
the gpu driver work happens on x86, where all of this just works. Even if
you do some really serious layering violations (which is why this isn't
that high a priority for gpu folks).
-Daniel

> 
> >
> >
> >> Although last time I had that discussion with Daniel Vetter
> >> I was under the impressions that GPUs really wanted non-coherent
> >> mappings.
> > Intel historically has done things a bit differently. And it's also
> > possible that embedded platforms and ARM prefer this mode of operation,
> > but I haven't caught up on that discussion.
> >
> >> But if you want a coherent mapping you can't go to a struct page,
> >> because on many systems you can't just map arbitrary memory as
> >> uncachable.  It might either come from very special limited pools,
> >> or might need other magic applied to it so that it is not visible
> >> in the normal direct mapping, or at least not access through it.
> >
> > The TTM subsystem has been relied on to provide coherent memory with
> > the option to switch caching mode of pages. But only on selected and
> > well tested platforms. On other platforms we simply do not load, and
> > that's fine for now.
> >
> > But as mentioned multiple times, to make GPU drivers more compliant,
> > we'd really want that
> >
> > bool dma_streaming_is_coherent(const struct device *)
> >
> > API to help us decide when to load or not.
> 
> Yes, please.
> 
> Christian.
> 
> >
> > Thanks,
> > Thomas
> >
> >
> >
> >
> >
> >
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

  reply	other threads:[~2019-01-16 10:14 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-04 22:35 [PATCH] lib/scatterlist: Provide a DMA page iterator Jason Gunthorpe
2019-01-05  2:37 ` kbuild test robot
2019-01-05  2:37   ` kbuild test robot
2019-01-05  3:21   ` Jason Gunthorpe
2019-01-10 23:42 ` Jason Gunthorpe
2019-01-14  9:48   ` Christoph Hellwig
2019-01-14  9:48     ` Christoph Hellwig
2019-01-15 14:17     ` Thomas Hellstrom
2019-01-15 14:17       ` Thomas Hellstrom
2019-01-15 14:24       ` Christian König
2019-01-15 14:24         ` Christian König
2019-01-15 15:20         ` hch
2019-01-15 18:03           ` Thomas Hellstrom
2019-01-15 18:03             ` Thomas Hellstrom
2019-01-15 18:31             ` hch
2019-01-15 18:31               ` hch
2019-01-15 19:13               ` Koenig, Christian
2019-01-15 19:13                 ` Koenig, Christian
2019-01-15 20:58                 ` hch
2019-01-16  7:09                   ` Thomas Hellstrom
2019-01-16  7:09                     ` Thomas Hellstrom
2019-01-16  7:28                     ` Koenig, Christian
2019-01-16  7:28                       ` Koenig, Christian
2019-01-16 10:14                       ` Daniel Vetter [this message]
2019-01-16 10:14                         ` Daniel Vetter
2019-01-16 16:06                       ` hch
2019-01-16 16:36                         ` Daniel Stone
2019-01-16 16:36                           ` Daniel Stone
2019-01-15 21:25       ` Jason Gunthorpe
2019-01-16 10:40         ` Christian König
2019-01-16 10:40           ` Christian König
2019-01-16 16:11         ` hch
2019-01-16 16:11           ` hch
2019-01-16 17:24           ` Jason Gunthorpe
2019-01-17  9:30             ` hch
2019-01-17 10:47               ` Thomas Hellstrom
2019-01-17 10:47                 ` Thomas Hellstrom
2019-01-17 15:54               ` Jason Gunthorpe
2019-01-12 18:27 ` Shiraz Saleem
2019-01-12 18:27   ` Shiraz Saleem
2019-01-12 18:37   ` Jason Gunthorpe
2019-01-12 19:03     ` Shiraz Saleem
2019-01-12 19:03       ` Shiraz Saleem
2019-01-14  9:46       ` Christoph Hellwig
2019-01-14 22:16       ` Jason Gunthorpe
2019-01-16 17:32         ` Shiraz Saleem
2019-02-07 23:23 ` Sakari Ailus
2019-02-07 23:23   ` Sakari Ailus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190116101440.GR10517@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=Christian.Koenig@amd.com \
    --cc=bingbu.cao@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=jgg@ziepe.ca \
    --cc=jian.xu.zheng@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sakari.ailus@linux.intel.com \
    --cc=shiraz.saleem@intel.com \
    --cc=thellstrom@vmware.com \
    --cc=tian.shu.qiu@intel.com \
    --cc=yong.zhi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.