* [bug report] drm/ttm: add transparent huge page support for DMA allocations v2
@ 2019-06-27 14:06 Dan Carpenter
2019-06-27 16:19 ` Koenig, Christian
0 siblings, 1 reply; 7+ messages in thread
From: Dan Carpenter @ 2019-06-27 14:06 UTC (permalink / raw)
To: christian.koenig; +Cc: Christoph Hellwig, dri-devel
Hello Christian König,
The patch 648bc3574716: "drm/ttm: add transparent huge page support
for DMA allocations v2" from Jul 6, 2017, leads to the following
static checker warning:
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:314 __ttm_dma_alloc_page()
error: 'vaddr' came from dma_alloc_coherent() so we can't do virt_to_phys()
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
295 static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
296 {
297 struct dma_page *d_page;
298 unsigned long attrs = 0;
299 void *vaddr;
300
301 d_page = kmalloc(sizeof(struct dma_page), GFP_KERNEL);
302 if (!d_page)
303 return NULL;
304
305 if (pool->type & IS_HUGE)
306 attrs = DMA_ATTR_NO_WARN;
307
308 vaddr = dma_alloc_attrs(pool->dev, pool->size, &d_page->dma,
309 pool->gfp_flags, attrs);
310 if (vaddr) {
311 if (is_vmalloc_addr(vaddr))
312 d_page->p = vmalloc_to_page(vaddr);
313 else
314 d_page->p = virt_to_page(vaddr);
Christoph was explaining this earlier and I don't think it's sufficient
to just check is_vmalloc_addr(). See https://lkml.org/lkml/2019/6/17/155
Apparently the vaddr might not have a page backing...
(I am a newbie to this so I may be wrong or have misunderstood also).
315 d_page->vaddr = (unsigned long)vaddr;
316 if (pool->type & IS_HUGE)
317 d_page->vaddr |= VADDR_FLAG_HUGE_POOL;
318 } else {
319 kfree(d_page);
320 d_page = NULL;
321 }
322 return d_page;
323 }
regards,
dan carpenter
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] drm/ttm: add transparent huge page support for DMA allocations v2
2019-06-27 14:06 [bug report] drm/ttm: add transparent huge page support for DMA allocations v2 Dan Carpenter
@ 2019-06-27 16:19 ` Koenig, Christian
[not found] ` <20190627165532.GD10652@lst.de>
0 siblings, 1 reply; 7+ messages in thread
From: Koenig, Christian @ 2019-06-27 16:19 UTC (permalink / raw)
To: Dan Carpenter; +Cc: Christoph Hellwig, dri-devel
Am 27.06.19 um 16:06 schrieb Dan Carpenter:
> Hello Christian König,
>
> The patch 648bc3574716: "drm/ttm: add transparent huge page support
> for DMA allocations v2" from Jul 6, 2017, leads to the following
> static checker warning:
>
> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:314 __ttm_dma_alloc_page()
> error: 'vaddr' came from dma_alloc_coherent() so we can't do virt_to_phys()
>
> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> 295 static struct dma_page *__ttm_dma_alloc_page(struct dma_pool *pool)
> 296 {
> 297 struct dma_page *d_page;
> 298 unsigned long attrs = 0;
> 299 void *vaddr;
> 300
> 301 d_page = kmalloc(sizeof(struct dma_page), GFP_KERNEL);
> 302 if (!d_page)
> 303 return NULL;
> 304
> 305 if (pool->type & IS_HUGE)
> 306 attrs = DMA_ATTR_NO_WARN;
> 307
> 308 vaddr = dma_alloc_attrs(pool->dev, pool->size, &d_page->dma,
> 309 pool->gfp_flags, attrs);
> 310 if (vaddr) {
> 311 if (is_vmalloc_addr(vaddr))
> 312 d_page->p = vmalloc_to_page(vaddr);
> 313 else
> 314 d_page->p = virt_to_page(vaddr);
>
> Christoph was explaining this earlier and I don't think it's sufficient
> to just check is_vmalloc_addr(). See https://lkml.org/lkml/2019/6/17/155
> Apparently the vaddr might not have a page backing...
That stuff was copied from the older non-huge page code and I actually
think it is correct.
If not then the real question is how to correctly get the backing page
from dma_alloc_attrs().
Christian.
>
> (I am a newbie to this so I may be wrong or have misunderstood also).
>
> 315 d_page->vaddr = (unsigned long)vaddr;
> 316 if (pool->type & IS_HUGE)
> 317 d_page->vaddr |= VADDR_FLAG_HUGE_POOL;
> 318 } else {
> 319 kfree(d_page);
> 320 d_page = NULL;
> 321 }
> 322 return d_page;
> 323 }
>
> regards,
> dan carpenter
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] drm/ttm: add transparent huge page support for DMA allocations v2
[not found] ` <20190627165532.GD10652@lst.de>
@ 2019-06-27 17:12 ` Koenig, Christian
[not found] ` <20190627171536.GA11117@lst.de>
0 siblings, 1 reply; 7+ messages in thread
From: Koenig, Christian @ 2019-06-27 17:12 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: dri-devel, Dan Carpenter
Am 27.06.19 um 18:55 schrieb Christoph Hellwig:
> On Thu, Jun 27, 2019 at 04:19:14PM +0000, Koenig, Christian wrote:
>> If not then the real question is how to correctly get the backing page
>> from dma_alloc_attrs().
> The answer to that is that you can't. dma_alloc_* is an opaque
> allocator that only gives you a kernel virtual address and a dma_addr_t,
> and you can't poke into the internals, as there are a variety of
> different implementations.
Thought that you would say that. It basically confirms my suspicion that
the whole TTM page allocation code is not really working that well.
How do we then do things like mapping that memory to userspace?
And how to we control caching, write combine/back of that memory?
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] drm/ttm: add transparent huge page support for DMA allocations v2
[not found] ` <20190627171536.GA11117@lst.de>
@ 2019-06-27 17:30 ` Koenig, Christian
[not found] ` <20190628073935.GA29114@lst.de>
0 siblings, 1 reply; 7+ messages in thread
From: Koenig, Christian @ 2019-06-27 17:30 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: dri-devel, Dan Carpenter
Am 27.06.19 um 19:15 schrieb Christoph Hellwig:
> On Thu, Jun 27, 2019 at 05:12:47PM +0000, Koenig, Christian wrote:
>> the whole TTM page allocation code is not really working that well.
>>
>> How do we then do things like mapping that memory to userspace?
> dma_mmap_attrs with the same flags as passed to dma_alloc_attrs
We need a way to map only a fraction of a VMA on a page fault. Of hand
I don't see that possible with dma_mmap_attrs().
>> And how to we control caching, write combine/back of that memory?
> By using the flags passed to the functions, with the slight caveat
> that a lot of the options are only implemented on some architectures.
>
> Documentation/DMA-attributes.txt documents the available flags.
The problem is that I see quite a bunch of functions which are needed by
GPU drivers and are not implemented in the DMA API.
For example we need to be able to setup uncached mappings, that is not
really supporter by the DMA API at the moment.
Additional to that we need a way to force a coherent mappings with
dma_map_page() which fails when this isn't guaranteed.
Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] drm/ttm: add transparent huge page support for DMA allocations v2
[not found] ` <20190628073935.GA29114@lst.de>
@ 2019-06-28 8:05 ` Koenig, Christian
[not found] ` <20190701083117.GA22587@lst.de>
0 siblings, 1 reply; 7+ messages in thread
From: Koenig, Christian @ 2019-06-28 8:05 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: dri-devel, Dan Carpenter
Am 28.06.19 um 09:39 schrieb Christoph Hellwig:
> On Thu, Jun 27, 2019 at 05:30:00PM +0000, Koenig, Christian wrote:
>> Am 27.06.19 um 19:15 schrieb Christoph Hellwig:
>>> On Thu, Jun 27, 2019 at 05:12:47PM +0000, Koenig, Christian wrote:
>>>> the whole TTM page allocation code is not really working that well.
>>>>
>>>> How do we then do things like mapping that memory to userspace?
>>> dma_mmap_attrs with the same flags as passed to dma_alloc_attrs
>> We need a way to map only a fraction of a VMA on a page fault. Of hand
>> I don't see that possible with dma_mmap_attrs().
> dma_mmap_attrs is intented to call from ->mmap and sets up all the
> page tables. That being said there is nothing in it that prevents
> you from calling it for parts of a mapping - you just need to increment
> the cpu_addr and dma addr, and reduce the size by the same amount.
>
> I don't see anything obvious why it could not be called from a
> ->faul handler, but I also don't see anything obvious preventing
> us from doing that.
Well the offset into the VMA where to start filling is missing, but
apart from that I agree that this should probably work.
>> The problem is that I see quite a bunch of functions which are needed by
>> GPU drivers and are not implemented in the DMA API.
> Well, we can work on that.
>
>> For example we need to be able to setup uncached mappings, that is not
>> really supporter by the DMA API at the moment.
> Lets put it in another way. Outside of x86 uncached mappings are the
> norm, but there is no way to explicitly request them on architectures
> that are DMA coherent. Adding a DMA_ATTR_UNCACHED would be mostly
> trivial, we just need to define proper semantics for it.
Sounds good. Can you do this? Cause I only know x86 and a few bits of ARM.
>> Additional to that we need a way to force a coherent mappings with
>> dma_map_page() which fails when this isn't guaranteed.
> We can't force things to be coherent that weren't allocate specifically
> to be DMA coherent. If you want coherent DMA memory it needs to come
> from dma_alloc_*.
Yeah, but you can return an error instead of using bounce buffers :)
See OpenGL, OpenCL and Vulkan have APIs which allow an application to
give a malloced pointer to the driver and say hey I want to access this
memory coherently from the GPU.
In this situation it is valid to return an error saying sorry that
device can't access that memory coherently, but it's not ok to just map
it non-coherently.
For OpenGL and OpenCL we can still say that the current platform doesn't
support this feature, but that renders a bunch of applications unusable.
For Vulkan it's even worse because it is actually part of the core API
as far as I know (but take this with a grain of salt I'm not really an
userspace developer).
Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] drm/ttm: add transparent huge page support for DMA allocations v2
[not found] ` <20190701083117.GA22587@lst.de>
@ 2019-07-15 10:41 ` Koenig, Christian
[not found] ` <20190715125058.GA4384@lst.de>
0 siblings, 1 reply; 7+ messages in thread
From: Koenig, Christian @ 2019-07-15 10:41 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: dri-devel, Dan Carpenter
Hi Christoph,
sorry for the delayed response, I was on a two week vacation.
Am 01.07.19 um 10:31 schrieb Christoph Hellwig:
> On Fri, Jun 28, 2019 at 08:05:28AM +0000, Koenig, Christian wrote:
> [SNIP]
>>> Lets put it in another way. Outside of x86 uncached mappings are the
>>> norm, but there is no way to explicitly request them on architectures
>>> that are DMA coherent. Adding a DMA_ATTR_UNCACHED would be mostly
>>> trivial, we just need to define proper semantics for it.
>> Sounds good. Can you do this? Cause I only know x86 and a few bits of ARM.
> So what semantics do you need? Given that we have some architectures
> that can't set pages as uncached at runtime it either has to be a hint,
> or we could fail it if not supported by implementation. Which one would
> you prefer?
Well first of all I think we need a function which can tell if it's
supported in general on the current architecture.
Then I've asked around a bit and we unfortunately found a few more cases
I didn't knew before where uncached access to system memory is
mandatory. The only good news I have is that the AMD devices needing
that are all integrated into the CPU. So at least for AMD hardware we
can safely assume x86 for those cases.
But because of that I would say we should hard fail if it is not
possible to get some uncached memory.
>>>> Additional to that we need a way to force a coherent mappings with
>>>> dma_map_page() which fails when this isn't guaranteed.
>>> We can't force things to be coherent that weren't allocate specifically
>>> to be DMA coherent. If you want coherent DMA memory it needs to come
>>> from dma_alloc_*.
>> Yeah, but you can return an error instead of using bounce buffers :)
>>
>> See OpenGL, OpenCL and Vulkan have APIs which allow an application to
>> give a malloced pointer to the driver and say hey I want to access this
>> memory coherently from the GPU.
>>
>> In this situation it is valid to return an error saying sorry that
>> device can't access that memory coherently, but it's not ok to just map
>> it non-coherently.
>>
>> For OpenGL and OpenCL we can still say that the current platform doesn't
>> support this feature, but that renders a bunch of applications unusable.
>>
>> For Vulkan it's even worse because it is actually part of the core API
>> as far as I know (but take this with a grain of salt I'm not really an
>> userspace developer).
> We'll have to fail this in many cases then, e.g. all the time when
> the device is not coherent, which is quite frequent, when the device
> doesn't support addressing all physical address space (which seems
> to be reasonably common even for GPUs), or if we are using an iommu
> and the device is external (which would git eGPUs hard).
Yeah, but as I said failing is perfectly fine for those APIs.
See when you have a hardware platform where a device is not coherent and
use (binary) userspace software which requires it to be coherent, then
something is really broken and you either need to replace the hardware
or the software.
When we return a proper error code we at least give the user a good idea
of what's going wrong.
I mean the only other possible workaround in the kernel I can see is to
instead of trying to map a page backing a certain userspace address is
to change where this userspace address is pointing to. You know what I
mean? (It's kind of hard to explain because I'm not a native speaker of
English) But that approach sounds like a deep rabbit hole to me.
Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] drm/ttm: add transparent huge page support for DMA allocations v2
[not found] ` <20190715125058.GA4384@lst.de>
@ 2019-07-15 13:38 ` Koenig, Christian
0 siblings, 0 replies; 7+ messages in thread
From: Koenig, Christian @ 2019-07-15 13:38 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: dri-devel, Dan Carpenter
Am 15.07.19 um 14:50 schrieb Christoph Hellwig:
> On Mon, Jul 15, 2019 at 10:41:14AM +0000, Koenig, Christian wrote:
> [SNIP]
>>>>> that are DMA coherent. Adding a DMA_ATTR_UNCACHED would be mostly
>>>>> trivial, we just need to define proper semantics for it.
>>>> Sounds good. Can you do this? Cause I only know x86 and a few bits of ARM.
>>> So what semantics do you need? Given that we have some architectures
>>> that can't set pages as uncached at runtime it either has to be a hint,
>>> or we could fail it if not supported by implementation. Which one would
>>> you prefer?
>> Well first of all I think we need a function which can tell if it's
>> supported in general on the current architecture.
>>
>> Then I've asked around a bit and we unfortunately found a few more cases
>> I didn't knew before where uncached access to system memory is
>> mandatory. The only good news I have is that the AMD devices needing
>> that are all integrated into the CPU. So at least for AMD hardware we
>> can safely assume x86 for those cases.
>>
>> But because of that I would say we should hard fail if it is not
>> possible to get some uncached memory.
> So I guess the semantics preferred by you is a DMA_ATTR_UNCACHED flag
> that would fail if not supported. That should be relatively easy
> to support. Initially you'd need that on x86 with the direct mapping
> and AMD IOMMU?
Currently I need that for both dma_alloc_attrs() and dma_map_page_attrs().
But I hope to get rid of the uncached use case for dma_map_page_attrs()
and only use dma_alloc_attrs().
So if that makes things easier you can just ignore dma_map_page_attrs()
and we just continue to use the hack we already have till I manage to
migrate all drivers using TTM away from that.
>> When we return a proper error code we at least give the user a good idea
>> of what's going wrong.
>>
>> I mean the only other possible workaround in the kernel I can see is to
>> instead of trying to map a page backing a certain userspace address is
>> to change where this userspace address is pointing to. You know what I
>> mean? (It's kind of hard to explain because I'm not a native speaker of
>> English) But that approach sounds like a deep rabbit hole to me.
> Isn't that kinda what we are doing for the device private memory
> work in hmm? But it could certainly go down a rathole fast.
Oh, good point. Yeah that is very similar.
Instead of replacing the system memory page with a device memory page,
we would replace one inaccessible system memory page with an accessible
one. But apart from that it is essentially the same functionality.
Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-07-15 13:38 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-27 14:06 [bug report] drm/ttm: add transparent huge page support for DMA allocations v2 Dan Carpenter
2019-06-27 16:19 ` Koenig, Christian
[not found] ` <20190627165532.GD10652@lst.de>
2019-06-27 17:12 ` Koenig, Christian
[not found] ` <20190627171536.GA11117@lst.de>
2019-06-27 17:30 ` Koenig, Christian
[not found] ` <20190628073935.GA29114@lst.de>
2019-06-28 8:05 ` Koenig, Christian
[not found] ` <20190701083117.GA22587@lst.de>
2019-07-15 10:41 ` Koenig, Christian
[not found] ` <20190715125058.GA4384@lst.de>
2019-07-15 13:38 ` Koenig, Christian
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.