All of lore.kernel.org
 help / color / mirror / Atom feed
* Contiguous memory allocations
@ 2010-07-02 20:47 Eric Nelson
  2010-07-05 10:10 ` Chris Simmonds
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Nelson @ 2010-07-02 20:47 UTC (permalink / raw)
  To: video4linux-list

Does anyone know if there's a common infrastructure for allocation
of DMA'able memory by drivers and applications above the straight
kernel API (dma_alloc_coherent)?

I'm working with Freescale i.MX51 drivers to do 720P video 
input and output and the embedded calls to dma_alloc_coherent
fail except when used right after boot because of fragmentation.

I'm fighting the urge to write yet another special-purpose allocator
for video buffers thinking this must be a common problem with a
solution already, but I can't seem to locate one.

The closest thing I've found is the bigphysarea patch, which doesn't
appear to be supported or headed toward main-line.

Thanks in advance,


Eric Nelson


--
video4linux-list mailing list
Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/video4linux-list

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Contiguous memory allocations
  2010-07-02 20:47 Contiguous memory allocations Eric Nelson
@ 2010-07-05 10:10 ` Chris Simmonds
  2010-07-05 14:27   ` Eric Nelson
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Simmonds @ 2010-07-05 10:10 UTC (permalink / raw)
  To: video4linux-list

On 02/07/10 21:47, Eric Nelson wrote:
> Does anyone know if there's a common infrastructure for allocation
> of DMA'able memory by drivers and applications above the straight
> kernel API (dma_alloc_coherent)?
>
> I'm working with Freescale i.MX51 drivers to do 720P video
> input and output and the embedded calls to dma_alloc_coherent
> fail except when used right after boot because of fragmentation.
>
> I'm fighting the urge to write yet another special-purpose allocator
> for video buffers thinking this must be a common problem with a
> solution already, but I can't seem to locate one.
>
> The closest thing I've found is the bigphysarea patch, which doesn't
> appear to be supported or headed toward main-line.
>
> Thanks in advance,
>

dma_alloc_coherent is pretty much just a wrapper round get_free_pages, 
which is the lowest level allocator in the kernel. So, no there is no 
other option (but see below). The simplest thing is to make sure your 
driver is loaded at boot time and to grab all the memory you need then 
and never let it go. That's what I do.

If you are desperate, you can use the bigphysarea patch - it's quite 
common on streaming video devices - but you will have to port it to your 
kernel. Or, you can restrict the memory the kernel uses with something 
like "mem=128M" on the command line and take that above 128M for 
yourself. You will have to map it in with ioremap(_nocache).

Bye for now,
Chris.

-- 
Chris Simmonds                   2net Limited
chris@2net.co.uk                 http://www.2net.co.uk/

--
video4linux-list mailing list
Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/video4linux-list

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Contiguous memory allocations
  2010-07-05 10:10 ` Chris Simmonds
@ 2010-07-05 14:27   ` Eric Nelson
  2010-07-05 15:31     ` Chris Simmonds
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Nelson @ 2010-07-05 14:27 UTC (permalink / raw)
  To: chris; +Cc: video4linux-list, Chris Simmonds

On 07/05/2010 03:10 AM, Chris Simmonds wrote:
> On 02/07/10 21:47, Eric Nelson wrote:
>> Does anyone know if there's a common infrastructure for allocation
>> of DMA'able memory by drivers and applications above the straight
>> kernel API (dma_alloc_coherent)?
>>
>> I'm working with Freescale i.MX51 drivers to do 720P video
>> input and output and the embedded calls to dma_alloc_coherent
>> fail except when used right after boot because of fragmentation.
>>
>> I'm fighting the urge to write yet another special-purpose allocator
>> for video buffers thinking this must be a common problem with a
>> solution already, but I can't seem to locate one.
>>
>> The closest thing I've found is the bigphysarea patch, which doesn't
>> appear to be supported or headed toward main-line.
>>
>> Thanks in advance,
>
> dma_alloc_coherent is pretty much just a wrapper round get_free_pages,
> which is the lowest level allocator in the kernel. So, no there is no
> other option (but see below). The simplest thing is to make sure your
> driver is loaded at boot time and to grab all the memory you need then
> and never let it go. That's what I do.
>
Thanks Chris.

The trouble is always "how much"? If we don't know at startup what kind of
video's needed or what size(s) of camera input may be needed, it's impossible
to tune. In the current Freescale kernels, there are at least 4 separate
drivers that allocate RAM, sometimes for internal use, but mostly in response
to userspace calls (ioctl).

	- frame-buffer driver
	- Video Processing Unit (VPU) - video encode/decode
	- V4L2 output device - allows access to YUV output layer, color blending
	- Image Processing Unit (IPU) - allows userspace bitblts through DMA

With this number of calls, tuning with separate kernel command-line args seems
unworkable.

> If you are desperate, you can use the bigphysarea patch - it's quite
> common on streaming video devices - but you will have to port it to your
> kernel. Or, you can restrict the memory the kernel uses with something
> like "mem=128M" on the command line and take that above 128M for
> yourself. You will have to map it in with ioremap(_nocache).
>
It also seems unlikely to ever make it to main-line.

I wrote a similar driver a few years ago for Davinci processors that had a
more general-purpose allocator specifically for accelerating bitblts.

http://mail.directfb.org/pipermail/directfb-users/2007-November/000123.html

Our current needs for i.MX51 are simpler, since we see fewer and larger
buffer allocations, but the problem is the same: how to reserve and then
allocate physically contiguous RAM.

> Bye for now,
> Chris.
>

--
video4linux-list mailing list
Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/video4linux-list

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Contiguous memory allocations
  2010-07-05 14:27   ` Eric Nelson
@ 2010-07-05 15:31     ` Chris Simmonds
  2010-07-05 16:14       ` Eric Nelson
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Simmonds @ 2010-07-05 15:31 UTC (permalink / raw)
  To: Eric Nelson; +Cc: video4linux-list

On 05/07/10 15:27, Eric Nelson wrote:
> On 07/05/2010 03:10 AM, Chris Simmonds wrote:
>> On 02/07/10 21:47, Eric Nelson wrote:
>>> Does anyone know if there's a common infrastructure for allocation
>>> of DMA'able memory by drivers and applications above the straight
>>> kernel API (dma_alloc_coherent)?
>>>
>>> I'm working with Freescale i.MX51 drivers to do 720P video
>>> input and output and the embedded calls to dma_alloc_coherent
>>> fail except when used right after boot because of fragmentation.
>>>
>>> I'm fighting the urge to write yet another special-purpose allocator
>>> for video buffers thinking this must be a common problem with a
>>> solution already, but I can't seem to locate one.
>>>
>>> The closest thing I've found is the bigphysarea patch, which doesn't
>>> appear to be supported or headed toward main-line.
>>>
>>> Thanks in advance,
>>
>> dma_alloc_coherent is pretty much just a wrapper round get_free_pages,
>> which is the lowest level allocator in the kernel. So, no there is no
>> other option (but see below). The simplest thing is to make sure your
>> driver is loaded at boot time and to grab all the memory you need then
>> and never let it go. That's what I do.
>>
> Thanks Chris.
>
> The trouble is always "how much"? If we don't know at startup what kind of
> video's needed or what size(s) of camera input may be needed, it's
> impossible
> to tune. In the current Freescale kernels, there are at least 4 separate
> drivers that allocate RAM, sometimes for internal use, but mostly in
> response
> to userspace calls (ioctl).
>
> - frame-buffer driver
> - Video Processing Unit (VPU) - video encode/decode
> - V4L2 output device - allows access to YUV output layer, color blending
> - Image Processing Unit (IPU) - allows userspace bitblts through DMA
>
> With this number of calls, tuning with separate kernel command-line args
> seems
> unworkable.

I think the kernel developers don't like this kind of on-the-side 
allocator because they tend to be dedicated to solving one kind of problem.

Here are a few thoughts about the imx51 specifically, based on my 
experience. First, the size of the memory pool used for 
dma_alloc_coherent is set in plat-mxc/include/mach/memory.h where it is 
hard coded to 64 MiB. You could try bumping that up a bit.

Second, you could re-do the buffer allocation and replace 
dma_alloc_coherent with kmalloc and then use dma_map_single to lock it 
down while dma is taking place. This way you avoid the 64M dma pool 
limit and you speed up buffer access via mmap because the memory is 
cached. In my case I got a two fold speed improvement reading frames 
into application memory. I have to admit that my case was a bit 
specialised though and it may not be worth the effort for you.

Bye for now,
Chris.


-- 
Chris Simmonds                   2net Limited
chris@2net.co.uk                 http://www.2net.co.uk/

--
video4linux-list mailing list
Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/video4linux-list

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Contiguous memory allocations
  2010-07-05 15:31     ` Chris Simmonds
@ 2010-07-05 16:14       ` Eric Nelson
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Nelson @ 2010-07-05 16:14 UTC (permalink / raw)
  To: chris; +Cc: video4linux-list, Chris Simmonds

On 07/05/2010 08:31 AM, Chris Simmonds wrote:
> On 05/07/10 15:27, Eric Nelson wrote:
>> On 07/05/2010 03:10 AM, Chris Simmonds wrote:
>>> On 02/07/10 21:47, Eric Nelson wrote:
>>>> Does anyone know if there's a common infrastructure for allocation
>>>> of DMA'able memory by drivers and applications above the straight
>>>> kernel API (dma_alloc_coherent)?
>>>>
>>>> I'm working with Freescale i.MX51 drivers to do 720P video
>>>> input and output and the embedded calls to dma_alloc_coherent
>>>> fail except when used right after boot because of fragmentation.
>>>>
>>>> I'm fighting the urge to write yet another special-purpose allocator
>>>> for video buffers thinking this must be a common problem with a
>>>> solution already, but I can't seem to locate one.
>>>>
>>>> The closest thing I've found is the bigphysarea patch, which doesn't
>>>> appear to be supported or headed toward main-line.
>>>>
>>>> Thanks in advance,
>>>
>>> dma_alloc_coherent is pretty much just a wrapper round get_free_pages,
>>> which is the lowest level allocator in the kernel. So, no there is no
>>> other option (but see below). The simplest thing is to make sure your
>>> driver is loaded at boot time and to grab all the memory you need then
>>> and never let it go. That's what I do.
>>>
>> Thanks Chris.
>>
>> The trouble is always "how much"? If we don't know at startup what
>> kind of video's needed or what size(s) of camera input may be needed, it's
>> impossible to tune. In the current Freescale kernels, there are at least
 >> 4 separate drivers that allocate RAM, sometimes for internal use, but
 >> mostly in response to userspace calls (ioctl).
>>
>> - frame-buffer driver
>> - Video Processing Unit (VPU) - video encode/decode
>> - V4L2 output device - allows access to YUV output layer, color blending
>> - Image Processing Unit (IPU) - allows userspace bitblts through DMA
>>
>> With this number of calls, tuning with separate kernel command-line args
>> seems unworkable.
>
> I think the kernel developers don't like this kind of on-the-side
> allocator because they tend to be dedicated to solving one kind of problem.
>

They've certainly rejected bigphysarea. I suspected that there are other
special-purpose allocators embedded in many drivers, but grepping drivers/video
shows only a few (2 x Freescale and sis).

I suspect this is because much is being done in userspace a.la. DirectFB.

> Here are a few thoughts about the imx51 specifically, based on my
> experience. First, the size of the memory pool used for dma_alloc_coherent
 > is set in plat-mxc/include/mach/memory.h where it is hard coded to
 > 64 MiB. You could try bumping that up a bit.
>
I did that, and it did help. The latest Freescale kernel patches do that as
well, bumping it to 96M.

The problem still exists, though, especially under Ubuntu 10.04 if the
compcache (ramzswap) module is loaded.

> Second, you could re-do the buffer allocation and replace
> dma_alloc_coherent with kmalloc and then use dma_map_single to lock it
> down while dma is taking place. This way you avoid the 64M dma pool
> limit and you speed up buffer access via mmap because the memory is
> cached. In my case I got a two fold speed improvement reading frames
> into application memory. I have to admit that my case was a bit
> specialised though and it may not be worth the effort for you.
>
> Bye for now,
> Chris.
>
>
Thanks again for the feedback, Chris.

Regards,


Eric

--
video4linux-list mailing list
Unsubscribe mailto:video4linux-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/video4linux-list

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-07-05 16:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-02 20:47 Contiguous memory allocations Eric Nelson
2010-07-05 10:10 ` Chris Simmonds
2010-07-05 14:27   ` Eric Nelson
2010-07-05 15:31     ` Chris Simmonds
2010-07-05 16:14       ` Eric Nelson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.