* application needs fast access to physical memory @ 2010-11-17 22:03 steven.lin 2010-11-18 12:24 ` Michael Ellerman 0 siblings, 1 reply; 8+ messages in thread From: steven.lin @ 2010-11-17 22:03 UTC (permalink / raw) To: linuxppc-dev; +Cc: Steven_Lin [-- Attachment #1: Type: text/plain, Size: 544 bytes --] My application needs a fast way to access a specific physical DDR memory region. The application runs on an MPC8548 PowerPC which has an MMU. I've tried two approaches that are typical for Linux, mmap() and using a kernel module that implements read()/write() into this region and I'm finding that performance is very slow for both. It's a couple orders of magnitude slower than, for example, copying a large buffer from one place in the application's virtual memory to another place in the application's virtual memory. Steve Lin [-- Attachment #2: Type: text/html, Size: 654 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: application needs fast access to physical memory 2010-11-17 22:03 application needs fast access to physical memory steven.lin @ 2010-11-18 12:24 ` Michael Ellerman 2010-11-18 12:52 ` David Laight 2010-11-18 12:54 ` David Gibson 0 siblings, 2 replies; 8+ messages in thread From: Michael Ellerman @ 2010-11-18 12:24 UTC (permalink / raw) To: steven.lin; +Cc: Steven_Lin, linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 797 bytes --] On Wed, 2010-11-17 at 16:03 -0600, steven.lin@teradyne.com wrote: > My application needs a fast way to access a specific physical DDR > memory region. The application runs on an MPC8548 PowerPC which has an > MMU. I've tried two approaches that are typical for Linux, mmap() and > using a kernel module that implements read()/write() into this region > and I'm finding that performance is very slow for both. It's a couple > orders of magnitude slower than, for example, copying a large buffer > from one place in the application's virtual memory to another place in > the application's virtual memory. The mmap() version should basically run at "full speed", at least once you've faulted the address range in. This specific DDR region isn't specifically slow is it ? :) cheers [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: application needs fast access to physical memory 2010-11-18 12:24 ` Michael Ellerman @ 2010-11-18 12:52 ` David Laight 2010-11-18 12:54 ` David Gibson 1 sibling, 0 replies; 8+ messages in thread From: David Laight @ 2010-11-18 12:52 UTC (permalink / raw) Cc: linuxppc-dev =20 > On Wed, 2010-11-17 at 16:03 -0600, steven.lin@teradyne.com wrote: > > My application needs a fast way to access a specific physical DDR > > memory region. The application runs on an MPC8548 PowerPC which has an > > MMU. I've tried two approaches that are typical for Linux, mmap() and > > using a kernel module that implements read()/write() into this region > > and I'm finding that performance is very slow for both. It's a couple > > orders of magnitude slower than, for example, copying a large buffer > > from one place in the application's virtual memory to another place in > > the application's virtual memory. >=20 > The mmap() version should basically run at "full speed", at least once > you've faulted the address range in. >=20 > This specific DDR region isn't specifically slow is it ? :) Or being mapped uncached ? David ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: application needs fast access to physical memory 2010-11-18 12:24 ` Michael Ellerman 2010-11-18 12:52 ` David Laight @ 2010-11-18 12:54 ` David Gibson 2010-11-18 16:55 ` steven.lin 1 sibling, 1 reply; 8+ messages in thread From: David Gibson @ 2010-11-18 12:54 UTC (permalink / raw) To: Michael Ellerman; +Cc: Steven_Lin, steven.lin, linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 1192 bytes --] On Thu, Nov 18, 2010 at 11:24:22PM +1100, Michael Ellerman wrote: > On Wed, 2010-11-17 at 16:03 -0600, steven.lin@teradyne.com wrote: > > My application needs a fast way to access a specific physical DDR > > memory region. The application runs on an MPC8548 PowerPC which has an > > MMU. I've tried two approaches that are typical for Linux, mmap() and > > using a kernel module that implements read()/write() into this region > > and I'm finding that performance is very slow for both. It's a couple > > orders of magnitude slower than, for example, copying a large buffer > > from one place in the application's virtual memory to another place in > > the application's virtual memory. > > The mmap() version should basically run at "full speed", at least once > you've faulted the address range in. > > This specific DDR region isn't specifically slow is it ? :) The other theory that springs to mind is whatever method you're using to access the region enabling cacheing? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: application needs fast access to physical memory 2010-11-18 12:54 ` David Gibson @ 2010-11-18 16:55 ` steven.lin 2010-11-18 19:35 ` Scott Wood 0 siblings, 1 reply; 8+ messages in thread From: steven.lin @ 2010-11-18 16:55 UTC (permalink / raw) To: David Gibson, Michael Ellerman; +Cc: linuxppc-dev [-- Attachment #1.1: Type: text/plain, Size: 4652 bytes --] Thanks for the replies. In the Linux Device Drivers book regarding mmap(), it states: Mapping a device means associating a range of user-space addresses to device memory. Whenever the program reads or writes in the assigned address range, it is actually accessing the device. In the X server example, using mmap allows quick and easy access to the video card’s memory. For a performance-critical application like this, direct access makes a large difference. For whatever reason, mmap() is definitely not quick and does not appear to be a direct access to device memory. After the application completes a large write into physical memory (via the pointer returned from mmap()), the application performs an ioctl() to query whether the data actually arrived into the memory region. It seems to take some time before the associated kernel module actually "sees" the data in the physical memory region. There's a few things I should say about this memory region. There's a total of 512 MB of physical memory. U-Boot passes "mem=256M" as a kernel parameter to tell Linux to only directly manage the lower 256 MB. The special region of physical memory that the application is trying to access is the upper 256 MB of memory not directly managed by Linux. The mmap() call from the application is: *memptr = (void *) mmap( NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, _fdTerAlloc, (off_t) 0x10000000); On the kernel module side, the function handling the mmap() file operation is: static int ter_alloc_mmap( struct file *pFile, struct vm_area_struct *vma ) { if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end - vma->vm_start, vma->vm_page_prot)) return -EAGAIN; vma->vm_ops = &ter_alloc_remap_vm_ops; ter_alloc_vma_open(vma); return 0; } -Steve Lin David Gibson <david@gibson.dro pbear.id.au> To Michael Ellerman 11/18/2010 06:54 <michael@ellerman.id.au> AM cc steven.lin@teradyne.com, Steven_Lin@notes.teradyne.com, linuxppc-dev@lists.ozlabs.org Subject Re: application needs fast access to physical memory On Thu, Nov 18, 2010 at 11:24:22PM +1100, Michael Ellerman wrote: > On Wed, 2010-11-17 at 16:03 -0600, steven.lin@teradyne.com wrote: > > My application needs a fast way to access a specific physical DDR > > memory region. The application runs on an MPC8548 PowerPC which has an > > MMU. I've tried two approaches that are typical for Linux, mmap() and > > using a kernel module that implements read()/write() into this region > > and I'm finding that performance is very slow for both. It's a couple > > orders of magnitude slower than, for example, copying a large buffer > > from one place in the application's virtual memory to another place in > > the application's virtual memory. > > The mmap() version should basically run at "full speed", at least once > you've faulted the address range in. > > This specific DDR region isn't specifically slow is it ? :) The other theory that springs to mind is whatever method you're using to access the region enabling cacheing? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [attachment "signature.asc" deleted by Steven Lin/USW/Teradyne] [-- Attachment #1.2: Type: text/html, Size: 6742 bytes --] [-- Attachment #2: graycol.gif --] [-- Type: image/gif, Size: 105 bytes --] [-- Attachment #3: pic02872.gif --] [-- Type: image/gif, Size: 1255 bytes --] [-- Attachment #4: ecblank.gif --] [-- Type: image/gif, Size: 45 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: application needs fast access to physical memory 2010-11-18 16:55 ` steven.lin @ 2010-11-18 19:35 ` Scott Wood 2010-11-18 20:46 ` steven.lin 0 siblings, 1 reply; 8+ messages in thread From: Scott Wood @ 2010-11-18 19:35 UTC (permalink / raw) To: steven.lin; +Cc: linuxppc-dev, David Gibson On Thu, 18 Nov 2010 10:55:21 -0600 <steven.lin@teradyne.com> wrote: > Thanks for the replies. >=20 > In the Linux Device Drivers book regarding mmap(), it states: >=20 > Mapping a device means associating a range of user-space addresses to > device memory. > Whenever the program reads or writes in the assigned address range, it > is actually > accessing the device. In the X server example, using mmap allows quick > and easy > access to the video card=E2=80=99s memory. For a performance-critical > application like this, > direct access makes a large difference. >=20 > For whatever reason, mmap() is definitely not quick and does not appear to > be a direct access to device memory. After the application completes a > large write into physical memory (via the pointer returned from mmap()), > the application performs an ioctl() to query whether the data actually > arrived into the memory region. It seems to take some time before the > associated kernel module actually "sees" the data in the physical memory > region. >=20 > There's a few things I should say about this memory region. There's a tot= al > of 512 MB of physical memory. U-Boot passes "mem=3D256M" as a kernel > parameter to tell Linux to only directly manage the lower 256 MB. The > special region of physical memory that the application is trying to access > is the upper 256 MB of memory not directly managed by Linux. The mmap() > call from the application is: > *memptr =3D (void *) mmap( NULL, size, PROT_READ | PROT_WRITE, MAP_SHA= RED, > _fdTerAlloc, (off_t) 0x10000000); Try this patch: http://patchwork.ozlabs.org/patch/68246/ -Scott ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: application needs fast access to physical memory 2010-11-18 19:35 ` Scott Wood @ 2010-11-18 20:46 ` steven.lin 2010-11-18 20:48 ` Scott Wood 0 siblings, 1 reply; 8+ messages in thread From: steven.lin @ 2010-11-18 20:46 UTC (permalink / raw) To: Scott Wood; +Cc: steven.lin, linuxppc-dev, David Gibson [-- Attachment #1.1: Type: text/plain, Size: 3437 bytes --] Hello Scott, Do you know whether this patch is necessary if I were to use alloc_bootmem () (to set aside a region of contiguous physical memory) instead of the kernel parameter "mem=256"? -Steve Lin Scott Wood <scottwood@freesc ale.com> To <steven.lin@teradyne.com> 11/18/2010 01:35 cc PM David Gibson <david@gibson.dropbear.id.au>, Michael Ellerman <michael@ellerman.id.au>, <linuxppc-dev@lists.ozlabs.org> Subject Re: application needs fast access to physical memory On Thu, 18 Nov 2010 10:55:21 -0600 <steven.lin@teradyne.com> wrote: > Thanks for the replies. > > In the Linux Device Drivers book regarding mmap(), it states: > > Mapping a device means associating a range of user-space addresses to > device memory. > Whenever the program reads or writes in the assigned address range, it > is actually > accessing the device. In the X server example, using mmap allows quick > and easy > access to the video card’s memory. For a performance-critical > application like this, > direct access makes a large difference. > > For whatever reason, mmap() is definitely not quick and does not appear to > be a direct access to device memory. After the application completes a > large write into physical memory (via the pointer returned from mmap()), > the application performs an ioctl() to query whether the data actually > arrived into the memory region. It seems to take some time before the > associated kernel module actually "sees" the data in the physical memory > region. > > There's a few things I should say about this memory region. There's a total > of 512 MB of physical memory. U-Boot passes "mem=256M" as a kernel > parameter to tell Linux to only directly manage the lower 256 MB. The > special region of physical memory that the application is trying to access > is the upper 256 MB of memory not directly managed by Linux. The mmap() > call from the application is: > *memptr = (void *) mmap( NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, > _fdTerAlloc, (off_t) 0x10000000); Try this patch: http://patchwork.ozlabs.org/patch/68246/ -Scott [-- Attachment #1.2: Type: text/html, Size: 4936 bytes --] [-- Attachment #2: graycol.gif --] [-- Type: image/gif, Size: 105 bytes --] [-- Attachment #3: pic02222.gif --] [-- Type: image/gif, Size: 1255 bytes --] [-- Attachment #4: ecblank.gif --] [-- Type: image/gif, Size: 45 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: application needs fast access to physical memory 2010-11-18 20:46 ` steven.lin @ 2010-11-18 20:48 ` Scott Wood 0 siblings, 0 replies; 8+ messages in thread From: Scott Wood @ 2010-11-18 20:48 UTC (permalink / raw) To: steven.lin; +Cc: linuxppc-dev, David Gibson On Thu, 18 Nov 2010 14:46:16 -0600 <steven.lin@teradyne.com> wrote: > Hello Scott, > > Do you know whether this patch is necessary if I were to use alloc_bootmem > () (to set aside a region of contiguous physical memory) instead of the > kernel parameter "mem=256"? It should not be needed in that case. -Scott ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-11-18 21:04 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-11-17 22:03 application needs fast access to physical memory steven.lin 2010-11-18 12:24 ` Michael Ellerman 2010-11-18 12:52 ` David Laight 2010-11-18 12:54 ` David Gibson 2010-11-18 16:55 ` steven.lin 2010-11-18 19:35 ` Scott Wood 2010-11-18 20:46 ` steven.lin 2010-11-18 20:48 ` Scott Wood
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.