All of lore.kernel.org
 help / color / mirror / Atom feed
* extra large DMA buffer for PCI-E device under UIO
@ 2011-11-18 21:16 Jean-Francois Dagenais
  2011-11-18 22:08 ` Greg KH
  2011-11-18 22:27 ` Hans J. Koch
  0 siblings, 2 replies; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-18 21:16 UTC (permalink / raw)
  To: hjk, gregkh, tglx; +Cc: linux-pci, open list

Hello fellow hackers.

I am maintaining a UIO based driver for a PCI-E data acquisition device.

I map BAR0 of the device to userspace. I also map two memory areas, one is used to feed instructions to the acquisition device, the other is used autonomously by the PCI device to write the acquired data.

The strategy we have been using for those two share memory areas has historically been using pci_alloc_coherent on v2.6.35 x86_64 (limited to 4MB based on my trials) and later, I made use of the VT-d (intel_iommu) to allocate as much as 128MB (an arbitrary limit) which appear contiguous to the PCI device. I use vmalloc_user to allocate 128M, then write all the physically continuous segments in a scatterlist, then use pci_map_sg which works it's way to intel_iommu. The device DMA addresses I get back are contiguous over the whole 128M. Neat! Our VT-d capable devices still use this strategy.

This large memory is mission-critical in making the acquisition device autonomous (real-time), yet keep the DMA implementation very simple. Today, we are re-using this device on a CPU architecture that has no IOMMU (intel E6XX/EG20T) and want to avoid creating a scatter-gather scheme between my driver and the FPGA (PCI device).

So I went back to the old pci_alloc_coherent method, which although limited to 4 MB, will do for early development phases. Instead of 2.6.35, we are doing preliminary development using 2.6.37 and will probably use 3.1 or more later.  The cpu/device shared memory maps (1MB and 4MB) are allocated using pci_alloc_coherent and handed to UIO as physical memory using the dma_addr_t returned by the pci_alloc func.

The 1st memory map is written to by CPU and read from device.
The 2nd memory map is typically written by the device and read by the CPU, but future features may have the device also read this memory.

My initial testing on the atom E6XX show the PCI device failing when trying to read from the first memory map. I suspect PCI-E payload sizes which may be somewhat hardcoded in the FPGA firmware... we will confirm this soon.

Now from the get go I have felt lucky to have made this work because of my limited research into the intricacies of the kernel's memory management. So I ask two things:

- Is this kosher?
- Is there a better/easier/safer way to achieve this? (remember that for the second map, the more memory I have, the better. We have a gig of ram, if I take, say 256MB, that would be OK too.

I had thought about cutting out a chunk of ram from the kernel's boot args, but had always feared cache/snooping errors. Not to mention I had no idea how to "claim" or setup this memory once my driver's probe function. Maybe I would still be lucky and it would just work? mmmh...

Thanks for the help!!
/jfd

Cheers Linus!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-18 21:16 extra large DMA buffer for PCI-E device under UIO Jean-Francois Dagenais
@ 2011-11-18 22:08 ` Greg KH
  2011-11-21 15:31   ` Jean-Francois Dagenais
  2011-11-22 19:57   ` Jean-Francois Dagenais
  2011-11-18 22:27 ` Hans J. Koch
  1 sibling, 2 replies; 24+ messages in thread
From: Greg KH @ 2011-11-18 22:08 UTC (permalink / raw)
  To: Jean-Francois Dagenais; +Cc: hjk, tglx, linux-pci, open list

On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
> Hello fellow hackers.
> 
> I am maintaining a UIO based driver for a PCI-E data acquisition device.
> 
> I map BAR0 of the device to userspace. I also map two memory areas,
> one is used to feed instructions to the acquisition device, the other
> is used autonomously by the PCI device to write the acquired data.

Nice, have a pointer to your driver anywhere so we can include it in the
main kernel tree to make your life easier?

> The strategy we have been using for those two share memory areas has
> historically been using pci_alloc_coherent on v2.6.35 x86_64 (limited
> to 4MB based on my trials) and later, I made use of the VT-d
> (intel_iommu) to allocate as much as 128MB (an arbitrary limit) which
> appear contiguous to the PCI device. I use vmalloc_user to allocate
> 128M, then write all the physically continuous segments in a
> scatterlist, then use pci_map_sg which works it's way to intel_iommu.
> The device DMA addresses I get back are contiguous over the whole
> 128M. Neat! Our VT-d capable devices still use this strategy.
> 
> This large memory is mission-critical in making the acquisition device
> autonomous (real-time), yet keep the DMA implementation very simple.
> Today, we are re-using this device on a CPU architecture that has no
> IOMMU (intel E6XX/EG20T) and want to avoid creating a scatter-gather
> scheme between my driver and the FPGA (PCI device).
> 
> So I went back to the old pci_alloc_coherent method, which although
> limited to 4 MB, will do for early development phases. Instead of
> 2.6.35, we are doing preliminary development using 2.6.37 and will
> probably use 3.1 or more later.  The cpu/device shared memory maps
> (1MB and 4MB) are allocated using pci_alloc_coherent and handed to UIO
> as physical memory using the dma_addr_t returned by the pci_alloc
> func.
> 
> The 1st memory map is written to by CPU and read from device.
> The 2nd memory map is typically written by the device and read by the
> CPU, but future features may have the device also read this memory.
> 
> My initial testing on the atom E6XX show the PCI device failing when
> trying to read from the first memory map. I suspect PCI-E payload
> sizes which may be somewhat hardcoded in the FPGA firmware... we will
> confirm this soon.

That would be good to find out.

> Now from the get go I have felt lucky to have made this work because
> of my limited research into the intricacies of the kernel's memory
> management. So I ask two things:
> 
> - Is this kosher?

I think so, yes, but others who know the DMA subsystem better than I
should chime in here, as I might be totally wrong.

> - Is there a better/easier/safer way to achieve this? (remember that
> for the second map, the more memory I have, the better. We have a gig
> of ram, if I take, say 256MB, that would be OK too.
> 
> I had thought about cutting out a chunk of ram from the kernel's boot
> args, but had always feared cache/snooping errors. Not to mention I
> had no idea how to "claim" or setup this memory once my driver's probe
> function. Maybe I would still be lucky and it would just work? mmmh...

Yeah, don't do that, it might not work out well.

greg k-h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-18 21:16 extra large DMA buffer for PCI-E device under UIO Jean-Francois Dagenais
  2011-11-18 22:08 ` Greg KH
@ 2011-11-18 22:27 ` Hans J. Koch
  2011-11-21 15:10   ` Jean-Francois Dagenais
  1 sibling, 1 reply; 24+ messages in thread
From: Hans J. Koch @ 2011-11-18 22:27 UTC (permalink / raw)
  To: Jean-Francois Dagenais; +Cc: hjk, gregkh, tglx, linux-pci, open list

On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
> Hello fellow hackers.

Hi. Could you please limit the line length of your mails to something less
than 80 chars?

> 
> I am maintaining a UIO based driver for a PCI-E data acquisition device.

Can you post it? No point in discussing non-existent code...

> 
> I map BAR0 of the device to userspace. I also map two memory areas, one is used to feed instructions to the acquisition device, the other is used autonomously by the PCI device to write the acquired data.
> 
> The strategy we have been using for those two share memory areas has historically been using pci_alloc_coherent on v2.6.35 x86_64 (limited to 4MB based on my trials) and later, I made use of the VT-d (intel_iommu) to allocate as much as 128MB (an arbitrary limit) which appear contiguous to the PCI device. I use vmalloc_user to allocate 128M, then write all the physically continuous segments in a scatterlist, then use pci_map_sg which works it's way to intel_iommu. The device DMA addresses I get back are contiguous over the whole 128M. Neat! Our VT-d capable devices still use this strategy.
> 
> This large memory is mission-critical in making the acquisition device autonomous (real-time), yet keep the DMA implementation very simple. Today, we are re-using this device on a CPU architecture that has no IOMMU (intel E6XX/EG20T) and want to avoid creating a scatter-gather scheme between my driver and the FPGA (PCI device).
> 
> So I went back to the old pci_alloc_coherent method, which although limited to 4 MB, will do for early development phases. Instead of 2.6.35, we are doing preliminary development using 2.6.37 and will probably use 3.1 or more later.  The cpu/device shared memory maps (1MB and 4MB) are allocated using pci_alloc_coherent and handed to UIO as physical memory using the dma_addr_t returned by the pci_alloc func.
> 
> The 1st memory map is written to by CPU and read from device.
> The 2nd memory map is typically written by the device and read by the CPU, but future features may have the device also read this memory.
> 
> My initial testing on the atom E6XX show the PCI device failing when trying to read from the first memory map.

Any kernel messages in the logs that could help?

[...]

Thanks,
Hans

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-18 22:27 ` Hans J. Koch
@ 2011-11-21 15:10   ` Jean-Francois Dagenais
  2011-11-21 15:47     ` Rolf Eike Beer
  0 siblings, 1 reply; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-21 15:10 UTC (permalink / raw)
  To: Hans J. Koch; +Cc: gregkh, tglx, linux-pci, open list


On Nov 18, 2011, at 17:27, Hans J. Koch wrote:

> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
>> Hello fellow hackers.
> 
> Hi. Could you please limit the line length of your mails to something less
> than 80 chars?
hehe, I don't think I have ever managed line length of regular talk in mails I have sent.
I read and write from mail clients that line wrap for me, (mac mail right now, please don't
judge me... I still make contributions to the kernel!! :)
> 
>> 
>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
> 
> Can you post it? No point in discussing non-existent code...
Well, the code does exist, but the driver drive's a pci device which is only found in
a product we sell. The pci ID we use is not registered, and except for this driver, which
is a non-driver really (UIO), the FPGA firmware and the userspace code is proprietary.

I have no problem sharing the code that runs in the kernel and will send a patch for you to
review, but contrary to other contributions I make to w1 or i2c device drivers, I never expect
this code to make it into the mainline. For this reason, as well as the fact it is my very first
kernel code project, it is quite non-conforming to the kernel standards in many respects
(line length, symbol names, etc.)
> 
>> 
>> I map BAR0 of the device to userspace. I also map two memory areas, one is used to feed instructions to the acquisition device, the other is used autonomously by the PCI device to write the acquired data.
>> 
>> The strategy we have been using for those two share memory areas has historically been using pci_alloc_coherent on v2.6.35 x86_64 (limited to 4MB based on my trials) and later, I made use of the VT-d (intel_iommu) to allocate as much as 128MB (an arbitrary limit) which appear contiguous to the PCI device. I use vmalloc_user to allocate 128M, then write all the physically continuous segments in a scatterlist, then use pci_map_sg which works it's way to intel_iommu. The device DMA addresses I get back are contiguous over the whole 128M. Neat! Our VT-d capable devices still use this strategy.
>> 
>> This large memory is mission-critical in making the acquisition device autonomous (real-time), yet keep the DMA implementation very simple. Today, we are re-using this device on a CPU architecture that has no IOMMU (intel E6XX/EG20T) and want to avoid creating a scatter-gather scheme between my driver and the FPGA (PCI device).
>> 
>> So I went back to the old pci_alloc_coherent method, which although limited to 4 MB, will do for early development phases. Instead of 2.6.35, we are doing preliminary development using 2.6.37 and will probably use 3.1 or more later.  The cpu/device shared memory maps (1MB and 4MB) are allocated using pci_alloc_coherent and handed to UIO as physical memory using the dma_addr_t returned by the pci_alloc func.
>> 
>> The 1st memory map is written to by CPU and read from device.
>> The 2nd memory map is typically written by the device and read by the CPU, but future features may have the device also read this memory.
>> 
>> My initial testing on the atom E6XX show the PCI device failing when trying to read from the first memory map.
> 
> Any kernel messages in the logs that could help?
My FPGA engineer is currently instrumenting the firmware to see what is happening.

I guess when we figure why the FPGA cannot read the system RAM, I am still stuck with the small 4MB buffer...
any thoughts how to get way more without the use of a device side IOMMU?
> 
> [...]
> 
> Thanks,
> Hans
Thanks for your help!
(hope my manual line length management helped you! ;)
/jfd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-18 22:08 ` Greg KH
@ 2011-11-21 15:31   ` Jean-Francois Dagenais
  2011-11-21 17:36     ` Greg KH
  2011-11-22 19:57   ` Jean-Francois Dagenais
  1 sibling, 1 reply; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-21 15:31 UTC (permalink / raw)
  To: Greg KH; +Cc: hjk, tglx, linux-pci, open list

Hi Greg, thanks for your answer...

On Nov 18, 2011, at 17:08, Greg KH wrote:

> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
>> Hello fellow hackers.
>> 
>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
>> 
>> I map BAR0 of the device to userspace. I also map two memory areas,
>> one is used to feed instructions to the acquisition device, the other
>> is used autonomously by the PCI device to write the acquired data.
> 
> Nice, have a pointer to your driver anywhere so we can include it in the
> main kernel tree to make your life easier?
As I said in a parallel answer from "Hans J. Koch" <hjk@hansjkoch.de>,
the driver, although GPL'ed, is quite uninteresting except for us here at
Sonatest. (BTW, sorry I use my gmail for my kernel dealings because it works better
than the company's exchange server. You can lookup my name in the mainline
or in the blackfin-device-devel archives and find the two e-mails though,
I have fixed my kernel's git config to be gmail only now).

About merging the driver to mainline, I guess it would only be interesting for
the recipe I demonstrate. Please advise.
> 
>> The strategy we have been using for those two share memory areas has
>> historically been using pci_alloc_coherent on v2.6.35 x86_64 (limited
>> to 4MB based on my trials) and later, I made use of the VT-d
>> (intel_iommu) to allocate as much as 128MB (an arbitrary limit) which
>> appear contiguous to the PCI device. I use vmalloc_user to allocate
>> 128M, then write all the physically continuous segments in a
>> scatterlist, then use pci_map_sg which works it's way to intel_iommu.
>> The device DMA addresses I get back are contiguous over the whole
>> 128M. Neat! Our VT-d capable devices still use this strategy.
>> 
>> This large memory is mission-critical in making the acquisition device
>> autonomous (real-time), yet keep the DMA implementation very simple.
>> Today, we are re-using this device on a CPU architecture that has no
>> IOMMU (intel E6XX/EG20T) and want to avoid creating a scatter-gather
>> scheme between my driver and the FPGA (PCI device).
>> 
>> So I went back to the old pci_alloc_coherent method, which although
>> limited to 4 MB, will do for early development phases. Instead of
>> 2.6.35, we are doing preliminary development using 2.6.37 and will
>> probably use 3.1 or more later.  The cpu/device shared memory maps
>> (1MB and 4MB) are allocated using pci_alloc_coherent and handed to UIO
>> as physical memory using the dma_addr_t returned by the pci_alloc
>> func.
>> 
>> The 1st memory map is written to by CPU and read from device.
>> The 2nd memory map is typically written by the device and read by the
>> CPU, but future features may have the device also read this memory.
>> 
>> My initial testing on the atom E6XX show the PCI device failing when
>> trying to read from the first memory map. I suspect PCI-E payload
>> sizes which may be somewhat hardcoded in the FPGA firmware... we will
>> confirm this soon.
> 
> That would be good to find out.
> 
>> Now from the get go I have felt lucky to have made this work because
>> of my limited research into the intricacies of the kernel's memory
>> management. So I ask two things:
>> 
>> - Is this kosher?
> 
> I think so, yes, but others who know the DMA subsystem better than I
> should chime in here, as I might be totally wrong.
> 
>> - Is there a better/easier/safer way to achieve this? (remember that
>> for the second map, the more memory I have, the better. We have a gig
>> of ram, if I take, say 256MB, that would be OK too.
>> 
>> I had thought about cutting out a chunk of ram from the kernel's boot
>> args, but had always feared cache/snooping errors. Not to mention I
>> had no idea how to "claim" or setup this memory once my driver's probe
>> function. Maybe I would still be lucky and it would just work? mmmh...
> 
> Yeah, don't do that, it might not work out well.
Other ways, anyone? 4MB is really not enough in the end for me. Maybe I could trace the
dma_alloc_coherent call, figure out where the limit is and patch the kernel for my needs?
> 
> greg k-h
Thanks for your precious time. Cheers.
/jfd


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-21 15:10   ` Jean-Francois Dagenais
@ 2011-11-21 15:47     ` Rolf Eike Beer
  2011-11-21 16:01       ` Jean-Francois Dagenais
  0 siblings, 1 reply; 24+ messages in thread
From: Rolf Eike Beer @ 2011-11-21 15:47 UTC (permalink / raw)
  To: Jean-Francois Dagenais; +Cc: Hans J. Koch, gregkh, tglx, linux-pci, open list

> The pci ID we use is not registered, and except for this driver, which
> is a non-driver really (UIO), the FPGA firmware and the userspace code is
> proprietary.

Why keep people doing this? This is just asking for future trouble. And it
gives bad karma.

Go and ask your FPGA vendor to assign a device id to you. At least for
Xilinx this has worked.

Eike

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-21 15:47     ` Rolf Eike Beer
@ 2011-11-21 16:01       ` Jean-Francois Dagenais
  0 siblings, 0 replies; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-21 16:01 UTC (permalink / raw)
  To: Rolf Eike Beer
  Cc: Hans J. Koch, Greg KH, tglx, linux-pci, open list, Simon Goyette


On Nov 21, 2011, at 10:47, Rolf Eike Beer wrote:

>> The pci ID we use is not registered, and except for this driver, which
>> is a non-driver really (UIO), the FPGA firmware and the userspace code is
>> proprietary.
> 
> Why keep people doing this? This is just asking for future trouble. And it
> gives bad karma.
This is a bit off topic...

Still, to address it:
... because when developing a complex product with limited staff, you prioritize.

Unless I am mistaken, my use of PCI ids is to match our HW device with our
driver (which is yet not meant to be merged to mainline) in our completely controlled
system (hw, kernel, software) made into a product. So we didn't feel it necessary
to assign guaranteed unique IDs. If there are any ID conflicts one day, I assume
it would be because of a new kernel version we put in, or a new hardware
device we put in and in any case, a new release of the product which is 
thoroughly tested before released. Furthermore, in the unlikely event we do
encounter a collision, we can change both the FPGA and the driver on a dime.

Am I missing something? What kind of future trouble are you referring to?
> 
> Go and ask your FPGA vendor to assign a device id to you. At least for
> Xilinx this has worked.
Thanks for the tip, we could look into this if we need to.
> 
> Eike


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-21 15:31   ` Jean-Francois Dagenais
@ 2011-11-21 17:36     ` Greg KH
  2011-11-21 18:17       ` Hans J. Koch
  0 siblings, 1 reply; 24+ messages in thread
From: Greg KH @ 2011-11-21 17:36 UTC (permalink / raw)
  To: Jean-Francois Dagenais; +Cc: hjk, tglx, linux-pci, open list

On Mon, Nov 21, 2011 at 10:31:07AM -0500, Jean-Francois Dagenais wrote:
> Hi Greg, thanks for your answer...
> 
> On Nov 18, 2011, at 17:08, Greg KH wrote:
> 
> > On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
> >> Hello fellow hackers.
> >> 
> >> I am maintaining a UIO based driver for a PCI-E data acquisition device.
> >> 
> >> I map BAR0 of the device to userspace. I also map two memory areas,
> >> one is used to feed instructions to the acquisition device, the other
> >> is used autonomously by the PCI device to write the acquired data.
> > 
> > Nice, have a pointer to your driver anywhere so we can include it in the
> > main kernel tree to make your life easier?
> As I said in a parallel answer from "Hans J. Koch" <hjk@hansjkoch.de>,
> the driver, although GPL'ed, is quite uninteresting except for us here at
> Sonatest.

I really doubt that, and you should submit it anyway to allow us to
change it when the in-kernel apis change in the future.  It will save
you time in the long run and make things easier for you (look, your
driver is automatically included in all distros!, people fix your bugs,
etc.)

> About merging the driver to mainline, I guess it would only be interesting for
> the recipe I demonstrate. Please advise.

That is a recipe that I'm sure others will use, and need help on in the
future.

So please submit a patch, that will make it easier to help you out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-21 17:36     ` Greg KH
@ 2011-11-21 18:17       ` Hans J. Koch
       [not found]         ` <4A52B447-8E21-43F6-A38E-711E36F89A34@gmail.com>
  2011-11-22 15:24         ` Jean-Francois Dagenais
  0 siblings, 2 replies; 24+ messages in thread
From: Hans J. Koch @ 2011-11-21 18:17 UTC (permalink / raw)
  To: Greg KH; +Cc: Jean-Francois Dagenais, hjk, tglx, linux-pci, open list

On Mon, Nov 21, 2011 at 09:36:20AM -0800, Greg KH wrote:
> On Mon, Nov 21, 2011 at 10:31:07AM -0500, Jean-Francois Dagenais wrote:
> > Hi Greg, thanks for your answer...
> > 
> > On Nov 18, 2011, at 17:08, Greg KH wrote:
> > 
> > > On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
> > >> Hello fellow hackers.
> > >> 
> > >> I am maintaining a UIO based driver for a PCI-E data acquisition device.
> > >> 
> > >> I map BAR0 of the device to userspace. I also map two memory areas,
> > >> one is used to feed instructions to the acquisition device, the other
> > >> is used autonomously by the PCI device to write the acquired data.
> > > 
> > > Nice, have a pointer to your driver anywhere so we can include it in the
> > > main kernel tree to make your life easier?
> > As I said in a parallel answer from "Hans J. Koch" <hjk@hansjkoch.de>,
> > the driver, although GPL'ed, is quite uninteresting except for us here at
> > Sonatest.
> 
> I really doubt that,

So do I. We never had a driver allocating so much memory.

> and you should submit it anyway to allow us to
> change it when the in-kernel apis change in the future.  It will save
> you time in the long run and make things easier for you (look, your
> driver is automatically included in all distros!, people fix your bugs,
> etc.)

Exactly.

> 
> > About merging the driver to mainline, I guess it would only be interesting for
> > the recipe I demonstrate. Please advise.
> 
> That is a recipe that I'm sure others will use, and need help on in the
> future.

They already needed it in the past, and they usually try to get it by
writing me private mail.

> 
> So please submit a patch, that will make it easier to help you out.

Yes, please do. The more different drivers we have under /drivers/uio, the
better. Didn't you use one of the existing drivers as a template for yours?

Thanks,
Hans

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
       [not found]         ` <4A52B447-8E21-43F6-A38E-711E36F89A34@gmail.com>
@ 2011-11-21 19:29           ` Hans J. Koch
  0 siblings, 0 replies; 24+ messages in thread
From: Hans J. Koch @ 2011-11-21 19:29 UTC (permalink / raw)
  To: Jean-Francois Dagenais; +Cc: Hans J. Koch, Greg KH, tglx, linux-pci, open list

On Mon, Nov 21, 2011 at 01:32:04PM -0500, Jean-Francois Dagenais wrote:
[...]
> >> So please submit a patch, that will make it easier to help you out.
> > 
> > Yes, please do. The more different drivers we have under /drivers/uio, the
> > better. Didn't you use one of the existing drivers as a template for yours?
> Alright. But I'll have to dress it up a bit before I expose it... Kinda like tidying up
> when expecting visit.
> 
> Hope you will still follow the thread when I put it in reply in a day or two...

Sure ;-)

Thanks,
Hans

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-21 18:17       ` Hans J. Koch
       [not found]         ` <4A52B447-8E21-43F6-A38E-711E36F89A34@gmail.com>
@ 2011-11-22 15:24         ` Jean-Francois Dagenais
  2011-11-22 15:35           ` Michael S. Tsirkin
  2011-11-22 16:05           ` Hans J. Koch
  1 sibling, 2 replies; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-22 15:24 UTC (permalink / raw)
  To: Hans J. Koch, mst; +Cc: Greg KH, tglx, linux-pci, open list


On Nov 21, 2011, at 13:17, Hans J. Koch wrote:

> On Mon, Nov 21, 2011 at 09:36:20AM -0800, Greg KH wrote:
>> On Mon, Nov 21, 2011 at 10:31:07AM -0500, Jean-Francois Dagenais wrote:
>>> Hi Greg, thanks for your answer...
>>> 
>>> On Nov 18, 2011, at 17:08, Greg KH wrote:
>>> 
>>>> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
>>>>> Hello fellow hackers.
>>>>> 
>>>>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
>>>>> 
>>>>> I map BAR0 of the device to userspace. I also map two memory areas,
>>>>> one is used to feed instructions to the acquisition device, the other
>>>>> is used autonomously by the PCI device to write the acquired data.
>>>> 
>>>> Nice, have a pointer to your driver anywhere so we can include it in the
>>>> main kernel tree to make your life easier?
>>> As I said in a parallel answer from "Hans J. Koch" <hjk@hansjkoch.de>,
>>> the driver, although GPL'ed, is quite uninteresting except for us here at
>>> Sonatest.
>> 
>> I really doubt that,
> 
> So do I. We never had a driver allocating so much memory.
> 
>> and you should submit it anyway to allow us to
>> change it when the in-kernel apis change in the future.  It will save
>> you time in the long run and make things easier for you (look, your
>> driver is automatically included in all distros!, people fix your bugs,
>> etc.)
> 
> Exactly.
> 
>> 
>>> About merging the driver to mainline, I guess it would only be interesting for
>>> the recipe I demonstrate. Please advise.
>> 
>> That is a recipe that I'm sure others will use, and need help on in the
>> future.
> 
> They already needed it in the past, and they usually try to get it by
> writing me private mail.
> 
>> 
>> So please submit a patch, that will make it easier to help you out.
> 
> Yes, please do. The more different drivers we have under /drivers/uio, the
> better. Didn't you use one of the existing drivers as a template for yours?
Of course, and I am making contributions to the kernel as well (ds1wm,  w1_ds2408,
ad714x, and more to be merged contribs to blackfin list drivers) because I strongly
believe in the community aspect of Linux.

So in the spirit of making the driver more generic, I would like to make this patch
something along the lines of a generic uio/pci based large DMA acquisition device
driver. Or maybe even complementing the existing uio_generic_pci.c?

The problem is that there are device specific aspects that the "generic" driver would
need to take into account, e.g. to map BARx or not, or in our case, there are MFDs
embedded in the firmware (xilinx's ds1wm core, and soon, xilinx's spi core). Furthermore,
I want the FPGA to be an irq expander since the cores generate interrupts, and a
couple of balls on the FPGA are irq signals from external i2c chips.

I don't yet see any way to specify like a setup callback function that could reach a
platform module when uio_pci_generic is probing.

I am thinking this through as I write here...

My other persona (C++ programmer) suggests that conceptually, uio_pci_generic is
a "base class" and the other more firmware specific items would be in a derived module.
In that sense, maybe uio_pci_generic could export it's symbols? So it can be used as
uio core functionnality?

So I would still have a module which would contain the specific MFD registration and IRQ
functionnality, but the BARx and large DMA mapping would reside in uio_pci_generic...

Any thoughts?
> 
> Thanks,
> Hans
Cheers,
/jfd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 15:24         ` Jean-Francois Dagenais
@ 2011-11-22 15:35           ` Michael S. Tsirkin
  2011-11-22 16:54             ` Jean-Francois Dagenais
  2011-11-22 16:05           ` Hans J. Koch
  1 sibling, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2011-11-22 15:35 UTC (permalink / raw)
  To: Jean-Francois Dagenais; +Cc: Hans J. Koch, Greg KH, tglx, linux-pci, open list

On Tue, Nov 22, 2011 at 10:24:17AM -0500, Jean-Francois Dagenais wrote:
> 
> On Nov 21, 2011, at 13:17, Hans J. Koch wrote:
> 
> > On Mon, Nov 21, 2011 at 09:36:20AM -0800, Greg KH wrote:
> >> On Mon, Nov 21, 2011 at 10:31:07AM -0500, Jean-Francois Dagenais wrote:
> >>> Hi Greg, thanks for your answer...
> >>> 
> >>> On Nov 18, 2011, at 17:08, Greg KH wrote:
> >>> 
> >>>> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
> >>>>> Hello fellow hackers.
> >>>>> 
> >>>>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
> >>>>> 
> >>>>> I map BAR0 of the device to userspace. I also map two memory areas,
> >>>>> one is used to feed instructions to the acquisition device, the other
> >>>>> is used autonomously by the PCI device to write the acquired data.
> >>>> 
> >>>> Nice, have a pointer to your driver anywhere so we can include it in the
> >>>> main kernel tree to make your life easier?
> >>> As I said in a parallel answer from "Hans J. Koch" <hjk@hansjkoch.de>,
> >>> the driver, although GPL'ed, is quite uninteresting except for us here at
> >>> Sonatest.
> >> 
> >> I really doubt that,
> > 
> > So do I. We never had a driver allocating so much memory.
> > 
> >> and you should submit it anyway to allow us to
> >> change it when the in-kernel apis change in the future.  It will save
> >> you time in the long run and make things easier for you (look, your
> >> driver is automatically included in all distros!, people fix your bugs,
> >> etc.)
> > 
> > Exactly.
> > 
> >> 
> >>> About merging the driver to mainline, I guess it would only be interesting for
> >>> the recipe I demonstrate. Please advise.
> >> 
> >> That is a recipe that I'm sure others will use, and need help on in the
> >> future.
> > 
> > They already needed it in the past, and they usually try to get it by
> > writing me private mail.
> > 
> >> 
> >> So please submit a patch, that will make it easier to help you out.
> > 
> > Yes, please do. The more different drivers we have under /drivers/uio, the
> > better. Didn't you use one of the existing drivers as a template for yours?
> Of course, and I am making contributions to the kernel as well (ds1wm,  w1_ds2408,
> ad714x, and more to be merged contribs to blackfin list drivers) because I strongly
> believe in the community aspect of Linux.
> 
> So in the spirit of making the driver more generic, I would like to make this patch
> something along the lines of a generic uio/pci based large DMA acquisition device
> driver. Or maybe even complementing the existing uio_generic_pci.c?
> 
> The problem is that there are device specific aspects that the "generic" driver would
> need to take into account, e.g. to map BARx or not, or in our case, there are MFDs
> embedded in the firmware (xilinx's ds1wm core, and soon, xilinx's spi core). Furthermore,
> I want the FPGA to be an irq expander since the cores generate interrupts, and a
> couple of balls on the FPGA are irq signals from external i2c chips.
> 
> I don't yet see any way to specify like a setup callback function that could reach a
> platform module when uio_pci_generic is probing.
> 
> I am thinking this through as I write here...
> 
> My other persona (C++ programmer) suggests that conceptually, uio_pci_generic is
> a "base class" and the other more firmware specific items would be in a derived module.
> In that sense, maybe uio_pci_generic could export it's symbols? So it can be used as
> uio core functionnality?
> 
> So I would still have a module which would contain the specific MFD registration and IRQ
> functionnality, but the BARx and large DMA mapping would reside in uio_pci_generic...
> 
> Any thoughts?
> > 
> > Thanks,
> > Hans
> Cheers,
> /jfd

BARx can be mapped through sysfs, right?
DMA into userspace really needs registration with an iommu and
locking userspace memory. This was discussed in the past but
no patch surfaced. You can copy some bits from 'VFIO' prototypes,
maybe - search for them.

-- 
MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 15:24         ` Jean-Francois Dagenais
  2011-11-22 15:35           ` Michael S. Tsirkin
@ 2011-11-22 16:05           ` Hans J. Koch
  1 sibling, 0 replies; 24+ messages in thread
From: Hans J. Koch @ 2011-11-22 16:05 UTC (permalink / raw)
  To: Jean-Francois Dagenais
  Cc: Hans J. Koch, mst, Greg KH, tglx, linux-pci, open list

On Tue, Nov 22, 2011 at 10:24:17AM -0500, Jean-Francois Dagenais wrote:
[...]
> >> 
> >> So please submit a patch, that will make it easier to help you out.
> > 
> > Yes, please do. The more different drivers we have under /drivers/uio, the
> > better. Didn't you use one of the existing drivers as a template for yours?
> Of course, and I am making contributions to the kernel as well (ds1wm,  w1_ds2408,
> ad714x, and more to be merged contribs to blackfin list drivers) because I strongly
> believe in the community aspect of Linux.

I don't doubt that, sorry if I gave that impression. My only concern is that in the
past few years (UIO is in mainline since 2.6.23) less than 1% of the drivers ever
appeared on LKML. That means that, with quite high probability, everybody writing
a new UIO driver will reinvent the wheel. Over the years, I met lots of people
(e.g. at conferences) who wrote a UIO driver for all kinds of devices, even with
different sorts of DMA handling. All of them said "oh, my driver is of no interest
to the public", and never posted it although I strongly encouraged them to do so.

Consequently, everybody has to do the same work again and again, which is just a
waste of rare engineering powers.

> 
> So in the spirit of making the driver more generic, I would like to make this patch
> something along the lines of a generic uio/pci based large DMA acquisition device
> driver. Or maybe even complementing the existing uio_generic_pci.c?

Adding DMA requires changes to the UIO core to be something more than a crude
workaround. Probably a new device like /dev/uio_dma0 for a /dev/uio0. Feel free
to make suggestions.

> 
> The problem is that there are device specific aspects that the "generic" driver would
> need to take into account, e.g. to map BARx or not, or in our case, there are MFDs
> embedded in the firmware (xilinx's ds1wm core, and soon, xilinx's spi core). Furthermore,
> I want the FPGA to be an irq expander since the cores generate interrupts, and a
> couple of balls on the FPGA are irq signals from external i2c chips.

That should not be much of a problem, unless the FPGA generates more than one
physical interrupt.

> 
> I don't yet see any way to specify like a setup callback function that could reach a
> platform module when uio_pci_generic is probing.
> 
> I am thinking this through as I write here...
> 
> My other persona (C++ programmer) suggests that conceptually, uio_pci_generic is
> a "base class" and the other more firmware specific items would be in a derived module.
> In that sense, maybe uio_pci_generic could export it's symbols? So it can be used as
> uio core functionnality?
> 
> So I would still have a module which would contain the specific MFD registration and IRQ
> functionnality, but the BARx and large DMA mapping would reside in uio_pci_generic...

If we want more UIO core functionality, we should integrate it into the UIO core,
and not just hack a driver.

Thanks,
Hans


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 15:35           ` Michael S. Tsirkin
@ 2011-11-22 16:54             ` Jean-Francois Dagenais
  2011-11-22 17:27               ` Matthew Wilcox
  2011-11-22 17:37               ` Michael S. Tsirkin
  0 siblings, 2 replies; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-22 16:54 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Hans J. Koch, Greg KH, tglx, linux-pci, open list


On Nov 22, 2011, at 10:35, Michael S. Tsirkin wrote:

> On Tue, Nov 22, 2011 at 10:24:17AM -0500, Jean-Francois Dagenais wrote:
>> 
>> On Nov 21, 2011, at 13:17, Hans J. Koch wrote:
>> 
>>> On Mon, Nov 21, 2011 at 09:36:20AM -0800, Greg KH wrote:
>>>> On Mon, Nov 21, 2011 at 10:31:07AM -0500, Jean-Francois Dagenais wrote:
>>>>> Hi Greg, thanks for your answer...
>>>>> 
>>>>> On Nov 18, 2011, at 17:08, Greg KH wrote:
>>>>> 
>>>>>> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
>>>>>>> Hello fellow hackers.
>>>>>>> 
>>>>>>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
>>>>>>> 
>>>>>>> I map BAR0 of the device to userspace. I also map two memory areas,
>>>>>>> one is used to feed instructions to the acquisition device, the other
>>>>>>> is used autonomously by the PCI device to write the acquired data.
>>>>>> 
>>>>>> Nice, have a pointer to your driver anywhere so we can include it in the
>>>>>> main kernel tree to make your life easier?
>>>>> As I said in a parallel answer from "Hans J. Koch" <hjk@hansjkoch.de>,
>>>>> the driver, although GPL'ed, is quite uninteresting except for us here at
>>>>> Sonatest.
>>>> 
>>>> I really doubt that,
>>> 
>>> So do I. We never had a driver allocating so much memory.
>>> 
>>>> and you should submit it anyway to allow us to
>>>> change it when the in-kernel apis change in the future.  It will save
>>>> you time in the long run and make things easier for you (look, your
>>>> driver is automatically included in all distros!, people fix your bugs,
>>>> etc.)
>>> 
>>> Exactly.
>>> 
>>>> 
>>>>> About merging the driver to mainline, I guess it would only be interesting for
>>>>> the recipe I demonstrate. Please advise.
>>>> 
>>>> That is a recipe that I'm sure others will use, and need help on in the
>>>> future.
>>> 
>>> They already needed it in the past, and they usually try to get it by
>>> writing me private mail.
>>> 
>>>> 
>>>> So please submit a patch, that will make it easier to help you out.
>>> 
>>> Yes, please do. The more different drivers we have under /drivers/uio, the
>>> better. Didn't you use one of the existing drivers as a template for yours?
>> Of course, and I am making contributions to the kernel as well (ds1wm,  w1_ds2408,
>> ad714x, and more to be merged contribs to blackfin list drivers) because I strongly
>> believe in the community aspect of Linux.
>> 
>> So in the spirit of making the driver more generic, I would like to make this patch
>> something along the lines of a generic uio/pci based large DMA acquisition device
>> driver. Or maybe even complementing the existing uio_generic_pci.c?
>> 
>> The problem is that there are device specific aspects that the "generic" driver would
>> need to take into account, e.g. to map BARx or not, or in our case, there are MFDs
>> embedded in the firmware (xilinx's ds1wm core, and soon, xilinx's spi core). Furthermore,
>> I want the FPGA to be an irq expander since the cores generate interrupts, and a
>> couple of balls on the FPGA are irq signals from external i2c chips.
>> 
>> I don't yet see any way to specify like a setup callback function that could reach a
>> platform module when uio_pci_generic is probing.
>> 
>> I am thinking this through as I write here...
>> 
>> My other persona (C++ programmer) suggests that conceptually, uio_pci_generic is
>> a "base class" and the other more firmware specific items would be in a derived module.
>> In that sense, maybe uio_pci_generic could export it's symbols? So it can be used as
>> uio core functionnality?
>> 
>> So I would still have a module which would contain the specific MFD registration and IRQ
>> functionnality, but the BARx and large DMA mapping would reside in uio_pci_generic...
>> 
>> Any thoughts?
>>> 
>>> Thanks,
>>> Hans
>> Cheers,
>> /jfd
> 
> BARx can be mapped through sysfs, right?
> DMA into userspace really needs registration with an iommu and
> locking userspace memory. This was discussed in the past but
> no patch surfaced. You can copy some bits from 'VFIO' prototypes,
> maybe - search for them.
That is quite interesting. It really seems like my VT-d recipe to create 128MB for my PCI-e
FPGA to write into is covered by this patch.

My problem is that our FPGA is connected to one of the atom E6XX's PCI-e links, so no
iommu :( Since our first product had VT-d, the FPGA, uio based module and userspace
code is designed such that the device sees a huge contiguous memory chunk. This is key
to the performance of the FPGA, which is essentially decoupled from the CPU for it's real-time
acquisition.

Can VFIO work without an IOMMU? Or am I better off with a UIO solution?

If it can, I know my current UIO based solution is un-useable without an IOMMU as well. The
problem I have is that my fallback to using IOMMU mapping, pci_alloc_consistent (i.e. dma_alloc_coherent),
still will only succeed at 4MB tops, and that's when the module loads from the init scripts. The
success rate drops rapidly after this moment. Could I get more at arch_init moment maybe?

And are there any other thoughts about ripping out like 256MB from the kernel boot args and
initializing it later and use it as userspace mapped DMA buffer?

> 
> -- 
> MST


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 16:54             ` Jean-Francois Dagenais
@ 2011-11-22 17:27               ` Matthew Wilcox
  2011-11-22 17:40                 ` Michael S. Tsirkin
  2011-11-22 17:37               ` Michael S. Tsirkin
  1 sibling, 1 reply; 24+ messages in thread
From: Matthew Wilcox @ 2011-11-22 17:27 UTC (permalink / raw)
  To: Jean-Francois Dagenais
  Cc: Michael S. Tsirkin, Hans J. Koch, Greg KH, tglx, linux-pci, open list

On Tue, Nov 22, 2011 at 11:54:22AM -0500, Jean-Francois Dagenais wrote:
> That is quite interesting. It really seems like my VT-d recipe to create 128MB for my PCI-e
> FPGA to write into is covered by this patch.
> 
> My problem is that our FPGA is connected to one of the atom E6XX's PCI-e links, so no
> iommu :( Since our first product had VT-d, the FPGA, uio based module and userspace
> code is designed such that the device sees a huge contiguous memory chunk. This is key
> to the performance of the FPGA, which is essentially decoupled from the CPU for it's real-time
> acquisition.

Is it really key?  If you supported, ohidon'tknow, 2MB pages, you'd
need 64 entries in the FPGA to store the addresses of those 2MB pages,
which doesn't sound like a huge burden.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 16:54             ` Jean-Francois Dagenais
  2011-11-22 17:27               ` Matthew Wilcox
@ 2011-11-22 17:37               ` Michael S. Tsirkin
  2011-11-22 17:54                 ` Hans J. Koch
  1 sibling, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2011-11-22 17:37 UTC (permalink / raw)
  To: Jean-Francois Dagenais; +Cc: Hans J. Koch, Greg KH, tglx, linux-pci, open list

On Tue, Nov 22, 2011 at 11:54:22AM -0500, Jean-Francois Dagenais wrote:
> 
> On Nov 22, 2011, at 10:35, Michael S. Tsirkin wrote:
> 
> > On Tue, Nov 22, 2011 at 10:24:17AM -0500, Jean-Francois Dagenais wrote:
> >> 
> >> On Nov 21, 2011, at 13:17, Hans J. Koch wrote:
> >> 
> >>> On Mon, Nov 21, 2011 at 09:36:20AM -0800, Greg KH wrote:
> >>>> On Mon, Nov 21, 2011 at 10:31:07AM -0500, Jean-Francois Dagenais wrote:
> >>>>> Hi Greg, thanks for your answer...
> >>>>> 
> >>>>> On Nov 18, 2011, at 17:08, Greg KH wrote:
> >>>>> 
> >>>>>> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
> >>>>>>> Hello fellow hackers.
> >>>>>>> 
> >>>>>>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
> >>>>>>> 
> >>>>>>> I map BAR0 of the device to userspace. I also map two memory areas,
> >>>>>>> one is used to feed instructions to the acquisition device, the other
> >>>>>>> is used autonomously by the PCI device to write the acquired data.
> >>>>>> 
> >>>>>> Nice, have a pointer to your driver anywhere so we can include it in the
> >>>>>> main kernel tree to make your life easier?
> >>>>> As I said in a parallel answer from "Hans J. Koch" <hjk@hansjkoch.de>,
> >>>>> the driver, although GPL'ed, is quite uninteresting except for us here at
> >>>>> Sonatest.
> >>>> 
> >>>> I really doubt that,
> >>> 
> >>> So do I. We never had a driver allocating so much memory.
> >>> 
> >>>> and you should submit it anyway to allow us to
> >>>> change it when the in-kernel apis change in the future.  It will save
> >>>> you time in the long run and make things easier for you (look, your
> >>>> driver is automatically included in all distros!, people fix your bugs,
> >>>> etc.)
> >>> 
> >>> Exactly.
> >>> 
> >>>> 
> >>>>> About merging the driver to mainline, I guess it would only be interesting for
> >>>>> the recipe I demonstrate. Please advise.
> >>>> 
> >>>> That is a recipe that I'm sure others will use, and need help on in the
> >>>> future.
> >>> 
> >>> They already needed it in the past, and they usually try to get it by
> >>> writing me private mail.
> >>> 
> >>>> 
> >>>> So please submit a patch, that will make it easier to help you out.
> >>> 
> >>> Yes, please do. The more different drivers we have under /drivers/uio, the
> >>> better. Didn't you use one of the existing drivers as a template for yours?
> >> Of course, and I am making contributions to the kernel as well (ds1wm,  w1_ds2408,
> >> ad714x, and more to be merged contribs to blackfin list drivers) because I strongly
> >> believe in the community aspect of Linux.
> >> 
> >> So in the spirit of making the driver more generic, I would like to make this patch
> >> something along the lines of a generic uio/pci based large DMA acquisition device
> >> driver. Or maybe even complementing the existing uio_generic_pci.c?
> >> 
> >> The problem is that there are device specific aspects that the "generic" driver would
> >> need to take into account, e.g. to map BARx or not, or in our case, there are MFDs
> >> embedded in the firmware (xilinx's ds1wm core, and soon, xilinx's spi core). Furthermore,
> >> I want the FPGA to be an irq expander since the cores generate interrupts, and a
> >> couple of balls on the FPGA are irq signals from external i2c chips.
> >> 
> >> I don't yet see any way to specify like a setup callback function that could reach a
> >> platform module when uio_pci_generic is probing.
> >> 
> >> I am thinking this through as I write here...
> >> 
> >> My other persona (C++ programmer) suggests that conceptually, uio_pci_generic is
> >> a "base class" and the other more firmware specific items would be in a derived module.
> >> In that sense, maybe uio_pci_generic could export it's symbols? So it can be used as
> >> uio core functionnality?
> >> 
> >> So I would still have a module which would contain the specific MFD registration and IRQ
> >> functionnality, but the BARx and large DMA mapping would reside in uio_pci_generic...
> >> 
> >> Any thoughts?
> >>> 
> >>> Thanks,
> >>> Hans
> >> Cheers,
> >> /jfd
> > 
> > BARx can be mapped through sysfs, right?
> > DMA into userspace really needs registration with an iommu and
> > locking userspace memory. This was discussed in the past but
> > no patch surfaced. You can copy some bits from 'VFIO' prototypes,
> > maybe - search for them.
> That is quite interesting. It really seems like my VT-d recipe to create 128MB for my PCI-e
> FPGA to write into is covered by this patch.
> 
> My problem is that our FPGA is connected to one of the atom E6XX's PCI-e links, so no
> iommu :( Since our first product had VT-d, the FPGA, uio based module and userspace
> code is designed such that the device sees a huge contiguous memory chunk. This is key
> to the performance of the FPGA, which is essentially decoupled from the CPU for it's real-time
> acquisition.
> 
> Can VFIO work without an IOMMU?

I don't think they have any such plans.

> Or am I better off with a UIO solution?

You should probably write a proper kernel driver, not a UIO one.
your kernel driver would have to prevent the device fom DMA into memory
outside the allocated range, even if userspace is malicious.
That's why UIO is generally not recommended for PCI devices that do DMA.

I also doubt a generic module like uio_pci_generic or VFIO can provide this
protection for you.

> If it can, I know my current UIO based solution is un-useable without an IOMMU as well. The
> problem I have is that my fallback to using IOMMU mapping, pci_alloc_consistent (i.e. dma_alloc_coherent),
> still will only succeed at 4MB tops, and that's when the module loads from the init scripts. The
> success rate drops rapidly after this moment. Could I get more at arch_init moment maybe?
> 
> And are there any other thoughts about ripping out like 256MB from the kernel boot args and
> initializing it later and use it as userspace mapped DMA buffer?

You can use alloc_bootmem I guess.

> > 
> > -- 
> > MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 17:27               ` Matthew Wilcox
@ 2011-11-22 17:40                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2011-11-22 17:40 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jean-Francois Dagenais, Hans J. Koch, Greg KH, tglx, linux-pci,
	open list

On Tue, Nov 22, 2011 at 10:27:25AM -0700, Matthew Wilcox wrote:
> On Tue, Nov 22, 2011 at 11:54:22AM -0500, Jean-Francois Dagenais wrote:
> > That is quite interesting. It really seems like my VT-d recipe to create 128MB for my PCI-e
> > FPGA to write into is covered by this patch.
> > 
> > My problem is that our FPGA is connected to one of the atom E6XX's PCI-e links, so no
> > iommu :( Since our first product had VT-d, the FPGA, uio based module and userspace
> > code is designed such that the device sees a huge contiguous memory chunk. This is key
> > to the performance of the FPGA, which is essentially decoupled from the CPU for it's real-time
> > acquisition.
> 
> Is it really key?  If you supported, ohidon'tknow, 2MB pages, you'd
> need 64 entries in the FPGA to store the addresses of those 2MB pages,
> which doesn't sound like a huge burden.

Ah yes, we have this support for on-device IOMMUs. Maybe a generic
access driver will work if you implement an IOMMU in FPGA.
You would need to separate the programming of the IOMMU
from the rest of the functionality of the device,
protecting it from a malicious driver, somehow.

> -- 
> Matthew Wilcox				Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours.  We can't possibly take such
> a retrograde step."

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 17:37               ` Michael S. Tsirkin
@ 2011-11-22 17:54                 ` Hans J. Koch
  2011-11-22 18:40                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Hans J. Koch @ 2011-11-22 17:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jean-Francois Dagenais, Hans J. Koch, Greg KH, tglx, linux-pci,
	open list

On Tue, Nov 22, 2011 at 07:37:23PM +0200, Michael S. Tsirkin wrote:
[...]
> > Or am I better off with a UIO solution?
> 
> You should probably write a proper kernel driver, not a UIO one.
> your kernel driver would have to prevent the device fom DMA into memory
> outside the allocated range, even if userspace is malicious.
> That's why UIO is generally not recommended for PCI devices that do DMA.

When UIO was designed, the main goal was the ability to handle interrupts
from userspace. There was no requirement for DMA. In fact, in five years I
didn't get one real world device on my desk that needed it. That doesn't
mean there are no such devices. Adding DMA support to the UIO core was
discussed several times but noone ever did it. Ideas are still welcome...

If parts of the driver should be in userspace, you should really try
to extend the UIO core instead of re-implementing UIO functionality in
a "proper kernel driver".

Thanks,
Hans


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 17:54                 ` Hans J. Koch
@ 2011-11-22 18:40                   ` Michael S. Tsirkin
  2011-11-22 18:52                     ` Hans J. Koch
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2011-11-22 18:40 UTC (permalink / raw)
  To: Hans J. Koch; +Cc: Jean-Francois Dagenais, Greg KH, tglx, linux-pci, open list

On Tue, Nov 22, 2011 at 06:54:02PM +0100, Hans J. Koch wrote:
> On Tue, Nov 22, 2011 at 07:37:23PM +0200, Michael S. Tsirkin wrote:
> [...]
> > > Or am I better off with a UIO solution?
> > 
> > You should probably write a proper kernel driver, not a UIO one.
> > your kernel driver would have to prevent the device fom DMA into memory
> > outside the allocated range, even if userspace is malicious.
> > That's why UIO is generally not recommended for PCI devices that do DMA.
> 
> When UIO was designed, the main goal was the ability to handle interrupts
> from userspace. There was no requirement for DMA. In fact, in five years I
> didn't get one real world device on my desk that needed it. That doesn't
> mean there are no such devices. Adding DMA support to the UIO core was
> discussed several times but noone ever did it. Ideas are still welcome...
> 
> If parts of the driver should be in userspace, you should really try
> to extend the UIO core instead of re-implementing UIO functionality in
> a "proper kernel driver".
> 
> Thanks,
> Hans

Right, I really meant put all of the driver in the kernel.
If parts are in userspace, and device can do DMA,
you are faced with the problem as userspace suddenly
can access arbitrary memory through the device.

-- 
MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 18:40                   ` Michael S. Tsirkin
@ 2011-11-22 18:52                     ` Hans J. Koch
  2011-11-22 19:50                       ` Jean-Francois Dagenais
  2011-11-23  8:20                       ` Michael S. Tsirkin
  0 siblings, 2 replies; 24+ messages in thread
From: Hans J. Koch @ 2011-11-22 18:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Hans J. Koch, Jean-Francois Dagenais, Greg KH, tglx, linux-pci,
	open list

On Tue, Nov 22, 2011 at 08:40:40PM +0200, Michael S. Tsirkin wrote:
> On Tue, Nov 22, 2011 at 06:54:02PM +0100, Hans J. Koch wrote:
> > On Tue, Nov 22, 2011 at 07:37:23PM +0200, Michael S. Tsirkin wrote:
> > [...]
> > > > Or am I better off with a UIO solution?
> > > 
> > > You should probably write a proper kernel driver, not a UIO one.
> > > your kernel driver would have to prevent the device fom DMA into memory
> > > outside the allocated range, even if userspace is malicious.
> > > That's why UIO is generally not recommended for PCI devices that do DMA.
> > 
> > When UIO was designed, the main goal was the ability to handle interrupts
> > from userspace. There was no requirement for DMA. In fact, in five years I
> > didn't get one real world device on my desk that needed it. That doesn't
> > mean there are no such devices. Adding DMA support to the UIO core was
> > discussed several times but noone ever did it. Ideas are still welcome...
> > 
> > If parts of the driver should be in userspace, you should really try
> > to extend the UIO core instead of re-implementing UIO functionality in
> > a "proper kernel driver".
> > 
> > Thanks,
> > Hans
> 
> Right, I really meant put all of the driver in the kernel.
> If parts are in userspace, and device can do DMA,
> you are faced with the problem as userspace suddenly
> can access arbitrary memory through the device.

That's nothing UIO specific. You have the same problem with /dev/mem
or graphic cards. If you're root, you can do lots of things that can
compromise security or crash your system.

Thanks,
Hans

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 18:52                     ` Hans J. Koch
@ 2011-11-22 19:50                       ` Jean-Francois Dagenais
  2011-11-23  8:20                       ` Michael S. Tsirkin
  1 sibling, 0 replies; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-22 19:50 UTC (permalink / raw)
  To: Hans J. Koch, Matthew Wilcox
  Cc: Michael S. Tsirkin, Greg KH, tglx, linux-pci, open list


On Nov 22, 2011, at 13:52, Hans J. Koch wrote:

> On Tue, Nov 22, 2011 at 08:40:40PM +0200, Michael S. Tsirkin wrote:
>> On Tue, Nov 22, 2011 at 06:54:02PM +0100, Hans J. Koch wrote:
>>> On Tue, Nov 22, 2011 at 07:37:23PM +0200, Michael S. Tsirkin wrote:
>>> [...]
>>>>> Or am I better off with a UIO solution?
>>>> 
>>>> You should probably write a proper kernel driver, not a UIO one.
>>>> your kernel driver would have to prevent the device fom DMA into memory
>>>> outside the allocated range, even if userspace is malicious.
>>>> That's why UIO is generally not recommended for PCI devices that do DMA.
>>> 
>>> When UIO was designed, the main goal was the ability to handle interrupts
>>> from userspace. There was no requirement for DMA. In fact, in five years I
>>> didn't get one real world device on my desk that needed it. That doesn't
>>> mean there are no such devices. Adding DMA support to the UIO core was
>>> discussed several times but noone ever did it. Ideas are still welcome...
>>> 
>>> If parts of the driver should be in userspace, you should really try
>>> to extend the UIO core instead of re-implementing UIO functionality in
>>> a "proper kernel driver".
>>> 
>>> Thanks,
>>> Hans
>> 
>> Right, I really meant put all of the driver in the kernel.
>> If parts are in userspace, and device can do DMA,
>> you are faced with the problem as userspace suddenly
>> can access arbitrary memory through the device.
> 
> That's nothing UIO specific. You have the same problem with /dev/mem
> or graphic cards. If you're root, you can do lots of things that can
> compromise security or crash your system.
Exactly, and remember, this is a closed, embedded and controlled system, with a
"kiosk" application running.

As a product supporter, if the end-user decides to temper with his unit and read/write 
anywhere in the system, it would surely qualify as un-supported use of the system
and so my design should not account for this.

Since this FPGA-PCI-e device is the reason for being of the product, it rules over
any other factor as priority 1. I mean, as long as the system doesn't swap, we're
happy.

Matthew Wilcox <matthew@wil.cx> wrote:
> Is it really key?  If you supported, ohidon'tknow, 2MB pages, you'd
> need 64 entries in the FPGA to store the addresses of those 2MB pages,
> which doesn't sound like a huge burden.
This is an excellent idea, and thank you for bringing me back to earth. It is definitely
doable and would solve my 4MB cap problem. It would require work on
the FPGA side as well as mods to my uio driver to manage the memory and
override uio's page_fault implementation (if I am correct), but the rest of our code
base would be unaffected.

What would be completely ideal for me is the idea to cut out a chunk of physical ram
from the kernel though, like those integrated graphic chips do. 

"Michael S. Tsirkin" <mst@redhat.com> wrote:
> You can use alloc_bootmem I guess.
That sounds like something I could use, any idea how to do this elegantly? Meaning,
where is the earliest point I can successfully call alloc_bootmem(128MB) successfully?
How do I communicate this cleanly to my pci probe function so I can hand it to uio?
Using plain old global exported symbols?

Would I encounter cache problems or the like? Or is snooping guaranteed on modern
Intel platforms (Atom E6XX)?

> 
> Thanks,
> Hans
Super thanks again for all the help guys, making real progress here.
/jfd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-18 22:08 ` Greg KH
  2011-11-21 15:31   ` Jean-Francois Dagenais
@ 2011-11-22 19:57   ` Jean-Francois Dagenais
  2013-01-23  2:00     ` Jean-François Dagenais
  1 sibling, 1 reply; 24+ messages in thread
From: Jean-Francois Dagenais @ 2011-11-22 19:57 UTC (permalink / raw)
  To: Greg KH; +Cc: hjk, tglx, linux-pci, open list


On Nov 18, 2011, at 17:08, Greg KH wrote:

> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
>> Hello fellow hackers.
>> 
>> I am maintaining a UIO based driver for a PCI-E data acquisition device.
>> 
>> I map BAR0 of the device to userspace. I also map two memory areas,
>> one is used to feed instructions to the acquisition device, the other
>> is used autonomously by the PCI device to write the acquired data.
> 
> Nice, have a pointer to your driver anywhere so we can include it in the
> main kernel tree to make your life easier?
> 
>> The strategy we have been using for those two share memory areas has
>> historically been using pci_alloc_coherent on v2.6.35 x86_64 (limited
>> to 4MB based on my trials) and later, I made use of the VT-d
>> (intel_iommu) to allocate as much as 128MB (an arbitrary limit) which
>> appear contiguous to the PCI device. I use vmalloc_user to allocate
>> 128M, then write all the physically continuous segments in a
>> scatterlist, then use pci_map_sg which works it's way to intel_iommu.
>> The device DMA addresses I get back are contiguous over the whole
>> 128M. Neat! Our VT-d capable devices still use this strategy.
>> 
>> This large memory is mission-critical in making the acquisition device
>> autonomous (real-time), yet keep the DMA implementation very simple.
>> Today, we are re-using this device on a CPU architecture that has no
>> IOMMU (intel E6XX/EG20T) and want to avoid creating a scatter-gather
>> scheme between my driver and the FPGA (PCI device).
>> 
>> So I went back to the old pci_alloc_coherent method, which although
>> limited to 4 MB, will do for early development phases. Instead of
>> 2.6.35, we are doing preliminary development using 2.6.37 and will
>> probably use 3.1 or more later.  The cpu/device shared memory maps
>> (1MB and 4MB) are allocated using pci_alloc_coherent and handed to UIO
>> as physical memory using the dma_addr_t returned by the pci_alloc
>> func.
>> 
>> The 1st memory map is written to by CPU and read from device.
>> The 2nd memory map is typically written by the device and read by the
>> CPU, but future features may have the device also read this memory.
>> 
>> My initial testing on the atom E6XX show the PCI device failing when
>> trying to read from the first memory map. I suspect PCI-E payload
>> sizes which may be somewhat hardcoded in the FPGA firmware... we will
>> confirm this soon.
> 
> That would be good to find out.
Just FYI,
To close the loop on the right above issue... The problem we had was that the
FPGA was using 64-bit formatted TLPs for it's read and write requests to the
system's <4Gig RAM, which is said by PCI-E to be unsupported.

This has never been a problem on the other systems we used, i.e.
Core2/ICH9M, and Atom-Z5xx/SCH-US15W.
> 
>> Now from the get go I have felt lucky to have made this work because
>> of my limited research into the intricacies of the kernel's memory
>> management. So I ask two things:
>> 
>> - Is this kosher?
> 
> I think so, yes, but others who know the DMA subsystem better than I
> should chime in here, as I might be totally wrong.
> 
>> - Is there a better/easier/safer way to achieve this? (remember that
>> for the second map, the more memory I have, the better. We have a gig
>> of ram, if I take, say 256MB, that would be OK too.
>> 
>> I had thought about cutting out a chunk of ram from the kernel's boot
>> args, but had always feared cache/snooping errors. Not to mention I
>> had no idea how to "claim" or setup this memory once my driver's probe
>> function. Maybe I would still be lucky and it would just work? mmmh...
> 
> Yeah, don't do that, it might not work out well.
> 
> greg k-h


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 18:52                     ` Hans J. Koch
  2011-11-22 19:50                       ` Jean-Francois Dagenais
@ 2011-11-23  8:20                       ` Michael S. Tsirkin
  1 sibling, 0 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2011-11-23  8:20 UTC (permalink / raw)
  To: Hans J. Koch; +Cc: Jean-Francois Dagenais, Greg KH, tglx, linux-pci, open list

On Tue, Nov 22, 2011 at 07:52:45PM +0100, Hans J. Koch wrote:
> On Tue, Nov 22, 2011 at 08:40:40PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Nov 22, 2011 at 06:54:02PM +0100, Hans J. Koch wrote:
> > > On Tue, Nov 22, 2011 at 07:37:23PM +0200, Michael S. Tsirkin wrote:
> > > [...]
> > > > > Or am I better off with a UIO solution?
> > > > 
> > > > You should probably write a proper kernel driver, not a UIO one.
> > > > your kernel driver would have to prevent the device fom DMA into memory
> > > > outside the allocated range, even if userspace is malicious.
> > > > That's why UIO is generally not recommended for PCI devices that do DMA.
> > > 
> > > When UIO was designed, the main goal was the ability to handle interrupts
> > > from userspace. There was no requirement for DMA. In fact, in five years I
> > > didn't get one real world device on my desk that needed it. That doesn't
> > > mean there are no such devices. Adding DMA support to the UIO core was
> > > discussed several times but noone ever did it. Ideas are still welcome...
> > > 
> > > If parts of the driver should be in userspace, you should really try
> > > to extend the UIO core instead of re-implementing UIO functionality in
> > > a "proper kernel driver".
> > > 
> > > Thanks,
> > > Hans
> > 
> > Right, I really meant put all of the driver in the kernel.
> > If parts are in userspace, and device can do DMA,
> > you are faced with the problem as userspace suddenly
> > can access arbitrary memory through the device.
> 
> That's nothing UIO specific. You have the same problem with /dev/mem
> or graphic cards. If you're root, you can do lots of things that can
> compromise security or crash your system.
> 
> Thanks,
> Hans

With an appropriate security policy, you might not be able to,
or your attempt to do so might be logged. Even without, people
can use permissions to give non-root access to devices.
One doesn't normally expect chown mst /dev/foobar
to give mst full root on a box.

-- 
MST

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: extra large DMA buffer for PCI-E device under UIO
  2011-11-22 19:57   ` Jean-Francois Dagenais
@ 2013-01-23  2:00     ` Jean-François Dagenais
  0 siblings, 0 replies; 24+ messages in thread
From: Jean-François Dagenais @ 2013-01-23  2:00 UTC (permalink / raw)
  To: Lokesh M; +Cc: hjk, tglx, open list, gregkh

Hi all,

Here's to free software! (and good karma?)

A M Lokesh mailed me directly for a follow up question about this old thread, I
thought it would be interesting to post my reply to the list.

On 2013-01-22, at 10:23, Lokesh M wrote:
> 
> After reading through your below thread, I was wondering if you could please
> give me some feedback
>  
> https://lkml.org/lkml/2011/11/18/462
>  
> I am in a similar situation, where I need to write a driver to pass the Data
> from PCIe(FPGA) to my Linux machine (4MB would be enough - streaming).  I
> havn't checked if my Server supports Vt-d,  But was interested in the way
> your implementation was(UIO mapping).

I've sort of abandonned the vt-d way of doing it because I also need to support
an Atom architecture. Was a bit glad to do it like this since I know the IOMMU
translation tables and what not aren't free and the code to map and support this
was kind of hard to follow. Giving it up also means loosing FPGA stray memory
access protection though, but it's not like I had the choice (Atom).

>  
> I was looking to know more about the 2 buffer mapping you have for streaming
> data and how it is achieved,  We have mapping for BAR0 for register access and
> I would like to implement similar buffer for Data as well.  So Please let me
> know any details and point me to few documentation to implement the same.
> We have a bounce buffer mechanism (device ->Kernel ->User) but the speed is
> around 100MB which I need to improve.

On Nov 18, 2011, at 17:08, Greg KH wrote:

> On Fri, Nov 18, 2011 at 04:16:23PM -0500, Jean-Francois Dagenais wrote:
>> 
>> 
>> I had thought about cutting out a chunk of ram from the kernel's boot
>> args, but had always feared cache/snooping errors. Not to mention I
>> had no idea how to "claim" or setup this memory once my driver's probe
>> function. Maybe I would still be lucky and it would just work? mmmh...
> 
> Yeah, don't do that, it might not work out well.
> 
> greg k-h


Turns out, for me, this works very well!!

So, here the jist of what I do... remember, I only need to support pure Core2 +
Intel CPU/Chipset architectures on very specific COM modules. This means the
architecture takes care of invalidating the CPU cachelines when the PCI-E device
(an FPGA) bus masters reads and writes to RAM (bus snooping). The area I
describe here is 128M (on the other system, I used 256M successfully) and is
strictly used for FPGA write - CPU read. As a note, the other area I use (only
1M) for CPU write - FPGA read is still allocated using pci_alloc_consistent. The
DMA address is collected through the 3rd argument of pci_alloc_consistent and is
handed to UIO as UIO_MEM_PHYS type of memory. FYI, I had previously succeeded in
allocating 4M using pci_alloc_consistent, but only if done quite soon after
boot. This was on a Core2 duo arch.

I do hook into the kernel boot parameter "memmap" to reserve a chunk of
contiguous memory which I know falls inside a range which the BIOS declares
(through E820) as available. This makes the kernel's memory management ignore
this area. I compile-in a kernel module which looks like this:

void* uio_hud_memblock_addr; EXPORT_SYMBOL(uio_hud_memblock_addr); unsigned long
long uio_hud_memblock_size; EXPORT_SYMBOL(uio_hud_memblock_size);

/* taken from parse_memmap_opt in e820.c and modified */ static int __init
uio_hud_memblock_setup(char *str) {
	char *cur_p = str; u64 mem_size;

	if (!str)
		return -EINVAL;

	mem_size = memparse(str, &cur_p); if (cur_p == str)
		return -EINVAL;

	if (*cur_p == '$') {
		uio_hud_memblock_addr = (void*)(ulong)memparse(cur_p+1, &cur_p);
		uio_hud_memblock_size = mem_size;
	} else {
		return -EINVAL;
	}

	return *cur_p == '\0' ? 0 : -EINVAL;
} __setup("memmap=", uio_hud_memblock_setup);


static int __init uio_hud_memblock_init(void) {
	if(uio_hud_memblock_addr) {
		PDEBUG("ram memblock at %p (size:%llu)\n",
		       uio_hud_memblock_addr, uio_hud_memblock_size);
	} else {
		PDEBUG("no memmap=nn$ss kernel parameter found\n");
	}

	return 0;
} early_initcall(uio_hud_memblock_init);


MODULE_AUTHOR("Jean-Francois Dagenais"); MODULE_DESCRIPTION("Built-in module to
parse the memmap memblock reservation"); MODULE_LICENSE("GPL");

The parsed address and size (uio_hud_memblock_addr/size) are exported for my
other non-compiled in module to discover. That module is the real PCI "driver"
which simply takes this address and size, and hands it to UIO as a memory map of
type UIO_MEM_PHYS.

That's pretty much it for the kernel stuff (aside for the trivial interrupt
handling). In userspace, I also have a UIO map for the FPGA's BAR0 registers
where I instruct the device where the other two physical memory ranges (begin
and end addresses and one for it's read ops (1M), one for it's write ops(128M),
so 4 physical addresses). The device autonomously updates where it's going to
write next (it's "data write addr" register), rolls around when reaching the end
and sends me an interrupt for each "data unit" it finishes. The interrupt is
forwarded to userspace as described in UIO docs because of a small ISR in my
kernel driver. Userspace instructs the device through a "software read addr"
register which indicates to FPGA the lowest address which the software still
needs (hasn't consumed yet). This is so the autonomous FPGA doesn't overwrite
busy memory. As soon as I update the soft read addr, the FPGA can fill that spot
again.

This way you squeeze out as much as you can out of the architecture as the CPU
is only burdonned with consuming the data and updating a pointer.

Cheers!  /jfd

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2013-01-23  2:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-18 21:16 extra large DMA buffer for PCI-E device under UIO Jean-Francois Dagenais
2011-11-18 22:08 ` Greg KH
2011-11-21 15:31   ` Jean-Francois Dagenais
2011-11-21 17:36     ` Greg KH
2011-11-21 18:17       ` Hans J. Koch
     [not found]         ` <4A52B447-8E21-43F6-A38E-711E36F89A34@gmail.com>
2011-11-21 19:29           ` Hans J. Koch
2011-11-22 15:24         ` Jean-Francois Dagenais
2011-11-22 15:35           ` Michael S. Tsirkin
2011-11-22 16:54             ` Jean-Francois Dagenais
2011-11-22 17:27               ` Matthew Wilcox
2011-11-22 17:40                 ` Michael S. Tsirkin
2011-11-22 17:37               ` Michael S. Tsirkin
2011-11-22 17:54                 ` Hans J. Koch
2011-11-22 18:40                   ` Michael S. Tsirkin
2011-11-22 18:52                     ` Hans J. Koch
2011-11-22 19:50                       ` Jean-Francois Dagenais
2011-11-23  8:20                       ` Michael S. Tsirkin
2011-11-22 16:05           ` Hans J. Koch
2011-11-22 19:57   ` Jean-Francois Dagenais
2013-01-23  2:00     ` Jean-François Dagenais
2011-11-18 22:27 ` Hans J. Koch
2011-11-21 15:10   ` Jean-Francois Dagenais
2011-11-21 15:47     ` Rolf Eike Beer
2011-11-21 16:01       ` Jean-Francois Dagenais

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.