[PATCH 0/7] DMAENGINE: fixes and PrimeCells

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-03  0:54 ` Linus Walleij
  0 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-03  0:54 UTC (permalink / raw)
  To: Dan Williams, Russell King - ARM Linux
  Cc: linux-arm-kernel, linux-mmc, linux-kernel, Linus Walleij

This patch set intended for Dan Williams tree includes:

- A fixed up COH 901 318 PrimeCell extension (unless applied
  to your tree already)
- Two DMA40 bug fixes.
- The latest iteration of the PrimeCell DMA extensions,
  altered and tested to work as Russell wants them. The PL011
  driver was tested:
  - on Versatile QEMU after I got it working again - OK
  - on U300 with DMA - OK
  - on U300 without DMA - OK
  - on U300 with fault injection, DMA always fail - OK
  - on U300 with fault injection, DMA fails every second time - OK
- Russell is this OK with you now?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-03  0:54 ` Linus Walleij
  0 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-03  0:54 UTC (permalink / raw)
  To: linux-arm-kernel

This patch set intended for Dan Williams tree includes:

- A fixed up COH 901 318 PrimeCell extension (unless applied
  to your tree already)
- Two DMA40 bug fixes.
- The latest iteration of the PrimeCell DMA extensions,
  altered and tested to work as Russell wants them. The PL011
  driver was tested:
  - on Versatile QEMU after I got it working again - OK
  - on U300 with DMA - OK
  - on U300 without DMA - OK
  - on U300 with fault injection, DMA always fail - OK
  - on U300 with fault injection, DMA fails every second time - OK
- Russell is this OK with you now?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-03  0:54 ` Linus Walleij
@ 2010-05-07  9:13   ` Linus Walleij
  -1 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-07  9:13 UTC (permalink / raw)
  To: Dan Williams
  Cc: Russell King - ARM Linux, linux-arm-kernel, linux-mmc, linux-kernel

Dan is this patch set OK?

Sorry if you're busy, just need to check...

Patches 1 thru 4 should be uncontroversial, only affects
our ST-Ericsson platforms and have been thorougly reviewed.

As posted elsewhere this is tested on an ARM reference
design now as well, which IMHO would make it OK to apply
also the PL011 patch, nr 5.

(OK Russell?)

Nr 6 and 7 depend on changes in Russells -next branch
so unless you want to bring in the ARM next to your tree
as well, you can wait with these.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-07  9:13   ` Linus Walleij
  0 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-07  9:13 UTC (permalink / raw)
  To: linux-arm-kernel

Dan is this patch set OK?

Sorry if you're busy, just need to check...

Patches 1 thru 4 should be uncontroversial, only affects
our ST-Ericsson platforms and have been thorougly reviewed.

As posted elsewhere this is tested on an ARM reference
design now as well, which IMHO would make it OK to apply
also the PL011 patch, nr 5.

(OK Russell?)

Nr 6 and 7 depend on changes in Russells -next branch
so unless you want to bring in the ARM next to your tree
as well, you can wait with these.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-07  9:13   ` Linus Walleij
@ 2010-05-07  9:32     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 28+ messages in thread
From: Russell King - ARM Linux @ 2010-05-07  9:32 UTC (permalink / raw)
  To: Linus Walleij; +Cc: Dan Williams, linux-arm-kernel, linux-mmc, linux-kernel

On Fri, May 07, 2010 at 11:13:57AM +0200, Linus Walleij wrote:
> Dan is this patch set OK?
> 
> Sorry if you're busy, just need to check...
> 
> Patches 1 thru 4 should be uncontroversial, only affects
> our ST-Ericsson platforms and have been thorougly reviewed.
> 
> As posted elsewhere this is tested on an ARM reference
> design now as well, which IMHO would make it OK to apply
> also the PL011 patch, nr 5.

I would have thought given the concerns that I stated, merely running
the drivers in PIO mode would not address those concerns.  So no, I'm
not satisfied.

As I've said, I don't want the ARM platforms to be boxed into a corner
such that they can never have DMA support because this stuff hasn't been
properly thought out.

Or let me put it another way - if people are happy for Linux to support
new ARM CPU architectures, but with very little attention given to DMA
support on those architectures, then feel free to box the ARM platforms
into a corner on DMA support - but on the understanding that _you_ will
have to deal with the DMA API breakage on those architectures yourself.
Because with ARM platforms not having DMA support, there's absolutely
no way to run any checks what so ever on DMA when the CPU architecture
support is created.

This is why people like the OMAP folk end up doing a lot of the DMA
debugging; they tend to be the first group to pick up new architectures
with fully functional platforms.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-07  9:32     ` Russell King - ARM Linux
  0 siblings, 0 replies; 28+ messages in thread
From: Russell King - ARM Linux @ 2010-05-07  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 07, 2010 at 11:13:57AM +0200, Linus Walleij wrote:
> Dan is this patch set OK?
> 
> Sorry if you're busy, just need to check...
> 
> Patches 1 thru 4 should be uncontroversial, only affects
> our ST-Ericsson platforms and have been thorougly reviewed.
> 
> As posted elsewhere this is tested on an ARM reference
> design now as well, which IMHO would make it OK to apply
> also the PL011 patch, nr 5.

I would have thought given the concerns that I stated, merely running
the drivers in PIO mode would not address those concerns.  So no, I'm
not satisfied.

As I've said, I don't want the ARM platforms to be boxed into a corner
such that they can never have DMA support because this stuff hasn't been
properly thought out.

Or let me put it another way - if people are happy for Linux to support
new ARM CPU architectures, but with very little attention given to DMA
support on those architectures, then feel free to box the ARM platforms
into a corner on DMA support - but on the understanding that _you_ will
have to deal with the DMA API breakage on those architectures yourself.
Because with ARM platforms not having DMA support, there's absolutely
no way to run any checks what so ever on DMA when the CPU architecture
support is created.

This is why people like the OMAP folk end up doing a lot of the DMA
debugging; they tend to be the first group to pick up new architectures
with fully functional platforms.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-07  9:32     ` Russell King - ARM Linux
@ 2010-05-07 11:43       ` Linus Walleij
  -1 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-07 11:43 UTC (permalink / raw)
  To: Russell King - ARM Linux, Ben Dooks
  Cc: Dan Williams, linux-arm-kernel, linux-mmc, linux-kernel

2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:

> I would have thought given the concerns that I stated, merely running
> the drivers in PIO mode would not address those concerns.  So no, I'm
> not satisfied.

Sorry didn't get it, I understood it as it should be tested on the Versatile
without regressions.

So now I understand that you want the drivers to be tested in DMA mode
on the ARM reference HW. Right?

So in order for this to be accepted, I also have to implement
support for the DMA controller found in the reference platforms from
ARM, PL080 and PL081, since there is only this S3C-tilted driver
in the kernel so far:
arch/arm/mach-s3c64xx/dma.c

This is hard for me to do since I have to try to lend a device to
test it on.

Anyway, I will see if I can lend the RealView machine again and
play around a bit with patching up Bens driver to work with the
DMAengine and show that it runs, in DMA mode, on the RealView,
as a proof of concept. Would that be OK then?

> As I've said, I don't want the ARM platforms to be boxed into a corner
> such that they can never have DMA support because this stuff hasn't been
> properly thought out.

I'm thinking all I can, I promise. Perhaps I'm not smart enough all the
time, it's a known problem, I'm working on it.

> Or let me put it another way - if people are happy for Linux to support
> new ARM CPU architectures, but with very little attention given to DMA
> support on those architectures, then feel free to box the ARM platforms
> into a corner on DMA support - but on the understanding that _you_ will
> have to deal with the DMA API breakage on those architectures yourself.
> Because with ARM platforms not having DMA support, there's absolutely
> no way to run any checks what so ever on DMA when the CPU architecture
> support is created.

I'm doing the best I can to meet exactly this goal. The changes done in
the DMA engine were done towards the end of making the DMA
engine support *any* DMA controller for the PrimeCells, and I've proven
it to be possible for two different architectures: U300 and U8500. These
are totally different even though they're coming out of the same
company (we *weren't* the same company when they were created!).
One is ARM9 the other is Cortex A9 dualcore. One use a DMA silicon
called COH901318 the other use a DMA silicon named DMA40.

So now I guess I have to make it tick on the block known as PL080/PL081
as well, and I'll have a try at it.

> This is why people like the OMAP folk end up doing a lot of the DMA
> debugging; they tend to be the first group to pick up new architectures
> with fully functional platforms.

Yep it seems we're going to have the same issue with Cortex A9 for
U8500.

Right now we're doing DMA debugging and testing on U300 and
U8500 to make sure neither break as a result of these patches,
and defining the API for the DMA engine.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-07 11:43       ` Linus Walleij
  0 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-07 11:43 UTC (permalink / raw)
  To: linux-arm-kernel

2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:

> I would have thought given the concerns that I stated, merely running
> the drivers in PIO mode would not address those concerns. ?So no, I'm
> not satisfied.

Sorry didn't get it, I understood it as it should be tested on the Versatile
without regressions.

So now I understand that you want the drivers to be tested in DMA mode
on the ARM reference HW. Right?

So in order for this to be accepted, I also have to implement
support for the DMA controller found in the reference platforms from
ARM, PL080 and PL081, since there is only this S3C-tilted driver
in the kernel so far:
arch/arm/mach-s3c64xx/dma.c

This is hard for me to do since I have to try to lend a device to
test it on.

Anyway, I will see if I can lend the RealView machine again and
play around a bit with patching up Bens driver to work with the
DMAengine and show that it runs, in DMA mode, on the RealView,
as a proof of concept. Would that be OK then?

> As I've said, I don't want the ARM platforms to be boxed into a corner
> such that they can never have DMA support because this stuff hasn't been
> properly thought out.

I'm thinking all I can, I promise. Perhaps I'm not smart enough all the
time, it's a known problem, I'm working on it.

> Or let me put it another way - if people are happy for Linux to support
> new ARM CPU architectures, but with very little attention given to DMA
> support on those architectures, then feel free to box the ARM platforms
> into a corner on DMA support - but on the understanding that _you_ will
> have to deal with the DMA API breakage on those architectures yourself.
> Because with ARM platforms not having DMA support, there's absolutely
> no way to run any checks what so ever on DMA when the CPU architecture
> support is created.

I'm doing the best I can to meet exactly this goal. The changes done in
the DMA engine were done towards the end of making the DMA
engine support *any* DMA controller for the PrimeCells, and I've proven
it to be possible for two different architectures: U300 and U8500. These
are totally different even though they're coming out of the same
company (we *weren't* the same company when they were created!).
One is ARM9 the other is Cortex A9 dualcore. One use a DMA silicon
called COH901318 the other use a DMA silicon named DMA40.

So now I guess I have to make it tick on the block known as PL080/PL081
as well, and I'll have a try at it.

> This is why people like the OMAP folk end up doing a lot of the DMA
> debugging; they tend to be the first group to pick up new architectures
> with fully functional platforms.

Yep it seems we're going to have the same issue with Cortex A9 for
U8500.

Right now we're doing DMA debugging and testing on U300 and
U8500 to make sure neither break as a result of these patches,
and defining the API for the DMA engine.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-07 11:43       ` Linus Walleij
@ 2010-05-07 12:31         ` jassi brar
  -1 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-07 12:31 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Russell King - ARM Linux, Ben Dooks, Dan Williams, linux-mmc,
	linux-kernel, linux-arm-kernel

On Fri, May 7, 2010 at 8:43 PM, Linus Walleij
<linus.ml.walleij@gmail.com> wrote:
> 2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>> Or let me put it another way - if people are happy for Linux to support
>> new ARM CPU architectures, but with very little attention given to DMA
>> support on those architectures, then feel free to box the ARM platforms
>> into a corner on DMA support - but on the understanding that _you_ will
>> have to deal with the DMA API breakage on those architectures yourself.
>> Because with ARM platforms not having DMA support, there's absolutely
>> no way to run any checks what so ever on DMA when the CPU architecture
>> support is created.
>
> I'm doing the best I can to meet exactly this goal. The changes done in
> the DMA engine were done towards the end of making the DMA
> engine support *any* DMA controller for the PrimeCells,
with due respect, I think DMA Engine API is very restricting. And it is not
just the 'async' character of it but also some desirable features like
Circular-Linked-Buffer are missing. It may be good enough for Mem->Mem
but is found wanting for Mem<->Dev transfers.
I would like to see some new API defined that address reasonable
requirements of extant platform specific implementations.

> So now I guess I have to make it tick on the block known as PL080/PL081
> as well, and I'll have a try at it.
I always hoped the pl080 core would be segregated out of S3C implementation,
but for some new common DMA API.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-07 12:31         ` jassi brar
  0 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-07 12:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 7, 2010 at 8:43 PM, Linus Walleij
<linus.ml.walleij@gmail.com> wrote:
> 2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>> Or let me put it another way - if people are happy for Linux to support
>> new ARM CPU architectures, but with very little attention given to DMA
>> support on those architectures, then feel free to box the ARM platforms
>> into a corner on DMA support - but on the understanding that _you_ will
>> have to deal with the DMA API breakage on those architectures yourself.
>> Because with ARM platforms not having DMA support, there's absolutely
>> no way to run any checks what so ever on DMA when the CPU architecture
>> support is created.
>
> I'm doing the best I can to meet exactly this goal. The changes done in
> the DMA engine were done towards the end of making the DMA
> engine support *any* DMA controller for the PrimeCells,
with due respect, I think DMA Engine API is very restricting. And it is not
just the 'async' character of it but also some desirable features like
Circular-Linked-Buffer are missing. It may be good enough for Mem->Mem
but is found wanting for Mem<->Dev transfers.
I would like to see some new API defined that address reasonable
requirements of extant platform specific implementations.

> So now I guess I have to make it tick on the block known as PL080/PL081
> as well, and I'll have a try at it.
I always hoped the pl080 core would be segregated out of S3C implementation,
but for some new common DMA API.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-07 12:31         ` jassi brar
@ 2010-05-07 16:10           ` Linus Walleij
  -1 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-07 16:10 UTC (permalink / raw)
  To: jassi brar
  Cc: Russell King - ARM Linux, Ben Dooks, Dan Williams, linux-mmc,
	linux-kernel, linux-arm-kernel

2010/5/7 jassi brar <jassisinghbrar@gmail.com>:
> On Fri, May 7, 2010 at 8:43 PM, Linus Walleij
> <linus.ml.walleij@gmail.com> wrote:
>> 2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>>>
>>> Or let me put it another way - if people are happy for Linux to support
>>> new ARM CPU architectures, but with very little attention given to DMA
>>> support on those architectures, then feel free to box the ARM platforms
>>> into a corner on DMA support - but on the understanding that _you_ will
>>> have to deal with the DMA API breakage on those architectures yourself.
>>> Because with ARM platforms not having DMA support, there's absolutely
>>> no way to run any checks what so ever on DMA when the CPU architecture
>>> support is created.
>>
>> I'm doing the best I can to meet exactly this goal. The changes done in
>> the DMA engine were done towards the end of making the DMA
>> engine support *any* DMA controller for the PrimeCells,
>
> with due respect, I think DMA Engine API is very restricting. And it is not
> just the 'async' character of it but also some desirable features like
> Circular-Linked-Buffer are missing. It may be good enough for Mem->Mem
> but is found wanting for Mem<->Dev transfers.
> I would like to see some new API defined that address reasonable
> requirements of extant platform specific implementations.

I understand these concerns, however I believe the DMAdevices/DMAengine
API can surely be refactored towards this end. Dan Williams has proved
*very* cooperative in doing changes and testing for regressions in the
DMAengine, and I see no fundamental problem with it.

Surely circular linked buffers and other goodies can be retrofitted into the
DMAengine without a complete redesign? I only see a new slave call
to support that really, in addition to the existing sglist interface.

I remember we discussed circular buffer device->device transfers with
Dan some while ago, and he was all for including that but wanted some
real-world example to go along with it.

>> So now I guess I have to make it tick on the block known as PL080/PL081
>> as well, and I'll have a try at it.
>
> I always hoped the pl080 core would be segregated out of S3C implementation,
> but for some new common DMA API.

Well, I don't see any fundamental problems with DMAengine, just wanted
extensions really.

I don't know if I'll be able to provide the nice breakout of the PL080 core
from S3C that you'd like to see but I will try to hack something up as
a proof-of-concept that the DMAengine can support the PrimeCells
found on the RealView.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-07 16:10           ` Linus Walleij
  0 siblings, 0 replies; 28+ messages in thread
From: Linus Walleij @ 2010-05-07 16:10 UTC (permalink / raw)
  To: linux-arm-kernel

2010/5/7 jassi brar <jassisinghbrar@gmail.com>:
> On Fri, May 7, 2010 at 8:43 PM, Linus Walleij
> <linus.ml.walleij@gmail.com> wrote:
>> 2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>>>
>>> Or let me put it another way - if people are happy for Linux to support
>>> new ARM CPU architectures, but with very little attention given to DMA
>>> support on those architectures, then feel free to box the ARM platforms
>>> into a corner on DMA support - but on the understanding that _you_ will
>>> have to deal with the DMA API breakage on those architectures yourself.
>>> Because with ARM platforms not having DMA support, there's absolutely
>>> no way to run any checks what so ever on DMA when the CPU architecture
>>> support is created.
>>
>> I'm doing the best I can to meet exactly this goal. The changes done in
>> the DMA engine were done towards the end of making the DMA
>> engine support *any* DMA controller for the PrimeCells,
>
> with due respect, I think DMA Engine API is very restricting. And it is not
> just the 'async' character of it but also some desirable features like
> Circular-Linked-Buffer are missing. It may be good enough for Mem->Mem
> but is found wanting for Mem<->Dev transfers.
> I would like to see some new API defined that address reasonable
> requirements of extant platform specific implementations.

I understand these concerns, however I believe the DMAdevices/DMAengine
API can surely be refactored towards this end. Dan Williams has proved
*very* cooperative in doing changes and testing for regressions in the
DMAengine, and I see no fundamental problem with it.

Surely circular linked buffers and other goodies can be retrofitted into the
DMAengine without a complete redesign? I only see a new slave call
to support that really, in addition to the existing sglist interface.

I remember we discussed circular buffer device->device transfers with
Dan some while ago, and he was all for including that but wanted some
real-world example to go along with it.

>> So now I guess I have to make it tick on the block known as PL080/PL081
>> as well, and I'll have a try at it.
>
> I always hoped the pl080 core would be segregated out of S3C implementation,
> but for some new common DMA API.

Well, I don't see any fundamental problems with DMAengine, just wanted
extensions really.

I don't know if I'll be able to provide the nice breakout of the PL080 core
from S3C that you'd like to see but I will try to hack something up as
a proof-of-concept that the DMAengine can support the PrimeCells
found on the RealView.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-07 11:43       ` Linus Walleij
@ 2010-05-07 23:54         ` Dan Williams
  -1 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-07 23:54 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Russell King - ARM Linux, Ben Dooks, linux-arm-kernel, linux-mmc,
	linux-kernel

On Fri, May 7, 2010 at 4:43 AM, Linus Walleij
<linus.ml.walleij@gmail.com> wrote:
> 2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>
>> I would have thought given the concerns that I stated, merely running
>> the drivers in PIO mode would not address those concerns.  So no, I'm
>> not satisfied.
>
> Sorry didn't get it, I understood it as it should be tested on the Versatile
> without regressions.
>
> So now I understand that you want the drivers to be tested in DMA mode
> on the ARM reference HW. Right?

Maybe I am also misunderstanding, but I think the concern is less
about "does the dma driver work" and more about exposing an api to dma
clients that is not expressive enough to handle architecture specific
quirks.  The dma provider can always be fixed up.  The "boxed in"
effect happens when there is non-trivial pile of dma client device
drivers using an api that is not expressive enough for a new ARM
platform

That being said I think the interface tweaks for device_control and
channel_status were relatively painless, and the one architecture
specific call we added seems like something that could never be
reconciled in a cross-architecture generic api.

> So in order for this to be accepted, I also have to implement
> support for the DMA controller found in the reference platforms from
> ARM, PL080 and PL081, since there is only this S3C-tilted driver
> in the kernel so far:
> arch/arm/mach-s3c64xx/dma.c
>
> This is hard for me to do since I have to try to lend a device to
> test it on.

I do not think this is a fair burden for you to carry, but it does
sound like Russell will come looking for you when he finds a
counter-example architecture that breaks the api assumptions.  The
multiplexing case seems to be the most challenging to the existing
model.  The idea to oversubscribe struct dma_chan's versus physical
channels seems workable... Russell is this the implementation you
would like to see pre-merge, or is a hand-wavy plan to address it
enough to let this initial implementation through?

> Anyway, I will see if I can lend the RealView machine again and
> play around a bit with patching up Bens driver to work with the
> DMAengine and show that it runs, in DMA mode, on the RealView,
> as a proof of concept. Would that be OK then?
>
>> As I've said, I don't want the ARM platforms to be boxed into a corner
>> such that they can never have DMA support because this stuff hasn't been
>> properly thought out.
>
> I'm thinking all I can, I promise. Perhaps I'm not smart enough all the
> time, it's a known problem, I'm working on it.
>
>> Or let me put it another way - if people are happy for Linux to support
>> new ARM CPU architectures, but with very little attention given to DMA
>> support on those architectures, then feel free to box the ARM platforms
>> into a corner on DMA support - but on the understanding that _you_ will
>> have to deal with the DMA API breakage on those architectures yourself.
>> Because with ARM platforms not having DMA support, there's absolutely
>> no way to run any checks what so ever on DMA when the CPU architecture
>> support is created.
>
> I'm doing the best I can to meet exactly this goal. The changes done in
> the DMA engine were done towards the end of making the DMA
> engine support *any* DMA controller for the PrimeCells, and I've proven
> it to be possible for two different architectures: U300 and U8500. These
> are totally different even though they're coming out of the same
> company (we *weren't* the same company when they were created!).
> One is ARM9 the other is Cortex A9 dualcore. One use a DMA silicon
> called COH901318 the other use a DMA silicon named DMA40.
>
> So now I guess I have to make it tick on the block known as PL080/PL081
> as well, and I'll have a try at it.
>
>> This is why people like the OMAP folk end up doing a lot of the DMA
>> debugging; they tend to be the first group to pick up new architectures
>> with fully functional platforms.
>
> Yep it seems we're going to have the same issue with Cortex A9 for
> U8500.
>
> Right now we're doing DMA debugging and testing on U300 and
> U8500 to make sure neither break as a result of these patches,
> and defining the API for the DMA engine.
>

Very reasonable and I am not seeing a merge blocking issue from the
dmaengine perspective at this point.  The dma_request_channel()
interface seems sufficient for all the needs to date as it basically
allows architectures to develop their own dma-provider-to-client
matching interface to override the generic memory-to-memory client
matching interface.  The 'slave' interface and the ->private parameter
of dma_chan (pending Jassi's recent fix) has proven sufficient for
handling association specific data.  These new patches establish a
model whereby once you know you are talking to your $ARCH dma driver
you can make driver specific calls to pass information that would be
too ugly to pass in a generic fashion.  I do not pretend that this is
an elegant arrangement, but it has proven to be functional.

I do have concerns about mapping device-to-device dma into a generic
api.  That's a whole new level of platform-specific crazy, but that is
not what these patches are about.

At the end of the day I still need Russell's ack/acceptance of the
platform_data infrastructure to move forward, but I'll get the
dmaengine bits queued into -next so it's at least ready to be pulled.

--
Dan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-07 23:54         ` Dan Williams
  0 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-07 23:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 7, 2010 at 4:43 AM, Linus Walleij
<linus.ml.walleij@gmail.com> wrote:
> 2010/5/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
>
>> I would have thought given the concerns that I stated, merely running
>> the drivers in PIO mode would not address those concerns. ?So no, I'm
>> not satisfied.
>
> Sorry didn't get it, I understood it as it should be tested on the Versatile
> without regressions.
>
> So now I understand that you want the drivers to be tested in DMA mode
> on the ARM reference HW. Right?

Maybe I am also misunderstanding, but I think the concern is less
about "does the dma driver work" and more about exposing an api to dma
clients that is not expressive enough to handle architecture specific
quirks.  The dma provider can always be fixed up.  The "boxed in"
effect happens when there is non-trivial pile of dma client device
drivers using an api that is not expressive enough for a new ARM
platform

That being said I think the interface tweaks for device_control and
channel_status were relatively painless, and the one architecture
specific call we added seems like something that could never be
reconciled in a cross-architecture generic api.

> So in order for this to be accepted, I also have to implement
> support for the DMA controller found in the reference platforms from
> ARM, PL080 and PL081, since there is only this S3C-tilted driver
> in the kernel so far:
> arch/arm/mach-s3c64xx/dma.c
>
> This is hard for me to do since I have to try to lend a device to
> test it on.

I do not think this is a fair burden for you to carry, but it does
sound like Russell will come looking for you when he finds a
counter-example architecture that breaks the api assumptions.  The
multiplexing case seems to be the most challenging to the existing
model.  The idea to oversubscribe struct dma_chan's versus physical
channels seems workable... Russell is this the implementation you
would like to see pre-merge, or is a hand-wavy plan to address it
enough to let this initial implementation through?

> Anyway, I will see if I can lend the RealView machine again and
> play around a bit with patching up Bens driver to work with the
> DMAengine and show that it runs, in DMA mode, on the RealView,
> as a proof of concept. Would that be OK then?
>
>> As I've said, I don't want the ARM platforms to be boxed into a corner
>> such that they can never have DMA support because this stuff hasn't been
>> properly thought out.
>
> I'm thinking all I can, I promise. Perhaps I'm not smart enough all the
> time, it's a known problem, I'm working on it.
>
>> Or let me put it another way - if people are happy for Linux to support
>> new ARM CPU architectures, but with very little attention given to DMA
>> support on those architectures, then feel free to box the ARM platforms
>> into a corner on DMA support - but on the understanding that _you_ will
>> have to deal with the DMA API breakage on those architectures yourself.
>> Because with ARM platforms not having DMA support, there's absolutely
>> no way to run any checks what so ever on DMA when the CPU architecture
>> support is created.
>
> I'm doing the best I can to meet exactly this goal. The changes done in
> the DMA engine were done towards the end of making the DMA
> engine support *any* DMA controller for the PrimeCells, and I've proven
> it to be possible for two different architectures: U300 and U8500. These
> are totally different even though they're coming out of the same
> company (we *weren't* the same company when they were created!).
> One is ARM9 the other is Cortex A9 dualcore. One use a DMA silicon
> called COH901318 the other use a DMA silicon named DMA40.
>
> So now I guess I have to make it tick on the block known as PL080/PL081
> as well, and I'll have a try at it.
>
>> This is why people like the OMAP folk end up doing a lot of the DMA
>> debugging; they tend to be the first group to pick up new architectures
>> with fully functional platforms.
>
> Yep it seems we're going to have the same issue with Cortex A9 for
> U8500.
>
> Right now we're doing DMA debugging and testing on U300 and
> U8500 to make sure neither break as a result of these patches,
> and defining the API for the DMA engine.
>

Very reasonable and I am not seeing a merge blocking issue from the
dmaengine perspective at this point.  The dma_request_channel()
interface seems sufficient for all the needs to date as it basically
allows architectures to develop their own dma-provider-to-client
matching interface to override the generic memory-to-memory client
matching interface.  The 'slave' interface and the ->private parameter
of dma_chan (pending Jassi's recent fix) has proven sufficient for
handling association specific data.  These new patches establish a
model whereby once you know you are talking to your $ARCH dma driver
you can make driver specific calls to pass information that would be
too ugly to pass in a generic fashion.  I do not pretend that this is
an elegant arrangement, but it has proven to be functional.

I do have concerns about mapping device-to-device dma into a generic
api.  That's a whole new level of platform-specific crazy, but that is
not what these patches are about.

At the end of the day I still need Russell's ack/acceptance of the
platform_data infrastructure to move forward, but I'll get the
dmaengine bits queued into -next so it's at least ready to be pulled.

--
Dan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-07 16:10           ` Linus Walleij
@ 2010-05-08  2:37             ` jassi brar
  -1 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-08  2:37 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Russell King - ARM Linux, Ben Dooks, Dan Williams, linux-mmc,
	linux-kernel, linux-arm-kernel

On Sat, May 8, 2010 at 1:10 AM, Linus Walleij
<linus.ml.walleij@gmail.com> wrote:
> Surely circular linked buffers and other goodies can be retrofitted into the
> DMAengine without a complete redesign? I only see a new slave call
> to support that really, in addition to the existing sglist interface.
well, before taking up the PL330 dma api driver, 'async' character of it
was the only concern I had in mind. That still is, but I came across a
a few more peculiarities while implementing the driver.

a) Async:- For lazy transfers of mem to mem this may be ok.
  But there might be devices the employ DMA to do extensive M2M transfers
  (say dedicated multimedia oriented devices) the 'async' nature might be
  a bottleneck. So too for M<=>D with a fast device with shallow FIFO.
  There may be clients that don't wanna do much upon DMA done, but they
  do need notifications ASAP.  By definition, this API forbids such
expectations.
 IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
 But since there maybe clients that need to do sleepable stuff in
callbacks, the API
 may do two callbacks - 'quick' in irq context and 'lazy' from
tasklets scheduled from
 the IRQ. Most clients will provide either, while some may provide
both callback functions.

b) There seems to be no clear way of reporting failed transfers. The
device_tx_status
    can get FAIL/SUCSESS but the call is open ended and can be performed
    without any time bound after tx_submit. It is not very optimal for
DMAC drivers
    to save descriptors of all failed transactions until the channel
is released.
    IMHO, provision of status checking by two mechanisms: cookie and dma-done
   callbacks is complication more than a feature. Perhaps the dma
engine could provide
   a default callback, should the client doesn't do so, and track
done/pending xfers
  for such requests?

c) Conceptually, the channels are tightly coupled with the DMACs,
there seems to be
   no way to be able to schedule a channel among more than one DMACs
in the runtime,
   that is if more then one DMAC support the same channel/peripheral.
   For example, Samsung's S5Pxxxx have many channels available on more
than 1 DMAC
   but for this dma api we have to statically assign channels to
DMACs, which may result in
   a channel acquire request rejected just because the DMAC we chose
for it is already
   fully busy while another DMAC, which also supports the channel, is idling.
   Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
I2STX_viaDMAC2
   and allocate double resources for these "mutually exclusive" channels.

d) Something like circular-linked-request is highly desirable for one
of the important DMA
   clients i.e, audio.

e) There seems to be no ScatterGather support for Mem to Mem transfers.

Or these are just due to my cursory understanding of the DMA Engine core?...

Of course, there are many good features of this API which any API
should provide.

regards.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-08  2:37             ` jassi brar
  0 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-08  2:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 8, 2010 at 1:10 AM, Linus Walleij
<linus.ml.walleij@gmail.com> wrote:
> Surely circular linked buffers and other goodies can be retrofitted into the
> DMAengine without a complete redesign? I only see a new slave call
> to support that really, in addition to the existing sglist interface.
well, before taking up the PL330 dma api driver, 'async' character of it
was the only concern I had in mind. That still is, but I came across a
a few more peculiarities while implementing the driver.

a) Async:- For lazy transfers of mem to mem this may be ok.
  But there might be devices the employ DMA to do extensive M2M transfers
  (say dedicated multimedia oriented devices) the 'async' nature might be
  a bottleneck. So too for M<=>D with a fast device with shallow FIFO.
  There may be clients that don't wanna do much upon DMA done, but they
  do need notifications ASAP.  By definition, this API forbids such
expectations.
 IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
 But since there maybe clients that need to do sleepable stuff in
callbacks, the API
 may do two callbacks - 'quick' in irq context and 'lazy' from
tasklets scheduled from
 the IRQ. Most clients will provide either, while some may provide
both callback functions.

b) There seems to be no clear way of reporting failed transfers. The
device_tx_status
    can get FAIL/SUCSESS but the call is open ended and can be performed
    without any time bound after tx_submit. It is not very optimal for
DMAC drivers
    to save descriptors of all failed transactions until the channel
is released.
    IMHO, provision of status checking by two mechanisms: cookie and dma-done
   callbacks is complication more than a feature. Perhaps the dma
engine could provide
   a default callback, should the client doesn't do so, and track
done/pending xfers
  for such requests?

c) Conceptually, the channels are tightly coupled with the DMACs,
there seems to be
   no way to be able to schedule a channel among more than one DMACs
in the runtime,
   that is if more then one DMAC support the same channel/peripheral.
   For example, Samsung's S5Pxxxx have many channels available on more
than 1 DMAC
   but for this dma api we have to statically assign channels to
DMACs, which may result in
   a channel acquire request rejected just because the DMAC we chose
for it is already
   fully busy while another DMAC, which also supports the channel, is idling.
   Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
I2STX_viaDMAC2
   and allocate double resources for these "mutually exclusive" channels.

d) Something like circular-linked-request is highly desirable for one
of the important DMA
   clients i.e, audio.

e) There seems to be no ScatterGather support for Mem to Mem transfers.

Or these are just due to my cursory understanding of the DMA Engine core?...

Of course, there are many good features of this API which any API
should provide.

regards.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-08  2:37             ` jassi brar
@ 2010-05-08 22:24               ` Dan Williams
  -1 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-08 22:24 UTC (permalink / raw)
  To: jassi brar
  Cc: Linus Walleij, Russell King - ARM Linux, Ben Dooks, linux-mmc,
	linux-kernel, linux-arm-kernel

On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
> On Sat, May 8, 2010 at 1:10 AM, Linus Walleij
> <linus.ml.walleij@gmail.com> wrote:
>> Surely circular linked buffers and other goodies can be retrofitted into the
>> DMAengine without a complete redesign? I only see a new slave call
>> to support that really, in addition to the existing sglist interface.
> well, before taking up the PL330 dma api driver, 'async' character of it
> was the only concern I had in mind. That still is, but I came across a
> a few more peculiarities while implementing the driver.
>
> a) Async:- For lazy transfers of mem to mem this may be ok.
>  But there might be devices the employ DMA to do extensive M2M transfers
>  (say dedicated multimedia oriented devices) the 'async' nature might be
>  a bottleneck. So too for M<=>D with a fast device with shallow FIFO.
>  There may be clients that don't wanna do much upon DMA done, but they
>  do need notifications ASAP.  By definition, this API forbids such
> expectations.

It is not forbidden by definition.  What is needed is a way for
drivers to opt-out of the async_tx expectations.  I have started down
this path with CONFIG_ASYNC_TX_DISABLE_CHANNEL_SWITCH for the ioatdma
driver, but the idea could be extended further to disable
CONFIG_ASYNC_TX_DMA and NET_DMA entirely to allow the device to
operate in a more device-dma friendly mode.

>  IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
>  But since there maybe clients that need to do sleepable stuff in

None of the current clients sleep in the callback, it's done in
soft-irq context.  The only expectation is that hard-irqs are enabled
during the callback just like timer callbacks.  I also would like to
see numbers to quantify the claims of slowness.  When Steven Rostedt
was proposing his "move tasklets to process context" patches I ran a
throughput test on iop13xx and did not measure any degradation.

> callbacks, the API
>  may do two callbacks - 'quick' in irq context and 'lazy' from
> tasklets scheduled from
>  the IRQ. Most clients will provide either, while some may provide
> both callback functions.
>
> b) There seems to be no clear way of reporting failed transfers. The
> device_tx_status
>    can get FAIL/SUCSESS but the call is open ended and can be performed
>    without any time bound after tx_submit. It is not very optimal for
> DMAC drivers
>    to save descriptors of all failed transactions until the channel
> is released.
>    IMHO, provision of status checking by two mechanisms: cookie and dma-done
>   callbacks is complication more than a feature. Perhaps the dma
> engine could provide
>   a default callback, should the client doesn't do so, and track
> done/pending xfers
>  for such requests?

I agree the error handling was designed around mem-to-mem assumptions
where failures are due to double-bit ECC errors and other rare events.

>
> c) Conceptually, the channels are tightly coupled with the DMACs,
> there seems to be
>   no way to be able to schedule a channel among more than one DMACs
> in the runtime,
>   that is if more then one DMAC support the same channel/peripheral.
>   For example, Samsung's S5Pxxxx have many channels available on more
> than 1 DMAC
>   but for this dma api we have to statically assign channels to
> DMACs, which may result in
>   a channel acquire request rejected just because the DMAC we chose
> for it is already
>   fully busy while another DMAC, which also supports the channel, is idling.
>   Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
> I2STX_viaDMAC2
>   and allocate double resources for these "mutually exclusive" channels.

I am not understanding this example.  If both DMACs are registered the
dma_filter function to dma_request_channel() can select between them,
right?

>
> d) Something like circular-linked-request is highly desirable for one
> of the important DMA
>   clients i.e, audio.

Is this a standing dma chain that periodically a client will say "go"
to re-run those operations?  Please enlighten me, I've never played
with audio drivers.

>
> e) There seems to be no ScatterGather support for Mem to Mem transfers.

There has never been a use case, what did you have in mind.  If
multiple prep_memcpy commands is too inefficient we could always add
another operation.

> Or these are just due to my cursory understanding of the DMA Engine core?...

No, it's a good review and points out some places where the API can evolve.

--
Dan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-08 22:24               ` Dan Williams
  0 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-08 22:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
> On Sat, May 8, 2010 at 1:10 AM, Linus Walleij
> <linus.ml.walleij@gmail.com> wrote:
>> Surely circular linked buffers and other goodies can be retrofitted into the
>> DMAengine without a complete redesign? I only see a new slave call
>> to support that really, in addition to the existing sglist interface.
> well, before taking up the PL330 dma api driver, 'async' character of it
> was the only concern I had in mind. That still is, but I came across a
> a few more peculiarities while implementing the driver.
>
> a) Async:- For lazy transfers of mem to mem this may be ok.
> ?But there might be devices the employ DMA to do extensive M2M transfers
> ?(say dedicated multimedia oriented devices) the 'async' nature might be
> ?a bottleneck. So too for M<=>D with a fast device with shallow FIFO.
> ?There may be clients that don't wanna do much upon DMA done, but they
> ?do need notifications ASAP. ?By definition, this API forbids such
> expectations.

It is not forbidden by definition.  What is needed is a way for
drivers to opt-out of the async_tx expectations.  I have started down
this path with CONFIG_ASYNC_TX_DISABLE_CHANNEL_SWITCH for the ioatdma
driver, but the idea could be extended further to disable
CONFIG_ASYNC_TX_DMA and NET_DMA entirely to allow the device to
operate in a more device-dma friendly mode.

> ?IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
> ?But since there maybe clients that need to do sleepable stuff in

None of the current clients sleep in the callback, it's done in
soft-irq context.  The only expectation is that hard-irqs are enabled
during the callback just like timer callbacks.  I also would like to
see numbers to quantify the claims of slowness.  When Steven Rostedt
was proposing his "move tasklets to process context" patches I ran a
throughput test on iop13xx and did not measure any degradation.

> callbacks, the API
> ?may do two callbacks - 'quick' in irq context and 'lazy' from
> tasklets scheduled from
> ?the IRQ. Most clients will provide either, while some may provide
> both callback functions.
>
> b) There seems to be no clear way of reporting failed transfers. The
> device_tx_status
> ? ?can get FAIL/SUCSESS but the call is open ended and can be performed
> ? ?without any time bound after tx_submit. It is not very optimal for
> DMAC drivers
> ? ?to save descriptors of all failed transactions until the channel
> is released.
> ? ?IMHO, provision of status checking by two mechanisms: cookie and dma-done
> ? callbacks is complication more than a feature. Perhaps the dma
> engine could provide
> ? a default callback, should the client doesn't do so, and track
> done/pending xfers
> ?for such requests?

I agree the error handling was designed around mem-to-mem assumptions
where failures are due to double-bit ECC errors and other rare events.

>
> c) Conceptually, the channels are tightly coupled with the DMACs,
> there seems to be
> ? no way to be able to schedule a channel among more than one DMACs
> in the runtime,
> ? that is if more then one DMAC support the same channel/peripheral.
> ? For example, Samsung's S5Pxxxx have many channels available on more
> than 1 DMAC
> ? but for this dma api we have to statically assign channels to
> DMACs, which may result in
> ? a channel acquire request rejected just because the DMAC we chose
> for it is already
> ? fully busy while another DMAC, which also supports the channel, is idling.
> ? Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
> I2STX_viaDMAC2
> ? and allocate double resources for these "mutually exclusive" channels.

I am not understanding this example.  If both DMACs are registered the
dma_filter function to dma_request_channel() can select between them,
right?

>
> d) Something like circular-linked-request is highly desirable for one
> of the important DMA
> ? clients i.e, audio.

Is this a standing dma chain that periodically a client will say "go"
to re-run those operations?  Please enlighten me, I've never played
with audio drivers.

>
> e) There seems to be no ScatterGather support for Mem to Mem transfers.

There has never been a use case, what did you have in mind.  If
multiple prep_memcpy commands is too inefficient we could always add
another operation.

> Or these are just due to my cursory understanding of the DMA Engine core?...

No, it's a good review and points out some places where the API can evolve.

--
Dan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-08 22:24               ` Dan Williams
@ 2010-05-09  3:48                 ` jassi brar
  -1 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-09  3:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: Linus Walleij, Russell King - ARM Linux, Ben Dooks, linux-mmc,
	linux-kernel, linux-arm-kernel

On Sun, May 9, 2010 at 7:24 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>>  IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
>>  But since there maybe clients that need to do sleepable stuff in
>
> None of the current clients sleep in the callback, it's done in
> soft-irq context.  The only expectation is that hard-irqs are enabled
> during the callback just like timer callbacks.  I also would like to
> see numbers to quantify the claims of slowness.
The clients evolve around the API so they don't do what the API doesn't
allow. Any API should try to put as least contraints as possible - you never
know what kinda clients are gonna arise.
Lets say a protocol requires 'quick' ACK(within few usecs) on control bus after
xfer'ing a large packet on data bus. All the client needs is to be
able to toggle
some bit of the device controller after the DMA done, which can very well be
done in IRQ context but maybe too late if the callback is done from a tasklet
scheduled from DMAC ISR.
The point being, a DMA API should be able to do callbacks from the IRQ context
too. That is, assuming the clients know what they do.

Also, I think it is possible to have an API that allows request submission from
callbacks, which will be a very useful feature.
Of course, assuming the clients know what they can/can't do (just like current
DMA API or any other API).


>> callbacks, the API
>>  may do two callbacks - 'quick' in irq context and 'lazy' from
>> tasklets scheduled from
>>  the IRQ. Most clients will provide either, while some may provide
>> both callback functions.
>>
>> b) There seems to be no clear way of reporting failed transfers. The
>> device_tx_status
>>    can get FAIL/SUCSESS but the call is open ended and can be performed
>>    without any time bound after tx_submit. It is not very optimal for
>> DMAC drivers
>>    to save descriptors of all failed transactions until the channel
>> is released.
>>    IMHO, provision of status checking by two mechanisms: cookie and dma-done
>>   callbacks is complication more than a feature. Perhaps the dma
>> engine could provide
>>   a default callback, should the client doesn't do so, and track
>> done/pending xfers
>>  for such requests?
>
> I agree the error handling was designed around mem-to-mem assumptions
> where failures are due to double-bit ECC errors and other rare events.
well, neither have I ever seen DMA failure, but a good API shouldn't count
upon h/w perfection.


>> c) Conceptually, the channels are tightly coupled with the DMACs,
>> there seems to be
>>   no way to be able to schedule a channel among more than one DMACs
>> in the runtime,
>>   that is if more then one DMAC support the same channel/peripheral.
>>   For example, Samsung's S5Pxxxx have many channels available on more
>> than 1 DMAC
>>   but for this dma api we have to statically assign channels to
>> DMACs, which may result in
>>   a channel acquire request rejected just because the DMAC we chose
>> for it is already
>>   fully busy while another DMAC, which also supports the channel, is idling.
>>   Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
>> I2STX_viaDMAC2
>>   and allocate double resources for these "mutually exclusive" channels.
>
> I am not understanding this example.  If both DMACs are registered the
> dma_filter function to dma_request_channel() can select between them,
> right?
Let me be precise. I2S_Tx fifo(I2S peripheral/channel) can be be reached
by two DMACs but, of course, the channel can only be active with
exactly one DMAC.
So, it is desirable to be able to reach the peripheral via second DMAC should
the first one is too busy to handle the request. Clearly this is a
runtime decision.
FWIHS, I can associate the channel with either of the DMACs and if that DMAC
can't handle the I2S_Tx request (say due to its all h/w threads
allocated to other
request), I can't play audio even if the DMAC might be simply idling.


>>
>> d) Something like circular-linked-request is highly desirable for one
>> of the important DMA
>>   clients i.e, audio.
>
> Is this a standing dma chain that periodically a client will say "go"
> to re-run those operations?  Please enlighten me, I've never played
> with audio drivers.
Yes, quite similar. Only alsa drivers will say "go" just once at playback start
and the submitted xfer requests(called periods) are repeatedly transfered in
circular manner.
Just a quick snd_pcm_period_elapsed is called in dma-done callback for
each request(which are usually the same length).
That way, the client neither have to re-submit requests nor need to do sleepable
stuff(allocating memory for new reqs and managing local state machine)
The minimum period size depends on audio latency, which depends on the
ability to do dma-done callbacks asap.
This is another example, where the clients wud benefit from callback from IRQ
context which is also perfectly safe.

>> e) There seems to be no ScatterGather support for Mem to Mem transfers.
>
> There has never been a use case, what did you have in mind.  If
> multiple prep_memcpy commands is too inefficient we could always add
> another operation.
Just that I believe any API should be as exhaustive and generic as possible.
I see it possible for multimedia devices/drivers to evolve to start needing
such capabilities.
Also, the way DMA API treats memcpy/memset and assume SG reqs to be
equivalent to MEM<=>DEV request is not very impressive.
IMHO, any submitted request should be a list of xfers. And an xfer is a
'memset' with 'src_len' bytes from 'src_addr' to be copied 'n' times
at 'dst_addr'.
Memcpy is just a special case of memset, where n := 1
This covers most possible use cases while being more compact and future-proof.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-09  3:48                 ` jassi brar
  0 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-09  3:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, May 9, 2010 at 7:24 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>> ?IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
>> ?But since there maybe clients that need to do sleepable stuff in
>
> None of the current clients sleep in the callback, it's done in
> soft-irq context. ?The only expectation is that hard-irqs are enabled
> during the callback just like timer callbacks. ?I also would like to
> see numbers to quantify the claims of slowness.
The clients evolve around the API so they don't do what the API doesn't
allow. Any API should try to put as least contraints as possible - you never
know what kinda clients are gonna arise.
Lets say a protocol requires 'quick' ACK(within few usecs) on control bus after
xfer'ing a large packet on data bus. All the client needs is to be
able to toggle
some bit of the device controller after the DMA done, which can very well be
done in IRQ context but maybe too late if the callback is done from a tasklet
scheduled from DMAC ISR.
The point being, a DMA API should be able to do callbacks from the IRQ context
too. That is, assuming the clients know what they do.

Also, I think it is possible to have an API that allows request submission from
callbacks, which will be a very useful feature.
Of course, assuming the clients know what they can/can't do (just like current
DMA API or any other API).


>> callbacks, the API
>> ?may do two callbacks - 'quick' in irq context and 'lazy' from
>> tasklets scheduled from
>> ?the IRQ. Most clients will provide either, while some may provide
>> both callback functions.
>>
>> b) There seems to be no clear way of reporting failed transfers. The
>> device_tx_status
>> ? ?can get FAIL/SUCSESS but the call is open ended and can be performed
>> ? ?without any time bound after tx_submit. It is not very optimal for
>> DMAC drivers
>> ? ?to save descriptors of all failed transactions until the channel
>> is released.
>> ? ?IMHO, provision of status checking by two mechanisms: cookie and dma-done
>> ? callbacks is complication more than a feature. Perhaps the dma
>> engine could provide
>> ? a default callback, should the client doesn't do so, and track
>> done/pending xfers
>> ?for such requests?
>
> I agree the error handling was designed around mem-to-mem assumptions
> where failures are due to double-bit ECC errors and other rare events.
well, neither have I ever seen DMA failure, but a good API shouldn't count
upon h/w perfection.


>> c) Conceptually, the channels are tightly coupled with the DMACs,
>> there seems to be
>> ? no way to be able to schedule a channel among more than one DMACs
>> in the runtime,
>> ? that is if more then one DMAC support the same channel/peripheral.
>> ? For example, Samsung's S5Pxxxx have many channels available on more
>> than 1 DMAC
>> ? but for this dma api we have to statically assign channels to
>> DMACs, which may result in
>> ? a channel acquire request rejected just because the DMAC we chose
>> for it is already
>> ? fully busy while another DMAC, which also supports the channel, is idling.
>> ? Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
>> I2STX_viaDMAC2
>> ? and allocate double resources for these "mutually exclusive" channels.
>
> I am not understanding this example. ?If both DMACs are registered the
> dma_filter function to dma_request_channel() can select between them,
> right?
Let me be precise. I2S_Tx fifo(I2S peripheral/channel) can be be reached
by two DMACs but, of course, the channel can only be active with
exactly one DMAC.
So, it is desirable to be able to reach the peripheral via second DMAC should
the first one is too busy to handle the request. Clearly this is a
runtime decision.
FWIHS, I can associate the channel with either of the DMACs and if that DMAC
can't handle the I2S_Tx request (say due to its all h/w threads
allocated to other
request), I can't play audio even if the DMAC might be simply idling.


>>
>> d) Something like circular-linked-request is highly desirable for one
>> of the important DMA
>> ? clients i.e, audio.
>
> Is this a standing dma chain that periodically a client will say "go"
> to re-run those operations? ?Please enlighten me, I've never played
> with audio drivers.
Yes, quite similar. Only alsa drivers will say "go" just once at playback start
and the submitted xfer requests(called periods) are repeatedly transfered in
circular manner.
Just a quick snd_pcm_period_elapsed is called in dma-done callback for
each request(which are usually the same length).
That way, the client neither have to re-submit requests nor need to do sleepable
stuff(allocating memory for new reqs and managing local state machine)
The minimum period size depends on audio latency, which depends on the
ability to do dma-done callbacks asap.
This is another example, where the clients wud benefit from callback from IRQ
context which is also perfectly safe.

>> e) There seems to be no ScatterGather support for Mem to Mem transfers.
>
> There has never been a use case, what did you have in mind. ?If
> multiple prep_memcpy commands is too inefficient we could always add
> another operation.
Just that I believe any API should be as exhaustive and generic as possible.
I see it possible for multimedia devices/drivers to evolve to start needing
such capabilities.
Also, the way DMA API treats memcpy/memset and assume SG reqs to be
equivalent to MEM<=>DEV request is not very impressive.
IMHO, any submitted request should be a list of xfers. And an xfer is a
'memset' with 'src_len' bytes from 'src_addr' to be copied 'n' times
at 'dst_addr'.
Memcpy is just a special case of memset, where n := 1
This covers most possible use cases while being more compact and future-proof.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-09  3:48                 ` jassi brar
@ 2010-05-09  7:47                   ` Dan Williams
  -1 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-09  7:47 UTC (permalink / raw)
  To: jassi brar
  Cc: Linus Walleij, Russell King - ARM Linux, Ben Dooks, linux-mmc,
	linux-kernel, linux-arm-kernel

On Sat, May 8, 2010 at 8:48 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
> On Sun, May 9, 2010 at 7:24 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>>>  IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
>>>  But since there maybe clients that need to do sleepable stuff in
>>
>> None of the current clients sleep in the callback, it's done in
>> soft-irq context.  The only expectation is that hard-irqs are enabled
>> during the callback just like timer callbacks.  I also would like to
>> see numbers to quantify the claims of slowness.
> The clients evolve around the API so they don't do what the API doesn't
> allow. Any API should try to put as least contraints as possible - you never
> know what kinda clients are gonna arise.

Running a callback in hard-irq context definitely puts constraints on
the callback implementation to be as minimal as possible... and there
is nothing stopping you from doing that today with the existing
dmaengine interface: see idmac_interrupt.

> Lets say a protocol requires 'quick' ACK(within few usecs) on control bus after
> xfer'ing a large packet on data bus. All the client needs is to be
> able to toggle
> some bit of the device controller after the DMA done, which can very well be
> done in IRQ context but maybe too late if the callback is done from a tasklet
> scheduled from DMAC ISR.
> The point being, a DMA API should be able to do callbacks from the IRQ context
> too. That is, assuming the clients know what they do.

You are confusing async_tx constraints and dmaengine.  If your driver
is providing the backend of an async_tx operation (currently only
md-raid acceleration) then md-raid can assume that the callback is
being performed in an irq-enabled non-sleepable context.  If you are
not providing an async_tx backend service then those constraints are
lifted.  I think I would like to make this explicit
CONFIG_DMA_SUPPORTS_ASYNC_TX option to clearly mark the intended use
model of the dma controller.

> Also, I think it is possible to have an API that allows request submission from
> callbacks, which will be a very useful feature.
> Of course, assuming the clients know what they can/can't do (just like current
> DMA API or any other API).

It's a driver specific implementation detail if it supports submission
from the callback.  As a "general" rule clients should not assume that
all drivers support this, but in the architecture specific case you
know which driver you are talking to, so this should not be an issue.

>
>
>>> callbacks, the API
>>>  may do two callbacks - 'quick' in irq context and 'lazy' from
>>> tasklets scheduled from
>>>  the IRQ. Most clients will provide either, while some may provide
>>> both callback functions.
>>>
>>> b) There seems to be no clear way of reporting failed transfers. The
>>> device_tx_status
>>>    can get FAIL/SUCSESS but the call is open ended and can be performed
>>>    without any time bound after tx_submit. It is not very optimal for
>>> DMAC drivers
>>>    to save descriptors of all failed transactions until the channel
>>> is released.
>>>    IMHO, provision of status checking by two mechanisms: cookie and dma-done
>>>   callbacks is complication more than a feature. Perhaps the dma
>>> engine could provide
>>>   a default callback, should the client doesn't do so, and track
>>> done/pending xfers
>>>  for such requests?
>>
>> I agree the error handling was designed around mem-to-mem assumptions
>> where failures are due to double-bit ECC errors and other rare events.
> well, neither have I ever seen DMA failure, but a good API shouldn't count
> upon h/w perfection.
>

It doesn't count on perfection, it treats failures the same way the
cpu would react to a unhandled data abort i.e. panic.  I was thinking
of a case like sata where you might see dma errors on a daily basis.

>
>>> c) Conceptually, the channels are tightly coupled with the DMACs,
>>> there seems to be
>>>   no way to be able to schedule a channel among more than one DMACs
>>> in the runtime,
>>>   that is if more then one DMAC support the same channel/peripheral.
>>>   For example, Samsung's S5Pxxxx have many channels available on more
>>> than 1 DMAC
>>>   but for this dma api we have to statically assign channels to
>>> DMACs, which may result in
>>>   a channel acquire request rejected just because the DMAC we chose
>>> for it is already
>>>   fully busy while another DMAC, which also supports the channel, is idling.
>>>   Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
>>> I2STX_viaDMAC2
>>>   and allocate double resources for these "mutually exclusive" channels.
>>
>> I am not understanding this example.  If both DMACs are registered the
>> dma_filter function to dma_request_channel() can select between them,
>> right?
> Let me be precise. I2S_Tx fifo(I2S peripheral/channel) can be be reached
> by two DMACs but, of course, the channel can only be active with
> exactly one DMAC.
> So, it is desirable to be able to reach the peripheral via second DMAC should
> the first one is too busy to handle the request. Clearly this is a
> runtime decision.
> FWIHS, I can associate the channel with either of the DMACs and if that DMAC
> can't handle the I2S_Tx request (say due to its all h/w threads
> allocated to other
> request), I can't play audio even if the DMAC might be simply idling.
>

Ah ok, you want load balancing between channels.  In that case the 1:1
nature of dma_request_channel() is not the right interface.  We would
need to develop something like an architecture specific implementation
of dma_find_channel() to allow dynamic channel allocation at runtime.
But at that point we will have written something that is very
architecture specific, how could we implement that in a generic api?

Basically if the driver does not want to present resources to generic
clients, does want to use any of the existing generic channel
allocation mechanisms, and has narrow platform-specific needs then why
code to/extend a generic api?

For example the ppe440 dma driver had architecture specific allocation
requirements (see arch/powerpc/include/asm/async_tx.h), but it still
wanted to service generic clients.

>>> d) Something like circular-linked-request is highly desirable for one
>>> of the important DMA
>>>   clients i.e, audio.
>>
>> Is this a standing dma chain that periodically a client will say "go"
>> to re-run those operations?  Please enlighten me, I've never played
>> with audio drivers.
> Yes, quite similar. Only alsa drivers will say "go" just once at playback start
> and the submitted xfer requests(called periods) are repeatedly transfered in
> circular manner.
> Just a quick snd_pcm_period_elapsed is called in dma-done callback for
> each request(which are usually the same length).
> That way, the client neither have to re-submit requests nor need to do sleepable
> stuff(allocating memory for new reqs and managing local state machine)
> The minimum period size depends on audio latency, which depends on the
> ability to do dma-done callbacks asap.
> This is another example, where the clients wud benefit from callback from IRQ
> context which is also perfectly safe.

Ok, thanks for the explanation.

>
>>> e) There seems to be no ScatterGather support for Mem to Mem transfers.
>>
>> There has never been a use case, what did you have in mind.  If
>> multiple prep_memcpy commands is too inefficient we could always add
>> another operation.
> Just that I believe any API should be as exhaustive and generic as possible.
> I see it possible for multimedia devices/drivers to evolve to start needing
> such capabilities.
> Also, the way DMA API treats memcpy/memset and assume SG reqs to be
> equivalent to MEM<=>DEV request is not very impressive.
> IMHO, any submitted request should be a list of xfers. And an xfer is a
> 'memset' with 'src_len' bytes from 'src_addr' to be copied 'n' times
> at 'dst_addr'.
> Memcpy is just a special case of memset, where n := 1
> This covers most possible use cases while being more compact and future-proof.

No, memset is an operation that does not have a source address and
instead writes a pattern.  As for the sg support for mem-to-mem
operations... like most things in Linux it was designed around its
users and none of the users at the time (md-raid, net-dma) required
scatter gather support.

Without seeing code its hard to make a judgment on what can and cannot
fit in dmaengine, but it needs to be judged on what fits in a generic
api and the feasibility of forcing mem-to-mem device-to-mem and
device-to-device dma into one api.  I am skeptical we can address all
those concerns, but we at least have something passably functional for
the first two.  On the other hand, it's perfectly sane for subarchs
like pxa to have their own dma api.  If at the end of the day all that
matters is $arch-specific-dma then why mess around with a generic api?

--
Dan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-09  7:47                   ` Dan Williams
  0 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-09  7:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 8, 2010 at 8:48 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
> On Sun, May 9, 2010 at 7:24 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>>> ?IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
>>> ?But since there maybe clients that need to do sleepable stuff in
>>
>> None of the current clients sleep in the callback, it's done in
>> soft-irq context. ?The only expectation is that hard-irqs are enabled
>> during the callback just like timer callbacks. ?I also would like to
>> see numbers to quantify the claims of slowness.
> The clients evolve around the API so they don't do what the API doesn't
> allow. Any API should try to put as least contraints as possible - you never
> know what kinda clients are gonna arise.

Running a callback in hard-irq context definitely puts constraints on
the callback implementation to be as minimal as possible... and there
is nothing stopping you from doing that today with the existing
dmaengine interface: see idmac_interrupt.

> Lets say a protocol requires 'quick' ACK(within few usecs) on control bus after
> xfer'ing a large packet on data bus. All the client needs is to be
> able to toggle
> some bit of the device controller after the DMA done, which can very well be
> done in IRQ context but maybe too late if the callback is done from a tasklet
> scheduled from DMAC ISR.
> The point being, a DMA API should be able to do callbacks from the IRQ context
> too. That is, assuming the clients know what they do.

You are confusing async_tx constraints and dmaengine.  If your driver
is providing the backend of an async_tx operation (currently only
md-raid acceleration) then md-raid can assume that the callback is
being performed in an irq-enabled non-sleepable context.  If you are
not providing an async_tx backend service then those constraints are
lifted.  I think I would like to make this explicit
CONFIG_DMA_SUPPORTS_ASYNC_TX option to clearly mark the intended use
model of the dma controller.

> Also, I think it is possible to have an API that allows request submission from
> callbacks, which will be a very useful feature.
> Of course, assuming the clients know what they can/can't do (just like current
> DMA API or any other API).

It's a driver specific implementation detail if it supports submission
from the callback.  As a "general" rule clients should not assume that
all drivers support this, but in the architecture specific case you
know which driver you are talking to, so this should not be an issue.

>
>
>>> callbacks, the API
>>> ?may do two callbacks - 'quick' in irq context and 'lazy' from
>>> tasklets scheduled from
>>> ?the IRQ. Most clients will provide either, while some may provide
>>> both callback functions.
>>>
>>> b) There seems to be no clear way of reporting failed transfers. The
>>> device_tx_status
>>> ? ?can get FAIL/SUCSESS but the call is open ended and can be performed
>>> ? ?without any time bound after tx_submit. It is not very optimal for
>>> DMAC drivers
>>> ? ?to save descriptors of all failed transactions until the channel
>>> is released.
>>> ? ?IMHO, provision of status checking by two mechanisms: cookie and dma-done
>>> ? callbacks is complication more than a feature. Perhaps the dma
>>> engine could provide
>>> ? a default callback, should the client doesn't do so, and track
>>> done/pending xfers
>>> ?for such requests?
>>
>> I agree the error handling was designed around mem-to-mem assumptions
>> where failures are due to double-bit ECC errors and other rare events.
> well, neither have I ever seen DMA failure, but a good API shouldn't count
> upon h/w perfection.
>

It doesn't count on perfection, it treats failures the same way the
cpu would react to a unhandled data abort i.e. panic.  I was thinking
of a case like sata where you might see dma errors on a daily basis.

>
>>> c) Conceptually, the channels are tightly coupled with the DMACs,
>>> there seems to be
>>> ? no way to be able to schedule a channel among more than one DMACs
>>> in the runtime,
>>> ? that is if more then one DMAC support the same channel/peripheral.
>>> ? For example, Samsung's S5Pxxxx have many channels available on more
>>> than 1 DMAC
>>> ? but for this dma api we have to statically assign channels to
>>> DMACs, which may result in
>>> ? a channel acquire request rejected just because the DMAC we chose
>>> for it is already
>>> ? fully busy while another DMAC, which also supports the channel, is idling.
>>> ? Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
>>> I2STX_viaDMAC2
>>> ? and allocate double resources for these "mutually exclusive" channels.
>>
>> I am not understanding this example. ?If both DMACs are registered the
>> dma_filter function to dma_request_channel() can select between them,
>> right?
> Let me be precise. I2S_Tx fifo(I2S peripheral/channel) can be be reached
> by two DMACs but, of course, the channel can only be active with
> exactly one DMAC.
> So, it is desirable to be able to reach the peripheral via second DMAC should
> the first one is too busy to handle the request. Clearly this is a
> runtime decision.
> FWIHS, I can associate the channel with either of the DMACs and if that DMAC
> can't handle the I2S_Tx request (say due to its all h/w threads
> allocated to other
> request), I can't play audio even if the DMAC might be simply idling.
>

Ah ok, you want load balancing between channels.  In that case the 1:1
nature of dma_request_channel() is not the right interface.  We would
need to develop something like an architecture specific implementation
of dma_find_channel() to allow dynamic channel allocation at runtime.
But at that point we will have written something that is very
architecture specific, how could we implement that in a generic api?

Basically if the driver does not want to present resources to generic
clients, does want to use any of the existing generic channel
allocation mechanisms, and has narrow platform-specific needs then why
code to/extend a generic api?

For example the ppe440 dma driver had architecture specific allocation
requirements (see arch/powerpc/include/asm/async_tx.h), but it still
wanted to service generic clients.

>>> d) Something like circular-linked-request is highly desirable for one
>>> of the important DMA
>>> ? clients i.e, audio.
>>
>> Is this a standing dma chain that periodically a client will say "go"
>> to re-run those operations? ?Please enlighten me, I've never played
>> with audio drivers.
> Yes, quite similar. Only alsa drivers will say "go" just once at playback start
> and the submitted xfer requests(called periods) are repeatedly transfered in
> circular manner.
> Just a quick snd_pcm_period_elapsed is called in dma-done callback for
> each request(which are usually the same length).
> That way, the client neither have to re-submit requests nor need to do sleepable
> stuff(allocating memory for new reqs and managing local state machine)
> The minimum period size depends on audio latency, which depends on the
> ability to do dma-done callbacks asap.
> This is another example, where the clients wud benefit from callback from IRQ
> context which is also perfectly safe.

Ok, thanks for the explanation.

>
>>> e) There seems to be no ScatterGather support for Mem to Mem transfers.
>>
>> There has never been a use case, what did you have in mind. ?If
>> multiple prep_memcpy commands is too inefficient we could always add
>> another operation.
> Just that I believe any API should be as exhaustive and generic as possible.
> I see it possible for multimedia devices/drivers to evolve to start needing
> such capabilities.
> Also, the way DMA API treats memcpy/memset and assume SG reqs to be
> equivalent to MEM<=>DEV request is not very impressive.
> IMHO, any submitted request should be a list of xfers. And an xfer is a
> 'memset' with 'src_len' bytes from 'src_addr' to be copied 'n' times
> at 'dst_addr'.
> Memcpy is just a special case of memset, where n := 1
> This covers most possible use cases while being more compact and future-proof.

No, memset is an operation that does not have a source address and
instead writes a pattern.  As for the sg support for mem-to-mem
operations... like most things in Linux it was designed around its
users and none of the users at the time (md-raid, net-dma) required
scatter gather support.

Without seeing code its hard to make a judgment on what can and cannot
fit in dmaengine, but it needs to be judged on what fits in a generic
api and the feasibility of forcing mem-to-mem device-to-mem and
device-to-device dma into one api.  I am skeptical we can address all
those concerns, but we at least have something passably functional for
the first two.  On the other hand, it's perfectly sane for subarchs
like pxa to have their own dma api.  If at the end of the day all that
matters is $arch-specific-dma then why mess around with a generic api?

--
Dan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-09  7:47                   ` Dan Williams
@ 2010-05-09 10:06                     ` jassi brar
  -1 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-09 10:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Linus Walleij, Russell King - ARM Linux, Ben Dooks, linux-mmc,
	linux-kernel, linux-arm-kernel

On Sun, May 9, 2010 at 4:47 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Sat, May 8, 2010 at 8:48 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>> On Sun, May 9, 2010 at 7:24 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>>> On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>>>>  IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
>>>>  But since there maybe clients that need to do sleepable stuff in
>>>
>>> None of the current clients sleep in the callback, it's done in
>>> soft-irq context.  The only expectation is that hard-irqs are enabled
>>> during the callback just like timer callbacks.  I also would like to
>>> see numbers to quantify the claims of slowness.
>> The clients evolve around the API so they don't do what the API doesn't
>> allow. Any API should try to put as least contraints as possible - you never
>> know what kinda clients are gonna arise.
>
> Running a callback in hard-irq context definitely puts constraints on
> the callback implementation to be as minimal as possible... and there
> is nothing stopping you from doing that today with the existing
> dmaengine interface: see idmac_interrupt.
We must plan for SoCs that have same peripheral IPs but different
DMACs. In that case, we have one client driver and two DMAC drivers
behind the same DMA API. So, if the DMA API doesn't fix such
assumptions, we can't have the driver to work for both SoCs.

>> Lets say a protocol requires 'quick' ACK(within few usecs) on control bus after
>> xfer'ing a large packet on data bus. All the client needs is to be
>> able to toggle
>> some bit of the device controller after the DMA done, which can very well be
>> done in IRQ context but maybe too late if the callback is done from a tasklet
>> scheduled from DMAC ISR.
>> The point being, a DMA API should be able to do callbacks from the IRQ context
>> too. That is, assuming the clients know what they do.
>
> You are confusing async_tx constraints and dmaengine.  If your driver
> is providing the backend of an async_tx operation (currently only
> md-raid acceleration) then md-raid can assume that the callback is
> being performed in an irq-enabled non-sleepable context.  If you are
> not providing an async_tx backend service then those constraints are
> lifted.  I think I would like to make this explicit
> CONFIG_DMA_SUPPORTS_ASYNC_TX option to clearly mark the intended use
> model of the dma controller.
Again, the client shouldn't need to know about the backend DMAC driver.
If some client can't stand 'async' ops, there should be some way for it to
ask DMA API and know.
IMHO, DMA API should see async_tx as a special 'relaxed' case. Adding
new flags to differentiate only complicate the client drivers.

>> Also, I think it is possible to have an API that allows request submission from
>> callbacks, which will be a very useful feature.
>> Of course, assuming the clients know what they can/can't do (just like current
>> DMA API or any other API).
>
> It's a driver specific implementation detail if it supports submission
> from the callback.  As a "general" rule clients should not assume that
> all drivers support this, but in the architecture specific case you
> know which driver you are talking to, so this should not be an issue.
Again, please look at the "Same client driver for different SoCs with
different DMACs" situation. DMA API needs to take a stand.

>>>> callbacks, the API
>>>>  may do two callbacks - 'quick' in irq context and 'lazy' from
>>>> tasklets scheduled from
>>>>  the IRQ. Most clients will provide either, while some may provide
>>>> both callback functions.
>>>>
>>>> b) There seems to be no clear way of reporting failed transfers. The
>>>> device_tx_status
>>>>    can get FAIL/SUCSESS but the call is open ended and can be performed
>>>>    without any time bound after tx_submit. It is not very optimal for
>>>> DMAC drivers
>>>>    to save descriptors of all failed transactions until the channel
>>>> is released.
>>>>    IMHO, provision of status checking by two mechanisms: cookie and dma-done
>>>>   callbacks is complication more than a feature. Perhaps the dma
>>>> engine could provide
>>>>   a default callback, should the client doesn't do so, and track
>>>> done/pending xfers
>>>>  for such requests?
>>>
>>> I agree the error handling was designed around mem-to-mem assumptions
>>> where failures are due to double-bit ECC errors and other rare events.
>> well, neither have I ever seen DMA failure, but a good API shouldn't count
>> upon h/w perfection.
>>
>
> It doesn't count on perfection, it treats failures the same way the
> cpu would react to a unhandled data abort i.e. panic.  I was thinking
> of a case like sata where you might see dma errors on a daily basis.
panic'ing is extreme reaction, esp when DMA API doesn't provide any
guarantee of time-bound operations - the client or API could simply
retry after taking appropriate action.

>>>> c) Conceptually, the channels are tightly coupled with the DMACs,
>>>> there seems to be
>>>>   no way to be able to schedule a channel among more than one DMACs
>>>> in the runtime,
>>>>   that is if more then one DMAC support the same channel/peripheral.
>>>>   For example, Samsung's S5Pxxxx have many channels available on more
>>>> than 1 DMAC
>>>>   but for this dma api we have to statically assign channels to
>>>> DMACs, which may result in
>>>>   a channel acquire request rejected just because the DMAC we chose
>>>> for it is already
>>>>   fully busy while another DMAC, which also supports the channel, is idling.
>>>>   Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
>>>> I2STX_viaDMAC2
>>>>   and allocate double resources for these "mutually exclusive" channels.
>>>
>>> I am not understanding this example.  If both DMACs are registered the
>>> dma_filter function to dma_request_channel() can select between them,
>>> right?
>> Let me be precise. I2S_Tx fifo(I2S peripheral/channel) can be be reached
>> by two DMACs but, of course, the channel can only be active with
>> exactly one DMAC.
>> So, it is desirable to be able to reach the peripheral via second DMAC should
>> the first one is too busy to handle the request. Clearly this is a
>> runtime decision.
>> FWIHS, I can associate the channel with either of the DMACs and if that DMAC
>> can't handle the I2S_Tx request (say due to its all h/w threads
>> allocated to other
>> request), I can't play audio even if the DMAC might be simply idling.
>>
>
> Ah ok, you want load balancing between channels.  In that case the 1:1
> nature of dma_request_channel() is not the right interface.  We would
> need to develop something like an architecture specific implementation
> of dma_find_channel() to allow dynamic channel allocation at runtime.
> But at that point we will have written something that is very
> architecture specific, how could we implement that in a generic api?
In S3C DMA API driver for PL330 DMAC, I add all unique channels and
DMACs to system wide channel and DMAC pool. DMACs have
a capability mask to help find which channels can be reached vai that DMAC.
That way it becomes possible to do runtime mapping of channels on
DMACs.
About current DMA API? I don't know any easy way. But is sure possible
to implement a generic API to do that.

> Basically if the driver does not want to present resources to generic
> clients, does want to use any of the existing generic channel
> allocation mechanisms, and has narrow platform-specific needs then why
> code to/extend a generic api?
Sometimes there is limit on the number of concurrrent channels that
the DMAC can handle. For ex, PL330 can have only 8 h/w threads but
32 possible peripherals/channels, allowing only max. 8 channels to be active
at any time. In such situations, the DMAC may refuse channel allocation
until some other client releases a channel thereby freeing up a h/w thread.

>>>> e) There seems to be no ScatterGather support for Mem to Mem transfers.
>>>
>>> There has never been a use case, what did you have in mind.  If
>>> multiple prep_memcpy commands is too inefficient we could always add
>>> another operation.
>> Just that I believe any API should be as exhaustive and generic as possible.
>> I see it possible for multimedia devices/drivers to evolve to start needing
>> such capabilities.
>> Also, the way DMA API treats memcpy/memset and assume SG reqs to be
>> equivalent to MEM<=>DEV request is not very impressive.
>> IMHO, any submitted request should be a list of xfers. And an xfer is a
>> 'memset' with 'src_len' bytes from 'src_addr' to be copied 'n' times
>> at 'dst_addr'.
>> Memcpy is just a special case of memset, where n := 1
>> This covers most possible use cases while being more compact and future-proof.
>
> No, memset is an operation that does not have a source address and
> instead writes a pattern.
That sounds like assuming memset to be writing a value multiple times.

Most DMACs come with optional SRC/DST increment bit to control the
copy operation. That makes memset nicely blend with memcpy requests.

Memset might not have source address, but for most DMACs the driver
would have to allocate 4 dma coherent bytes(if memset unit is int) and
treat it most likely as memcpy request with src_inc := 0 and dst_inc := 1
For DMACs that does support direct memset operation, there wud be a
limit on size of the pattern, making them 'lesser' flexible than those without
direct memset support.

So, IMHO, memset is better seen as writing a data pattern multiple times rather
than writing a value multiple times.
Because with clients and over time, the unit size may vary.
Lets not put limit on unit size for memset operation.
What if i want to set 1800 bytes of memory with a pattern of 9bytes?
200 memcpy requests OR allocate 9 dma coherent bytes and do 1 memset?

>  As for the sg support for mem-to-mem
> operations... like most things in Linux it was designed around its
> users and none of the users at the time (md-raid, net-dma) required
> scatter gather support.
Yes, but if a request is defined as a 'SG list of memset ops', we pay
no extra price for having capability for Mem to Mem SG. And one
doesn't need to look at the potential users for the API.
All DMAC drivers would have had just one 'prepare' rather than three.

> Without seeing code its hard to make a judgment on what can and cannot
> fit in dmaengine, but it needs to be judged on what fits in a generic
> api and the feasibility of forcing mem-to-mem device-to-mem and
> device-to-device dma into one api.  I am skeptical we can address all
> those concerns, but we at least have something passably functional for
> the first two.
I admit I have little idea about Dev->Dev implementation, but the other two
should be possible to implement in a generic way.
This discussion is purely about what the current DMA API misses and what
a generic DMA API should do. So, that the current DMA API fills up those
gap, if possible. I would love to get started implementing the generic
DMA API for reference but my priorities are decided by my employer.

>  On the other hand, it's perfectly sane for subarchs
> like pxa to have their own dma api.  If at the end of the day all that
> matters is $arch-specific-dma then why mess around with a generic api?
This is unlikely to hold for long. SoCs are more and more becoming a
cocktail of off-the-shelf third party device IPs, where the device IPs may
be same but different DMAC IP.

Regards.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-09 10:06                     ` jassi brar
  0 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-09 10:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, May 9, 2010 at 4:47 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Sat, May 8, 2010 at 8:48 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>> On Sun, May 9, 2010 at 7:24 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>>> On Fri, May 7, 2010 at 7:37 PM, jassi brar <jassisinghbrar@gmail.com> wrote:
>>>> ?IMHO, a DMA api should be as quick as possible - callbacks done in IRQ context.
>>>> ?But since there maybe clients that need to do sleepable stuff in
>>>
>>> None of the current clients sleep in the callback, it's done in
>>> soft-irq context. ?The only expectation is that hard-irqs are enabled
>>> during the callback just like timer callbacks. ?I also would like to
>>> see numbers to quantify the claims of slowness.
>> The clients evolve around the API so they don't do what the API doesn't
>> allow. Any API should try to put as least contraints as possible - you never
>> know what kinda clients are gonna arise.
>
> Running a callback in hard-irq context definitely puts constraints on
> the callback implementation to be as minimal as possible... and there
> is nothing stopping you from doing that today with the existing
> dmaengine interface: see idmac_interrupt.
We must plan for SoCs that have same peripheral IPs but different
DMACs. In that case, we have one client driver and two DMAC drivers
behind the same DMA API. So, if the DMA API doesn't fix such
assumptions, we can't have the driver to work for both SoCs.

>> Lets say a protocol requires 'quick' ACK(within few usecs) on control bus after
>> xfer'ing a large packet on data bus. All the client needs is to be
>> able to toggle
>> some bit of the device controller after the DMA done, which can very well be
>> done in IRQ context but maybe too late if the callback is done from a tasklet
>> scheduled from DMAC ISR.
>> The point being, a DMA API should be able to do callbacks from the IRQ context
>> too. That is, assuming the clients know what they do.
>
> You are confusing async_tx constraints and dmaengine. ?If your driver
> is providing the backend of an async_tx operation (currently only
> md-raid acceleration) then md-raid can assume that the callback is
> being performed in an irq-enabled non-sleepable context. ?If you are
> not providing an async_tx backend service then those constraints are
> lifted. ?I think I would like to make this explicit
> CONFIG_DMA_SUPPORTS_ASYNC_TX option to clearly mark the intended use
> model of the dma controller.
Again, the client shouldn't need to know about the backend DMAC driver.
If some client can't stand 'async' ops, there should be some way for it to
ask DMA API and know.
IMHO, DMA API should see async_tx as a special 'relaxed' case. Adding
new flags to differentiate only complicate the client drivers.

>> Also, I think it is possible to have an API that allows request submission from
>> callbacks, which will be a very useful feature.
>> Of course, assuming the clients know what they can/can't do (just like current
>> DMA API or any other API).
>
> It's a driver specific implementation detail if it supports submission
> from the callback. ?As a "general" rule clients should not assume that
> all drivers support this, but in the architecture specific case you
> know which driver you are talking to, so this should not be an issue.
Again, please look at the "Same client driver for different SoCs with
different DMACs" situation. DMA API needs to take a stand.

>>>> callbacks, the API
>>>> ?may do two callbacks - 'quick' in irq context and 'lazy' from
>>>> tasklets scheduled from
>>>> ?the IRQ. Most clients will provide either, while some may provide
>>>> both callback functions.
>>>>
>>>> b) There seems to be no clear way of reporting failed transfers. The
>>>> device_tx_status
>>>> ? ?can get FAIL/SUCSESS but the call is open ended and can be performed
>>>> ? ?without any time bound after tx_submit. It is not very optimal for
>>>> DMAC drivers
>>>> ? ?to save descriptors of all failed transactions until the channel
>>>> is released.
>>>> ? ?IMHO, provision of status checking by two mechanisms: cookie and dma-done
>>>> ? callbacks is complication more than a feature. Perhaps the dma
>>>> engine could provide
>>>> ? a default callback, should the client doesn't do so, and track
>>>> done/pending xfers
>>>> ?for such requests?
>>>
>>> I agree the error handling was designed around mem-to-mem assumptions
>>> where failures are due to double-bit ECC errors and other rare events.
>> well, neither have I ever seen DMA failure, but a good API shouldn't count
>> upon h/w perfection.
>>
>
> It doesn't count on perfection, it treats failures the same way the
> cpu would react to a unhandled data abort i.e. panic. ?I was thinking
> of a case like sata where you might see dma errors on a daily basis.
panic'ing is extreme reaction, esp when DMA API doesn't provide any
guarantee of time-bound operations - the client or API could simply
retry after taking appropriate action.

>>>> c) Conceptually, the channels are tightly coupled with the DMACs,
>>>> there seems to be
>>>> ? no way to be able to schedule a channel among more than one DMACs
>>>> in the runtime,
>>>> ? that is if more then one DMAC support the same channel/peripheral.
>>>> ? For example, Samsung's S5Pxxxx have many channels available on more
>>>> than 1 DMAC
>>>> ? but for this dma api we have to statically assign channels to
>>>> DMACs, which may result in
>>>> ? a channel acquire request rejected just because the DMAC we chose
>>>> for it is already
>>>> ? fully busy while another DMAC, which also supports the channel, is idling.
>>>> ? Unless we treat the same peripheral as, say, I2STX_viaDMAC1 and
>>>> I2STX_viaDMAC2
>>>> ? and allocate double resources for these "mutually exclusive" channels.
>>>
>>> I am not understanding this example. ?If both DMACs are registered the
>>> dma_filter function to dma_request_channel() can select between them,
>>> right?
>> Let me be precise. I2S_Tx fifo(I2S peripheral/channel) can be be reached
>> by two DMACs but, of course, the channel can only be active with
>> exactly one DMAC.
>> So, it is desirable to be able to reach the peripheral via second DMAC should
>> the first one is too busy to handle the request. Clearly this is a
>> runtime decision.
>> FWIHS, I can associate the channel with either of the DMACs and if that DMAC
>> can't handle the I2S_Tx request (say due to its all h/w threads
>> allocated to other
>> request), I can't play audio even if the DMAC might be simply idling.
>>
>
> Ah ok, you want load balancing between channels. ?In that case the 1:1
> nature of dma_request_channel() is not the right interface. ?We would
> need to develop something like an architecture specific implementation
> of dma_find_channel() to allow dynamic channel allocation at runtime.
> But at that point we will have written something that is very
> architecture specific, how could we implement that in a generic api?
In S3C DMA API driver for PL330 DMAC, I add all unique channels and
DMACs to system wide channel and DMAC pool. DMACs have
a capability mask to help find which channels can be reached vai that DMAC.
That way it becomes possible to do runtime mapping of channels on
DMACs.
About current DMA API? I don't know any easy way. But is sure possible
to implement a generic API to do that.

> Basically if the driver does not want to present resources to generic
> clients, does want to use any of the existing generic channel
> allocation mechanisms, and has narrow platform-specific needs then why
> code to/extend a generic api?
Sometimes there is limit on the number of concurrrent channels that
the DMAC can handle. For ex, PL330 can have only 8 h/w threads but
32 possible peripherals/channels, allowing only max. 8 channels to be active
at any time. In such situations, the DMAC may refuse channel allocation
until some other client releases a channel thereby freeing up a h/w thread.

>>>> e) There seems to be no ScatterGather support for Mem to Mem transfers.
>>>
>>> There has never been a use case, what did you have in mind. ?If
>>> multiple prep_memcpy commands is too inefficient we could always add
>>> another operation.
>> Just that I believe any API should be as exhaustive and generic as possible.
>> I see it possible for multimedia devices/drivers to evolve to start needing
>> such capabilities.
>> Also, the way DMA API treats memcpy/memset and assume SG reqs to be
>> equivalent to MEM<=>DEV request is not very impressive.
>> IMHO, any submitted request should be a list of xfers. And an xfer is a
>> 'memset' with 'src_len' bytes from 'src_addr' to be copied 'n' times
>> at 'dst_addr'.
>> Memcpy is just a special case of memset, where n := 1
>> This covers most possible use cases while being more compact and future-proof.
>
> No, memset is an operation that does not have a source address and
> instead writes a pattern.
That sounds like assuming memset to be writing a value multiple times.

Most DMACs come with optional SRC/DST increment bit to control the
copy operation. That makes memset nicely blend with memcpy requests.

Memset might not have source address, but for most DMACs the driver
would have to allocate 4 dma coherent bytes(if memset unit is int) and
treat it most likely as memcpy request with src_inc := 0 and dst_inc := 1
For DMACs that does support direct memset operation, there wud be a
limit on size of the pattern, making them 'lesser' flexible than those without
direct memset support.

So, IMHO, memset is better seen as writing a data pattern multiple times rather
than writing a value multiple times.
Because with clients and over time, the unit size may vary.
Lets not put limit on unit size for memset operation.
What if i want to set 1800 bytes of memory with a pattern of 9bytes?
200 memcpy requests OR allocate 9 dma coherent bytes and do 1 memset?

>  As for the sg support for mem-to-mem
> operations... like most things in Linux it was designed around its
> users and none of the users at the time (md-raid, net-dma) required
> scatter gather support.
Yes, but if a request is defined as a 'SG list of memset ops', we pay
no extra price for having capability for Mem to Mem SG. And one
doesn't need to look at the potential users for the API.
All DMAC drivers would have had just one 'prepare' rather than three.

> Without seeing code its hard to make a judgment on what can and cannot
> fit in dmaengine, but it needs to be judged on what fits in a generic
> api and the feasibility of forcing mem-to-mem device-to-mem and
> device-to-device dma into one api. ?I am skeptical we can address all
> those concerns, but we at least have something passably functional for
> the first two.
I admit I have little idea about Dev->Dev implementation, but the other two
should be possible to implement in a generic way.
This discussion is purely about what the current DMA API misses and what
a generic DMA API should do. So, that the current DMA API fills up those
gap, if possible. I would love to get started implementing the generic
DMA API for reference but my priorities are decided by my employer.

>  On the other hand, it's perfectly sane for subarchs
> like pxa to have their own dma api.  If at the end of the day all that
> matters is $arch-specific-dma then why mess around with a generic api?
This is unlikely to hold for long. SoCs are more and more becoming a
cocktail of off-the-shelf third party device IPs, where the device IPs may
be same but different DMAC IP.

Regards.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-09 10:06                     ` jassi brar
@ 2010-05-09 17:26                       ` Dan Williams
  -1 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-09 17:26 UTC (permalink / raw)
  To: jassi brar
  Cc: Linus Walleij, Russell King - ARM Linux, Ben Dooks, linux-mmc,
	linux-kernel, linux-arm-kernel

On Sun, May 9, 2010 at 3:06 AM, jassi brar <jassisinghbrar@gmail.com> wrote:
> This discussion is purely about what the current DMA API misses and what
> a generic DMA API should do. So, that the current DMA API fills up those
> gap, if possible. I would love to get started implementing the generic
> DMA API for reference but my priorities are decided by my employer.

Well, the only significant miss that has been identified so far is
dynamic channel allocation for the device-to-mem case.  Everything
else can be done with small tweaks to the existing interface.  But
some of this discussion reminds me of Section 2.4 of
Documentaion/SubmittingPatches:

4) Don't over-design.

Don't try to anticipate nebulous future cases which may or may not
be useful:  "Make it as simple as you can, and no simpler."

Let's just wait for the code before outlining what can and cannot be
done, especially given where we started [1].

--
Dan

[1]: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=include/linux/dmaengine.h;hb=v2.6.18#l160

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-09 17:26                       ` Dan Williams
  0 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2010-05-09 17:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, May 9, 2010 at 3:06 AM, jassi brar <jassisinghbrar@gmail.com> wrote:
> This discussion is purely about what the current DMA API misses and what
> a generic DMA API should do. So, that the current DMA API fills up those
> gap, if possible. I would love to get started implementing the generic
> DMA API for reference but my priorities are decided by my employer.

Well, the only significant miss that has been identified so far is
dynamic channel allocation for the device-to-mem case.  Everything
else can be done with small tweaks to the existing interface.  But
some of this discussion reminds me of Section 2.4 of
Documentaion/SubmittingPatches:

4) Don't over-design.

Don't try to anticipate nebulous future cases which may or may not
be useful:  "Make it as simple as you can, and no simpler."

Let's just wait for the code before outlining what can and cannot be
done, especially given where we started [1].

--
Dan

[1]: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=include/linux/dmaengine.h;hb=v2.6.18#l160

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/7] DMAENGINE: fixes and PrimeCells
  2010-05-09 17:26                       ` Dan Williams
@ 2010-05-09 22:51                         ` jassi brar
  -1 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-09 22:51 UTC (permalink / raw)
  To: Dan Williams
  Cc: Linus Walleij, Russell King - ARM Linux, Ben Dooks, linux-mmc,
	linux-kernel, linux-arm-kernel

On Mon, May 10, 2010 at 2:26 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Sun, May 9, 2010 at 3:06 AM, jassi brar <jassisinghbrar@gmail.com> wrote:
>> This discussion is purely about what the current DMA API misses and what
>> a generic DMA API should do. So, that the current DMA API fills up those
>> gap, if possible. I would love to get started implementing the generic
>> DMA API for reference but my priorities are decided by my employer.
>
> Well, the only significant miss that has been identified so far is
> dynamic channel allocation for the device-to-mem case.  Everything
> else can be done with small tweaks to the existing interface.  But
> some of this discussion reminds me of Section 2.4 of
> Documentaion/SubmittingPatches:
>
> 4) Don't over-design.
>
> Don't try to anticipate nebulous future cases which may or may not
> be useful:  "Make it as simple as you can, and no simpler."
There is a fine line between anticipating future requirements and
over-designing.

We already have Samsung's and STM's SoC sharing a DMAC IP.

Esp with PrimeCells, we are soon likely to see SoCs sharing a
peripheral IP with possibly different DMACs.

Some elements may be ideal-design at the cost of over-design, but not all.
Let us not simply brush aside all concerns as over-designing.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/7] DMAENGINE: fixes and PrimeCells
@ 2010-05-09 22:51                         ` jassi brar
  0 siblings, 0 replies; 28+ messages in thread
From: jassi brar @ 2010-05-09 22:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 10, 2010 at 2:26 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Sun, May 9, 2010 at 3:06 AM, jassi brar <jassisinghbrar@gmail.com> wrote:
>> This discussion is purely about what the current DMA API misses and what
>> a generic DMA API should do. So, that the current DMA API fills up those
>> gap, if possible. I would love to get started implementing the generic
>> DMA API for reference but my priorities are decided by my employer.
>
> Well, the only significant miss that has been identified so far is
> dynamic channel allocation for the device-to-mem case. ?Everything
> else can be done with small tweaks to the existing interface. ?But
> some of this discussion reminds me of Section 2.4 of
> Documentaion/SubmittingPatches:
>
> 4) Don't over-design.
>
> Don't try to anticipate nebulous future cases which may or may not
> be useful: ?"Make it as simple as you can, and no simpler."
There is a fine line between anticipating future requirements and
over-designing.

We already have Samsung's and STM's SoC sharing a DMAC IP.

Esp with PrimeCells, we are soon likely to see SoCs sharing a
peripheral IP with possibly different DMACs.

Some elements may be ideal-design at the cost of over-design, but not all.
Let us not simply brush aside all concerns as over-designing.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2010-05-09 22:51 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-03  0:54 [PATCH 0/7] DMAENGINE: fixes and PrimeCells Linus Walleij
2010-05-03  0:54 ` Linus Walleij
2010-05-07  9:13 ` Linus Walleij
2010-05-07  9:13   ` Linus Walleij
2010-05-07  9:32   ` Russell King - ARM Linux
2010-05-07  9:32     ` Russell King - ARM Linux
2010-05-07 11:43     ` Linus Walleij
2010-05-07 11:43       ` Linus Walleij
2010-05-07 12:31       ` jassi brar
2010-05-07 12:31         ` jassi brar
2010-05-07 16:10         ` Linus Walleij
2010-05-07 16:10           ` Linus Walleij
2010-05-08  2:37           ` jassi brar
2010-05-08  2:37             ` jassi brar
2010-05-08 22:24             ` Dan Williams
2010-05-08 22:24               ` Dan Williams
2010-05-09  3:48               ` jassi brar
2010-05-09  3:48                 ` jassi brar
2010-05-09  7:47                 ` Dan Williams
2010-05-09  7:47                   ` Dan Williams
2010-05-09 10:06                   ` jassi brar
2010-05-09 10:06                     ` jassi brar
2010-05-09 17:26                     ` Dan Williams
2010-05-09 17:26                       ` Dan Williams
2010-05-09 22:51                       ` jassi brar
2010-05-09 22:51                         ` jassi brar
2010-05-07 23:54       ` Dan Williams
2010-05-07 23:54         ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.