All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] DMAEngine: Define generic transfer request api
@ 2011-08-12 11:14 Jassi Brar
  2011-08-16 12:56 ` Koul, Vinod
  2011-09-15  7:46 ` [PATCHv2] " Jassi Brar
  0 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-08-12 11:14 UTC (permalink / raw)
  To: dan.j.williams, vkoul, linux-kernel
  Cc: sundaram, linus.walleij, linux-omap, rmk+kernel, nsekhar, Jassi Brar

Define a new api that could be used for doing fancy data transfers
like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient in
such cases as where the interleave and chunk are only a few bytes,
which call for a very condensed api to convey pattern of the transfer.

This api supports all 4 variants of scatter-gather and contiguous transfer.
Besides, it could also represent common operations like
        device_prep_dma_{cyclic, memset, memcpy}
and maybe some more that I am not sure of.

Of course, neither can this api help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern.

Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
---
 include/linux/dmaengine.h |   73 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 8fbf40e..74f3ae0 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -76,6 +76,76 @@ enum dma_transaction_type {
 /* last transaction type for creation of the capabilities mask */
 #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
 
+/**
+ * Generic Transfer Request
+ * ------------------------
+ * A chunk is collection of contiguous bytes to be transfered.
+ * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
+ * ICGs may or maynot change between chunks.
+ * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
+ *  that when repeated an integral number of times, specifies the transfer.
+ * A transfer template is specification of a Frame, the number of times
+ *  it is to be repeated and other per-transfer attributes.
+ *
+ * Practically, a client driver would have ready a template for each
+ *  type of transfer it is going to need during its lifetime and
+ *  set only 'src_start' and 'dst_start' before submitting the requests.
+ *
+ *
+ *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
+ *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
+ *
+ *    ==  Chunk size
+ *    ... ICG
+ */
+
+/**
+ * struct data_chunk - Element of scatter-gather list that makes a frame.
+ * @size: Number of bytes to read from source.
+ *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
+ * @icg: Number of bytes to jump after last src/dst address of this
+ *	 chunk and before first src/dst address for next chunk.
+ *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
+ *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
+ */
+struct data_chunk {
+	size_t size;
+	size_t icg;
+};
+
+/**
+ * struct xfer_template - Template to convey DMAC the transfer pattern
+ *	 and attributes.
+ * @op: The operation to perform on source data before writing it on
+ *	 to destination address.
+ * @src_start: Bus address of source for the first chunk.
+ * @dst_start: Bus address of destination for the first chunk.
+ * @src_inc: If the source address increments after reading from it.
+ * @dst_inc: If the destination address increments after writing to it.
+ * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
+ *		Otherwise, source is read contiguously (icg ignored).
+ *		Ignored if src_inc is false.
+ * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
+ *		Otherwise, destination is filled contiguously (icg ignored).
+ *		Ignored if dst_inc is false.
+ * @frm_irq: If the client expects DMAC driver to do callback after each frame.
+ * @numf: Number of frames in this template.
+ * @frame_size: Number of chunks in a frame i.e, size of sgl[].
+ * @sgl: Array of {chunk,icg} pairs that make up a frame.
+ */
+struct xfer_template {
+	enum dma_transaction_type op;
+	dma_addr_t src_start;
+	dma_addr_t dst_start;
+	bool src_inc;
+	bool dst_inc;
+	bool src_sgl;
+	bool dst_sgl;
+	bool frm_irq;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
 
 /**
  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
@@ -432,6 +502,7 @@ struct dma_tx_state {
  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
  *	The function takes a buffer of size buf_len. The callback function will
  *	be called after period_len bytes have been transferred.
+ * @device_prep_dma_genxfer: Transfer expression in a generic way.
  * @device_control: manipulate all pending operations on a channel, returns
  *	zero or error code
  * @device_tx_status: poll for transaction completion, the optional
@@ -496,6 +567,8 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
+	struct dma_async_tx_descriptor *(*device_prep_dma_genxfer)(
+		struct dma_chan *chan, struct xfer_template *xt);
 	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
 		unsigned long arg);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-12 11:14 [PATCH] DMAEngine: Define generic transfer request api Jassi Brar
@ 2011-08-16 12:56 ` Koul, Vinod
  2011-08-16 13:06   ` Linus Walleij
  2011-08-16 14:32   ` Jassi Brar
  2011-09-15  7:46 ` [PATCHv2] " Jassi Brar
  1 sibling, 2 replies; 131+ messages in thread
From: Koul, Vinod @ 2011-08-16 12:56 UTC (permalink / raw)
  To: Jassi Brar, sundaram
  Cc: dan.j.williams, linux-kernel, linus.walleij, linux-omap,
	rmk+kernel, nsekhar

On Fri, 2011-08-12 at 16:44 +0530, Jassi Brar wrote:
> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
> 
> This api supports all 4 variants of scatter-gather and contiguous transfer.
> Besides, it could also represent common operations like
>         device_prep_dma_{cyclic, memset, memcpy}
> and maybe some more that I am not sure of.
> 
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
> 
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
>  include/linux/dmaengine.h |   73 +++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 73 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..74f3ae0 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -76,6 +76,76 @@ enum dma_transaction_type {
>  /* last transaction type for creation of the capabilities mask */
>  #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>  
> +/**
> + * Generic Transfer Request
> + * ------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
> +
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *	 chunk and before first src/dst address for next chunk.
> + *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +	size_t size;
> +	size_t icg;
> +};
> +
> +/**
> + * struct xfer_template - Template to convey DMAC the transfer pattern
> + *	 and attributes.
> + * @op: The operation to perform on source data before writing it on
> + *	 to destination address.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *		Otherwise, source is read contiguously (icg ignored).
> + *		Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *		Otherwise, destination is filled contiguously (icg ignored).
> + *		Ignored if dst_inc is false.
> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct xfer_template {
> +	enum dma_transaction_type op;
> +	dma_addr_t src_start;
> +	dma_addr_t dst_start;
> +	bool src_inc;
> +	bool dst_inc;
> +	bool src_sgl;
> +	bool dst_sgl;
> +	bool frm_irq;
> +	size_t numf;
> +	size_t frame_size;
> +	struct data_chunk sgl[0];
> +};
>  
>  /**
>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -432,6 +502,7 @@ struct dma_tx_state {
>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>   *	The function takes a buffer of size buf_len. The callback function will
>   *	be called after period_len bytes have been transferred.
> + * @device_prep_dma_genxfer: Transfer expression in a generic way.
>   * @device_control: manipulate all pending operations on a channel, returns
>   *	zero or error code
>   * @device_tx_status: poll for transaction completion, the optional
> @@ -496,6 +567,8 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
> +	struct dma_async_tx_descriptor *(*device_prep_dma_genxfer)(
> +		struct dma_chan *chan, struct xfer_template *xt);
>  	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>  		unsigned long arg);
>  
Do you have a driver which is using the API, would help to evaluate this
API when we see the usage :)

Currently we have two approaches to solve this problem first being the
DMA_STRIDE_CONFIG proposed by Linus W, I feel this one is better
approach as this can give client ability to configure each transfer
rather than set for the channel. Linus W, do you agree?

Sundaram, Does this fit to the usage you folks wanted?


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-16 12:56 ` Koul, Vinod
@ 2011-08-16 13:06   ` Linus Walleij
  2011-08-19 13:43     ` Koul, Vinod
  2011-08-16 14:32   ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Linus Walleij @ 2011-08-16 13:06 UTC (permalink / raw)
  To: Koul, Vinod
  Cc: Jassi Brar, sundaram, dan.j.williams, linux-kernel, linux-omap,
	rmk+kernel, nsekhar

On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:

> Currently we have two approaches to solve this problem first being the
> DMA_STRIDE_CONFIG proposed by Linus W, I feel this one is better
> approach as this can give client ability to configure each transfer
> rather than set for the channel. Linus W, do you agree?

I think Sundaram is in the position of doing some heavy work on
using one or the other of the API:s, and I think he is better
suited than anyone else of us to select what scheme to use,
in the end he's going to write the first code using the API.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-16 12:56 ` Koul, Vinod
  2011-08-16 13:06   ` Linus Walleij
@ 2011-08-16 14:32   ` Jassi Brar
  1 sibling, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-08-16 14:32 UTC (permalink / raw)
  To: Koul, Vinod
  Cc: sundaram, dan.j.williams, linux-kernel, linus.walleij,
	linux-omap, rmk+kernel, nsekhar

On 16 August 2011 18:26, Koul, Vinod <vinod.koul@intel.com> wrote:
> On Fri, 2011-08-12 at 16:44 +0530, Jassi Brar wrote:
>> Define a new api that could be used for doing fancy data transfers
>> like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient in
>> such cases as where the interleave and chunk are only a few bytes,
>> which call for a very condensed api to convey pattern of the transfer.
>>
>> This api supports all 4 variants of scatter-gather and contiguous transfer.
>> Besides, it could also represent common operations like
>>         device_prep_dma_{cyclic, memset, memcpy}
>> and maybe some more that I am not sure of.
>>
>> Of course, neither can this api help transfers that don't lend to DMA by
>> nature, i.e, scattered tiny read/writes with no periodic pattern.
>>
>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>> ---
>>  include/linux/dmaengine.h |   73 +++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 73 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 8fbf40e..74f3ae0 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -76,6 +76,76 @@ enum dma_transaction_type {
>>  /* last transaction type for creation of the capabilities mask */
>>  #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>>
>> +/**
>> + * Generic Transfer Request
>> + * ------------------------
>> + * A chunk is collection of contiguous bytes to be transfered.
>> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>> + * ICGs may or maynot change between chunks.
>> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>> + *  that when repeated an integral number of times, specifies the transfer.
>> + * A transfer template is specification of a Frame, the number of times
>> + *  it is to be repeated and other per-transfer attributes.
>> + *
>> + * Practically, a client driver would have ready a template for each
>> + *  type of transfer it is going to need during its lifetime and
>> + *  set only 'src_start' and 'dst_start' before submitting the requests.
>> + *
>> + *
>> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>> + *
>> + *    ==  Chunk size
>> + *    ... ICG
>> + */
>> +
>> +/**
>> + * struct data_chunk - Element of scatter-gather list that makes a frame.
>> + * @size: Number of bytes to read from source.
>> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
>> + * @icg: Number of bytes to jump after last src/dst address of this
>> + *    chunk and before first src/dst address for next chunk.
>> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
>> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
>> + */
>> +struct data_chunk {
>> +     size_t size;
>> +     size_t icg;
>> +};
>> +
>> +/**
>> + * struct xfer_template - Template to convey DMAC the transfer pattern
>> + *    and attributes.
>> + * @op: The operation to perform on source data before writing it on
>> + *    to destination address.
>> + * @src_start: Bus address of source for the first chunk.
>> + * @dst_start: Bus address of destination for the first chunk.
>> + * @src_inc: If the source address increments after reading from it.
>> + * @dst_inc: If the destination address increments after writing to it.
>> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> + *           Otherwise, source is read contiguously (icg ignored).
>> + *           Ignored if src_inc is false.
>> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> + *           Otherwise, destination is filled contiguously (icg ignored).
>> + *           Ignored if dst_inc is false.
>> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
>> + * @numf: Number of frames in this template.
>> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
>> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
>> + */
>> +struct xfer_template {
>> +     enum dma_transaction_type op;
>> +     dma_addr_t src_start;
>> +     dma_addr_t dst_start;
>> +     bool src_inc;
>> +     bool dst_inc;
>> +     bool src_sgl;
>> +     bool dst_sgl;
>> +     bool frm_irq;
>> +     size_t numf;
>> +     size_t frame_size;
>> +     struct data_chunk sgl[0];
>> +};
>>
>>  /**
>>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
>> @@ -432,6 +502,7 @@ struct dma_tx_state {
>>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>>   *   The function takes a buffer of size buf_len. The callback function will
>>   *   be called after period_len bytes have been transferred.
>> + * @device_prep_dma_genxfer: Transfer expression in a generic way.
>>   * @device_control: manipulate all pending operations on a channel, returns
>>   *   zero or error code
>>   * @device_tx_status: poll for transaction completion, the optional
>> @@ -496,6 +567,8 @@ struct dma_device {
>>       struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>>               struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>>               size_t period_len, enum dma_data_direction direction);
>> +     struct dma_async_tx_descriptor *(*device_prep_dma_genxfer)(
>> +             struct dma_chan *chan, struct xfer_template *xt);
>>       int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>>               unsigned long arg);
>>
> Do you have a driver which is using the API, would help to evaluate this
> API when we see the usage :)
Sorry, not in public and not for DMAENGINE.
It will help if you see it as a generic[1] solution to the requirement of
supporting interleaved transfers with DMACs like TI's EDMA, SDMA,
ARM's PL330 and many more in future.

> Currently we have two approaches to solve this problem first being the
> DMA_STRIDE_CONFIG proposed by Linus W, I feel this one is better
> approach as this can give client ability to configure each transfer
> rather than set for the channel.
To me, doing it via a control command vs new transfer type, is secondary
to doing it in a generic way. And yes, I too believe this way is better because
it is more flexible.


[1] Btw, I haven't yet added option to specify byte/half-word/word swap
while data transfer - which I once had to do with PL330 in a weird setup.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-16 13:06   ` Linus Walleij
@ 2011-08-19 13:43     ` Koul, Vinod
  2011-08-19 14:19       ` Linus Walleij
  2011-08-23 14:43         ` Matt Porter
  0 siblings, 2 replies; 131+ messages in thread
From: Koul, Vinod @ 2011-08-19 13:43 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Jassi Brar, sundaram, dan.j.williams, linux-kernel, linux-omap,
	rmk+kernel, nsekhar

On Tue, 2011-08-16 at 15:06 +0200, Linus Walleij wrote:
> On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:
> 
> > Currently we have two approaches to solve this problem first being the
> > DMA_STRIDE_CONFIG proposed by Linus W, I feel this one is better
> > approach as this can give client ability to configure each transfer
> > rather than set for the channel. Linus W, do you agree?
> 
> I think Sundaram is in the position of doing some heavy work on
> using one or the other of the API:s, and I think he is better
> suited than anyone else of us to select what scheme to use,
> in the end he's going to write the first code using the API.
And Unfortunately TI folks don't seem to care about this discussion :(
Haven't seen anything on this from them, or on previous RFC by Jassi

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-19 13:43     ` Koul, Vinod
@ 2011-08-19 14:19       ` Linus Walleij
  2011-08-19 15:46         ` Jassi Brar
  2011-08-23 14:43         ` Matt Porter
  1 sibling, 1 reply; 131+ messages in thread
From: Linus Walleij @ 2011-08-19 14:19 UTC (permalink / raw)
  To: Koul, Vinod
  Cc: Jassi Brar, sundaram, dan.j.williams, linux-kernel, linux-omap,
	rmk+kernel, nsekhar

2011/8/19 Koul, Vinod <vinod.koul@intel.com>:
> On Tue, 2011-08-16 at 15:06 +0200, Linus Walleij wrote:
>> On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:

>> I think Sundaram is in the position of doing some heavy work on
>> using one or the other of the API:s, and I think he is better
>> suited than anyone else of us to select what scheme to use,
>> in the end he's going to write the first code using the API.

> And Unfortunately TI folks don't seem to care about this discussion :(
> Haven't seen anything on this from them, or on previous RFC by Jassi

Well if there is no code usig the API then there is no rush
in merging it either I guess. Whenever someone (TI or
Samsung) cook some driver patches they can choose their
approach.

Thanks,
Linus Walleij

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-19 14:19       ` Linus Walleij
@ 2011-08-19 15:46         ` Jassi Brar
  2011-08-19 17:28           ` Koul, Vinod
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-08-19 15:46 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Koul, Vinod, sundaram, dan.j.williams, linux-kernel, linux-omap,
	rmk+kernel, nsekhar

On 19 August 2011 19:49, Linus Walleij <linus.ml.walleij@gmail.com> wrote:
> 2011/8/19 Koul, Vinod <vinod.koul@intel.com>:
>> On Tue, 2011-08-16 at 15:06 +0200, Linus Walleij wrote:
>>> On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:
>
>>> I think Sundaram is in the position of doing some heavy work on
>>> using one or the other of the API:s, and I think he is better
>>> suited than anyone else of us to select what scheme to use,
>>> in the end he's going to write the first code using the API.
>
>> And Unfortunately TI folks don't seem to care about this discussion :(
>> Haven't seen anything on this from them, or on previous RFC by Jassi
>
> Well if there is no code usig the API then there is no rush
> in merging it either I guess. Whenever someone (TI or
> Samsung) cook some driver patches they can choose their
> approach.
>
No, it's not a matter of "choice".
If that were the case, Sundaram already proposed a TI specific
flag. Why wait for him to tell his choice again?

You might, but I can't molest my sensibility to believe that a Vendor
specific flag could be better than a generic solution.
Not here at least, where the overhead due to generality is not much.
(though I can trim some 'futuristic' members from the 'struct xfer_template')

Maintainers might wait as long as they want, but there should never
be an option to have vendor specific hacks.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-19 15:46         ` Jassi Brar
@ 2011-08-19 17:28           ` Koul, Vinod
  2011-08-19 18:45             ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Koul, Vinod @ 2011-08-19 17:28 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Linus Walleij, sundaram, Williams, Dan J, linux-kernel,
	linux-omap, rmk+kernel, nsekhar

On Fri, 2011-08-19 at 21:16 +0530, Jassi Brar wrote:
> On 19 August 2011 19:49, Linus Walleij <linus.ml.walleij@gmail.com> wrote:
> > 2011/8/19 Koul, Vinod <vinod.koul@intel.com>:
> >> On Tue, 2011-08-16 at 15:06 +0200, Linus Walleij wrote:
> >>> On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:
> >
> >>> I think Sundaram is in the position of doing some heavy work on
> >>> using one or the other of the API:s, and I think he is better
> >>> suited than anyone else of us to select what scheme to use,
> >>> in the end he's going to write the first code using the API.
> >
> >> And Unfortunately TI folks don't seem to care about this discussion :(
> >> Haven't seen anything on this from them, or on previous RFC by Jassi
> >
> > Well if there is no code usig the API then there is no rush
> > in merging it either I guess. Whenever someone (TI or
> > Samsung) cook some driver patches they can choose their
> > approach.
> >
> No, it's not a matter of "choice".
> If that were the case, Sundaram already proposed a TI specific
> flag. Why wait for him to tell his choice again?
> 
> You might, but I can't molest my sensibility to believe that a Vendor
> specific flag could be better than a generic solution.
> Not here at least, where the overhead due to generality is not much.
> (though I can trim some 'futuristic' members from the 'struct xfer_template')
Who said anything about adding a vendor flag solution, since TI are
potential users of the API it would good to know i this fits there needs
are not. If they don't care, we can't help it...

> 
> Maintainers might wait as long as they want, but there should never
> be an option to have vendor specific hacks.
to me API looks decent after reading some specs of DMACs which support
this mode. Pls send updated patch along with one driver which uses it.
Should be good to go...


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-19 17:28           ` Koul, Vinod
@ 2011-08-19 18:45             ` Jassi Brar
  0 siblings, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-08-19 18:45 UTC (permalink / raw)
  To: Koul, Vinod
  Cc: Linus Walleij, sundaram, Williams, Dan J, linux-kernel,
	linux-omap, rmk+kernel, nsekhar

On 19 August 2011 22:58, Koul, Vinod <vinod.koul@intel.com> wrote:
> On Fri, 2011-08-19 at 21:16 +0530, Jassi Brar wrote:
>> On 19 August 2011 19:49, Linus Walleij <linus.ml.walleij@gmail.com> wrote:
>> > 2011/8/19 Koul, Vinod <vinod.koul@intel.com>:
>> >> On Tue, 2011-08-16 at 15:06 +0200, Linus Walleij wrote:
>> >>> On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:
>> >
>> >>> I think Sundaram is in the position of doing some heavy work on
>> >>> using one or the other of the API:s, and I think he is better
>> >>> suited than anyone else of us to select what scheme to use,
>> >>> in the end he's going to write the first code using the API.
>> >
>> >> And Unfortunately TI folks don't seem to care about this discussion :(
>> >> Haven't seen anything on this from them, or on previous RFC by Jassi
>> >
>> > Well if there is no code usig the API then there is no rush
>> > in merging it either I guess. Whenever someone (TI or
>> > Samsung) cook some driver patches they can choose their
>> > approach.
>> >
>> No, it's not a matter of "choice".
>> If that were the case, Sundaram already proposed a TI specific
>> flag. Why wait for him to tell his choice again?
>>
>> You might, but I can't molest my sensibility to believe that a Vendor
>> specific flag could be better than a generic solution.
>> Not here at least, where the overhead due to generality is not much.
>> (though I can trim some 'futuristic' members from the 'struct xfer_template')
> Who said anything about adding a vendor flag solution,
Not you, but to whom I replied - LinusW   See https://lkml.org/lkml/2011/7/11/74

> since TI are potential users of the API it would good to know i this fits there needs
> are not.
I am super-interested to hear from TI guys.
The generic api here rather supports the case Sundaram projected as
'most' general case. Look at the figure at end of
https://lkml.org/lkml/2011/6/9/343

>> Maintainers might wait as long as they want, but there should never
>> be an option to have vendor specific hacks.
> to me API looks decent after reading some specs of DMACs which support
> this mode. Pls send updated patch along with one driver which uses it.
> Should be good to go...
That has been one problem with DMAEngine. People patch the API as they need
stuff, rather than having had solid thought-out API that could be
extended consistently.
For ex, we ended up having _ten_ device_prep_dma_* callbacks, where as
we could have done having had originally 1 (or maybe 2)  'generic'
prepare callback with a 'enum dma_transaction_type op' argument.
IMO it's rather good that I designed this API for a theoretical
highly-capable DMAC,
and not just the DMAC I've worked on - which would have constrained
the api in future
for other DMACs.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-08-19 13:43     ` Koul, Vinod
@ 2011-08-23 14:43         ` Matt Porter
  2011-08-23 14:43         ` Matt Porter
  1 sibling, 0 replies; 131+ messages in thread
From: Matt Porter @ 2011-08-23 14:43 UTC (permalink / raw)
  To: Koul, Vinod
  Cc: Linus Walleij, Jassi Brar, sundaram, dan.j.williams,
	linux-kernel, linux-omap, rmk+kernel, nsekhar

On Aug 19, 2011, at 9:43 AM, Koul, Vinod wrote:

> On Tue, 2011-08-16 at 15:06 +0200, Linus Walleij wrote:
>> On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:
>> 
>>> Currently we have two approaches to solve this problem first being the
>>> DMA_STRIDE_CONFIG proposed by Linus W, I feel this one is better
>>> approach as this can give client ability to configure each transfer
>>> rather than set for the channel. Linus W, do you agree?
>> 
>> I think Sundaram is in the position of doing some heavy work on
>> using one or the other of the API:s, and I think he is better
>> suited than anyone else of us to select what scheme to use,
>> in the end he's going to write the first code using the API.
> And Unfortunately TI folks don't seem to care about this discussion :(
> Haven't seen anything on this from them, or on previous RFC by Jassi

IIRC, Sundaram went on holiday. I suspect he cares a lot about this
discussion as he has a session on the topic at LPC.

-Matt

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
@ 2011-08-23 14:43         ` Matt Porter
  0 siblings, 0 replies; 131+ messages in thread
From: Matt Porter @ 2011-08-23 14:43 UTC (permalink / raw)
  To: Koul, Vinod
  Cc: Linus Walleij, Jassi Brar, sundaram, dan.j.williams,
	linux-kernel, linux-omap, rmk+kernel, nsekhar

On Aug 19, 2011, at 9:43 AM, Koul, Vinod wrote:

> On Tue, 2011-08-16 at 15:06 +0200, Linus Walleij wrote:
>> On Tue, Aug 16, 2011 at 2:56 PM, Koul, Vinod <vinod.koul@intel.com> wrote:
>> 
>>> Currently we have two approaches to solve this problem first being the
>>> DMA_STRIDE_CONFIG proposed by Linus W, I feel this one is better
>>> approach as this can give client ability to configure each transfer
>>> rather than set for the channel. Linus W, do you agree?
>> 
>> I think Sundaram is in the position of doing some heavy work on
>> using one or the other of the API:s, and I think he is better
>> suited than anyone else of us to select what scheme to use,
>> in the end he's going to write the first code using the API.
> And Unfortunately TI folks don't seem to care about this discussion :(
> Haven't seen anything on this from them, or on previous RFC by Jassi

IIRC, Sundaram went on holiday. I suspect he cares a lot about this
discussion as he has a session on the topic at LPC.

-Matt

^ permalink raw reply	[flat|nested] 131+ messages in thread

* [PATCHv2] DMAEngine: Define generic transfer request api
  2011-08-12 11:14 [PATCH] DMAEngine: Define generic transfer request api Jassi Brar
  2011-08-16 12:56 ` Koul, Vinod
@ 2011-09-15  7:46 ` Jassi Brar
  2011-09-15  8:22   ` Russell King
                     ` (3 more replies)
  1 sibling, 4 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-15  7:46 UTC (permalink / raw)
  To: dan.j.williams, vkoul, linux-kernel; +Cc: rmk+kernel, 21cnbao, Jassi Brar

Define a new api that could be used for doing fancy data transfers
like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient in
such cases as where the interleave and chunk are only a few bytes,
which call for a very condensed api to convey pattern of the transfer.

This api supports all 4 variants of scatter-gather and contiguous transfer.
Besides, in future it could also represent common operations like
        device_prep_dma_{cyclic, memset, memcpy}
and maybe some more that I am not sure of.

Of course, neither can this api help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern.

Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
---

Changes since v1:
   1) Dropped the 'dma_transaction_type' member until we really
      merge another type into this api. Instead added special
      type for this api - DMA_GENXFER in dma_transaction_type.
   2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
      preserve namespace, closer to as suggested by Barry Song.

 drivers/dma/dmaengine.c   |    2 +
 include/linux/dmaengine.h |   71 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index b48967b..63284f6 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -699,6 +699,8 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_cyclic);
 	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
 		!device->device_control);
+	BUG_ON(dma_has_cap(DMA_GENXFER, device->cap_mask) &&
+		!device->device_prep_dma_genxfer);
 
 	BUG_ON(!device->device_alloc_chan_resources);
 	BUG_ON(!device->device_free_chan_resources);
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 8fbf40e..68ebe6c 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -71,11 +71,79 @@ enum dma_transaction_type {
 	DMA_ASYNC_TX,
 	DMA_SLAVE,
 	DMA_CYCLIC,
+	DMA_GENXFER,
 };
 
 /* last transaction type for creation of the capabilities mask */
 #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
 
+/**
+ * Generic Transfer Request
+ * ------------------------
+ * A chunk is collection of contiguous bytes to be transfered.
+ * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
+ * ICGs may or maynot change between chunks.
+ * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
+ *  that when repeated an integral number of times, specifies the transfer.
+ * A transfer template is specification of a Frame, the number of times
+ *  it is to be repeated and other per-transfer attributes.
+ *
+ * Practically, a client driver would have ready a template for each
+ *  type of transfer it is going to need during its lifetime and
+ *  set only 'src_start' and 'dst_start' before submitting the requests.
+ *
+ *
+ *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
+ *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
+ *
+ *    ==  Chunk size
+ *    ... ICG
+ */
+
+/**
+ * struct data_chunk - Element of scatter-gather list that makes a frame.
+ * @size: Number of bytes to read from source.
+ *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
+ * @icg: Number of bytes to jump after last src/dst address of this
+ *	 chunk and before first src/dst address for next chunk.
+ *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
+ *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
+ */
+struct data_chunk {
+	size_t size;
+	size_t icg;
+};
+
+/**
+ * struct dmaxfer_template - Template to convey DMAC the transfer pattern
+ *	 and attributes.
+ * @src_start: Bus address of source for the first chunk.
+ * @dst_start: Bus address of destination for the first chunk.
+ * @src_inc: If the source address increments after reading from it.
+ * @dst_inc: If the destination address increments after writing to it.
+ * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
+ *		Otherwise, source is read contiguously (icg ignored).
+ *		Ignored if src_inc is false.
+ * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
+ *		Otherwise, destination is filled contiguously (icg ignored).
+ *		Ignored if dst_inc is false.
+ * @frm_irq: If the client expects DMAC driver to do callback after each frame.
+ * @numf: Number of frames in this template.
+ * @frame_size: Number of chunks in a frame i.e, size of sgl[].
+ * @sgl: Array of {chunk,icg} pairs that make up a frame.
+ */
+struct dmaxfer_template {
+	dma_addr_t src_start;
+	dma_addr_t dst_start;
+	bool src_inc;
+	bool dst_inc;
+	bool src_sgl;
+	bool dst_sgl;
+	bool frm_irq;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
 
 /**
  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
@@ -432,6 +500,7 @@ struct dma_tx_state {
  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
  *	The function takes a buffer of size buf_len. The callback function will
  *	be called after period_len bytes have been transferred.
+ * @device_prep_dma_genxfer: Transfer expression in a generic way.
  * @device_control: manipulate all pending operations on a channel, returns
  *	zero or error code
  * @device_tx_status: poll for transaction completion, the optional
@@ -496,6 +565,8 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
+	struct dma_async_tx_descriptor *(*device_prep_dma_genxfer)(
+		struct dma_chan *chan, struct dmaxfer_template *xt);
 	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
 		unsigned long arg);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-15  7:46 ` [PATCHv2] " Jassi Brar
@ 2011-09-15  8:22   ` Russell King
  2011-09-15 10:02     ` Jassi Brar
  2011-09-16  7:17   ` Barry Song
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 131+ messages in thread
From: Russell King @ 2011-09-15  8:22 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, vkoul, linux-kernel, 21cnbao

On Thu, Sep 15, 2011 at 01:16:29PM +0530, Jassi Brar wrote:
> +/**
> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> + *	 and attributes.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *		Otherwise, source is read contiguously (icg ignored).
> + *		Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *		Otherwise, destination is filled contiguously (icg ignored).
> + *		Ignored if dst_inc is false.
> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct dmaxfer_template {
> +	dma_addr_t src_start;
> +	dma_addr_t dst_start;
> +	bool src_inc;
> +	bool dst_inc;
> +	bool src_sgl;
> +	bool dst_sgl;
> +	bool frm_irq;

I'll ask again.  How does this deal with the XOR operation, where you have
two sources and one destination?  Unless you sort that out, I don't see
how your idea of collapsing all the prepare functions can possibly work.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-15  8:22   ` Russell King
@ 2011-09-15 10:02     ` Jassi Brar
  0 siblings, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-15 10:02 UTC (permalink / raw)
  To: Russell King; +Cc: dan.j.williams, vkoul, linux-kernel, 21cnbao

On 15 September 2011 13:52, Russell King <rmk@arm.linux.org.uk> wrote:
> On Thu, Sep 15, 2011 at 01:16:29PM +0530, Jassi Brar wrote:
>> +/**
>> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
>> + *    and attributes.
>> + * @src_start: Bus address of source for the first chunk.
>> + * @dst_start: Bus address of destination for the first chunk.
>> + * @src_inc: If the source address increments after reading from it.
>> + * @dst_inc: If the destination address increments after writing to it.
>> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> + *           Otherwise, source is read contiguously (icg ignored).
>> + *           Ignored if src_inc is false.
>> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> + *           Otherwise, destination is filled contiguously (icg ignored).
>> + *           Ignored if dst_inc is false.
>> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
>> + * @numf: Number of frames in this template.
>> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
>> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
>> + */
>> +struct dmaxfer_template {
>> +     dma_addr_t src_start;
>> +     dma_addr_t dst_start;
>> +     bool src_inc;
>> +     bool dst_inc;
>> +     bool src_sgl;
>> +     bool dst_sgl;
>> +     bool frm_irq;

> How does this deal with the XOR operation, where you have
> two sources and one destination?  Unless you sort that out, I don't see
> how your idea of collapsing all the prepare functions can possibly work.

In this v2 revision I have dropped the member "enum dma_transaction_type op"
from "struct dmaxfer_template". So as such it wouldn't support.

In future though, it should be possible to support cascading XOR after
restoring the 'op' member.

Every other member of 'struct dmaxfer_template' would be interpreted
according to the type of operation requested.
For XOR operation - src_inc, dst_inc, src_sgl, dst_sgl, frm_irq would
be ignored.

For cascading XOR of 'N' sources  { Src1^Src2^....^SrcN -> Dst }
  dmaxfer_template.op := DMA_XOR
  dmaxfer_template.numf := 1
  dmaxfer_template.frame_size := N
  dmaxfer_template.sgl[] would point to an array of 'N' source chunks.

For simple XOR   i.e,   Src ^ Dst -> Dst
  Same as above(N:=2), but only the client would point 'dst_start' to
the address of second source chunk.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-15  7:46 ` [PATCHv2] " Jassi Brar
  2011-09-15  8:22   ` Russell King
@ 2011-09-16  7:17   ` Barry Song
  2011-09-16 11:03     ` Jassi Brar
  2011-09-16  9:07   ` Vinod Koul
  2011-09-20 12:12   ` [PATCHv3] DMAEngine: Define interleaved " Jassi Brar
  3 siblings, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-09-16  7:17 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, vkoul, linux-kernel, rmk+kernel

2011/9/15 Jassi Brar <jaswinder.singh@linaro.org>:
> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
>
> This api supports all 4 variants of scatter-gather and contiguous transfer.
> Besides, in future it could also represent common operations like
>        device_prep_dma_{cyclic, memset, memcpy}
> and maybe some more that I am not sure of.
>
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
>
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
>
> Changes since v1:
>   1) Dropped the 'dma_transaction_type' member until we really
>      merge another type into this api. Instead added special
>      type for this api - DMA_GENXFER in dma_transaction_type.
>   2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
>      preserve namespace, closer to as suggested by Barry Song.
>
>  drivers/dma/dmaengine.c   |    2 +
>  include/linux/dmaengine.h |   71 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 73 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index b48967b..63284f6 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -699,6 +699,8 @@ int dma_async_device_register(struct dma_device *device)
>                !device->device_prep_dma_cyclic);
>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>                !device->device_control);
> +       BUG_ON(dma_has_cap(DMA_GENXFER, device->cap_mask) &&
> +               !device->device_prep_dma_genxfer);

i don't think it is what i want here. device_prep_dma_genxfer should
be able to cover memcpy, slave or other modes, but not parallel with
them.
For example, if i use genxfer, but my devices are slave. i might not
need a device_prep_slave_sg since i have already prep_dma_genxfer, but
anyway, i need a DMA_SLAVE_CONFIG to set burst size or others. then
i'd like to have DMA_SLAVE flag but without device_prep_slave_sg
callback.

>
>        BUG_ON(!device->device_alloc_chan_resources);
>        BUG_ON(!device->device_free_chan_resources);
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..68ebe6c 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -71,11 +71,79 @@ enum dma_transaction_type {
>        DMA_ASYNC_TX,
>        DMA_SLAVE,
>        DMA_CYCLIC,
> +       DMA_GENXFER,
>  };

-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-15  7:46 ` [PATCHv2] " Jassi Brar
  2011-09-15  8:22   ` Russell King
  2011-09-16  7:17   ` Barry Song
@ 2011-09-16  9:07   ` Vinod Koul
  2011-09-16 12:30     ` Jassi Brar
  2011-09-20 12:12   ` [PATCHv3] DMAEngine: Define interleaved " Jassi Brar
  3 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-09-16  9:07 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, linux-kernel, rmk+kernel, 21cnbao

On Thu, 2011-09-15 at 13:16 +0530, Jassi Brar wrote:
> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
Then this is a very specific type of transfer this API supports, so the
name generic_xxx is not apt!

> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
> 
> This api supports all 4 variants of scatter-gather and contiguous transfer.
> Besides, in future it could also represent common operations like
>         device_prep_dma_{cyclic, memset, memcpy}
> and maybe some more that I am not sure of.
> 
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
> 
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
> 
> Changes since v1:
>    1) Dropped the 'dma_transaction_type' member until we really
>       merge another type into this api. Instead added special
>       type for this api - DMA_GENXFER in dma_transaction_type.
>    2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
>       preserve namespace, closer to as suggested by Barry Song.
> 
>  drivers/dma/dmaengine.c   |    2 +
>  include/linux/dmaengine.h |   71 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 73 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index b48967b..63284f6 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -699,6 +699,8 @@ int dma_async_device_register(struct dma_device *device)
>  		!device->device_prep_dma_cyclic);
>  	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>  		!device->device_control);
> +	BUG_ON(dma_has_cap(DMA_GENXFER, device->cap_mask) &&
> +		!device->device_prep_dma_genxfer);
>  
>  	BUG_ON(!device->device_alloc_chan_resources);
>  	BUG_ON(!device->device_free_chan_resources);
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..68ebe6c 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -71,11 +71,79 @@ enum dma_transaction_type {
>  	DMA_ASYNC_TX,
>  	DMA_SLAVE,
>  	DMA_CYCLIC,
> +	DMA_GENXFER,
>  };
>  
>  /* last transaction type for creation of the capabilities mask */
>  #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>  
> +/**
> + * Generic Transfer Request
> + * ------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
> +
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *	 chunk and before first src/dst address for next chunk.
> + *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +	size_t size;
> +	size_t icg;
> +};
> +
> +/**
> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> + *	 and attributes.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *		Otherwise, source is read contiguously (icg ignored).
> + *		Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *		Otherwise, destination is filled contiguously (icg ignored).
> + *		Ignored if dst_inc is false.
> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct dmaxfer_template {
> +	dma_addr_t src_start;
> +	dma_addr_t dst_start;
> +	bool src_inc;
> +	bool dst_inc;
> +	bool src_sgl;
> +	bool dst_sgl;
> +	bool frm_irq;
> +	size_t numf;
> +	size_t frame_size;
> +	struct data_chunk sgl[0];
> +};
This kind of transfer can be either slave (audio interleaved) based or
mem-to-mem (extracting still image from a buffer), How do you propose
the dmac know about this is unclear to me
>  
>  /**
>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -432,6 +500,7 @@ struct dma_tx_state {
>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>   *	The function takes a buffer of size buf_len. The callback function will
>   *	be called after period_len bytes have been transferred.
> + * @device_prep_dma_genxfer: Transfer expression in a generic way.
>   * @device_control: manipulate all pending operations on a channel, returns
>   *	zero or error code
>   * @device_tx_status: poll for transaction completion, the optional
> @@ -496,6 +565,8 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
> +	struct dma_async_tx_descriptor *(*device_prep_dma_genxfer)(
> +		struct dma_chan *chan, struct dmaxfer_template *xt);
>  	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>  		unsigned long arg);
>  


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-16  7:17   ` Barry Song
@ 2011-09-16 11:03     ` Jassi Brar
  0 siblings, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-16 11:03 UTC (permalink / raw)
  To: Barry Song; +Cc: dan.j.williams, vkoul, linux-kernel, rmk+kernel

On 16 September 2011 12:47, Barry Song <21cnbao@gmail.com> wrote:
> 2011/9/15 Jassi Brar <jaswinder.singh@linaro.org>:
>> --- a/drivers/dma/dmaengine.c
>> +++ b/drivers/dma/dmaengine.c
>> @@ -699,6 +699,8 @@ int dma_async_device_register(struct dma_device *device)
>>                !device->device_prep_dma_cyclic);
>>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>                !device->device_control);
>> +       BUG_ON(dma_has_cap(DMA_GENXFER, device->cap_mask) &&
>> +               !device->device_prep_dma_genxfer);
>
> i don't think it is what i want here. device_prep_dma_genxfer should
> be able to cover memcpy, slave or other modes, but not parallel with
> them.

ATM this api is indeed parallel to others like memcpy, slave_sg etc.
because I have dropped the 'op' member in v2. The idea is to merge
other prepares one at a time and ultimately have this api consume
those that it can. So the 'op' would be restored when we merge
the first 'prepare' into this.


> For example, if i use genxfer, but my devices are slave. i might not
> need a device_prep_slave_sg since i have already prep_dma_genxfer, but
> anyway, i need a DMA_SLAVE_CONFIG to set burst size or others. then
> i'd like to have DMA_SLAVE flag but without device_prep_slave_sg
> callback.

As I already said, this api could be used for interleaved Mem->Mem
as well as Mem<->Dev
If your channels are Slave, please set the DMA_SLAVE flag and provide the
'device_control' callback.

I think I should have simply removed the check
    BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
                   !device->device_prep_slave_sg);
Because obviously we now have a _valid_ case of SLAVE channel but
supporting a different prepare - device_prep_dma_genxfer.  Will do that in v3.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-16  9:07   ` Vinod Koul
@ 2011-09-16 12:30     ` Jassi Brar
  2011-09-16 17:06       ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-16 12:30 UTC (permalink / raw)
  To: Vinod Koul; +Cc: dan.j.williams, linux-kernel, rmk+kernel, 21cnbao

On 16 September 2011 14:37, Vinod Koul <vinod.koul@intel.com> wrote:
> On Thu, 2011-09-15 at 13:16 +0530, Jassi Brar wrote:
>> Define a new api that could be used for doing fancy data transfers
>> like interleaved to contiguous copy and vice-versa.
> Then this is a very specific type of transfer this API supports, so the
> name generic_xxx is not apt!

Currently it doesn't support every operation, perhaps there will be an
operation that couldn't ever be merged into this.
But it does support interleaved(async and slave), memcpy, memset, dma_sg,
slave_sg, and cyclic.
Please feel free to suggest a better name.


>> +/**
>> + * Generic Transfer Request
>> + * ------------------------
>> + * A chunk is collection of contiguous bytes to be transfered.
>> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>> + * ICGs may or maynot change between chunks.
>> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>> + *  that when repeated an integral number of times, specifies the transfer.
>> + * A transfer template is specification of a Frame, the number of times
>> + *  it is to be repeated and other per-transfer attributes.
>> + *
>> + * Practically, a client driver would have ready a template for each
>> + *  type of transfer it is going to need during its lifetime and
>> + *  set only 'src_start' and 'dst_start' before submitting the requests.
>> + *
>> + *
>> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>> + *
>> + *    ==  Chunk size
>> + *    ... ICG
>> + */
>> +
>> +/**
>> + * struct data_chunk - Element of scatter-gather list that makes a frame.
>> + * @size: Number of bytes to read from source.
>> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
>> + * @icg: Number of bytes to jump after last src/dst address of this
>> + *    chunk and before first src/dst address for next chunk.
>> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
>> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
>> + */
>> +struct data_chunk {
>> +     size_t size;
>> +     size_t icg;
>> +};
>> +
>> +/**
>> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
>> + *    and attributes.
>> + * @src_start: Bus address of source for the first chunk.
>> + * @dst_start: Bus address of destination for the first chunk.
>> + * @src_inc: If the source address increments after reading from it.
>> + * @dst_inc: If the destination address increments after writing to it.
>> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> + *           Otherwise, source is read contiguously (icg ignored).
>> + *           Ignored if src_inc is false.
>> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> + *           Otherwise, destination is filled contiguously (icg ignored).
>> + *           Ignored if dst_inc is false.
>> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
>> + * @numf: Number of frames in this template.
>> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
>> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
>> + */
>> +struct dmaxfer_template {
>> +     dma_addr_t src_start;
>> +     dma_addr_t dst_start;
>> +     bool src_inc;
>> +     bool dst_inc;
>> +     bool src_sgl;
>> +     bool dst_sgl;
>> +     bool frm_irq;
>> +     size_t numf;
>> +     size_t frame_size;
>> +     struct data_chunk sgl[0];
>> +};
> This kind of transfer can be either slave (audio interleaved) based or
> mem-to-mem (extracting still image from a buffer),
> How do you propose the dmac know about this is unclear to me

Every 'prepare' callback already passes the channel chosen for the transfer.
And the DMAC driver already knows if that channel is SLAVE capable or not.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-16 12:30     ` Jassi Brar
@ 2011-09-16 17:06       ` Vinod Koul
  2011-09-16 17:51         ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-09-16 17:06 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, linux-kernel, rmk+kernel, 21cnbao

On Fri, 2011-09-16 at 18:00 +0530, Jassi Brar wrote:
> On 16 September 2011 14:37, Vinod Koul <vinod.koul@intel.com> wrote:
> > On Thu, 2011-09-15 at 13:16 +0530, Jassi Brar wrote:
> >> Define a new api that could be used for doing fancy data transfers
> >> like interleaved to contiguous copy and vice-versa.
> > Then this is a very specific type of transfer this API supports, so the
> > name generic_xxx is not apt!
> 
> Currently it doesn't support every operation, perhaps there will be an
> operation that couldn't ever be merged into this.
> But it does support interleaved(async and slave), memcpy, memset, dma_sg,
> slave_sg, and cyclic.
> Please feel free to suggest a better name.
ATM, this api is for interleaved like dma operation. so
prep_interleaved_dma may be better suited.

Let the generalization be discussed separately...
> 
> 
> >> +/**
> >> + * Generic Transfer Request
> >> + * ------------------------
> >> + * A chunk is collection of contiguous bytes to be transfered.
> >> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> >> + * ICGs may or maynot change between chunks.
> >> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> >> + *  that when repeated an integral number of times, specifies the transfer.
> >> + * A transfer template is specification of a Frame, the number of times
> >> + *  it is to be repeated and other per-transfer attributes.
> >> + *
> >> + * Practically, a client driver would have ready a template for each
> >> + *  type of transfer it is going to need during its lifetime and
> >> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> >> + *
> >> + *
> >> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> >> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> >> + *
> >> + *    ==  Chunk size
> >> + *    ... ICG
> >> + */
> >> +
> >> +/**
> >> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> >> + * @size: Number of bytes to read from source.
> >> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
> >> + * @icg: Number of bytes to jump after last src/dst address of this
> >> + *    chunk and before first src/dst address for next chunk.
> >> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> >> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> >> + */
> >> +struct data_chunk {
> >> +     size_t size;
> >> +     size_t icg;
> >> +};
> >> +
> >> +/**
> >> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> >> + *    and attributes.
> >> + * @src_start: Bus address of source for the first chunk.
> >> + * @dst_start: Bus address of destination for the first chunk.
> >> + * @src_inc: If the source address increments after reading from it.
> >> + * @dst_inc: If the destination address increments after writing to it.
> >> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> >> + *           Otherwise, source is read contiguously (icg ignored).
> >> + *           Ignored if src_inc is false.
> >> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> >> + *           Otherwise, destination is filled contiguously (icg ignored).
> >> + *           Ignored if dst_inc is false.
> >> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> >> + * @numf: Number of frames in this template.
> >> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> >> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> >> + */
> >> +struct dmaxfer_template {
> >> +     dma_addr_t src_start;
> >> +     dma_addr_t dst_start;
> >> +     bool src_inc;
> >> +     bool dst_inc;
> >> +     bool src_sgl;
> >> +     bool dst_sgl;
> >> +     bool frm_irq;
> >> +     size_t numf;
> >> +     size_t frame_size;
> >> +     struct data_chunk sgl[0];
> >> +};
> > This kind of transfer can be either slave (audio interleaved) based or
> > mem-to-mem (extracting still image from a buffer),
> > How do you propose the dmac know about this is unclear to me
> 
> Every 'prepare' callback already passes the channel chosen for the transfer.
> And the DMAC driver already knows if that channel is SLAVE capable or not.
How about above params, you should add they may/may not be valid
depending upon the mode.
Pls also send updates to documentation for this API

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-16 17:06       ` Vinod Koul
@ 2011-09-16 17:51         ` Jassi Brar
  2011-09-19  3:23           ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-16 17:51 UTC (permalink / raw)
  To: Vinod Koul; +Cc: dan.j.williams, linux-kernel, rmk+kernel, 21cnbao

On 16 September 2011 22:36, Vinod Koul <vinod.koul@intel.com> wrote:
>> >> Define a new api that could be used for doing fancy data transfers
>> >> like interleaved to contiguous copy and vice-versa.
>> > Then this is a very specific type of transfer this API supports, so the
>> > name generic_xxx is not apt!
>>
>> Currently it doesn't support every operation, perhaps there will be an
>> operation that couldn't ever be merged into this.
>> But it does support interleaved(async and slave), memcpy, memset, dma_sg,
>> slave_sg, and cyclic.
>> Please feel free to suggest a better name.
> ATM, this api is for interleaved like dma operation. so
> prep_interleaved_dma may be better suited.
Well, ok I can live with that.

> Let the generalization be discussed separately...
As you wish.

>> >> +/**
>> >> + * Generic Transfer Request
>> >> + * ------------------------
>> >> + * A chunk is collection of contiguous bytes to be transfered.
>> >> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>> >> + * ICGs may or maynot change between chunks.
>> >> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>> >> + *  that when repeated an integral number of times, specifies the transfer.
>> >> + * A transfer template is specification of a Frame, the number of times
>> >> + *  it is to be repeated and other per-transfer attributes.
>> >> + *
>> >> + * Practically, a client driver would have ready a template for each
>> >> + *  type of transfer it is going to need during its lifetime and
>> >> + *  set only 'src_start' and 'dst_start' before submitting the requests.
>> >> + *
>> >> + *
>> >> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>> >> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>> >> + *
>> >> + *    ==  Chunk size
>> >> + *    ... ICG
>> >> + */
>> >> +
>> >> +/**
>> >> + * struct data_chunk - Element of scatter-gather list that makes a frame.
>> >> + * @size: Number of bytes to read from source.
>> >> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
>> >> + * @icg: Number of bytes to jump after last src/dst address of this
>> >> + *    chunk and before first src/dst address for next chunk.
>> >> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
>> >> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
>> >> + */
>> >> +struct data_chunk {
>> >> +     size_t size;
>> >> +     size_t icg;
>> >> +};
>> >> +
>> >> +/**
>> >> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
>> >> + *    and attributes.
>> >> + * @src_start: Bus address of source for the first chunk.
>> >> + * @dst_start: Bus address of destination for the first chunk.
>> >> + * @src_inc: If the source address increments after reading from it.
>> >> + * @dst_inc: If the destination address increments after writing to it.
>> >> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> >> + *           Otherwise, source is read contiguously (icg ignored).
>> >> + *           Ignored if src_inc is false.
>> >> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> >> + *           Otherwise, destination is filled contiguously (icg ignored).
>> >> + *           Ignored if dst_inc is false.
>> >> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
>> >> + * @numf: Number of frames in this template.
>> >> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
>> >> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
>> >> + */
>> >> +struct dmaxfer_template {
>> >> +     dma_addr_t src_start;
>> >> +     dma_addr_t dst_start;
>> >> +     bool src_inc;
>> >> +     bool dst_inc;
>> >> +     bool src_sgl;
>> >> +     bool dst_sgl;
>> >> +     bool frm_irq;
>> >> +     size_t numf;
>> >> +     size_t frame_size;
>> >> +     struct data_chunk sgl[0];
>> >> +};
>> > This kind of transfer can be either slave (audio interleaved) based or
>> > mem-to-mem (extracting still image from a buffer),
>> > How do you propose the dmac know about this is unclear to me
>>
>> Every 'prepare' callback already passes the channel chosen for the transfer.
>> And the DMAC driver already knows if that channel is SLAVE capable or not.
> How about above params, you should add they may/may not be valid
> depending upon the mode.
If you look at the documentation above the 'struct dmaxfer_template', I have
already mentioned what params are ignored/invalid when.
Or is there something else you want mentioned ?

> Pls also send updates to documentation for this API
Ok, I'll write some notes on using this api in Documentation/dmaengine.txt

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv2] DMAEngine: Define generic transfer request api
  2011-09-16 17:51         ` Jassi Brar
@ 2011-09-19  3:23           ` Vinod Koul
  0 siblings, 0 replies; 131+ messages in thread
From: Vinod Koul @ 2011-09-19  3:23 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, linux-kernel, rmk+kernel, 21cnbao

On Fri, 2011-09-16 at 23:21 +0530, Jassi Brar wrote:
> On 16 September 2011 22:36, Vinod Koul <vinod.koul@intel.com> wrote:
> >> >> Define a new api that could be used for doing fancy data transfers
> >> >> like interleaved to contiguous copy and vice-versa.
> >> > Then this is a very specific type of transfer this API supports, so the
> >> > name generic_xxx is not apt!
> >>
> >> Currently it doesn't support every operation, perhaps there will be an
> >> operation that couldn't ever be merged into this.
> >> But it does support interleaved(async and slave), memcpy, memset, dma_sg,
> >> slave_sg, and cyclic.
> >> Please feel free to suggest a better name.
> > ATM, this api is for interleaved like dma operation. so
> > prep_interleaved_dma may be better suited.
> Well, ok I can live with that.
> 
> > Let the generalization be discussed separately...
> As you wish.
> 
> >> >> +/**
> >> >> + * Generic Transfer Request
> >> >> + * ------------------------
> >> >> + * A chunk is collection of contiguous bytes to be transfered.
> >> >> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> >> >> + * ICGs may or maynot change between chunks.
> >> >> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> >> >> + *  that when repeated an integral number of times, specifies the transfer.
> >> >> + * A transfer template is specification of a Frame, the number of times
> >> >> + *  it is to be repeated and other per-transfer attributes.
> >> >> + *
> >> >> + * Practically, a client driver would have ready a template for each
> >> >> + *  type of transfer it is going to need during its lifetime and
> >> >> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> >> >> + *
> >> >> + *
> >> >> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> >> >> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> >> >> + *
> >> >> + *    ==  Chunk size
> >> >> + *    ... ICG
> >> >> + */
> >> >> +
> >> >> +/**
> >> >> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> >> >> + * @size: Number of bytes to read from source.
> >> >> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
> >> >> + * @icg: Number of bytes to jump after last src/dst address of this
> >> >> + *    chunk and before first src/dst address for next chunk.
> >> >> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> >> >> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> >> >> + */
> >> >> +struct data_chunk {
> >> >> +     size_t size;
> >> >> +     size_t icg;
> >> >> +};
> >> >> +
> >> >> +/**
> >> >> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> >> >> + *    and attributes.
> >> >> + * @src_start: Bus address of source for the first chunk.
> >> >> + * @dst_start: Bus address of destination for the first chunk.
> >> >> + * @src_inc: If the source address increments after reading from it.
> >> >> + * @dst_inc: If the destination address increments after writing to it.
> >> >> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> >> >> + *           Otherwise, source is read contiguously (icg ignored).
> >> >> + *           Ignored if src_inc is false.
> >> >> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> >> >> + *           Otherwise, destination is filled contiguously (icg ignored).
> >> >> + *           Ignored if dst_inc is false.
> >> >> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
This should be specified in the callback field in descriptor..

> >> >> + * @numf: Number of frames in this template.
> >> >> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> >> >> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> >> >> + */
> >> >> +struct dmaxfer_template {
> >> >> +     dma_addr_t src_start;
> >> >> +     dma_addr_t dst_start;
> >> >> +     bool src_inc;
> >> >> +     bool dst_inc;
> >> >> +     bool src_sgl;
> >> >> +     bool dst_sgl;
> >> >> +     bool frm_irq;
> >> >> +     size_t numf;
> >> >> +     size_t frame_size;
> >> >> +     struct data_chunk sgl[0];
> >> >> +};
> >> > This kind of transfer can be either slave (audio interleaved) based or
> >> > mem-to-mem (extracting still image from a buffer),
> >> > How do you propose the dmac know about this is unclear to me
> >>
> >> Every 'prepare' callback already passes the channel chosen for the transfer.
> >> And the DMAC driver already knows if that channel is SLAVE capable or not.
> > How about above params, you should add they may/may not be valid
> > depending upon the mode.
> If you look at the documentation above the 'struct dmaxfer_template', I have
> already mentioned what params are ignored/invalid when.
> Or is there something else you want mentioned ?
How would dmacs decodes these addresses for slave transfers, some maybe
valid some not based on the direction... a note would be good :)

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-15  7:46 ` [PATCHv2] " Jassi Brar
                     ` (2 preceding siblings ...)
  2011-09-16  9:07   ` Vinod Koul
@ 2011-09-20 12:12   ` Jassi Brar
  2011-09-20 16:52     ` Vinod Koul
  2011-09-28  6:39     ` [PATCHv4] " Jassi Brar
  3 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-20 12:12 UTC (permalink / raw)
  To: dan.j.williams, vkoul, linux-kernel; +Cc: rmk, 21cnbao, Jassi Brar

Define a new api that could be used for doing fancy data transfers
like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient in
such cases as where the interleave and chunk are only a few bytes,
which call for a very condensed api to convey pattern of the transfer.
This api supports all 4 variants of scatter-gather and contiguous transfer.

Of course, neither can this api help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern.

Also since now we support SLAVE channels that might not provide
device_prep_slave_sg callback but device_prep_interleaved_dma,
remove the BUG_ON check.

Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
---

Changes since v2:
1) Added some notes to documentation.
2) Removed the BUG_ON check that expects every SLAVE channel to
   provide a prep_slave_sg, as we are now valid otherwise too.
3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.

Changes since v1:
1) Dropped the 'dma_transaction_type' member until we really
   merge another type into this api. Instead added special
   type for this api - DMA_GENXFER in dma_transaction_type.
2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
   preserve namespace, closer to as suggested by Barry Song.

 Documentation/dmaengine.txt |    8 ++++
 drivers/dma/dmaengine.c     |    4 +-
 include/linux/dmaengine.h   |   77 +++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
index 94b7e0f..962a2d3 100644
--- a/Documentation/dmaengine.txt
+++ b/Documentation/dmaengine.txt
@@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
    slave_sg	- DMA a list of scatter gather buffers from/to a peripheral
    dma_cyclic	- Perform a cyclic DMA operation from/to a peripheral till the
 		  operation is explicitly stopped.
+   interleaved_dma - This is common to Slave as well as M2M clients. For slave
+		 address of devices' fifo could be already known to the driver.
+		 Various types of operations could be expressed by setting
+		 appropriate values to the 'dmaxfer_template' members.
 
    A non-NULL return of this transfer API represents a "descriptor" for
    the given transaction.
@@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
 
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
+		struct dma_chan *chan, struct dmaxfer_template *xt,
+		unsigned long flags);
+
    The peripheral driver is expected to have mapped the scatterlist for
    the DMA operation prior to calling device_prep_slave_sg, and must
    keep the scatterlist mapped until the DMA operation has completed.
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index b48967b..a6c6051 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_interrupt);
 	BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
 		!device->device_prep_dma_sg);
-	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
-		!device->device_prep_slave_sg);
 	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
 		!device->device_prep_dma_cyclic);
 	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
 		!device->device_control);
+	BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
+		!device->device_prep_interleaved_dma);
 
 	BUG_ON(!device->device_alloc_chan_resources);
 	BUG_ON(!device->device_free_chan_resources);
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 8fbf40e..fcc85d7 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -71,11 +71,78 @@ enum dma_transaction_type {
 	DMA_ASYNC_TX,
 	DMA_SLAVE,
 	DMA_CYCLIC,
+	DMA_INTERLEAVE,
+/* last transaction type for creation of the capabilities mask */
+	DMA_TX_TYPE_END,
 };
 
-/* last transaction type for creation of the capabilities mask */
-#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
+/**
+ * Interleaved Transfer Request
+ * ----------------------------
+ * A chunk is collection of contiguous bytes to be transfered.
+ * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
+ * ICGs may or maynot change between chunks.
+ * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
+ *  that when repeated an integral number of times, specifies the transfer.
+ * A transfer template is specification of a Frame, the number of times
+ *  it is to be repeated and other per-transfer attributes.
+ *
+ * Practically, a client driver would have ready a template for each
+ *  type of transfer it is going to need during its lifetime and
+ *  set only 'src_start' and 'dst_start' before submitting the requests.
+ *
+ *
+ *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
+ *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
+ *
+ *    ==  Chunk size
+ *    ... ICG
+ */
 
+/**
+ * struct data_chunk - Element of scatter-gather list that makes a frame.
+ * @size: Number of bytes to read from source.
+ *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
+ * @icg: Number of bytes to jump after last src/dst address of this
+ *	 chunk and before first src/dst address for next chunk.
+ *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
+ *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
+ */
+struct data_chunk {
+	size_t size;
+	size_t icg;
+};
+
+/**
+ * struct dmaxfer_template - Template to convey DMAC the transfer pattern
+ *	 and attributes.
+ * @src_start: Bus address of source for the first chunk.
+ * @dst_start: Bus address of destination for the first chunk.
+ * @src_inc: If the source address increments after reading from it.
+ * @dst_inc: If the destination address increments after writing to it.
+ * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
+ *		Otherwise, source is read contiguously (icg ignored).
+ *		Ignored if src_inc is false.
+ * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
+ *		Otherwise, destination is filled contiguously (icg ignored).
+ *		Ignored if dst_inc is false.
+ * @frm_irq: If the client expects DMAC driver to do callback after each frame.
+ * @numf: Number of frames in this template.
+ * @frame_size: Number of chunks in a frame i.e, size of sgl[].
+ * @sgl: Array of {chunk,icg} pairs that make up a frame.
+ */
+struct dmaxfer_template {
+	dma_addr_t src_start;
+	dma_addr_t dst_start;
+	bool src_inc;
+	bool dst_inc;
+	bool src_sgl;
+	bool dst_sgl;
+	bool frm_irq;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
 
 /**
  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
@@ -309,6 +376,8 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
  * @chan: target channel for this operation
  * @tx_submit: set the prepared descriptor(s) to be executed by the engine
  * @callback: routine to call after this operation is complete
+ *	And after each frame if the 'frm_irq' flag is set during
+ *	device_prep_interleaved_dma.
  * @callback_param: general parameter to pass to the callback routine
  * ---async_tx api specific fields---
  * @next: at completion submit this descriptor
@@ -432,6 +501,7 @@ struct dma_tx_state {
  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
  *	The function takes a buffer of size buf_len. The callback function will
  *	be called after period_len bytes have been transferred.
+ * @device_prep_interleaved_dma: Transfer expression in a generic way.
  * @device_control: manipulate all pending operations on a channel, returns
  *	zero or error code
  * @device_tx_status: poll for transaction completion, the optional
@@ -496,6 +566,9 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
+		struct dma_chan *chan, struct dmaxfer_template *xt,
+		unsigned long flags);
 	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
 		unsigned long arg);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-20 12:12   ` [PATCHv3] DMAEngine: Define interleaved " Jassi Brar
@ 2011-09-20 16:52     ` Vinod Koul
  2011-09-20 18:08       ` Jassi Brar
  2011-09-28  6:39     ` [PATCHv4] " Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-09-20 16:52 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, linux-kernel, rmk, 21cnbao

On Tue, 2011-09-20 at 17:42 +0530, Jassi Brar wrote:
> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
> This api supports all 4 variants of scatter-gather and contiguous transfer.
> 
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
> 
> Also since now we support SLAVE channels that might not provide
> device_prep_slave_sg callback but device_prep_interleaved_dma,
> remove the BUG_ON check.
> 
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
> 
> Changes since v2:
> 1) Added some notes to documentation.
> 2) Removed the BUG_ON check that expects every SLAVE channel to
>    provide a prep_slave_sg, as we are now valid otherwise too.
> 3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
> 4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.
> 
> Changes since v1:
> 1) Dropped the 'dma_transaction_type' member until we really
>    merge another type into this api. Instead added special
>    type for this api - DMA_GENXFER in dma_transaction_type.
> 2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
>    preserve namespace, closer to as suggested by Barry Song.
> 
>  Documentation/dmaengine.txt |    8 ++++
>  drivers/dma/dmaengine.c     |    4 +-
>  include/linux/dmaengine.h   |   77 +++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 85 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
> index 94b7e0f..962a2d3 100644
> --- a/Documentation/dmaengine.txt
> +++ b/Documentation/dmaengine.txt
> @@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
>     slave_sg	- DMA a list of scatter gather buffers from/to a peripheral
>     dma_cyclic	- Perform a cyclic DMA operation from/to a peripheral till the
>  		  operation is explicitly stopped.
> +   interleaved_dma - This is common to Slave as well as M2M clients. For slave
> +		 address of devices' fifo could be already known to the driver.
> +		 Various types of operations could be expressed by setting
> +		 appropriate values to the 'dmaxfer_template' members.
>  
>     A non-NULL return of this transfer API represents a "descriptor" for
>     the given transaction.
> @@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
>  
> +	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +		struct dma_chan *chan, struct dmaxfer_template *xt,
> +		unsigned long flags);
> +
>     The peripheral driver is expected to have mapped the scatterlist for
>     the DMA operation prior to calling device_prep_slave_sg, and must
>     keep the scatterlist mapped until the DMA operation has completed.
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index b48967b..a6c6051 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
>  		!device->device_prep_dma_interrupt);
>  	BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
>  		!device->device_prep_dma_sg);
> -	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
> -		!device->device_prep_slave_sg);
>  	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>  		!device->device_prep_dma_cyclic);
>  	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>  		!device->device_control);
> +	BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
> +		!device->device_prep_interleaved_dma);
>  
>  	BUG_ON(!device->device_alloc_chan_resources);
>  	BUG_ON(!device->device_free_chan_resources);
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..fcc85d7 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -71,11 +71,78 @@ enum dma_transaction_type {
>  	DMA_ASYNC_TX,
>  	DMA_SLAVE,
>  	DMA_CYCLIC,
> +	DMA_INTERLEAVE,
> +/* last transaction type for creation of the capabilities mask */
> +	DMA_TX_TYPE_END,
>  };
>  
> -/* last transaction type for creation of the capabilities mask */
> -#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
> +/**
> + * Interleaved Transfer Request
> + * ----------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
>  
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *	 chunk and before first src/dst address for next chunk.
> + *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +	size_t size;
> +	size_t icg;
> +};
> +
> +/**
> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> + *	 and attributes.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *		Otherwise, source is read contiguously (icg ignored).
> + *		Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *		Otherwise, destination is filled contiguously (icg ignored).
> + *		Ignored if dst_inc is false.
> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct dmaxfer_template {
> +	dma_addr_t src_start;
> +	dma_addr_t dst_start;
> +	bool src_inc;
> +	bool dst_inc;
> +	bool src_sgl;
> +	bool dst_sgl;
> +	bool frm_irq;
> +	size_t numf;
> +	size_t frame_size;
> +	struct data_chunk sgl[0];
> +};
>  
>  /**
>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -309,6 +376,8 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
>   * @chan: target channel for this operation
>   * @tx_submit: set the prepared descriptor(s) to be executed by the engine
>   * @callback: routine to call after this operation is complete
> + *	And after each frame if the 'frm_irq' flag is set during
> + *	device_prep_interleaved_dma.
Nope, if callback is set it should be called, additionally you may use
the flag DMA_PREP_INTERRUPT (which few dmacs use).
I think frm_irq is not required, if you need please use
DMA_PREP_INTERRUPT passed thru flags arg

Also for slave transfers, how do we infer direction?
>   * @callback_param: general parameter to pass to the callback routine
>   * ---async_tx api specific fields---
>   * @next: at completion submit this descriptor
> @@ -432,6 +501,7 @@ struct dma_tx_state {
>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>   *	The function takes a buffer of size buf_len. The callback function will
>   *	be called after period_len bytes have been transferred.
> + * @device_prep_interleaved_dma: Transfer expression in a generic way.
>   * @device_control: manipulate all pending operations on a channel, returns
>   *	zero or error code
>   * @device_tx_status: poll for transaction completion, the optional
> @@ -496,6 +566,9 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
> +	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +		struct dma_chan *chan, struct dmaxfer_template *xt,
> +		unsigned long flags);
>  	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>  		unsigned long arg);
>  


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-20 16:52     ` Vinod Koul
@ 2011-09-20 18:08       ` Jassi Brar
  2011-09-21  6:32         ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-20 18:08 UTC (permalink / raw)
  To: Vinod Koul; +Cc: dan.j.williams, linux-kernel, rmk, 21cnbao

On 20 September 2011 22:22, Vinod Koul <vinod.koul@intel.com> wrote:
>>  /**
>>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
>> @@ -309,6 +376,8 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
>>   * @chan: target channel for this operation
>>   * @tx_submit: set the prepared descriptor(s) to be executed by the engine
>>   * @callback: routine to call after this operation is complete
>> + *   And after each frame if the 'frm_irq' flag is set during
>> + *   device_prep_interleaved_dma.
> Nope, if callback is set it should be called
What makes you think it won't be called? Note that it starts with _AND_
'frm_irq' would only cause the callbacks done _additionally_ after each frame.

And no, DMA_PREP_INTERRUPT can't serve that purpose because that acts at
only the end of full transfer not a part of it.

> I think frm_irq is not required
Barry doesn't need it, so yes, I'd better remove it for now and make
my life easier.


> Also for slave transfers, how do we infer direction?
I already explained to Barry.  Here's it again.

At any time, the dmac driver knows if the channel, on which the xfer is
prepared/submitted is Slave or not.

SLAVE Transfer
   dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => DMA_TO_DEVICE
   !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => DMA_FROM_DEVICE

Mem->Mem Transfer
   dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => Meaningless
   !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => MemSet

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-20 18:08       ` Jassi Brar
@ 2011-09-21  6:32         ` Vinod Koul
  2011-09-21  6:45           ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-09-21  6:32 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, linux-kernel, rmk, 21cnbao

On Tue, 2011-09-20 at 23:38 +0530, Jassi Brar wrote:
> On 20 September 2011 22:22, Vinod Koul <vinod.koul@intel.com> wrote:
> > Also for slave transfers, how do we infer direction?
> I already explained to Barry.  Here's it again.
> 
> At any time, the dmac driver knows if the channel, on which the xfer is
> prepared/submitted is Slave or not.
> 
> SLAVE Transfer
>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => DMA_TO_DEVICE
>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => DMA_FROM_DEVICE
> 
> Mem->Mem Transfer
>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => Meaningless
>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => MemSet
Rather than each driver adding this logic with good chance of screwing
up, care to add this as helper in dmaengine.h

Ideally, I would have preferred direction to be told explicitly, would
leave it you..

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-21  6:32         ` Vinod Koul
@ 2011-09-21  6:45           ` Jassi Brar
  2011-09-21  6:51             ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-21  6:45 UTC (permalink / raw)
  To: Vinod Koul; +Cc: dan.j.williams, linux-kernel, rmk, 21cnbao

On 21 September 2011 12:02, Vinod Koul <vinod.koul@intel.com> wrote:
> On Tue, 2011-09-20 at 23:38 +0530, Jassi Brar wrote:
>> On 20 September 2011 22:22, Vinod Koul <vinod.koul@intel.com> wrote:
>> > Also for slave transfers, how do we infer direction?
>> I already explained to Barry.  Here's it again.
>>
>> At any time, the dmac driver knows if the channel, on which the xfer is
>> prepared/submitted is Slave or not.
>>
>> SLAVE Transfer
>>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => DMA_TO_DEVICE
>>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => DMA_FROM_DEVICE
>>
>> Mem->Mem Transfer
>>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => Meaningless
>>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => MemSet
> Rather than each driver adding this logic with good chance of screwing
> up, care to add this as helper in dmaengine.h
>
> Ideally, I would have preferred direction to be told explicitly, would
> leave it you..
>
I repeat yet again :- "This api is common to Slave as well as Mem<->Mem"

Even if we have slave clients specify DMA_TO/FROM_DEVICE, what
flag do you suggest Mem->Mem clients use ?

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-21  6:45           ` Jassi Brar
@ 2011-09-21  6:51             ` Vinod Koul
  2011-09-21  7:31               ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-09-21  6:51 UTC (permalink / raw)
  To: Jassi Brar; +Cc: dan.j.williams, linux-kernel, rmk, 21cnbao

On Wed, 2011-09-21 at 12:15 +0530, Jassi Brar wrote:
> On 21 September 2011 12:02, Vinod Koul <vinod.koul@intel.com> wrote:
> > On Tue, 2011-09-20 at 23:38 +0530, Jassi Brar wrote:
> >> On 20 September 2011 22:22, Vinod Koul <vinod.koul@intel.com> wrote:
> >> > Also for slave transfers, how do we infer direction?
> >> I already explained to Barry.  Here's it again.
> >>
> >> At any time, the dmac driver knows if the channel, on which the xfer is
> >> prepared/submitted is Slave or not.
> >>
> >> SLAVE Transfer
> >>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => DMA_TO_DEVICE
> >>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => DMA_FROM_DEVICE
> >>
> >> Mem->Mem Transfer
> >>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => Meaningless
> >>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => MemSet
> > Rather than each driver adding this logic with good chance of screwing
> > up, care to add this as helper in dmaengine.h
> >
> > Ideally, I would have preferred direction to be told explicitly, would
> > leave it you..
> >
> I repeat yet again :- "This api is common to Slave as well as Mem<->Mem"
> 
> Even if we have slave clients specify DMA_TO/FROM_DEVICE, what
> flag do you suggest Mem->Mem clients use ?
Would it be invalid in that case!!! Same as few ields in xt_template
would be for peripheral case..

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-21  6:51             ` Vinod Koul
@ 2011-09-21  7:31               ` Jassi Brar
  2011-09-21 10:18                 ` Russell King
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-21  7:31 UTC (permalink / raw)
  To: Vinod Koul; +Cc: dan.j.williams, linux-kernel, rmk, 21cnbao

On 21 September 2011 12:21, Vinod Koul <vinod.koul@intel.com> wrote:
>> >> > Also for slave transfers, how do we infer direction?
>> >> I already explained to Barry.  Here's it again.
>> >>
>> >> At any time, the dmac driver knows if the channel, on which the xfer is
>> >> prepared/submitted is Slave or not.
>> >>
>> >> SLAVE Transfer
>> >>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => DMA_TO_DEVICE
>> >>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => DMA_FROM_DEVICE
>> >>
>> >> Mem->Mem Transfer
>> >>    dmaxfer_template.src_inc  && !dmaxfer_template.dst_inc  => Meaningless
>> >>    !dmaxfer_template.src_inc  && dmaxfer_template.dst_inc  => MemSet
>> > Rather than each driver adding this logic with good chance of screwing
>> > up, care to add this as helper in dmaengine.h
>> >
>> > Ideally, I would have preferred direction to be told explicitly, would
>> > leave it you..
>> >
>> I repeat yet again :- "This api is common to Slave as well as Mem<->Mem"
>>
>> Even if we have slave clients specify DMA_TO/FROM_DEVICE, what
>> flag do you suggest Mem->Mem clients use ?
> Would it be invalid in that case!!!
Of course!
It doesn't make any sense for a mem->mem client asking
for DMA_FROM_DEVICE when it wanted MemSet.

> Same as few ields in xt_template would be for peripheral case..
No properly written client would ever pass 'invalid' values for
xt_template members.
But _every_ properly written Mem->Mem client would have to
ask for DMA_TO/FROM_DEVICE, following what you suggest.

Btw, even if we wanted, Memcpy couldn't be represented by
either DMA_TO_DEVICE or  DMA_FROM_DEVICE, because
both Src and Dst increments here.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-21  7:31               ` Jassi Brar
@ 2011-09-21 10:18                 ` Russell King
  2011-09-21 15:21                   ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Russell King @ 2011-09-21 10:18 UTC (permalink / raw)
  To: Jassi Brar; +Cc: Vinod Koul, dan.j.williams, linux-kernel, 21cnbao

On Wed, Sep 21, 2011 at 01:01:24PM +0530, Jassi Brar wrote:
> Of course!
> It doesn't make any sense for a mem->mem client asking
> for DMA_FROM_DEVICE when it wanted MemSet.

Yes it does.  For memset, you _are_ DMAing data from the DMA device to
memory.  The fact that the data is constant is merely incidental (and
a private property of the DMA controller.)

Also, this memory *has* to be mapped via the DMA API to ensure coherency,
and it *has* to be mapped using DMA_FROM_DEVICE.  Otherwise you won't see
the data until effects such as cache eviction have happened (even then
you may find your nicely memset'd data is overwritten by dirty cache
lines.)

So, to say that M2M transfers don't have DMA_FROM_DEVICE/TO_DEVICE
properties shows a lack of appreciation for what is actually going on
at the hardware level.

> > Same as few ields in xt_template would be for peripheral case..
>
> No properly written client would ever pass 'invalid' values for
> xt_template members.
> But _every_ properly written Mem->Mem client would have to
> ask for DMA_TO/FROM_DEVICE, following what you suggest.

Which is definitely a good thing.

> Btw, even if we wanted, Memcpy couldn't be represented by
> either DMA_TO_DEVICE or  DMA_FROM_DEVICE, because
> both Src and Dst increments here.

memcpy(dst,src,len) would require the source to be mapped DMA_TO_DEVICE
and the destination mapped DMA_FROM_DEVICE via the DMA API.

Look, the DMA_TO_DEVICE/FROM_DEVICE are completely separate properties
from "do we increment the source" "do we increment the destination"
properties.

There are existing client drivers in the kernel which already use M->P
DMA with non-incrementing source memory addresses (SPI transfers where
the transmit path has to be loaded with dummy data) and P->M DMA with
non-incrementing destination memory addresses (SPI discarding RX data.)

So, there's _absolutely_ no way that any sane API can ever infer the
DMA direction from the source/destination increment specifications.
Ignore this at your peril (and you'll find that people will botch
around your new API.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv3] DMAEngine: Define interleaved transfer request api
  2011-09-21 10:18                 ` Russell King
@ 2011-09-21 15:21                   ` Jassi Brar
  0 siblings, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-21 15:21 UTC (permalink / raw)
  To: Russell King; +Cc: Vinod Koul, dan.j.williams, linux-kernel, 21cnbao

On 21 September 2011 15:48, Russell King <rmk@arm.linux.org.uk> wrote:
> On Wed, Sep 21, 2011 at 01:01:24PM +0530, Jassi Brar wrote:
>> Of course!
>> It doesn't make any sense for a mem->mem client asking
>> for DMA_FROM_DEVICE when it wanted MemSet.
>
> Yes it does.  For memset, you _are_ DMAing data from the DMA device to
> memory.  The fact that the data is constant is merely incidental (and
> a private property of the DMA controller.)
>
> Also, this memory *has* to be mapped via the DMA API to ensure coherency,
> and it *has* to be mapped using DMA_FROM_DEVICE.  Otherwise you won't see
> the data until effects such as cache eviction have happened (even then
> you may find your nicely memset'd data is overwritten by dirty cache
> lines.)
>
> So, to say that M2M transfers don't have DMA_FROM_DEVICE/TO_DEVICE
> properties shows a lack of appreciation for what is actually going on
> at the hardware level.
I don't discount that. I just think memory map/unmap'ing should not be
any business of the dmac driver. It should be handled by either client
or by some common part of dmaengine api.
Infact I never really asked 'why' the dmac driver needs to know the
direction, rather I only tried to explain 'how' it can find out.

Anyways, I am ok if this new api too must support map/unmap'ing of
buffers on behalf of the client.

> So, there's _absolutely_ no way that any sane API can ever infer the
> DMA direction from the source/destination increment specifications.
> Ignore this at your peril (and you'll find that people will botch
> around your new API.)
Yes. But having a 'enum dma_data_direction' flag wouldn't make it
future-proof either.

Seems I would have to define something like :-
enum xfer_direction {
	MEM_TO_MEM,
	MEM_TO_DEV,
	DEV_TO_MEM,
	DEV_TO_DEV,
 };

And replace every 'enum dma_data_direction' in the dmaengine api
with 'enum xfer_direction'

Thanks.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-20 12:12   ` [PATCHv3] DMAEngine: Define interleaved " Jassi Brar
  2011-09-20 16:52     ` Vinod Koul
@ 2011-09-28  6:39     ` Jassi Brar
  2011-09-28  9:03       ` Vinod Koul
  2011-10-13  7:03       ` [PATCHv5] " Jassi Brar
  1 sibling, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-28  6:39 UTC (permalink / raw)
  To: linux-kernel, dan.j.williams, vkoul; +Cc: rmk, 21cnbao, Jassi Brar

Define a new api that could be used for doing fancy data transfers
like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient in
such cases as where the interleave and chunk are only a few bytes,
which call for a very condensed api to convey pattern of the transfer.
This api supports all 4 variants of scatter-gather and contiguous transfer.

Of course, neither can this api help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern.

Also since now we support SLAVE channels that might not provide
device_prep_slave_sg callback but device_prep_interleaved_dma,
remove the BUG_ON check.

Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
---

Changes since v3:
1) Added explicit type for source and destination.

Changes since v2:
1) Added some notes to documentation.
2) Removed the BUG_ON check that expects every SLAVE channel to
   provide a prep_slave_sg, as we are now valid otherwise too.
3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.

Changes since v1:
1) Dropped the 'dma_transaction_type' member until we really
   merge another type into this api. Instead added special
   type for this api - DMA_GENXFER in dma_transaction_type.
2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
   preserve namespace, closer to as suggested by Barry Song.

 Documentation/dmaengine.txt |    8 ++++
 drivers/dma/dmaengine.c     |    4 +-
 include/linux/dmaengine.h   |   86 ++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 94 insertions(+), 4 deletions(-)

diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
index 94b7e0f..962a2d3 100644
--- a/Documentation/dmaengine.txt
+++ b/Documentation/dmaengine.txt
@@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
    slave_sg	- DMA a list of scatter gather buffers from/to a peripheral
    dma_cyclic	- Perform a cyclic DMA operation from/to a peripheral till the
 		  operation is explicitly stopped.
+   interleaved_dma - This is common to Slave as well as M2M clients. For slave
+		 address of devices' fifo could be already known to the driver.
+		 Various types of operations could be expressed by setting
+		 appropriate values to the 'dmaxfer_template' members.
 
    A non-NULL return of this transfer API represents a "descriptor" for
    the given transaction.
@@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
 
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
+		struct dma_chan *chan, struct dmaxfer_template *xt,
+		unsigned long flags);
+
    The peripheral driver is expected to have mapped the scatterlist for
    the DMA operation prior to calling device_prep_slave_sg, and must
    keep the scatterlist mapped until the DMA operation has completed.
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index b48967b..a6c6051 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_interrupt);
 	BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
 		!device->device_prep_dma_sg);
-	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
-		!device->device_prep_slave_sg);
 	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
 		!device->device_prep_dma_cyclic);
 	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
 		!device->device_control);
+	BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
+		!device->device_prep_interleaved_dma);
 
 	BUG_ON(!device->device_alloc_chan_resources);
 	BUG_ON(!device->device_free_chan_resources);
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 8fbf40e..7d6f9d7 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -71,11 +71,87 @@ enum dma_transaction_type {
 	DMA_ASYNC_TX,
 	DMA_SLAVE,
 	DMA_CYCLIC,
+	DMA_INTERLEAVE,
+/* last transaction type for creation of the capabilities mask */
+	DMA_TX_TYPE_END,
 };
 
-/* last transaction type for creation of the capabilities mask */
-#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
+enum xfer_direction {
+	MEM_TO_MEM,
+	MEM_TO_DEV,
+	DEV_TO_MEM,
+	DEV_TO_DEV,
+};
+
+/**
+ * Interleaved Transfer Request
+ * ----------------------------
+ * A chunk is collection of contiguous bytes to be transfered.
+ * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
+ * ICGs may or maynot change between chunks.
+ * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
+ *  that when repeated an integral number of times, specifies the transfer.
+ * A transfer template is specification of a Frame, the number of times
+ *  it is to be repeated and other per-transfer attributes.
+ *
+ * Practically, a client driver would have ready a template for each
+ *  type of transfer it is going to need during its lifetime and
+ *  set only 'src_start' and 'dst_start' before submitting the requests.
+ *
+ *
+ *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
+ *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
+ *
+ *    ==  Chunk size
+ *    ... ICG
+ */
 
+/**
+ * struct data_chunk - Element of scatter-gather list that makes a frame.
+ * @size: Number of bytes to read from source.
+ *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
+ * @icg: Number of bytes to jump after last src/dst address of this
+ *	 chunk and before first src/dst address for next chunk.
+ *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
+ *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
+ */
+struct data_chunk {
+	size_t size;
+	size_t icg;
+};
+
+/**
+ * struct dmaxfer_template - Template to convey DMAC the transfer pattern
+ *	 and attributes.
+ * @src_start: Bus address of source for the first chunk.
+ * @dst_start: Bus address of destination for the first chunk.
+ * @dir: Specifies the type of Source and Destination.
+ * @src_inc: If the source address increments after reading from it.
+ * @dst_inc: If the destination address increments after writing to it.
+ * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
+ *		Otherwise, source is read contiguously (icg ignored).
+ *		Ignored if src_inc is false.
+ * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
+ *		Otherwise, destination is filled contiguously (icg ignored).
+ *		Ignored if dst_inc is false.
+ * @frm_irq: If the client expects DMAC driver to do callback after each frame.
+ * @numf: Number of frames in this template.
+ * @frame_size: Number of chunks in a frame i.e, size of sgl[].
+ * @sgl: Array of {chunk,icg} pairs that make up a frame.
+ */
+struct dmaxfer_template {
+	dma_addr_t src_start;
+	dma_addr_t dst_start;
+	enum xfer_direction dir;
+	bool src_inc;
+	bool dst_inc;
+	bool src_sgl;
+	bool dst_sgl;
+	bool frm_irq;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
 
 /**
  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
@@ -309,6 +385,8 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
  * @chan: target channel for this operation
  * @tx_submit: set the prepared descriptor(s) to be executed by the engine
  * @callback: routine to call after this operation is complete
+ *	And after each frame if the 'frm_irq' flag is set during
+ *	device_prep_interleaved_dma.
  * @callback_param: general parameter to pass to the callback routine
  * ---async_tx api specific fields---
  * @next: at completion submit this descriptor
@@ -432,6 +510,7 @@ struct dma_tx_state {
  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
  *	The function takes a buffer of size buf_len. The callback function will
  *	be called after period_len bytes have been transferred.
+ * @device_prep_interleaved_dma: Transfer expression in a generic way.
  * @device_control: manipulate all pending operations on a channel, returns
  *	zero or error code
  * @device_tx_status: poll for transaction completion, the optional
@@ -496,6 +575,9 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
+		struct dma_chan *chan, struct dmaxfer_template *xt,
+		unsigned long flags);
 	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
 		unsigned long arg);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-28  6:39     ` [PATCHv4] " Jassi Brar
@ 2011-09-28  9:03       ` Vinod Koul
  2011-09-28 15:15         ` Jassi Brar
  2011-10-13  7:03       ` [PATCHv5] " Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-09-28  9:03 UTC (permalink / raw)
  To: Jassi Brar; +Cc: linux-kernel, dan.j.williams, rmk, 21cnbao

On Wed, 2011-09-28 at 12:09 +0530, Jassi Brar wrote:
> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
> This api supports all 4 variants of scatter-gather and contiguous transfer.
> 
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
> 
> Also since now we support SLAVE channels that might not provide
> device_prep_slave_sg callback but device_prep_interleaved_dma,
> remove the BUG_ON check.
> 
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
> 
> Changes since v3:
> 1) Added explicit type for source and destination.
> 
> Changes since v2:
> 1) Added some notes to documentation.
> 2) Removed the BUG_ON check that expects every SLAVE channel to
>    provide a prep_slave_sg, as we are now valid otherwise too.
> 3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
> 4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.
> 
> Changes since v1:
> 1) Dropped the 'dma_transaction_type' member until we really
>    merge another type into this api. Instead added special
>    type for this api - DMA_GENXFER in dma_transaction_type.
> 2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
>    preserve namespace, closer to as suggested by Barry Song.
> 
>  Documentation/dmaengine.txt |    8 ++++
>  drivers/dma/dmaengine.c     |    4 +-
>  include/linux/dmaengine.h   |   86 ++++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 94 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
> index 94b7e0f..962a2d3 100644
> --- a/Documentation/dmaengine.txt
> +++ b/Documentation/dmaengine.txt
> @@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
>     slave_sg	- DMA a list of scatter gather buffers from/to a peripheral
>     dma_cyclic	- Perform a cyclic DMA operation from/to a peripheral till the
>  		  operation is explicitly stopped.
> +   interleaved_dma - This is common to Slave as well as M2M clients. For slave
> +		 address of devices' fifo could be already known to the driver.
> +		 Various types of operations could be expressed by setting
> +		 appropriate values to the 'dmaxfer_template' members.
>  
>     A non-NULL return of this transfer API represents a "descriptor" for
>     the given transaction.
> @@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
>  
> +	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +		struct dma_chan *chan, struct dmaxfer_template *xt,
> +		unsigned long flags);
what's the flags used for?
> +
>     The peripheral driver is expected to have mapped the scatterlist for
>     the DMA operation prior to calling device_prep_slave_sg, and must
>     keep the scatterlist mapped until the DMA operation has completed.
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index b48967b..a6c6051 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
>  		!device->device_prep_dma_interrupt);
>  	BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
>  		!device->device_prep_dma_sg);
> -	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
> -		!device->device_prep_slave_sg);
>  	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>  		!device->device_prep_dma_cyclic);
>  	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>  		!device->device_control);
> +	BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
> +		!device->device_prep_interleaved_dma);
>  
>  	BUG_ON(!device->device_alloc_chan_resources);
>  	BUG_ON(!device->device_free_chan_resources);
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..7d6f9d7 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -71,11 +71,87 @@ enum dma_transaction_type {
>  	DMA_ASYNC_TX,
>  	DMA_SLAVE,
>  	DMA_CYCLIC,
> +	DMA_INTERLEAVE,
> +/* last transaction type for creation of the capabilities mask */
> +	DMA_TX_TYPE_END,
>  };
>  
> -/* last transaction type for creation of the capabilities mask */
> -#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
> +enum xfer_direction {
> +	MEM_TO_MEM,
> +	MEM_TO_DEV,
> +	DEV_TO_MEM,
> +	DEV_TO_DEV,
Use/update dma_data_direction.
> +};
> +
> +/**
> + * Interleaved Transfer Request
> + * ----------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
>  
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *	 chunk and before first src/dst address for next chunk.
> + *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +	size_t size;
> +	size_t icg;
> +};
> +
> +/**
> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> + *	 and attributes.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @dir: Specifies the type of Source and Destination.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *		Otherwise, source is read contiguously (icg ignored).
> + *		Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *		Otherwise, destination is filled contiguously (icg ignored).
> + *		Ignored if dst_inc is false.
> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
I thought you were going to remove this?

> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct dmaxfer_template {
> +	dma_addr_t src_start;
> +	dma_addr_t dst_start;
> +	enum xfer_direction dir;
> +	bool src_inc;
> +	bool dst_inc;
> +	bool src_sgl;
> +	bool dst_sgl;
> +	bool frm_irq;
> +	size_t numf;
> +	size_t frame_size;
> +	struct data_chunk sgl[0];
> +};
>  
>  /**
>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -309,6 +385,8 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
>   * @chan: target channel for this operation
>   * @tx_submit: set the prepared descriptor(s) to be executed by the engine
>   * @callback: routine to call after this operation is complete
> + *	And after each frame if the 'frm_irq' flag is set during
> + *	device_prep_interleaved_dma.
Other APIs don't have frm_irq flag, so the comment is not generic

>   * @callback_param: general parameter to pass to the callback routine
>   * ---async_tx api specific fields---
>   * @next: at completion submit this descriptor
> @@ -432,6 +510,7 @@ struct dma_tx_state {
>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>   *	The function takes a buffer of size buf_len. The callback function will
>   *	be called after period_len bytes have been transferred.
> + * @device_prep_interleaved_dma: Transfer expression in a generic way.
>   * @device_control: manipulate all pending operations on a channel, returns
>   *	zero or error code
>   * @device_tx_status: poll for transaction completion, the optional
> @@ -496,6 +575,9 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
> +	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +		struct dma_chan *chan, struct dmaxfer_template *xt,
> +		unsigned long flags);
>  	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>  		unsigned long arg);
>  


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-28  9:03       ` Vinod Koul
@ 2011-09-28 15:15         ` Jassi Brar
  2011-09-29 11:17           ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-28 15:15 UTC (permalink / raw)
  To: Vinod Koul; +Cc: linux-kernel, dan.j.williams, rmk, 21cnbao

On 28 September 2011 14:33, Vinod Koul <vinod.koul@intel.com> wrote:
> On Wed, 2011-09-28 at 12:09 +0530, Jassi Brar wrote:

>>
>> +     struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
>> +             struct dma_chan *chan, struct dmaxfer_template *xt,
>> +             unsigned long flags);
> what's the flags used for?
Same as for device_prep_slave_sg, device_prep_dma_sg, device_prep_dma_interrupt,
device_prep_dma_memset etc. Usually for flagging buffer map/unmap'ing
in this case.
Btw, this was present in v3 as well.


>> +enum xfer_direction {
>> +     MEM_TO_MEM,
>> +     MEM_TO_DEV,
>> +     DEV_TO_MEM,
>> +     DEV_TO_DEV,
> Use/update dma_data_direction.
dma_data_direction is the mapping attribute of a buffer.
While that info is what some dmac driver might need ultimately, our main
aim here is to tell exactly if Src and Dst is Memory or a device's FIFO.

Mapping attribute of src/dst buffers could be very well deducted from
xfer_direction, but dma_data_direction isn't meant to tell if Src and Dst
is Mem or FIFO.
Also, for (SLAVE && !src_inc && !dst_inc) we need to disambiguate three
options  Mem->Fifo,  Fifo->Mem,  *Fifo->Fifo*(not impossible)
So while using dma_data_direction would work today, that sure is hacky
and not future-proof.


>> +/**
>> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
>> + *    and attributes.
>> + * @src_start: Bus address of source for the first chunk.
>> + * @dst_start: Bus address of destination for the first chunk.
>> + * @dir: Specifies the type of Source and Destination.
>> + * @src_inc: If the source address increments after reading from it.
>> + * @dst_inc: If the destination address increments after writing to it.
>> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> + *           Otherwise, source is read contiguously (icg ignored).
>> + *           Ignored if src_inc is false.
>> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> + *           Otherwise, destination is filled contiguously (icg ignored).
>> + *           Ignored if dst_inc is false.
>> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> I thought you were going to remove this?
Yes, I just forgot.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-28 15:15         ` Jassi Brar
@ 2011-09-29 11:17           ` Vinod Koul
  2011-09-30  6:43             ` Barry Song
  2011-09-30 15:47             ` Jassi Brar
  0 siblings, 2 replies; 131+ messages in thread
From: Vinod Koul @ 2011-09-29 11:17 UTC (permalink / raw)
  To: Jassi Brar; +Cc: linux-kernel, dan.j.williams, rmk, 21cnbao

On Wed, 2011-09-28 at 20:45 +0530, Jassi Brar wrote:
> >> +enum xfer_direction {
> >> +     MEM_TO_MEM,
> >> +     MEM_TO_DEV,
> >> +     DEV_TO_MEM,
> >> +     DEV_TO_DEV,
> > Use/update dma_data_direction.
> dma_data_direction is the mapping attribute of a buffer.
> While that info is what some dmac driver might need ultimately, our
> main aim here is to tell exactly if Src and Dst is Memory or a
> device's FIFO.
> 
> Mapping attribute of src/dst buffers could be very well deducted from
> xfer_direction, but dma_data_direction isn't meant to tell if Src and
> Dst is Mem or FIFO.
> Also, for (SLAVE && !src_inc && !dst_inc) we need to disambiguate
> three options  Mem->Fifo,  Fifo->Mem,  *Fifo->Fifo*(not impossible)
> So while using dma_data_direction would work today, that sure is hacky
> and not future-proof.
That is why I said use/update, you missed the update part.

One way would be to use direction field with new flag indicating if its
memory or device transfer, otherwise you can expand this enum.

Point is few things are already there so improve upon it rather than
have two structures in kernel doing similar things...

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-29 11:17           ` Vinod Koul
@ 2011-09-30  6:43             ` Barry Song
  2011-09-30 16:01               ` Jassi Brar
  2011-09-30 15:47             ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-09-30  6:43 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Jassi Brar, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

2011/9/29 Vinod Koul <vinod.koul@intel.com>:
> On Wed, 2011-09-28 at 20:45 +0530, Jassi Brar wrote:
>> >> +enum xfer_direction {
>> >> +     MEM_TO_MEM,
>> >> +     MEM_TO_DEV,
>> >> +     DEV_TO_MEM,
>> >> +     DEV_TO_DEV,
>> > Use/update dma_data_direction.
>> dma_data_direction is the mapping attribute of a buffer.
>> While that info is what some dmac driver might need ultimately, our
>> main aim here is to tell exactly if Src and Dst is Memory or a
>> device's FIFO.
>>
>> Mapping attribute of src/dst buffers could be very well deducted from
>> xfer_direction, but dma_data_direction isn't meant to tell if Src and
>> Dst is Mem or FIFO.
>> Also, for (SLAVE && !src_inc && !dst_inc) we need to disambiguate
>> three options  Mem->Fifo,  Fifo->Mem,  *Fifo->Fifo*(not impossible)
>> So while using dma_data_direction would work today, that sure is hacky
>> and not future-proof.
> That is why I said use/update, you missed the update part.
>
> One way would be to use direction field with new flag indicating if its
> memory or device transfer, otherwise you can expand this enum.
>
> Point is few things are already there so improve upon it rather than
> have two structures in kernel doing similar things...

i support we can update dma_data_direction to:   MEM_TO_MEM,
MEM_TO_DEV,  DEV_TO_MEM, DEV_TO_DEV.

Russell gave a good explain about the mapping decided by MEM_TO_DEV
and  DEV_TO_MEM. Then even a memory-to-memory DMA will require
MEM_TO_DEV/DEV_TO_MEM flag to do mapping because only
MEM_TO_DEV/DEV_TO_MEM can decide what dma_map_single() class functions
will operate on cache. But that is really a bad idea to users. To
users, i just want DMA_TO_DEV to be a DMA which transfers data from
memory to device. otherwise, it should have a different name. I think
how to map buffer is just a software detail. but the DMA_TO_DEV should
be a right description to the real world.

anyway, the update can be a seperate patch not in this one.

>
> --
> ~Vinod

-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-29 11:17           ` Vinod Koul
  2011-09-30  6:43             ` Barry Song
@ 2011-09-30 15:47             ` Jassi Brar
  1 sibling, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-30 15:47 UTC (permalink / raw)
  To: Vinod Koul; +Cc: linux-kernel, dan.j.williams, rmk, 21cnbao

On 29 September 2011 16:47, Vinod Koul <vinod.koul@intel.com> wrote:
> On Wed, 2011-09-28 at 20:45 +0530, Jassi Brar wrote:
>> >> +enum xfer_direction {
>> >> +     MEM_TO_MEM,
>> >> +     MEM_TO_DEV,
>> >> +     DEV_TO_MEM,
>> >> +     DEV_TO_DEV,
>> > Use/update dma_data_direction.
>> dma_data_direction is the mapping attribute of a buffer.
>> While that info is what some dmac driver might need ultimately, our
>> main aim here is to tell exactly if Src and Dst is Memory or a
>> device's FIFO.
>>
>> Mapping attribute of src/dst buffers could be very well deducted from
>> xfer_direction, but dma_data_direction isn't meant to tell if Src and
>> Dst is Mem or FIFO.
>> Also, for (SLAVE && !src_inc && !dst_inc) we need to disambiguate
>> three options  Mem->Fifo,  Fifo->Mem,  *Fifo->Fifo*(not impossible)
>> So while using dma_data_direction would work today, that sure is hacky
>> and not future-proof.
> That is why I said use/update, you missed the update part.
>
> One way would be to use direction field with new flag indicating if its
> memory or device transfer, otherwise you can expand this enum.
>
> Point is few things are already there so improve upon it rather than
> have two structures in kernel doing similar things...
>
I don't think dma_data_direction should be messed with here.
Or I am just being ignorant, in that case please do suggest how do
you want dma_data_direction "updated" ?

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-30  6:43             ` Barry Song
@ 2011-09-30 16:01               ` Jassi Brar
  2011-10-01  3:05                 ` Barry Song
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-30 16:01 UTC (permalink / raw)
  To: Barry Song
  Cc: Vinod Koul, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 30 September 2011 12:13, Barry Song <21cnbao@gmail.com> wrote:
>
> i support we can update dma_data_direction to:   MEM_TO_MEM,
> MEM_TO_DEV,  DEV_TO_MEM, DEV_TO_DEV.
>
So basically you are suggesting to replace 'enum dma_data_direction'
with 'enum xfer_direction'
I am not sure about that. I think they represent different things and
hence should be separate.
dma_data_direction tells the mapping of a buffer while the other
tells if the src and dst are memory or a device's FIFO.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-09-30 16:01               ` Jassi Brar
@ 2011-10-01  3:05                 ` Barry Song
  2011-10-01 18:11                   ` Vinod Koul
  2011-10-01 18:41                   ` Jassi Brar
  0 siblings, 2 replies; 131+ messages in thread
From: Barry Song @ 2011-10-01  3:05 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Vinod Koul, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

2011/10/1 Jassi Brar <jaswinder.singh@linaro.org>:
> On 30 September 2011 12:13, Barry Song <21cnbao@gmail.com> wrote:
>>
>> i support we can update dma_data_direction to:   MEM_TO_MEM,
>> MEM_TO_DEV,  DEV_TO_MEM, DEV_TO_DEV.
>>
> So basically you are suggesting to replace 'enum dma_data_direction'
> with 'enum xfer_direction'
> I am not sure about that. I think they represent different things and
> hence should be separate.
> dma_data_direction tells the mapping of a buffer while the other
> tells if the src and dst are memory or a device's FIFO.
>
you are kind of right now. now people use dma_data_direction to do
mapping for dma buffer. even with all 4 direction, people still use
the old two direction to do mapping. For example, it can't use
MEM_TO_MEM to map, it still need to know whether the memory is source
or dest.

i just don't like to the two old macro names. it seems i get a ticket
flying from New York to Beijing, but actually, we fly to Mexico...
so by the moment, dma_data_direction seems just to mean how to do map,
but xfer_direction is the real transfer direction.

How could we have two macro names: SRC_MEM, DEST_MEM for mapping. or just add:
dma_map_single_src
dma_map_single_dst
...
Thanks
barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-01  3:05                 ` Barry Song
@ 2011-10-01 18:11                   ` Vinod Koul
  2011-10-01 18:45                     ` Jassi Brar
  2011-10-01 18:41                   ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-01 18:11 UTC (permalink / raw)
  To: Barry Song, Jassi Brar
  Cc: linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Sat, 2011-10-01 at 11:05 +0800, Barry Song wrote:
> 2011/10/1 Jassi Brar <jaswinder.singh@linaro.org>:
> > On 30 September 2011 12:13, Barry Song <21cnbao@gmail.com> wrote:
> >>
> >> i support we can update dma_data_direction to:   MEM_TO_MEM,
> >> MEM_TO_DEV,  DEV_TO_MEM, DEV_TO_DEV.
> >>
> > So basically you are suggesting to replace 'enum dma_data_direction'
> > with 'enum xfer_direction'
> > I am not sure about that. I think they represent different things and
> > hence should be separate.
> > dma_data_direction tells the mapping of a buffer while the other
> > tells if the src and dst are memory or a device's FIFO.
> >
> you are kind of right now. now people use dma_data_direction to do
> mapping for dma buffer. even with all 4 direction, people still use
> the old two direction to do mapping. For example, it can't use
> MEM_TO_MEM to map, it still need to know whether the memory is source
> or dest.
> 
> i just don't like to the two old macro names. it seems i get a ticket
> flying from New York to Beijing, but actually, we fly to Mexico...
> so by the moment, dma_data_direction seems just to mean how to do map,
> but xfer_direction is the real transfer direction.
> 
> How could we have two macro names: SRC_MEM, DEST_MEM for mapping. or just add:
> dma_map_single_src
> dma_map_single_dst
Direction is valid even if you are doing memory to memory transfer. IMO
what we need to specify what type of transfer (memory or peripheral),
perhaps a flag and use that along with dma_data_direction.

@Jassi, this is what kind of tweak I have in mind. And i do not agree
current approach is bad and is it cleanly tell us type of transfer
(memcpy or prep_sg) and direction with dma_data _direction.

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-01  3:05                 ` Barry Song
  2011-10-01 18:11                   ` Vinod Koul
@ 2011-10-01 18:41                   ` Jassi Brar
  2011-10-01 18:48                     ` Jassi Brar
  2011-10-02  0:33                     ` Barry Song
  1 sibling, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-01 18:41 UTC (permalink / raw)
  To: Barry Song
  Cc: Vinod Koul, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 1 October 2011 08:35, Barry Song <21cnbao@gmail.com> wrote:
> 2011/10/1 Jassi Brar <jaswinder.singh@linaro.org>:
>> On 30 September 2011 12:13, Barry Song <21cnbao@gmail.com> wrote:
>>>
>>> i support we can update dma_data_direction to:   MEM_TO_MEM,
>>> MEM_TO_DEV,  DEV_TO_MEM, DEV_TO_DEV.
>>>
>> So basically you are suggesting to replace 'enum dma_data_direction'
>> with 'enum xfer_direction'
>> I am not sure about that. I think they represent different things and
>> hence should be separate.
>> dma_data_direction tells the mapping of a buffer while the other
>> tells if the src and dst are memory or a device's FIFO.
>>
> you are kind of right now. now people use dma_data_direction to do
> mapping for dma buffer. even with all 4 direction, people still use
> the old two direction to do mapping.
> For example, it can't use
> MEM_TO_MEM to map, it still need to know whether the memory is source
> or dest.
MEM_TO_MEM means "From Memory Source To Memory Destination"
  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE

MEM_TO_DEV means "From Memory Source To FIFO Destination"
  Map Src buffer with DMA_TO_DEVICE.

DEV_TO_MEM means "From FIFO Source To Memory Destination"
  Map Dst buffer with DMA_FROM_DEVICE

DEV_TO_DEV means "From FIFO Source To FIFO Destination"

What else would you want to know ?

> i just don't like to the two old macro names. it seems i get a ticket
> flying from New York to Beijing, but actually, we fly to Mexico...
> so by the moment, dma_data_direction seems just to mean how to do map,
> but xfer_direction is the real transfer direction.
Not every dmac driver would need to know the src and dst type _only_ for buffer
mapping. Some might have to reprogram the channel accordingly, say.
The point is to give exact info about src and dst to dmac client driver and let
it do what it must.

> How could we have two macro names: SRC_MEM, DEST_MEM for mapping. or just add:
> dma_map_single_src
> dma_map_single_dst
Not sure what you mean by this.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-01 18:11                   ` Vinod Koul
@ 2011-10-01 18:45                     ` Jassi Brar
  0 siblings, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-01 18:45 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 1 October 2011 23:41, Vinod Koul <vinod.koul@intel.com> wrote:
> On Sat, 2011-10-01 at 11:05 +0800, Barry Song wrote:
>> 2011/10/1 Jassi Brar <jaswinder.singh@linaro.org>:
>> > On 30 September 2011 12:13, Barry Song <21cnbao@gmail.com> wrote:
>> >>
>> >> i support we can update dma_data_direction to:   MEM_TO_MEM,
>> >> MEM_TO_DEV,  DEV_TO_MEM, DEV_TO_DEV.
>> >>
>> > So basically you are suggesting to replace 'enum dma_data_direction'
>> > with 'enum xfer_direction'
>> > I am not sure about that. I think they represent different things and
>> > hence should be separate.
>> > dma_data_direction tells the mapping of a buffer while the other
>> > tells if the src and dst are memory or a device's FIFO.
>> >
>> you are kind of right now. now people use dma_data_direction to do
>> mapping for dma buffer. even with all 4 direction, people still use
>> the old two direction to do mapping. For example, it can't use
>> MEM_TO_MEM to map, it still need to know whether the memory is source
>> or dest.
>>
>> i just don't like to the two old macro names. it seems i get a ticket
>> flying from New York to Beijing, but actually, we fly to Mexico...
>> so by the moment, dma_data_direction seems just to mean how to do map,
>> but xfer_direction is the real transfer direction.
>>
>> How could we have two macro names: SRC_MEM, DEST_MEM for mapping. or just add:
>> dma_map_single_src
>> dma_map_single_dst
> Direction is valid even if you are doing memory to memory transfer. IMO
> what we need to specify what type of transfer (memory or peripheral),
> perhaps a flag and use that along with dma_data_direction.
I think dma_data_direction should be moved into dmac drivers that decide
the value from _more_ complete transfer info i.e, xfer_direction
IOW let clients pass more precise info to the dmac drivers via xfer_direction.

> @Jassi, this is what kind of tweak I have in mind. And i do not agree
> current approach is bad and is it cleanly tell us type of transfer
> (memcpy or prep_sg) and direction with dma_data _direction.
Not sure how is that "updating" the dma_data_direction ?
Please clarify what you exactly propose.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-01 18:41                   ` Jassi Brar
@ 2011-10-01 18:48                     ` Jassi Brar
  2011-10-02  0:33                     ` Barry Song
  1 sibling, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-01 18:48 UTC (permalink / raw)
  To: Barry Song
  Cc: Vinod Koul, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 2 October 2011 00:11, Jassi Brar <jaswinder.singh@linaro.org> wrote:

Errata...

> The point is to give exact info about src and dst to dmac client driver and let
> it do what it must.
** The point is to give exact info about src and dst to the dmac driver and let
it do what it must. **

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-01 18:41                   ` Jassi Brar
  2011-10-01 18:48                     ` Jassi Brar
@ 2011-10-02  0:33                     ` Barry Song
  2011-10-03  6:24                       ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-10-02  0:33 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Vinod Koul, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

2011/10/2 Jassi Brar <jaswinder.singh@linaro.org>
>
> On 1 October 2011 08:35, Barry Song <21cnbao@gmail.com> wrote:
> > 2011/10/1 Jassi Brar <jaswinder.singh@linaro.org>:
> >> On 30 September 2011 12:13, Barry Song <21cnbao@gmail.com> wrote:
> >>>
> >>> i support we can update dma_data_direction to:   MEM_TO_MEM,
> >>> MEM_TO_DEV,  DEV_TO_MEM, DEV_TO_DEV.
> >>>
> >> So basically you are suggesting to replace 'enum dma_data_direction'
> >> with 'enum xfer_direction'
> >> I am not sure about that. I think they represent different things and
> >> hence should be separate.
> >> dma_data_direction tells the mapping of a buffer while the other
> >> tells if the src and dst are memory or a device's FIFO.
> >>
> > you are kind of right now. now people use dma_data_direction to do
> > mapping for dma buffer. even with all 4 direction, people still use
> > the old two direction to do mapping.
> > For example, it can't use
> > MEM_TO_MEM to map, it still need to know whether the memory is source
> > or dest.
> MEM_TO_MEM means "From Memory Source To Memory Destination"
>  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE
>
> MEM_TO_DEV means "From Memory Source To FIFO Destination"
>  Map Src buffer with DMA_TO_DEVICE.
>
> DEV_TO_MEM means "From FIFO Source To Memory Destination"
>  Map Dst buffer with DMA_FROM_DEVICE
>
> DEV_TO_DEV means "From FIFO Source To FIFO Destination"
>
> What else would you want to know ?

that is the problem. for example, drivers can't use MEM_TO_MEM as a
flag to do dma mapping. so xfer_direction can't cover all that
dma_data_direction can do.  that's why you need both
dma_data_direction and xfer_direction with some similar flags in them.

>
> > i just don't like to the two old macro names. it seems i get a ticket
> > flying from New York to Beijing, but actually, we fly to Mexico...
> > so by the moment, dma_data_direction seems just to mean how to do map,
> > but xfer_direction is the real transfer direction.
> Not every dmac driver would need to know the src and dst type _only_ for buffer
> mapping. Some might have to reprogram the channel accordingly, say.
> The point is to give exact info about src and dst to dmac client driver and let
> it do what it must.
>
> > How could we have two macro names: SRC_MEM, DEST_MEM for mapping. or just add:
> > dma_map_single_src
> > dma_map_single_dst
> Not sure what you mean by this.

my point is dma_data_direction is named wrong after we have your
xfer_direction. people will think dma_data_direction just means dma is
transferred from where to where. doesn't dma_data_direction mean m2d,
m2m, d2d, d2m just from the name? but actually it is just the mapping
direction now.

so we might make mapping things be the mapping things but not the data
direction things. we might rename macros DMA_FROM_DEV, DMA_TO_DEV in
it to MEM_DST and MEM_SRC. or we split dma_map_single class functions
into dma_map_single_src and dma_map_single_dst. then for the memory
buffer as dma src, we map it by dma_map_single_src. for the memory
buffer as dma dst, we map it by dma_map_single_dst.

-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-02  0:33                     ` Barry Song
@ 2011-10-03  6:24                       ` Jassi Brar
  2011-10-03 16:13                         ` Russell King
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-03  6:24 UTC (permalink / raw)
  To: Barry Song
  Cc: Vinod Koul, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 2 October 2011 06:03, Barry Song <21cnbao@gmail.com> wrote:
> 2011/10/2 Jassi Brar <jaswinder.singh@linaro.org>

>> > For example, it can't use
>> > MEM_TO_MEM to map, it still need to know whether the memory is source
>> > or dest.
>> MEM_TO_MEM means "From Memory Source To Memory Destination"
>>  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE
>>
>> MEM_TO_DEV means "From Memory Source To FIFO Destination"
>>  Map Src buffer with DMA_TO_DEVICE.
>>
>> DEV_TO_MEM means "From FIFO Source To Memory Destination"
>>  Map Dst buffer with DMA_FROM_DEVICE
>>
>> DEV_TO_DEV means "From FIFO Source To FIFO Destination"
>>
>> What else would you want to know ?
>
> that is the problem. for example, drivers can't use MEM_TO_MEM as a
> flag to do dma mapping. so xfer_direction can't cover all that
> dma_data_direction can do.  that's why you need both
> dma_data_direction and xfer_direction with some similar flags in them.
>
The client drivers map the src/dst buffers and the dmac driver unmaps
them by default(!). For which, the dmac driver doesn't look at anything
other than
     DMA_COMPL_SKIP_SRC/DST_UNMAP
     DMA_COMPL_SRC/DST_UNMAP_SINGLE
  bits of 'enum dma_ctrl_flags'.
For this unmap'ing purpose, the usage of dma_data_direction is already
internal to the dmac driver.
[BTW, the scheme is broken because the dmac driver can't know if
the client mapped the buffers with DMA_TO/FROM_DEVICE or with
DMA_BIDIRECTION (dmatest.c does so)]

The only use of 'enum dma_data_direction' in DMAEngine _api_ is
in 'struct dma_slave_config', device_prep_slave_sg and device_prep_dma_cyclic.
Where too the dmac drivers disregard any value other than DMA_TO_DEVICE
and DMA_FROM_DEVICE and very rightly so IMO.

So replacing dma_data_direction usage with xfer_direction in
dmaengine is the best thing to do.

I don't know how better could I explain. If you still think otherwise,
please do tell exactly when would a client need to use both the
flags - dma_data_direction and xfer_direction?


>> > How could we have two macro names: SRC_MEM, DEST_MEM for mapping. or just add:
>> > dma_map_single_src
>> > dma_map_single_dst
>> Not sure what you mean by this.
>
> my point is dma_data_direction is named wrong after we have your
> xfer_direction. people will think dma_data_direction just means dma is
> transferred from where to where. doesn't dma_data_direction mean m2d,
> m2m, d2d, d2m just from the name? but actually it is just the mapping
> direction now.
>
So basically you mean people might get confused because
of 'dma_data_direction' and 'xfer_direction' names.
The concern might or might not be serious, but it certainly doesn't
warrant any change to dma_data_direction.

-j

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-03  6:24                       ` Jassi Brar
@ 2011-10-03 16:13                         ` Russell King
  2011-10-03 16:19                           ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Russell King @ 2011-10-03 16:13 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Barry Song, Vinod Koul, linux-kernel, dan.j.williams,
	DL-SHA-WorkGroupLinux

On Mon, Oct 03, 2011 at 11:54:23AM +0530, Jassi Brar wrote:
> On 2 October 2011 06:03, Barry Song <21cnbao@gmail.com> wrote:
> > 2011/10/2 Jassi Brar <jaswinder.singh@linaro.org>
> 
> >> > For example, it can't use
> >> > MEM_TO_MEM to map, it still need to know whether the memory is source
> >> > or dest.
> >> MEM_TO_MEM means "From Memory Source To Memory Destination"
> >>  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE
> >>
> >> MEM_TO_DEV means "From Memory Source To FIFO Destination"
> >>  Map Src buffer with DMA_TO_DEVICE.
> >>
> >> DEV_TO_MEM means "From FIFO Source To Memory Destination"
> >>  Map Dst buffer with DMA_FROM_DEVICE
> >>
> >> DEV_TO_DEV means "From FIFO Source To FIFO Destination"
> >>
> >> What else would you want to know ?
> >
> > that is the problem. for example, drivers can't use MEM_TO_MEM as a
> > flag to do dma mapping. so xfer_direction can't cover all that
> > dma_data_direction can do.  that's why you need both
> > dma_data_direction and xfer_direction with some similar flags in them.
> >
> The client drivers map the src/dst buffers and the dmac driver unmaps
> them by default(!). For which, the dmac driver doesn't look at anything
> other than
>      DMA_COMPL_SKIP_SRC/DST_UNMAP
>      DMA_COMPL_SRC/DST_UNMAP_SINGLE
>   bits of 'enum dma_ctrl_flags'.
> For this unmap'ing purpose, the usage of dma_data_direction is already
> internal to the dmac driver.

No.  Slave DMA engine drivers do *not* (and if they do, they should *not*)
honour the unmapping of submitted buffers.

The unmapping of these buffers by the DMA engine driver is intended to be
done for the async_tx API and not slave DMA.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-03 16:13                         ` Russell King
@ 2011-10-03 16:19                           ` Jassi Brar
  2011-10-03 17:15                             ` Williams, Dan J
  2011-10-05 18:14                             ` Williams, Dan J
  0 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-03 16:19 UTC (permalink / raw)
  To: Russell King
  Cc: Barry Song, Vinod Koul, linux-kernel, dan.j.williams,
	DL-SHA-WorkGroupLinux

On 3 October 2011 21:43, Russell King <rmk@arm.linux.org.uk> wrote:
> On Mon, Oct 03, 2011 at 11:54:23AM +0530, Jassi Brar wrote:
>> On 2 October 2011 06:03, Barry Song <21cnbao@gmail.com> wrote:
>> > 2011/10/2 Jassi Brar <jaswinder.singh@linaro.org>
>>
>> >> > For example, it can't use
>> >> > MEM_TO_MEM to map, it still need to know whether the memory is source
>> >> > or dest.
>> >> MEM_TO_MEM means "From Memory Source To Memory Destination"
>> >>  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE
>> >>
>> >> MEM_TO_DEV means "From Memory Source To FIFO Destination"
>> >>  Map Src buffer with DMA_TO_DEVICE.
>> >>
>> >> DEV_TO_MEM means "From FIFO Source To Memory Destination"
>> >>  Map Dst buffer with DMA_FROM_DEVICE
>> >>
>> >> DEV_TO_DEV means "From FIFO Source To FIFO Destination"
>> >>
>> >> What else would you want to know ?
>> >
>> > that is the problem. for example, drivers can't use MEM_TO_MEM as a
>> > flag to do dma mapping. so xfer_direction can't cover all that
>> > dma_data_direction can do.  that's why you need both
>> > dma_data_direction and xfer_direction with some similar flags in them.
>> >
>> The client drivers map the src/dst buffers and the dmac driver unmaps
>> them by default(!). For which, the dmac driver doesn't look at anything
>> other than
>>      DMA_COMPL_SKIP_SRC/DST_UNMAP
>>      DMA_COMPL_SRC/DST_UNMAP_SINGLE
>>   bits of 'enum dma_ctrl_flags'.
>> For this unmap'ing purpose, the usage of dma_data_direction is already
>> internal to the dmac driver.
>
> No.  Slave DMA engine drivers do *not* (and if they do, they should *not*)
> honour the unmapping of submitted buffers.
>
> The unmapping of these buffers by the DMA engine driver is intended to be
> done for the async_tx API and not slave DMA.
>
The proposed api is usable by both Slave as well as Async(Memcpy etc).
So it *does* matter here.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-03 16:19                           ` Jassi Brar
@ 2011-10-03 17:15                             ` Williams, Dan J
  2011-10-03 18:23                               ` Jassi Brar
  2011-10-05 18:14                             ` Williams, Dan J
  1 sibling, 1 reply; 131+ messages in thread
From: Williams, Dan J @ 2011-10-03 17:15 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Russell King, Barry Song, Vinod Koul, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, Oct 3, 2011 at 9:19 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> On 3 October 2011 21:43, Russell King <rmk@arm.linux.org.uk> wrote:
>> On Mon, Oct 03, 2011 at 11:54:23AM +0530, Jassi Brar wrote:
>>> On 2 October 2011 06:03, Barry Song <21cnbao@gmail.com> wrote:
>>> > 2011/10/2 Jassi Brar <jaswinder.singh@linaro.org>
>>>
>>> >> > For example, it can't use
>>> >> > MEM_TO_MEM to map, it still need to know whether the memory is source
>>> >> > or dest.
>>> >> MEM_TO_MEM means "From Memory Source To Memory Destination"
>>> >>  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE
>>> >>
>>> >> MEM_TO_DEV means "From Memory Source To FIFO Destination"
>>> >>  Map Src buffer with DMA_TO_DEVICE.
>>> >>
>>> >> DEV_TO_MEM means "From FIFO Source To Memory Destination"
>>> >>  Map Dst buffer with DMA_FROM_DEVICE
>>> >>
>>> >> DEV_TO_DEV means "From FIFO Source To FIFO Destination"
>>> >>
>>> >> What else would you want to know ?
>>> >
>>> > that is the problem. for example, drivers can't use MEM_TO_MEM as a
>>> > flag to do dma mapping. so xfer_direction can't cover all that
>>> > dma_data_direction can do.  that's why you need both
>>> > dma_data_direction and xfer_direction with some similar flags in them.
>>> >
>>> The client drivers map the src/dst buffers and the dmac driver unmaps
>>> them by default(!). For which, the dmac driver doesn't look at anything
>>> other than
>>>      DMA_COMPL_SKIP_SRC/DST_UNMAP
>>>      DMA_COMPL_SRC/DST_UNMAP_SINGLE
>>>   bits of 'enum dma_ctrl_flags'.
>>> For this unmap'ing purpose, the usage of dma_data_direction is already
>>> internal to the dmac driver.
>>
>> No.  Slave DMA engine drivers do *not* (and if they do, they should *not*)
>> honour the unmapping of submitted buffers.
>>
>> The unmapping of these buffers by the DMA engine driver is intended to be
>> done for the async_tx API and not slave DMA.
>>
> The proposed api is usable by both Slave as well as Async(Memcpy etc).
> So it *does* matter here.

I think the confusion is reduced if you don't try to use this api for
mem-to-mem transfers.  Then you can use DMA_NONE to indicate the
dev-to-dev case.  If a mem-to-mem user arrives we can revisit
xfer_direction, but as it stands it seems this is primarily useful for
slave-dma, i.e. I don't see async_tx_dma or net_dma switching to this
scheme anytime soon, if ever.  Are there other mem-to-mem use cases
that would use this?

Also in dmaxfer_template I do not understand the need for src_inc and
dst_inc.  Aren't those properties that the client would know about the
slave device?

--
Dan

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-03 17:15                             ` Williams, Dan J
@ 2011-10-03 18:23                               ` Jassi Brar
  2011-10-05 18:19                                 ` Williams, Dan J
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-03 18:23 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Russell King, Barry Song, Vinod Koul, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 3 October 2011 22:45, Williams, Dan J <dan.j.williams@intel.com> wrote:
> On Mon, Oct 3, 2011 at 9:19 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
>> On 3 October 2011 21:43, Russell King <rmk@arm.linux.org.uk> wrote:
>>> On Mon, Oct 03, 2011 at 11:54:23AM +0530, Jassi Brar wrote:
>>>> On 2 October 2011 06:03, Barry Song <21cnbao@gmail.com> wrote:
>>>> > 2011/10/2 Jassi Brar <jaswinder.singh@linaro.org>
>>>>
>>>> >> > For example, it can't use
>>>> >> > MEM_TO_MEM to map, it still need to know whether the memory is source
>>>> >> > or dest.
>>>> >> MEM_TO_MEM means "From Memory Source To Memory Destination"
>>>> >>  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE
>>>> >>
>>>> >> MEM_TO_DEV means "From Memory Source To FIFO Destination"
>>>> >>  Map Src buffer with DMA_TO_DEVICE.
>>>> >>
>>>> >> DEV_TO_MEM means "From FIFO Source To Memory Destination"
>>>> >>  Map Dst buffer with DMA_FROM_DEVICE
>>>> >>
>>>> >> DEV_TO_DEV means "From FIFO Source To FIFO Destination"
>>>> >>
>>>> >> What else would you want to know ?
>>>> >
>>>> > that is the problem. for example, drivers can't use MEM_TO_MEM as a
>>>> > flag to do dma mapping. so xfer_direction can't cover all that
>>>> > dma_data_direction can do.  that's why you need both
>>>> > dma_data_direction and xfer_direction with some similar flags in them.
>>>> >
>>>> The client drivers map the src/dst buffers and the dmac driver unmaps
>>>> them by default(!). For which, the dmac driver doesn't look at anything
>>>> other than
>>>>      DMA_COMPL_SKIP_SRC/DST_UNMAP
>>>>      DMA_COMPL_SRC/DST_UNMAP_SINGLE
>>>>   bits of 'enum dma_ctrl_flags'.
>>>> For this unmap'ing purpose, the usage of dma_data_direction is already
>>>> internal to the dmac driver.
>>>
>>> No.  Slave DMA engine drivers do *not* (and if they do, they should *not*)
>>> honour the unmapping of submitted buffers.
>>>
>>> The unmapping of these buffers by the DMA engine driver is intended to be
>>> done for the async_tx API and not slave DMA.
>>>
>> The proposed api is usable by both Slave as well as Async(Memcpy etc).
>> So it *does* matter here.
>
> I think the confusion is reduced if you don't try to use this api for
> mem-to-mem transfers.  Then you can use DMA_NONE to indicate the
> dev-to-dev case.  If a mem-to-mem user arrives we can revisit
> xfer_direction, but as it stands it seems this is primarily useful for
> slave-dma, i.e. I don't see async_tx_dma or net_dma switching to this
> scheme anytime soon, if ever.  Are there other mem-to-mem use cases
> that would use this?
>
As a matter of fact, I personally ever only needed such api for MemToMem
- converting a peculiar YUV arrangement to a more 'normal' one using PL330.
Barry's is the first real Slave requirement I came across(not to mean I wasn't
expecting).
So IMO, limiting the api to only Slave, is not much better than having Vendor
specific api because chances are we might not see another 'interleaved' user
of it real soon.


> Also in dmaxfer_template I do not understand the need for src_inc and
> dst_inc.  Aren't those properties that the client would know about the
> slave device?
>
You are assuming only Slave usage.
src_inc/dst_inc are mainly for 'Memset' type operation.
In Slave transfer they would help avoid allocating full length RX buffer
when the client only wants to send data but the controller works
only in full-duplex and vice-versa (thanks to RMK for pointing the case,
and I remember S3C24XX do have such SPI controller).
More generally when one needs to transmit the same data, or discard
the received data, for a certain period of time.

Good to have you in the discussion.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-03 16:19                           ` Jassi Brar
  2011-10-03 17:15                             ` Williams, Dan J
@ 2011-10-05 18:14                             ` Williams, Dan J
  2011-10-06  7:12                               ` Jassi Brar
  2011-10-07  5:45                               ` Vinod Koul
  1 sibling, 2 replies; 131+ messages in thread
From: Williams, Dan J @ 2011-10-05 18:14 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Russell King, Barry Song, Vinod Koul, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, Oct 3, 2011 at 9:19 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> On 3 October 2011 21:43, Russell King <rmk@arm.linux.org.uk> wrote:
>> On Mon, Oct 03, 2011 at 11:54:23AM +0530, Jassi Brar wrote:
>>> On 2 October 2011 06:03, Barry Song <21cnbao@gmail.com> wrote:
>>> > 2011/10/2 Jassi Brar <jaswinder.singh@linaro.org>
>>>
>>> >> > For example, it can't use
>>> >> > MEM_TO_MEM to map, it still need to know whether the memory is source
>>> >> > or dest.
>>> >> MEM_TO_MEM means "From Memory Source To Memory Destination"
>>> >>  Map Src buffer with DMA_TO_DEVICE and Dst buffer with DMA_FROM_DEVICE
>>> >>
>>> >> MEM_TO_DEV means "From Memory Source To FIFO Destination"
>>> >>  Map Src buffer with DMA_TO_DEVICE.
>>> >>
>>> >> DEV_TO_MEM means "From FIFO Source To Memory Destination"
>>> >>  Map Dst buffer with DMA_FROM_DEVICE
>>> >>
>>> >> DEV_TO_DEV means "From FIFO Source To FIFO Destination"
>>> >>
>>> >> What else would you want to know ?
>>> >
>>> > that is the problem. for example, drivers can't use MEM_TO_MEM as a
>>> > flag to do dma mapping. so xfer_direction can't cover all that
>>> > dma_data_direction can do.  that's why you need both
>>> > dma_data_direction and xfer_direction with some similar flags in them.
>>> >
>>> The client drivers map the src/dst buffers and the dmac driver unmaps
>>> them by default(!). For which, the dmac driver doesn't look at anything
>>> other than
>>>      DMA_COMPL_SKIP_SRC/DST_UNMAP
>>>      DMA_COMPL_SRC/DST_UNMAP_SINGLE
>>>   bits of 'enum dma_ctrl_flags'.
>>> For this unmap'ing purpose, the usage of dma_data_direction is already
>>> internal to the dmac driver.
>>
>> No.  Slave DMA engine drivers do *not* (and if they do, they should *not*)
>> honour the unmapping of submitted buffers.
>>
>> The unmapping of these buffers by the DMA engine driver is intended to be
>> done for the async_tx API and not slave DMA.
>>
> The proposed api is usable by both Slave as well as Async(Memcpy etc).
> So it *does* matter here.

Support for automatic unmapping is really only useful for simple cases
like net_dma where all operations in the chain are to distinct
buffers.  Trying to support it for async_tx contributed to the current
brokenness with respect to overlapping mappings for operation chaining
in the async-tx raid case.  So I would like to rip out unmap support
from the dma drivers, but before we can do that we need to teach raid
and net_dma how to manage the mappings themselves.  The raid lift is a
bit bigger because it needs to handle the cases of cpu-memcpy +
dma-xor-pq versus dma-memcpy + dma-xor-pq (I would drop support for
dma-memcpy + cpu-xor-pq and just make this case all cpu).

This new operation type strikes me as being in a similar vein to
commit a08abd8c "async_tx: structify submission arguments, add
scribble", in that we convert multiple submission arguments into one
description template.  With some tweaks it could probably even cover
the DMA_CYCLIC, but probably could not cover the raid ops.  In general
I'm concerned about operation type proliferation, so if we added this
one I'd like to see others removed.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-03 18:23                               ` Jassi Brar
@ 2011-10-05 18:19                                 ` Williams, Dan J
  2011-10-06  9:06                                   ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Williams, Dan J @ 2011-10-05 18:19 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Russell King, Barry Song, Vinod Koul, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, Oct 3, 2011 at 11:23 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> You are assuming only Slave usage.
> src_inc/dst_inc are mainly for 'Memset' type operation.

There currently are not any users of offloaded memset in the kernel,
do you have one in mind?  Otherwise I would not design for it this
early.

> In Slave transfer they would help avoid allocating full length RX buffer
> when the client only wants to send data but the controller works
> only in full-duplex and vice-versa (thanks to RMK for pointing the case,
> and I remember S3C24XX do have such SPI controller).
> More generally when one needs to transmit the same data, or discard
> the received data, for a certain period of time.

In the slave case I would think the driver would know the address
increment attributes of the slave and would not need to be told on
each operation by the client?

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-05 18:14                             ` Williams, Dan J
@ 2011-10-06  7:12                               ` Jassi Brar
  2011-10-07  5:45                               ` Vinod Koul
  1 sibling, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-06  7:12 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Russell King, Barry Song, Vinod Koul, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 5 October 2011 23:44, Williams, Dan J <dan.j.williams@intel.com> wrote:
>
> This new operation type strikes me as being in a similar vein to
> commit a08abd8c "async_tx: structify submission arguments, add
> scribble", in that we convert multiple submission arguments into one
> description template.  With some tweaks it could probably even cover
> the DMA_CYCLIC, but probably could not cover the raid ops.  In general
> I'm concerned about operation type proliferation, so if we added this
> one I'd like to see others removed.
>
ATM this api is meant to provide a way for clients to express
interleaved (in bytes) transfers which are not possible using any
other api.

I do share your concern about operation type proliferation, and in fact
tried to make this api so that at least some prepares could be merged
in to it, but I am not sure how reasonable would it be hold this api
at ransom until other prepares are merged.
The way I see things, if we make this api generic enough we could
disallow adding more apis and merge extant ones into it one by one
at our own pace.

-j

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-05 18:19                                 ` Williams, Dan J
@ 2011-10-06  9:06                                   ` Jassi Brar
  0 siblings, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-06  9:06 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Russell King, Barry Song, Vinod Koul, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 5 October 2011 23:49, Williams, Dan J <dan.j.williams@intel.com> wrote:
> On Mon, Oct 3, 2011 at 11:23 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
>> You are assuming only Slave usage.
>> src_inc/dst_inc are mainly for 'Memset' type operation.
>
> There currently are not any users of offloaded memset in the kernel,
> do you have one in mind?  Otherwise I would not design for it this
> early.
>
Let me explain the circumstances that led me to the crime ...

# This api is common for Mem->Mem and Slave interleaved transfers.

For Mem->Mem
******************
# Memcpy is simply an 'interleaved' transfer of 1 frame containing
  1 chunk of very large size.
  So we might as well merge device_prep_dma_memcpy into this api.
# And Memset is simply a Memcpy with fixed source address.
  Hence the src_inc.

For Slave
***********
# Considering most cases where ...
    src_inc  &&  dst_inc  => Invalid
   !src_inc  &&  dst_inc  => DEV_TO_MEM
    src_inc  && !dst_inc  => MEM_TO_DEV
   !src_inc  && !dst_inc  => DEV_TO_DEV
would get dmac drivers the direction per transfer.
That is _IF_ the api specifies that clients can not do memset
to/from device's FIFO and use full buffers even for dummy data.
But as Russell noted, some clients already do dma between fixed
memory address and FIFO, we have to support 'slave-memset'
and rather have explicit flag 'enum xfer_direction' for direction.
So we still need src/dst_inc albeit for not so important role.

>
>> In Slave transfer they would help avoid allocating full length RX buffer
>> when the client only wants to send data but the controller works
>> only in full-duplex and vice-versa (thanks to RMK for pointing the case,
>> and I remember S3C24XX do have such SPI controller).
>> More generally when one needs to transmit the same data, or discard
>> the received data, for a certain period of time.
>
> In the slave case I would think the driver would know the address
> increment attributes of the slave and would not need to be told on
> each operation by the client?
>
Consider the same example of full-duplex-only SPI controller.
For some transfers the client might move real data (memory
side address increments) while for other transfers the client
would have to provide dummy data, by keep memory side address
constant, just to keep active the useful transfer in other direction.
So this has to be a per-transfer thing rather than a property
of the channel.

-jassi

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-05 18:14                             ` Williams, Dan J
  2011-10-06  7:12                               ` Jassi Brar
@ 2011-10-07  5:45                               ` Vinod Koul
  2011-10-07 11:27                                 ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-07  5:45 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Jassi Brar, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Wed, 2011-10-05 at 11:14 -0700, Williams, Dan J wrote:
> This new operation type strikes me as being in a similar vein to
> commit a08abd8c "async_tx: structify submission arguments, add
> scribble", in that we convert multiple submission arguments into one
> description template.  With some tweaks it could probably even cover
> the DMA_CYCLIC, but probably could not cover the raid ops.  In general
> I'm concerned about operation type proliferation, so if we added this
> one I'd like to see others removed. 
For slave cases, we have DMA_SLAVE, DMA_CYCLIC and some support memcpy
as well.

I think we should have kept DMA_CYCLIC as a special case of DMA_SLAVE
(thru a flag perhaps) not a new API, if all agree i can fix that up for
3.3

Thru this patch Jassi gave a very good try at merging DMA_SLAVE and
memcpy, but more we debate this, I am still not convinced about merging
memcpy and DMA_SLAVE yet.

I would still argue that if we split this on same lines as current
mechanism, we have clean way to convey all details for both cases.

maybe I am being pessimist, but my vote goes for simpler things

Thoughts...?

For other memcpy cases like xor, etc, I don't think I have looked at
finer detail to comment on it, but if we can make a generic mempy API
with ops specified for what "type" of memcpy we could reduce it :)

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-07  5:45                               ` Vinod Koul
@ 2011-10-07 11:27                                 ` Jassi Brar
  2011-10-07 14:19                                   ` Vinod Koul
  2011-10-11 16:44                                   ` Williams, Dan J
  0 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-07 11:27 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 7 October 2011 11:15, Vinod Koul <vinod.koul@intel.com> wrote:

> Thru this patch Jassi gave a very good try at merging DMA_SLAVE and
> memcpy, but more we debate this, I am still not convinced about merging
> memcpy and DMA_SLAVE yet.
>
Nobody is merging memcpy and DMA_SLAVE right away.
The api's primary purpose is to support interleave transfers.
Possibility to merge other prepares into this is a side-effect.

> I would still argue that if we split this on same lines as current
> mechanism, we have clean way to convey all details for both cases.
>
Do you mean to have separate interleaved transfer apis for Slave
and Mem->Mem ? Please clarify.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-07 11:27                                 ` Jassi Brar
@ 2011-10-07 14:19                                   ` Vinod Koul
  2011-10-07 14:38                                     ` Jassi Brar
  2011-10-11 16:44                                   ` Williams, Dan J
  1 sibling, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-07 14:19 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Fri, 2011-10-07 at 16:57 +0530, Jassi Brar wrote:
> On 7 October 2011 11:15, Vinod Koul <vinod.koul@intel.com> wrote:
> 
> > Thru this patch Jassi gave a very good try at merging DMA_SLAVE and
> > memcpy, but more we debate this, I am still not convinced about merging
> > memcpy and DMA_SLAVE yet.
> >
> Nobody is merging memcpy and DMA_SLAVE right away.
> The api's primary purpose is to support interleave transfers.
> Possibility to merge other prepares into this is a side-effect.
For interleaved isn't that what you are trying?
> 
> > I would still argue that if we split this on same lines as current
> > mechanism, we have clean way to convey all details for both cases.
> >
> Do you mean to have separate interleaved transfer apis for Slave
> and Mem->Mem ? Please clarify.
If we can make API cleaner and well defined that way then Yes :)

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-07 14:19                                   ` Vinod Koul
@ 2011-10-07 14:38                                     ` Jassi Brar
  2011-10-10  6:53                                       ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-07 14:38 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 7 October 2011 19:49, Vinod Koul <vinod.koul@intel.com> wrote:
> On Fri, 2011-10-07 at 16:57 +0530, Jassi Brar wrote:

>> > I would still argue that if we split this on same lines as current
>> > mechanism, we have clean way to convey all details for both cases.
>> >
>> Do you mean to have separate interleaved transfer apis for Slave
>> and Mem->Mem ? Please clarify.
> If we can make API cleaner and well defined that way then Yes :)
>
I assume if you suggest you already have an idea....
Please do tell roughly how the api should look for Slave and for Mem->Mem ?

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-07 14:38                                     ` Jassi Brar
@ 2011-10-10  6:53                                       ` Vinod Koul
  2011-10-10  9:16                                         ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-10  6:53 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Fri, 2011-10-07 at 20:08 +0530, Jassi Brar wrote:
> On 7 October 2011 19:49, Vinod Koul <vinod.koul@intel.com> wrote:
> > On Fri, 2011-10-07 at 16:57 +0530, Jassi Brar wrote:
> 
> >> > I would still argue that if we split this on same lines as current
> >> > mechanism, we have clean way to convey all details for both cases.
> >> >
> >> Do you mean to have separate interleaved transfer apis for Slave
> >> and Mem->Mem ? Please clarify.
> > If we can make API cleaner and well defined that way then Yes :)
> >
> I assume if you suggest you already have an idea....
> Please do tell roughly how the api should look for Slave and for Mem->Mem ?
Okay, I think at this point we have discussed the parameters and agree
on them. Only issue being direction, which makes sense in slave, and
doesn't matter to be passed/dudeced in memcpy cases by dmac driver
What we can do is like below or tweaks to it:

+struct dmaxfer_memcpy_template {
+	dma_addr_t src_start;
+	dma_addr_t dst_start;
+	bool src_inc;
+	bool dst_inc;
+	bool src_sgl;
+	bool dst_sgl;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
+
+struct dmaxfer_slave_template {	
+	dma_addr_t mem;
+	bool mem_inc;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
 /* last transaction type for creation of the capabilities mask */
 #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
 
@@ -495,6 +521,13 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_memcpy)(
+		struct dma_chan *chan, struct dmaxfer_memcpy_template,
+		unsigned long flags);
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_slave)(
+		struct dma_chan *chan, struct dmaxfer_slave_template,
+		enum dma_data_direction direction, unsigned long flags);
+
 	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
 		unsigned long arg);
 
I was tempted to create a single dmaxfer structure which can be common
to both slave and memcpy. In former you pass this along with direction
and in latter you can pass pair of these to describe src and dstn.

I will leave the choice to you :)

The point is that it makes it simpler to understand what we are doing
rather than resort to parsing to find out if its memcpy or slave and
what is the direction.

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10  6:53                                       ` Vinod Koul
@ 2011-10-10  9:16                                         ` Jassi Brar
  2011-10-10  9:18                                           ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-10  9:16 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 10 October 2011 12:23, Vinod Koul <vinod.koul@intel.com> wrote:
>
> +struct dmaxfer_memcpy_template {
> +       dma_addr_t src_start;
> +       dma_addr_t dst_start;
> +       bool src_inc;
> +       bool dst_inc;
> +       bool src_sgl;
> +       bool dst_sgl;
> +       size_t numf;
> +       size_t frame_size;
> +       struct data_chunk sgl[0];
> +};
> +
> +struct dmaxfer_slave_template {
> +       dma_addr_t mem;
> +       bool mem_inc;
> +       size_t numf;
> +       size_t frame_size;
> +       struct data_chunk sgl[0];
> +};
>
(1) Please tell how is dmaxfer_slave_template supposed to work on
 bi-directional channels?
 Keeping in mind, dma_slave_config.direction is marked to go away
 in future.

(2)
  * slave_template.mem  <=>  memcpy_template.src_start
  * slave_template.mem_inc  <=>  memcpy_template.src_inc

 So essentially
  memcpy_template  :=  slave_template + src/dst_sgl + dst_start + dst_inc

 Even after this separation, there is nothing slave specific in
dmaxfer_slave_template. The slave client still needs DMA_SLAVE_CONFIG
to specify slave parameters of the transfer.
 You only save a few bytes in a _copy_ of memcpy_template.

Sorry but I only see code duplication and a very vulnerable segregation.

>
> The point is that it makes it simpler to understand what we are doing
> rather than resort to parsing to find out if its memcpy or slave and
> what is the direction.
>
Not sure how reading dmaxfer_template.xfer_direction is "parsing".

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10  9:16                                         ` Jassi Brar
@ 2011-10-10  9:18                                           ` Vinod Koul
  2011-10-10  9:53                                             ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-10  9:18 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, 2011-10-10 at 14:46 +0530, Jassi Brar wrote:
> On 10 October 2011 12:23, Vinod Koul <vinod.koul@intel.com> wrote:
> >
> > +struct dmaxfer_memcpy_template {
> > +       dma_addr_t src_start;
> > +       dma_addr_t dst_start;
> > +       bool src_inc;
> > +       bool dst_inc;
> > +       bool src_sgl;
> > +       bool dst_sgl;
> > +       size_t numf;
> > +       size_t frame_size;
> > +       struct data_chunk sgl[0];
> > +};
> > +
> > +struct dmaxfer_slave_template {
> > +       dma_addr_t mem;
> > +       bool mem_inc;
> > +       size_t numf;
> > +       size_t frame_size;
> > +       struct data_chunk sgl[0];
> > +};
> >
> (1) Please tell how is dmaxfer_slave_template supposed to work on
>  bi-directional channels?
>  Keeping in mind, dma_slave_config.direction is marked to go away
>  in future.
I didn't use dma_slave_config.direction. There is direction field in
corresponding prepare function.

> 
> (2)
>   * slave_template.mem  <=>  memcpy_template.src_start
>   * slave_template.mem_inc  <=>  memcpy_template.src_inc
> 
>  So essentially
>   memcpy_template  :=  slave_template + src/dst_sgl + dst_start + dst_inc
> 
>  Even after this separation, there is nothing slave specific in
> dmaxfer_slave_template. The slave client still needs DMA_SLAVE_CONFIG
> to specify slave parameters of the transfer.
>  You only save a few bytes in a _copy_ of memcpy_template.
Yes DMA_SLAVE_CONFIG is always required, this attempt was not aimed to
remove that, but I would be interested in it :)

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10  9:18                                           ` Vinod Koul
@ 2011-10-10  9:53                                             ` Jassi Brar
  2011-10-10 10:45                                               ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-10  9:53 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 10 October 2011 14:48, Vinod Koul <vinod.koul@intel.com> wrote:
> On Mon, 2011-10-10 at 14:46 +0530, Jassi Brar wrote:
>> On 10 October 2011 12:23, Vinod Koul <vinod.koul@intel.com> wrote:
>> >
>> > +struct dmaxfer_memcpy_template {
>> > +       dma_addr_t src_start;
>> > +       dma_addr_t dst_start;
>> > +       bool src_inc;
>> > +       bool dst_inc;
>> > +       bool src_sgl;
>> > +       bool dst_sgl;
>> > +       size_t numf;
>> > +       size_t frame_size;
>> > +       struct data_chunk sgl[0];
>> > +};
>> > +
>> > +struct dmaxfer_slave_template {
>> > +       dma_addr_t mem;
>> > +       bool mem_inc;
>> > +       size_t numf;
>> > +       size_t frame_size;
>> > +       struct data_chunk sgl[0];
>> > +};
>> >
>> (1) Please tell how is dmaxfer_slave_template supposed to work on
>>  bi-directional channels?
>>  Keeping in mind, dma_slave_config.direction is marked to go away
>>  in future.
> I didn't use dma_slave_config.direction. There is direction field in
> corresponding prepare function.
>
ok but why not reduce 1 argument from api and embed that as
the transfer's property in dmaxfer_slave_template, as I did ?

>>
>> (2)
>>   * slave_template.mem  <=>  memcpy_template.src_start
>>   * slave_template.mem_inc  <=>  memcpy_template.src_inc
>>
>>  So essentially
>>   memcpy_template  :=  slave_template + src/dst_sgl + dst_start + dst_inc
>>
>>  Even after this separation, there is nothing slave specific in
>> dmaxfer_slave_template. The slave client still needs DMA_SLAVE_CONFIG
>> to specify slave parameters of the transfer.
>>  You only save a few bytes in a _copy_ of memcpy_template.
> Yes DMA_SLAVE_CONFIG is always required, this attempt was not aimed to
> remove that, but I would be interested in it :)
>
Sorry then I don't see this "ambiguity"(if there really is any) removal worth
adding an extra prepare when we already have 10 of them.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10  9:53                                             ` Jassi Brar
@ 2011-10-10 10:45                                               ` Vinod Koul
  2011-10-10 11:16                                                 ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-10 10:45 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, 2011-10-10 at 15:23 +0530, Jassi Brar wrote:
> On 10 October 2011 14:48, Vinod Koul <vinod.koul@intel.com> wrote:
> > On Mon, 2011-10-10 at 14:46 +0530, Jassi Brar wrote:
> >> On 10 October 2011 12:23, Vinod Koul <vinod.koul@intel.com> wrote:
> >> >
> >> > +struct dmaxfer_memcpy_template {
> >> > +       dma_addr_t src_start;
> >> > +       dma_addr_t dst_start;
> >> > +       bool src_inc;
> >> > +       bool dst_inc;
> >> > +       bool src_sgl;
> >> > +       bool dst_sgl;
> >> > +       size_t numf;
> >> > +       size_t frame_size;
> >> > +       struct data_chunk sgl[0];
> >> > +};
> >> > +
> >> > +struct dmaxfer_slave_template {
> >> > +       dma_addr_t mem;
> >> > +       bool mem_inc;
> >> > +       size_t numf;
> >> > +       size_t frame_size;
> >> > +       struct data_chunk sgl[0];
> >> > +};
> >> >
> >> (1) Please tell how is dmaxfer_slave_template supposed to work on
> >>  bi-directional channels?
> >>  Keeping in mind, dma_slave_config.direction is marked to go away
> >>  in future.
> > I didn't use dma_slave_config.direction. There is direction field in
> > corresponding prepare function.
> >
> ok but why not reduce 1 argument from api and embed that as
> the transfer's property in dmaxfer_slave_template, as I did ?
I am not religious about it, doesn't matter either way :)
> 
> >>
> >> (2)
> >>   * slave_template.mem  <=>  memcpy_template.src_start
> >>   * slave_template.mem_inc  <=>  memcpy_template.src_inc
> >>
> >>  So essentially
> >>   memcpy_template  :=  slave_template + src/dst_sgl + dst_start + dst_inc
> >>
> >>  Even after this separation, there is nothing slave specific in
> >> dmaxfer_slave_template. The slave client still needs DMA_SLAVE_CONFIG
> >> to specify slave parameters of the transfer.
> >>  You only save a few bytes in a _copy_ of memcpy_template.
> > Yes DMA_SLAVE_CONFIG is always required, this attempt was not aimed to
> > remove that, but I would be interested in it :)
> >
> Sorry then I don't see this "ambiguity"(if there really is any) removal worth
> adding an extra prepare when we already have 10 of them.
For slave we have only two, and we can easily merge cyclic by adding a
flag or something, I planning to do that for next merge cycle.

IMO having one more for interleaved-slave should be okay.

But I am fine if we find a common ground and merge the two where dmac
can cleanly identify direction and mode it is operating.

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10 10:45                                               ` Vinod Koul
@ 2011-10-10 11:16                                                 ` Jassi Brar
  2011-10-10 16:02                                                   ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-10 11:16 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 10 October 2011 16:15, Vinod Koul <vinod.koul@intel.com> wrote:
>
> But I am fine if we find a common ground and merge the two where dmac
> can cleanly identify direction and mode it is operating.
>
The client would set the xfer_direction and dmac would interpret as

enum xfer_direction {
     MEM_TO_MEM,   ->  Async/Memcpy mode
     MEM_TO_DEV,    ->  Slave mode & From Memory to Device
     DEV_TO_MEM,    ->  Slave mode & From Device to Memory
     DEV_TO_DEV,     ->  Slave mode & From Device to Device
}

How could it get any cleaner?

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10 11:16                                                 ` Jassi Brar
@ 2011-10-10 16:02                                                   ` Vinod Koul
  2011-10-10 16:28                                                     ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-10 16:02 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, 2011-10-10 at 16:46 +0530, Jassi Brar wrote:
> On 10 October 2011 16:15, Vinod Koul <vinod.koul@intel.com> wrote:
> >
> > But I am fine if we find a common ground and merge the two where dmac
> > can cleanly identify direction and mode it is operating.
> >
> The client would set the xfer_direction and dmac would interpret as
> 
> enum xfer_direction {
>      MEM_TO_MEM,   ->  Async/Memcpy mode
>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
> }
> 
> How could it get any cleaner?
Consider the case of a dmac driver which supports interleaved dma as
well as memcpy and slave
It needs to interpret dma_data_direction for later cases and
xfer_direction for former ones. Not a clean way to convey same info. And
only reason why we cannot use dma_data_direction here is because we
don't know the "mode", by splitting we overcome that.

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10 16:02                                                   ` Vinod Koul
@ 2011-10-10 16:28                                                     ` Jassi Brar
  2011-10-11 11:56                                                       ` Vinod Koul
  2011-10-12  5:41                                                       ` Barry Song
  0 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-10 16:28 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 10 October 2011 21:32, Vinod Koul <vinod.koul@intel.com> wrote:
> On Mon, 2011-10-10 at 16:46 +0530, Jassi Brar wrote:
>> On 10 October 2011 16:15, Vinod Koul <vinod.koul@intel.com> wrote:
>> >
>> > But I am fine if we find a common ground and merge the two where dmac
>> > can cleanly identify direction and mode it is operating.
>> >
>> The client would set the xfer_direction and dmac would interpret as
>>
>> enum xfer_direction {
>>      MEM_TO_MEM,   ->  Async/Memcpy mode
>>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
>>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
>>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
>> }
>>
>> How could it get any cleaner?
> Consider the case of a dmac driver which supports interleaved dma as
> well as memcpy and slave
> It needs to interpret dma_data_direction for later cases and
> xfer_direction for former ones.
dma_data_direction is the mapping attribute of a buffer and is not meant to
tell type of source and destination of a transfer.
xfer_direction is meant for that purpose.
So I'd rather convert device_prep_dma_cyclic and device_prep_slave_sg
to use xfer_direction.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10 16:28                                                     ` Jassi Brar
@ 2011-10-11 11:56                                                       ` Vinod Koul
  2011-10-11 15:57                                                         ` Jassi Brar
  2011-10-12  5:41                                                       ` Barry Song
  1 sibling, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-11 11:56 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, 2011-10-10 at 21:58 +0530, Jassi Brar wrote:
> On 10 October 2011 21:32, Vinod Koul <vinod.koul@intel.com> wrote:
> > On Mon, 2011-10-10 at 16:46 +0530, Jassi Brar wrote:
> >> On 10 October 2011 16:15, Vinod Koul <vinod.koul@intel.com> wrote:
> >> >
> >> > But I am fine if we find a common ground and merge the two where dmac
> >> > can cleanly identify direction and mode it is operating.
> >> >
> >> The client would set the xfer_direction and dmac would interpret as
> >>
> >> enum xfer_direction {
> >>      MEM_TO_MEM,   ->  Async/Memcpy mode
> >>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
> >>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
> >>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
> >> }
> >>
> >> How could it get any cleaner?
> > Consider the case of a dmac driver which supports interleaved dma as
> > well as memcpy and slave
> > It needs to interpret dma_data_direction for later cases and
> > xfer_direction for former ones.
> dma_data_direction is the mapping attribute of a buffer and is not meant to
> tell type of source and destination of a transfer.
> xfer_direction is meant for that purpose.
> So I'd rather convert device_prep_dma_cyclic and device_prep_slave_sg
> to use xfer_direction.
If the conversion id done for all drivers, then it should be fine...

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-11 11:56                                                       ` Vinod Koul
@ 2011-10-11 15:57                                                         ` Jassi Brar
  2011-10-11 16:45                                                           ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-11 15:57 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 11 October 2011 17:26, Vinod Koul <vinod.koul@intel.com> wrote:
>> >> >
>> >> > But I am fine if we find a common ground and merge the two where dmac
>> >> > can cleanly identify direction and mode it is operating.
>> >> >
>> >> The client would set the xfer_direction and dmac would interpret as
>> >>
>> >> enum xfer_direction {
>> >>      MEM_TO_MEM,   ->  Async/Memcpy mode
>> >>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
>> >>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
>> >>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
>> >> }
>> >>
>> >> How could it get any cleaner?
>> > Consider the case of a dmac driver which supports interleaved dma as
>> > well as memcpy and slave
>> > It needs to interpret dma_data_direction for later cases and
>> > xfer_direction for former ones.
>> dma_data_direction is the mapping attribute of a buffer and is not meant to
>> tell type of source and destination of a transfer.
>> xfer_direction is meant for that purpose.
>> So I'd rather convert device_prep_dma_cyclic and device_prep_slave_sg
>> to use xfer_direction.
>
> If the conversion id done for all drivers, then it should be fine...
>
I already said that many days ago. Though I am not sure what blocks
this patch if that conversion is done separately (which would touch many
subsystems).

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-07 11:27                                 ` Jassi Brar
  2011-10-07 14:19                                   ` Vinod Koul
@ 2011-10-11 16:44                                   ` Williams, Dan J
  2011-10-11 18:42                                     ` Jassi Brar
  2011-10-14 17:50                                     ` Bounine, Alexandre
  1 sibling, 2 replies; 131+ messages in thread
From: Williams, Dan J @ 2011-10-11 16:44 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Vinod Koul, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang, Alexandre.Bounine

[ Adding Alexandre ]

On Fri, Oct 7, 2011 at 4:27 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> On 7 October 2011 11:15, Vinod Koul <vinod.koul@intel.com> wrote:
>
>> Thru this patch Jassi gave a very good try at merging DMA_SLAVE and
>> memcpy, but more we debate this, I am still not convinced about merging
>> memcpy and DMA_SLAVE yet.
>>
> Nobody is merging memcpy and DMA_SLAVE right away.
> The api's primary purpose is to support interleave transfers.
> Possibility to merge other prepares into this is a side-effect.
>
>> I would still argue that if we split this on same lines as current
>> mechanism, we have clean way to convey all details for both cases.
>>
> Do you mean to have separate interleaved transfer apis for Slave
> and Mem->Mem ? Please clarify.
>

This is a tangent, but it would be nice if this API extension also
covered the needs of the incoming RapidIO case which wants to specify
new device context information per operation (and not once at
configuration time, like slave case).  Would it be enough if the
transfer template included a (struct device *context) member at the
end?  Most dma users could ignore it, but RapidIO could use it to do
something like:

   struct rio_dev *rdev = container_of(context, typeof(*rdev), device);

That might not be enough, but I'm concerned that making the context a
(void *) is too flexible.  I'd rather have something like this than
acquiring a lock in rio_dma_prep_slave_sg() and holding it over
->prep().  The alternative is to extend device_prep_slave_sg to take
an extra parameter, but that impacts all other slave implementations
with a dead parameter.

--
Dan

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-11 15:57                                                         ` Jassi Brar
@ 2011-10-11 16:45                                                           ` Vinod Koul
  0 siblings, 0 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-11 16:45 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, 2011-10-11 at 21:27 +0530, Jassi Brar wrote:
> On 11 October 2011 17:26, Vinod Koul <vinod.koul@intel.com> wrote:
> >> >> >
> >> >> > But I am fine if we find a common ground and merge the two where dmac
> >> >> > can cleanly identify direction and mode it is operating.
> >> >> >
> >> >> The client would set the xfer_direction and dmac would interpret as
> >> >>
> >> >> enum xfer_direction {
> >> >>      MEM_TO_MEM,   ->  Async/Memcpy mode
> >> >>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
> >> >>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
> >> >>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
> >> >> }
> >> >>
> >> >> How could it get any cleaner?
> >> > Consider the case of a dmac driver which supports interleaved dma as
> >> > well as memcpy and slave
> >> > It needs to interpret dma_data_direction for later cases and
> >> > xfer_direction for former ones.
> >> dma_data_direction is the mapping attribute of a buffer and is not meant to
> >> tell type of source and destination of a transfer.
> >> xfer_direction is meant for that purpose.
> >> So I'd rather convert device_prep_dma_cyclic and device_prep_slave_sg
> >> to use xfer_direction.
> >
> > If the conversion id done for all drivers, then it should be fine...
> >
> I already said that many days ago. Though I am not sure what blocks
> this patch if that conversion is done separately (which would touch many
> subsystems).
For this patchset, you need to remove frm_irq in template

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-11 16:44                                   ` Williams, Dan J
@ 2011-10-11 18:42                                     ` Jassi Brar
  2011-10-14 18:11                                       ` Bounine, Alexandre
  2011-10-14 17:50                                     ` Bounine, Alexandre
  1 sibling, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-11 18:42 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: Vinod Koul, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang, Alexandre.Bounine

On 11 October 2011 22:14, Williams, Dan J <dan.j.williams@intel.com> wrote:
> [ Adding Alexandre ]
>
> This is a tangent, but it would be nice if this API extension also
> covered the needs of the incoming RapidIO case which wants to specify
> new device context information per operation (and not once at
> configuration time, like slave case).  Would it be enough if the
> transfer template included a (struct device *context) member at the
> end?  Most dma users could ignore it, but RapidIO could use it to do
> something like:
>
>   struct rio_dev *rdev = container_of(context, typeof(*rdev), device);
>
> That might not be enough, but I'm concerned that making the context a
> (void *) is too flexible.  I'd rather have something like this than
> acquiring a lock in rio_dma_prep_slave_sg() and holding it over
> ->prep().  The alternative is to extend device_prep_slave_sg to take
> an extra parameter, but that impacts all other slave implementations
> with a dead parameter.
>
>From what I read so far, the requirement is closer to prep_slave_sg
than to this api.

IMO, there should be a virtual channel for each device that the real
physical channel, at the backend, can transfer data to/from.

The client driver should request each virtual channel corresponding
to each target device it wants to transfer data with.

In the dmac driver - transfers queued for all virtual channels that are
backed by the same physical channel, could be added to the same
list and executed in FIFO manner.

That way, there won't be any need to hook target device info per transfer
and more importantly "struct dma_chan" would continue to mean
link between fixed 'endpoints'.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-10 16:28                                                     ` Jassi Brar
  2011-10-11 11:56                                                       ` Vinod Koul
@ 2011-10-12  5:41                                                       ` Barry Song
  2011-10-12  6:19                                                         ` Vinod Koul
  1 sibling, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-10-12  5:41 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Vinod Koul, Williams, Dan J, Russell King, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

2011/10/11 Jassi Brar <jaswinder.singh@linaro.org>:
> On 10 October 2011 21:32, Vinod Koul <vinod.koul@intel.com> wrote:
>> On Mon, 2011-10-10 at 16:46 +0530, Jassi Brar wrote:
>>> On 10 October 2011 16:15, Vinod Koul <vinod.koul@intel.com> wrote:
>>> >
>>> > But I am fine if we find a common ground and merge the two where dmac
>>> > can cleanly identify direction and mode it is operating.
>>> >
>>> The client would set the xfer_direction and dmac would interpret as
>>>
>>> enum xfer_direction {
>>>      MEM_TO_MEM,   ->  Async/Memcpy mode
>>>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
>>>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
>>>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
>>> }
>>>
>>> How could it get any cleaner?
>> Consider the case of a dmac driver which supports interleaved dma as
>> well as memcpy and slave
>> It needs to interpret dma_data_direction for later cases and
>> xfer_direction for former ones.
> dma_data_direction is the mapping attribute of a buffer and is not meant to
> tell type of source and destination of a transfer.
> xfer_direction is meant for that purpose.
> So I'd rather convert device_prep_dma_cyclic and device_prep_slave_sg
> to use xfer_direction.

i tend to agree with Jassi. now dma_data_direction actually is only
mapping things not real transfer direction.
xfer_direction is now something really telling the data transfer direction.
I think that's what device_prep_dma_cyclic and device_prep_slave_sg want.

actually, there is only one case we need to use dma_data_direction in
dmac driver, that's unmapping async dma buffer.
But the param to dma_unmap_single can be implied by xfer_direction and
dma description.

that's why i think we should rename dma_data_direction to
dma_map_direction or something like that to avoid confusion.

-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-12  5:41                                                       ` Barry Song
@ 2011-10-12  6:19                                                         ` Vinod Koul
  2011-10-12  6:30                                                           ` Jassi Brar
  2011-10-12  6:53                                                           ` Barry Song
  0 siblings, 2 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-12  6:19 UTC (permalink / raw)
  To: Barry Song
  Cc: Jassi Brar, Williams, Dan J, Russell King, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Wed, 2011-10-12 at 13:41 +0800, Barry Song wrote:
> 2011/10/11 Jassi Brar <jaswinder.singh@linaro.org>:
> > On 10 October 2011 21:32, Vinod Koul <vinod.koul@intel.com> wrote:
> >> On Mon, 2011-10-10 at 16:46 +0530, Jassi Brar wrote:
> >>> On 10 October 2011 16:15, Vinod Koul <vinod.koul@intel.com> wrote:
> >>> >
> >>> > But I am fine if we find a common ground and merge the two where dmac
> >>> > can cleanly identify direction and mode it is operating.
> >>> >
> >>> The client would set the xfer_direction and dmac would interpret as
> >>>
> >>> enum xfer_direction {
> >>>      MEM_TO_MEM,   ->  Async/Memcpy mode
> >>>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
> >>>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
> >>>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
> >>> }
> >>>
> >>> How could it get any cleaner?
> >> Consider the case of a dmac driver which supports interleaved dma as
> >> well as memcpy and slave
> >> It needs to interpret dma_data_direction for later cases and
> >> xfer_direction for former ones.
> > dma_data_direction is the mapping attribute of a buffer and is not meant to
> > tell type of source and destination of a transfer.
> > xfer_direction is meant for that purpose.
> > So I'd rather convert device_prep_dma_cyclic and device_prep_slave_sg
> > to use xfer_direction.
> 
> i tend to agree with Jassi. now dma_data_direction actually is only
> mapping things not real transfer direction.
> xfer_direction is now something really telling the data transfer direction.
> I think that's what device_prep_dma_cyclic and device_prep_slave_sg want.
> 
> actually, there is only one case we need to use dma_data_direction in
> dmac driver, that's unmapping async dma buffer.
> But the param to dma_unmap_single can be implied by xfer_direction and
> dma description.
> 
> that's why i think we should rename dma_data_direction to
> dma_map_direction or something like that to avoid confusion.
Nope, I would leave dma_data_direction as is.
We should use above enum in dmaengine and all dmac drivers.

@Jassi: I have started doing this change for dmaengine and dmacs, and I
took the liberty to name this enum as dma_transfer_direction, hope you
are okay with that


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-12  6:19                                                         ` Vinod Koul
@ 2011-10-12  6:30                                                           ` Jassi Brar
  2011-10-12  6:53                                                           ` Barry Song
  1 sibling, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-12  6:30 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Barry Song, Williams, Dan J, Russell King, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On 12 October 2011 11:49, Vinod Koul <vinod.koul@intel.com> wrote:
>
> @Jassi: I have started doing this change for dmaengine and dmacs, and I
> took the liberty to name this enum as dma_transfer_direction, hope you
> are okay with that
>
OK. Thanks.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-12  6:19                                                         ` Vinod Koul
  2011-10-12  6:30                                                           ` Jassi Brar
@ 2011-10-12  6:53                                                           ` Barry Song
  1 sibling, 0 replies; 131+ messages in thread
From: Barry Song @ 2011-10-12  6:53 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Jassi Brar, Williams, Dan J, Russell King, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

2011/10/12 Vinod Koul <vinod.koul@intel.com>:
> On Wed, 2011-10-12 at 13:41 +0800, Barry Song wrote:
>> 2011/10/11 Jassi Brar <jaswinder.singh@linaro.org>:
>> > On 10 October 2011 21:32, Vinod Koul <vinod.koul@intel.com> wrote:
>> >> On Mon, 2011-10-10 at 16:46 +0530, Jassi Brar wrote:
>> >>> On 10 October 2011 16:15, Vinod Koul <vinod.koul@intel.com> wrote:
>> >>> >
>> >>> > But I am fine if we find a common ground and merge the two where dmac
>> >>> > can cleanly identify direction and mode it is operating.
>> >>> >
>> >>> The client would set the xfer_direction and dmac would interpret as
>> >>>
>> >>> enum xfer_direction {
>> >>>      MEM_TO_MEM,   ->  Async/Memcpy mode
>> >>>      MEM_TO_DEV,    ->  Slave mode & From Memory to Device
>> >>>      DEV_TO_MEM,    ->  Slave mode & From Device to Memory
>> >>>      DEV_TO_DEV,     ->  Slave mode & From Device to Device
>> >>> }
>> >>>
>> >>> How could it get any cleaner?
>> >> Consider the case of a dmac driver which supports interleaved dma as
>> >> well as memcpy and slave
>> >> It needs to interpret dma_data_direction for later cases and
>> >> xfer_direction for former ones.
>> > dma_data_direction is the mapping attribute of a buffer and is not meant to
>> > tell type of source and destination of a transfer.
>> > xfer_direction is meant for that purpose.
>> > So I'd rather convert device_prep_dma_cyclic and device_prep_slave_sg
>> > to use xfer_direction.
>>
>> i tend to agree with Jassi. now dma_data_direction actually is only
>> mapping things not real transfer direction.
>> xfer_direction is now something really telling the data transfer direction.
>> I think that's what device_prep_dma_cyclic and device_prep_slave_sg want.
>>
>> actually, there is only one case we need to use dma_data_direction in
>> dmac driver, that's unmapping async dma buffer.
>> But the param to dma_unmap_single can be implied by xfer_direction and
>> dma description.
>>
>> that's why i think we should rename dma_data_direction to
>> dma_map_direction or something like that to avoid confusion.
> Nope, I would leave dma_data_direction as is.
> We should use above enum in dmaengine and all dmac drivers.

anyway, the rename issue is trivial. i can accept the current name as
is even it doesn't make all senses.
The important thing is the dma_transfer_direction or
dma_xfer_direction as Jassi and you said.

>
> @Jassi: I have started doing this change for dmaengine and dmacs, and I
> took the liberty to name this enum as dma_transfer_direction, hope you
> are okay with that
>
>
> --
> ~Vinod
>
-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-09-28  6:39     ` [PATCHv4] " Jassi Brar
  2011-09-28  9:03       ` Vinod Koul
@ 2011-10-13  7:03       ` Jassi Brar
  2011-10-14  7:32         ` Barry Song
  2011-10-14 15:16         ` Vinod Koul
  1 sibling, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-13  7:03 UTC (permalink / raw)
  To: linux-kernel, dan.j.williams, vkoul; +Cc: rmk, 21cnbao, Jassi Brar

Define a new api that could be used for doing fancy data transfers
like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient in
such cases as where the interleave and chunk are only a few bytes,
which call for a very condensed api to convey pattern of the transfer.
This api supports all 4 variants of scatter-gather and contiguous transfer.

Of course, neither can this api help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern.

Also since now we support SLAVE channels that might not provide
device_prep_slave_sg callback but device_prep_interleaved_dma,
remove the BUG_ON check.

Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
---

Dan,
 After waiting for a reply from you/Alexandre a couple of days I am revising
 my patch without adding a hook for RapidIO case for 2 reasons:
  a) I think the RapidIO requirement is served better by the concept of
     virtual channels, rather than hacking "struct dma_chan" to reach more
     than one device.
  b) If Alexandre comes up with something irresistible, we can always add
     the hook later.

Changes since v4:
1) Dropped the 'frm_irq' member.
2) Renamed 'xfer_direction' to 'dma_transfer_direction'

Changes since v3:
1) Added explicit type for source and destination.

Changes since v2:
1) Added some notes to documentation.
2) Removed the BUG_ON check that expects every SLAVE channel to
   provide a prep_slave_sg, as we are now valid otherwise too.
3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.

Changes since v1:
1) Dropped the 'dma_transaction_type' member until we really
   merge another type into this api. Instead added special
   type for this api - DMA_GENXFER in dma_transaction_type.
2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to

 Documentation/dmaengine.txt |    8 ++++
 drivers/dma/dmaengine.c     |    4 +-
 include/linux/dmaengine.h   |   82 +++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
index 94b7e0f..962a2d3 100644
--- a/Documentation/dmaengine.txt
+++ b/Documentation/dmaengine.txt
@@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
    slave_sg	- DMA a list of scatter gather buffers from/to a peripheral
    dma_cyclic	- Perform a cyclic DMA operation from/to a peripheral till the
 		  operation is explicitly stopped.
+   interleaved_dma - This is common to Slave as well as M2M clients. For slave
+		 address of devices' fifo could be already known to the driver.
+		 Various types of operations could be expressed by setting
+		 appropriate values to the 'dmaxfer_template' members.
 
    A non-NULL return of this transfer API represents a "descriptor" for
    the given transaction.
@@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
 
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
+		struct dma_chan *chan, struct dmaxfer_template *xt,
+		unsigned long flags);
+
    The peripheral driver is expected to have mapped the scatterlist for
    the DMA operation prior to calling device_prep_slave_sg, and must
    keep the scatterlist mapped until the DMA operation has completed.
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index b48967b..a6c6051 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_interrupt);
 	BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
 		!device->device_prep_dma_sg);
-	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
-		!device->device_prep_slave_sg);
 	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
 		!device->device_prep_dma_cyclic);
 	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
 		!device->device_control);
+	BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
+		!device->device_prep_interleaved_dma);
 
 	BUG_ON(!device->device_alloc_chan_resources);
 	BUG_ON(!device->device_free_chan_resources);
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 8fbf40e..ce8c40a 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -71,11 +71,85 @@ enum dma_transaction_type {
 	DMA_ASYNC_TX,
 	DMA_SLAVE,
 	DMA_CYCLIC,
+	DMA_INTERLEAVE,
+/* last transaction type for creation of the capabilities mask */
+	DMA_TX_TYPE_END,
 };
 
-/* last transaction type for creation of the capabilities mask */
-#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
+enum dma_transfer_direction {
+	MEM_TO_MEM,
+	MEM_TO_DEV,
+	DEV_TO_MEM,
+	DEV_TO_DEV,
+};
+
+/**
+ * Interleaved Transfer Request
+ * ----------------------------
+ * A chunk is collection of contiguous bytes to be transfered.
+ * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
+ * ICGs may or maynot change between chunks.
+ * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
+ *  that when repeated an integral number of times, specifies the transfer.
+ * A transfer template is specification of a Frame, the number of times
+ *  it is to be repeated and other per-transfer attributes.
+ *
+ * Practically, a client driver would have ready a template for each
+ *  type of transfer it is going to need during its lifetime and
+ *  set only 'src_start' and 'dst_start' before submitting the requests.
+ *
+ *
+ *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
+ *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
+ *
+ *    ==  Chunk size
+ *    ... ICG
+ */
 
+/**
+ * struct data_chunk - Element of scatter-gather list that makes a frame.
+ * @size: Number of bytes to read from source.
+ *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
+ * @icg: Number of bytes to jump after last src/dst address of this
+ *	 chunk and before first src/dst address for next chunk.
+ *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
+ *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
+ */
+struct data_chunk {
+	size_t size;
+	size_t icg;
+};
+
+/**
+ * struct dmaxfer_template - Template to convey DMAC the transfer pattern
+ *	 and attributes.
+ * @src_start: Bus address of source for the first chunk.
+ * @dst_start: Bus address of destination for the first chunk.
+ * @dir: Specifies the type of Source and Destination.
+ * @src_inc: If the source address increments after reading from it.
+ * @dst_inc: If the destination address increments after writing to it.
+ * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
+ *		Otherwise, source is read contiguously (icg ignored).
+ *		Ignored if src_inc is false.
+ * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
+ *		Otherwise, destination is filled contiguously (icg ignored).
+ *		Ignored if dst_inc is false.
+ * @numf: Number of frames in this template.
+ * @frame_size: Number of chunks in a frame i.e, size of sgl[].
+ * @sgl: Array of {chunk,icg} pairs that make up a frame.
+ */
+struct dmaxfer_template {
+	dma_addr_t src_start;
+	dma_addr_t dst_start;
+	enum dma_transfer_direction dir;
+	bool src_inc;
+	bool dst_inc;
+	bool src_sgl;
+	bool dst_sgl;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
 
 /**
  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
@@ -432,6 +506,7 @@ struct dma_tx_state {
  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
  *	The function takes a buffer of size buf_len. The callback function will
  *	be called after period_len bytes have been transferred.
+ * @device_prep_interleaved_dma: Transfer expression in a generic way.
  * @device_control: manipulate all pending operations on a channel, returns
  *	zero or error code
  * @device_tx_status: poll for transaction completion, the optional
@@ -496,6 +571,9 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
+	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
+		struct dma_chan *chan, struct dmaxfer_template *xt,
+		unsigned long flags);
 	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
 		unsigned long arg);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-13  7:03       ` [PATCHv5] " Jassi Brar
@ 2011-10-14  7:32         ` Barry Song
  2011-10-14 11:51           ` Jassi Brar
  2011-10-14 15:16         ` Vinod Koul
  1 sibling, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-10-14  7:32 UTC (permalink / raw)
  To: Jassi Brar
  Cc: linux-kernel, dan.j.williams, vkoul, rmk, DL-SHA-WorkGroupLinux

Hi Jassi,

2011/10/13 Jassi Brar <jaswinder.singh@linaro.org>:
> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
> This api supports all 4 variants of scatter-gather and contiguous transfer.
>
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
>
> Also since now we support SLAVE channels that might not provide
> device_prep_slave_sg callback but device_prep_interleaved_dma,
> remove the BUG_ON check.
>
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
>
> Dan,
>  After waiting for a reply from you/Alexandre a couple of days I am revising
>  my patch without adding a hook for RapidIO case for 2 reasons:
>  a) I think the RapidIO requirement is served better by the concept of
>     virtual channels, rather than hacking "struct dma_chan" to reach more
>     than one device.
>  b) If Alexandre comes up with something irresistible, we can always add
>     the hook later.
>
> Changes since v4:
> 1) Dropped the 'frm_irq' member.
> 2) Renamed 'xfer_direction' to 'dma_transfer_direction'
>
> Changes since v3:
> 1) Added explicit type for source and destination.
>
> Changes since v2:
> 1) Added some notes to documentation.
> 2) Removed the BUG_ON check that expects every SLAVE channel to
>   provide a prep_slave_sg, as we are now valid otherwise too.
> 3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
> 4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.
>
> Changes since v1:
> 1) Dropped the 'dma_transaction_type' member until we really
>   merge another type into this api. Instead added special
>   type for this api - DMA_GENXFER in dma_transaction_type.
> 2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
>
>  Documentation/dmaengine.txt |    8 ++++
>  drivers/dma/dmaengine.c     |    4 +-
>  include/linux/dmaengine.h   |   82 +++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 90 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
> index 94b7e0f..962a2d3 100644
> --- a/Documentation/dmaengine.txt
> +++ b/Documentation/dmaengine.txt
> @@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
>    slave_sg    - DMA a list of scatter gather buffers from/to a peripheral
>    dma_cyclic  - Perform a cyclic DMA operation from/to a peripheral till the
>                  operation is explicitly stopped.
> +   interleaved_dma - This is common to Slave as well as M2M clients. For slave
> +                address of devices' fifo could be already known to the driver.
> +                Various types of operations could be expressed by setting
> +                appropriate values to the 'dmaxfer_template' members.
>
>    A non-NULL return of this transfer API represents a "descriptor" for
>    the given transaction.
> @@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
>                struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>                size_t period_len, enum dma_data_direction direction);
>
> +       struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +               struct dma_chan *chan, struct dmaxfer_template *xt,
> +               unsigned long flags);
> +

what if i want a cyclic interleaved transfer? i think the cyclic
interleaved transfer is what i want for audio dma.

>    The peripheral driver is expected to have mapped the scatterlist for
>    the DMA operation prior to calling device_prep_slave_sg, and must
>    keep the scatterlist mapped until the DMA operation has completed.
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index b48967b..a6c6051 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
>                !device->device_prep_dma_interrupt);
>        BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
>                !device->device_prep_dma_sg);
> -       BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
> -               !device->device_prep_slave_sg);
>        BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>                !device->device_prep_dma_cyclic);
>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>                !device->device_control);
> +       BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
> +               !device->device_prep_interleaved_dma);
>
>        BUG_ON(!device->device_alloc_chan_resources);
>        BUG_ON(!device->device_free_chan_resources);
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..ce8c40a 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -71,11 +71,85 @@ enum dma_transaction_type {
>        DMA_ASYNC_TX,
>        DMA_SLAVE,
>        DMA_CYCLIC,
> +       DMA_INTERLEAVE,
> +/* last transaction type for creation of the capabilities mask */
> +       DMA_TX_TYPE_END,
>  };
>
> -/* last transaction type for creation of the capabilities mask */
> -#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
> +enum dma_transfer_direction {
> +       MEM_TO_MEM,
> +       MEM_TO_DEV,
> +       DEV_TO_MEM,
> +       DEV_TO_DEV,
> +};

Vinod has sent this as a seperate patch.

> +
> +/**
> + * Interleaved Transfer Request
> + * ----------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
>
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *       size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *      chunk and before first src/dst address for next chunk.
> + *      Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *      Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +       size_t size;
> +       size_t icg;
> +};
> +
> +/**
> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> + *      and attributes.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @dir: Specifies the type of Source and Destination.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *             Otherwise, source is read contiguously (icg ignored).
> + *             Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *             Otherwise, destination is filled contiguously (icg ignored).
> + *             Ignored if dst_inc is false.
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct dmaxfer_template {
> +       dma_addr_t src_start;
> +       dma_addr_t dst_start;
> +       enum dma_transfer_direction dir;
> +       bool src_inc;
> +       bool dst_inc;
> +       bool src_sgl;
> +       bool dst_sgl;
> +       size_t numf;
> +       size_t frame_size;
> +       struct data_chunk sgl[0];
> +};
>
>  /**
>  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -432,6 +506,7 @@ struct dma_tx_state {
>  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>  *     The function takes a buffer of size buf_len. The callback function will
>  *     be called after period_len bytes have been transferred.
> + * @device_prep_interleaved_dma: Transfer expression in a generic way.
>  * @device_control: manipulate all pending operations on a channel, returns
>  *     zero or error code
>  * @device_tx_status: poll for transaction completion, the optional
> @@ -496,6 +571,9 @@ struct dma_device {
>        struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>                struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>                size_t period_len, enum dma_data_direction direction);
> +       struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +               struct dma_chan *chan, struct dmaxfer_template *xt,
> +               unsigned long flags);
>        int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>                unsigned long arg);
>
> --
> 1.7.4.1

-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14  7:32         ` Barry Song
@ 2011-10-14 11:51           ` Jassi Brar
  2011-10-14 13:31             ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-14 11:51 UTC (permalink / raw)
  To: Barry Song
  Cc: linux-kernel, dan.j.williams, vkoul, rmk, DL-SHA-WorkGroupLinux

On 14 October 2011 13:02, Barry Song <21cnbao@gmail.com> wrote:
>> diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
>> index 94b7e0f..962a2d3 100644
>> --- a/Documentation/dmaengine.txt
>> +++ b/Documentation/dmaengine.txt
>> @@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
>>    slave_sg    - DMA a list of scatter gather buffers from/to a peripheral
>>    dma_cyclic  - Perform a cyclic DMA operation from/to a peripheral till the
>>                  operation is explicitly stopped.
>> +   interleaved_dma - This is common to Slave as well as M2M clients. For slave
>> +                address of devices' fifo could be already known to the driver.
>> +                Various types of operations could be expressed by setting
>> +                appropriate values to the 'dmaxfer_template' members.
>>
>>    A non-NULL return of this transfer API represents a "descriptor" for
>>    the given transaction.
>> @@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
>>                struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>>                size_t period_len, enum dma_data_direction direction);
>>
>> +       struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
>> +               struct dma_chan *chan, struct dmaxfer_template *xt,
>> +               unsigned long flags);
>> +
>
> what if i want a cyclic interleaved transfer? i think the cyclic
> interleaved transfer is what i want for audio dma.
>
... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
after 'numf' frames have been transferred.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 11:51           ` Jassi Brar
@ 2011-10-14 13:31             ` Vinod Koul
  2011-10-14 13:51               ` Jassi Brar
  2011-10-14 14:55               ` Barry Song
  0 siblings, 2 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-14 13:31 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Fri, 2011-10-14 at 17:21 +0530, Jassi Brar wrote:
> On 14 October 2011 13:02, Barry Song <21cnbao@gmail.com> wrote:
> >
> > what if i want a cyclic interleaved transfer? i think the cyclic
> > interleaved transfer is what i want for audio dma.
> >
> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
> after 'numf' frames have been transferred.
I was thinking more on lines to have this conveyed thru a flag.

Anyway I plan to work on merging device_prep_slave_sg and
device_prep_cyclic to single API. Think more of device_prep_cyclic as
special case with sg length one and flag to tell dmac its cyclic.

Similarly here we could use/define this flag to say this transfer is
also cyclic in nature and dmac then reloads the list again.
That way any prep can be made cyclic in nature by just using this flag.

@Barry: Why would you need to use interleaved API for audio?

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 13:31             ` Vinod Koul
@ 2011-10-14 13:51               ` Jassi Brar
  2011-10-14 14:05                 ` Vinod Koul
  2011-10-14 14:55               ` Barry Song
  1 sibling, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-14 13:51 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 14 October 2011 19:01, Vinod Koul <vinod.koul@intel.com> wrote:
> On Fri, 2011-10-14 at 17:21 +0530, Jassi Brar wrote:
>> On 14 October 2011 13:02, Barry Song <21cnbao@gmail.com> wrote:
>> >
>> > what if i want a cyclic interleaved transfer? i think the cyclic
>> > interleaved transfer is what i want for audio dma.
>> >
>> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
>> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
>> after 'numf' frames have been transferred.
> I was thinking more on lines to have this conveyed thru a flag.
>
Sorry don't see exactly what you mean.
frm_irq and numf are independent of each other and could
cover more cases than a single flag.

> Anyway I plan to work on merging device_prep_slave_sg and
> device_prep_cyclic to single API. Think more of device_prep_cyclic as
> special case with sg length one and flag to tell dmac its cyclic.
>
AFAIK, usually the ring buffer is divided into N (>1) different
periods that need
to be transferred in endless loop.

> Similarly here we could use/define this flag to say this transfer is
> also cyclic in nature and dmac then reloads the list again.
> That way any prep can be made cyclic in nature by just using this flag.
>
Some cyclic transfers might want callback done after each period/frame
while some not. We can't cover using just one flag.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 13:51               ` Jassi Brar
@ 2011-10-14 14:05                 ` Vinod Koul
  2011-10-14 14:18                   ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-14 14:05 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Fri, 2011-10-14 at 19:21 +0530, Jassi Brar wrote:
> >> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
> >> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
> >> after 'numf' frames have been transferred.
> > I was thinking more on lines to have this conveyed thru a flag.
> >
> Sorry don't see exactly what you mean.
> frm_irq and numf are independent of each other and could
> cover more cases than a single flag.
> 
> > Anyway I plan to work on merging device_prep_slave_sg and
> > device_prep_cyclic to single API. Think more of device_prep_cyclic as
> > special case with sg length one and flag to tell dmac its cyclic.
> >
> AFAIK, usually the ring buffer is divided into N (>1) different
> periods that need
> to be transferred in endless loop.
But that is not interleaved dma, we have a cyclic API, barry should be
using that
> 
> > Similarly here we could use/define this flag to say this transfer is
> > also cyclic in nature and dmac then reloads the list again.
> > That way any prep can be made cyclic in nature by just using this flag.
> >
> Some cyclic transfers might want callback done after each period/frame
> while some not. We can't cover using just one flag.
Again, we would typically have two cases:
1) notification at end of each list
2) notification at end of each frame and complete list
If we use ACK flag along with callback field we can achieve above by
defining rules as 
- if callback is set, notify after list completion (this is set in
descriptor for this purpose)
- if ACK is set, then call for each frame as well
If we agree then, we should add this info to documentation and fix
existing behavior

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 14:05                 ` Vinod Koul
@ 2011-10-14 14:18                   ` Vinod Koul
  0 siblings, 0 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-14 14:18 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Fri, 2011-10-14 at 19:35 +0530, Vinod Koul wrote:
> On Fri, 2011-10-14 at 19:21 +0530, Jassi Brar wrote:
> > >> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
> > >> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
> > >> after 'numf' frames have been transferred.
> > > I was thinking more on lines to have this conveyed thru a flag.
> > >
> > Sorry don't see exactly what you mean.
> > frm_irq and numf are independent of each other and could
> > cover more cases than a single flag.
> > 
> > > Anyway I plan to work on merging device_prep_slave_sg and
> > > device_prep_cyclic to single API. Think more of device_prep_cyclic as
> > > special case with sg length one and flag to tell dmac its cyclic.
> > >
> > AFAIK, usually the ring buffer is divided into N (>1) different
> > periods that need
> > to be transferred in endless loop.
> But that is not interleaved dma, we have a cyclic API, barry should be
> using that
> > 
> > > Similarly here we could use/define this flag to say this transfer is
> > > also cyclic in nature and dmac then reloads the list again.
> > > That way any prep can be made cyclic in nature by just using this flag.
> > >
> > Some cyclic transfers might want callback done after each period/frame
> > while some not. We can't cover using just one flag.
> Again, we would typically have two cases:
> 1) notification at end of each list
> 2) notification at end of each frame and complete list
> If we use ACK flag along with callback field we can achieve above by
> defining rules as 
> - if callback is set, notify after list completion (this is set in
> descriptor for this purpose)
> - if ACK is set, then call for each frame as well
> If we agree then, we should add this info to documentation and fix
> existing behavior
Sorry meant DMA_PREP_INTERRUPT flag above :)


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 13:31             ` Vinod Koul
  2011-10-14 13:51               ` Jassi Brar
@ 2011-10-14 14:55               ` Barry Song
  2011-10-14 15:06                 ` Vinod Koul
  1 sibling, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-10-14 14:55 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Jassi Brar, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

Hi Vinod,

2011/10/14 Vinod Koul <vinod.koul@intel.com>:
> On Fri, 2011-10-14 at 17:21 +0530, Jassi Brar wrote:
>> On 14 October 2011 13:02, Barry Song <21cnbao@gmail.com> wrote:
>> >
>> > what if i want a cyclic interleaved transfer? i think the cyclic
>> > interleaved transfer is what i want for audio dma.
>> >
>> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
>> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
>> after 'numf' frames have been transferred.
> I was thinking more on lines to have this conveyed thru a flag.
>
> Anyway I plan to work on merging device_prep_slave_sg and
> device_prep_cyclic to single API. Think more of device_prep_cyclic as
> special case with sg length one and flag to tell dmac its cyclic.
>
> Similarly here we could use/define this flag to say this transfer is
> also cyclic in nature and dmac then reloads the list again.
> That way any prep can be made cyclic in nature by just using this flag.
>
> @Barry: Why would you need to use interleaved API for audio?

At first, audio dma is typically cyclic. if no underflow and overflow
happens, it will always be running. underflow and overflow will
trigger the cyclic dma termination.

in case audio PCM data is interleaved and saved in dma buffer  as below:

left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
left (2B)  right (2B)
left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
left (2B)  right (2B)
left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
left (2B)  right (2B)
left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
left (2B)  right (2B)
,,,,

and some hardwares need two seperate dma channels to tranfer left and
right audio channel.

For 1st and 2nd dma channel, they want dma address increases 4bytes
and transfer 2bytes every line.
so it looks to me like a cyclic interleaved dma.

or am i missing anything?

>
> --
> ~Vinod
-Barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 14:55               ` Barry Song
@ 2011-10-14 15:06                 ` Vinod Koul
  2011-10-14 15:38                   ` Barry Song
  2011-10-14 16:35                   ` Jassi Brar
  0 siblings, 2 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-14 15:06 UTC (permalink / raw)
  To: Barry Song
  Cc: Jassi Brar, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Fri, 2011-10-14 at 22:55 +0800, Barry Song wrote:
> Hi Vinod,
> 
> 2011/10/14 Vinod Koul <vinod.koul@intel.com>:
> > On Fri, 2011-10-14 at 17:21 +0530, Jassi Brar wrote:
> >> On 14 October 2011 13:02, Barry Song <21cnbao@gmail.com> wrote:
> >> >
> >> > what if i want a cyclic interleaved transfer? i think the cyclic
> >> > interleaved transfer is what i want for audio dma.
> >> >
> >> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
> >> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
> >> after 'numf' frames have been transferred.
> > I was thinking more on lines to have this conveyed thru a flag.
> >
> > Anyway I plan to work on merging device_prep_slave_sg and
> > device_prep_cyclic to single API. Think more of device_prep_cyclic as
> > special case with sg length one and flag to tell dmac its cyclic.
> >
> > Similarly here we could use/define this flag to say this transfer is
> > also cyclic in nature and dmac then reloads the list again.
> > That way any prep can be made cyclic in nature by just using this flag.
> >
> > @Barry: Why would you need to use interleaved API for audio?
> 
> At first, audio dma is typically cyclic. if no underflow and overflow
> happens, it will always be running. underflow and overflow will
> trigger the cyclic dma termination.
Yes and you terminate dma. Then prepare will be called again and you
setup DMA again and on trigger start you start DMA again.
No need for interleaving in this case.
> 
> in case audio PCM data is interleaved and saved in dma buffer  as below:
> 
> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
> left (2B)  right (2B)
> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
> left (2B)  right (2B)
> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
> left (2B)  right (2B)
> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
> left (2B)  right (2B)
> ,,,,
> 
> and some hardwares need two seperate dma channels to tranfer left and
> right audio channel.
> 
> For 1st and 2nd dma channel, they want dma address increases 4bytes
> and transfer 2bytes every line.
> so it looks to me like a cyclic interleaved dma.
Hmmm, do we have sound cards which use this?
Nevertheless for this kind of transfers we would need interleaved cyclic
DMA as well, Do you have such usage? Can you tell me which codec
requires this?


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-13  7:03       ` [PATCHv5] " Jassi Brar
  2011-10-14  7:32         ` Barry Song
@ 2011-10-14 15:16         ` Vinod Koul
  2011-10-14 15:50           ` Barry Song
  2011-10-16 11:16           ` Jassi Brar
  1 sibling, 2 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-14 15:16 UTC (permalink / raw)
  To: Jassi Brar; +Cc: linux-kernel, dan.j.williams, rmk, 21cnbao

On Thu, 2011-10-13 at 12:33 +0530, Jassi Brar wrote:
> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
> This api supports all 4 variants of scatter-gather and contiguous transfer.
> 
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
> 
> Also since now we support SLAVE channels that might not provide
> device_prep_slave_sg callback but device_prep_interleaved_dma,
> remove the BUG_ON check.
> 
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
> 
> Dan,
>  After waiting for a reply from you/Alexandre a couple of days I am revising
>  my patch without adding a hook for RapidIO case for 2 reasons:
>   a) I think the RapidIO requirement is served better by the concept of
>      virtual channels, rather than hacking "struct dma_chan" to reach more
>      than one device.
>   b) If Alexandre comes up with something irresistible, we can always add
>      the hook later.
> 
> Changes since v4:
> 1) Dropped the 'frm_irq' member.
> 2) Renamed 'xfer_direction' to 'dma_transfer_direction'
> 
> Changes since v3:
> 1) Added explicit type for source and destination.
> 
> Changes since v2:
> 1) Added some notes to documentation.
> 2) Removed the BUG_ON check that expects every SLAVE channel to
>    provide a prep_slave_sg, as we are now valid otherwise too.
> 3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
> 4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.
> 
> Changes since v1:
> 1) Dropped the 'dma_transaction_type' member until we really
>    merge another type into this api. Instead added special
>    type for this api - DMA_GENXFER in dma_transaction_type.
> 2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
> 
>  Documentation/dmaengine.txt |    8 ++++
>  drivers/dma/dmaengine.c     |    4 +-
>  include/linux/dmaengine.h   |   82 +++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 90 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
> index 94b7e0f..962a2d3 100644
> --- a/Documentation/dmaengine.txt
> +++ b/Documentation/dmaengine.txt
> @@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
>     slave_sg	- DMA a list of scatter gather buffers from/to a peripheral
>     dma_cyclic	- Perform a cyclic DMA operation from/to a peripheral till the
>  		  operation is explicitly stopped.
> +   interleaved_dma - This is common to Slave as well as M2M clients. For slave
> +		 address of devices' fifo could be already known to the driver.
> +		 Various types of operations could be expressed by setting
> +		 appropriate values to the 'dmaxfer_template' members.
>  
>     A non-NULL return of this transfer API represents a "descriptor" for
>     the given transaction.
> @@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
>  
> +	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +		struct dma_chan *chan, struct dmaxfer_template *xt,
> +		unsigned long flags);
> +
>     The peripheral driver is expected to have mapped the scatterlist for
>     the DMA operation prior to calling device_prep_slave_sg, and must
>     keep the scatterlist mapped until the DMA operation has completed.
> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
> index b48967b..a6c6051 100644
> --- a/drivers/dma/dmaengine.c
> +++ b/drivers/dma/dmaengine.c
> @@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
>  		!device->device_prep_dma_interrupt);
>  	BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
>  		!device->device_prep_dma_sg);
> -	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
> -		!device->device_prep_slave_sg);
>  	BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>  		!device->device_prep_dma_cyclic);
>  	BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>  		!device->device_control);
> +	BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
> +		!device->device_prep_interleaved_dma);
>  
>  	BUG_ON(!device->device_alloc_chan_resources);
>  	BUG_ON(!device->device_free_chan_resources);
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..ce8c40a 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -71,11 +71,85 @@ enum dma_transaction_type {
>  	DMA_ASYNC_TX,
>  	DMA_SLAVE,
>  	DMA_CYCLIC,
> +	DMA_INTERLEAVE,
> +/* last transaction type for creation of the capabilities mask */
> +	DMA_TX_TYPE_END,
>  };
>  
> -/* last transaction type for creation of the capabilities mask */
> -#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
> +enum dma_transfer_direction {
> +	MEM_TO_MEM,
> +	MEM_TO_DEV,
> +	DEV_TO_MEM,
> +	DEV_TO_DEV,
> +};
> +
> +/**
> + * Interleaved Transfer Request
> + * ----------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
>  
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *	 chunk and before first src/dst address for next chunk.
> + *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +	size_t size;
> +	size_t icg;
> +};
> +
> +/**
> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
> + *	 and attributes.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @dir: Specifies the type of Source and Destination.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *		Otherwise, source is read contiguously (icg ignored).
> + *		Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *		Otherwise, destination is filled contiguously (icg ignored).
> + *		Ignored if dst_inc is false.
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct dmaxfer_template {
> +	dma_addr_t src_start;
> +	dma_addr_t dst_start;
> +	enum dma_transfer_direction dir;
> +	bool src_inc;
> +	bool dst_inc;
> +	bool src_sgl;
> +	bool dst_sgl;
> +	size_t numf;
> +	size_t frame_size;
> +	struct data_chunk sgl[0];
> +};
>  
>  /**
>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -432,6 +506,7 @@ struct dma_tx_state {
>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>   *	The function takes a buffer of size buf_len. The callback function will
>   *	be called after period_len bytes have been transferred.
> + * @device_prep_interleaved_dma: Transfer expression in a generic way.
>   * @device_control: manipulate all pending operations on a channel, returns
>   *	zero or error code
>   * @device_tx_status: poll for transaction completion, the optional
> @@ -496,6 +571,9 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
> +	struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
> +		struct dma_chan *chan, struct dmaxfer_template *xt,
> +		unsigned long flags);
>  	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>  		unsigned long arg);
>  
IMO this looks decent now, we can take this for merge if we don't have
any other issues.
Ideally would be great if we also need to see the usage for this
API .... Barry?. I am okay to host this up on a branch meanwhile.

Just a minor nitpick, I would have really like dmaxfer_template to be
named dma_interleaved_template. I think we are still quite far from
generic transfer template. Jassi if you agree I can fix that up while
applying, no need to revise for nitpick :)

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 15:06                 ` Vinod Koul
@ 2011-10-14 15:38                   ` Barry Song
  2011-10-14 16:09                     ` Vinod Koul
  2011-10-14 16:35                   ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-10-14 15:38 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Jassi Brar, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

2011/10/14 Vinod Koul <vinod.koul@intel.com>:
> On Fri, 2011-10-14 at 22:55 +0800, Barry Song wrote:
>> Hi Vinod,
>>
>> 2011/10/14 Vinod Koul <vinod.koul@intel.com>:
>> > On Fri, 2011-10-14 at 17:21 +0530, Jassi Brar wrote:
>> >> On 14 October 2011 13:02, Barry Song <21cnbao@gmail.com> wrote:
>> >> >
>> >> > what if i want a cyclic interleaved transfer? i think the cyclic
>> >> > interleaved transfer is what i want for audio dma.
>> >> >
>> >> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
>> >> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
>> >> after 'numf' frames have been transferred.
>> > I was thinking more on lines to have this conveyed thru a flag.
>> >
>> > Anyway I plan to work on merging device_prep_slave_sg and
>> > device_prep_cyclic to single API. Think more of device_prep_cyclic as
>> > special case with sg length one and flag to tell dmac its cyclic.
>> >
>> > Similarly here we could use/define this flag to say this transfer is
>> > also cyclic in nature and dmac then reloads the list again.
>> > That way any prep can be made cyclic in nature by just using this flag.
>> >
>> > @Barry: Why would you need to use interleaved API for audio?
>>
>> At first, audio dma is typically cyclic. if no underflow and overflow
>> happens, it will always be running. underflow and overflow will
>> trigger the cyclic dma termination.
> Yes and you terminate dma. Then prepare will be called again and you
> setup DMA again and on trigger start you start DMA again.
> No need for interleaving in this case.
>>
>> in case audio PCM data is interleaved and saved in dma buffer  as below:
>>
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> ,,,,
>>
>> and some hardwares need two seperate dma channels to tranfer left and
>> right audio channel.
>>
>> For 1st and 2nd dma channel, they want dma address increases 4bytes
>> and transfer 2bytes every line.
>> so it looks to me like a cyclic interleaved dma.
> Hmmm, do we have sound cards which use this?
> Nevertheless for this kind of transfers we would need interleaved cyclic
> DMA as well, Do you have such usage? Can you tell me which codec
> requires this?

Vinod, actually it is not decided by codec. it is only related with
the hardware of pcm, i2s or AC'97 controllers in SoC. i did remember
some people  once licensed a TDM tranfer enginee from synopsys and IC
guys bound one TDM slot, which has a seperate  dma channel, to one
audio channel then organize several TDM slots into a AC97 controller.

i am on holiday now and i can't give you more information until next week.

Still i saw most other chips binding multi-channels of an I2S/PCM/AC97
audio controller in a dma channel, then it doesn't need interleaved
dma api as its dma address will increase continuously. dma will
transfer left+right together but not one left and one right.

>
>
> --
> ~Vinod

-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 15:16         ` Vinod Koul
@ 2011-10-14 15:50           ` Barry Song
  2011-10-16 11:16           ` Jassi Brar
  1 sibling, 0 replies; 131+ messages in thread
From: Barry Song @ 2011-10-14 15:50 UTC (permalink / raw)
  To: Vinod Koul; +Cc: Jassi Brar, linux-kernel, dan.j.williams, rmk

2011/10/14 Vinod Koul <vinod.koul@intel.com>:
> On Thu, 2011-10-13 at 12:33 +0530, Jassi Brar wrote:
>> Define a new api that could be used for doing fancy data transfers
>> like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient in
>> such cases as where the interleave and chunk are only a few bytes,
>> which call for a very condensed api to convey pattern of the transfer.
>> This api supports all 4 variants of scatter-gather and contiguous transfer.
>>
>> Of course, neither can this api help transfers that don't lend to DMA by
>> nature, i.e, scattered tiny read/writes with no periodic pattern.
>>
>> Also since now we support SLAVE channels that might not provide
>> device_prep_slave_sg callback but device_prep_interleaved_dma,
>> remove the BUG_ON check.
>>
>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>> ---
>>
>> Dan,
>>  After waiting for a reply from you/Alexandre a couple of days I am revising
>>  my patch without adding a hook for RapidIO case for 2 reasons:
>>   a) I think the RapidIO requirement is served better by the concept of
>>      virtual channels, rather than hacking "struct dma_chan" to reach more
>>      than one device.
>>   b) If Alexandre comes up with something irresistible, we can always add
>>      the hook later.
>>
>> Changes since v4:
>> 1) Dropped the 'frm_irq' member.
>> 2) Renamed 'xfer_direction' to 'dma_transfer_direction'
>>
>> Changes since v3:
>> 1) Added explicit type for source and destination.
>>
>> Changes since v2:
>> 1) Added some notes to documentation.
>> 2) Removed the BUG_ON check that expects every SLAVE channel to
>>    provide a prep_slave_sg, as we are now valid otherwise too.
>> 3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
>> 4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.
>>
>> Changes since v1:
>> 1) Dropped the 'dma_transaction_type' member until we really
>>    merge another type into this api. Instead added special
>>    type for this api - DMA_GENXFER in dma_transaction_type.
>> 2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
>>
>>  Documentation/dmaengine.txt |    8 ++++
>>  drivers/dma/dmaengine.c     |    4 +-
>>  include/linux/dmaengine.h   |   82 +++++++++++++++++++++++++++++++++++++++++-
>>  3 files changed, 90 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
>> index 94b7e0f..962a2d3 100644
>> --- a/Documentation/dmaengine.txt
>> +++ b/Documentation/dmaengine.txt
>> @@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
>>     slave_sg  - DMA a list of scatter gather buffers from/to a peripheral
>>     dma_cyclic        - Perform a cyclic DMA operation from/to a peripheral till the
>>                 operation is explicitly stopped.
>> +   interleaved_dma - This is common to Slave as well as M2M clients. For slave
>> +              address of devices' fifo could be already known to the driver.
>> +              Various types of operations could be expressed by setting
>> +              appropriate values to the 'dmaxfer_template' members.
>>
>>     A non-NULL return of this transfer API represents a "descriptor" for
>>     the given transaction.
>> @@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
>>               struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>>               size_t period_len, enum dma_data_direction direction);
>>
>> +     struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
>> +             struct dma_chan *chan, struct dmaxfer_template *xt,
>> +             unsigned long flags);
>> +
>>     The peripheral driver is expected to have mapped the scatterlist for
>>     the DMA operation prior to calling device_prep_slave_sg, and must
>>     keep the scatterlist mapped until the DMA operation has completed.
>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>> index b48967b..a6c6051 100644
>> --- a/drivers/dma/dmaengine.c
>> +++ b/drivers/dma/dmaengine.c
>> @@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
>>               !device->device_prep_dma_interrupt);
>>       BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
>>               !device->device_prep_dma_sg);
>> -     BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>> -             !device->device_prep_slave_sg);
>>       BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>>               !device->device_prep_dma_cyclic);
>>       BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>               !device->device_control);
>> +     BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
>> +             !device->device_prep_interleaved_dma);
>>
>>       BUG_ON(!device->device_alloc_chan_resources);
>>       BUG_ON(!device->device_free_chan_resources);
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 8fbf40e..ce8c40a 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -71,11 +71,85 @@ enum dma_transaction_type {
>>       DMA_ASYNC_TX,
>>       DMA_SLAVE,
>>       DMA_CYCLIC,
>> +     DMA_INTERLEAVE,
>> +/* last transaction type for creation of the capabilities mask */
>> +     DMA_TX_TYPE_END,
>>  };
>>
>> -/* last transaction type for creation of the capabilities mask */
>> -#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>> +enum dma_transfer_direction {
>> +     MEM_TO_MEM,
>> +     MEM_TO_DEV,
>> +     DEV_TO_MEM,
>> +     DEV_TO_DEV,
>> +};
>> +
>> +/**
>> + * Interleaved Transfer Request
>> + * ----------------------------
>> + * A chunk is collection of contiguous bytes to be transfered.
>> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>> + * ICGs may or maynot change between chunks.
>> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>> + *  that when repeated an integral number of times, specifies the transfer.
>> + * A transfer template is specification of a Frame, the number of times
>> + *  it is to be repeated and other per-transfer attributes.
>> + *
>> + * Practically, a client driver would have ready a template for each
>> + *  type of transfer it is going to need during its lifetime and
>> + *  set only 'src_start' and 'dst_start' before submitting the requests.
>> + *
>> + *
>> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>> + *
>> + *    ==  Chunk size
>> + *    ... ICG
>> + */
>>
>> +/**
>> + * struct data_chunk - Element of scatter-gather list that makes a frame.
>> + * @size: Number of bytes to read from source.
>> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
>> + * @icg: Number of bytes to jump after last src/dst address of this
>> + *    chunk and before first src/dst address for next chunk.
>> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
>> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
>> + */
>> +struct data_chunk {
>> +     size_t size;
>> +     size_t icg;
>> +};
>> +
>> +/**
>> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
>> + *    and attributes.
>> + * @src_start: Bus address of source for the first chunk.
>> + * @dst_start: Bus address of destination for the first chunk.
>> + * @dir: Specifies the type of Source and Destination.
>> + * @src_inc: If the source address increments after reading from it.
>> + * @dst_inc: If the destination address increments after writing to it.
>> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> + *           Otherwise, source is read contiguously (icg ignored).
>> + *           Ignored if src_inc is false.
>> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> + *           Otherwise, destination is filled contiguously (icg ignored).
>> + *           Ignored if dst_inc is false.
>> + * @numf: Number of frames in this template.
>> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
>> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
>> + */
>> +struct dmaxfer_template {
>> +     dma_addr_t src_start;
>> +     dma_addr_t dst_start;
>> +     enum dma_transfer_direction dir;
>> +     bool src_inc;
>> +     bool dst_inc;
>> +     bool src_sgl;
>> +     bool dst_sgl;
>> +     size_t numf;
>> +     size_t frame_size;
>> +     struct data_chunk sgl[0];
>> +};
>>
>>  /**
>>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
>> @@ -432,6 +506,7 @@ struct dma_tx_state {
>>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>>   *   The function takes a buffer of size buf_len. The callback function will
>>   *   be called after period_len bytes have been transferred.
>> + * @device_prep_interleaved_dma: Transfer expression in a generic way.
>>   * @device_control: manipulate all pending operations on a channel, returns
>>   *   zero or error code
>>   * @device_tx_status: poll for transaction completion, the optional
>> @@ -496,6 +571,9 @@ struct dma_device {
>>       struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>>               struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>>               size_t period_len, enum dma_data_direction direction);
>> +     struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
>> +             struct dma_chan *chan, struct dmaxfer_template *xt,
>> +             unsigned long flags);
>>       int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>>               unsigned long arg);
>>
> IMO this looks decent now, we can take this for merge if we don't have
> any other issues.
> Ideally would be great if we also need to see the usage for this
> API .... Barry?. I am okay to host this up on a branch meanwhile.

yes. i am glad to see this merged.

we'll rebase CSR prima2 dma driver to your dma_transfer_direction
patch, use this new api , fix other commented issues and send v3.

>
> Just a minor nitpick, I would have really like dmaxfer_template to be
> named dma_interleaved_template. I think we are still quite far from
> generic transfer template. Jassi if you agree I can fix that up while
> applying, no need to revise for nitpick :)

generic will be something covering all current "non-generic" modes. it
seems it doesn't for the moment.
so dma_interleaved_template should be better at least by now.

And when you apply this patch, you might add:

Acked-by: Barry Song <Baohua.Song@csr.com>

>
> --
> ~Vinod

-barry

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 15:38                   ` Barry Song
@ 2011-10-14 16:09                     ` Vinod Koul
  0 siblings, 0 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-14 16:09 UTC (permalink / raw)
  To: Barry Song
  Cc: Jassi Brar, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Fri, 2011-10-14 at 23:38 +0800, Barry Song wrote:
> Vinod, actually it is not decided by codec. it is only related with
> the hardware of pcm, i2s or AC'97 controllers in SoC. i did remember
> some people  once licensed a TDM tranfer enginee from synopsys and IC
> guys bound one TDM slot, which has a seperate  dma channel, to one
> audio channel then organize several TDM slots into a AC97 controller.
Not sure if I follow you.
For i2s and pcm you send left/right channel and then right/left channel.
Data is sent interleaved. DMA also treats it same way.
I don't know about AC97 so wont comment about it
> 
> i am on holiday now and i can't give you more information until next
> week.
> 
> Still i saw most other chips binding multi-channels of an I2S/PCM/AC97
> audio controller in a dma channel, then it doesn't need interleaved
> dma api as its dma address will increase continuously. dma will
> transfer left+right together but not one left and one right.
Right :)

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 15:06                 ` Vinod Koul
  2011-10-14 15:38                   ` Barry Song
@ 2011-10-14 16:35                   ` Jassi Brar
  2011-10-14 17:04                     ` Vinod Koul
  1 sibling, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-14 16:35 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 14 October 2011 20:36, Vinod Koul <vinod.koul@intel.com> wrote:
>> 2011/10/14 Vinod Koul <vinod.koul@intel.com>:
>> > On Fri, 2011-10-14 at 17:21 +0530, Jassi Brar wrote:
>> >> On 14 October 2011 13:02, Barry Song <21cnbao@gmail.com> wrote:
>> >> >
>> >> > what if i want a cyclic interleaved transfer? i think the cyclic
>> >> > interleaved transfer is what i want for audio dma.
>> >> >
>> >> ... we need to restore 'bool frm_irq' and add new 'bool cyclic' that
>> >> would replay the transfer(i.e, reset dma-pointers to src_start & dst_start)
>> >> after 'numf' frames have been transferred.
>> > I was thinking more on lines to have this conveyed thru a flag.
>> >
>> > Anyway I plan to work on merging device_prep_slave_sg and
>> > device_prep_cyclic to single API. Think more of device_prep_cyclic as
>> > special case with sg length one and flag to tell dmac its cyclic.
>> >
>> > Similarly here we could use/define this flag to say this transfer is
>> > also cyclic in nature and dmac then reloads the list again.
>> > That way any prep can be made cyclic in nature by just using this flag.
>> >
>> > @Barry: Why would you need to use interleaved API for audio?
>>
>> At first, audio dma is typically cyclic. if no underflow and overflow
>> happens, it will always be running. underflow and overflow will
>> trigger the cyclic dma termination.
> Yes and you terminate dma. Then prepare will be called again and you
> setup DMA again and on trigger start you start DMA again.
> No need for interleaving in this case.
>>
>> in case audio PCM data is interleaved and saved in dma buffer  as below:
>>
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> left (2B)  right (2B)  left (2B)  right (2B)   left (2B)  right (2B)
>> left (2B)  right (2B)
>> ,,,,
>>
>> and some hardwares need two seperate dma channels to tranfer left and
>> right audio channel.
>>
>> For 1st and 2nd dma channel, they want dma address increases 4bytes
>> and transfer 2bytes every line.
>> so it looks to me like a cyclic interleaved dma.
> Hmmm, do we have sound cards which use this?
> Nevertheless for this kind of transfers we would need interleaved cyclic
> DMA as well, Do you have such usage? Can you tell me which codec
> requires this?
>
My proposed 'frm_irq' and 'cyclic' flags are for such requirements.

Consider a 5.1chan I2S controller that employs 3 dma-channels
each transferring 2 audio-channels to 3 three FIFOs. And the
pcm-dma driver supports SNDRV_PCM_INFO_INTERLEAVED.
While I worked with simple single fifo 5.1chan I2S controllers, I don't
think such 3-fifo controllers can't exist.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 16:35                   ` Jassi Brar
@ 2011-10-14 17:04                     ` Vinod Koul
  2011-10-14 17:59                       ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-14 17:04 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Fri, 2011-10-14 at 22:05 +0530, Jassi Brar wrote:
> >> and some hardwares need two seperate dma channels to tranfer left
> and
> >> right audio channel.
> >>
> >> For 1st and 2nd dma channel, they want dma address increases 4bytes
> >> and transfer 2bytes every line.
> >> so it looks to me like a cyclic interleaved dma.
> > Hmmm, do we have sound cards which use this?
> > Nevertheless for this kind of transfers we would need interleaved
> cyclic
> > DMA as well, Do you have such usage? Can you tell me which codec
> > requires this?
> >
> My proposed 'frm_irq' and 'cyclic' flags are for such requirements.
> 
> Consider a 5.1chan I2S controller that employs 3 dma-channels
> each transferring 2 audio-channels to 3 three FIFOs. And the
> pcm-dma driver supports SNDRV_PCM_INFO_INTERLEAVED.
> While I worked with simple single fifo 5.1chan I2S controllers, I
> don't
> think such 3-fifo controllers can't exist. 
I am not against cyclic, and yes 3 fifos can exist but one would
question why we need 3 controller, 3 sets of ports and pins and
associated analog stuff when one can do with one :)

Nevertheless, cyclic should be supported for all dmaengine APIs (but not
adding new APIs for cyclic) and in consistent manner by having a generic
cyclic flag and caps.

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-11 16:44                                   ` Williams, Dan J
  2011-10-11 18:42                                     ` Jassi Brar
@ 2011-10-14 17:50                                     ` Bounine, Alexandre
  2011-10-14 18:36                                       ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-14 17:50 UTC (permalink / raw)
  To: Williams, Dan J, Jassi Brar
  Cc: Vinod Koul, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 11, 2011 at 12:45 PM, Williams, Dan J
<dan.j.williams@intel.com> wrote:

> Subject: Re: [PATCHv4] DMAEngine: Define interleaved transfer request
> api
> 
> [ Adding Alexandre ]
> 
> On Fri, Oct 7, 2011 at 4:27 AM, Jassi Brar
<jaswinder.singh@linaro.org>
> wrote:
> > On 7 October 2011 11:15, Vinod Koul <vinod.koul@intel.com> wrote:
> >
> >> Thru this patch Jassi gave a very good try at merging DMA_SLAVE and
> >> memcpy, but more we debate this, I am still not convinced about
> merging
> >> memcpy and DMA_SLAVE yet.
> >>
> > Nobody is merging memcpy and DMA_SLAVE right away.
> > The api's primary purpose is to support interleave transfers.
> > Possibility to merge other prepares into this is a side-effect.
> >
> >> I would still argue that if we split this on same lines as current
> >> mechanism, we have clean way to convey all details for both cases.
> >>
> > Do you mean to have separate interleaved transfer apis for Slave
> > and Mem->Mem ? Please clarify.
> >
> 
> This is a tangent, but it would be nice if this API extension also
> covered the needs of the incoming RapidIO case which wants to specify
> new device context information per operation (and not once at
> configuration time, like slave case).  Would it be enough if the
> transfer template included a (struct device *context) member at the
> end?  Most dma users could ignore it, but RapidIO could use it to do
> something like:
> 
>    struct rio_dev *rdev = container_of(context, typeof(*rdev),
device);
> 
> That might not be enough, but I'm concerned that making the context a
> (void *) is too flexible.  I'd rather have something like this than
> acquiring a lock in rio_dma_prep_slave_sg() and holding it over
> ->prep().  The alternative is to extend device_prep_slave_sg to take
> an extra parameter, but that impacts all other slave implementations
> with a dead parameter.
> 

Having context limited to the device structure will not be enough for
RapidIO because of 66-bit target address (dma_addr_t will not work
here).
Probably that range is out of practical use at this moment but it is
defined by RIO specification and I would prefer to deal with it now
instead of postponing it for future. Passing context using (void *) will
solve this.

Alex.


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 17:04                     ` Vinod Koul
@ 2011-10-14 17:59                       ` Jassi Brar
  2011-10-15 17:11                         ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-14 17:59 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On 14 October 2011 22:34, Vinod Koul <vinod.koul@intel.com> wrote:
> On Fri, 2011-10-14 at 22:05 +0530, Jassi Brar wrote:
>> >> and some hardwares need two seperate dma channels to tranfer left
>> and
>> >> right audio channel.
>> >>
>> >> For 1st and 2nd dma channel, they want dma address increases 4bytes
>> >> and transfer 2bytes every line.
>> >> so it looks to me like a cyclic interleaved dma.
>> > Hmmm, do we have sound cards which use this?
>> > Nevertheless for this kind of transfers we would need interleaved
>> cyclic
>> > DMA as well, Do you have such usage? Can you tell me which codec
>> > requires this?
>> >
>> My proposed 'frm_irq' and 'cyclic' flags are for such requirements.
>>
>> Consider a 5.1chan I2S controller that employs 3 dma-channels
>> each transferring 2 audio-channels to 3 three FIFOs. And the
>> pcm-dma driver supports SNDRV_PCM_INFO_INTERLEAVED.
>> While I worked with simple single fifo 5.1chan I2S controllers, I
>> don't
>> think such 3-fifo controllers can't exist.
> I am not against cyclic, and yes 3 fifos can exist but one would
> question why we need 3 controller, 3 sets of ports and pins and
> associated analog stuff when one can do with one :)
>
No, 1 controller with 3 fifos. And that doesn't necessarily mean
pin count higher than the controller with single fifo.
Anyways, if you accepted interleaved Slave operation you should
not need such specific examples.
Cyclic transfers are not necessary for audio, but sure is the
preferred mechanism if supported.

> Nevertheless, cyclic should be supported for all dmaengine APIs (but not
> adding new APIs for cyclic) and in consistent manner by having a generic
> cyclic flag and caps.
>
You are already merging Cyclic into Slave_sg. And Slave_sg could be
merged into this api. So overall we would get rid one parameter passed
outside of the parameter structure - dmaxfer_template. (Yes I want to
reduce dependence on 'flags' argument as much as possible).

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-11 18:42                                     ` Jassi Brar
@ 2011-10-14 18:11                                       ` Bounine, Alexandre
  0 siblings, 0 replies; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-14 18:11 UTC (permalink / raw)
  To: Jassi Brar, Williams, Dan J
  Cc: Vinod Koul, Russell King, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 11, 2011 at 2:42 PM, Jassi Brar <jaswinder.singh@linaro.org> wrote:

> 
> On 11 October 2011 22:14, Williams, Dan J <dan.j.williams@intel.com>
> wrote:
> > [ Adding Alexandre ]
> >
> > This is a tangent, but it would be nice if this API extension also
> > covered the needs of the incoming RapidIO case which wants to specify
> > new device context information per operation (and not once at
> > configuration time, like slave case).  Would it be enough if the
> > transfer template included a (struct device *context) member at the
> > end?  Most dma users could ignore it, but RapidIO could use it to do
> > something like:
> >
> >   struct rio_dev *rdev = container_of(context, typeof(*rdev),
> device);
> >
> > That might not be enough, but I'm concerned that making the context a
> > (void *) is too flexible.  I'd rather have something like this than
> > acquiring a lock in rio_dma_prep_slave_sg() and holding it over
> > ->prep().  The alternative is to extend device_prep_slave_sg to take
> > an extra parameter, but that impacts all other slave implementations
> > with a dead parameter.
> >
> From what I read so far, the requirement is closer to prep_slave_sg
> than to this api.
Yes, it is a closest fit so far but with one deficiency - it does not give
me a (natural) way to pass target device parameters for every transaction
that should be initiated.   

> 
> IMO, there should be a virtual channel for each device that the real
> physical channel, at the backend, can transfer data to/from.
> 
> The client driver should request each virtual channel corresponding
> to each target device it wants to transfer data with.
> 
> In the dmac driver - transfers queued for all virtual channels that are
> backed by the same physical channel, could be added to the same
> list and executed in FIFO manner.
> 
> That way, there won't be any need to hook target device info per
> transfer
> and more importantly "struct dma_chan" would continue to mean
> link between fixed 'endpoints'.
Passing a 66-bit RIO address will require an extra parameter anyway.
This brings us back to the problem that I have with the physical slave channel.




^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-14 17:50                                     ` Bounine, Alexandre
@ 2011-10-14 18:36                                       ` Jassi Brar
  2011-10-14 19:15                                         ` Bounine, Alexandre
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-14 18:36 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 14 October 2011 23:20, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
>> On Fri, Oct 7, 2011 at 4:27 AM, Jassi Brar
> <jaswinder.singh@linaro.org>
>> wrote:
>> > On 7 October 2011 11:15, Vinod Koul <vinod.koul@intel.com> wrote:
>> >
>> >> Thru this patch Jassi gave a very good try at merging DMA_SLAVE and
>> >> memcpy, but more we debate this, I am still not convinced about
>> merging
>> >> memcpy and DMA_SLAVE yet.
>> >>
>> > Nobody is merging memcpy and DMA_SLAVE right away.
>> > The api's primary purpose is to support interleave transfers.
>> > Possibility to merge other prepares into this is a side-effect.
>> >
>> >> I would still argue that if we split this on same lines as current
>> >> mechanism, we have clean way to convey all details for both cases.
>> >>
>> > Do you mean to have separate interleaved transfer apis for Slave
>> > and Mem->Mem ? Please clarify.
>> >
>>
>> This is a tangent, but it would be nice if this API extension also
>> covered the needs of the incoming RapidIO case which wants to specify
>> new device context information per operation (and not once at
>> configuration time, like slave case).  Would it be enough if the
>> transfer template included a (struct device *context) member at the
>> end?  Most dma users could ignore it, but RapidIO could use it to do
>> something like:
>>
>>    struct rio_dev *rdev = container_of(context, typeof(*rdev),
> device);
>>
>> That might not be enough, but I'm concerned that making the context a
>> (void *) is too flexible.  I'd rather have something like this than
>> acquiring a lock in rio_dma_prep_slave_sg() and holding it over
>> ->prep().  The alternative is to extend device_prep_slave_sg to take
>> an extra parameter, but that impacts all other slave implementations
>> with a dead parameter.
>>
>
> Having context limited to the device structure will not be enough for
> RapidIO because of 66-bit target address (dma_addr_t will not work
> here).
> Probably that range is out of practical use at this moment but it is
> defined by RIO specification and I would prefer to deal with it now
> instead of postponing it for future. Passing context using (void *) will
> solve this.
>
OK so you need a void* to contain all info. Agreed.
But doesn't the info, pointed to by this (void *), remain same for every
transfer to a particular target/remote device ?
If so, couldn't you stick this (void *) to the virtual channel's
'private' ?  'private' :D

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-14 18:36                                       ` Jassi Brar
@ 2011-10-14 19:15                                         ` Bounine, Alexandre
  2011-10-15 11:25                                           ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-14 19:15 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Fri, Oct 14, 2011 at 2:37 PM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> 
> On 14 October 2011 23:20, Bounine, Alexandre
> <Alexandre.Bounine@idt.com> wrote:
> >> On Fri, Oct 7, 2011 at 4:27 AM, Jassi Brar
> > <jaswinder.singh@linaro.org>
> >> wrote:
> >> > On 7 October 2011 11:15, Vinod Koul <vinod.koul@intel.com> wrote:
> >> >
> >> >> Thru this patch Jassi gave a very good try at merging DMA_SLAVE
> and
> >> >> memcpy, but more we debate this, I am still not convinced about
> >> merging
> >> >> memcpy and DMA_SLAVE yet.
> >> >>
> >> > Nobody is merging memcpy and DMA_SLAVE right away.
> >> > The api's primary purpose is to support interleave transfers.
> >> > Possibility to merge other prepares into this is a side-effect.
> >> >
> >> >> I would still argue that if we split this on same lines as
> current
> >> >> mechanism, we have clean way to convey all details for both
> cases.
> >> >>
> >> > Do you mean to have separate interleaved transfer apis for Slave
> >> > and Mem->Mem ? Please clarify.
> >> >
> >>
> >> This is a tangent, but it would be nice if this API extension also
> >> covered the needs of the incoming RapidIO case which wants to
> specify
> >> new device context information per operation (and not once at
> >> configuration time, like slave case).  Would it be enough if the
> >> transfer template included a (struct device *context) member at the
> >> end?  Most dma users could ignore it, but RapidIO could use it to do
> >> something like:
> >>
> >>    struct rio_dev *rdev = container_of(context, typeof(*rdev),
> > device);
> >>
> >> That might not be enough, but I'm concerned that making the context
> a
> >> (void *) is too flexible.  I'd rather have something like this than
> >> acquiring a lock in rio_dma_prep_slave_sg() and holding it over
> >> ->prep().  The alternative is to extend device_prep_slave_sg to take
> >> an extra parameter, but that impacts all other slave implementations
> >> with a dead parameter.
> >>
> >
> > Having context limited to the device structure will not be enough for
> > RapidIO because of 66-bit target address (dma_addr_t will not work
> > here).
> > Probably that range is out of practical use at this moment but it is
> > defined by RIO specification and I would prefer to deal with it now
> > instead of postponing it for future. Passing context using (void *)
> will
> > solve this.
> >
> OK so you need a void* to contain all info. Agreed.
> But doesn't the info, pointed to by this (void *), remain same for
> every
> transfer to a particular target/remote device ?
No. An address within the target may (and most likely will) be changed for
every transfer. Target destination ID will be the same for given virtual channel. 

> If so, couldn't you stick this (void *) to the virtual channel's
> 'private' ?  'private' :D

This is what I am trying to do for physical channel ;).
Virtual channel may bring the same challenge and I may need a channel locking
if more than one requester try to read/write data to the same target RIO device.

Currently, I am leaning towards adopting Dan's idea of having subsystem specific
prep_sg() routine which will be associated with rio_mport device that provides DMA
support but keep it registered as DMA_SLAVE. In this context I am happy to see
that your patch removes BUG_ON check for DMA_SLAVE.

This also gives RapidIO greater level of independence in dealing with
RIO transfer details.

I am sorry for my delayed replies - I was on vacation.

Alex.
   

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-14 19:15                                         ` Bounine, Alexandre
@ 2011-10-15 11:25                                           ` Jassi Brar
  2011-10-17 14:07                                             ` Bounine, Alexandre
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-15 11:25 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 15 October 2011 00:45, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:

>> But doesn't the info, pointed to by this (void *), remain same for
>> every
>> transfer to a particular target/remote device ?
> No. An address within the target may (and most likely will) be changed for
> every transfer. Target destination ID will be the same for given virtual channel.
>
Thanks for the info.

> Virtual channel may bring the same challenge and I may need a channel locking
> if more than one requester try to read/write data to the same target RIO device.
>
One can't avoid taking care of locking, but using virtual channels keeps
the dma_chan usage consistent.

RapidIO supports 34(32+2), 50(48+2) and 66(64+2) bit addressing
which makes me wonder if the (upper or lower) 2 bits could be attached to
the identity of the target device ?
(tsi721 driver actually discards the upper 2 bits while claiming to support
66bit addressing so I couldn't make anything out of it and specs don't
seem to say much about it)

If there is no user of 66bit addressing and isn't coming in very near future,
we might as well drop that case for now(tsi721 already does) because
that 'completeness' of support modifies the semantics of dmaengine apis
today for no real use.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 17:59                       ` Jassi Brar
@ 2011-10-15 17:11                         ` Vinod Koul
  0 siblings, 0 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-15 17:11 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Barry Song, linux-kernel, dan.j.williams, rmk, DL-SHA-WorkGroupLinux

On Fri, 2011-10-14 at 23:29 +0530, Jassi Brar wrote:
> 
> > Nevertheless, cyclic should be supported for all dmaengine APIs (but not
> > adding new APIs for cyclic) and in consistent manner by having a generic
> > cyclic flag and caps.
> >
> You are already merging Cyclic into Slave_sg. And Slave_sg could be
> merged into this api. So overall we would get rid one parameter passed
> outside of the parameter structure - dmaxfer_template. (Yes I want to
> reduce dependence on 'flags' argument as much as possible). 
The flags parameter already exists and we are not using it properly in
slave cases.

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-14 15:16         ` Vinod Koul
  2011-10-14 15:50           ` Barry Song
@ 2011-10-16 11:16           ` Jassi Brar
  2011-10-16 12:16             ` Vinod Koul
  1 sibling, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-16 11:16 UTC (permalink / raw)
  To: Vinod Koul; +Cc: linux-kernel, dan.j.williams, rmk, 21cnbao

On 14 October 2011 20:46, Vinod Koul <vinod.koul@intel.com> wrote:
> On Thu, 2011-10-13 at 12:33 +0530, Jassi Brar wrote:
>> Define a new api that could be used for doing fancy data transfers
>> like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient in
>> such cases as where the interleave and chunk are only a few bytes,
>> which call for a very condensed api to convey pattern of the transfer.
>> This api supports all 4 variants of scatter-gather and contiguous transfer.
>>
>> Of course, neither can this api help transfers that don't lend to DMA by
>> nature, i.e, scattered tiny read/writes with no periodic pattern.
>>
>> Also since now we support SLAVE channels that might not provide
>> device_prep_slave_sg callback but device_prep_interleaved_dma,
>> remove the BUG_ON check.
>>
>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>> ---
>>
>> Dan,
>>  After waiting for a reply from you/Alexandre a couple of days I am revising
>>  my patch without adding a hook for RapidIO case for 2 reasons:
>>   a) I think the RapidIO requirement is served better by the concept of
>>      virtual channels, rather than hacking "struct dma_chan" to reach more
>>      than one device.
>>   b) If Alexandre comes up with something irresistible, we can always add
>>      the hook later.
>>
>> Changes since v4:
>> 1) Dropped the 'frm_irq' member.
>> 2) Renamed 'xfer_direction' to 'dma_transfer_direction'
>>
>> Changes since v3:
>> 1) Added explicit type for source and destination.
>>
>> Changes since v2:
>> 1) Added some notes to documentation.
>> 2) Removed the BUG_ON check that expects every SLAVE channel to
>>    provide a prep_slave_sg, as we are now valid otherwise too.
>> 3) Fixed the DMA_TX_TYPE_END offset - made it last element of enum.
>> 4) Renamed prep_dma_genxfer to prep_interleaved_dma as Vinod wanted.
>>
>> Changes since v1:
>> 1) Dropped the 'dma_transaction_type' member until we really
>>    merge another type into this api. Instead added special
>>    type for this api - DMA_GENXFER in dma_transaction_type.
>> 2) Renamed 'xfer_template' to 'dmaxfer_template' inorder to
>>
>>  Documentation/dmaengine.txt |    8 ++++
>>  drivers/dma/dmaengine.c     |    4 +-
>>  include/linux/dmaengine.h   |   82 +++++++++++++++++++++++++++++++++++++++++-
>>  3 files changed, 90 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/dmaengine.txt b/Documentation/dmaengine.txt
>> index 94b7e0f..962a2d3 100644
>> --- a/Documentation/dmaengine.txt
>> +++ b/Documentation/dmaengine.txt
>> @@ -75,6 +75,10 @@ The slave DMA usage consists of following steps:
>>     slave_sg  - DMA a list of scatter gather buffers from/to a peripheral
>>     dma_cyclic        - Perform a cyclic DMA operation from/to a peripheral till the
>>                 operation is explicitly stopped.
>> +   interleaved_dma - This is common to Slave as well as M2M clients. For slave
>> +              address of devices' fifo could be already known to the driver.
>> +              Various types of operations could be expressed by setting
>> +              appropriate values to the 'dmaxfer_template' members.
>>
>>     A non-NULL return of this transfer API represents a "descriptor" for
>>     the given transaction.
>> @@ -89,6 +93,10 @@ The slave DMA usage consists of following steps:
>>               struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>>               size_t period_len, enum dma_data_direction direction);
>>
>> +     struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
>> +             struct dma_chan *chan, struct dmaxfer_template *xt,
>> +             unsigned long flags);
>> +
>>     The peripheral driver is expected to have mapped the scatterlist for
>>     the DMA operation prior to calling device_prep_slave_sg, and must
>>     keep the scatterlist mapped until the DMA operation has completed.
>> diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>> index b48967b..a6c6051 100644
>> --- a/drivers/dma/dmaengine.c
>> +++ b/drivers/dma/dmaengine.c
>> @@ -693,12 +693,12 @@ int dma_async_device_register(struct dma_device *device)
>>               !device->device_prep_dma_interrupt);
>>       BUG_ON(dma_has_cap(DMA_SG, device->cap_mask) &&
>>               !device->device_prep_dma_sg);
>> -     BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>> -             !device->device_prep_slave_sg);
>>       BUG_ON(dma_has_cap(DMA_CYCLIC, device->cap_mask) &&
>>               !device->device_prep_dma_cyclic);
>>       BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>               !device->device_control);
>> +     BUG_ON(dma_has_cap(DMA_INTERLEAVE, device->cap_mask) &&
>> +             !device->device_prep_interleaved_dma);
>>
>>       BUG_ON(!device->device_alloc_chan_resources);
>>       BUG_ON(!device->device_free_chan_resources);
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 8fbf40e..ce8c40a 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -71,11 +71,85 @@ enum dma_transaction_type {
>>       DMA_ASYNC_TX,
>>       DMA_SLAVE,
>>       DMA_CYCLIC,
>> +     DMA_INTERLEAVE,
>> +/* last transaction type for creation of the capabilities mask */
>> +     DMA_TX_TYPE_END,
>>  };
>>
>> -/* last transaction type for creation of the capabilities mask */
>> -#define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>> +enum dma_transfer_direction {
>> +     MEM_TO_MEM,
>> +     MEM_TO_DEV,
>> +     DEV_TO_MEM,
>> +     DEV_TO_DEV,
>> +};
>> +
>> +/**
>> + * Interleaved Transfer Request
>> + * ----------------------------
>> + * A chunk is collection of contiguous bytes to be transfered.
>> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>> + * ICGs may or maynot change between chunks.
>> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>> + *  that when repeated an integral number of times, specifies the transfer.
>> + * A transfer template is specification of a Frame, the number of times
>> + *  it is to be repeated and other per-transfer attributes.
>> + *
>> + * Practically, a client driver would have ready a template for each
>> + *  type of transfer it is going to need during its lifetime and
>> + *  set only 'src_start' and 'dst_start' before submitting the requests.
>> + *
>> + *
>> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>> + *
>> + *    ==  Chunk size
>> + *    ... ICG
>> + */
>>
>> +/**
>> + * struct data_chunk - Element of scatter-gather list that makes a frame.
>> + * @size: Number of bytes to read from source.
>> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
>> + * @icg: Number of bytes to jump after last src/dst address of this
>> + *    chunk and before first src/dst address for next chunk.
>> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
>> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
>> + */
>> +struct data_chunk {
>> +     size_t size;
>> +     size_t icg;
>> +};
>> +
>> +/**
>> + * struct dmaxfer_template - Template to convey DMAC the transfer pattern
>> + *    and attributes.
>> + * @src_start: Bus address of source for the first chunk.
>> + * @dst_start: Bus address of destination for the first chunk.
>> + * @dir: Specifies the type of Source and Destination.
>> + * @src_inc: If the source address increments after reading from it.
>> + * @dst_inc: If the destination address increments after writing to it.
>> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> + *           Otherwise, source is read contiguously (icg ignored).
>> + *           Ignored if src_inc is false.
>> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> + *           Otherwise, destination is filled contiguously (icg ignored).
>> + *           Ignored if dst_inc is false.
>> + * @numf: Number of frames in this template.
>> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
>> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
>> + */
>> +struct dmaxfer_template {
>> +     dma_addr_t src_start;
>> +     dma_addr_t dst_start;
>> +     enum dma_transfer_direction dir;
>> +     bool src_inc;
>> +     bool dst_inc;
>> +     bool src_sgl;
>> +     bool dst_sgl;
>> +     size_t numf;
>> +     size_t frame_size;
>> +     struct data_chunk sgl[0];
>> +};
>>
>>  /**
>>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
>> @@ -432,6 +506,7 @@ struct dma_tx_state {
>>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>>   *   The function takes a buffer of size buf_len. The callback function will
>>   *   be called after period_len bytes have been transferred.
>> + * @device_prep_interleaved_dma: Transfer expression in a generic way.
>>   * @device_control: manipulate all pending operations on a channel, returns
>>   *   zero or error code
>>   * @device_tx_status: poll for transaction completion, the optional
>> @@ -496,6 +571,9 @@ struct dma_device {
>>       struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>>               struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>>               size_t period_len, enum dma_data_direction direction);
>> +     struct dma_async_tx_descriptor *(*device_prep_interleaved_dma)(
>> +             struct dma_chan *chan, struct dmaxfer_template *xt,
>> +             unsigned long flags);
>>       int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>>               unsigned long arg);
>>
> IMO this looks decent now, we can take this for merge if we don't have
> any other issues.
> Ideally would be great if we also need to see the usage for this
> API .... Barry?. I am okay to host this up on a branch meanwhile.
>
> Just a minor nitpick, I would have really like dmaxfer_template to be
> named dma_interleaved_template. I think we are still quite far from
> generic transfer template. Jassi if you agree I can fix that up while
> applying, no need to revise for nitpick :)
>
There is need to provide cyclic functionality but we seem to disagree on
implementation. So you might as well take this patch and provide
cyclic feature, the way you want, as a patch on top of this.
I am ok with dmaxfer_template renaming.
Thanks.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv5] DMAEngine: Define interleaved transfer request api
  2011-10-16 11:16           ` Jassi Brar
@ 2011-10-16 12:16             ` Vinod Koul
  0 siblings, 0 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-16 12:16 UTC (permalink / raw)
  To: Jassi Brar; +Cc: linux-kernel, dan.j.williams, rmk, 21cnbao

On Sun, 2011-10-16 at 16:46 +0530, Jassi Brar wrote:
> > IMO this looks decent now, we can take this for merge if we don't
> have
> > any other issues.
> > Ideally would be great if we also need to see the usage for this
> > API .... Barry?. I am okay to host this up on a branch meanwhile.
> >
> > Just a minor nitpick, I would have really like dmaxfer_template to be
> > named dma_interleaved_template. I think we are still quite far from
> > generic transfer template. Jassi if you agree I can fix that up while
> > applying, no need to revise for nitpick :)
> >
> There is need to provide cyclic functionality but we seem to disagree on
> implementation. So you might as well take this patch and provide
> cyclic feature, the way you want, as a patch on top of this.
> I am ok with dmaxfer_template renaming. 
Thanks, I have changed it and pushed to branch interleaved_dma.

Once Barry sends his drivers using this API, will merge 

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-15 11:25                                           ` Jassi Brar
@ 2011-10-17 14:07                                             ` Bounine, Alexandre
  2011-10-17 15:16                                               ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-17 14:07 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Sat, Oct 15, 2011 at 7:26 AM, Jassi Brar <jaswinder.singh@linaro.org>
wrote:
> 
> On 15 October 2011 00:45, Bounine, Alexandre
> <Alexandre.Bounine@idt.com> wrote:
> 
> >> But doesn't the info, pointed to by this (void *), remain same for
> >> every
> >> transfer to a particular target/remote device ?
> > No. An address within the target may (and most likely will) be
changed for
> > every transfer. Target destination ID will be the same for given
virtual channel.
> >
> Thanks for the info.
> 
> > Virtual channel may bring the same challenge and I may need a
channel locking
> > if more than one requester try to read/write data to the same target
RIO device.
> >
> One can't avoid taking care of locking, but using virtual channels
> keeps the dma_chan usage consistent.
> 
Using virtual channels adds layers of complexity which may be avoided
with
simple API changes:
- virtual channel allocation: statically vs. dynamically
- linking virtual channel to the physical one


> RapidIO supports 34(32+2), 50(48+2) and 66(64+2) bit addressing
> which makes me wonder if the (upper or lower) 2 bits could be attached
> to
> the identity of the target device ?
> (tsi721 driver actually discards the upper 2 bits while claiming to
> support
> 66bit addressing so I couldn't make anything out of it and specs don't
> seem to say much about it)
> 
> If there is no user of 66bit addressing and isn't coming in very near
> future,
> we might as well drop that case for now(tsi721 already does) because
> that 'completeness' of support modifies the semantics of dmaengine
apis
> today for no real use.
This is marked to be fixed in tsi721 driver. Also, this is a local
deficiency
and changing it that does not affect other components of the RIO
subsystem.
Contrary to that, defining an upper layer affects all future development
and
may result in greater pain if it needs to be adjusted later. 

Alex.



^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-17 14:07                                             ` Bounine, Alexandre
@ 2011-10-17 15:16                                               ` Jassi Brar
  2011-10-17 18:00                                                 ` Bounine, Alexandre
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-17 15:16 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 17 October 2011 19:37, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
> On Sat, Oct 15, 2011 at 7:26 AM, Jassi Brar <jaswinder.singh@linaro.org>
> wrote:
>>
>> On 15 October 2011 00:45, Bounine, Alexandre
>> <Alexandre.Bounine@idt.com> wrote:
>>
>> >> But doesn't the info, pointed to by this (void *), remain same for
>> >> every
>> >> transfer to a particular target/remote device ?
>> > No. An address within the target may (and most likely will) be
> changed for
>> > every transfer. Target destination ID will be the same for given
> virtual channel.
>> >
>> Thanks for the info.
>>
>> > Virtual channel may bring the same challenge and I may need a
> channel locking
>> > if more than one requester try to read/write data to the same target
> RIO device.
>> >
>> One can't avoid taking care of locking, but using virtual channels
>> keeps the dma_chan usage consistent.
>>
> Using virtual channels adds layers of complexity
Perhaps you didn't get me ... I suggest the dma controller driver
(not client drivers) create virtual channels corresponding to each
device it can talk to. A bunch of virtual channels could be served
by a single appropriate physical channel.
It is actually quite common, see amba-pl08x.c or pl330.c for example.

> which may be avoided with simple API changes:
> - virtual channel allocation: statically vs. dynamically
Yes, it would be cool but it's not possible right now.

> - linking virtual channel to the physical one
>
Perhaps you mean what I suggested ?

>
>> RapidIO supports 34(32+2), 50(48+2) and 66(64+2) bit addressing
>> which makes me wonder if the (upper or lower) 2 bits could be attached
>> to
>> the identity of the target device ?
>> (tsi721 driver actually discards the upper 2 bits while claiming to
>> support
>> 66bit addressing so I couldn't make anything out of it and specs don't
>> seem to say much about it)
>>
>> If there is no user of 66bit addressing and isn't coming in very near
>> future,
>> we might as well drop that case for now(tsi721 already does) because
>> that 'completeness' of support modifies the semantics of dmaengine
> apis
>> today for no real use.
> This is marked to be fixed in tsi721 driver. Also, this is a local
> deficiency
> and changing it that does not affect other components of the RIO
> subsystem.
> Contrary to that, defining an upper layer affects all future development
> and
> may result in greater pain if it needs to be adjusted later.
>
I just wanted to know

1) The role of the 'extra' 2bits ?

2) Are there real use-cases that are blocked on this support right now ?
    If there are indeed, do you think the transfer would be _randomly_
   distributed over the 66-bit address space ? Because otherwise, maybe
   the upper 2 bits could be used to "activate" one of the 4 "segments"
   using slave config call.

We should try our best to avoid opening the can of worms by adding
(void *) hook to each transfer, because any client driver could want to
pass its own private data to dmac and there would be no way for a dmac
driver to know what to cast the void pointer to.
Thanks.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-17 15:16                                               ` Jassi Brar
@ 2011-10-17 18:00                                                 ` Bounine, Alexandre
  2011-10-17 19:29                                                   ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-17 18:00 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, Oct 17, 2011 at 11:17 AM, Jassi Brar
<jaswinder.singh@linaro.org> wrote:
> 
> On 17 October 2011 19:37, Bounine, Alexandre
> <Alexandre.Bounine@idt.com> wrote:
> > On Sat, Oct 15, 2011 at 7:26 AM, Jassi Brar
> <jaswinder.singh@linaro.org>
> > wrote:
> >>
> >> On 15 October 2011 00:45, Bounine, Alexandre
> >> <Alexandre.Bounine@idt.com> wrote:
> >>
> >> >> But doesn't the info, pointed to by this (void *), remain same
> for
> >> >> every
> >> >> transfer to a particular target/remote device ?
> >> > No. An address within the target may (and most likely will) be
> > changed for
> >> > every transfer. Target destination ID will be the same for given
> > virtual channel.
> >> >
> >> Thanks for the info.
> >>
> >> > Virtual channel may bring the same challenge and I may need a
> > channel locking
> >> > if more than one requester try to read/write data to the same
> target
> > RIO device.
> >> >
> >> One can't avoid taking care of locking, but using virtual channels
> >> keeps the dma_chan usage consistent.
> >>
> > Using virtual channels adds layers of complexity
> Perhaps you didn't get me ... I suggest the dma controller driver
> (not client drivers) create virtual channels corresponding to each
> device it can talk to. A bunch of virtual channels could be served
> by a single appropriate physical channel.
> It is actually quite common, see amba-pl08x.c or pl330.c for example.
> 
This is a source of the problem for RIO - DMA controller driver creates
virtual channels statically. RapidIO may use 8- or 16-bit destID.
In this case we need to create 256 or 64K virtual channels if we
want to cover all possible targets on single RIO port. Adding an extra
controller/net multiplies that number. Considering that not every
device will need a data transfer from a given node static allocation
will
create even more wasted resources. 

> > which may be avoided with simple API changes:
> > - virtual channel allocation: statically vs. dynamically
> Yes, it would be cool but it's not possible right now.
> 
This is a reason for calling it "added complexity" we will need
to find some mechanism to do it dynamically at DMA or RIO layer.

> > - linking virtual channel to the physical one
> >
> Perhaps you mean what I suggested ?
> 
I mean linking dynamically allocated virtual channel to the physical
one.
Sorry for confusing statement. Ideally, in the virtual channel scenario
rio_request_dma() should dynamically allocate target-mapped virtual DMA
channel and link it to the appropriate physical DMA channel.  

> >
> >> RapidIO supports 34(32+2), 50(48+2) and 66(64+2) bit addressing
> >> which makes me wonder if the (upper or lower) 2 bits could be
> attached
> >> to
> >> the identity of the target device ?
> >> (tsi721 driver actually discards the upper 2 bits while claiming to
> >> support
> >> 66bit addressing so I couldn't make anything out of it and specs
> don't
> >> seem to say much about it)
> >>
> >> If there is no user of 66bit addressing and isn't coming in very
> near
> >> future,
> >> we might as well drop that case for now(tsi721 already does)
because
> >> that 'completeness' of support modifies the semantics of dmaengine
> > apis
> >> today for no real use.
> > This is marked to be fixed in tsi721 driver. Also, this is a local
> > deficiency
> > and changing it that does not affect other components of the RIO
> > subsystem.
> > Contrary to that, defining an upper layer affects all future
> development
> > and
> > may result in greater pain if it needs to be adjusted later.
> >
> I just wanted to know
> 
> 1) The role of the 'extra' 2bits ?
> 
Just upper bits of the RIO address.

> 2) Are there real use-cases that are blocked on this support right now
> ?
>     If there are indeed, do you think the transfer would be _randomly_
>    distributed over the 66-bit address space ? Because otherwise,
maybe
>    the upper 2 bits could be used to "activate" one of the 4
"segments"
>    using slave config call.
> 
There is nothing that absence of full 66-bit addressing blocks now.
So far we are not aware about implementations that use 66-bit address.
This does not prevent someone from designing RIO compliant endpoint
device
which gives interpretation to these two bits in addition to full 64-bit
addressing of their platform.

At this moment we may say that it is reasonable to lower an importance
of 66-bit
addressing but we should keep it in mind when considering all pros and
cons
of possible API changes. I discussed this with some members of RTA and
did not
hear any strong argument in favor of full 66-bit addressing support in
SW
at this moment.

> We should try our best to avoid opening the can of worms by adding
> (void *) hook to each transfer, because any client driver could want
to
> pass its own private data to dmac and there would be no way for a dmac
> driver to know what to cast the void pointer to.

Do we really expect that clients will jump to use an extra parameter
without a valid reason and without knowing their hardware specifics?

Alex.


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-17 18:00                                                 ` Bounine, Alexandre
@ 2011-10-17 19:29                                                   ` Jassi Brar
  2011-10-17 21:07                                                     ` Bounine, Alexandre
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-17 19:29 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 17 October 2011 23:30, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
>> >> >> But doesn't the info, pointed to by this (void *), remain same
>> for
>> >> >> every
>> >> >> transfer to a particular target/remote device ?
>> >> > No. An address within the target may (and most likely will) be
>> > changed for
>> >> > every transfer. Target destination ID will be the same for given
>> > virtual channel.
>> >> >
>> >> Thanks for the info.
>> >>
>> >> > Virtual channel may bring the same challenge and I may need a
>> > channel locking
>> >> > if more than one requester try to read/write data to the same
>> target
>> > RIO device.
>> >> >
>> >> One can't avoid taking care of locking, but using virtual channels
>> >> keeps the dma_chan usage consistent.
>> >>
>> > Using virtual channels adds layers of complexity
>> Perhaps you didn't get me ... I suggest the dma controller driver
>> (not client drivers) create virtual channels corresponding to each
>> device it can talk to. A bunch of virtual channels could be served
>> by a single appropriate physical channel.
>> It is actually quite common, see amba-pl08x.c or pl330.c for example.
>>
> This is a source of the problem for RIO - DMA controller driver creates
> virtual channels statically. RapidIO may use 8- or 16-bit destID.
> In this case we need to create 256 or 64K virtual channels if we
> want to cover all possible targets on single RIO port. Adding an extra
> controller/net multiplies that number. Considering that not every
> device will need a data transfer from a given node static allocation
> will
> create even more wasted resources.
>
Please excuse my rudimentary knowledge of RapidIO but I am tempted
to ask why not register channels only for those targets that are actually
detected and enumerated?

>>
>> 1) The role of the 'extra' 2bits ?
>>
> Just upper bits of the RIO address.
>
I assume you mean any transfer could be targeted at any of 2^66 bits
on the remote device.

> There is nothing that absence of full 66-bit addressing blocks now.
> So far we are not aware about implementations that use 66-bit address.
>
Thanks for the important info.
If I were you, I would postpone enabling support for 66-bit addressing
esp when it soo affects the dmaengine API.
Otherwise, you don't just code unused feature, but also put constraints
on development/up-gradation of the API in future, possibly, nearer than
real need of the feature.

If we postpone 66-bit addressing to when it arrives, we can
1) Attach destID to the virtual channel's identity
2) Use device_prep_dma_memcpy so as to be able to change
    target address for every transfer. Or use prep_slave, depending
    upon nature of address at target endpoint.
3) Use slave_config to set wr_type if it remains same for enough
    consecutive transfers to the same target (only you could strike
    the balance between idealism and pragmatism).

> This does not prevent someone from designing RIO compliant endpoint
> device which gives interpretation to these two bits in addition to full 64-bit
> addressing of their platform.
>
It sounds as if the 2bits are 'Vendor-Specific' ?

>> We should try our best to avoid opening the can of worms by adding
>> (void *) hook to each transfer, because any client driver could want
> to
>> pass its own private data to dmac and there would be no way for a dmac
>> driver to know what to cast the void pointer to.
>
> Do we really expect that clients will jump to use an extra parameter
> without a valid reason and without knowing their hardware specifics?
>
The long term plan for dmaengine is to be able to have client drivers
shared across dmac drivers. And that requires clients making no
assumptions about the underlying dmac and the dmac driver expecting
nothing from client via outside of common APIs.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-17 19:29                                                   ` Jassi Brar
@ 2011-10-17 21:07                                                     ` Bounine, Alexandre
  2011-10-18  5:45                                                       ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-17 21:07 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, Oct 17, 2011 at 3:30 PM, Jassi Brar <jaswinder.singh@linaro.org>
wrote:
... skip ...
> > This is a source of the problem for RIO - DMA controller driver
> creates
> > virtual channels statically. RapidIO may use 8- or 16-bit destID.
> > In this case we need to create 256 or 64K virtual channels if we
> > want to cover all possible targets on single RIO port. Adding an
> extra
> > controller/net multiplies that number. Considering that not every
> > device will need a data transfer from a given node static allocation
> > will
> > create even more wasted resources.
> >
> Please excuse my rudimentary knowledge of RapidIO but I am tempted
> to ask why not register channels only for those targets that are
> actually
> detected and enumerated?
> 
Two reasons:
- possibility of hot device insertion/removal
- there is no advance knowledge of which target device may require DMA
  service. A device driver for the particular target device is expected
  to request DMA service if required.

> >>
> >> 1) The role of the 'extra' 2bits ?
> >>
> > Just upper bits of the RIO address.
> >
> I assume you mean any transfer could be targeted at any of 2^66 bits
> on the remote device.
Yes, this is correct. 

> 
> > There is nothing that absence of full 66-bit addressing blocks now.
> > So far we are not aware about implementations that use 66-bit
> address.
> >
> Thanks for the important info.
> If I were you, I would postpone enabling support for 66-bit addressing
> esp when it soo affects the dmaengine API.
> Otherwise, you don't just code unused feature, but also put
constraints
> on development/up-gradation of the API in future, possibly, nearer
than
> real need of the feature.
> 
> If we postpone 66-bit addressing to when it arrives, we can
> 1) Attach destID to the virtual channel's identity
> 2) Use device_prep_dma_memcpy so as to be able to change
>     target address for every transfer. Or use prep_slave, depending
>     upon nature of address at target endpoint.
> 3) Use slave_config to set wr_type if it remains same for enough
>     consecutive transfers to the same target (only you could strike
>     the balance between idealism and pragmatism).
> 
With item #1 above being a separate topic, I may have a problem with #2
as well: dma_addr_t is sized for the local platform and not guaranteed
to be a 64-bit value (which may be required by a target).
Agree with #3 (if #1 and #2 work).  

> > This does not prevent someone from designing RIO compliant endpoint
> > device which gives interpretation to these two bits in addition to
> full 64-bit
> > addressing of their platform.
> >
> It sounds as if the 2bits are 'Vendor-Specific' ?
> 
They are not 'Vendor-Specific' in the RIO spec (just address),
but HW vendors may give them such meaning. I just wanted to say
that HW designers may follow the same logic as we do here and
decide to give those bits a special meaning because "no one uses
them". They are expecting from RapidIO fabric to pass 66-bit
address and have right to do this. Tsi721 is capable to generate
66-bit address for RIO requests.

So far this is a theoretical possibility and we are not
aware about any designs of this type. We may put a big warning
note about 64-bit limitation in RapidIO documentation section.

I have 64-bit only support in Tsi721 DMA driver because my priority
is defining right upper layer interface to DMA Engine.
I published tsi721 DMA driver mostly as supporting example to
demonstrate what we are doing at RapidIO layer.
After the upper layer is defined it should not be a problem to
switch to 66-bit or just keep 64.

> >> We should try our best to avoid opening the can of worms by adding
> >> (void *) hook to each transfer, because any client driver could
want
> > to
> >> pass its own private data to dmac and there would be no way for a
> dmac
> >> driver to know what to cast the void pointer to.
> >
> > Do we really expect that clients will jump to use an extra parameter
> > without a valid reason and without knowing their hardware specifics?
> >
> The long term plan for dmaengine is to be able to have client drivers
> shared across dmac drivers. And that requires clients making no
> assumptions about the underlying dmac and the dmac driver expecting
> nothing from client via outside of common APIs.
In this case adding new values to dma_transaction_type (like
DMA_RAPIDIO)
may have a sense when using unified prep_slave_sg() with extra
parameter.
This will hide RapidIO dmac drivers from DMA_SLAVE clients (and vice
versa)
while keeping common APIs from growing in their numbers.

Alex.
 

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-17 21:07                                                     ` Bounine, Alexandre
@ 2011-10-18  5:45                                                       ` Jassi Brar
  2011-10-18  7:42                                                         ` Russell King
  2011-10-18 13:51                                                         ` Bounine, Alexandre
  0 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-18  5:45 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 18 October 2011 02:37, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
> On Mon, Oct 17, 2011 at 3:30 PM, Jassi Brar <jaswinder.singh@linaro.org>
> wrote:
> ... skip ...
>> > This is a source of the problem for RIO - DMA controller driver
>> creates
>> > virtual channels statically. RapidIO may use 8- or 16-bit destID.
>> > In this case we need to create 256 or 64K virtual channels if we
>> > want to cover all possible targets on single RIO port. Adding an
>> extra
>> > controller/net multiplies that number. Considering that not every
>> > device will need a data transfer from a given node static allocation
>> > will
>> > create even more wasted resources.
>> >
>> Please excuse my rudimentary knowledge of RapidIO but I am tempted
>> to ask why not register channels only for those targets that are
>> actually
>> detected and enumerated?
>>
> Two reasons:
> - possibility of hot device insertion/removal
... but the linux RIO stack doesn't seem to support hotplug.
Enumeration/discovery is done only once during boot.
Am I overlooking something ?

> - there is no advance knowledge of which target device may require DMA
>  service. A device driver for the particular target device is expected
>  to request DMA service if required.
>
IMHO 1 channel per real device is an acceptable 'overhead'.
Already many SoCs register dozens of channels but only a couple
of them are actually used.

>> > There is nothing that absence of full 66-bit addressing blocks now.
>> > So far we are not aware about implementations that use 66-bit
>> address.
>> >
>> Thanks for the important info.
>> If I were you, I would postpone enabling support for 66-bit addressing
>> esp when it soo affects the dmaengine API.
>> Otherwise, you don't just code unused feature, but also put
> constraints
>> on development/up-gradation of the API in future, possibly, nearer
> than
>> real need of the feature.
>>
>> If we postpone 66-bit addressing to when it arrives, we can
>> 1) Attach destID to the virtual channel's identity
>> 2) Use device_prep_dma_memcpy so as to be able to change
>>     target address for every transfer. Or use prep_slave, depending
>>     upon nature of address at target endpoint.
>> 3) Use slave_config to set wr_type if it remains same for enough
>>     consecutive transfers to the same target (only you could strike
>>     the balance between idealism and pragmatism).
>>
> With item #1 above being a separate topic, I may have a problem with #2
> as well: dma_addr_t is sized for the local platform and not guaranteed
> to be a 64-bit value (which may be required by a target).
> Agree with #3 (if #1 and #2 work).
>
Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?

>> > This does not prevent someone from designing RIO compliant endpoint
>> > device which gives interpretation to these two bits in addition to
>> full 64-bit
>> > addressing of their platform.
>> >
>> It sounds as if the 2bits are 'Vendor-Specific' ?
>>
> They are not 'Vendor-Specific' in the RIO spec (just address),
> but HW vendors may give them such meaning. I just wanted to say
> that HW designers may follow the same logic as we do here and
> decide to give those bits a special meaning because "no one uses
> them".
>
We should discount that. Irrespective of linux RIO stack, a vendor
would be at fault if it assigns different meaning to the upper 2bits
while the specs deem them just MSB of 66-bit addresses.

> So far this is a theoretical possibility and we are not
> aware about any designs of this type. We may put a big warning
> note about 64-bit limitation in RapidIO documentation section.
>
Yes, please.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  5:45                                                       ` Jassi Brar
@ 2011-10-18  7:42                                                         ` Russell King
  2011-10-18  8:30                                                           ` Jassi Brar
  2011-10-18 13:51                                                         ` Bounine, Alexandre
  1 sibling, 1 reply; 131+ messages in thread
From: Russell King @ 2011-10-18  7:42 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Bounine, Alexandre, Williams, Dan J, Vinod Koul, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 11:15:29AM +0530, Jassi Brar wrote:
> On 18 October 2011 02:37, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
> > With item #1 above being a separate topic, I may have a problem with #2
> > as well: dma_addr_t is sized for the local platform and not guaranteed
> > to be a 64-bit value (which may be required by a target).
> > Agree with #3 (if #1 and #2 work).
> >
> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?

That's just an idiotic suggestion - there's no other way to put that.
Let's have some sanity here.

dma_addr_t is the size of a DMA address for the CPU architecture being
built.  This has no relationship to what any particular DMA engine uses.

DMA addresses are likely to be stored in scatter tables in memory, and
such structures should be defined using u32, u64 (even more bonus points
if you use le32 and le64), etc and not types which are dependent on the
CPU architecture.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  8:30                                                           ` Jassi Brar
@ 2011-10-18  8:26                                                             ` Vinod Koul
  2011-10-18  8:37                                                               ` Jassi Brar
  2011-10-18  9:49                                                             ` Russell King
  1 sibling, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-18  8:26 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Russell King, Bounine, Alexandre, Williams, Dan J, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, 2011-10-18 at 14:00 +0530, Jassi Brar wrote:
> On 18 October 2011 13:12, Russell King <rmk@arm.linux.org.uk> wrote:
> > On Tue, Oct 18, 2011 at 11:15:29AM +0530, Jassi Brar wrote:
> >> On 18 October 2011 02:37, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
> >> > With item #1 above being a separate topic, I may have a problem with #2
> >> > as well: dma_addr_t is sized for the local platform and not guaranteed
> >> > to be a 64-bit value (which may be required by a target).
> >> > Agree with #3 (if #1 and #2 work).
> >> >
> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> >
> > That's just an idiotic suggestion - there's no other way to put that.
> > Let's have some sanity here.
> >
> Yeah, I am not proud of the workaround, so I only probed the option.
> I think I need to explain myself.
> 
> The case here is that even a 32-bit RapidIO host could ask transfer against
> 64-bit address space on a remote device. And vice versa 64->32.
I thought RIO address were always 64 + 2 bits, irrespective of what the
host system is...

> 
> > dma_addr_t is the size of a DMA address for the CPU architecture being
> > built.  This has no relationship to what any particular DMA engine uses.
> >
> Yes, so far the dmaengine ever only needed to transfer within platform's
> address-space. So the assumption that src and dst addresses could
> be contained within dma_addr_t, worked.
> If the damengine is to get rid of that assumption/constraint, the memcpy,
> slave_sg etc need to accept addresses specified in bigger of the host and
> remote address space, and u64 is the safe option.
> Ultimately dma_addr_t is either u32 or u64.
> 
> If you still think that's unacceptable, please do show us the optimal
> path forward.


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  7:42                                                         ` Russell King
@ 2011-10-18  8:30                                                           ` Jassi Brar
  2011-10-18  8:26                                                             ` Vinod Koul
  2011-10-18  9:49                                                             ` Russell King
  0 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-18  8:30 UTC (permalink / raw)
  To: Russell King
  Cc: Bounine, Alexandre, Williams, Dan J, Vinod Koul, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 18 October 2011 13:12, Russell King <rmk@arm.linux.org.uk> wrote:
> On Tue, Oct 18, 2011 at 11:15:29AM +0530, Jassi Brar wrote:
>> On 18 October 2011 02:37, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
>> > With item #1 above being a separate topic, I may have a problem with #2
>> > as well: dma_addr_t is sized for the local platform and not guaranteed
>> > to be a 64-bit value (which may be required by a target).
>> > Agree with #3 (if #1 and #2 work).
>> >
>> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
>
> That's just an idiotic suggestion - there's no other way to put that.
> Let's have some sanity here.
>
Yeah, I am not proud of the workaround, so I only probed the option.
I think I need to explain myself.

The case here is that even a 32-bit RapidIO host could ask transfer against
64-bit address space on a remote device. And vice versa 64->32.

> dma_addr_t is the size of a DMA address for the CPU architecture being
> built.  This has no relationship to what any particular DMA engine uses.
>
Yes, so far the dmaengine ever only needed to transfer within platform's
address-space. So the assumption that src and dst addresses could
be contained within dma_addr_t, worked.
If the damengine is to get rid of that assumption/constraint, the memcpy,
slave_sg etc need to accept addresses specified in bigger of the host and
remote address space, and u64 is the safe option.
Ultimately dma_addr_t is either u32 or u64.

If you still think that's unacceptable, please do show us the optimal
path forward.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  8:26                                                             ` Vinod Koul
@ 2011-10-18  8:37                                                               ` Jassi Brar
  2011-10-18 14:44                                                                 ` Bounine, Alexandre
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-18  8:37 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Russell King, Bounine, Alexandre, Williams, Dan J, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 18 October 2011 13:56, Vinod Koul <vinod.koul@intel.com> wrote:
> On Tue, 2011-10-18 at 14:00 +0530, Jassi Brar wrote:
>> On 18 October 2011 13:12, Russell King <rmk@arm.linux.org.uk> wrote:
>> > On Tue, Oct 18, 2011 at 11:15:29AM +0530, Jassi Brar wrote:
>> >> On 18 October 2011 02:37, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
>> >> > With item #1 above being a separate topic, I may have a problem with #2
>> >> > as well: dma_addr_t is sized for the local platform and not guaranteed
>> >> > to be a 64-bit value (which may be required by a target).
>> >> > Agree with #3 (if #1 and #2 work).
>> >> >
>> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
>> >
>> > That's just an idiotic suggestion - there's no other way to put that.
>> > Let's have some sanity here.
>> >
>> Yeah, I am not proud of the workaround, so I only probed the option.
>> I think I need to explain myself.
>>
>> The case here is that even a 32-bit RapidIO host could ask transfer against
>> 64-bit address space on a remote device. And vice versa 64->32.
> I thought RIO address were always 64 + 2 bits, irrespective of what the
> host system is...
>
No, not always. RIO address could be 32, 48 or 64... with the role extra 2 bits
not very clear.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  8:30                                                           ` Jassi Brar
  2011-10-18  8:26                                                             ` Vinod Koul
@ 2011-10-18  9:49                                                             ` Russell King
  2011-10-18 11:50                                                               ` Jassi Brar
  2011-10-18 17:26                                                               ` Bounine, Alexandre
  1 sibling, 2 replies; 131+ messages in thread
From: Russell King @ 2011-10-18  9:49 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Bounine, Alexandre, Williams, Dan J, Vinod Koul, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 02:00:45PM +0530, Jassi Brar wrote:
> On 18 October 2011 13:12, Russell King <rmk@arm.linux.org.uk> wrote:
> > On Tue, Oct 18, 2011 at 11:15:29AM +0530, Jassi Brar wrote:
> >> On 18 October 2011 02:37, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:
> >> > With item #1 above being a separate topic, I may have a problem with #2
> >> > as well: dma_addr_t is sized for the local platform and not guaranteed
> >> > to be a 64-bit value (which may be required by a target).
> >> > Agree with #3 (if #1 and #2 work).
> >> >
> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> >
> > That's just an idiotic suggestion - there's no other way to put that.
> > Let's have some sanity here.
> >
> Yeah, I am not proud of the workaround, so I only probed the option.
> I think I need to explain myself.
> 
> The case here is that even a 32-bit RapidIO host could ask transfer against
> 64-bit address space on a remote device. And vice versa 64->32.
> 
> > dma_addr_t is the size of a DMA address for the CPU architecture being
> > built.  This has no relationship to what any particular DMA engine uses.
> >
> Yes, so far the dmaengine ever only needed to transfer within platform's
> address-space. So the assumption that src and dst addresses could
> be contained within dma_addr_t, worked.
> If the damengine is to get rid of that assumption/constraint, the memcpy,
> slave_sg etc need to accept addresses specified in bigger of the host and
> remote address space, and u64 is the safe option.
> Ultimately dma_addr_t is either u32 or u64.

Let me spell it out:

1. Data structures read by the DMA engine hardware should not be defined
   using the 'dma_addr_t' type, but one of the [bl]e{8,16,32,64} types,
   or at a push the u{8,16,32,64} types if they're always host-endian.

   This helps to ensure that the layout of the structures read by the
   hardware are less dependent of the host architecture and each element
   is appropriately sized (and, with sparse and the endian-sized types,
   can be endian-checked at compile time.)

2. dma_addr_t is the size of the DMA address for the host architecture.
   This may be 32-bit or 64-bit depending on the host architecture.

The following points are my opinion:

3. For architectures where there are only 32-bit DMA addresses, dma_addr_t
   will be a 32-bit type.  For architectures where there are 64-bit DMA
   addresses, it will be a 64-bit type.

4. If RIO can accept 64-bit DMA addresses but is only connected to 32-bit
   busses, then the top 32 address bits are not usable (it's truncated in
   hardware.)  So there's no point passing around a 64-bit DMA address.

5. In the case of a 64-bit dma_addr_t and a 32-bit DMA engine host being
   asked to transfer >= 4GB, this needs error handing in the DMA engine
   driver (I don't think its checked for - I know amba-pl08x doesn't.)

6. 32-bit dma_addr_t with 64-bit DMA address space is a problem and is
   probably a bug in itself - the platform should be using a 64-bit
   dma_addr_t in this case.  (see 3.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  9:49                                                             ` Russell King
@ 2011-10-18 11:50                                                               ` Jassi Brar
  2011-10-18 11:59                                                                 ` Russell King
  2011-10-18 17:57                                                                 ` Bounine, Alexandre
  2011-10-18 17:26                                                               ` Bounine, Alexandre
  1 sibling, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-18 11:50 UTC (permalink / raw)
  To: Russell King
  Cc: Bounine, Alexandre, Williams, Dan J, Vinod Koul, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 18 October 2011 15:19, Russell King <rmk@arm.linux.org.uk> wrote:

>> >> > With item #1 above being a separate topic, I may have a problem with #2
>> >> > as well: dma_addr_t is sized for the local platform and not guaranteed
>> >> > to be a 64-bit value (which may be required by a target).
>> >> > Agree with #3 (if #1 and #2 work).
>> >> >
>> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
>> >
>> > That's just an idiotic suggestion - there's no other way to put that.
>> > Let's have some sanity here.
>> >
>> Yeah, I am not proud of the workaround, so I only probed the option.
>> I think I need to explain myself.
>>
>> The case here is that even a 32-bit RapidIO host could ask transfer against
>> 64-bit address space on a remote device. And vice versa 64->32.
>>
>> > dma_addr_t is the size of a DMA address for the CPU architecture being
>> > built.  This has no relationship to what any particular DMA engine uses.
>> >
>> Yes, so far the dmaengine ever only needed to transfer within platform's
>> address-space. So the assumption that src and dst addresses could
>> be contained within dma_addr_t, worked.
>> If the damengine is to get rid of that assumption/constraint, the memcpy,
>> slave_sg etc need to accept addresses specified in bigger of the host and
>> remote address space, and u64 is the safe option.
>> Ultimately dma_addr_t is either u32 or u64.
>
> Let me spell it out:
>
> 1. Data structures read by the DMA engine hardware should not be defined
>   using the 'dma_addr_t' type, but one of the [bl]e{8,16,32,64} types,
>   or at a push the u{8,16,32,64} types if they're always host-endian.
>
>   This helps to ensure that the layout of the structures read by the
>   hardware are less dependent of the host architecture and each element
>   is appropriately sized (and, with sparse and the endian-sized types,
>   can be endian-checked at compile time.)
>
> 2. dma_addr_t is the size of the DMA address for the host architecture.
>   This may be 32-bit or 64-bit depending on the host architecture.
>
> The following points are my opinion:
>
> 3. For architectures where there are only 32-bit DMA addresses, dma_addr_t
>   will be a 32-bit type.  For architectures where there are 64-bit DMA
>   addresses, it will be a 64-bit type.
>
> 4. If RIO can accept 64-bit DMA addresses but is only connected to 32-bit
>   busses, then the top 32 address bits are not usable (it's truncated in
>   hardware.)  So there's no point passing around a 64-bit DMA address.
>
> 5. In the case of a 64-bit dma_addr_t and a 32-bit DMA engine host being
>   asked to transfer >= 4GB, this needs error handing in the DMA engine
>   driver (I don't think its checked for - I know amba-pl08x doesn't.)
>
> 6. 32-bit dma_addr_t with 64-bit DMA address space is a problem and is
>   probably a bug in itself - the platform should be using a 64-bit
>   dma_addr_t in this case.  (see 3.)
>
Thanks for the detailed explanation.

RapidIO is a packet switched interconnect with parallel or serial interface.
Among other things, a packet contains 32, 48 or 64 bit offset into the
remote-endpoint's address space. So I don't get how any of the above
6 points apply here.

Though I agree it is peculiar for a networking technology to expose a
DMAEngine interface. But I assume Alex has good reasons for it, who
knows RIO better than us.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18 11:50                                                               ` Jassi Brar
@ 2011-10-18 11:59                                                                 ` Russell King
  2011-10-18 17:57                                                                 ` Bounine, Alexandre
  1 sibling, 0 replies; 131+ messages in thread
From: Russell King @ 2011-10-18 11:59 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Bounine, Alexandre, Williams, Dan J, Vinod Koul, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 05:20:30PM +0530, Jassi Brar wrote:
> On 18 October 2011 15:19, Russell King <rmk@arm.linux.org.uk> wrote:
> 
> >> >> > With item #1 above being a separate topic, I may have a problem with #2
> >> >> > as well: dma_addr_t is sized for the local platform and not guaranteed
> >> >> > to be a 64-bit value (which may be required by a target).
> >> >> > Agree with #3 (if #1 and #2 work).
> >> >> >
> >> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> >> >
> >> > That's just an idiotic suggestion - there's no other way to put that.
> >> > Let's have some sanity here.
> >> >
> >> Yeah, I am not proud of the workaround, so I only probed the option.
> >> I think I need to explain myself.
> >>
> >> The case here is that even a 32-bit RapidIO host could ask transfer against
> >> 64-bit address space on a remote device. And vice versa 64->32.
> >>
> >> > dma_addr_t is the size of a DMA address for the CPU architecture being
> >> > built.  This has no relationship to what any particular DMA engine uses.
> >> >
> >> Yes, so far the dmaengine ever only needed to transfer within platform's
> >> address-space. So the assumption that src and dst addresses could
> >> be contained within dma_addr_t, worked.
> >> If the damengine is to get rid of that assumption/constraint, the memcpy,
> >> slave_sg etc need to accept addresses specified in bigger of the host and
> >> remote address space, and u64 is the safe option.
> >> Ultimately dma_addr_t is either u32 or u64.
> >
> > Let me spell it out:
> >
> > 1. Data structures read by the DMA engine hardware should not be defined
> >   using the 'dma_addr_t' type, but one of the [bl]e{8,16,32,64} types,
> >   or at a push the u{8,16,32,64} types if they're always host-endian.
> >
> >   This helps to ensure that the layout of the structures read by the
> >   hardware are less dependent of the host architecture and each element
> >   is appropriately sized (and, with sparse and the endian-sized types,
> >   can be endian-checked at compile time.)
> >
> > 2. dma_addr_t is the size of the DMA address for the host architecture.
> >   This may be 32-bit or 64-bit depending on the host architecture.
> >
> > The following points are my opinion:
> >
> > 3. For architectures where there are only 32-bit DMA addresses, dma_addr_t
> >   will be a 32-bit type.  For architectures where there are 64-bit DMA
> >   addresses, it will be a 64-bit type.
> >
> > 4. If RIO can accept 64-bit DMA addresses but is only connected to 32-bit
> >   busses, then the top 32 address bits are not usable (it's truncated in
> >   hardware.)  So there's no point passing around a 64-bit DMA address.
> >
> > 5. In the case of a 64-bit dma_addr_t and a 32-bit DMA engine host being
> >   asked to transfer >= 4GB, this needs error handing in the DMA engine
> >   driver (I don't think its checked for - I know amba-pl08x doesn't.)
> >
> > 6. 32-bit dma_addr_t with 64-bit DMA address space is a problem and is
> >   probably a bug in itself - the platform should be using a 64-bit
> >   dma_addr_t in this case.  (see 3.)
> >
> Thanks for the detailed explanation.
> 
> RapidIO is a packet switched interconnect with parallel or serial interface.
> Among other things, a packet contains 32, 48 or 64 bit offset into the
> remote-endpoint's address space. So I don't get how any of the above
> 6 points apply here.

So you can't see that redefining dma_addr_t to be 64-bit for the entire
DMA engine subsystem would be a bad idea...

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  5:45                                                       ` Jassi Brar
  2011-10-18  7:42                                                         ` Russell King
@ 2011-10-18 13:51                                                         ` Bounine, Alexandre
  2011-10-18 14:54                                                           ` Jassi Brar
  1 sibling, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-18 13:51 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 1:45 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> 
> On 18 October 2011 02:37, Bounine, Alexandre
> <Alexandre.Bounine@idt.com> wrote:
> > On Mon, Oct 17, 2011 at 3:30 PM, Jassi Brar
> <jaswinder.singh@linaro.org>
> > wrote:
> > ... skip ...
> >> > This is a source of the problem for RIO - DMA controller driver
> >> creates
> >> > virtual channels statically. RapidIO may use 8- or 16-bit destID.
> >> > In this case we need to create 256 or 64K virtual channels if we
> >> > want to cover all possible targets on single RIO port. Adding an
> >> extra
> >> > controller/net multiplies that number. Considering that not every
> >> > device will need a data transfer from a given node static
> allocation
> >> > will
> >> > create even more wasted resources.
> >> >
> >> Please excuse my rudimentary knowledge of RapidIO but I am tempted
> >> to ask why not register channels only for those targets that are
> >> actually
> >> detected and enumerated?
> >>
> > Two reasons:
> > - possibility of hot device insertion/removal
> ... but the linux RIO stack doesn't seem to support hotplug.
> Enumeration/discovery is done only once during boot.
> Am I overlooking something ?
> 
You are right about the current status of enumeration/discovery.
Hotplug support is in updates pipeline and will be added in planned
order for RIO patches. There are some other patches that should
go ahead of it (e.g. DMA ;) ).
Device insertion/removal notification is already in place but is
not handled yet.

We do not want to have a RapidIO DMA implementation that may become
obsolete in six month. In addition there are some user's implementations
of RapidIO hotplug which we should try not to break if they will add DMA.  

> > - there is no advance knowledge of which target device may require
> DMA
> >  service. A device driver for the particular target device is
> expected
> >  to request DMA service if required.
> >
> IMHO 1 channel per real device is an acceptable 'overhead'.
> Already many SoCs register dozens of channels but only a couple
> of them are actually used.
>
Is 64K of virtual channels per RIO port is acceptable?
Freescale's QorIQ 4080 SoC has two RIO controllers.
Designs with tsi712 may have even more: up to 4 is a reasonable estimate.
 
> >> > There is nothing that absence of full 66-bit addressing blocks
> now.
> >> > So far we are not aware about implementations that use 66-bit
> >> address.
> >> >
> >> Thanks for the important info.
> >> If I were you, I would postpone enabling support for 66-bit
> addressing
> >> esp when it soo affects the dmaengine API.
> >> Otherwise, you don't just code unused feature, but also put
> > constraints
> >> on development/up-gradation of the API in future, possibly, nearer
> > than
> >> real need of the feature.
> >>
> >> If we postpone 66-bit addressing to when it arrives, we can
> >> 1) Attach destID to the virtual channel's identity
> >> 2) Use device_prep_dma_memcpy so as to be able to change
> >>     target address for every transfer. Or use prep_slave, depending
> >>     upon nature of address at target endpoint.
> >> 3) Use slave_config to set wr_type if it remains same for enough
> >>     consecutive transfers to the same target (only you could strike
> >>     the balance between idealism and pragmatism).
> >>
> > With item #1 above being a separate topic, I may have a problem with
> #2
> > as well: dma_addr_t is sized for the local platform and not
> guaranteed
> > to be a 64-bit value (which may be required by a target).
> > Agree with #3 (if #1 and #2 work).
> >
> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> 
Adding an extra parameter to prep_slave_sg() looks much better to me.
That tweak for prep_dma_memcopy() may be more unsafe than the
extra param in prep_slave_sg().
Plus dma_memcopy does not fit logically for RIO as good as dma_slave.
For RIO we have only one buffer in local memory (as slave).
We just need to pass more info to the transfer prep routine.
 
> >> > This does not prevent someone from designing RIO compliant
> endpoint
> >> > device which gives interpretation to these two bits in addition to
> >> full 64-bit
> >> > addressing of their platform.
> >> >
> >> It sounds as if the 2bits are 'Vendor-Specific' ?
> >>
> > They are not 'Vendor-Specific' in the RIO spec (just address),
> > but HW vendors may give them such meaning. I just wanted to say
> > that HW designers may follow the same logic as we do here and
> > decide to give those bits a special meaning because "no one uses
> > them".
> >
> We should discount that. Irrespective of linux RIO stack, a vendor
> would be at fault if it assigns different meaning to the upper 2bits
> while the specs deem them just MSB of 66-bit addresses.
> 
Not a big deal. We may limit RIO address to 64-bits if this is needed
for the DMA API adopted for RIO. Let's define the right API first ;). 

> > So far this is a theoretical possibility and we are not
> > aware about any designs of this type. We may put a big warning
> > note about 64-bit limitation in RapidIO documentation section.
> >
> Yes, please.


^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  8:37                                                               ` Jassi Brar
@ 2011-10-18 14:44                                                                 ` Bounine, Alexandre
  0 siblings, 0 replies; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-18 14:44 UTC (permalink / raw)
  To: Jassi Brar, Vinod Koul
  Cc: Russell King, Williams, Dan J, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 4:38 AM, Jassi Brar <jaswinder.singh@linaro.org>
wrote:
> 
> On 18 October 2011 13:56, Vinod Koul <vinod.koul@intel.com> wrote:
> > On Tue, 2011-10-18 at 14:00 +0530, Jassi Brar wrote:
> >> On 18 October 2011 13:12, Russell King <rmk@arm.linux.org.uk>
wrote:
> >> > On Tue, Oct 18, 2011 at 11:15:29AM +0530, Jassi Brar wrote:
> >> >> On 18 October 2011 02:37, Bounine, Alexandre
> <Alexandre.Bounine@idt.com> wrote:
> >> >> > With item #1 above being a separate topic, I may have a
problem
> with #2
> >> >> > as well: dma_addr_t is sized for the local platform and not
> guaranteed
> >> >> > to be a 64-bit value (which may be required by a target).
> >> >> > Agree with #3 (if #1 and #2 work).
> >> >> >
> >> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> >> >
> >> > That's just an idiotic suggestion - there's no other way to put
> that.
> >> > Let's have some sanity here.
> >> >
> >> Yeah, I am not proud of the workaround, so I only probed the
option.
> >> I think I need to explain myself.
> >>
> >> The case here is that even a 32-bit RapidIO host could ask transfer
> against
> >> 64-bit address space on a remote device. And vice versa 64->32.
> > I thought RIO address were always 64 + 2 bits, irrespective of what
> the
> > host system is...
> >
> No, not always. RIO address could be 32, 48 or 64... with the role
> extra 2 bits
> not very clear.
RapidIO specification defines three possible address sizes for
Processing
Elements: 34, 50 and 66 bits.

There is no any special meaning for upper 2 bits that do not fit into
traditional 32- and 64-bit sizes and the best way to deal with them is
just forget this "+2" notion. 

Processing Elements report their ability to support (generate and
decode)
specified address field sizes (with 34-bit address support being
mandatory
for all PEs).

Even a platform with dma_size_t = 32 bit may be capable to generate RIO
requests with all RIO address ranges defined by RIO specification.
E.g., any platform that has PCI Express with one or more Tsi721.
Required RIO request address size will be defined by remote target
mapping/implementation.  

Using dma_memcopy API does not fit well into RIO system model because
one
of memory locations is outside of local memory map. This maybe a source
or
destination buffer.  

Within the existing DMA API only dma_slave fits well into RIO model but
it
is unable to pass additional information that does not fit into
MEM<->PORT model.

Alex.
      

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18 13:51                                                         ` Bounine, Alexandre
@ 2011-10-18 14:54                                                           ` Jassi Brar
  2011-10-18 15:15                                                             ` Bounine, Alexandre
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-10-18 14:54 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 18 October 2011 19:21, Bounine, Alexandre <Alexandre.Bounine@idt.com> wrote:

>> > - there is no advance knowledge of which target device may require
>> DMA
>> >  service. A device driver for the particular target device is
>> expected
>> >  to request DMA service if required.
>> >
>> IMHO 1 channel per real device is an acceptable 'overhead'.
>> Already many SoCs register dozens of channels but only a couple
>> of them are actually used.
>>
> Is 64K of virtual channels per RIO port is acceptable?
>
I said 1 channel per _real_ device after enumeration, assuming status quo
of no hotplug support. But you plan to implement hotplug soon, so that kills
this option.
Btw, is the RIO hotplug usage gonna look like USB or PCI?

>> >> > There is nothing that absence of full 66-bit addressing blocks
>> now.
>> >> > So far we are not aware about implementations that use 66-bit
>> >> address.
>> >> >
>> >> Thanks for the important info.
>> >> If I were you, I would postpone enabling support for 66-bit
>> addressing
>> >> esp when it soo affects the dmaengine API.
>> >> Otherwise, you don't just code unused feature, but also put
>> > constraints
>> >> on development/up-gradation of the API in future, possibly, nearer
>> > than
>> >> real need of the feature.
>> >>
>> >> If we postpone 66-bit addressing to when it arrives, we can
>> >> 1) Attach destID to the virtual channel's identity
>> >> 2) Use device_prep_dma_memcpy so as to be able to change
>> >>     target address for every transfer. Or use prep_slave, depending
>> >>     upon nature of address at target endpoint.
>> >> 3) Use slave_config to set wr_type if it remains same for enough
>> >>     consecutive transfers to the same target (only you could strike
>> >>     the balance between idealism and pragmatism).
>> >>
>> > With item #1 above being a separate topic, I may have a problem with
>> #2
>> > as well: dma_addr_t is sized for the local platform and not
>> guaranteed
>> > to be a 64-bit value (which may be required by a target).
>> > Agree with #3 (if #1 and #2 work).
>> >
>> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
>>
> Adding an extra parameter to prep_slave_sg() looks much better to me.
> That tweak for prep_dma_memcopy() may be more unsafe than the
> extra param in prep_slave_sg().
>
To me the idea of making exception for RapidIO to add a new API looks
even better. Anyways... whatever maintainers decide.

> Plus dma_memcopy does not fit logically for RIO as good as dma_slave.
> For RIO we have only one buffer in local memory (as slave).
> We just need to pass more info to the transfer prep routine.
>
>From a client's POV a slave transfer is just a 'variation' of memcpy
after the underlying channel has been configured appropriately. More so
when even some Mem->Mem dmacs support physical channel configuration
tweaking (PL330 in Samsung's SoCs do at least)
So in an ideal world, there could be one generic 'prepare' and an optional
'slave_config' callback.
OTOH I suspect I am overlooking something serious since nobody ever
tried doing it. But this is a topic of different discussion.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18 14:54                                                           ` Jassi Brar
@ 2011-10-18 15:15                                                             ` Bounine, Alexandre
  0 siblings, 0 replies; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-18 15:15 UTC (permalink / raw)
  To: Jassi Brar
  Cc: Williams, Dan J, Vinod Koul, Russell King, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 10:55 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> 
> On 18 October 2011 19:21, Bounine, Alexandre
> <Alexandre.Bounine@idt.com> wrote:
> 
> >> > - there is no advance knowledge of which target device may require
> >> DMA
> >> >  service. A device driver for the particular target device is
> >> expected
> >> >  to request DMA service if required.
> >> >
> >> IMHO 1 channel per real device is an acceptable 'overhead'.
> >> Already many SoCs register dozens of channels but only a couple
> >> of them are actually used.
> >>
> > Is 64K of virtual channels per RIO port is acceptable?
> >
> I said 1 channel per _real_ device after enumeration, assuming status
> quo
> of no hotplug support. But you plan to implement hotplug soon, so that
> kills
> this option.
> Btw, is the RIO hotplug usage gonna look like USB or PCI?
I would say closer to USB with surprise insertion/removal.
But nothing is final here yet. There are some changes to
enumeration/discovery that we would like to see first.

> 
> >> >> > There is nothing that absence of full 66-bit addressing blocks
> >> now.
> >> >> > So far we are not aware about implementations that use 66-bit
> >> >> address.
> >> >> >
> >> >> Thanks for the important info.
> >> >> If I were you, I would postpone enabling support for 66-bit
> >> addressing
> >> >> esp when it soo affects the dmaengine API.
> >> >> Otherwise, you don't just code unused feature, but also put
> >> > constraints
> >> >> on development/up-gradation of the API in future, possibly,
> nearer
> >> > than
> >> >> real need of the feature.
> >> >>
> >> >> If we postpone 66-bit addressing to when it arrives, we can
> >> >> 1) Attach destID to the virtual channel's identity
> >> >> 2) Use device_prep_dma_memcpy so as to be able to change
> >> >>     target address for every transfer. Or use prep_slave,
> depending
> >> >>     upon nature of address at target endpoint.
> >> >> 3) Use slave_config to set wr_type if it remains same for enough
> >> >>     consecutive transfers to the same target (only you could
> strike
> >> >>     the balance between idealism and pragmatism).
> >> >>
> >> > With item #1 above being a separate topic, I may have a problem
> with
> >> #2
> >> > as well: dma_addr_t is sized for the local platform and not
> >> guaranteed
> >> > to be a 64-bit value (which may be required by a target).
> >> > Agree with #3 (if #1 and #2 work).
> >> >
> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> >>
> > Adding an extra parameter to prep_slave_sg() looks much better to me.
> > That tweak for prep_dma_memcopy() may be more unsafe than the
> > extra param in prep_slave_sg().
> >
> To me the idea of making exception for RapidIO to add a new API looks
> even better. Anyways... whatever maintainers decide.

Both hands up for this! 

> 
> > Plus dma_memcopy does not fit logically for RIO as good as dma_slave.
> > For RIO we have only one buffer in local memory (as slave).
> > We just need to pass more info to the transfer prep routine.
> >
> From a client's POV a slave transfer is just a 'variation' of memcpy
> after the underlying channel has been configured appropriately. More so
> when even some Mem->Mem dmacs support physical channel configuration
> tweaking (PL330 in Samsung's SoCs do at least)
> So in an ideal world, there could be one generic 'prepare' and an
> optional
> 'slave_config' callback.
> OTOH I suspect I am overlooking something serious since nobody ever
> tried doing it. But this is a topic of different discussion.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18  9:49                                                             ` Russell King
  2011-10-18 11:50                                                               ` Jassi Brar
@ 2011-10-18 17:26                                                               ` Bounine, Alexandre
  2011-10-18 17:35                                                                 ` Russell King
  1 sibling, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-18 17:26 UTC (permalink / raw)
  To: Russell King, Jassi Brar
  Cc: Williams, Dan J, Vinod Koul, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 5:50 AM, Russell King <rmk@arm.linux.org.uk> wrote:
> 
> On Tue, Oct 18, 2011 at 02:00:45PM +0530, Jassi Brar wrote:
> > On 18 October 2011 13:12, Russell King <rmk@arm.linux.org.uk> wrote:
> > > On Tue, Oct 18, 2011 at 11:15:29AM +0530, Jassi Brar wrote:
> > >> On 18 October 2011 02:37, Bounine, Alexandre
> <Alexandre.Bounine@idt.com> wrote:
> > >> > With item #1 above being a separate topic, I may have a problem
> with #2
> > >> > as well: dma_addr_t is sized for the local platform and not
> guaranteed
> > >> > to be a 64-bit value (which may be required by a target).
> > >> > Agree with #3 (if #1 and #2 work).
> > >> >
> > >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> > >
> > > That's just an idiotic suggestion - there's no other way to put
> that.
> > > Let's have some sanity here.
> > >
> > Yeah, I am not proud of the workaround, so I only probed the option.
> > I think I need to explain myself.
> >
> > The case here is that even a 32-bit RapidIO host could ask transfer
> against
> > 64-bit address space on a remote device. And vice versa 64->32.
> >
> > > dma_addr_t is the size of a DMA address for the CPU architecture
> being
> > > built.  This has no relationship to what any particular DMA engine
> uses.
> > >
> > Yes, so far the dmaengine ever only needed to transfer within
> platform's
> > address-space. So the assumption that src and dst addresses could
> > be contained within dma_addr_t, worked.
> > If the damengine is to get rid of that assumption/constraint, the
> memcpy,
> > slave_sg etc need to accept addresses specified in bigger of the host
> and
> > remote address space, and u64 is the safe option.
> > Ultimately dma_addr_t is either u32 or u64.
> 
> Let me spell it out:
> 
> 1. Data structures read by the DMA engine hardware should not be
> defined
>    using the 'dma_addr_t' type, but one of the [bl]e{8,16,32,64} types,
>    or at a push the u{8,16,32,64} types if they're always host-endian.
> 
>    This helps to ensure that the layout of the structures read by the
>    hardware are less dependent of the host architecture and each
> element
>    is appropriately sized (and, with sparse and the endian-sized types,
>    can be endian-checked at compile time.)
> 
> 2. dma_addr_t is the size of the DMA address for the host architecture.
>    This may be 32-bit or 64-bit depending on the host architecture.
> 
> The following points are my opinion:
> 
> 3. For architectures where there are only 32-bit DMA addresses,
> dma_addr_t
>    will be a 32-bit type.  For architectures where there are 64-bit DMA
>    addresses, it will be a 64-bit type.
> 
> 4. If RIO can accept 64-bit DMA addresses but is only connected to 32-
> bit
>    busses, then the top 32 address bits are not usable (it's truncated
> in
>    hardware.)  So there's no point passing around a 64-bit DMA address.

Existing RapidIO controllers offer an address translation mechanism for
inbound RIO requests which will translate RIO address into an address of
mapped endpoint's local buffer. Some less-generic endpoints may also use
upper address bits for their own implementation specific mapping and this
is absolutely legal from the RIO POV. As an example of similar use may be
a PCI bus where upper AD lines are used for device selection during PCI
configuration cycles.       

So RIO addresses of any size are not blindly truncated and cannot be ignored.
  
> 
> 5. In the case of a 64-bit dma_addr_t and a 32-bit DMA engine host
> being
>    asked to transfer >= 4GB, this needs error handing in the DMA engine
>    driver (I don't think its checked for - I know amba-pl08x doesn't.)
> 
> 6. 32-bit dma_addr_t with 64-bit DMA address space is a problem and is
>    probably a bug in itself - the platform should be using a 64-bit
>    dma_addr_t in this case.  (see 3.)

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18 17:26                                                               ` Bounine, Alexandre
@ 2011-10-18 17:35                                                                 ` Russell King
  2011-10-18 17:53                                                                   ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Russell King @ 2011-10-18 17:35 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Jassi Brar, Williams, Dan J, Vinod Koul, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 10:26:56AM -0700, Bounine, Alexandre wrote:
> On Tue, Oct 18, 2011 at 5:50 AM, Russell King <rmk@arm.linux.org.uk> wrote:
> > 3. For architectures where there are only 32-bit DMA addresses,
> > dma_addr_t
> >    will be a 32-bit type.  For architectures where there are 64-bit DMA
> >    addresses, it will be a 64-bit type.
> > 
> > 4. If RIO can accept 64-bit DMA addresses but is only connected to 32-
> > bit
> >    busses, then the top 32 address bits are not usable (it's truncated
> > in
> >    hardware.)  So there's no point passing around a 64-bit DMA address.
> 
> Existing RapidIO controllers offer an address translation mechanism for
> inbound RIO requests which will translate RIO address into an address of
> mapped endpoint's local buffer. Some less-generic endpoints may also use
> upper address bits for their own implementation specific mapping and this
> is absolutely legal from the RIO POV. As an example of similar use may be
> a PCI bus where upper AD lines are used for device selection during PCI
> configuration cycles.       
> 
> So RIO addresses of any size are not blindly truncated and cannot be ignored.

See point 3 and point 6.

However, Jassi's suggestion that we redefine dma_addr_t to be 64-bit
for just the DMA engine code is utterly insane and unworkable.  We
can't have typedefs which change depending on which bits of code are
being built, potentially changing the layout of structures being passed
around inside the kernel.

> > 6. 32-bit dma_addr_t with 64-bit DMA address space is a problem and is
> >    probably a bug in itself - the platform should be using a 64-bit
> >    dma_addr_t in this case.  (see 3.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18 17:35                                                                 ` Russell King
@ 2011-10-18 17:53                                                                   ` Jassi Brar
  0 siblings, 0 replies; 131+ messages in thread
From: Jassi Brar @ 2011-10-18 17:53 UTC (permalink / raw)
  To: Russell King
  Cc: Bounine, Alexandre, Williams, Dan J, Vinod Koul, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On 18 October 2011 23:05, Russell King <rmk@arm.linux.org.uk> wrote:
>
> However, Jassi's suggestion that we redefine dma_addr_t to be 64-bit
> for just the DMA engine code is utterly insane and unworkable.  We
> can't have typedefs which change depending on which bits of code are
> being built, potentially changing the layout of structures being passed
> around inside the kernel.
>
I am not sure you understood what I meant.
When I said
     "Perhaps simply change dma_addr_t to u64 in dmaengine.h alone"
I meant
      s/dma_addr_t/u64

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18 11:50                                                               ` Jassi Brar
  2011-10-18 11:59                                                                 ` Russell King
@ 2011-10-18 17:57                                                                 ` Bounine, Alexandre
  2011-10-24  3:49                                                                   ` Vinod Koul
  1 sibling, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-18 17:57 UTC (permalink / raw)
  To: Jassi Brar, Russell King
  Cc: Williams, Dan J, Vinod Koul, Barry Song, linux-kernel,
	DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, Oct 18, 2011 at 7:51 AM, Jassi Brar <jaswinder.singh@linaro.org> wrote:
> 
> On 18 October 2011 15:19, Russell King <rmk@arm.linux.org.uk> wrote:
> 
> >> >> > With item #1 above being a separate topic, I may have a problem
> with #2
> >> >> > as well: dma_addr_t is sized for the local platform and not
> guaranteed
> >> >> > to be a 64-bit value (which may be required by a target).
> >> >> > Agree with #3 (if #1 and #2 work).
> >> >> >
> >> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> >> >
> >> > That's just an idiotic suggestion - there's no other way to put
> that.
> >> > Let's have some sanity here.
> >> >
> >> Yeah, I am not proud of the workaround, so I only probed the option.
> >> I think I need to explain myself.
> >>
> >> The case here is that even a 32-bit RapidIO host could ask transfer
> against
> >> 64-bit address space on a remote device. And vice versa 64->32.
> >>
> >> > dma_addr_t is the size of a DMA address for the CPU architecture
> being
> >> > built.  This has no relationship to what any particular DMA engine
> uses.
> >> >
> >> Yes, so far the dmaengine ever only needed to transfer within
> platform's
> >> address-space. So the assumption that src and dst addresses could
> >> be contained within dma_addr_t, worked.
> >> If the damengine is to get rid of that assumption/constraint, the
> memcpy,
> >> slave_sg etc need to accept addresses specified in bigger of the
> host and
> >> remote address space, and u64 is the safe option.
> >> Ultimately dma_addr_t is either u32 or u64.
> >
> > Let me spell it out:
> >
> > 1. Data structures read by the DMA engine hardware should not be
> defined
> >   using the 'dma_addr_t' type, but one of the [bl]e{8,16,32,64}
> types,
> >   or at a push the u{8,16,32,64} types if they're always host-endian.
> >
> >   This helps to ensure that the layout of the structures read by the
> >   hardware are less dependent of the host architecture and each
> element
> >   is appropriately sized (and, with sparse and the endian-sized
> types,
> >   can be endian-checked at compile time.)
> >
> > 2. dma_addr_t is the size of the DMA address for the host
> architecture.
> >   This may be 32-bit or 64-bit depending on the host architecture.
> >
> > The following points are my opinion:
> >
> > 3. For architectures where there are only 32-bit DMA addresses,
> dma_addr_t
> >   will be a 32-bit type.  For architectures where there are 64-bit
> DMA
> >   addresses, it will be a 64-bit type.
> >
> > 4. If RIO can accept 64-bit DMA addresses but is only connected to
> 32-bit
> >   busses, then the top 32 address bits are not usable (it's truncated
> in
> >   hardware.)  So there's no point passing around a 64-bit DMA
> address.
> >
> > 5. In the case of a 64-bit dma_addr_t and a 32-bit DMA engine host
> being
> >   asked to transfer >= 4GB, this needs error handing in the DMA
> engine
> >   driver (I don't think its checked for - I know amba-pl08x doesn't.)
> >
> > 6. 32-bit dma_addr_t with 64-bit DMA address space is a problem and
> is
> >   probably a bug in itself - the platform should be using a 64-bit
> >   dma_addr_t in this case.  (see 3.)
> >
> Thanks for the detailed explanation.
> 
> RapidIO is a packet switched interconnect with parallel or serial
> interface.
> Among other things, a packet contains 32, 48 or 64 bit offset into the
> remote-endpoint's address space. So I don't get how any of the above
> 6 points apply here.
> 
> Though I agree it is peculiar for a networking technology to expose a
> DMAEngine interface. But I assume Alex has good reasons for it, who
> knows RIO better than us.
To keep it simple, look at this as a peer-to-peer networking with HW ability
to directly address memory of the link partner.

RapidIO supports messaging which is closer to traditional networking and is
supported by RIONET driver (pretending to be Ethernet). But in some situations
messaging cannot be used. In these cases addressed memory read/write
operations take place.

I would like to put a simple example of RIO based system that may help to
understand our DMA requirements. 

Consider a platform with one host CPU and several DSP cards connected
to it through a switched backplane (transparent for purpose of this example).
 
The host CPU has one or more RIO-capable DMA channel and runs device
drivers for connected DSP cards. Each device driver is required to load
an individual program code into corresponding DSP(s). Directly addressed
writes have a lot of sense.

After DSP code is loaded device drivers start DSP program and may
participate in data transfers between DSP cards and host CPU. Again
messaging type transfers may add unnecessary overhead here compared
to direct data reads/writes.

Configuration of each DSP card may be different but from host's
POV is RIO spec compliant. 



^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-18 17:57                                                                 ` Bounine, Alexandre
@ 2011-10-24  3:49                                                                   ` Vinod Koul
  2011-10-24 12:36                                                                     ` Bounine, Alexandre
  0 siblings, 1 reply; 131+ messages in thread
From: Vinod Koul @ 2011-10-24  3:49 UTC (permalink / raw)
  To: Bounine, Alexandre
  Cc: Jassi Brar, Russell King, Williams, Dan J, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Tue, 2011-10-18 at 10:57 -0700, Bounine, Alexandre wrote:

Sorry for delayed response, been quite busy with upcoming festival, and
other stuff  :(
> > >
> > Thanks for the detailed explanation.
> > 
> > RapidIO is a packet switched interconnect with parallel or serial
> > interface.
> > Among other things, a packet contains 32, 48 or 64 bit offset into the
> > remote-endpoint's address space. So I don't get how any of the above
> > 6 points apply here.
> > 
> > Though I agree it is peculiar for a networking technology to expose a
> > DMAEngine interface. But I assume Alex has good reasons for it, who
> > knows RIO better than us.
> To keep it simple, look at this as a peer-to-peer networking with HW ability
> to directly address memory of the link partner.
> 
> RapidIO supports messaging which is closer to traditional networking and is
> supported by RIONET driver (pretending to be Ethernet). But in some situations
> messaging cannot be used. In these cases addressed memory read/write
> operations take place.
> 
> I would like to put a simple example of RIO based system that may help to
> understand our DMA requirements. 
> 
> Consider a platform with one host CPU and several DSP cards connected
> to it through a switched backplane (transparent for purpose of this example).
>  
> The host CPU has one or more RIO-capable DMA channel and runs device
> drivers for connected DSP cards. Each device driver is required to load
> an individual program code into corresponding DSP(s). Directly addressed
> writes have a lot of sense.
> 
> After DSP code is loaded device drivers start DSP program and may
> participate in data transfers between DSP cards and host CPU. Again
> messaging type transfers may add unnecessary overhead here compared
> to direct data reads/writes.
> 
> Configuration of each DSP card may be different but from host's
> POV is RIO spec compliant.

I think we all agree that this fits the dma_slave case :)

As for changing in dmaengine to u64, if we are thinking this as slave
usage, then ideally we should not make assumption of the address type of
peripheral so we should only move the dma_slave_config address fields to
u64, if that helps in RIO case. Moving other usages would be insane.
 
At this point we have two proposals
a) to make RIO exceptional case and add RIO specific stuff.
b) make dmaengine transparent and add additional argument
in .device_prep_slave_sg() callback which is subsystem dependent.
Current dmacs and those who don't need it will ignore it.

ATM, I am leaning towards the latter, for the main reason to keep
dmaengine away from subsystem details.

-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-24  3:49                                                                   ` Vinod Koul
@ 2011-10-24 12:36                                                                     ` Bounine, Alexandre
  2011-10-24 15:27                                                                       ` Vinod Koul
  0 siblings, 1 reply; 131+ messages in thread
From: Bounine, Alexandre @ 2011-10-24 12:36 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Jassi Brar, Russell King, Williams, Dan J, Barry Song,
	linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 2133 bytes --]

On Sun, Oct 23, 2011 at 11:49 PM, Vinod Koul <vinod.koul@intel.com> wrote:
> 
> On Tue, 2011-10-18 at 10:57 -0700, Bounine, Alexandre wrote:
> 
> > I would like to put a simple example of RIO based system that may
> help to
> > understand our DMA requirements.
> >
> > Consider a platform with one host CPU and several DSP cards connected
> > to it through a switched backplane (transparent for purpose of this
> example).
> >
> > The host CPU has one or more RIO-capable DMA channel and runs device
> > drivers for connected DSP cards. Each device driver is required to
> load
> > an individual program code into corresponding DSP(s). Directly
> addressed
> > writes have a lot of sense.
> >
> > After DSP code is loaded device drivers start DSP program and may
> > participate in data transfers between DSP cards and host CPU. Again
> > messaging type transfers may add unnecessary overhead here compared
> > to direct data reads/writes.
> >
> > Configuration of each DSP card may be different but from host's
> > POV is RIO spec compliant.
> 
> I think we all agree that this fits the dma_slave case :)
> 
> As for changing in dmaengine to u64, if we are thinking this as slave
> usage, then ideally we should not make assumption of the address type
> of
> peripheral so we should only move the dma_slave_config address fields
> to
> u64, if that helps in RIO case. Moving other usages would be insane.
> 
> At this point we have two proposals
> a) to make RIO exceptional case and add RIO specific stuff.
> b) make dmaengine transparent and add additional argument
> in .device_prep_slave_sg() callback which is subsystem dependent.
> Current dmacs and those who don't need it will ignore it.
> 
> ATM, I am leaning towards the latter, for the main reason to keep
> dmaengine away from subsystem details.
> 
Both proposals will work for RapidIO but second option looks more
universal and may be used by another subsystem in the future.
My vote goes to the option b).


ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 131+ messages in thread

* RE: [PATCHv4] DMAEngine: Define interleaved transfer request api
  2011-10-24 12:36                                                                     ` Bounine, Alexandre
@ 2011-10-24 15:27                                                                       ` Vinod Koul
  0 siblings, 0 replies; 131+ messages in thread
From: Vinod Koul @ 2011-10-24 15:27 UTC (permalink / raw)
  To: Bounine, Alexandre, Russell King, Williams, Dan J
  Cc: Jassi Brar, Barry Song, linux-kernel, DL-SHA-WorkGroupLinux, Dave Jiang

On Mon, 2011-10-24 at 05:36 -0700, Bounine, Alexandre wrote:
> > I think we all agree that this fits the dma_slave case :)
> > 
> > As for changing in dmaengine to u64, if we are thinking this as
> slave
> > usage, then ideally we should not make assumption of the address
> type
> > of
> > peripheral so we should only move the dma_slave_config address
> fields
> > to
> > u64, if that helps in RIO case. Moving other usages would be insane.
> > 
> > At this point we have two proposals
> > a) to make RIO exceptional case and add RIO specific stuff.
> > b) make dmaengine transparent and add additional argument
> > in .device_prep_slave_sg() callback which is subsystem dependent.
> > Current dmacs and those who don't need it will ignore it.
> > 
> > ATM, I am leaning towards the latter, for the main reason to keep
> > dmaengine away from subsystem details.
> > 
> Both proposals will work for RapidIO but second option looks more
> universal and may be used by another subsystem in the future.
> My vote goes to the option b). 
Thanks for the vote :D

I would really like to hear from Dan, Jassi and Russell as well.


-- 
~Vinod


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-15  6:43               ` Jassi Brar
  2011-09-15  6:49                 ` Barry Song
@ 2011-09-15  8:17                 ` Barry Song
  1 sibling, 0 replies; 131+ messages in thread
From: Barry Song @ 2011-09-15  8:17 UTC (permalink / raw)
  To: Jassi Brar
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

2011/9/15 Jassi Brar <jaswinder.singh@linaro.org>:
> On 15 September 2011 12:01, Barry Song <21cnbao@gmail.com> wrote:
>> 2011/9/13 Barry Song <21cnbao@gmail.com>:
>>> 2011/9/13 Jassi Brar <jaswinder.singh@linaro.org>:
>>>> On 13 September 2011 13:16, Barry Song <21cnbao@gmail.com> wrote:
>>>>>> if test pass, to the patch, and even for the moment, to the API's idea
>>>>>> Acked-by: Barry Song <baohua.song@csr.com>
>>>>>
>>>>> one issue i noticed is with a device_prep_dma_genxfer, i don't need
>>>>> device_prep_slave_sg any more,
>>>> Yeah, the damengine would need to adapt to the fact that these
>>>> interleaved tranfers could be Mem->Mem as well as Mem<->Dev
>>>> (even though yours could be only one type, but some dmacs could
>>>> do both).
>>>>
>>>>> How about:
>>>>>
>>>>>       BUG_ON(dma_has_cap(DMA_MEMCPY, device->cap_mask) &&
>>>>> -               !device->device_prep_dma_memcpy);
>>>>> +               !device->device_prep_dma_memcpy &&
>>>>> +               !device->device_prep_dma_genxfer);
>>>>>
>>>>>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>>>>  -               !device->device_prep_slave_sg);
>>>>> +               !device->device_prep_slave_sg &&
>>>>> +               !device->device_prep_dma_genxfer);
>>>>>
>>>> Seems ok, but please modify in a way you think is best and submit a patch
>>>> on top of this new api. Then it'll be easier to evaluate everything.
>>>
>>> i think it should be handled by this patch but not a new one.
>>
>> and i also think xfer_template is a bad name for a structure which is
>> an API. i'd like to add namespace for it and rename it to dma_genxfer.
>> or have any good suggestion?
> I think xfer_template is better - which stresses the usage as having prepared
> templates of transfers and only change src/dst address before submitting.
> 'device_prep_dma_genxfer' is the API which is already named so.
>
>> i'd like to send this together with "BUG_ON(dma_has_cap(DMA_SLAVE,
>> device->cap_mask) &&!device->device_prep_dma_genxfer)" as v2.
> Is there no change other than skipping check for SLAVE when using this api ?

another change i want to do is a simple xfer alloc helper so that
every driver doesn't need a long line to alloc this struct with a zero
length array:

struct xfer_template  *alloc_xfer_template(size_t frame_size)
{
        kzalloc(sizeof(struct xfer_template) +
                sizeof(struct data_chunk) * frame_size);
}

Then client can fill
xt.sgl[0].size
xt.sgl[0].icg
xt.sgl[1].size
xt.sgl[1].icg
...
xt.sgl[x].size
xt.sgl[x].icg

but xfer_template and data_chunk will have namespace.

>
-barry
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-15  6:43               ` Jassi Brar
@ 2011-09-15  6:49                 ` Barry Song
  2011-09-15  8:17                 ` Barry Song
  1 sibling, 0 replies; 131+ messages in thread
From: Barry Song @ 2011-09-15  6:49 UTC (permalink / raw)
  To: Jassi Brar
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

2011/9/15 Jassi Brar <jaswinder.singh@linaro.org>:
> On 15 September 2011 12:01, Barry Song <21cnbao@gmail.com> wrote:
>> 2011/9/13 Barry Song <21cnbao@gmail.com>:
>>> 2011/9/13 Jassi Brar <jaswinder.singh@linaro.org>:
>>>> On 13 September 2011 13:16, Barry Song <21cnbao@gmail.com> wrote:
>>>>>> if test pass, to the patch, and even for the moment, to the API's idea
>>>>>> Acked-by: Barry Song <baohua.song@csr.com>
>>>>>
>>>>> one issue i noticed is with a device_prep_dma_genxfer, i don't need
>>>>> device_prep_slave_sg any more,
>>>> Yeah, the damengine would need to adapt to the fact that these
>>>> interleaved tranfers could be Mem->Mem as well as Mem<->Dev
>>>> (even though yours could be only one type, but some dmacs could
>>>> do both).
>>>>
>>>>> How about:
>>>>>
>>>>>       BUG_ON(dma_has_cap(DMA_MEMCPY, device->cap_mask) &&
>>>>> -               !device->device_prep_dma_memcpy);
>>>>> +               !device->device_prep_dma_memcpy &&
>>>>> +               !device->device_prep_dma_genxfer);
>>>>>
>>>>>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>>>>  -               !device->device_prep_slave_sg);
>>>>> +               !device->device_prep_slave_sg &&
>>>>> +               !device->device_prep_dma_genxfer);
>>>>>
>>>> Seems ok, but please modify in a way you think is best and submit a patch
>>>> on top of this new api. Then it'll be easier to evaluate everything.
>>>
>>> i think it should be handled by this patch but not a new one.
>>
>> and i also think xfer_template is a bad name for a structure which is
>> an API. i'd like to add namespace for it and rename it to dma_genxfer.
>> or have any good suggestion?
> I think xfer_template is better - which stresses the usage as having prepared
> templates of transfers and only change src/dst address before submitting.
> 'device_prep_dma_genxfer' is the API which is already named so.

sorry i can't agree that.
device_prep_dma_genxfer is an API, xfer_template is a data structure
which will be seen by users, client drivers. it at least needs a
namespace. Otherwise, people someday maybe add another xfer.

>
>> i'd like to send this together with "BUG_ON(dma_has_cap(DMA_SLAVE,
>> device->cap_mask) &&!device->device_prep_dma_genxfer)" as v2.
> Is there no change other than skipping check for SLAVE when using this api ?
>
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-15  6:31             ` Barry Song
@ 2011-09-15  6:43               ` Jassi Brar
  2011-09-15  6:49                 ` Barry Song
  2011-09-15  8:17                 ` Barry Song
  0 siblings, 2 replies; 131+ messages in thread
From: Jassi Brar @ 2011-09-15  6:43 UTC (permalink / raw)
  To: Barry Song
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

On 15 September 2011 12:01, Barry Song <21cnbao@gmail.com> wrote:
> 2011/9/13 Barry Song <21cnbao@gmail.com>:
>> 2011/9/13 Jassi Brar <jaswinder.singh@linaro.org>:
>>> On 13 September 2011 13:16, Barry Song <21cnbao@gmail.com> wrote:
>>>>> if test pass, to the patch, and even for the moment, to the API's idea
>>>>> Acked-by: Barry Song <baohua.song@csr.com>
>>>>
>>>> one issue i noticed is with a device_prep_dma_genxfer, i don't need
>>>> device_prep_slave_sg any more,
>>> Yeah, the damengine would need to adapt to the fact that these
>>> interleaved tranfers could be Mem->Mem as well as Mem<->Dev
>>> (even though yours could be only one type, but some dmacs could
>>> do both).
>>>
>>>> How about:
>>>>
>>>>       BUG_ON(dma_has_cap(DMA_MEMCPY, device->cap_mask) &&
>>>> -               !device->device_prep_dma_memcpy);
>>>> +               !device->device_prep_dma_memcpy &&
>>>> +               !device->device_prep_dma_genxfer);
>>>>
>>>>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>>>  -               !device->device_prep_slave_sg);
>>>> +               !device->device_prep_slave_sg &&
>>>> +               !device->device_prep_dma_genxfer);
>>>>
>>> Seems ok, but please modify in a way you think is best and submit a patch
>>> on top of this new api. Then it'll be easier to evaluate everything.
>>
>> i think it should be handled by this patch but not a new one.
>
> and i also think xfer_template is a bad name for a structure which is
> an API. i'd like to add namespace for it and rename it to dma_genxfer.
> or have any good suggestion?
I think xfer_template is better - which stresses the usage as having prepared
templates of transfers and only change src/dst address before submitting.
'device_prep_dma_genxfer' is the API which is already named so.

> i'd like to send this together with "BUG_ON(dma_has_cap(DMA_SLAVE,
> device->cap_mask) &&!device->device_prep_dma_genxfer)" as v2.
Is there no change other than skipping check for SLAVE when using this api ?
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-13  8:58           ` Barry Song
@ 2011-09-15  6:31             ` Barry Song
  2011-09-15  6:43               ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-09-15  6:31 UTC (permalink / raw)
  To: Jassi Brar
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

2011/9/13 Barry Song <21cnbao@gmail.com>:
> 2011/9/13 Jassi Brar <jaswinder.singh@linaro.org>:
>> On 13 September 2011 13:16, Barry Song <21cnbao@gmail.com> wrote:
>>>> if test pass, to the patch, and even for the moment, to the API's idea
>>>> Acked-by: Barry Song <baohua.song@csr.com>
>>>
>>> one issue i noticed is with a device_prep_dma_genxfer, i don't need
>>> device_prep_slave_sg any more,
>> Yeah, the damengine would need to adapt to the fact that these
>> interleaved tranfers could be Mem->Mem as well as Mem<->Dev
>> (even though yours could be only one type, but some dmacs could
>> do both).
>>
>>> How about:
>>>
>>>       BUG_ON(dma_has_cap(DMA_MEMCPY, device->cap_mask) &&
>>> -               !device->device_prep_dma_memcpy);
>>> +               !device->device_prep_dma_memcpy &&
>>> +               !device->device_prep_dma_genxfer);
>>>
>>>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>>  -               !device->device_prep_slave_sg);
>>> +               !device->device_prep_slave_sg &&
>>> +               !device->device_prep_dma_genxfer);
>>>
>> Seems ok, but please modify in a way you think is best and submit a patch
>> on top of this new api. Then it'll be easier to evaluate everything.
>
> i think it should be handled by this patch but not a new one.

and i also think xfer_template is a bad name for a structure which is
an API. i'd like to add namespace for it and rename it to dma_genxfer.
or have any good suggestion?

i'd like to send this together with "BUG_ON(dma_has_cap(DMA_SLAVE,
device->cap_mask) &&!device->device_prep_dma_genxfer)" as v2.

-barry
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-13  8:43         ` Jassi Brar
@ 2011-09-13  8:58           ` Barry Song
  2011-09-15  6:31             ` Barry Song
  0 siblings, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-09-13  8:58 UTC (permalink / raw)
  To: Jassi Brar
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

2011/9/13 Jassi Brar <jaswinder.singh@linaro.org>:
> On 13 September 2011 13:16, Barry Song <21cnbao@gmail.com> wrote:
>>> if test pass, to the patch, and even for the moment, to the API's idea
>>> Acked-by: Barry Song <baohua.song@csr.com>
>>
>> one issue i noticed is with a device_prep_dma_genxfer, i don't need
>> device_prep_slave_sg any more,
> Yeah, the damengine would need to adapt to the fact that these
> interleaved tranfers could be Mem->Mem as well as Mem<->Dev
> (even though yours could be only one type, but some dmacs could
> do both).
>
>> How about:
>>
>>       BUG_ON(dma_has_cap(DMA_MEMCPY, device->cap_mask) &&
>> -               !device->device_prep_dma_memcpy);
>> +               !device->device_prep_dma_memcpy &&
>> +               !device->device_prep_dma_genxfer);
>>
>>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>>  -               !device->device_prep_slave_sg);
>> +               !device->device_prep_slave_sg &&
>> +               !device->device_prep_dma_genxfer);
>>
> Seems ok, but please modify in a way you think is best and submit a patch
> on top of this new api. Then it'll be easier to evaluate everything.

i think it should be handled by this patch but not a new one.

>
> thanks.
>

-barry
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-13  7:46       ` Barry Song
@ 2011-09-13  8:43         ` Jassi Brar
  2011-09-13  8:58           ` Barry Song
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-13  8:43 UTC (permalink / raw)
  To: Barry Song
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

On 13 September 2011 13:16, Barry Song <21cnbao@gmail.com> wrote:
>> if test pass, to the patch, and even for the moment, to the API's idea
>> Acked-by: Barry Song <baohua.song@csr.com>
>
> one issue i noticed is with a device_prep_dma_genxfer, i don't need
> device_prep_slave_sg any more,
Yeah, the damengine would need to adapt to the fact that these
interleaved tranfers could be Mem->Mem as well as Mem<->Dev
(even though yours could be only one type, but some dmacs could
do both).

> How about:
>
>       BUG_ON(dma_has_cap(DMA_MEMCPY, device->cap_mask) &&
> -               !device->device_prep_dma_memcpy);
> +               !device->device_prep_dma_memcpy &&
> +               !device->device_prep_dma_genxfer);
>
>        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
>  -               !device->device_prep_slave_sg);
> +               !device->device_prep_slave_sg &&
> +               !device->device_prep_dma_genxfer);
>
Seems ok, but please modify in a way you think is best and submit a patch
on top of this new api. Then it'll be easier to evaluate everything.

thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-13  1:21     ` Barry Song
@ 2011-09-13  7:46       ` Barry Song
  2011-09-13  8:43         ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-09-13  7:46 UTC (permalink / raw)
  To: Jassi Brar
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

2011/9/13 Barry Song <21cnbao@gmail.com>:
> 2011/9/13 Jassi Brar <jaswinder.singh@linaro.org>:
>> On 12 September 2011 21:56, Barry Song <21cnbao@gmail.com> wrote:
>>>> Define a new api that could be used for doing fancy data transfers
>>>> like interleaved to contiguous copy and vice-versa.
>>>> Traditional SG_list based transfers tend to be very inefficient in
>>>> such cases as where the interleave and chunk are only a few bytes,
>>>> which call for a very condensed api to convey pattern of the transfer.
>>>>
>>>> This api supports all 4 variants of scatter-gather and contiguous transfer.
>>>> Besides, it could also represent common operations like
>>>>        device_prep_dma_{cyclic, memset, memcpy}
>>>> and maybe some more that I am not sure of.
>>>>
>>>> Of course, neither can this api help transfers that don't lend to DMA by
>>>> nature, i.e, scattered tiny read/writes with no periodic pattern.
>>>>
>>>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>>>
>>> anyway, this API needs a real user to prove why it needs to exist.
>>>
>>> prima2 can be the 1st(?, 2nd if TI uses) user of this API. let's try
>>> to see what the driver will be with this api. Then we might figure out
>>> more about what it should be.
>>>
>> Did you discover any issue with the api?
>
> no until now, but i need to test as i said since there is nobody else
> has used it before. so i just hold the formal ACK for a moment.
>
>> Because only three days ago you said
>> {
>> Jassi, you might think my reply as an ACK to "[PATCH] DMAEngine:
>> Define generic transfer request api".
>> }
>
> if test pass, to the patch, and even for the moment, to the API's idea
> Acked-by: Barry Song <baohua.song@csr.com>

one issue i noticed is with a device_prep_dma_genxfer, i don't need
device_prep_slave_sg any more, then the validation check in
dma_async_device_register():

        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
                !device->device_prep_slave_sg);

is wrong to me.

How about:

       BUG_ON(dma_has_cap(DMA_MEMCPY, device->cap_mask) &&
-               !device->device_prep_dma_memcpy);
+               !device->device_prep_dma_memcpy &&
+               !device->device_prep_dma_genxfer);

        BUG_ON(dma_has_cap(DMA_SLAVE, device->cap_mask) &&
 -               !device->device_prep_slave_sg);
+               !device->device_prep_slave_sg &&
+               !device->device_prep_dma_genxfer);

>
>>
>> The api met your requirements easily not because I know them already,
>> but because I designed the api to be as generic as practically possible.

-barry
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-12 16:54   ` Jassi Brar
@ 2011-09-13  1:21     ` Barry Song
  2011-09-13  7:46       ` Barry Song
  0 siblings, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-09-13  1:21 UTC (permalink / raw)
  To: Jassi Brar
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

2011/9/13 Jassi Brar <jaswinder.singh@linaro.org>:
> On 12 September 2011 21:56, Barry Song <21cnbao@gmail.com> wrote:
>>> Define a new api that could be used for doing fancy data transfers
>>> like interleaved to contiguous copy and vice-versa.
>>> Traditional SG_list based transfers tend to be very inefficient in
>>> such cases as where the interleave and chunk are only a few bytes,
>>> which call for a very condensed api to convey pattern of the transfer.
>>>
>>> This api supports all 4 variants of scatter-gather and contiguous transfer.
>>> Besides, it could also represent common operations like
>>>        device_prep_dma_{cyclic, memset, memcpy}
>>> and maybe some more that I am not sure of.
>>>
>>> Of course, neither can this api help transfers that don't lend to DMA by
>>> nature, i.e, scattered tiny read/writes with no periodic pattern.
>>>
>>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>>
>> anyway, this API needs a real user to prove why it needs to exist.
>>
>> prima2 can be the 1st(?, 2nd if TI uses) user of this API. let's try
>> to see what the driver will be with this api. Then we might figure out
>> more about what it should be.
>>
> Did you discover any issue with the api?

no until now, but i need to test as i said since there is nobody else
has used it before. so i just hold the formal ACK for a moment.

> Because only three days ago you said
> {
> Jassi, you might think my reply as an ACK to "[PATCH] DMAEngine:
> Define generic transfer request api".
> }

if test pass, to the patch, and even for the moment, to the API's idea
Acked-by: Barry Song <baohua.song@csr.com>

>
> The api met your requirements easily not because I know them already,
> but because I designed the api to be as generic as practically possible.
>

-barry
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
  2011-09-12 16:26 ` [PATCH] DMAEngine: Define generic " Barry Song
@ 2011-09-12 16:54   ` Jassi Brar
  2011-09-13  1:21     ` Barry Song
  0 siblings, 1 reply; 131+ messages in thread
From: Jassi Brar @ 2011-09-12 16:54 UTC (permalink / raw)
  To: Barry Song
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

On 12 September 2011 21:56, Barry Song <21cnbao@gmail.com> wrote:
>> Define a new api that could be used for doing fancy data transfers
>> like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient in
>> such cases as where the interleave and chunk are only a few bytes,
>> which call for a very condensed api to convey pattern of the transfer.
>>
>> This api supports all 4 variants of scatter-gather and contiguous transfer.
>> Besides, it could also represent common operations like
>>        device_prep_dma_{cyclic, memset, memcpy}
>> and maybe some more that I am not sure of.
>>
>> Of course, neither can this api help transfers that don't lend to DMA by
>> nature, i.e, scattered tiny read/writes with no periodic pattern.
>>
>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>
> anyway, this API needs a real user to prove why it needs to exist.
>
> prima2 can be the 1st(?, 2nd if TI uses) user of this API. let's try
> to see what the driver will be with this api. Then we might figure out
> more about what it should be.
>
Did you discover any issue with the api?
Because only three days ago you said
{
Jassi, you might think my reply as an ACK to "[PATCH] DMAEngine:
Define generic transfer request api".
}

The api met your requirements easily not because I know them already,
but because I designed the api to be as generic as practically possible.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] DMAEngine: Define generic transfer request api
       [not found] <CAGsJ_4wXURUwbf-fcNOq1m5-NJ9+VMuDq+9OJpBjFZK4C_X3cw@mail.gmail.com>
@ 2011-09-12 16:26 ` Barry Song
  2011-09-12 16:54   ` Jassi Brar
  0 siblings, 1 reply; 131+ messages in thread
From: Barry Song @ 2011-09-12 16:26 UTC (permalink / raw)
  To: Jassi Brar
  Cc: dan.j.williams, sundaram, linus.walleij, vinod.koul, rmk+kernel,
	linux-omap, DL-SHA-WorkGroupLinux

> Define a new api that could be used for doing fancy data transfers
> like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient in
> such cases as where the interleave and chunk are only a few bytes,
> which call for a very condensed api to convey pattern of the transfer.
>
> This api supports all 4 variants of scatter-gather and contiguous transfer.
> Besides, it could also represent common operations like
>        device_prep_dma_{cyclic, memset, memcpy}
> and maybe some more that I am not sure of.
>
> Of course, neither can this api help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
>
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

anyway, this API needs a real user to prove why it needs to exist.

prima2 can be the 1st(?, 2nd if TI uses) user of this API. let's try
to see what the driver will be with this api. Then we might figure out
more about what it should be.

> ---
>  include/linux/dmaengine.h |   73 +++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 73 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 8fbf40e..74f3ae0 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -76,6 +76,76 @@ enum dma_transaction_type {
>  /* last transaction type for creation of the capabilities mask */
>  #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>
> +/**
> + * Generic Transfer Request
> + * ------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
> +
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *       size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *      chunk and before first src/dst address for next chunk.
> + *      Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *      Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +       size_t size;
> +       size_t icg;
> +};
> +
> +/**
> + * struct xfer_template - Template to convey DMAC the transfer pattern
> + *      and attributes.
> + * @op: The operation to perform on source data before writing it on
> + *      to destination address.
> + * @src_start: Bus address of source for the first chunk.
> + * @dst_start: Bus address of destination for the first chunk.
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *             Otherwise, source is read contiguously (icg ignored).
> + *             Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *             Otherwise, destination is filled contiguously (icg ignored).
> + *             Ignored if dst_inc is false.
> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct xfer_template {
> +       enum dma_transaction_type op;
> +       dma_addr_t src_start;
> +       dma_addr_t dst_start;
> +       bool src_inc;
> +       bool dst_inc;
> +       bool src_sgl;
> +       bool dst_sgl;
> +       bool frm_irq;
> +       size_t numf;
> +       size_t frame_size;
> +       struct data_chunk sgl[0];
> +};
>
>  /**
>  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -432,6 +502,7 @@ struct dma_tx_state {
>  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>  *     The function takes a buffer of size buf_len. The callback function will
>  *     be called after period_len bytes have been transferred.
> + * @device_prep_dma_genxfer: Transfer expression in a generic way.
>  * @device_control: manipulate all pending operations on a channel, returns
>  *     zero or error code
>  * @device_tx_status: poll for transaction completion, the optional
> @@ -496,6 +567,8 @@ struct dma_device {
>        struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>                struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>                size_t period_len, enum dma_data_direction direction);
> +       struct dma_async_tx_descriptor *(*device_prep_dma_genxfer)(
> +               struct dma_chan *chan, struct xfer_template *xt);
>        int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>                unsigned long arg);
>
> --
> 1.7.4.1
>
-barry
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 131+ messages in thread

end of thread, other threads:[~2011-10-24 15:36 UTC | newest]

Thread overview: 131+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-12 11:14 [PATCH] DMAEngine: Define generic transfer request api Jassi Brar
2011-08-16 12:56 ` Koul, Vinod
2011-08-16 13:06   ` Linus Walleij
2011-08-19 13:43     ` Koul, Vinod
2011-08-19 14:19       ` Linus Walleij
2011-08-19 15:46         ` Jassi Brar
2011-08-19 17:28           ` Koul, Vinod
2011-08-19 18:45             ` Jassi Brar
2011-08-23 14:43       ` Matt Porter
2011-08-23 14:43         ` Matt Porter
2011-08-16 14:32   ` Jassi Brar
2011-09-15  7:46 ` [PATCHv2] " Jassi Brar
2011-09-15  8:22   ` Russell King
2011-09-15 10:02     ` Jassi Brar
2011-09-16  7:17   ` Barry Song
2011-09-16 11:03     ` Jassi Brar
2011-09-16  9:07   ` Vinod Koul
2011-09-16 12:30     ` Jassi Brar
2011-09-16 17:06       ` Vinod Koul
2011-09-16 17:51         ` Jassi Brar
2011-09-19  3:23           ` Vinod Koul
2011-09-20 12:12   ` [PATCHv3] DMAEngine: Define interleaved " Jassi Brar
2011-09-20 16:52     ` Vinod Koul
2011-09-20 18:08       ` Jassi Brar
2011-09-21  6:32         ` Vinod Koul
2011-09-21  6:45           ` Jassi Brar
2011-09-21  6:51             ` Vinod Koul
2011-09-21  7:31               ` Jassi Brar
2011-09-21 10:18                 ` Russell King
2011-09-21 15:21                   ` Jassi Brar
2011-09-28  6:39     ` [PATCHv4] " Jassi Brar
2011-09-28  9:03       ` Vinod Koul
2011-09-28 15:15         ` Jassi Brar
2011-09-29 11:17           ` Vinod Koul
2011-09-30  6:43             ` Barry Song
2011-09-30 16:01               ` Jassi Brar
2011-10-01  3:05                 ` Barry Song
2011-10-01 18:11                   ` Vinod Koul
2011-10-01 18:45                     ` Jassi Brar
2011-10-01 18:41                   ` Jassi Brar
2011-10-01 18:48                     ` Jassi Brar
2011-10-02  0:33                     ` Barry Song
2011-10-03  6:24                       ` Jassi Brar
2011-10-03 16:13                         ` Russell King
2011-10-03 16:19                           ` Jassi Brar
2011-10-03 17:15                             ` Williams, Dan J
2011-10-03 18:23                               ` Jassi Brar
2011-10-05 18:19                                 ` Williams, Dan J
2011-10-06  9:06                                   ` Jassi Brar
2011-10-05 18:14                             ` Williams, Dan J
2011-10-06  7:12                               ` Jassi Brar
2011-10-07  5:45                               ` Vinod Koul
2011-10-07 11:27                                 ` Jassi Brar
2011-10-07 14:19                                   ` Vinod Koul
2011-10-07 14:38                                     ` Jassi Brar
2011-10-10  6:53                                       ` Vinod Koul
2011-10-10  9:16                                         ` Jassi Brar
2011-10-10  9:18                                           ` Vinod Koul
2011-10-10  9:53                                             ` Jassi Brar
2011-10-10 10:45                                               ` Vinod Koul
2011-10-10 11:16                                                 ` Jassi Brar
2011-10-10 16:02                                                   ` Vinod Koul
2011-10-10 16:28                                                     ` Jassi Brar
2011-10-11 11:56                                                       ` Vinod Koul
2011-10-11 15:57                                                         ` Jassi Brar
2011-10-11 16:45                                                           ` Vinod Koul
2011-10-12  5:41                                                       ` Barry Song
2011-10-12  6:19                                                         ` Vinod Koul
2011-10-12  6:30                                                           ` Jassi Brar
2011-10-12  6:53                                                           ` Barry Song
2011-10-11 16:44                                   ` Williams, Dan J
2011-10-11 18:42                                     ` Jassi Brar
2011-10-14 18:11                                       ` Bounine, Alexandre
2011-10-14 17:50                                     ` Bounine, Alexandre
2011-10-14 18:36                                       ` Jassi Brar
2011-10-14 19:15                                         ` Bounine, Alexandre
2011-10-15 11:25                                           ` Jassi Brar
2011-10-17 14:07                                             ` Bounine, Alexandre
2011-10-17 15:16                                               ` Jassi Brar
2011-10-17 18:00                                                 ` Bounine, Alexandre
2011-10-17 19:29                                                   ` Jassi Brar
2011-10-17 21:07                                                     ` Bounine, Alexandre
2011-10-18  5:45                                                       ` Jassi Brar
2011-10-18  7:42                                                         ` Russell King
2011-10-18  8:30                                                           ` Jassi Brar
2011-10-18  8:26                                                             ` Vinod Koul
2011-10-18  8:37                                                               ` Jassi Brar
2011-10-18 14:44                                                                 ` Bounine, Alexandre
2011-10-18  9:49                                                             ` Russell King
2011-10-18 11:50                                                               ` Jassi Brar
2011-10-18 11:59                                                                 ` Russell King
2011-10-18 17:57                                                                 ` Bounine, Alexandre
2011-10-24  3:49                                                                   ` Vinod Koul
2011-10-24 12:36                                                                     ` Bounine, Alexandre
2011-10-24 15:27                                                                       ` Vinod Koul
2011-10-18 17:26                                                               ` Bounine, Alexandre
2011-10-18 17:35                                                                 ` Russell King
2011-10-18 17:53                                                                   ` Jassi Brar
2011-10-18 13:51                                                         ` Bounine, Alexandre
2011-10-18 14:54                                                           ` Jassi Brar
2011-10-18 15:15                                                             ` Bounine, Alexandre
2011-09-30 15:47             ` Jassi Brar
2011-10-13  7:03       ` [PATCHv5] " Jassi Brar
2011-10-14  7:32         ` Barry Song
2011-10-14 11:51           ` Jassi Brar
2011-10-14 13:31             ` Vinod Koul
2011-10-14 13:51               ` Jassi Brar
2011-10-14 14:05                 ` Vinod Koul
2011-10-14 14:18                   ` Vinod Koul
2011-10-14 14:55               ` Barry Song
2011-10-14 15:06                 ` Vinod Koul
2011-10-14 15:38                   ` Barry Song
2011-10-14 16:09                     ` Vinod Koul
2011-10-14 16:35                   ` Jassi Brar
2011-10-14 17:04                     ` Vinod Koul
2011-10-14 17:59                       ` Jassi Brar
2011-10-15 17:11                         ` Vinod Koul
2011-10-14 15:16         ` Vinod Koul
2011-10-14 15:50           ` Barry Song
2011-10-16 11:16           ` Jassi Brar
2011-10-16 12:16             ` Vinod Koul
     [not found] <CAGsJ_4wXURUwbf-fcNOq1m5-NJ9+VMuDq+9OJpBjFZK4C_X3cw@mail.gmail.com>
2011-09-12 16:26 ` [PATCH] DMAEngine: Define generic " Barry Song
2011-09-12 16:54   ` Jassi Brar
2011-09-13  1:21     ` Barry Song
2011-09-13  7:46       ` Barry Song
2011-09-13  8:43         ` Jassi Brar
2011-09-13  8:58           ` Barry Song
2011-09-15  6:31             ` Barry Song
2011-09-15  6:43               ` Jassi Brar
2011-09-15  6:49                 ` Barry Song
2011-09-15  8:17                 ` Barry Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.