alsa-devel.alsa-project.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Enable DMA mode on Intel Keem Bay platform
@ 2020-11-17  8:03 Michael Sit Wei Hong
  2020-11-17  8:03 ` [RFC PATCH 1/4] dt-bindings: sound: intel, keembay-i2s: Add info for device to use DMA Michael Sit Wei Hong
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Michael Sit Wei Hong @ 2020-11-17  8:03 UTC (permalink / raw)
  To: alsa-devel
  Cc: cezary.rojewski, lars, andriy.shevchenko, tiwai, jee.heng.sia,
	pierre-louis.bossart, liam.r.girdwood, vkoul, broonie

v1: Initial patch version, to enable DMA mode on Intel Keembay platform
    by exposing some dmaengine api to work around DMA limitations at the
    client driver level.
    This patchset suggests an ALSA-only quirk, having other more generic 
    means to deal with this limitation would be fine - we just wanted to 
    have a discussion on preferred directions. The IPs used are not 
    Intel-specific so sooner or later someone else will have similar 
    limitations to work-around.

Michael Sit Wei Hong (4):
  dt-bindings: sound: intel, keembay-i2s: Add info for device to use DMA
  ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit
    function
  ASoC: dmaengine_pcm: expose functions to header file for custom
    functions
  ASoC: Intel: KMB: Enable DMA transfer mode

 .../bindings/sound/intel,keembay-i2s.yaml     |  14 ++
 include/sound/dmaengine_pcm.h                 |  21 ++
 sound/core/pcm_dmaengine.c                    |  46 ++--
 sound/soc/intel/Kconfig                       |   2 +
 sound/soc/intel/keembay/kmb_platform.c        | 202 ++++++++++++++++--
 sound/soc/intel/keembay/kmb_platform.h        |   9 +
 sound/soc/soc-generic-dmaengine-pcm.c         |   8 +-
 7 files changed, 267 insertions(+), 35 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH 1/4] dt-bindings: sound: intel, keembay-i2s: Add info for device to use DMA
  2020-11-17  8:03 [RFC PATCH 0/4] Enable DMA mode on Intel Keem Bay platform Michael Sit Wei Hong
@ 2020-11-17  8:03 ` Michael Sit Wei Hong
  2020-11-17  8:03 ` [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function Michael Sit Wei Hong
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Michael Sit Wei Hong @ 2020-11-17  8:03 UTC (permalink / raw)
  To: alsa-devel
  Cc: cezary.rojewski, lars, andriy.shevchenko, tiwai, jee.heng.sia,
	pierre-louis.bossart, liam.r.girdwood, vkoul, broonie

Add descriptions for entries needed for audio device to use DMA
channels for audio playback and capture.

Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 .../bindings/sound/intel,keembay-i2s.yaml          | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/intel,keembay-i2s.yaml b/Documentation/devicetree/bindings/sound/intel,keembay-i2s.yaml
index d346e61ab708..e0658f122cbb 100644
--- a/Documentation/devicetree/bindings/sound/intel,keembay-i2s.yaml
+++ b/Documentation/devicetree/bindings/sound/intel,keembay-i2s.yaml
@@ -45,6 +45,18 @@ properties:
       - const: osc
       - const: apb_clk
 
+  dmas:
+    items:
+      - description: DMA controller phandle and DMA channel
+                     for TX and RX
+
+  dma-names:
+    items:
+      - description: "tx" for the transmit channel
+                     "rx" for the receive channel
+      - const: tx
+      - const: rx
+
 required:
   - compatible
   - "#sound-dai-cells"
@@ -70,4 +82,6 @@ examples:
          interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>;
          clock-names = "osc", "apb_clk";
          clocks = <&scmi_clk KEEM_BAY_PSS_AUX_I2S3>, <&scmi_clk KEEM_BAY_PSS_I2S3>;
+         dmas = <&axi_dma0 29 &axi_dma0 33>;
+         dma-names = "tx", "rx";
      };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-17  8:03 [RFC PATCH 0/4] Enable DMA mode on Intel Keem Bay platform Michael Sit Wei Hong
  2020-11-17  8:03 ` [RFC PATCH 1/4] dt-bindings: sound: intel, keembay-i2s: Add info for device to use DMA Michael Sit Wei Hong
@ 2020-11-17  8:03 ` Michael Sit Wei Hong
  2020-11-17 13:56   ` Vinod Koul
  2020-11-17 15:50   ` Andy Shevchenko
  2020-11-17  8:03 ` [RFC PATCH 3/4] ASoC: dmaengine_pcm: expose functions to header file for custom functions Michael Sit Wei Hong
  2020-11-17  8:03 ` [RFC PATCH 4/4] ASoC: Intel: KMB: Enable DMA transfer mode Michael Sit Wei Hong
  3 siblings, 2 replies; 23+ messages in thread
From: Michael Sit Wei Hong @ 2020-11-17  8:03 UTC (permalink / raw)
  To: alsa-devel
  Cc: cezary.rojewski, lars, andriy.shevchenko, tiwai, jee.heng.sia,
	pierre-louis.bossart, liam.r.girdwood, vkoul, broonie

Enabling custom prepare and submit function to overcome DMA limitation.

In the Intel KeemBay solution, the DW AXI-based DMA has a limitation on
the number of DMA blocks per transfer. In the case of 16 bit audio ASoC
would allocate blocks exceeding the DMA block limitation.

The ASoC layers are not aware of such DMA limitation, and the DMA engine
does not provide an API to set the maximum number of blocks per linked link.

This patch suggests an additional callback to let the caller check and modify
the number of blocks per transfer to work-around the limitations.

Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/sound/dmaengine_pcm.h         |  6 ++++++
 sound/core/pcm_dmaengine.c            | 30 ++++++++++++++++++++++-----
 sound/soc/soc-generic-dmaengine-pcm.c |  8 ++++++-
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/include/sound/dmaengine_pcm.h b/include/sound/dmaengine_pcm.h
index 8c5e38180fb0..9fae56d39ae2 100644
--- a/include/sound/dmaengine_pcm.h
+++ b/include/sound/dmaengine_pcm.h
@@ -28,6 +28,9 @@ snd_pcm_substream_to_dma_direction(const struct snd_pcm_substream *substream)
 int snd_hwparams_to_dma_slave_config(const struct snd_pcm_substream *substream,
 	const struct snd_pcm_hw_params *params, struct dma_slave_config *slave_config);
 int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd);
+int snd_dmaengine_pcm_custom_trigger(struct snd_pcm_substream *substream, int cmd,
+	int (*custom_pcm_prepare_and_submit)(struct snd_pcm_substream *substream));
+
 snd_pcm_uframes_t snd_dmaengine_pcm_pointer(struct snd_pcm_substream *substream);
 snd_pcm_uframes_t snd_dmaengine_pcm_pointer_no_residue(struct snd_pcm_substream *substream);
 
@@ -113,6 +116,8 @@ int snd_dmaengine_pcm_refine_runtime_hwparams(
  *   which do not use devicetree.
  * @process: Callback used to apply processing on samples transferred from/to
  *   user space.
+ * @custom_pcm_prepare_and_submit: Callback used to work-around DMA limitations
+ *   related to link lists.
  * @compat_filter_fn: Will be used as the filter function when requesting a
  *  channel for platforms which do not use devicetree. The filter parameter
  *  will be the DAI's DMA data.
@@ -138,6 +143,7 @@ struct snd_dmaengine_pcm_config {
 	int (*process)(struct snd_pcm_substream *substream,
 		       int channel, unsigned long hwoff,
 		       void *buf, unsigned long bytes);
+	int (*custom_pcm_prepare_and_submit)(struct snd_pcm_substream *substream);
 	dma_filter_fn compat_filter_fn;
 	struct device *dma_dev;
 	const char *chan_names[SNDRV_PCM_STREAM_LAST + 1];
diff --git a/sound/core/pcm_dmaengine.c b/sound/core/pcm_dmaengine.c
index 4d059ff2b2e4..cbd1429de509 100644
--- a/sound/core/pcm_dmaengine.c
+++ b/sound/core/pcm_dmaengine.c
@@ -170,16 +170,20 @@ static int dmaengine_pcm_prepare_and_submit(struct snd_pcm_substream *substream)
 }
 
 /**
- * snd_dmaengine_pcm_trigger - dmaengine based PCM trigger implementation
+ * snd_dmaengine_pcm_custom_trigger - customized PCM trigger implementation to
+ *  work-around DMA limitations related to link lists.
  * @substream: PCM substream
  * @cmd: Trigger command
+ * @custom_pcm_prepare_and_submit: custom function to deal with DMA limitations
  *
  * Returns 0 on success, a negative error code otherwise.
  *
- * This function can be used as the PCM trigger callback for dmaengine based PCM
- * driver implementations.
+ * This function can be used as the PCM trigger callback for customized dmaengine
+ * based PCM driver implementations.
  */
-int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
+
+int snd_dmaengine_pcm_custom_trigger(struct snd_pcm_substream *substream, int cmd,
+	int (*custom_pcm_prepare_and_submit)(struct snd_pcm_substream *substream))
 {
 	struct dmaengine_pcm_runtime_data *prtd = substream_to_prtd(substream);
 	struct snd_pcm_runtime *runtime = substream->runtime;
@@ -187,7 +191,7 @@ int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
 
 	switch (cmd) {
 	case SNDRV_PCM_TRIGGER_START:
-		ret = dmaengine_pcm_prepare_and_submit(substream);
+		ret = custom_pcm_prepare_and_submit(substream);
 		if (ret)
 			return ret;
 		dma_async_issue_pending(prtd->dma_chan);
@@ -214,6 +218,22 @@ int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(snd_dmaengine_pcm_custom_trigger);
+
+/**
+ * snd_dmaengine_pcm_trigger - dmaengine based PCM trigger implementation
+ * @substream: PCM substream
+ * @cmd: Trigger command
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ *
+ * This function can be used as the PCM trigger callback for dmaengine based PCM
+ * driver implementations.
+ */
+int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
+{
+	return snd_dmaengine_pcm_custom_trigger(substream, cmd, dmaengine_pcm_prepare_and_submit);
+}
 EXPORT_SYMBOL_GPL(snd_dmaengine_pcm_trigger);
 
 /**
diff --git a/sound/soc/soc-generic-dmaengine-pcm.c b/sound/soc/soc-generic-dmaengine-pcm.c
index 9ef80a48707e..88fca6402a36 100644
--- a/sound/soc/soc-generic-dmaengine-pcm.c
+++ b/sound/soc/soc-generic-dmaengine-pcm.c
@@ -173,7 +173,13 @@ static int dmaengine_pcm_close(struct snd_soc_component *component,
 static int dmaengine_pcm_trigger(struct snd_soc_component *component,
 				 struct snd_pcm_substream *substream, int cmd)
 {
-	return snd_dmaengine_pcm_trigger(substream, cmd);
+	struct dmaengine_pcm *pcm = soc_component_to_pcm(component);
+
+	if (pcm->config && pcm->config->custom_pcm_prepare_and_submit)
+		return snd_dmaengine_pcm_custom_trigger(substream, cmd,
+							pcm->config->custom_pcm_prepare_and_submit);
+	else
+		return snd_dmaengine_pcm_trigger(substream, cmd);
 }
 
 static struct dma_chan *dmaengine_pcm_compat_request_channel(
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 3/4] ASoC: dmaengine_pcm: expose functions to header file for custom functions
  2020-11-17  8:03 [RFC PATCH 0/4] Enable DMA mode on Intel Keem Bay platform Michael Sit Wei Hong
  2020-11-17  8:03 ` [RFC PATCH 1/4] dt-bindings: sound: intel, keembay-i2s: Add info for device to use DMA Michael Sit Wei Hong
  2020-11-17  8:03 ` [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function Michael Sit Wei Hong
@ 2020-11-17  8:03 ` Michael Sit Wei Hong
  2020-11-17  8:03 ` [RFC PATCH 4/4] ASoC: Intel: KMB: Enable DMA transfer mode Michael Sit Wei Hong
  3 siblings, 0 replies; 23+ messages in thread
From: Michael Sit Wei Hong @ 2020-11-17  8:03 UTC (permalink / raw)
  To: alsa-devel
  Cc: cezary.rojewski, lars, andriy.shevchenko, tiwai, jee.heng.sia,
	pierre-louis.bossart, liam.r.girdwood, vkoul, broonie

Moving some functions to the header file to be used by custom prepare and
submit function.

In the Intel KeemBay solution, there is a DMA limitation which requires
a custom prepare and submit function to modify the number of blocks per
linked link.

This patch exposes some of the functions used in the pcm_dmaengine.c to
be used by the custom function.

Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 include/sound/dmaengine_pcm.h | 15 +++++++++++++++
 sound/core/pcm_dmaengine.c    | 16 ++--------------
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/include/sound/dmaengine_pcm.h b/include/sound/dmaengine_pcm.h
index 9fae56d39ae2..d45652a27f73 100644
--- a/include/sound/dmaengine_pcm.h
+++ b/include/sound/dmaengine_pcm.h
@@ -174,8 +174,23 @@ struct dmaengine_pcm {
 	unsigned int flags;
 };
 
+struct dmaengine_pcm_runtime_data {
+	struct dma_chan *dma_chan;
+	dma_cookie_t cookie;
+
+	unsigned int pos;
+};
+
+static inline struct dmaengine_pcm_runtime_data *substream_to_prtd(
+	const struct snd_pcm_substream *substream)
+{
+	return substream->runtime->private_data;
+}
+
 static inline struct dmaengine_pcm *soc_component_to_pcm(struct snd_soc_component *p)
 {
 	return container_of(p, struct dmaengine_pcm, component);
 }
+
+void dmaengine_pcm_dma_complete(void *arg);
 #endif
diff --git a/sound/core/pcm_dmaengine.c b/sound/core/pcm_dmaengine.c
index cbd1429de509..0f99c63964ec 100644
--- a/sound/core/pcm_dmaengine.c
+++ b/sound/core/pcm_dmaengine.c
@@ -19,19 +19,6 @@
 
 #include <sound/dmaengine_pcm.h>
 
-struct dmaengine_pcm_runtime_data {
-	struct dma_chan *dma_chan;
-	dma_cookie_t cookie;
-
-	unsigned int pos;
-};
-
-static inline struct dmaengine_pcm_runtime_data *substream_to_prtd(
-	const struct snd_pcm_substream *substream)
-{
-	return substream->runtime->private_data;
-}
-
 struct dma_chan *snd_dmaengine_pcm_get_chan(struct snd_pcm_substream *substream)
 {
 	struct dmaengine_pcm_runtime_data *prtd = substream_to_prtd(substream);
@@ -128,7 +115,7 @@ void snd_dmaengine_pcm_set_config_from_dai_data(
 }
 EXPORT_SYMBOL_GPL(snd_dmaengine_pcm_set_config_from_dai_data);
 
-static void dmaengine_pcm_dma_complete(void *arg)
+void dmaengine_pcm_dma_complete(void *arg)
 {
 	struct snd_pcm_substream *substream = arg;
 	struct dmaengine_pcm_runtime_data *prtd = substream_to_prtd(substream);
@@ -139,6 +126,7 @@ static void dmaengine_pcm_dma_complete(void *arg)
 
 	snd_pcm_period_elapsed(substream);
 }
+EXPORT_SYMBOL_GPL(dmaengine_pcm_dma_complete);
 
 static int dmaengine_pcm_prepare_and_submit(struct snd_pcm_substream *substream)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 4/4] ASoC: Intel: KMB: Enable DMA transfer mode
  2020-11-17  8:03 [RFC PATCH 0/4] Enable DMA mode on Intel Keem Bay platform Michael Sit Wei Hong
                   ` (2 preceding siblings ...)
  2020-11-17  8:03 ` [RFC PATCH 3/4] ASoC: dmaengine_pcm: expose functions to header file for custom functions Michael Sit Wei Hong
@ 2020-11-17  8:03 ` Michael Sit Wei Hong
  3 siblings, 0 replies; 23+ messages in thread
From: Michael Sit Wei Hong @ 2020-11-17  8:03 UTC (permalink / raw)
  To: alsa-devel
  Cc: cezary.rojewski, lars, andriy.shevchenko, tiwai, jee.heng.sia,
	pierre-louis.bossart, liam.r.girdwood, vkoul, broonie

Enable DMA transfer mode for Intel Keem Bay ASoC platform driver.

The driver will search the device tree for DMA resources at boot
time to enable DMA transfer mode, and will proceed to use DMA
transfer if the resource is available, otherwise the default PIO
mode will be used.

Due to DMA Block transfer limitation, a customized function is
introduced to check and handle the limitation. Instead of limiting
the maximum period bytes to the minimum supported, which forces the
period size to less than 10.6ms in the worst case, adjusting the DMA
to use a longer linked list will allow more flexible period sizes which
does not force applications to use ridiculously small period sizes.

Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 sound/soc/intel/Kconfig                |   2 +
 sound/soc/intel/keembay/kmb_platform.c | 202 +++++++++++++++++++++++--
 sound/soc/intel/keembay/kmb_platform.h |   9 ++
 3 files changed, 198 insertions(+), 15 deletions(-)

diff --git a/sound/soc/intel/Kconfig b/sound/soc/intel/Kconfig
index a5b446d5af19..6114dadfc52f 100644
--- a/sound/soc/intel/Kconfig
+++ b/sound/soc/intel/Kconfig
@@ -200,6 +200,8 @@ config SND_SOC_INTEL_KEEMBAY
 	tristate "Keembay Platforms"
 	depends on ARM64 || COMPILE_TEST
 	depends on COMMON_CLK
+	select SND_DMAENGINE_PCM
+	select SND_SOC_GENERIC_DMAENGINE_PCM
 	help
 	  If you have a Intel Keembay platform then enable this option
 	  by saying Y or m.
diff --git a/sound/soc/intel/keembay/kmb_platform.c b/sound/soc/intel/keembay/kmb_platform.c
index 291a686568c2..3041823e447b 100644
--- a/sound/soc/intel/keembay/kmb_platform.c
+++ b/sound/soc/intel/keembay/kmb_platform.c
@@ -6,10 +6,12 @@
 //
 
 #include <linux/clk.h>
+#include <linux/dma-mapping.h>
 #include <linux/io.h>
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/of_device.h>
+#include <sound/dmaengine_pcm.h>
 #include <sound/pcm.h>
 #include <sound/pcm_params.h>
 #include <sound/soc.h>
@@ -23,6 +25,7 @@
 #define I2S_OPERATION		0
 #define DATA_WIDTH_CONFIG_BIT	6
 #define TDM_CHANNEL_CONFIG_BIT	3
+#define MAX_DMA_BLOCKS		1024
 
 static const struct snd_pcm_hardware kmb_pcm_hardware = {
 	.info = SNDRV_PCM_INFO_INTERLEAVED |
@@ -335,6 +338,45 @@ static snd_pcm_uframes_t kmb_pcm_pointer(struct snd_soc_component *component,
 	return pos < runtime->buffer_size ? pos : 0;
 }
 
+static int kmb_pcm_prepare_and_submit(struct snd_pcm_substream *substream)
+{
+	struct dmaengine_pcm_runtime_data *prtd = substream_to_prtd(substream);
+	struct dma_chan *chan = prtd->dma_chan;
+	struct dma_async_tx_descriptor *desc;
+	enum dma_transfer_direction direction;
+	unsigned long flags = DMA_CTRL_ACK;
+	struct snd_pcm_runtime *runtime = substream->runtime;
+	int blocks;
+
+	blocks = bytes_to_samples(runtime, snd_pcm_lib_period_bytes(substream));
+	direction = snd_pcm_substream_to_dma_direction(substream);
+
+	if (!substream->runtime->no_period_wakeup)
+		flags |= DMA_PREP_INTERRUPT;
+
+	prtd->pos = 0;
+	/* Check if the number of blocks used exceed the DMA BLOCK limitation */
+	if (blocks > MAX_DMA_BLOCKS && direction == DMA_DEV_TO_MEM)
+		desc = dmaengine_prep_dma_cyclic(chan,
+			substream->runtime->dma_addr,
+			snd_pcm_lib_buffer_bytes(substream),
+			samples_to_bytes(runtime, MAX_DMA_BLOCKS), direction, flags);
+	else
+		desc = dmaengine_prep_dma_cyclic(chan,
+			substream->runtime->dma_addr,
+			snd_pcm_lib_buffer_bytes(substream),
+			snd_pcm_lib_period_bytes(substream), direction, flags);
+
+	if (!desc)
+		return -ENOMEM;
+
+	desc->callback = dmaengine_pcm_dma_complete;
+	desc->callback_param = substream;
+	prtd->cookie = dmaengine_submit(desc);
+
+	return 0;
+}
+
 static const struct snd_soc_component_driver kmb_component = {
 	.name		= "kmb",
 	.pcm_construct	= kmb_platform_pcm_new,
@@ -343,6 +385,53 @@ static const struct snd_soc_component_driver kmb_component = {
 	.pointer	= kmb_pcm_pointer,
 };
 
+static const struct snd_soc_component_driver kmb_component_dma = {
+	.name		= "kmb",
+};
+
+static int kmb_probe(struct snd_soc_dai *cpu_dai)
+{
+	struct kmb_i2s_info *kmb_i2s = snd_soc_dai_get_drvdata(cpu_dai);
+
+	if (kmb_i2s->use_pio)
+		return 0;
+
+	snd_soc_dai_init_dma_data(cpu_dai, &kmb_i2s->play_dma_data,
+				  &kmb_i2s->capture_dma_data);
+
+	return 0;
+}
+
+static inline void kmb_i2s_enable_dma(struct kmb_i2s_info *kmb_i2s, u32 stream)
+{
+	u32 dma_reg;
+
+	dma_reg = readl(kmb_i2s->i2s_base + I2S_DMACR);
+	/* Enable DMA handshake for stream */
+	if (stream == SNDRV_PCM_STREAM_PLAYBACK)
+		dma_reg |= I2S_DMAEN_TXBLOCK;
+	else
+		dma_reg |= I2S_DMAEN_RXBLOCK;
+
+	writel(dma_reg, kmb_i2s->i2s_base + I2S_DMACR);
+}
+
+static inline void kmb_i2s_disable_dma(struct kmb_i2s_info *kmb_i2s, u32 stream)
+{
+	u32 dma_reg;
+
+	dma_reg = readl(kmb_i2s->i2s_base + I2S_DMACR);
+	/* Disable DMA handshake for stream */
+	if (stream == SNDRV_PCM_STREAM_PLAYBACK) {
+		dma_reg &= ~I2S_DMAEN_TXBLOCK;
+		writel(1, kmb_i2s->i2s_base + I2S_RTXDMA);
+	} else {
+		dma_reg &= ~I2S_DMAEN_RXBLOCK;
+		writel(1, kmb_i2s->i2s_base + I2S_RRXDMA);
+	}
+	writel(dma_reg, kmb_i2s->i2s_base + I2S_DMACR);
+}
+
 static void kmb_i2s_start(struct kmb_i2s_info *kmb_i2s,
 			  struct snd_pcm_substream *substream)
 {
@@ -356,7 +445,11 @@ static void kmb_i2s_start(struct kmb_i2s_info *kmb_i2s,
 	else
 		writel(1, kmb_i2s->i2s_base + IRER);
 
-	kmb_i2s_irq_trigger(kmb_i2s, substream->stream, config->chan_nr, true);
+	if (kmb_i2s->use_pio)
+		kmb_i2s_irq_trigger(kmb_i2s, substream->stream,
+				    config->chan_nr, true);
+	else
+		kmb_i2s_enable_dma(kmb_i2s, substream->stream);
 
 	if (kmb_i2s->master)
 		writel(1, kmb_i2s->i2s_base + CER);
@@ -434,7 +527,8 @@ static int kmb_dai_trigger(struct snd_pcm_substream *substream,
 		break;
 	case SNDRV_PCM_TRIGGER_STOP:
 		kmb_i2s->active--;
-		kmb_i2s_stop(kmb_i2s, substream);
+		if (kmb_i2s->use_pio)
+			kmb_i2s_stop(kmb_i2s, substream);
 		break;
 	default:
 		return  -EINVAL;
@@ -485,16 +579,22 @@ static int kmb_dai_hw_params(struct snd_pcm_substream *substream,
 		config->data_width = 16;
 		kmb_i2s->ccr = 0x00;
 		kmb_i2s->xfer_resolution = 0x02;
+		kmb_i2s->play_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
+		kmb_i2s->capture_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
 		break;
 	case SNDRV_PCM_FORMAT_S24_LE:
 		config->data_width = 32;
 		kmb_i2s->ccr = 0x14;
 		kmb_i2s->xfer_resolution = 0x05;
+		kmb_i2s->play_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+		kmb_i2s->capture_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
 		break;
 	case SNDRV_PCM_FORMAT_S32_LE:
 		config->data_width = 32;
 		kmb_i2s->ccr = 0x10;
 		kmb_i2s->xfer_resolution = 0x05;
+		kmb_i2s->play_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+		kmb_i2s->capture_dma_data.addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
 		break;
 	default:
 		dev_err(kmb_i2s->dev, "kmb: unsupported PCM fmt");
@@ -572,9 +672,56 @@ static int kmb_dai_prepare(struct snd_pcm_substream *substream,
 	return 0;
 }
 
+static int kmb_dai_startup(struct snd_pcm_substream *substream,
+			   struct snd_soc_dai *cpu_dai)
+{
+	struct kmb_i2s_info *kmb_i2s = snd_soc_dai_get_drvdata(cpu_dai);
+	struct snd_dmaengine_dai_dma_data *dma_data;
+
+	if (kmb_i2s->use_pio)
+		return 0;
+
+	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
+		dma_data = &kmb_i2s->play_dma_data;
+	else
+		dma_data = &kmb_i2s->capture_dma_data;
+
+	snd_soc_dai_set_dma_data(cpu_dai, substream, dma_data);
+
+	return 0;
+}
+
+static int kmb_dai_hw_free(struct snd_pcm_substream *substream,
+			   struct snd_soc_dai *cpu_dai)
+{
+	struct kmb_i2s_info *kmb_i2s = snd_soc_dai_get_drvdata(cpu_dai);
+	/* I2S Programming sequence in Keem_Bay_VPU_DB_v1.1 */
+	if (kmb_i2s->use_pio)
+		kmb_i2s_clear_irqs(kmb_i2s, substream->stream);
+
+	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
+		writel(0, kmb_i2s->i2s_base + ITER);
+	else
+		writel(0, kmb_i2s->i2s_base + IRER);
+
+	if (kmb_i2s->use_pio)
+		kmb_i2s_irq_trigger(kmb_i2s, substream->stream, 8, false);
+	else
+		kmb_i2s_disable_dma(kmb_i2s, substream->stream);
+
+	if (!kmb_i2s->active) {
+		writel(0, kmb_i2s->i2s_base + CER);
+		writel(0, kmb_i2s->i2s_base + IER);
+	}
+
+	return 0;
+}
+
 static struct snd_soc_dai_ops kmb_dai_ops = {
+	.startup	= kmb_dai_startup,
 	.trigger	= kmb_dai_trigger,
 	.hw_params	= kmb_dai_hw_params,
+	.hw_free	= kmb_dai_hw_free,
 	.prepare	= kmb_dai_prepare,
 	.set_fmt	= kmb_set_dai_fmt,
 };
@@ -607,6 +754,7 @@ static struct snd_soc_dai_driver intel_kmb_i2s_dai[] = {
 				    SNDRV_PCM_FMTBIT_S16_LE),
 		},
 		.ops = &kmb_dai_ops,
+		.probe = kmb_probe,
 	},
 };
 
@@ -626,6 +774,7 @@ static struct snd_soc_dai_driver intel_kmb_tdm_dai[] = {
 				    SNDRV_PCM_FMTBIT_S16_LE),
 		},
 		.ops = &kmb_dai_ops,
+		.probe = kmb_probe,
 	},
 };
 
@@ -635,12 +784,19 @@ static const struct of_device_id kmb_plat_of_match[] = {
 	{}
 };
 
+static const struct snd_dmaengine_pcm_config kmb_dmaengine_pcm_config = {
+	.prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config,
+	.custom_pcm_prepare_and_submit = kmb_pcm_prepare_and_submit,
+};
+
 static int kmb_plat_dai_probe(struct platform_device *pdev)
 {
+	struct device_node *np = pdev->dev.of_node;
 	struct snd_soc_dai_driver *kmb_i2s_dai;
 	const struct of_device_id *match;
 	struct device *dev = &pdev->dev;
 	struct kmb_i2s_info *kmb_i2s;
+	struct resource *res;
 	int ret, irq;
 	u32 comp1_reg;
 
@@ -682,7 +838,7 @@ static int kmb_plat_dai_probe(struct platform_device *pdev)
 		return PTR_ERR(kmb_i2s->clk_i2s);
 	}
 
-	kmb_i2s->i2s_base = devm_platform_ioremap_resource(pdev, 0);
+	kmb_i2s->i2s_base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
 	if (IS_ERR(kmb_i2s->i2s_base))
 		return PTR_ERR(kmb_i2s->i2s_base);
 
@@ -692,22 +848,38 @@ static int kmb_plat_dai_probe(struct platform_device *pdev)
 
 	kmb_i2s->dev = &pdev->dev;
 
-	irq = platform_get_irq_optional(pdev, 0);
-	if (irq > 0) {
-		ret = devm_request_irq(dev, irq, kmb_i2s_irq_handler, 0,
-				       pdev->name, kmb_i2s);
-		if (ret < 0) {
-			dev_err(dev, "failed to request irq\n");
-			return ret;
-		}
-	}
-
 	comp1_reg = readl(kmb_i2s->i2s_base + I2S_COMP_PARAM_1);
 
 	kmb_i2s->fifo_th = (1 << COMP1_FIFO_DEPTH(comp1_reg)) / 2;
 
-	ret = devm_snd_soc_register_component(dev, &kmb_component,
-					      kmb_i2s_dai, 1);
+	kmb_i2s->use_pio = !(of_property_read_bool(np, "dmas"));
+
+	if (kmb_i2s->use_pio) {
+		irq = platform_get_irq_optional(pdev, 0);
+		if (irq > 0) {
+			ret = devm_request_irq(dev, irq, kmb_i2s_irq_handler, 0,
+					       pdev->name, kmb_i2s);
+			if (ret < 0) {
+				dev_err(dev, "failed to request irq\n");
+				return ret;
+			}
+		}
+		ret = devm_snd_soc_register_component(dev, &kmb_component,
+						      kmb_i2s_dai, 1);
+	} else {
+		kmb_i2s->play_dma_data.addr = res->start + I2S_TXDMA;
+		kmb_i2s->capture_dma_data.addr = res->start + I2S_RXDMA;
+		ret = snd_dmaengine_pcm_register(&pdev->dev,
+						 &kmb_dmaengine_pcm_config, 0);
+		if (ret) {
+			dev_err(&pdev->dev, "could not register dmaengine: %d\n",
+				ret);
+			return ret;
+		}
+		ret = devm_snd_soc_register_component(dev, &kmb_component_dma,
+						      kmb_i2s_dai, 1);
+	}
+
 	if (ret) {
 		dev_err(dev, "not able to register dai\n");
 		return ret;
diff --git a/sound/soc/intel/keembay/kmb_platform.h b/sound/soc/intel/keembay/kmb_platform.h
index 9756b132c12f..fd5341b66279 100644
--- a/sound/soc/intel/keembay/kmb_platform.h
+++ b/sound/soc/intel/keembay/kmb_platform.h
@@ -12,6 +12,7 @@
 #include <linux/bits.h>
 #include <linux/bitfield.h>
 #include <linux/types.h>
+#include <sound/dmaengine_pcm.h>
 
 /* Register values with reference to KMB databook v1.1 */
 /* common register for all channel */
@@ -103,7 +104,12 @@
 #define DW_I2S_MASTER	BIT(3)
 
 #define I2S_RXDMA	0x01C0
+#define I2S_RRXDMA	0x01C4
 #define I2S_TXDMA	0x01C8
+#define I2S_RTXDMA	0x01CC
+#define I2S_DMACR	0x0200
+#define I2S_DMAEN_RXBLOCK	(1 << 16)
+#define I2S_DMAEN_TXBLOCK	(1 << 17)
 
 /*
  * struct i2s_clk_config_data - represent i2s clk configuration data
@@ -131,6 +137,9 @@ struct kmb_i2s_info {
 	u32 xfer_resolution;
 	u32 fifo_th;
 	bool master;
+	/* data related to DMA transfers b/w i2s and DMAC */
+	struct snd_dmaengine_dai_dma_data play_dma_data;
+	struct snd_dmaengine_dai_dma_data capture_dma_data;
 
 	struct i2s_clk_config_data config;
 	int (*i2s_clk_cfg)(struct i2s_clk_config_data *config);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-17  8:03 ` [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function Michael Sit Wei Hong
@ 2020-11-17 13:56   ` Vinod Koul
  2020-11-17 15:50   ` Andy Shevchenko
  1 sibling, 0 replies; 23+ messages in thread
From: Vinod Koul @ 2020-11-17 13:56 UTC (permalink / raw)
  To: Michael Sit Wei Hong
  Cc: pierre-louis.bossart, cezary.rojewski, lars, andriy.shevchenko,
	alsa-devel, jee.heng.sia, tiwai, liam.r.girdwood, broonie

On 17-11-20, 16:03, Michael Sit Wei Hong wrote:
> Enabling custom prepare and submit function to overcome DMA limitation.
> 
> In the Intel KeemBay solution, the DW AXI-based DMA has a limitation on
> the number of DMA blocks per transfer. In the case of 16 bit audio ASoC
> would allocate blocks exceeding the DMA block limitation.

Looking at this and cover does not tell me the limitation of the
hardware. Can you please elaborate what is the special feature this
hardware brings which results in custom solution here..?
> 
> The ASoC layers are not aware of such DMA limitation, and the DMA engine
> does not provide an API to set the maximum number of blocks per linked link.

What does a block refer to here..?

Btw have you looked into dma_slave_caps specifically max_sg_burst field
added recently

Thanks

> This patch suggests an additional callback to let the caller check and modify
> the number of blocks per transfer to work-around the limitations.
> 
> Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  include/sound/dmaengine_pcm.h         |  6 ++++++
>  sound/core/pcm_dmaengine.c            | 30 ++++++++++++++++++++++-----
>  sound/soc/soc-generic-dmaengine-pcm.c |  8 ++++++-
>  3 files changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/include/sound/dmaengine_pcm.h b/include/sound/dmaengine_pcm.h
> index 8c5e38180fb0..9fae56d39ae2 100644
> --- a/include/sound/dmaengine_pcm.h
> +++ b/include/sound/dmaengine_pcm.h
> @@ -28,6 +28,9 @@ snd_pcm_substream_to_dma_direction(const struct snd_pcm_substream *substream)
>  int snd_hwparams_to_dma_slave_config(const struct snd_pcm_substream *substream,
>  	const struct snd_pcm_hw_params *params, struct dma_slave_config *slave_config);
>  int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd);
> +int snd_dmaengine_pcm_custom_trigger(struct snd_pcm_substream *substream, int cmd,
> +	int (*custom_pcm_prepare_and_submit)(struct snd_pcm_substream *substream));
> +
>  snd_pcm_uframes_t snd_dmaengine_pcm_pointer(struct snd_pcm_substream *substream);
>  snd_pcm_uframes_t snd_dmaengine_pcm_pointer_no_residue(struct snd_pcm_substream *substream);
>  
> @@ -113,6 +116,8 @@ int snd_dmaengine_pcm_refine_runtime_hwparams(
>   *   which do not use devicetree.
>   * @process: Callback used to apply processing on samples transferred from/to
>   *   user space.
> + * @custom_pcm_prepare_and_submit: Callback used to work-around DMA limitations
> + *   related to link lists.
>   * @compat_filter_fn: Will be used as the filter function when requesting a
>   *  channel for platforms which do not use devicetree. The filter parameter
>   *  will be the DAI's DMA data.
> @@ -138,6 +143,7 @@ struct snd_dmaengine_pcm_config {
>  	int (*process)(struct snd_pcm_substream *substream,
>  		       int channel, unsigned long hwoff,
>  		       void *buf, unsigned long bytes);
> +	int (*custom_pcm_prepare_and_submit)(struct snd_pcm_substream *substream);
>  	dma_filter_fn compat_filter_fn;
>  	struct device *dma_dev;
>  	const char *chan_names[SNDRV_PCM_STREAM_LAST + 1];
> diff --git a/sound/core/pcm_dmaengine.c b/sound/core/pcm_dmaengine.c
> index 4d059ff2b2e4..cbd1429de509 100644
> --- a/sound/core/pcm_dmaengine.c
> +++ b/sound/core/pcm_dmaengine.c
> @@ -170,16 +170,20 @@ static int dmaengine_pcm_prepare_and_submit(struct snd_pcm_substream *substream)
>  }
>  
>  /**
> - * snd_dmaengine_pcm_trigger - dmaengine based PCM trigger implementation
> + * snd_dmaengine_pcm_custom_trigger - customized PCM trigger implementation to
> + *  work-around DMA limitations related to link lists.
>   * @substream: PCM substream
>   * @cmd: Trigger command
> + * @custom_pcm_prepare_and_submit: custom function to deal with DMA limitations
>   *
>   * Returns 0 on success, a negative error code otherwise.
>   *
> - * This function can be used as the PCM trigger callback for dmaengine based PCM
> - * driver implementations.
> + * This function can be used as the PCM trigger callback for customized dmaengine
> + * based PCM driver implementations.
>   */
> -int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
> +
> +int snd_dmaengine_pcm_custom_trigger(struct snd_pcm_substream *substream, int cmd,
> +	int (*custom_pcm_prepare_and_submit)(struct snd_pcm_substream *substream))
>  {
>  	struct dmaengine_pcm_runtime_data *prtd = substream_to_prtd(substream);
>  	struct snd_pcm_runtime *runtime = substream->runtime;
> @@ -187,7 +191,7 @@ int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
>  
>  	switch (cmd) {
>  	case SNDRV_PCM_TRIGGER_START:
> -		ret = dmaengine_pcm_prepare_and_submit(substream);
> +		ret = custom_pcm_prepare_and_submit(substream);
>  		if (ret)
>  			return ret;
>  		dma_async_issue_pending(prtd->dma_chan);
> @@ -214,6 +218,22 @@ int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
>  
>  	return 0;
>  }
> +EXPORT_SYMBOL_GPL(snd_dmaengine_pcm_custom_trigger);
> +
> +/**
> + * snd_dmaengine_pcm_trigger - dmaengine based PCM trigger implementation
> + * @substream: PCM substream
> + * @cmd: Trigger command
> + *
> + * Returns 0 on success, a negative error code otherwise.
> + *
> + * This function can be used as the PCM trigger callback for dmaengine based PCM
> + * driver implementations.
> + */
> +int snd_dmaengine_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
> +{
> +	return snd_dmaengine_pcm_custom_trigger(substream, cmd, dmaengine_pcm_prepare_and_submit);
> +}
>  EXPORT_SYMBOL_GPL(snd_dmaengine_pcm_trigger);
>  
>  /**
> diff --git a/sound/soc/soc-generic-dmaengine-pcm.c b/sound/soc/soc-generic-dmaengine-pcm.c
> index 9ef80a48707e..88fca6402a36 100644
> --- a/sound/soc/soc-generic-dmaengine-pcm.c
> +++ b/sound/soc/soc-generic-dmaengine-pcm.c
> @@ -173,7 +173,13 @@ static int dmaengine_pcm_close(struct snd_soc_component *component,
>  static int dmaengine_pcm_trigger(struct snd_soc_component *component,
>  				 struct snd_pcm_substream *substream, int cmd)
>  {
> -	return snd_dmaengine_pcm_trigger(substream, cmd);
> +	struct dmaengine_pcm *pcm = soc_component_to_pcm(component);
> +
> +	if (pcm->config && pcm->config->custom_pcm_prepare_and_submit)
> +		return snd_dmaengine_pcm_custom_trigger(substream, cmd,
> +							pcm->config->custom_pcm_prepare_and_submit);
> +	else
> +		return snd_dmaengine_pcm_trigger(substream, cmd);
>  }
>  
>  static struct dma_chan *dmaengine_pcm_compat_request_channel(
> -- 
> 2.17.1

-- 
~Vinod

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-17  8:03 ` [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function Michael Sit Wei Hong
  2020-11-17 13:56   ` Vinod Koul
@ 2020-11-17 15:50   ` Andy Shevchenko
  2020-11-18  0:34     ` Sia, Jee Heng
  1 sibling, 1 reply; 23+ messages in thread
From: Andy Shevchenko @ 2020-11-17 15:50 UTC (permalink / raw)
  To: Michael Sit Wei Hong
  Cc: pierre-louis.bossart, cezary.rojewski, lars, alsa-devel,
	jee.heng.sia, tiwai, liam.r.girdwood, vkoul, broonie

On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei Hong wrote:
> Enabling custom prepare and submit function to overcome DMA limitation.
> 
> In the Intel KeemBay solution, the DW AXI-based DMA has a limitation on
> the number of DMA blocks per transfer. In the case of 16 bit audio ASoC
> would allocate blocks exceeding the DMA block limitation.

I'm still not sure the hardware has such a limitation.

The Synopsys IP supports linked list (LLI) transfers and I hardly can
imagine the list has any limitation. Even though, one can always emulate
LLI in software how it's done in the DesignWare AHB DMA driver.

I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM and didn't
find any errata or limits like this.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-17 15:50   ` Andy Shevchenko
@ 2020-11-18  0:34     ` Sia, Jee Heng
  2020-11-18  4:40       ` Vinod Koul
  2020-11-18 14:50       ` Shevchenko, Andriy
  0 siblings, 2 replies; 23+ messages in thread
From: Sia, Jee Heng @ 2020-11-18  0:34 UTC (permalink / raw)
  To: Shevchenko, Andriy, Sit, Michael Wei Hong
  Cc: pierre-louis.bossart, Rojewski, Cezary, lars, alsa-devel, tiwai,
	liam.r.girdwood, vkoul, broonie



> -----Original Message-----
> From: Andy Shevchenko <andriy.shevchenko@intel.com>
> Sent: 17 November 2020 11:51 PM
> To: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>
> Cc: alsa-devel@alsa-project.org; tiwai@suse.com; broonie@kernel.org; pierre-
> louis.bossart@linux.intel.com; Rojewski, Cezary <cezary.rojewski@intel.com>;
> liam.r.girdwood@linux.intel.com; Sia, Jee Heng <jee.heng.sia@intel.com>;
> vkoul@kernel.org; lars@metafoo.de
> Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom
> prepare and submit function
> 
> On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei Hong wrote:
> > Enabling custom prepare and submit function to overcome DMA limitation.
> >
> > In the Intel KeemBay solution, the DW AXI-based DMA has a limitation
> > on the number of DMA blocks per transfer. In the case of 16 bit audio
> > ASoC would allocate blocks exceeding the DMA block limitation.
> 
> I'm still not sure the hardware has such a limitation.
> 
> The Synopsys IP supports linked list (LLI) transfers and I hardly can imagine the
> list has any limitation. Even though, one can always emulate LLI in software how
> it's done in the DesignWare AHB DMA driver.
> 
> I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM and didn't find any
> errata or limits like this.
[>>] Intel KeemBay datasheet can be found at below link:
https://www.intel.com/content/www/us/en/secure/design/confidential/products-and-solutions/processors-and-chipsets/keem-bay/technical-library.html?grouping=EMT_Content%20Type&sort=title:asc
Pg783, "Programmable transfer length (block length), max 1024". Sub-list can't exceed 1024 blocks.
BTW, I think the 16bits audio could be a confusion because it is not about the number of bits, but rather the constraint of the block length. Base on my understanding, Audio needs a period larger than 10ms, regardless of the bit depth.
The questions here are:
1. Should DMAEngine expose a new API to constraint the block_length (instead of size in bytes)?
2. Should DMA client re-split the linked-list before passing the linked-list to DMAEngine.
3. Should DMA controller driver re-split the linked-list before execution.
> 
> --
> With Best Regards,
> Andy Shevchenko
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-18  0:34     ` Sia, Jee Heng
@ 2020-11-18  4:40       ` Vinod Koul
  2020-11-18  5:27         ` Sia, Jee Heng
  2020-11-18 14:50       ` Shevchenko, Andriy
  1 sibling, 1 reply; 23+ messages in thread
From: Vinod Koul @ 2020-11-18  4:40 UTC (permalink / raw)
  To: Sia, Jee Heng
  Cc: alsa-devel, lars, Shevchenko, Andriy, Rojewski, Cezary,
	pierre-louis.bossart, tiwai, liam.r.girdwood, broonie, Sit,
	Michael Wei Hong

On 18-11-20, 00:34, Sia, Jee Heng wrote:
> 
> 
> > -----Original Message-----
> > From: Andy Shevchenko <andriy.shevchenko@intel.com>
> > Sent: 17 November 2020 11:51 PM
> > To: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>
> > Cc: alsa-devel@alsa-project.org; tiwai@suse.com; broonie@kernel.org; pierre-
> > louis.bossart@linux.intel.com; Rojewski, Cezary <cezary.rojewski@intel.com>;
> > liam.r.girdwood@linux.intel.com; Sia, Jee Heng <jee.heng.sia@intel.com>;
> > vkoul@kernel.org; lars@metafoo.de
> > Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom
> > prepare and submit function
> > 
> > On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei Hong wrote:
> > > Enabling custom prepare and submit function to overcome DMA limitation.
> > >
> > > In the Intel KeemBay solution, the DW AXI-based DMA has a limitation
> > > on the number of DMA blocks per transfer. In the case of 16 bit audio
> > > ASoC would allocate blocks exceeding the DMA block limitation.
> > 
> > I'm still not sure the hardware has such a limitation.
> > 
> > The Synopsys IP supports linked list (LLI) transfers and I hardly can imagine the
> > list has any limitation. Even though, one can always emulate LLI in software how
> > it's done in the DesignWare AHB DMA driver.
> > 
> > I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM and didn't find any
> > errata or limits like this.
> [>>] Intel KeemBay datasheet can be found at below link:

** Please wrap your replies to 80 chars ** I have reflown below


> https://www.intel.com/content/www/us/en/secure/design/confidential/products-and-solutions/processors-and-chipsets/keem-bay/technical-library.html?grouping=EMT_Content%20Type&sort=title:asc

And this link is not accessible freely!

> Pg783, "Programmable transfer length (block length), max 1024".

Okay so block length cant be more than 1024, and IIUC that is 1024 items
and not bytes right

> Sub-list can't exceed 1024 blocks.  BTW, I think the 16bits audio
> could be a confusion because it is not about the number of bits, but
> rather the constraint of the block length.
> Base on my understanding,
> Audio needs a period larger than 10ms, regardless of the bit depth.
> The questions here are:
> 1. Should DMAEngine expose a new API to constraint the block_length
> (instead of size in bytes)?
> 2. Should DMA client re-split the linked-list before passing the
> linked-list to DMAEngine.
> 3. Should DMA controller driver re-split the linked-list before
> execution.

I would go with 3, this is not a fwk issue and can be easily handled in
the dma driver. It knows the limitation on block and can split a period
into multiple blocks and set the interrupt on last block. I do not see
why that will not work

-- 
~Vinod

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-18  4:40       ` Vinod Koul
@ 2020-11-18  5:27         ` Sia, Jee Heng
  0 siblings, 0 replies; 23+ messages in thread
From: Sia, Jee Heng @ 2020-11-18  5:27 UTC (permalink / raw)
  To: Vinod Koul
  Cc: alsa-devel, lars, Shevchenko, Andriy, Rojewski, Cezary,
	pierre-louis.bossart, tiwai, liam.r.girdwood, broonie, Sit,
	Michael Wei Hong



> -----Original Message-----
> From: Vinod Koul <vkoul@kernel.org>
> Sent: 18 November 2020 12:41 PM
> To: Sia, Jee Heng <jee.heng.sia@intel.com>
> Cc: Shevchenko, Andriy <andriy.shevchenko@intel.com>; Sit, Michael Wei Hong
> <michael.wei.hong.sit@intel.com>; alsa-devel@alsa-project.org; tiwai@suse.com;
> broonie@kernel.org; pierre-louis.bossart@linux.intel.com; Rojewski, Cezary
> <cezary.rojewski@intel.com>; liam.r.girdwood@linux.intel.com; lars@metafoo.de
> Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom
> prepare and submit function
> 
> On 18-11-20, 00:34, Sia, Jee Heng wrote:
> >
> >
> > > -----Original Message-----
> > > From: Andy Shevchenko <andriy.shevchenko@intel.com>
> > > Sent: 17 November 2020 11:51 PM
> > > To: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>
> > > Cc: alsa-devel@alsa-project.org; tiwai@suse.com; broonie@kernel.org;
> > > pierre- louis.bossart@linux.intel.com; Rojewski, Cezary
> > > <cezary.rojewski@intel.com>; liam.r.girdwood@linux.intel.com; Sia,
> > > Jee Heng <jee.heng.sia@intel.com>; vkoul@kernel.org; lars@metafoo.de
> > > Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add
> > > custom prepare and submit function
> > >
> > > On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei Hong wrote:
> > > > Enabling custom prepare and submit function to overcome DMA limitation.
> > > >
> > > > In the Intel KeemBay solution, the DW AXI-based DMA has a
> > > > limitation on the number of DMA blocks per transfer. In the case
> > > > of 16 bit audio ASoC would allocate blocks exceeding the DMA block limitation.
> > >
> > > I'm still not sure the hardware has such a limitation.
> > >
> > > The Synopsys IP supports linked list (LLI) transfers and I hardly
> > > can imagine the list has any limitation. Even though, one can always
> > > emulate LLI in software how it's done in the DesignWare AHB DMA driver.
> > >
> > > I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM and didn't
> > > find any errata or limits like this.
> > [>>] Intel KeemBay datasheet can be found at below link:
> 
> ** Please wrap your replies to 80 chars ** I have reflown below
[>>] Noted.
> 
> 
> > https://www.intel.com/content/www/us/en/secure/design/confidential/pro
> > ducts-and-solutions/processors-and-chipsets/keem-bay/technical-library
> > .html?grouping=EMT_Content%20Type&sort=title:asc
> 
> And this link is not accessible freely!
[>>] Sorry, perhaps need to apply the access.
> 
> > Pg783, "Programmable transfer length (block length), max 1024".
> 
> Okay so block length cant be more than 1024, and IIUC that is 1024 items and not
> bytes right
[>>] Yes, it is 1024 items/blocks
> 
> > Sub-list can't exceed 1024 blocks.  BTW, I think the 16bits audio
> > could be a confusion because it is not about the number of bits, but
> > rather the constraint of the block length.
> > Base on my understanding,
> > Audio needs a period larger than 10ms, regardless of the bit depth.
> > The questions here are:
> > 1. Should DMAEngine expose a new API to constraint the block_length
> > (instead of size in bytes)?
> > 2. Should DMA client re-split the linked-list before passing the
> > linked-list to DMAEngine.
> > 3. Should DMA controller driver re-split the linked-list before
> > execution.
> 
> I would go with 3, this is not a fwk issue and can be easily handled in the dma driver.
> It knows the limitation on block and can split a period into multiple blocks and set
> the interrupt on last block. I do not see why that will not work
[>>] Got it. A separate patch shall be submitted to AxiDMA to split the linked-list. 
Hope the rest of folks are fine with this approach.
> 
> --
> ~Vinod

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-18  0:34     ` Sia, Jee Heng
  2020-11-18  4:40       ` Vinod Koul
@ 2020-11-18 14:50       ` Shevchenko, Andriy
  2020-11-24  3:51         ` Sia, Jee Heng
  1 sibling, 1 reply; 23+ messages in thread
From: Shevchenko, Andriy @ 2020-11-18 14:50 UTC (permalink / raw)
  To: Sia, Jee Heng
  Cc: Rojewski, Cezary, lars, alsa-devel, pierre-louis.bossart, tiwai,
	liam.r.girdwood, vkoul, broonie, Sit, Michael Wei Hong

On Wed, Nov 18, 2020 at 02:34:22AM +0200, Sia, Jee Heng wrote:
> > From: Andy Shevchenko <andriy.shevchenko@intel.com>
> > Sent: 17 November 2020 11:51 PM
> > On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei Hong wrote:
> > > Enabling custom prepare and submit function to overcome DMA limitation.
> > >
> > > In the Intel KeemBay solution, the DW AXI-based DMA has a limitation
> > > on the number of DMA blocks per transfer. In the case of 16 bit audio
> > > ASoC would allocate blocks exceeding the DMA block limitation.
> >
> > I'm still not sure the hardware has such a limitation.
> >
> > The Synopsys IP supports linked list (LLI) transfers and I hardly can imagine the
> > list has any limitation. Even though, one can always emulate LLI in software how
> > it's done in the DesignWare AHB DMA driver.
> >
> > I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM and didn't find any
> > errata or limits like this.
> [>>] Intel KeemBay datasheet can be found at below link:
> https://www.intel.com/content/www/us/en/secure/design/confidential/products-and-solutions/processors-and-chipsets/keem-bay/technical-library.html?grouping=EMT_Content%20Type&sort=title:asc
> Pg783, "Programmable transfer length (block length), max 1024". Sub-list can't exceed 1024 blocks.
> BTW, I think the 16bits audio could be a confusion because it is not about the number of bits, but rather the constraint of the block length. Base on my understanding, Audio needs a period larger than 10ms, regardless of the bit depth.
> The questions here are:
> 1. Should DMAEngine expose a new API to constraint the block_length (instead of size in bytes)?
> 2. Should DMA client re-split the linked-list before passing the linked-list to DMAEngine.
> 3. Should DMA controller driver re-split the linked-list before execution.

Since we have DMA slave capabilities, the consumer may ask for them and prepare
the SG list accordingly.

Above limitation is a block size (value of 1023 is a maximum, meaning 1024
maximum items in the block of given transfer width). So, like DesignWare DMA
(AHB) has up to 4095 (but I saw hardware with 2047) or iDMA 32- and 64-bit have
131071. There is no limitation for amount of blocks of given maximum length in
the LLI!

No hacks are needed, really.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-18 14:50       ` Shevchenko, Andriy
@ 2020-11-24  3:51         ` Sia, Jee Heng
  2020-11-30  9:57           ` Sit, Michael Wei Hong
  0 siblings, 1 reply; 23+ messages in thread
From: Sia, Jee Heng @ 2020-11-24  3:51 UTC (permalink / raw)
  To: Shevchenko, Andriy
  Cc: Rojewski, Cezary, lars, alsa-devel, pierre-louis.bossart, tiwai,
	liam.r.girdwood, vkoul, broonie, Sit, Michael Wei Hong



> -----Original Message-----
> From: Shevchenko, Andriy <andriy.shevchenko@intel.com>
> Sent: 18 November 2020 10:51 PM
> To: Sia, Jee Heng <jee.heng.sia@intel.com>
> Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>; alsa-devel@alsa-
> project.org; tiwai@suse.com; broonie@kernel.org; pierre-
> louis.bossart@linux.intel.com; Rojewski, Cezary <cezary.rojewski@intel.com>;
> liam.r.girdwood@linux.intel.com; vkoul@kernel.org; lars@metafoo.de
> Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom
> prepare and submit function
> 
> On Wed, Nov 18, 2020 at 02:34:22AM +0200, Sia, Jee Heng wrote:
> > > From: Andy Shevchenko <andriy.shevchenko@intel.com>
> > > Sent: 17 November 2020 11:51 PM
> > > On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei Hong wrote:
> > > > Enabling custom prepare and submit function to overcome DMA limitation.
> > > >
> > > > In the Intel KeemBay solution, the DW AXI-based DMA has a
> > > > limitation on the number of DMA blocks per transfer. In the case
> > > > of 16 bit audio ASoC would allocate blocks exceeding the DMA block limitation.
> > >
> > > I'm still not sure the hardware has such a limitation.
> > >
> > > The Synopsys IP supports linked list (LLI) transfers and I hardly
> > > can imagine the list has any limitation. Even though, one can always
> > > emulate LLI in software how it's done in the DesignWare AHB DMA driver.
> > >
> > > I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM and didn't
> > > find any errata or limits like this.
> > [>>] Intel KeemBay datasheet can be found at below link:
> > https://www.intel.com/content/www/us/en/secure/design/confidential/pro
> > ducts-and-solutions/processors-and-chipsets/keem-bay/technical-library
> > .html?grouping=EMT_Content%20Type&sort=title:asc
> > Pg783, "Programmable transfer length (block length), max 1024". Sub-list can't
> exceed 1024 blocks.
> > BTW, I think the 16bits audio could be a confusion because it is not about the
> number of bits, but rather the constraint of the block length. Base on my
> understanding, Audio needs a period larger than 10ms, regardless of the bit depth.
> > The questions here are:
> > 1. Should DMAEngine expose a new API to constraint the block_length (instead of
> size in bytes)?
> > 2. Should DMA client re-split the linked-list before passing the linked-list to
> DMAEngine.
> > 3. Should DMA controller driver re-split the linked-list before execution.
> 
> Since we have DMA slave capabilities, the consumer may ask for them and prepare
> the SG list accordingly.
> 
> Above limitation is a block size (value of 1023 is a maximum, meaning 1024
> maximum items in the block of given transfer width). So, like DesignWare DMA
> (AHB) has up to 4095 (but I saw hardware with 2047) or iDMA 32- and 64-bit have
> 131071. There is no limitation for amount of blocks of given maximum length in the
> LLI!
> 
> No hacks are needed, really.
[>>] Hi All, can we have the agreement that DMA clients should optimize the linked-list
[>>] by retrieve the DMA capabilities from the DMA controller?
[>>] I noticed that Vinod voted #3 but Andy voted #2. 
[>>] We need your decision to move on.
> 
> --
> With Best Regards,
> Andy Shevchenko
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-24  3:51         ` Sia, Jee Heng
@ 2020-11-30  9:57           ` Sit, Michael Wei Hong
  2020-11-30 10:45             ` Lars-Peter Clausen
  0 siblings, 1 reply; 23+ messages in thread
From: Sit, Michael Wei Hong @ 2020-11-30  9:57 UTC (permalink / raw)
  To: Sia, Jee Heng, Shevchenko, Andriy, pierre-louis.bossart,
	Rojewski, Cezary, liam.r.girdwood, vkoul, tiwai, broonie
  Cc: alsa-devel, lars



> -----Original Message-----
> From: Sia, Jee Heng <jee.heng.sia@intel.com>
> Sent: Tuesday, 24 November, 2020 11:51 AM
> To: Shevchenko, Andriy <andriy.shevchenko@intel.com>
> Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>;
> alsa-devel@alsa-project.org; tiwai@suse.com;
> broonie@kernel.org; pierre-louis.bossart@linux.intel.com;
> Rojewski, Cezary <cezary.rojewski@intel.com>;
> liam.r.girdwood@linux.intel.com; vkoul@kernel.org;
> lars@metafoo.de
> Subject: RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm:
> Add custom prepare and submit function
> 
> 
> 
> > -----Original Message-----
> > From: Shevchenko, Andriy <andriy.shevchenko@intel.com>
> > Sent: 18 November 2020 10:51 PM
> > To: Sia, Jee Heng <jee.heng.sia@intel.com>
> > Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>;
> > alsa-devel@alsa- project.org; tiwai@suse.com;
> broonie@kernel.org;
> > pierre- louis.bossart@linux.intel.com; Rojewski, Cezary
> > <cezary.rojewski@intel.com>; liam.r.girdwood@linux.intel.com;
> > vkoul@kernel.org; lars@metafoo.de
> > Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-
> pcm: Add
> > custom prepare and submit function
> >
> > On Wed, Nov 18, 2020 at 02:34:22AM +0200, Sia, Jee Heng
> wrote:
> > > > From: Andy Shevchenko <andriy.shevchenko@intel.com>
> > > > Sent: 17 November 2020 11:51 PM
> > > > On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei
> Hong wrote:
> > > > > Enabling custom prepare and submit function to
> overcome DMA limitation.
> > > > >
> > > > > In the Intel KeemBay solution, the DW AXI-based DMA
> has a
> > > > > limitation on the number of DMA blocks per transfer. In
> the case
> > > > > of 16 bit audio ASoC would allocate blocks exceeding the
> DMA block limitation.
> > > >
> > > > I'm still not sure the hardware has such a limitation.
> > > >
> > > > The Synopsys IP supports linked list (LLI) transfers and I
> hardly
> > > > can imagine the list has any limitation. Even though, one can
> > > > always emulate LLI in software how it's done in the
> DesignWare AHB DMA driver.
> > > >
> > > > I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM
> and
> > > > didn't find any errata or limits like this.
> > > [>>] Intel KeemBay datasheet can be found at below link:
> > >
> https://www.intel.com/content/www/us/en/secure/design/con
> fidential/p
> > > ro
> > > ducts-and-solutions/processors-and-chipsets/keem-
> bay/technical-libra
> > > ry .html?grouping=EMT_Content%20Type&sort=title:asc
> > > Pg783, "Programmable transfer length (block length), max
> 1024".
> > > Sub-list can't
> > exceed 1024 blocks.
> > > BTW, I think the 16bits audio could be a confusion because it is
> not
> > > about the
> > number of bits, but rather the constraint of the block length.
> Base on
> > my understanding, Audio needs a period larger than 10ms,
> regardless of the bit depth.
> > > The questions here are:
> > > 1. Should DMAEngine expose a new API to constraint the
> block_length
> > > (instead of
> > size in bytes)?
> > > 2. Should DMA client re-split the linked-list before passing the
> > > linked-list to
> > DMAEngine.
> > > 3. Should DMA controller driver re-split the linked-list before
> execution.
> >
> > Since we have DMA slave capabilities, the consumer may ask
> for them
> > and prepare the SG list accordingly.
> >
> > Above limitation is a block size (value of 1023 is a maximum,
> meaning
> > 1024 maximum items in the block of given transfer width). So,
> like
> > DesignWare DMA
> > (AHB) has up to 4095 (but I saw hardware with 2047) or iDMA
> 32- and
> > 64-bit have 131071. There is no limitation for amount of blocks
> of
> > given maximum length in the LLI!
> >
> > No hacks are needed, really.
> [>>] Hi All, can we have the agreement that DMA clients should
> optimize the linked-list [>>] by retrieve the DMA capabilities from
> the DMA controller?
> [>>] I noticed that Vinod voted #3 but Andy voted #2.
> [>>] We need your decision to move on.
> >
> > --
> > With Best Regards,
> > Andy Shevchenko
> >
Hi everyone,

Is there anymore comment on this RFC?
We will be using the ASoC framework to split the linked-list, since resplitting the linked-list in DMA is a no go.
If there isn't any more comments, we will push these patches for review and merging.

Thanks,
Regards,
Michael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-30  9:57           ` Sit, Michael Wei Hong
@ 2020-11-30 10:45             ` Lars-Peter Clausen
  2020-11-30 11:09               ` Shevchenko, Andriy
  0 siblings, 1 reply; 23+ messages in thread
From: Lars-Peter Clausen @ 2020-11-30 10:45 UTC (permalink / raw)
  To: Sit, Michael Wei Hong, Sia, Jee Heng, Shevchenko, Andriy,
	pierre-louis.bossart, Rojewski, Cezary, liam.r.girdwood, vkoul,
	tiwai, broonie
  Cc: alsa-devel

On 11/30/20 10:57 AM, Sit, Michael Wei Hong wrote:
>
>> -----Original Message-----
>> From: Sia, Jee Heng <jee.heng.sia@intel.com>
>> Sent: Tuesday, 24 November, 2020 11:51 AM
>> To: Shevchenko, Andriy <andriy.shevchenko@intel.com>
>> Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>;
>> alsa-devel@alsa-project.org; tiwai@suse.com;
>> broonie@kernel.org; pierre-louis.bossart@linux.intel.com;
>> Rojewski, Cezary <cezary.rojewski@intel.com>;
>> liam.r.girdwood@linux.intel.com; vkoul@kernel.org;
>> lars@metafoo.de
>> Subject: RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm:
>> Add custom prepare and submit function
>>
>>
>>
>>> -----Original Message-----
>>> From: Shevchenko, Andriy <andriy.shevchenko@intel.com>
>>> Sent: 18 November 2020 10:51 PM
>>> To: Sia, Jee Heng <jee.heng.sia@intel.com>
>>> Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>;
>>> alsa-devel@alsa- project.org; tiwai@suse.com;
>> broonie@kernel.org;
>>> pierre- louis.bossart@linux.intel.com; Rojewski, Cezary
>>> <cezary.rojewski@intel.com>; liam.r.girdwood@linux.intel.com;
>>> vkoul@kernel.org; lars@metafoo.de
>>> Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-
>> pcm: Add
>>> custom prepare and submit function
>>>
>>> On Wed, Nov 18, 2020 at 02:34:22AM +0200, Sia, Jee Heng
>> wrote:
>>>>> From: Andy Shevchenko <andriy.shevchenko@intel.com>
>>>>> Sent: 17 November 2020 11:51 PM
>>>>> On Tue, Nov 17, 2020 at 04:03:48PM +0800, Michael Sit Wei
>> Hong wrote:
>>>>>> Enabling custom prepare and submit function to
>> overcome DMA limitation.
>>>>>> In the Intel KeemBay solution, the DW AXI-based DMA
>> has a
>>>>>> limitation on the number of DMA blocks per transfer. In
>> the case
>>>>>> of 16 bit audio ASoC would allocate blocks exceeding the
>> DMA block limitation.
>>>>> I'm still not sure the hardware has such a limitation.
>>>>>
>>>>> The Synopsys IP supports linked list (LLI) transfers and I
>> hardly
>>>>> can imagine the list has any limitation. Even though, one can
>>>>> always emulate LLI in software how it's done in the
>> DesignWare AHB DMA driver.
>>>>> I have briefly read chapter 4.6 "AXI_DMA" of Keem Bay TRM
>> and
>>>>> didn't find any errata or limits like this.
>>>> [>>] Intel KeemBay datasheet can be found at below link:
>>>>
>> https://www.intel.com/content/www/us/en/secure/design/con
>> fidential/p
>>>> ro
>>>> ducts-and-solutions/processors-and-chipsets/keem-
>> bay/technical-libra
>>>> ry .html?grouping=EMT_Content%20Type&sort=title:asc
>>>> Pg783, "Programmable transfer length (block length), max
>> 1024".
>>>> Sub-list can't
>>> exceed 1024 blocks.
>>>> BTW, I think the 16bits audio could be a confusion because it is
>> not
>>>> about the
>>> number of bits, but rather the constraint of the block length.
>> Base on
>>> my understanding, Audio needs a period larger than 10ms,
>> regardless of the bit depth.
>>>> The questions here are:
>>>> 1. Should DMAEngine expose a new API to constraint the
>> block_length
>>>> (instead of
>>> size in bytes)?
>>>> 2. Should DMA client re-split the linked-list before passing the
>>>> linked-list to
>>> DMAEngine.
>>>> 3. Should DMA controller driver re-split the linked-list before
>> execution.
>>> Since we have DMA slave capabilities, the consumer may ask
>> for them
>>> and prepare the SG list accordingly.
>>>
>>> Above limitation is a block size (value of 1023 is a maximum,
>> meaning
>>> 1024 maximum items in the block of given transfer width). So,
>> like
>>> DesignWare DMA
>>> (AHB) has up to 4095 (but I saw hardware with 2047) or iDMA
>> 32- and
>>> 64-bit have 131071. There is no limitation for amount of blocks
>> of
>>> given maximum length in the LLI!
>>>
>>> No hacks are needed, really.
>> [>>] Hi All, can we have the agreement that DMA clients should
>> optimize the linked-list [>>] by retrieve the DMA capabilities from
>> the DMA controller?
>> [>>] I noticed that Vinod voted #3 but Andy voted #2.
>> [>>] We need your decision to move on.
>>> --
>>> With Best Regards,
>>> Andy Shevchenko
>>>
> Hi everyone,
>
> Is there anymore comment on this RFC?
> We will be using the ASoC framework to split the linked-list, since resplitting the linked-list in DMA is a no go.
> If there isn't any more comments, we will push these patches for review and merging.

Hi,

Why is splitting the list in the DMAengine framework a no go?

The whole idea of the DMAengine framework is to hide hardware specifics. 
It offers an API with certain semantics and it is up to the driver to 
provide an implementation that implements these semantics. There does 
not necessarily have to be a 1-to-1 mapping to hardware primitives in 
such an implementation.

- Lars


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-30 10:45             ` Lars-Peter Clausen
@ 2020-11-30 11:09               ` Shevchenko, Andriy
  2020-12-01  8:22                 ` Lars-Peter Clausen
  0 siblings, 1 reply; 23+ messages in thread
From: Shevchenko, Andriy @ 2020-11-30 11:09 UTC (permalink / raw)
  To: Lars-Peter Clausen
  Cc: Rojewski, Cezary, alsa-devel, pierre-louis.bossart, Sia,
	Jee Heng, tiwai, liam.r.girdwood, vkoul, broonie, Sit,
	Michael Wei Hong

On Mon, Nov 30, 2020 at 11:45:17AM +0100, Lars-Peter Clausen wrote:
> On 11/30/20 10:57 AM, Sit, Michael Wei Hong wrote:

> > Is there anymore comment on this RFC?
> > We will be using the ASoC framework to split the linked-list, since resplitting the linked-list in DMA is a no go.
> > If there isn't any more comments, we will push these patches for review and merging.

> Why is splitting the list in the DMAengine framework a no go?
> 
> The whole idea of the DMAengine framework is to hide hardware specifics. It
> offers an API with certain semantics and it is up to the driver to provide
> an implementation that implements these semantics. There does not
> necessarily have to be a 1-to-1 mapping to hardware primitives in such an
> implementation.

I would say it's not desirable.

Why should we split than resplit if we may do it in one go?
Why then we have DMA capabilities returned to the consumers.

So, I have that we need to optimize DMA SG list preparation in a way that
consumer gets SG list cooked in accordance with DMA limitations / requirements.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-11-30 11:09               ` Shevchenko, Andriy
@ 2020-12-01  8:22                 ` Lars-Peter Clausen
  2020-12-01  9:10                   ` Sia, Jee Heng
  0 siblings, 1 reply; 23+ messages in thread
From: Lars-Peter Clausen @ 2020-12-01  8:22 UTC (permalink / raw)
  To: Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, pierre-louis.bossart, Sia,
	Jee Heng, tiwai, liam.r.girdwood, vkoul, broonie, Sit,
	Michael Wei Hong

On 11/30/20 12:09 PM, Shevchenko, Andriy wrote:
> On Mon, Nov 30, 2020 at 11:45:17AM +0100, Lars-Peter Clausen wrote:
>> On 11/30/20 10:57 AM, Sit, Michael Wei Hong wrote:
>>> Is there anymore comment on this RFC?
>>> We will be using the ASoC framework to split the linked-list, since resplitting the linked-list in DMA is a no go.
>>> If there isn't any more comments, we will push these patches for review and merging.
>> Why is splitting the list in the DMAengine framework a no go?
>>
>> The whole idea of the DMAengine framework is to hide hardware specifics. It
>> offers an API with certain semantics and it is up to the driver to provide
>> an implementation that implements these semantics. There does not
>> necessarily have to be a 1-to-1 mapping to hardware primitives in such an
>> implementation.
> I would say it's not desirable.
>
> Why should we split than resplit if we may do it in one go?
> Why then we have DMA capabilities returned to the consumers.
>
> So, I have that we need to optimize DMA SG list preparation in a way that
> consumer gets SG list cooked in accordance with DMA limitations / requirements.

The API that the audio drivers use request a period DMA transfer for 
length N split into M segments with the callback being called after each 
segment.

How that is implemented internally in the DMA does not matter as long as 
it matches those semantics. E.g. if the segment is longer than what the 
DMA can handle it can split it into two segments internally and call the 
callback every second segment.

The way this API works there isn't even really a possibility for the 
client side to split periodic transfers.

Btw. 1024 beats per segment seems excessively small, I don't understand 
how somebody would design such an DMA. Was the assumption that the DMA 
will never transfer more than PAGE_SIZE of contiguous memory? Or do we 
not understand the documentation correctly?

- Lars


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-12-01  8:22                 ` Lars-Peter Clausen
@ 2020-12-01  9:10                   ` Sia, Jee Heng
  2020-12-05  0:55                     ` Sit, Michael Wei Hong
  0 siblings, 1 reply; 23+ messages in thread
From: Sia, Jee Heng @ 2020-12-01  9:10 UTC (permalink / raw)
  To: Lars-Peter Clausen, Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, tiwai, pierre-louis.bossart,
	liam.r.girdwood, vkoul, broonie, Sit, Michael Wei Hong



> -----Original Message-----
> From: Lars-Peter Clausen <lars@metafoo.de>
> Sent: 01 December 2020 4:22 PM
> To: Shevchenko, Andriy <andriy.shevchenko@intel.com>
> Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>; Sia, Jee
> Heng <jee.heng.sia@intel.com>; pierre-louis.bossart@linux.intel.com;
> Rojewski, Cezary <cezary.rojewski@intel.com>;
> liam.r.girdwood@linux.intel.com; vkoul@kernel.org; tiwai@suse.com;
> broonie@kernel.org; alsa-devel@alsa-project.org
> Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add
> custom prepare and submit function
> 
> On 11/30/20 12:09 PM, Shevchenko, Andriy wrote:
> > On Mon, Nov 30, 2020 at 11:45:17AM +0100, Lars-Peter Clausen
> wrote:
> >> On 11/30/20 10:57 AM, Sit, Michael Wei Hong wrote:
> >>> Is there anymore comment on this RFC?
> >>> We will be using the ASoC framework to split the linked-list, since
> resplitting the linked-list in DMA is a no go.
> >>> If there isn't any more comments, we will push these patches for
> review and merging.
> >> Why is splitting the list in the DMAengine framework a no go?
> >>
> >> The whole idea of the DMAengine framework is to hide hardware
> >> specifics. It offers an API with certain semantics and it is up to
> >> the driver to provide an implementation that implements these
> >> semantics. There does not necessarily have to be a 1-to-1 mapping
> to
> >> hardware primitives in such an implementation.
> > I would say it's not desirable.
> >
> > Why should we split than resplit if we may do it in one go?
> > Why then we have DMA capabilities returned to the consumers.
> >
> > So, I have that we need to optimize DMA SG list preparation in a way
> > that consumer gets SG list cooked in accordance with DMA
> limitations / requirements.
> 
> The API that the audio drivers use request a period DMA transfer for
> length N split into M segments with the callback being called after
> each segment.
> 
> How that is implemented internally in the DMA does not matter as
> long as it matches those semantics. E.g. if the segment is longer than
> what the DMA can handle it can split it into two segments internally
> and call the callback every second segment.
> 
> The way this API works there isn't even really a possibility for the client
> side to split periodic transfers.
> 
> Btw. 1024 beats per segment seems excessively small, I don't
> understand how somebody would design such an DMA. Was the
> assumption that the DMA will never transfer more than PAGE_SIZE of
> contiguous memory? Or do we not understand the documentation
> correctly?
[>>] The segment size is 1024 items. The item size could be 8bits,
16bits or 32bits. So, set_max_seg_size() is set to 1024*4bytes.
 

> 
> - Lars


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-12-01  9:10                   ` Sia, Jee Heng
@ 2020-12-05  0:55                     ` Sit, Michael Wei Hong
  2020-12-07 10:05                       ` Lars-Peter Clausen
  0 siblings, 1 reply; 23+ messages in thread
From: Sit, Michael Wei Hong @ 2020-12-05  0:55 UTC (permalink / raw)
  To: Sia, Jee Heng, Lars-Peter Clausen, Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, pierre-louis.bossart, tiwai,
	liam.r.girdwood, vkoul, broonie



> > -----Original Message-----
> > From: Lars-Peter Clausen <lars@metafoo.de>
> > Sent: 01 December 2020 4:22 PM
> > To: Shevchenko, Andriy <andriy.shevchenko@intel.com>
> > Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>;
> Sia, Jee
> > Heng <jee.heng.sia@intel.com>; pierre-
> louis.bossart@linux.intel.com;
> > Rojewski, Cezary <cezary.rojewski@intel.com>;
> > liam.r.girdwood@linux.intel.com; vkoul@kernel.org;
> tiwai@suse.com;
> > broonie@kernel.org; alsa-devel@alsa-project.org
> > Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-
> pcm: Add
> > custom prepare and submit function
> >
...
> > > Why should we split than resplit if we may do it in one go?
> > > Why then we have DMA capabilities returned to the
> consumers.
> > >
> > > So, I have that we need to optimize DMA SG list preparation
> in a way
> > > that consumer gets SG list cooked in accordance with DMA
> > limitations / requirements.
> >
> > The API that the audio drivers use request a period DMA
> transfer for
> > length N split into M segments with the callback being called
> after
> > each segment.
> >
> > How that is implemented internally in the DMA does not matter
> as long
> > as it matches those semantics. E.g. if the segment is longer than
> what
> > the DMA can handle it can split it into two segments internally
> and
> > call the callback every second segment.
> >
> > The way this API works there isn't even really a possibility for
> the
> > client side to split periodic transfers.
> >
> > Btw. 1024 beats per segment seems excessively small, I don't
> > understand how somebody would design such an DMA. Was
> the assumption
> > that the DMA will never transfer more than PAGE_SIZE of
> contiguous
> > memory? Or do we not understand the documentation
> correctly?
> [>>] The segment size is 1024 items. The item size could be 8bits,
> 16bits or 32bits. So, set_max_seg_size() is set to 1024*4bytes.
> 
> 
> >
> > - Lars

Hi All,
 
In order to fulfil Andy request on optimizing the linked-list at the DMA client side, we proposed a few changes to the soc-generic- dmaengine and DMAENGINE API due to the AxiDMA's nature operation in number of items.
 
Add New DMAENGINE API:
// For DMA driver to set the max items in a segment 1. dma_set_max_seg_items(struct device *dev, unsigned int size)

// For DMA client to retrieve the max items supported in a segment 2. dma_get_max_seg_items(struct device *dev)

#1 and #2 above are the critical API needed to be exposed to the DMA Clients so that DMA Clients can use it to calculate the appropriate segment length before pass it to the DMA driver.
If #1 and #2 are No Go, then splitting linked-list in DMA client can't be achieve.

ASoC changes:
1. Adding variable to snd_pcm_hardware to store DMA limitation information.
2. soc-generic-dmaengine-pcm to register DMA limitation exposed by DMA driver using API provided above.
3. dmaengine_pcm_prepare_and_submit in pcm_dmaengine.c to check the number of items needed calculated from userspace and configure the dma accordingly if the number of items exceeds the DMA items limitation.
4. dmaengine_pcm_dma_complete in pcm_dmaengine.c to check the number of items needed calculated from userspace and update position according to DMA limitation, to trigger period_elapse appropriately.

All of the above are needed in order to facilitate storing of the DMA limitation info and using the info to configure the DMA appropriately within the DMA limits.
#3 and #4 implements a check against the number of items needed based on userspace provided information and the DMA item limit, if the limit is exceeded, the maximum limit of the DMA is used to configure the DMA transfers.
e.g.
bytes_to_samples returns a value higher than the maximum item limit of the DMA, the driver will pass the maximum usable limit of the DMA using samples_to_bytes to the DMA driver. This will make the DMA driver to use longer linked-list and would not need to do the resplitting at the DMA driver.

Below is the snapshot of the code showing how soc-generic- dmaengine make use of the new API to calculate the segment length.
---------------------------------------------------------------------------------------------------------------------------------------------
Code change in pcm.h

struct snd_pcm_hardware {
	......

	size_t buffer_bytes_max;	/* max buffer size */
	size_t period_bytes_min;	/* min period size */
	size_t period_bytes_max;	/* max period size */
	unsigned int periods_min;	/* min # of periods */
	unsigned int periods_max;	/* max # of periods */
	size_t fifo_size;		/* fifo size in bytes */

--> Add variable for dma max item numbers
	e.g: unsigned int dma_item_max;	/* max # of dma items */

	......
};


---------------------------------------------------------------------------------------------------------------------------------------------
Code change in soc-generic-dmaengine-pcm.c

static int
dmaengine_pcm_set_runtime_hwparams(struct snd_soc_component *component,
				   struct snd_pcm_substream *substream) {
	......

	memset(&hw, 0, sizeof(hw));
	hw.info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
			SNDRV_PCM_INFO_INTERLEAVED;
	hw.periods_min = 2;
	hw.periods_max = UINT_MAX;
	hw.period_bytes_min = 256;
	hw.period_bytes_max = dma_get_max_seg_size(dma_dev);
	hw.buffer_bytes_max = SIZE_MAX;
	hw.fifo_size = dma_data->fifo_size;

--> Add code to register MAX_DMA_Items limitation using API exposed by 
--> dma
	e.g: hw.dma_item_max = dma_get_max_item_number(dma_dev);

	......
}

---------------------------------------------------------------------------------------------------------------------------------------------
Code change in pcm_dmaengine.c

static void dmaengine_pcm_dma_complete(void *arg) {
	......

	struct snd_pcm_runtime *runtime = substream->runtime;
	int blocks;

-->Add code to convert period bytes to number of DMA items passed down 
-->from user space
	e.g : blocks = bytes_to_samples(runtime, snd_pcm_lib_period_bytes(substream));
--> Add code to check number of DMA items from user space vs DMA 
--> limitation and update position accordingly
	e.g:

	if (blocks > hw.dma_item_max)
		prtd->pos += samples_to_bytes(runtime, MAX_DMA_BLOCKS);
	else
		prtd->pos += snd_pcm_lib_period_bytes(substream);

	......
}

static int dmaengine_pcm_prepare_and_submit(struct snd_pcm_substream *substream) {
	......

	struct snd_pcm_runtime *runtime = substream->runtime;
	int blocks;

--> Add code to convert period bytes to number of DMA items passed down 
--> from user space
	e.g: blocks = bytes_to_samples(runtime, snd_pcm_lib_period_bytes(substream));
	......

--> Add code to check if the number of blocks used exceed the DMA 
--> limitation and prepare DMA according to limitation instead of 
--> information taken from userspace
	e.g:
	if (blocks > hw.dma_item_max)
		desc = dmaengine_prep_dma_cyclic(chan,
			substream->runtime->dma_addr,
			snd_pcm_lib_buffer_bytes(substream),
			samples_to_bytes(runtime, MAX_DMA_BLOCKS), direction, flags);
	else
		desc = dmaengine_prep_dma_cyclic(chan,
			substream->runtime->dma_addr,
			snd_pcm_lib_buffer_bytes(substream),
			snd_pcm_lib_period_bytes(substream), direction, flags);

	......
}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-12-05  0:55                     ` Sit, Michael Wei Hong
@ 2020-12-07 10:05                       ` Lars-Peter Clausen
  2020-12-07 15:36                         ` Pierre-Louis Bossart
  0 siblings, 1 reply; 23+ messages in thread
From: Lars-Peter Clausen @ 2020-12-07 10:05 UTC (permalink / raw)
  To: Sit, Michael Wei Hong, Sia, Jee Heng, Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, pierre-louis.bossart, tiwai,
	liam.r.girdwood, vkoul, broonie

On 12/5/20 1:55 AM, Sit, Michael Wei Hong wrote:
>
>>> -----Original Message-----
>>> From: Lars-Peter Clausen <lars@metafoo.de>
>>> Sent: 01 December 2020 4:22 PM
>>> To: Shevchenko, Andriy <andriy.shevchenko@intel.com>
>>> Cc: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com>;
>> Sia, Jee
>>> Heng <jee.heng.sia@intel.com>; pierre-
>> louis.bossart@linux.intel.com;
>>> Rojewski, Cezary <cezary.rojewski@intel.com>;
>>> liam.r.girdwood@linux.intel.com; vkoul@kernel.org;
>> tiwai@suse.com;
>>> broonie@kernel.org; alsa-devel@alsa-project.org
>>> Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-
>> pcm: Add
>>> custom prepare and submit function
>>>
> ...
>>>> Why should we split than resplit if we may do it in one go?
>>>> Why then we have DMA capabilities returned to the
>> consumers.
>>>> So, I have that we need to optimize DMA SG list preparation
>> in a way
>>>> that consumer gets SG list cooked in accordance with DMA
>>> limitations / requirements.
>>>
>>> The API that the audio drivers use request a period DMA
>> transfer for
>>> length N split into M segments with the callback being called
>> after
>>> each segment.
>>>
>>> How that is implemented internally in the DMA does not matter
>> as long
>>> as it matches those semantics. E.g. if the segment is longer than
>> what
>>> the DMA can handle it can split it into two segments internally
>> and
>>> call the callback every second segment.
>>>
>>> The way this API works there isn't even really a possibility for
>> the
>>> client side to split periodic transfers.
>>>
>>> Btw. 1024 beats per segment seems excessively small, I don't
>>> understand how somebody would design such an DMA. Was
>> the assumption
>>> that the DMA will never transfer more than PAGE_SIZE of
>> contiguous
>>> memory? Or do we not understand the documentation
>> correctly?
>> [>>] The segment size is 1024 items. The item size could be 8bits,
>> 16bits or 32bits. So, set_max_seg_size() is set to 1024*4bytes.
>>
>>
>>> - Lars
> Hi All,
>   
> In order to fulfil Andy request on optimizing the linked-list at the DMA client side, we proposed a few changes to the soc-generic- dmaengine and DMAENGINE API due to the AxiDMA's nature operation in number of items.
>   
> Add New DMAENGINE API:
> // For DMA driver to set the max items in a segment 1. dma_set_max_seg_items(struct device *dev, unsigned int size)
>
> // For DMA client to retrieve the max items supported in a segment 2. dma_get_max_seg_items(struct device *dev)
>
> #1 and #2 above are the critical API needed to be exposed to the DMA Clients so that DMA Clients can use it to calculate the appropriate segment length before pass it to the DMA driver.
> If #1 and #2 are No Go, then splitting linked-list in DMA client can't be achieve.
>
> ASoC changes:
> 1. Adding variable to snd_pcm_hardware to store DMA limitation information.
> 2. soc-generic-dmaengine-pcm to register DMA limitation exposed by DMA driver using API provided above.
> 3. dmaengine_pcm_prepare_and_submit in pcm_dmaengine.c to check the number of items needed calculated from userspace and configure the dma accordingly if the number of items exceeds the DMA items limitation.
> 4. dmaengine_pcm_dma_complete in pcm_dmaengine.c to check the number of items needed calculated from userspace and update position according to DMA limitation, to trigger period_elapse appropriately.
>
> All of the above are needed in order to facilitate storing of the DMA limitation info and using the info to configure the DMA appropriately within the DMA limits.
> #3 and #4 implements a check against the number of items needed based on userspace provided information and the DMA item limit, if the limit is exceeded, the maximum limit of the DMA is used to configure the DMA transfers.
> e.g.
> bytes_to_samples returns a value higher than the maximum item limit of the DMA, the driver will pass the maximum usable limit of the DMA using samples_to_bytes to the DMA driver. This will make the DMA driver to use longer linked-list and would not need to do the resplitting at the DMA driver.
>
> Below is the snapshot of the code showing how soc-generic- dmaengine make use of the new API to calculate the segment length.
> ---------------------------------------------------------------------------------------------------------------------------------------------
> Code change in pcm.h
>
> struct snd_pcm_hardware {
> 	......
>
> 	size_t buffer_bytes_max;	/* max buffer size */
> 	size_t period_bytes_min;	/* min period size */
> 	size_t period_bytes_max;	/* max period size */
> 	unsigned int periods_min;	/* min # of periods */
> 	unsigned int periods_max;	/* max # of periods */
> 	size_t fifo_size;		/* fifo size in bytes */
>
> --> Add variable for dma max item numbers
> 	e.g: unsigned int dma_item_max;	/* max # of dma items */
>
> 	......
> };
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------------
> Code change in soc-generic-dmaengine-pcm.c
>
> static int
> dmaengine_pcm_set_runtime_hwparams(struct snd_soc_component *component,
> 				   struct snd_pcm_substream *substream) {
> 	......
>
> 	memset(&hw, 0, sizeof(hw));
> 	hw.info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
> 			SNDRV_PCM_INFO_INTERLEAVED;
> 	hw.periods_min = 2;
> 	hw.periods_max = UINT_MAX;
> 	hw.period_bytes_min = 256;
> 	hw.period_bytes_max = dma_get_max_seg_size(dma_dev);
> 	hw.buffer_bytes_max = SIZE_MAX;
> 	hw.fifo_size = dma_data->fifo_size;
>
> --> Add code to register MAX_DMA_Items limitation using API exposed by
> --> dma
> 	e.g: hw.dma_item_max = dma_get_max_item_number(dma_dev);
>
> 	......
> }
>
> ---------------------------------------------------------------------------------------------------------------------------------------------
> Code change in pcm_dmaengine.c
>
> static void dmaengine_pcm_dma_complete(void *arg) {
> 	......
>
> 	struct snd_pcm_runtime *runtime = substream->runtime;
> 	int blocks;
>
> -->Add code to convert period bytes to number of DMA items passed down
> -->from user space
> 	e.g : blocks = bytes_to_samples(runtime, snd_pcm_lib_period_bytes(substream));
> --> Add code to check number of DMA items from user space vs DMA
> --> limitation and update position accordingly
> 	e.g:
>
> 	if (blocks > hw.dma_item_max)
> 		prtd->pos += samples_to_bytes(runtime, MAX_DMA_BLOCKS);
> 	else
> 		prtd->pos += snd_pcm_lib_period_bytes(substream);
>
> 	......
> }
>
> static int dmaengine_pcm_prepare_and_submit(struct snd_pcm_substream *substream) {
> 	......
>
> 	struct snd_pcm_runtime *runtime = substream->runtime;
> 	int blocks;
>
> --> Add code to convert period bytes to number of DMA items passed down
> --> from user space
> 	e.g: blocks = bytes_to_samples(runtime, snd_pcm_lib_period_bytes(substream));
> 	......
>
> --> Add code to check if the number of blocks used exceed the DMA
> --> limitation and prepare DMA according to limitation instead of
> --> information taken from userspace
> 	e.g:
> 	if (blocks > hw.dma_item_max)
> 		desc = dmaengine_prep_dma_cyclic(chan,
> 			substream->runtime->dma_addr,
> 			snd_pcm_lib_buffer_bytes(substream),
> 			samples_to_bytes(runtime, MAX_DMA_BLOCKS), direction, flags);
> 	else
> 		desc = dmaengine_prep_dma_cyclic(chan,
> 			substream->runtime->dma_addr,
> 			snd_pcm_lib_buffer_bytes(substream),
> 			snd_pcm_lib_period_bytes(substream), direction, flags);
>
> 	......
> }
>
Hi,

I don't think this approach will work. If you setup a DMA transfer with 
a different configuration that what was requested your audio will glitch.

If you really want to limit your period size you need to install a range 
constraint on the SNDRV_PCM_HW_PARAM_PERIOD_SIZE parameter.

But I'd highly recommend against it and just split the transfer into 
multiple segments in the DMA driver. Needlessly limiting the period size 
will increase the number of interrupts during audio playback/recording 
and hurt the power efficiency of your system.

- Lars


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-12-07 10:05                       ` Lars-Peter Clausen
@ 2020-12-07 15:36                         ` Pierre-Louis Bossart
  2020-12-08  3:21                           ` Sia, Jee Heng
  0 siblings, 1 reply; 23+ messages in thread
From: Pierre-Louis Bossart @ 2020-12-07 15:36 UTC (permalink / raw)
  To: Lars-Peter Clausen, Sit, Michael Wei Hong, Sia, Jee Heng,
	Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, tiwai, liam.r.girdwood, vkoul, broonie




> If you really want to limit your period size you need to install a range 
> constraint on the SNDRV_PCM_HW_PARAM_PERIOD_SIZE parameter.
> 
> But I'd highly recommend against it and just split the transfer into 
> multiple segments in the DMA driver. Needlessly limiting the period size 
> will increase the number of interrupts during audio playback/recording 
> and hurt the power efficiency of your system.

Yes that was also an objection from me, the fix should be in the DMA 
level. The 1024 block limitation would mean restricting the period size 
to be at most 5.3 or 10.6ms (16 and 32-bit cases). That's way to small.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-12-07 15:36                         ` Pierre-Louis Bossart
@ 2020-12-08  3:21                           ` Sia, Jee Heng
  2020-12-10  8:24                             ` Sit, Michael Wei Hong
  0 siblings, 1 reply; 23+ messages in thread
From: Sia, Jee Heng @ 2020-12-08  3:21 UTC (permalink / raw)
  To: Pierre-Louis Bossart, Lars-Peter Clausen, Sit, Michael Wei Hong,
	Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, tiwai, liam.r.girdwood, vkoul, broonie



> -----Original Message-----
> From: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> Sent: 07 December 2020 11:36 PM
> To: Lars-Peter Clausen <lars@metafoo.de>; Sit, Michael Wei Hong
> <michael.wei.hong.sit@intel.com>; Sia, Jee Heng
> <jee.heng.sia@intel.com>; Shevchenko, Andriy
> <andriy.shevchenko@intel.com>
> Cc: Rojewski, Cezary <cezary.rojewski@intel.com>; alsa-devel@alsa-
> project.org; tiwai@suse.com; liam.r.girdwood@linux.intel.com;
> vkoul@kernel.org; broonie@kernel.org
> Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add
> custom prepare and submit function
> 
> 
> 
> 
> > If you really want to limit your period size you need to install a
> > range constraint on the SNDRV_PCM_HW_PARAM_PERIOD_SIZE
> parameter.
> >
> > But I'd highly recommend against it and just split the transfer into
> > multiple segments in the DMA driver. Needlessly limiting the period
> > size will increase the number of interrupts during audio
> > playback/recording and hurt the power efficiency of your system.
> 
> Yes that was also an objection from me, the fix should be in the DMA
> level. The 1024 block limitation would mean restricting the period size
> to be at most 5.3 or 10.6ms (16 and 32-bit cases). That's way to small.
[>>] Seems like complexity increases if splitting the segments in ASoC. This is not a framework issue nor architecture issue. 
If introducing new API to DMAENGINE to constraint the number of items is not recommended, then lets split the segments in DMA driver.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-12-08  3:21                           ` Sia, Jee Heng
@ 2020-12-10  8:24                             ` Sit, Michael Wei Hong
  2020-12-14  3:23                               ` Sit, Michael Wei Hong
  0 siblings, 1 reply; 23+ messages in thread
From: Sit, Michael Wei Hong @ 2020-12-10  8:24 UTC (permalink / raw)
  To: Sia, Jee Heng, Pierre-Louis Bossart, Lars-Peter Clausen,
	Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, tiwai, liam.r.girdwood, vkoul, broonie



> -----Original Message-----
> From: Sia, Jee Heng <jee.heng.sia@intel.com>
> Sent: Tuesday, 8 December, 2020 11:21 AM
> To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>;
> Lars-Peter Clausen <lars@metafoo.de>; Sit, Michael Wei Hong
> <michael.wei.hong.sit@intel.com>; Shevchenko, Andriy
> <andriy.shevchenko@intel.com>
> Cc: Rojewski, Cezary <cezary.rojewski@intel.com>; alsa-
> devel@alsa-project.org; tiwai@suse.com;
> liam.r.girdwood@linux.intel.com; vkoul@kernel.org;
> broonie@kernel.org
> Subject: RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm:
> Add custom prepare and submit function
> 
> 
> 
> > -----Original Message-----
> > From: Pierre-Louis Bossart <pierre-
> louis.bossart@linux.intel.com>
> > Sent: 07 December 2020 11:36 PM
> > To: Lars-Peter Clausen <lars@metafoo.de>; Sit, Michael Wei
> Hong
> > <michael.wei.hong.sit@intel.com>; Sia, Jee Heng
> > <jee.heng.sia@intel.com>; Shevchenko, Andriy
> > <andriy.shevchenko@intel.com>
> > Cc: Rojewski, Cezary <cezary.rojewski@intel.com>; alsa-
> devel@alsa-
> > project.org; tiwai@suse.com; liam.r.girdwood@linux.intel.com;
> > vkoul@kernel.org; broonie@kernel.org
> > Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-
> pcm: Add
> > custom prepare and submit function
> >
> >
> >
> >
> > > If you really want to limit your period size you need to install a
> > > range constraint on the
> SNDRV_PCM_HW_PARAM_PERIOD_SIZE
> > parameter.
> > >
> > > But I'd highly recommend against it and just split the transfer
> into
> > > multiple segments in the DMA driver. Needlessly limiting the
> period
> > > size will increase the number of interrupts during audio
> > > playback/recording and hurt the power efficiency of your
> system.
> >
> > Yes that was also an objection from me, the fix should be in the
> DMA
> > level. The 1024 block limitation would mean restricting the
> period
> > size to be at most 5.3 or 10.6ms (16 and 32-bit cases). That's way
> to small.
> [>>] Seems like complexity increases if splitting the segments in
> ASoC. This is not a framework issue nor architecture issue.
> If introducing new API to DMAENGINE to constraint the number
> of items is not recommended, then lets split the segments in
> DMA driver.

With the increased complexity of introducing new APIs can we move the segment splitting to the DMA driver?
Anymore concerns with doing so?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function
  2020-12-10  8:24                             ` Sit, Michael Wei Hong
@ 2020-12-14  3:23                               ` Sit, Michael Wei Hong
  0 siblings, 0 replies; 23+ messages in thread
From: Sit, Michael Wei Hong @ 2020-12-14  3:23 UTC (permalink / raw)
  To: Sia, Jee Heng, Pierre-Louis Bossart, Lars-Peter Clausen,
	Shevchenko, Andriy
  Cc: Rojewski, Cezary, alsa-devel, tiwai, liam.r.girdwood, vkoul, broonie



> -----Original Message-----
> From: Sit, Michael Wei Hong
> Sent: Thursday, 10 December, 2020 4:24 PM
> To: Sia, Jee Heng <jee.heng.sia@intel.com>; Pierre-Louis Bossart
> <pierre-louis.bossart@linux.intel.com>; Lars-Peter Clausen
> <lars@metafoo.de>; Shevchenko, Andriy
> <andriy.shevchenko@intel.com>
> Cc: Rojewski, Cezary <cezary.rojewski@intel.com>; alsa-
> devel@alsa-project.org; tiwai@suse.com;
> liam.r.girdwood@linux.intel.com; vkoul@kernel.org;
> broonie@kernel.org
> Subject: RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm:
> Add custom prepare and submit function
> 
> 
> 
> > -----Original Message-----
> > From: Sia, Jee Heng <jee.heng.sia@intel.com>
> > Sent: Tuesday, 8 December, 2020 11:21 AM
> > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>;
> > Lars-Peter Clausen <lars@metafoo.de>; Sit, Michael Wei Hong
> > <michael.wei.hong.sit@intel.com>; Shevchenko, Andriy
> > <andriy.shevchenko@intel.com>
> > Cc: Rojewski, Cezary <cezary.rojewski@intel.com>; alsa-
> > devel@alsa-project.org; tiwai@suse.com;
> > liam.r.girdwood@linux.intel.com; vkoul@kernel.org;
> broonie@kernel.org
> > Subject: RE: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm:
> > Add custom prepare and submit function
> >
> >
> >
> > > -----Original Message-----
> > > From: Pierre-Louis Bossart <pierre-
> > louis.bossart@linux.intel.com>
> > > Sent: 07 December 2020 11:36 PM
> > > To: Lars-Peter Clausen <lars@metafoo.de>; Sit, Michael Wei
> > Hong
> > > <michael.wei.hong.sit@intel.com>; Sia, Jee Heng
> > > <jee.heng.sia@intel.com>; Shevchenko, Andriy
> > > <andriy.shevchenko@intel.com>
> > > Cc: Rojewski, Cezary <cezary.rojewski@intel.com>; alsa-
> > devel@alsa-
> > > project.org; tiwai@suse.com;
> liam.r.girdwood@linux.intel.com;
> > > vkoul@kernel.org; broonie@kernel.org
> > > Subject: Re: [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-
> > pcm: Add
> > > custom prepare and submit function
> > >
> > >
> > >
> > >
> > > > If you really want to limit your period size you need to install
> a
> > > > range constraint on the
> > SNDRV_PCM_HW_PARAM_PERIOD_SIZE
> > > parameter.
> > > >
> > > > But I'd highly recommend against it and just split the
> transfer
> > into
> > > > multiple segments in the DMA driver. Needlessly limiting the
> > period
> > > > size will increase the number of interrupts during audio
> > > > playback/recording and hurt the power efficiency of your
> > system.
> > >
> > > Yes that was also an objection from me, the fix should be in
> the
> > DMA
> > > level. The 1024 block limitation would mean restricting the
> > period
> > > size to be at most 5.3 or 10.6ms (16 and 32-bit cases). That's
> way
> > to small.
> > [>>] Seems like complexity increases if splitting the segments in
> > ASoC. This is not a framework issue nor architecture issue.
> > If introducing new API to DMAENGINE to constraint the number
> of items
> > is not recommended, then lets split the segments in DMA driver.
> 
> With the increased complexity of introducing new APIs can we
> move the segment splitting to the DMA driver?
> Anymore concerns with doing so?

If there are no more comments on this, I will start cleaning up the code to remove the DMA splitting in the ASoC layer, and push the code for review soon.
The DMA segment splitting will be done in the DMA driver instead.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-12-14  3:25 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-17  8:03 [RFC PATCH 0/4] Enable DMA mode on Intel Keem Bay platform Michael Sit Wei Hong
2020-11-17  8:03 ` [RFC PATCH 1/4] dt-bindings: sound: intel, keembay-i2s: Add info for device to use DMA Michael Sit Wei Hong
2020-11-17  8:03 ` [RFC PATCH 2/4] ASoC: soc-generic-dmaengine-pcm: Add custom prepare and submit function Michael Sit Wei Hong
2020-11-17 13:56   ` Vinod Koul
2020-11-17 15:50   ` Andy Shevchenko
2020-11-18  0:34     ` Sia, Jee Heng
2020-11-18  4:40       ` Vinod Koul
2020-11-18  5:27         ` Sia, Jee Heng
2020-11-18 14:50       ` Shevchenko, Andriy
2020-11-24  3:51         ` Sia, Jee Heng
2020-11-30  9:57           ` Sit, Michael Wei Hong
2020-11-30 10:45             ` Lars-Peter Clausen
2020-11-30 11:09               ` Shevchenko, Andriy
2020-12-01  8:22                 ` Lars-Peter Clausen
2020-12-01  9:10                   ` Sia, Jee Heng
2020-12-05  0:55                     ` Sit, Michael Wei Hong
2020-12-07 10:05                       ` Lars-Peter Clausen
2020-12-07 15:36                         ` Pierre-Louis Bossart
2020-12-08  3:21                           ` Sia, Jee Heng
2020-12-10  8:24                             ` Sit, Michael Wei Hong
2020-12-14  3:23                               ` Sit, Michael Wei Hong
2020-11-17  8:03 ` [RFC PATCH 3/4] ASoC: dmaengine_pcm: expose functions to header file for custom functions Michael Sit Wei Hong
2020-11-17  8:03 ` [RFC PATCH 4/4] ASoC: Intel: KMB: Enable DMA transfer mode Michael Sit Wei Hong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).