linux-hardening.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] Introduce STM32 DMA3 support
@ 2024-04-23 12:32 Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings Amelie Delaunay
                   ` (11 more replies)
  0 siblings, 12 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

STM32 DMA3 is a direct memory access controller with different features
depending on its hardware configuration. It is either called LPDMA (Low
Power), GPDMA (General Purpose) or HPDMA (High Performance), and it can
be found in new STM32 MCUs and MPUs.

In STM32MP25 SoC [1], 3 HPDMAs and 1 LPDMA are embedded. Only HPDMAs are
used by Linux.

Before adding this new driver, this series gathers existing STM32 DMA
drivers and bindings under stm32/ subdirectory and adds an entry in
MAINTAINERS file.

To ease review, the initial "dmaengine: Add STM32 DMA3 support" has been
split into functionnalities.
Patches 6 to 9 can be squashed into patch 5.

Patch 10 has already been proposed [2], the API is now used in stm32-dma3
driver. Indeed, STM32 DMA3 channels can be individually reserved either
because they are secure, or dedicated to another CPU. These channels are
not registered in dmaengine, so id is not incremented, but, using the new
API to specify the channel name, channel name matches the name in the
Reference Manual and ease requesting a channel thanks to its name.

[1] https://www.st.com/resource/en/reference_manual/rm0457-stm32mp25xx-advanced-armbased-3264bit-mpus-stmicroelectronics.pdf
[2] https://lore.kernel.org/lkml/20231213174021.3074759-1-amelie.delaunay@foss.st.com/

Amelie Delaunay (12):
  dt-bindings: dma: New directory for STM32 DMA controllers bindings
  dmaengine: stm32: New directory for STM32 DMA controllers drivers
  MAINTAINERS: Add entry for STM32 DMA controllers drivers and
    documentation
  dt-bindings: dma: Document STM32 DMA3 controller bindings
  dmaengine: Add STM32 DMA3 support
  dmaengine: stm32-dma3: add DMA_CYCLIC capability
  dmaengine: stm32-dma3: add DMA_MEMCPY capability
  dmaengine: stm32-dma3: add device_pause and device_resume ops
  dmaengine: stm32-dma3: improve residue granularity
  dmaengine: add channel device name to channel registration
  dmaengine: stm32-dma3: defer channel registration to specify channel
    name
  arm64: dts: st: add HPDMA nodes on stm32mp251

 .../dma/{ => stm32}/st,stm32-dma.yaml         |    4 +-
 .../bindings/dma/stm32/st,stm32-dma3.yaml     |  125 ++
 .../dma/{ => stm32}/st,stm32-dmamux.yaml      |    4 +-
 .../dma/{ => stm32}/st,stm32-mdma.yaml        |    4 +-
 MAINTAINERS                                   |    9 +
 arch/arm64/boot/dts/st/stm32mp251.dtsi        |   69 +
 drivers/dma/Kconfig                           |   34 +-
 drivers/dma/Makefile                          |    4 +-
 drivers/dma/dmaengine.c                       |   16 +-
 drivers/dma/idxd/dma.c                        |    2 +-
 drivers/dma/stm32/Kconfig                     |   47 +
 drivers/dma/stm32/Makefile                    |    5 +
 drivers/dma/{ => stm32}/stm32-dma.c           |    2 +-
 drivers/dma/stm32/stm32-dma3.c                | 1838 +++++++++++++++++
 drivers/dma/{ => stm32}/stm32-dmamux.c        |    0
 drivers/dma/{ => stm32}/stm32-mdma.c          |    2 +-
 include/linux/dmaengine.h                     |    3 +-
 17 files changed, 2117 insertions(+), 51 deletions(-)
 rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dma.yaml (97%)
 create mode 100644 Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml
 rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dmamux.yaml (89%)
 rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-mdma.yaml (96%)
 create mode 100644 drivers/dma/stm32/Kconfig
 create mode 100644 drivers/dma/stm32/Makefile
 rename drivers/dma/{ => stm32}/stm32-dma.c (99%)
 create mode 100644 drivers/dma/stm32/stm32-dma3.c
 rename drivers/dma/{ => stm32}/stm32-dmamux.c (100%)
 rename drivers/dma/{ => stm32}/stm32-mdma.c (99%)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 13:50   ` Rob Herring
  2024-04-23 12:32 ` [PATCH 02/12] dmaengine: stm32: New directory for STM32 DMA controllers drivers Amelie Delaunay
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

Gather the STM32 DMA controllers bindings under ./dma/stm32/

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 .../devicetree/bindings/dma/{ => stm32}/st,stm32-dma.yaml     | 4 ++--
 .../devicetree/bindings/dma/{ => stm32}/st,stm32-dmamux.yaml  | 4 ++--
 .../devicetree/bindings/dma/{ => stm32}/st,stm32-mdma.yaml    | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)
 rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dma.yaml (97%)
 rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dmamux.yaml (89%)
 rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-mdma.yaml (96%)

diff --git a/Documentation/devicetree/bindings/dma/st,stm32-dma.yaml b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma.yaml
similarity index 97%
rename from Documentation/devicetree/bindings/dma/st,stm32-dma.yaml
rename to Documentation/devicetree/bindings/dma/stm32/st,stm32-dma.yaml
index 329847ef096a..071363d18443 100644
--- a/Documentation/devicetree/bindings/dma/st,stm32-dma.yaml
+++ b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma.yaml
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 %YAML 1.2
 ---
-$id: http://devicetree.org/schemas/dma/st,stm32-dma.yaml#
+$id: http://devicetree.org/schemas/dma/stm32/st,stm32-dma.yaml#
 $schema: http://devicetree.org/meta-schemas/core.yaml#
 
 title: STMicroelectronics STM32 DMA Controller
@@ -53,7 +53,7 @@ maintainers:
   - Amelie Delaunay <amelie.delaunay@foss.st.com>
 
 allOf:
-  - $ref: dma-controller.yaml#
+  - $ref: /schemas/dma/dma-controller.yaml#
 
 properties:
   "#dma-cells":
diff --git a/Documentation/devicetree/bindings/dma/st,stm32-dmamux.yaml b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dmamux.yaml
similarity index 89%
rename from Documentation/devicetree/bindings/dma/st,stm32-dmamux.yaml
rename to Documentation/devicetree/bindings/dma/stm32/st,stm32-dmamux.yaml
index e722fbcd8a5f..88c9e88cf3d5 100644
--- a/Documentation/devicetree/bindings/dma/st,stm32-dmamux.yaml
+++ b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dmamux.yaml
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 %YAML 1.2
 ---
-$id: http://devicetree.org/schemas/dma/st,stm32-dmamux.yaml#
+$id: http://devicetree.org/schemas/dma/stm32/st,stm32-dmamux.yaml#
 $schema: http://devicetree.org/meta-schemas/core.yaml#
 
 title: STMicroelectronics STM32 DMA MUX (DMA request router)
@@ -10,7 +10,7 @@ maintainers:
   - Amelie Delaunay <amelie.delaunay@foss.st.com>
 
 allOf:
-  - $ref: dma-router.yaml#
+  - $ref: /schemas/dma/dma-router.yaml#
 
 properties:
   "#dma-cells":
diff --git a/Documentation/devicetree/bindings/dma/st,stm32-mdma.yaml b/Documentation/devicetree/bindings/dma/stm32/st,stm32-mdma.yaml
similarity index 96%
rename from Documentation/devicetree/bindings/dma/st,stm32-mdma.yaml
rename to Documentation/devicetree/bindings/dma/stm32/st,stm32-mdma.yaml
index 3874544dfa74..45fe91db11db 100644
--- a/Documentation/devicetree/bindings/dma/st,stm32-mdma.yaml
+++ b/Documentation/devicetree/bindings/dma/stm32/st,stm32-mdma.yaml
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 %YAML 1.2
 ---
-$id: http://devicetree.org/schemas/dma/st,stm32-mdma.yaml#
+$id: http://devicetree.org/schemas/dma/stm32/st,stm32-mdma.yaml#
 $schema: http://devicetree.org/meta-schemas/core.yaml#
 
 title: STMicroelectronics STM32 MDMA Controller
@@ -53,7 +53,7 @@ maintainers:
   - Amelie Delaunay <amelie.delaunay@foss.st.com>
 
 allOf:
-  - $ref: dma-controller.yaml#
+  - $ref: /schemas/dma/dma-controller.yaml#
 
 properties:
   "#dma-cells":
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/12] dmaengine: stm32: New directory for STM32 DMA controllers drivers
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 03/12] MAINTAINERS: Add entry for STM32 DMA controllers drivers and documentation Amelie Delaunay
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

Gather the STM32 DMA controllers drivers under drivers/dma/stm32/

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/Kconfig                    | 34 ++---------------------
 drivers/dma/Makefile                   |  4 +--
 drivers/dma/stm32/Kconfig              | 37 ++++++++++++++++++++++++++
 drivers/dma/stm32/Makefile             |  4 +++
 drivers/dma/{ => stm32}/stm32-dma.c    |  2 +-
 drivers/dma/{ => stm32}/stm32-dmamux.c |  0
 drivers/dma/{ => stm32}/stm32-mdma.c   |  2 +-
 7 files changed, 46 insertions(+), 37 deletions(-)
 create mode 100644 drivers/dma/stm32/Kconfig
 create mode 100644 drivers/dma/stm32/Makefile
 rename drivers/dma/{ => stm32}/stm32-dma.c (99%)
 rename drivers/dma/{ => stm32}/stm32-dmamux.c (100%)
 rename drivers/dma/{ => stm32}/stm32-mdma.c (99%)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 002a5ec80620..32b4256ef874 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -568,38 +568,6 @@ config ST_FDMA
 	  Say Y here if you have such a chipset.
 	  If unsure, say N.
 
-config STM32_DMA
-	bool "STMicroelectronics STM32 DMA support"
-	depends on ARCH_STM32 || COMPILE_TEST
-	select DMA_ENGINE
-	select DMA_VIRTUAL_CHANNELS
-	help
-	  Enable support for the on-chip DMA controller on STMicroelectronics
-	  STM32 MCUs.
-	  If you have a board based on such a MCU and wish to use DMA say Y
-	  here.
-
-config STM32_DMAMUX
-	bool "STMicroelectronics STM32 dma multiplexer support"
-	depends on STM32_DMA || COMPILE_TEST
-	help
-	  Enable support for the on-chip DMA multiplexer on STMicroelectronics
-	  STM32 MCUs.
-	  If you have a board based on such a MCU and wish to use DMAMUX say Y
-	  here.
-
-config STM32_MDMA
-	bool "STMicroelectronics STM32 master dma support"
-	depends on ARCH_STM32 || COMPILE_TEST
-	depends on OF
-	select DMA_ENGINE
-	select DMA_VIRTUAL_CHANNELS
-	help
-	  Enable support for the on-chip MDMA controller on STMicroelectronics
-	  STM32 platforms.
-	  If you have a board based on STM32 SoC and wish to use the master DMA
-	  say Y here.
-
 config SPRD_DMA
 	tristate "Spreadtrum DMA support"
 	depends on ARCH_SPRD || COMPILE_TEST
@@ -772,6 +740,8 @@ source "drivers/dma/fsl-dpaa2-qdma/Kconfig"
 
 source "drivers/dma/lgm/Kconfig"
 
+source "drivers/dma/stm32/Kconfig"
+
 # clients
 comment "DMA Clients"
 	depends on DMA_ENGINE
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index dfd40d14e408..512b32408efd 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -68,9 +68,6 @@ obj-$(CONFIG_PXA_DMA) += pxa_dma.o
 obj-$(CONFIG_RENESAS_DMA) += sh/
 obj-$(CONFIG_SF_PDMA) += sf-pdma/
 obj-$(CONFIG_STE_DMA40) += ste_dma40.o ste_dma40_ll.o
-obj-$(CONFIG_STM32_DMA) += stm32-dma.o
-obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
-obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
 obj-$(CONFIG_SPRD_DMA) += sprd-dma.o
 obj-$(CONFIG_TXX9_DMAC) += txx9dmac.o
 obj-$(CONFIG_TEGRA186_GPC_DMA) += tegra186-gpc-dma.o
@@ -86,5 +83,6 @@ obj-$(CONFIG_INTEL_LDMA) += lgm/
 
 obj-y += mediatek/
 obj-y += qcom/
+obj-y += stm32/
 obj-y += ti/
 obj-y += xilinx/
diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
new file mode 100644
index 000000000000..b72ae1a4502f
--- /dev/null
+++ b/drivers/dma/stm32/Kconfig
@@ -0,0 +1,37 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# STM32 DMA controllers drivers
+#
+if ARCH_STM32 || COMPILE_TEST
+
+config STM32_DMA
+	bool "STMicroelectronics STM32 DMA support"
+	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
+	help
+	  Enable support for the on-chip DMA controller on STMicroelectronics
+	  STM32 platforms.
+	  If you have a board based on STM32 SoC with such DMA controller
+	  and want to use DMA say Y here.
+
+config STM32_DMAMUX
+	bool "STMicroelectronics STM32 DMA multiplexer support"
+	depends on STM32_DMA
+	help
+	  Enable support for the on-chip DMA multiplexer on STMicroelectronics
+	  STM32 platforms.
+	  If you have a board based on STM32 SoC with such DMA multiplexer
+	  and want to use DMAMUX say Y here.
+
+config STM32_MDMA
+	bool "STMicroelectronics STM32 master DMA support"
+	depends on OF
+	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
+	help
+	  Enable support for the on-chip MDMA controller on STMicroelectronics
+	  STM32 platforms.
+	  If you have a board based on STM32 SoC with such DMA controller
+	  and want to use MDMA say Y here.
+
+endif
diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
new file mode 100644
index 000000000000..663a3896a881
--- /dev/null
+++ b/drivers/dma/stm32/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_STM32_DMA) += stm32-dma.o
+obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
+obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
diff --git a/drivers/dma/stm32-dma.c b/drivers/dma/stm32/stm32-dma.c
similarity index 99%
rename from drivers/dma/stm32-dma.c
rename to drivers/dma/stm32/stm32-dma.c
index 90857d08a1a7..917f8e922373 100644
--- a/drivers/dma/stm32-dma.c
+++ b/drivers/dma/stm32/stm32-dma.c
@@ -28,7 +28,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 
-#include "virt-dma.h"
+#include "../virt-dma.h"
 
 #define STM32_DMA_LISR			0x0000 /* DMA Low Int Status Reg */
 #define STM32_DMA_HISR			0x0004 /* DMA High Int Status Reg */
diff --git a/drivers/dma/stm32-dmamux.c b/drivers/dma/stm32/stm32-dmamux.c
similarity index 100%
rename from drivers/dma/stm32-dmamux.c
rename to drivers/dma/stm32/stm32-dmamux.c
diff --git a/drivers/dma/stm32-mdma.c b/drivers/dma/stm32/stm32-mdma.c
similarity index 99%
rename from drivers/dma/stm32-mdma.c
rename to drivers/dma/stm32/stm32-mdma.c
index 6505081ced44..e6d525901de7 100644
--- a/drivers/dma/stm32-mdma.c
+++ b/drivers/dma/stm32/stm32-mdma.c
@@ -30,7 +30,7 @@
 #include <linux/reset.h>
 #include <linux/slab.h>
 
-#include "virt-dma.h"
+#include "../virt-dma.h"
 
 #define STM32_MDMA_GISR0		0x0000 /* MDMA Int Status Reg 1 */
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/12] MAINTAINERS: Add entry for STM32 DMA controllers drivers and documentation
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 02/12] dmaengine: stm32: New directory for STM32 DMA controllers drivers Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 04/12] dt-bindings: dma: Document STM32 DMA3 controller bindings Amelie Delaunay
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

Add an entry to make myself a maintainer of STM32 DMA controllers drivers
and documentation.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 MAINTAINERS | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9038abd8411e..c117184b7d26 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21131,6 +21131,15 @@ F:	Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.yaml
 F:	Documentation/devicetree/bindings/sound/st,stm32-*.yaml
 F:	sound/soc/stm/
 
+STM32 DMA DRIVERS
+M:	Amélie Delaunay <amelie.delaunay@foss.st.com>
+L:	dmaengine@vger.kernel.org
+L:	linux-stm32@st-md-mailman.stormreply.com (moderated for non-subscribers)
+S:	Maintained
+F:	Documentation/arch/arm/stm32/stm32-dma-mdma-chaining.rst
+F:	Documentation/devicetree/bindings/dma/stm32/
+F:	drivers/dma/stm32/
+
 STM32 TIMER/LPTIMER DRIVERS
 M:	Fabrice Gasnier <fabrice.gasnier@foss.st.com>
 S:	Maintained
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/12] dt-bindings: dma: Document STM32 DMA3 controller bindings
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (2 preceding siblings ...)
  2024-04-23 12:32 ` [PATCH 03/12] MAINTAINERS: Add entry for STM32 DMA controllers drivers and documentation Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 15:22   ` Rob Herring
  2024-04-23 12:32 ` [PATCH 05/12] dmaengine: Add STM32 DMA3 support Amelie Delaunay
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

The STM32 DMA3 is a Direct Memory Access controller with different features
depending on its hardware configuration.
The channels have not the same capabilities, some have a larger FIFO, so
their performance is higher.
This patch describes STM32 DMA3 bindings, used to select a channel that
fits client requirements, and to pre-configure the channel depending on
the client needs.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 .../bindings/dma/stm32/st,stm32-dma3.yaml     | 125 ++++++++++++++++++
 1 file changed, 125 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml

diff --git a/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml
new file mode 100644
index 000000000000..ea4f8f6add3c
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml
@@ -0,0 +1,125 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/dma/stm32/st,stm32-dma3.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: STMicroelectronics STM32 DMA3 Controller
+
+description: |
+  The STM32 DMA3 is a direct memory access controller with different features
+  depending on its hardware configuration.
+  It is either called LPDMA (Low Power), GPDMA (General Purpose) or
+  HPDMA (High Performance).
+  Its hardware configuration registers allow to dynamically expose its features.
+
+  GPDMA and HPDMA support 16 independent DMA channels, while only 4 for LPDMA.
+  GPDMA and HPDMA support 256 DMA requests from peripherals, 8 for LPDMA.
+
+  Bindings are generic for these 3 STM32 DMA3 configurations.
+
+  DMA clients connected to the STM32 DMA3 controller must use the format described
+  in the ../dma.txt file, using a four-cell specifier for each channel.
+  A phandle to the DMA controller plus the following three integer cells:
+    1. The request line number
+    2. A 32-bit mask specifying the DMA channel requirements
+      -bit 0-1: The priority level
+        0x0: low priority, low weight
+        0x1: low priority, mid weight
+        0x2: low priority, high weight
+        0x3: high priority
+      -bit 4-7: The FIFO requirement for queuing source and destination transfers
+        0x0: no FIFO requirement/any channel can fit
+        0x2: FIFO of 8 bytes (2^2+1)
+        0x4: FIFO of 32 bytes (2^4+1)
+        0x6: FIFO of 128 bytes (2^6+1)
+        0x7: FIFO of 256 bytes (2^7+1)
+    3. A 32-bit mask specifying the DMA transfer requirements
+      -bit 0: The source incrementing burst
+        0x0: fixed burst
+        0x1: contiguously incremented burst
+      -bit 1: The source allocated port
+        0x0: port 0 is allocated to the source transfer
+        0x1: port 1 is allocated to the source transfer
+      -bit 4: The destination incrementing burst
+        0x0: fixed burst
+        0x1: contiguously incremented burst
+      -bit 5: The destination allocated port
+        0x0: port 0 is allocated to the destination transfer
+        0x1: port 1 is allocated to the destination transfer
+      -bit 8: The type of hardware request
+        0x0: burst
+        0x1: block
+      -bit 9: The control mode
+        0x0: DMA controller control mode
+        0x1: peripheral control mode
+      -bit 12-13: The transfer complete event mode
+        0x0: at block level, transfer complete event is generated at the end of a block
+        0x2: at LLI level, the transfer complete event is generated at the end of the LLI transfer,
+             including the update of the LLI if any
+        0x3: at channel level, the transfer complete event is generated at the end of the last LLI
+
+maintainers:
+  - Amelie Delaunay <amelie.delaunay@foss.st.com>
+
+allOf:
+  - $ref: /schemas/dma/dma-controller.yaml#
+
+properties:
+  "#dma-cells":
+    const: 3
+
+  compatible:
+    const: st,stm32-dma3
+
+  reg:
+    maxItems: 1
+
+  clocks:
+    maxItems: 1
+
+  resets:
+    maxItems: 1
+
+  interrupts:
+    minItems: 4
+    maxItems: 16
+
+  power-domains:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - interrupts
+
+unevaluatedProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+    #include <dt-bindings/clock/st,stm32mp25-rcc.h>
+    dma-controller@40400000 {
+      compatible = "st,stm32-dma3";
+      reg = <0x40400000 0x1000>;
+      interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 35 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 37 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 39 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 43 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 44 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 45 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 46 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 47 IRQ_TYPE_LEVEL_HIGH>,
+                   <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
+      clocks = <&rcc CK_BUS_HPDMA1>;
+      #dma-cells = <3>;
+    };
+...
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (3 preceding siblings ...)
  2024-04-23 12:32 ` [PATCH 04/12] dt-bindings: dma: Document STM32 DMA3 controller bindings Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-05-04 12:40   ` Vinod Koul
                     ` (2 more replies)
  2024-04-23 12:32 ` [PATCH 06/12] dmaengine: stm32-dma3: add DMA_CYCLIC capability Amelie Delaunay
                   ` (6 subsequent siblings)
  11 siblings, 3 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
controller:
- LPDMA (Low Power): 4 channels, no FIFO
- GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
- HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
Hardware configuration of the channels is retrieved from the hardware
configuration registers.
The client can specify its channel requirements through device tree.
STM32 DMA3 channels can be individually reserved either because they are
secure, or dedicated to another CPU.
Indeed, channels availability depends on Resource Isolation Framework
(RIF) configuration. RIF grants access to buses with Compartiment ID
(CIF) filtering, secure and privilege level. It also assigns DMA channels
to one or several processors.
DMA channels used by Linux should be CID-filtered and statically assigned
to CID1 or shared with other CPUs but using semaphore. In case CID
filtering is not configured, dma-channel-mask property can be used to
specify available DMA channels to the kernel, otherwise such channels
will be marked as reserved and can't be used by Linux.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/stm32/Kconfig      |   10 +
 drivers/dma/stm32/Makefile     |    1 +
 drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
 3 files changed, 1442 insertions(+)
 create mode 100644 drivers/dma/stm32/stm32-dma3.c

diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
index b72ae1a4502f..4d8d8063133b 100644
--- a/drivers/dma/stm32/Kconfig
+++ b/drivers/dma/stm32/Kconfig
@@ -34,4 +34,14 @@ config STM32_MDMA
 	  If you have a board based on STM32 SoC with such DMA controller
 	  and want to use MDMA say Y here.
 
+config STM32_DMA3
+	tristate "STMicroelectronics STM32 DMA3 support"
+	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
+	help
+	  Enable support for the on-chip DMA3 controller on STMicroelectronics
+	  STM32 platforms.
+	  If you have a board based on STM32 SoC with such DMA3 controller
+	  and want to use DMA3, say Y here.
+
 endif
diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
index 663a3896a881..5082db4b4c1c 100644
--- a/drivers/dma/stm32/Makefile
+++ b/drivers/dma/stm32/Makefile
@@ -2,3 +2,4 @@
 obj-$(CONFIG_STM32_DMA) += stm32-dma.o
 obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
 obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
+obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
new file mode 100644
index 000000000000..b5493f497d06
--- /dev/null
+++ b/drivers/dma/stm32/stm32-dma3.c
@@ -0,0 +1,1431 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * STM32 DMA3 controller driver
+ *
+ * Copyright (C) STMicroelectronics 2024
+ * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
+ */
+
+#include <linux/bitfield.h>
+#include <linux/clk.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmaengine.h>
+#include <linux/dmapool.h>
+#include <linux/init.h>
+#include <linux/iopoll.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/of_dma.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/reset.h>
+#include <linux/slab.h>
+
+#include "../virt-dma.h"
+
+#define STM32_DMA3_SECCFGR		0x00
+#define STM32_DMA3_PRIVCFGR		0x04
+#define STM32_DMA3_RCFGLOCKR		0x08
+#define STM32_DMA3_MISR			0x0C
+#define STM32_DMA3_SMISR		0x10
+
+#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
+#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
+#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
+#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
+#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
+#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
+#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
+#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
+#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
+#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
+#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
+#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
+
+#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
+#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
+#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
+#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
+#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
+#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
+#define STM32_DMA3_VERR			0xFF4
+
+/* SECCFGR DMA secure configuration register */
+#define SECCFGR_SEC(x)			BIT(x)
+
+/* MISR DMA non-secure/secure masked interrupt status register */
+#define MISR_MIS(x)			BIT(x)
+
+/* CxLBAR DMA channel x linked_list base address register */
+#define CLBAR_LBA			GENMASK(31, 16)
+
+/* CxCIDCFGR DMA channel x CID register */
+#define CCIDCFGR_CFEN			BIT(0)
+#define CCIDCFGR_SEM_EN			BIT(1)
+#define CCIDCFGR_SCID			GENMASK(5, 4)
+#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
+#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
+#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
+
+enum ccidcfgr_cid {
+	CCIDCFGR_CID0,
+	CCIDCFGR_CID1,
+	CCIDCFGR_CID2,
+};
+
+/* CxSEMCR DMA channel x semaphore control register */
+#define CSEMCR_SEM_MUTEX		BIT(0)
+#define CSEMCR_SEM_CCID			GENMASK(5, 4)
+
+/* CxFCR DMA channel x flag clear register */
+#define CFCR_TCF			BIT(8)
+#define CFCR_HTF			BIT(9)
+#define CFCR_DTEF			BIT(10)
+#define CFCR_ULEF			BIT(11)
+#define CFCR_USEF			BIT(12)
+#define CFCR_SUSPF			BIT(13)
+
+/* CxSR DMA channel x status register */
+#define CSR_IDLEF			BIT(0)
+#define CSR_TCF				BIT(8)
+#define CSR_HTF				BIT(9)
+#define CSR_DTEF			BIT(10)
+#define CSR_ULEF			BIT(11)
+#define CSR_USEF			BIT(12)
+#define CSR_SUSPF			BIT(13)
+#define CSR_ALL_F			GENMASK(13, 8)
+#define CSR_FIFOL			GENMASK(24, 16)
+
+/* CxCR DMA channel x control register */
+#define CCR_EN				BIT(0)
+#define CCR_RESET			BIT(1)
+#define CCR_SUSP			BIT(2)
+#define CCR_TCIE			BIT(8)
+#define CCR_HTIE			BIT(9)
+#define CCR_DTEIE			BIT(10)
+#define CCR_ULEIE			BIT(11)
+#define CCR_USEIE			BIT(12)
+#define CCR_SUSPIE			BIT(13)
+#define CCR_ALLIE			GENMASK(13, 8)
+#define CCR_LSM				BIT(16)
+#define CCR_LAP				BIT(17)
+#define CCR_PRIO			GENMASK(23, 22)
+
+enum ccr_prio {
+	CCR_PRIO_LOW,
+	CCR_PRIO_MID,
+	CCR_PRIO_HIGH,
+	CCR_PRIO_VERY_HIGH,
+};
+
+/* CxTR1 DMA channel x transfer register 1 */
+#define CTR1_SINC			BIT(3)
+#define CTR1_SBL_1			GENMASK(9, 4)
+#define CTR1_DINC			BIT(19)
+#define CTR1_DBL_1			GENMASK(25, 20)
+#define CTR1_SDW_LOG2			GENMASK(1, 0)
+#define CTR1_PAM			GENMASK(12, 11)
+#define CTR1_SAP			BIT(14)
+#define CTR1_DDW_LOG2			GENMASK(17, 16)
+#define CTR1_DAP			BIT(30)
+
+enum ctr1_dw {
+	CTR1_DW_BYTE,
+	CTR1_DW_HWORD,
+	CTR1_DW_WORD,
+	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
+};
+
+enum ctr1_pam {
+	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
+	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
+	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
+};
+
+/* CxTR2 DMA channel x transfer register 2 */
+#define CTR2_REQSEL			GENMASK(7, 0)
+#define CTR2_SWREQ			BIT(9)
+#define CTR2_DREQ			BIT(10)
+#define CTR2_BREQ			BIT(11)
+#define CTR2_PFREQ			BIT(12)
+#define CTR2_TCEM			GENMASK(31, 30)
+
+enum ctr2_tcem {
+	CTR2_TCEM_BLOCK,
+	CTR2_TCEM_REPEAT_BLOCK,
+	CTR2_TCEM_LLI,
+	CTR2_TCEM_CHANNEL,
+};
+
+/* CxBR1 DMA channel x block register 1 */
+#define CBR1_BNDT			GENMASK(15, 0)
+
+/* CxLLR DMA channel x linked-list address register */
+#define CLLR_LA				GENMASK(15, 2)
+#define CLLR_ULL			BIT(16)
+#define CLLR_UDA			BIT(27)
+#define CLLR_USA			BIT(28)
+#define CLLR_UB1			BIT(29)
+#define CLLR_UT2			BIT(30)
+#define CLLR_UT1			BIT(31)
+
+/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
+/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
+#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
+
+/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
+/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
+#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
+
+#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
+
+/* HWCFGR2 DMA hardware configuration register 2 */
+#define G_MAX_REQ_ID			GENMASK(7, 0)
+
+/* HWCFGR1 DMA hardware configuration register 1 */
+#define G_MASTER_PORTS			GENMASK(2, 0)
+#define G_NUM_CHANNELS			GENMASK(12, 8)
+#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
+#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
+
+enum stm32_dma3_master_ports {
+	AXI64, /* 1x AXI: 64-bit port 0 */
+	AHB32, /* 1x AHB: 32-bit port 0 */
+	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
+	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
+	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
+	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
+};
+
+enum stm32_dma3_port_data_width {
+	DW_32, /* 32-bit, for AHB */
+	DW_64, /* 64-bit, for AXI */
+	DW_128, /* 128-bit, for AXI */
+	DW_INVALID,
+};
+
+/* VERR DMA version register */
+#define VERR_MINREV			GENMASK(3, 0)
+#define VERR_MAJREV			GENMASK(7, 4)
+
+/* Device tree */
+/* struct stm32_dma3_dt_conf */
+/* .ch_conf */
+#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
+#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
+/* .tr_conf */
+#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
+#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
+#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
+#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
+#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
+#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
+#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
+
+#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
+#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
+					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
+#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
+					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
+#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
+					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
+					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
+
+/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
+struct stm32_dma3_hwdesc {
+	u32 ctr1;
+	u32 ctr2;
+	u32 cbr1;
+	u32 csar;
+	u32 cdar;
+	u32 cllr;
+} __aligned(32);
+
+/*
+ * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
+ * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
+ * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
+ * the sg size limitation.
+ */
+#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
+
+/*
+ * Linked-list items
+ */
+struct stm32_dma3_lli {
+	struct stm32_dma3_hwdesc *hwdesc;
+	dma_addr_t hwdesc_addr;
+};
+
+struct stm32_dma3_swdesc {
+	struct virt_dma_desc vdesc;
+	u32 ccr;
+	bool cyclic;
+	u32 lli_size;
+	struct stm32_dma3_lli lli[] __counted_by(lli_size);
+};
+
+struct stm32_dma3_dt_conf {
+	u32 ch_id;
+	u32 req_line;
+	u32 ch_conf;
+	u32 tr_conf;
+};
+
+struct stm32_dma3_chan {
+	struct virt_dma_chan vchan;
+	u32 id;
+	int irq;
+	u32 fifo_size;
+	u32 max_burst;
+	bool semaphore_mode;
+	struct stm32_dma3_dt_conf dt_config;
+	struct dma_slave_config dma_config;
+	struct dma_pool *lli_pool;
+	struct stm32_dma3_swdesc *swdesc;
+	enum ctr2_tcem tcem;
+	u32 dma_status;
+};
+
+struct stm32_dma3_ddata {
+	struct dma_device dma_dev;
+	void __iomem *base;
+	struct clk *clk;
+	struct stm32_dma3_chan *chans;
+	u32 dma_channels;
+	u32 dma_requests;
+	enum stm32_dma3_port_data_width ports_max_dw[2];
+};
+
+static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
+{
+	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
+}
+
+static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
+{
+	return container_of(c, struct stm32_dma3_chan, vchan.chan);
+}
+
+static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
+{
+	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
+}
+
+static struct device *chan2dev(struct stm32_dma3_chan *chan)
+{
+	return &chan->vchan.chan.dev->device;
+}
+
+static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	struct device *dev = chan2dev(chan);
+	u32 id = chan->id, offset;
+
+	offset = STM32_DMA3_SECCFGR;
+	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_PRIVCFGR;
+	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CCIDCFGR(id);
+	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CSEMCR(id);
+	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CSR(id);
+	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CCR(id);
+	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CTR1(id);
+	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CTR2(id);
+	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CBR1(id);
+	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CSAR(id);
+	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CDAR(id);
+	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CLLR(id);
+	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+	offset = STM32_DMA3_CLBAR(id);
+	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
+}
+
+static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
+					struct stm32_dma3_swdesc *swdesc)
+{
+	struct stm32_dma3_hwdesc *hwdesc;
+	int i;
+
+	for (i = 0; i < swdesc->lli_size; i++) {
+		hwdesc = swdesc->lli[i].hwdesc;
+		if (i)
+			dev_dbg(chan2dev(chan), "V\n");
+		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
+		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
+		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
+		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
+		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
+		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
+		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
+	}
+
+	if (swdesc->cyclic) {
+		dev_dbg(chan2dev(chan), "|\n");
+		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
+	} else {
+		dev_dbg(chan2dev(chan), "X\n");
+	}
+}
+
+static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	struct stm32_dma3_swdesc *swdesc;
+	int i;
+
+	/*
+	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
+	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
+	 * addressed, so abort the allocation.
+	 */
+	if ((count * 32) > CLLR_LA) {
+		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
+		return NULL;
+	}
+
+	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
+	if (!swdesc)
+		return NULL;
+
+	for (i = 0; i < count; i++) {
+		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
+							&swdesc->lli[i].hwdesc_addr);
+		if (!swdesc->lli[i].hwdesc)
+			goto err_pool_free;
+	}
+	swdesc->lli_size = count;
+	swdesc->ccr = 0;
+
+	/* Set LL base address */
+	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
+		       ddata->base + STM32_DMA3_CLBAR(chan->id));
+
+	/* Set LL allocated port */
+	swdesc->ccr &= ~CCR_LAP;
+
+	return swdesc;
+
+err_pool_free:
+	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
+	while (--i >= 0)
+		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
+	kfree(swdesc);
+
+	return NULL;
+}
+
+static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
+				      struct stm32_dma3_swdesc *swdesc)
+{
+	int i;
+
+	for (i = 0; i < swdesc->lli_size; i++)
+		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
+
+	kfree(swdesc);
+}
+
+static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
+{
+	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
+
+	stm32_dma3_chan_desc_free(chan, swdesc);
+}
+
+static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	struct device *dev = chan2dev(chan);
+	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
+	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
+	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
+	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
+	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
+	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
+	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
+	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
+	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
+	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
+
+	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
+		dev_err(dev, "null source block size and no update of this value\n");
+	if (bndt % sdw)
+		dev_err(dev, "source block size not multiple of src data width\n");
+	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
+		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
+	if (csar % sdw)
+		dev_err(dev, "unaligned source address not multiple of src data width\n");
+	if (cdar % ddw)
+		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
+	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
+		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
+	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
+		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
+}
+
+static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
+					struct stm32_dma3_swdesc *swdesc,
+					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
+					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
+{
+	struct stm32_dma3_hwdesc *hwdesc;
+	dma_addr_t next_lli;
+	u32 next = curr + 1;
+
+	hwdesc = swdesc->lli[curr].hwdesc;
+	hwdesc->ctr1 = ctr1;
+	hwdesc->ctr2 = ctr2;
+	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
+	hwdesc->csar = src;
+	hwdesc->cdar = dst;
+
+	if (is_last) {
+		if (is_cyclic)
+			next_lli = swdesc->lli[0].hwdesc_addr;
+		else
+			next_lli = 0;
+	} else {
+		next_lli = swdesc->lli[next].hwdesc_addr;
+	}
+
+	hwdesc->cllr = 0;
+	if (next_lli) {
+		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
+		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
+		hwdesc->cllr |= (next_lli & CLLR_LA);
+	}
+}
+
+static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
+						     enum stm32_dma3_port_data_width port_max_dw,
+						     u32 len, dma_addr_t addr)
+{
+	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
+
+	/* len and addr must be a multiple of dw */
+	return 1 << __ffs(len | addr | max_dw);
+}
+
+static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
+{
+	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
+
+	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
+	if (len < chan_max_burst)
+		max_burst = len / dw;
+
+	/*
+	 * HW doesn't modify the burst if burst size <= half of the fifo size.
+	 * If len is not a multiple of burst size, last burst is shortened by HW.
+	 */
+	return max_burst;
+}
+
+static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
+				   u32 *ccr, u32 *ctr1, u32 *ctr2,
+				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	struct dma_device dma_device = ddata->dma_dev;
+	u32 sdw, ddw, sbl_max, dbl_max, tcem;
+	u32 _ctr1 = 0, _ctr2 = 0;
+	u32 ch_conf = chan->dt_config.ch_conf;
+	u32 tr_conf = chan->dt_config.tr_conf;
+	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
+	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
+
+	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
+		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
+
+	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
+	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
+	sbl_max = chan->dma_config.src_maxburst ? : 1;
+	dbl_max = chan->dma_config.dst_maxburst ? : 1;
+
+	/* Following conditions would raise User Setting Error interrupt */
+	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
+		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
+		return -EINVAL;
+	}
+
+	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
+		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
+		return -EINVAL;
+	}
+
+	sap_max_dw = ddata->ports_max_dw[sap];
+	dap_max_dw = ddata->ports_max_dw[dap];
+	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
+	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
+		dev_err(chan2dev(chan),
+			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
+			sdw, ddw, sap, dap);
+		return -EINVAL;
+	}
+
+	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
+		_ctr1 |= CTR1_SINC;
+	if (sap)
+		_ctr1 |= CTR1_SAP;
+	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
+		_ctr1 |= CTR1_DINC;
+	if (dap)
+		_ctr1 |= CTR1_DAP;
+
+	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
+	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
+		_ctr2 |= CTR2_BREQ;
+	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
+		_ctr2 |= CTR2_PFREQ;
+	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
+	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
+
+	/* Store TCEM to know on which event TC flag occurred */
+	chan->tcem = tcem;
+	/* Store direction for residue computation */
+	chan->dma_config.direction = dir;
+
+	switch (dir) {
+	case DMA_MEM_TO_DEV:
+		/* Set destination (device) data width and burst */
+		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
+							    len, dst_addr));
+		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
+
+		/* Set source (memory) data width and burst */
+		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
+		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
+
+		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
+		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
+		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
+		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
+
+		if (ddw != sdw) {
+			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
+			/* Should never reach this case as ddw is clamped down */
+			if (len & (ddw - 1)) {
+				dev_err(chan2dev(chan),
+					"Packing mode is enabled and len is not multiple of ddw");
+				return -EINVAL;
+			}
+		}
+
+		/* dst = dev */
+		_ctr2 |= CTR2_DREQ;
+
+		break;
+
+	case DMA_DEV_TO_MEM:
+		/* Set source (device) data width and burst */
+		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
+							    len, src_addr));
+		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
+
+		/* Set destination (memory) data width and burst */
+		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
+		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
+
+		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
+		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
+		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
+		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
+
+		if (ddw != sdw) {
+			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
+			/* Should never reach this case as ddw is clamped down */
+			if (len & (ddw - 1)) {
+				dev_err(chan2dev(chan),
+					"Packing mode is enabled and len is not multiple of ddw\n");
+				return -EINVAL;
+			}
+		}
+
+		/* dst = mem */
+		_ctr2 &= ~CTR2_DREQ;
+
+		break;
+
+	default:
+		dev_err(chan2dev(chan), "Direction %s not supported\n",
+			dmaengine_get_direction_text(dir));
+		return -EINVAL;
+	}
+
+	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
+	*ctr1 = _ctr1;
+	*ctr2 = _ctr2;
+
+	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
+		__func__, sdw, sbl_max, ddw, dbl_max);
+
+	return 0;
+}
+
+static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	struct virt_dma_desc *vdesc;
+	struct stm32_dma3_hwdesc *hwdesc;
+	u32 id = chan->id;
+	u32 csr, ccr;
+
+	vdesc = vchan_next_desc(&chan->vchan);
+	if (!vdesc) {
+		chan->swdesc = NULL;
+		return;
+	}
+	list_del(&vdesc->node);
+
+	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
+	hwdesc = chan->swdesc->lli[0].hwdesc;
+
+	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
+
+	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
+	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
+	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
+	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
+	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
+	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
+	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
+
+	/* Clear any pending interrupts */
+	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
+	if (csr & CSR_ALL_F)
+		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
+
+	stm32_dma3_chan_dump_reg(chan);
+
+	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
+	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
+
+	chan->dma_status = DMA_IN_PROGRESS;
+
+	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
+}
+
+static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
+	int ret = 0;
+
+	if (susp)
+		ccr |= CCR_SUSP;
+	else
+		ccr &= ~CCR_SUSP;
+
+	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
+
+	if (susp) {
+		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
+							csr & CSR_SUSPF, 1, 10);
+		if (!ret)
+			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
+
+		stm32_dma3_chan_dump_reg(chan);
+	}
+
+	return ret;
+}
+
+static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
+
+	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
+}
+
+static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	u32 ccr;
+	int ret = 0;
+
+	chan->dma_status = DMA_COMPLETE;
+
+	/* Disable interrupts */
+	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
+	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
+
+	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
+		/* Suspend the channel */
+		ret = stm32_dma3_chan_suspend(chan, true);
+		if (ret)
+			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
+	}
+
+	/*
+	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
+	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
+	 */
+	stm32_dma3_chan_reset(chan);
+
+	return ret;
+}
+
+static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
+{
+	if (!chan->swdesc)
+		return;
+
+	vchan_cookie_complete(&chan->swdesc->vdesc);
+	chan->swdesc = NULL;
+	stm32_dma3_chan_start(chan);
+}
+
+static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
+{
+	struct stm32_dma3_chan *chan = devid;
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	u32 misr, csr, ccr;
+
+	spin_lock(&chan->vchan.lock);
+
+	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
+	if (!(misr & MISR_MIS(chan->id))) {
+		spin_unlock(&chan->vchan.lock);
+		return IRQ_NONE;
+	}
+
+	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
+	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
+
+	if (csr & CSR_TCF && ccr & CCR_TCIE) {
+		if (chan->swdesc->cyclic)
+			vchan_cyclic_callback(&chan->swdesc->vdesc);
+		else
+			stm32_dma3_chan_complete(chan);
+	}
+
+	if (csr & CSR_USEF && ccr & CCR_USEIE) {
+		dev_err(chan2dev(chan), "User setting error\n");
+		chan->dma_status = DMA_ERROR;
+		/* CCR.EN automatically cleared by HW */
+		stm32_dma3_check_user_setting(chan);
+		stm32_dma3_chan_reset(chan);
+	}
+
+	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
+		dev_err(chan2dev(chan), "Update link transfer error\n");
+		chan->dma_status = DMA_ERROR;
+		/* CCR.EN automatically cleared by HW */
+		stm32_dma3_chan_reset(chan);
+	}
+
+	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
+		dev_err(chan2dev(chan), "Data transfer error\n");
+		chan->dma_status = DMA_ERROR;
+		/* CCR.EN automatically cleared by HW */
+		stm32_dma3_chan_reset(chan);
+	}
+
+	/*
+	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
+	 * ensure HTF flag to be cleared, with other flags.
+	 */
+	csr &= (ccr | CCR_HTIE);
+
+	if (csr)
+		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
+
+	spin_unlock(&chan->vchan.lock);
+
+	return IRQ_HANDLED;
+}
+
+static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	u32 id = chan->id, csemcr, ccid;
+	int ret;
+
+	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
+	if (ret < 0)
+		return ret;
+
+	/* Ensure the channel is free */
+	if (chan->semaphore_mode &&
+	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
+		ret = -EBUSY;
+		goto err_put_sync;
+	}
+
+	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
+					  sizeof(struct stm32_dma3_hwdesc),
+					  __alignof__(struct stm32_dma3_hwdesc), 0);
+	if (!chan->lli_pool) {
+		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
+		ret = -ENOMEM;
+		goto err_put_sync;
+	}
+
+	/* Take the channel semaphore */
+	if (chan->semaphore_mode) {
+		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
+		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
+		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
+		/* Check that the channel is well taken */
+		if (ccid != CCIDCFGR_CID1) {
+			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
+			ret = -EPERM;
+			goto err_pool_destroy;
+		}
+		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
+	}
+
+	return 0;
+
+err_pool_destroy:
+	dmam_pool_destroy(chan->lli_pool);
+	chan->lli_pool = NULL;
+
+err_put_sync:
+	pm_runtime_put_sync(ddata->dma_dev.dev);
+
+	return ret;
+}
+
+static void stm32_dma3_free_chan_resources(struct dma_chan *c)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	unsigned long flags;
+
+	/* Ensure channel is in idle state */
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+	stm32_dma3_chan_stop(chan);
+	chan->swdesc = NULL;
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+
+	vchan_free_chan_resources(to_virt_chan(c));
+
+	dmam_pool_destroy(chan->lli_pool);
+	chan->lli_pool = NULL;
+
+	/* Release the channel semaphore */
+	if (chan->semaphore_mode)
+		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
+
+	pm_runtime_put_sync(ddata->dma_dev.dev);
+
+	/* Reset configuration */
+	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
+	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
+}
+
+static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
+								struct scatterlist *sgl,
+								unsigned int sg_len,
+								enum dma_transfer_direction dir,
+								unsigned long flags, void *context)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	struct stm32_dma3_swdesc *swdesc;
+	struct scatterlist *sg;
+	size_t len;
+	dma_addr_t sg_addr, dev_addr, src, dst;
+	u32 i, j, count, ctr1, ctr2;
+	int ret;
+
+	count = sg_len;
+	for_each_sg(sgl, sg, sg_len, i) {
+		len = sg_dma_len(sg);
+		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
+			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
+	}
+
+	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
+	if (!swdesc)
+		return NULL;
+
+	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
+	j = 0;
+	for_each_sg(sgl, sg, sg_len, i) {
+		sg_addr = sg_dma_address(sg);
+		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
+						     chan->dma_config.src_addr;
+		len = sg_dma_len(sg);
+
+		do {
+			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
+
+			if (dir == DMA_MEM_TO_DEV) {
+				src = sg_addr;
+				dst = dev_addr;
+
+				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
+							      src, dst, chunk);
+
+				if (FIELD_GET(CTR1_DINC, ctr1))
+					dev_addr += chunk;
+			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
+				src = dev_addr;
+				dst = sg_addr;
+
+				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
+							      src, dst, chunk);
+
+				if (FIELD_GET(CTR1_SINC, ctr1))
+					dev_addr += chunk;
+			}
+
+			if (ret)
+				goto err_desc_free;
+
+			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
+						    ctr1, ctr2, j == (count - 1), false);
+
+			sg_addr += chunk;
+			len -= chunk;
+			j++;
+		} while (len);
+	}
+
+	/* Enable Error interrupts */
+	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
+	/* Enable Transfer state interrupts */
+	swdesc->ccr |= CCR_TCIE;
+
+	swdesc->cyclic = false;
+
+	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
+
+err_desc_free:
+	stm32_dma3_chan_desc_free(chan, swdesc);
+
+	return NULL;
+}
+
+static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+
+	if (!chan->fifo_size) {
+		caps->max_burst = 0;
+		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
+		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
+	} else {
+		/* Burst transfer should not exceed half of the fifo size */
+		caps->max_burst = chan->max_burst;
+		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
+			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
+			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
+		}
+	}
+}
+
+static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+
+	memcpy(&chan->dma_config, config, sizeof(*config));
+
+	return 0;
+}
+
+static int stm32_dma3_terminate_all(struct dma_chan *c)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	unsigned long flags;
+	LIST_HEAD(head);
+
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+
+	if (chan->swdesc) {
+		vchan_terminate_vdesc(&chan->swdesc->vdesc);
+		chan->swdesc = NULL;
+	}
+
+	stm32_dma3_chan_stop(chan);
+
+	vchan_get_all_descriptors(&chan->vchan, &head);
+
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+	vchan_dma_desc_free_list(&chan->vchan, &head);
+
+	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
+
+	return 0;
+}
+
+static void stm32_dma3_synchronize(struct dma_chan *c)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+
+	vchan_synchronize(&chan->vchan);
+}
+
+static void stm32_dma3_issue_pending(struct dma_chan *c)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+
+	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
+		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
+		stm32_dma3_chan_start(chan);
+	}
+
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+}
+
+static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	struct stm32_dma3_dt_conf *conf = fn_param;
+	u32 mask, semcr;
+	int ret;
+
+	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
+		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
+
+	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
+		if (!(mask & BIT(chan->id)))
+			return false;
+
+	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
+	if (ret < 0)
+		return false;
+	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
+	pm_runtime_put_sync(ddata->dma_dev.dev);
+
+	/* Check if chan is free */
+	if (semcr & CSEMCR_SEM_MUTEX)
+		return false;
+
+	/* Check if chan fifo fits well */
+	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
+		return false;
+
+	return true;
+}
+
+static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
+{
+	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
+	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
+	struct stm32_dma3_dt_conf conf;
+	struct stm32_dma3_chan *chan;
+	struct dma_chan *c;
+
+	if (dma_spec->args_count < 3) {
+		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
+		return NULL;
+	}
+
+	conf.req_line = dma_spec->args[0];
+	conf.ch_conf = dma_spec->args[1];
+	conf.tr_conf = dma_spec->args[2];
+
+	if (conf.req_line >= ddata->dma_requests) {
+		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
+		return NULL;
+	}
+
+	/* Request dma channel among the generic dma controller list */
+	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
+	if (!c) {
+		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
+		return NULL;
+	}
+
+	chan = to_stm32_dma3_chan(c);
+	chan->dt_config = conf;
+
+	return c;
+}
+
+static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
+{
+	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
+
+	/* Reserve Secure channels */
+	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
+
+	/*
+	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
+	 * the processor which is configuring and using the given channel.
+	 * In case CID filtering is not configured, dma-channel-mask property can be used to
+	 * specify available DMA channels to the kernel.
+	 */
+	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
+
+	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
+	for (i = 0; i < ddata->dma_channels; i++) {
+		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
+
+		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
+			invalid_cid |= BIT(i);
+			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
+				chan_reserved |= BIT(i);
+		} else { /* CID-filtered */
+			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
+				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
+					chan_reserved |= BIT(i);
+			} else { /* Semaphore mode */
+				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
+					chan_reserved |= BIT(i);
+				ddata->chans[i].semaphore_mode = true;
+			}
+		}
+		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
+			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
+			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
+			(chan_reserved & BIT(i)) ? "denied" :
+			mask & BIT(i) ? "force allowed" : "allowed");
+	}
+
+	if (invalid_cid)
+		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
+			 ddata->dma_channels, &invalid_cid);
+
+	return chan_reserved;
+}
+
+static const struct of_device_id stm32_dma3_of_match[] = {
+	{ .compatible = "st,stm32-dma3", },
+	{ /* sentinel */},
+};
+MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
+
+static int stm32_dma3_probe(struct platform_device *pdev)
+{
+	struct device_node *np = pdev->dev.of_node;
+	struct stm32_dma3_ddata *ddata;
+	struct reset_control *reset;
+	struct stm32_dma3_chan *chan;
+	struct dma_device *dma_dev;
+	u32 master_ports, chan_reserved, i, verr;
+	u64 hwcfgr;
+	int ret;
+
+	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
+	if (!ddata)
+		return -ENOMEM;
+	platform_set_drvdata(pdev, ddata);
+
+	dma_dev = &ddata->dma_dev;
+
+	ddata->base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(ddata->base))
+		return PTR_ERR(ddata->base);
+
+	ddata->clk = devm_clk_get(&pdev->dev, NULL);
+	if (IS_ERR(ddata->clk))
+		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
+
+	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
+	if (IS_ERR(reset))
+		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
+
+	ret = clk_prepare_enable(ddata->clk);
+	if (ret)
+		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
+
+	reset_control_reset(reset);
+
+	INIT_LIST_HEAD(&dma_dev->channels);
+
+	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
+	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
+	dma_dev->dev = &pdev->dev;
+	/*
+	 * This controller supports up to 8-byte buswidth depending on the port used and the
+	 * channel, and can only access address at even boundaries, multiple of the buswidth.
+	 */
+	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
+	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
+				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
+				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
+				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
+	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
+				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
+				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
+				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
+	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
+
+	dma_dev->descriptor_reuse = true;
+	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
+	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
+	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
+	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
+	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
+	dma_dev->device_caps = stm32_dma3_caps;
+	dma_dev->device_config = stm32_dma3_config;
+	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
+	dma_dev->device_synchronize = stm32_dma3_synchronize;
+	dma_dev->device_tx_status = dma_cookie_status;
+	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
+
+	/* if dma_channels is not modified, get it from hwcfgr1 */
+	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
+		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
+		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
+	}
+
+	/* if dma_requests is not modified, get it from hwcfgr2 */
+	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
+		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
+		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
+	}
+
+	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
+	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
+	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
+
+	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
+	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
+		ddata->ports_max_dw[1] = DW_INVALID;
+	else /* Dual master ports */
+		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
+
+	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
+				    GFP_KERNEL);
+	if (!ddata->chans) {
+		ret = -ENOMEM;
+		goto err_clk_disable;
+	}
+
+	chan_reserved = stm32_dma3_check_rif(ddata);
+
+	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
+		ret = -ENODEV;
+		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
+		goto err_clk_disable;
+	}
+
+	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
+	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
+	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
+
+	for (i = 0; i < ddata->dma_channels; i++) {
+		if (chan_reserved & BIT(i))
+			continue;
+
+		chan = &ddata->chans[i];
+		chan->id = i;
+		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
+		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
+		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
+		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
+
+		vchan_init(&chan->vchan, dma_dev);
+	}
+
+	ret = dmaenginem_async_device_register(dma_dev);
+	if (ret)
+		goto err_clk_disable;
+
+	for (i = 0; i < ddata->dma_channels; i++) {
+		if (chan_reserved & BIT(i))
+			continue;
+
+		ret = platform_get_irq(pdev, i);
+		if (ret < 0)
+			goto err_clk_disable;
+
+		chan = &ddata->chans[i];
+		chan->irq = ret;
+
+		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
+				       dev_name(chan2dev(chan)), chan);
+		if (ret) {
+			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
+				      dev_name(chan2dev(chan)));
+			goto err_clk_disable;
+		}
+	}
+
+	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
+	if (ret) {
+		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
+		goto err_clk_disable;
+	}
+
+	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
+
+	pm_runtime_set_active(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+	pm_runtime_get_noresume(&pdev->dev);
+	pm_runtime_put(&pdev->dev);
+
+	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
+		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
+
+	return 0;
+
+err_clk_disable:
+	clk_disable_unprepare(ddata->clk);
+
+	return ret;
+}
+
+static void stm32_dma3_remove(struct platform_device *pdev)
+{
+	pm_runtime_disable(&pdev->dev);
+}
+
+static int stm32_dma3_runtime_suspend(struct device *dev)
+{
+	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
+
+	clk_disable_unprepare(ddata->clk);
+
+	return 0;
+}
+
+static int stm32_dma3_runtime_resume(struct device *dev)
+{
+	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
+	int ret;
+
+	ret = clk_prepare_enable(ddata->clk);
+	if (ret)
+		dev_err(dev, "Failed to enable clk: %d\n", ret);
+
+	return ret;
+}
+
+static const struct dev_pm_ops stm32_dma3_pm_ops = {
+	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
+	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
+};
+
+static struct platform_driver stm32_dma3_driver = {
+	.probe = stm32_dma3_probe,
+	.remove_new = stm32_dma3_remove,
+	.driver = {
+		.name = "stm32-dma3",
+		.of_match_table = stm32_dma3_of_match,
+		.pm = pm_ptr(&stm32_dma3_pm_ops),
+	},
+};
+
+static int __init stm32_dma3_init(void)
+{
+	return platform_driver_register(&stm32_dma3_driver);
+}
+
+subsys_initcall(stm32_dma3_init);
+
+MODULE_DESCRIPTION("STM32 DMA3 controller driver");
+MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
+MODULE_LICENSE("GPL");
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/12] dmaengine: stm32-dma3: add DMA_CYCLIC capability
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (4 preceding siblings ...)
  2024-04-23 12:32 ` [PATCH 05/12] dmaengine: Add STM32 DMA3 support Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 07/12] dmaengine: stm32-dma3: add DMA_MEMCPY capability Amelie Delaunay
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

Add DMA_CYCLIC capability and relative device_prep_dma_cyclic ops with
stm32_dma3_prep_dma_cyclic(). It reuses stm32_dma3_chan_prep_hw() and
stm32_dma3_chan_prep_hwdesc() helpers.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/stm32/stm32-dma3.c | 77 ++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
index b5493f497d06..3afd9f8da2b6 100644
--- a/drivers/dma/stm32/stm32-dma3.c
+++ b/drivers/dma/stm32/stm32-dma3.c
@@ -1012,6 +1012,81 @@ static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan
 	return NULL;
 }
 
+static struct dma_async_tx_descriptor *stm32_dma3_prep_dma_cyclic(struct dma_chan *c,
+								  dma_addr_t buf_addr,
+								  size_t buf_len, size_t period_len,
+								  enum dma_transfer_direction dir,
+								  unsigned long flags)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	struct stm32_dma3_swdesc *swdesc;
+	dma_addr_t src, dst;
+	u32 count, i, ctr1, ctr2;
+	int ret;
+
+	if (!buf_len || !period_len || period_len > STM32_DMA3_MAX_BLOCK_SIZE) {
+		dev_err(chan2dev(chan), "Invalid buffer/period length\n");
+		return NULL;
+	}
+
+	if (buf_len % period_len) {
+		dev_err(chan2dev(chan), "Buffer length not multiple of period length\n");
+		return NULL;
+	}
+
+	count = buf_len / period_len;
+	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
+	if (!swdesc)
+		return NULL;
+
+	if (dir == DMA_MEM_TO_DEV) {
+		src = buf_addr;
+		dst = chan->dma_config.dst_addr;
+
+		ret = stm32_dma3_chan_prep_hw(chan, DMA_MEM_TO_DEV, &swdesc->ccr, &ctr1, &ctr2,
+					      src, dst, period_len);
+	} else if (dir == DMA_DEV_TO_MEM) {
+		src = chan->dma_config.src_addr;
+		dst = buf_addr;
+
+		ret = stm32_dma3_chan_prep_hw(chan, DMA_DEV_TO_MEM, &swdesc->ccr, &ctr1, &ctr2,
+					      src, dst, period_len);
+	} else {
+		dev_err(chan2dev(chan), "Invalid direction\n");
+		ret = -EINVAL;
+	}
+
+	if (ret)
+		goto err_desc_free;
+
+	for (i = 0; i < count; i++) {
+		if (dir == DMA_MEM_TO_DEV) {
+			src = buf_addr + i * period_len;
+			dst = chan->dma_config.dst_addr;
+		} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
+			src = chan->dma_config.src_addr;
+			dst = buf_addr + i * period_len;
+		}
+
+		stm32_dma3_chan_prep_hwdesc(chan, swdesc, i, src, dst, period_len,
+					    ctr1, ctr2, i == (count - 1), true);
+	}
+
+	/* Enable Error interrupts */
+	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
+	/* Enable Transfer state interrupts */
+	swdesc->ccr |= CCR_TCIE;
+
+	swdesc->cyclic = true;
+
+	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
+
+err_desc_free:
+	stm32_dma3_chan_desc_free(chan, swdesc);
+
+	return NULL;
+}
+
 static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
 {
 	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
@@ -1246,6 +1321,7 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 
 	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
 	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
+	dma_cap_set(DMA_CYCLIC, dma_dev->cap_mask);
 	dma_dev->dev = &pdev->dev;
 	/*
 	 * This controller supports up to 8-byte buswidth depending on the port used and the
@@ -1268,6 +1344,7 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
 	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
 	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
+	dma_dev->device_prep_dma_cyclic = stm32_dma3_prep_dma_cyclic;
 	dma_dev->device_caps = stm32_dma3_caps;
 	dma_dev->device_config = stm32_dma3_config;
 	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/12] dmaengine: stm32-dma3: add DMA_MEMCPY capability
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (5 preceding siblings ...)
  2024-04-23 12:32 ` [PATCH 06/12] dmaengine: stm32-dma3: add DMA_CYCLIC capability Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 08/12] dmaengine: stm32-dma3: add device_pause and device_resume ops Amelie Delaunay
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

Add DMA_MEMCPY capability and relative device_prep_dma_memcpy ops with
stm32_dma3_prep_dma_memcpy(). It reuses stm32_dma3_chan_prep_hw() and
stm32_dma3_prep_hwdesc() helpers.
As this driver relies on both device_config and of_xlate ops to
pre-configure the channel for transfer, add a new helper
(stm32_dma3_init_chan_config_for_memcpy) in case the channel is used
without being pre-configured (with DT and/or dmaengine_slave_config()).

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/stm32/stm32-dma3.c | 131 ++++++++++++++++++++++++++++++++-
 1 file changed, 130 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
index 3afd9f8da2b6..73e856a5aeab 100644
--- a/drivers/dma/stm32/stm32-dma3.c
+++ b/drivers/dma/stm32/stm32-dma3.c
@@ -222,6 +222,11 @@ enum stm32_dma3_port_data_width {
 #define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
 #define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
 
+/* struct stm32_dma3_chan .config_set bitfield */
+#define STM32_DMA3_CFG_SET_DT		BIT(0)
+#define STM32_DMA3_CFG_SET_DMA		BIT(1)
+#define STM32_DMA3_CFG_SET_BOTH		(STM32_DMA3_CFG_SET_DT | STM32_DMA3_CFG_SET_DMA)
+
 #define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
 #define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
 					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
@@ -281,6 +286,7 @@ struct stm32_dma3_chan {
 	bool semaphore_mode;
 	struct stm32_dma3_dt_conf dt_config;
 	struct dma_slave_config dma_config;
+	u8 config_set;
 	struct dma_pool *lli_pool;
 	struct stm32_dma3_swdesc *swdesc;
 	enum ctr2_tcem tcem;
@@ -539,7 +545,7 @@ static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transf
 {
 	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
 	struct dma_device dma_device = ddata->dma_dev;
-	u32 sdw, ddw, sbl_max, dbl_max, tcem;
+	u32 sdw, ddw, sbl_max, dbl_max, tcem, init_dw, init_bl_max;
 	u32 _ctr1 = 0, _ctr2 = 0;
 	u32 ch_conf = chan->dt_config.ch_conf;
 	u32 tr_conf = chan->dt_config.tr_conf;
@@ -658,6 +664,49 @@ static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transf
 
 		break;
 
+	case DMA_MEM_TO_MEM:
+		/* Set source (memory) data width and burst */
+		init_dw = sdw;
+		init_bl_max = sbl_max;
+		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
+		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
+		if (chan->config_set & STM32_DMA3_CFG_SET_DMA) {
+			sdw = min_t(u32, init_dw, sdw);
+			sbl_max = min_t(u32, init_bl_max,
+					stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
+		}
+
+		/* Set destination (memory) data width and burst */
+		init_dw = ddw;
+		init_bl_max = dbl_max;
+		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
+		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
+		if (chan->config_set & STM32_DMA3_CFG_SET_DMA) {
+			ddw = min_t(u32, init_dw, ddw);
+			dbl_max = min_t(u32, init_bl_max,
+					stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
+		}
+
+		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
+		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
+		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
+		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
+
+		if (ddw != sdw) {
+			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
+			/* Should never reach this case as ddw is clamped down */
+			if (len & (ddw - 1)) {
+				dev_err(chan2dev(chan),
+					"Packing mode is enabled and len is not multiple of ddw");
+				return -EINVAL;
+			}
+		}
+
+		/* CTR2_REQSEL/DREQ/BREQ/PFREQ are ignored with CTR2_SWREQ=1 */
+		_ctr2 |= CTR2_SWREQ;
+
+		break;
+
 	default:
 		dev_err(chan2dev(chan), "Direction %s not supported\n",
 			dmaengine_get_direction_text(dir));
@@ -927,6 +976,82 @@ static void stm32_dma3_free_chan_resources(struct dma_chan *c)
 	/* Reset configuration */
 	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
 	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
+	chan->config_set = 0;
+}
+
+static void stm32_dma3_init_chan_config_for_memcpy(struct stm32_dma3_chan *chan,
+						   dma_addr_t dst, dma_addr_t src)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	u32 dw = get_chan_max_dw(ddata->ports_max_dw[0], chan->max_burst); /* port 0 by default */
+	u32 burst = chan->max_burst / dw;
+
+	/* Initialize dt_config if channel not pre-configured through DT */
+	if (!(chan->config_set & STM32_DMA3_CFG_SET_DT)) {
+		chan->dt_config.ch_conf = FIELD_PREP(STM32_DMA3_DT_PRIO, CCR_PRIO_VERY_HIGH);
+		chan->dt_config.ch_conf |= FIELD_PREP(STM32_DMA3_DT_FIFO, chan->fifo_size);
+		chan->dt_config.tr_conf = STM32_DMA3_DT_SINC | STM32_DMA3_DT_DINC;
+		chan->dt_config.tr_conf |= FIELD_PREP(STM32_DMA3_DT_TCEM, CTR2_TCEM_CHANNEL);
+	}
+
+	/* Initialize dma_config if dmaengine_slave_config() not used */
+	if (!(chan->config_set & STM32_DMA3_CFG_SET_DMA)) {
+		chan->dma_config.src_addr_width = dw;
+		chan->dma_config.dst_addr_width = dw;
+		chan->dma_config.src_maxburst = burst;
+		chan->dma_config.dst_maxburst = burst;
+		chan->dma_config.src_addr = src;
+		chan->dma_config.dst_addr = dst;
+	}
+}
+
+static struct dma_async_tx_descriptor *stm32_dma3_prep_dma_memcpy(struct dma_chan *c,
+								  dma_addr_t dst, dma_addr_t src,
+								  size_t len, unsigned long flags)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	struct stm32_dma3_swdesc *swdesc;
+	size_t next_size, offset;
+	u32 count, i, ctr1, ctr2;
+
+	count = DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE);
+
+	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
+	if (!swdesc)
+		return NULL;
+
+	if (chan->config_set != STM32_DMA3_CFG_SET_BOTH)
+		stm32_dma3_init_chan_config_for_memcpy(chan, dst, src);
+
+	for (i = 0, offset = 0; offset < len; i++, offset += next_size) {
+		size_t remaining;
+		int ret;
+
+		remaining = len - offset;
+		next_size = min_t(size_t, remaining, STM32_DMA3_MAX_BLOCK_SIZE);
+
+		ret = stm32_dma3_chan_prep_hw(chan, DMA_MEM_TO_MEM, &swdesc->ccr, &ctr1, &ctr2,
+					      src + offset, dst + offset, next_size);
+		if (ret)
+			goto err_desc_free;
+
+		stm32_dma3_chan_prep_hwdesc(chan, swdesc, i, src + offset, dst + offset, next_size,
+					    ctr1, ctr2, next_size == remaining, false);
+	}
+
+	/* Enable Errors interrupts */
+	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
+	/* Enable Transfer state interrupts */
+	swdesc->ccr |= CCR_TCIE;
+
+	swdesc->cyclic = false;
+
+	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
+
+err_desc_free:
+	stm32_dma3_chan_desc_free(chan, swdesc);
+
+	return NULL;
 }
 
 static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
@@ -1110,6 +1235,7 @@ static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config
 	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
 
 	memcpy(&chan->dma_config, config, sizeof(*config));
+	chan->config_set |= STM32_DMA3_CFG_SET_DMA;
 
 	return 0;
 }
@@ -1224,6 +1350,7 @@ static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, st
 
 	chan = to_stm32_dma3_chan(c);
 	chan->dt_config = conf;
+	chan->config_set |= STM32_DMA3_CFG_SET_DT;
 
 	return c;
 }
@@ -1322,6 +1449,7 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
 	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
 	dma_cap_set(DMA_CYCLIC, dma_dev->cap_mask);
+	dma_cap_set(DMA_MEMCPY, dma_dev->cap_mask);
 	dma_dev->dev = &pdev->dev;
 	/*
 	 * This controller supports up to 8-byte buswidth depending on the port used and the
@@ -1343,6 +1471,7 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
 	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
 	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
+	dma_dev->device_prep_dma_memcpy = stm32_dma3_prep_dma_memcpy;
 	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
 	dma_dev->device_prep_dma_cyclic = stm32_dma3_prep_dma_cyclic;
 	dma_dev->device_caps = stm32_dma3_caps;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/12] dmaengine: stm32-dma3: add device_pause and device_resume ops
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (6 preceding siblings ...)
  2024-04-23 12:32 ` [PATCH 07/12] dmaengine: stm32-dma3: add DMA_MEMCPY capability Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 12:32 ` [PATCH 09/12] dmaengine: stm32-dma3: improve residue granularity Amelie Delaunay
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

STM32 DMA3 controller is able to suspend an ongoing transfer (the transfer
is suspended after the ongoing burst is flushed to the destination) and
resume it from the point it was suspended. No need to reconfigure any
register.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/stm32/stm32-dma3.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
index 73e856a5aeab..3d827c33150e 100644
--- a/drivers/dma/stm32/stm32-dma3.c
+++ b/drivers/dma/stm32/stm32-dma3.c
@@ -1240,6 +1240,35 @@ static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config
 	return 0;
 }
 
+static int stm32_dma3_pause(struct dma_chan *c)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	int ret;
+
+	ret = stm32_dma3_chan_suspend(chan, true);
+	if (ret)
+		return ret;
+
+	chan->dma_status = DMA_PAUSED;
+
+	dev_dbg(chan2dev(chan), "vchan %pK: paused\n", &chan->vchan);
+
+	return 0;
+}
+
+static int stm32_dma3_resume(struct dma_chan *c)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+
+	stm32_dma3_chan_suspend(chan, false);
+
+	chan->dma_status = DMA_IN_PROGRESS;
+
+	dev_dbg(chan2dev(chan), "vchan %pK: resumed\n", &chan->vchan);
+
+	return 0;
+}
+
 static int stm32_dma3_terminate_all(struct dma_chan *c)
 {
 	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
@@ -1476,6 +1505,8 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 	dma_dev->device_prep_dma_cyclic = stm32_dma3_prep_dma_cyclic;
 	dma_dev->device_caps = stm32_dma3_caps;
 	dma_dev->device_config = stm32_dma3_config;
+	dma_dev->device_pause = stm32_dma3_pause;
+	dma_dev->device_resume = stm32_dma3_resume;
 	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
 	dma_dev->device_synchronize = stm32_dma3_synchronize;
 	dma_dev->device_tx_status = dma_cookie_status;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/12] dmaengine: stm32-dma3: improve residue granularity
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (7 preceding siblings ...)
  2024-04-23 12:32 ` [PATCH 08/12] dmaengine: stm32-dma3: add device_pause and device_resume ops Amelie Delaunay
@ 2024-04-23 12:32 ` Amelie Delaunay
  2024-04-23 12:33 ` [PATCH 10/12] dmaengine: add channel device name to channel registration Amelie Delaunay
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:32 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

Implement own device_tx_status ops to compute the residue with a finer
granularity, up to bytes.
STM32 DMA3 has a bitfield, BNDT, in CxTR1 register which reflects the
number of bytes read from the source.
It also has a bitfield, FIFOL, in CxSR register which reflects the FIFO
level in units of programmed destination data width.
The channel is briefly suspended to get a coherent snapshot of registers.
It is possible to correct the fifo level when packing/unpacking is enabled
with destination increment.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/stm32/stm32-dma3.c | 165 ++++++++++++++++++++++++++++++++-
 1 file changed, 163 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
index 3d827c33150e..f15cc0129fc7 100644
--- a/drivers/dma/stm32/stm32-dma3.c
+++ b/drivers/dma/stm32/stm32-dma3.c
@@ -799,6 +799,134 @@ static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
 	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
 }
 
+static int stm32_dma3_chan_get_curr_hwdesc(struct stm32_dma3_swdesc *swdesc, u32 cllr, u32 *residue)
+{
+	u32 i, lli_offset, next_lli_offset = cllr & CLLR_LA;
+
+	/* If cllr is null, it means it is either the last or single item */
+	if (!cllr)
+		return swdesc->lli_size - 1;
+
+	/* In cyclic mode, go fast and first check we are not on the last item */
+	if (swdesc->cyclic && next_lli_offset == (swdesc->lli[0].hwdesc_addr & CLLR_LA))
+		return swdesc->lli_size - 1;
+
+	/* As transfer is in progress, look backward from the last item */
+	for (i = swdesc->lli_size - 1; i > 0; i--) {
+		*residue += FIELD_GET(CBR1_BNDT, swdesc->lli[i].hwdesc->cbr1);
+		lli_offset = swdesc->lli[i].hwdesc_addr & CLLR_LA;
+		if (lli_offset == next_lli_offset)
+			return i - 1;
+	}
+
+	return -EINVAL;
+}
+
+static void stm32_dma3_chan_set_residue(struct stm32_dma3_chan *chan,
+					struct stm32_dma3_swdesc *swdesc,
+					struct dma_tx_state *txstate)
+{
+	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
+	struct device *dev = chan2dev(chan);
+	struct stm32_dma3_hwdesc *hwdesc;
+	u32 residue, curr_lli, csr, cdar, cbr1, cllr, bndt, fifol;
+	bool pack_unpack;
+	int ret;
+
+	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
+	if (!((csr & CSR_TCF) && (csr & CSR_IDLEF)) && chan->dma_status != DMA_PAUSED) {
+		/* Suspend current transfer to read registers for a snapshot */
+		writel_relaxed(swdesc->ccr | CCR_SUSP, ddata->base + STM32_DMA3_CCR(chan->id));
+		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
+							csr & (CSR_SUSPF | CSR_TCF), 1, 10);
+
+		if (ret || ((csr & CSR_TCF) && (csr & CSR_IDLEF))) {
+			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
+			writel_relaxed(swdesc->ccr, ddata->base + STM32_DMA3_CCR(chan->id));
+			if (ret)
+				dev_err(dev, "Channel suspension timeout, csr=%08x\n", csr);
+		}
+	}
+
+	/* If channel is still active (CSR_IDLEF is not set), can't get a reliable residue */
+	if (!(csr & CSR_IDLEF)) {
+		dev_err(dev, "Can't get residue: channel still active, csr=%08x\n", csr);
+		return;
+	}
+
+	/*
+	 * If channel is not suspended, but Idle and Transfer Complete are set,
+	 * linked-list is over, no residue
+	 */
+	if (!(csr & CSR_SUSPF) && (csr & CSR_TCF) && (csr & CSR_IDLEF))
+		return;
+
+	/* Read registers to have a snapshot */
+	cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
+	cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
+	cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
+
+	/* Resume current transfer */
+	writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
+	writel_relaxed(swdesc->ccr, ddata->base + STM32_DMA3_CCR(chan->id));
+
+	/* Add current BNDT */
+	bndt = FIELD_GET(CBR1_BNDT, cbr1);
+	residue = bndt;
+
+	/* Get current hwdesc and cumulate residue of pending hwdesc BNDT */
+	ret = stm32_dma3_chan_get_curr_hwdesc(swdesc, cllr, &residue);
+	if (ret < 0) {
+		dev_err(chan2dev(chan), "Can't get residue: current hwdesc not found\n");
+		return;
+	}
+	curr_lli = ret;
+
+	/* Read current FIFO level - in units of programmed destination data width */
+	hwdesc = swdesc->lli[curr_lli].hwdesc;
+	fifol = FIELD_GET(CSR_FIFOL, csr) * (1 << FIELD_GET(CTR1_DDW_LOG2, hwdesc->ctr1));
+	/* If the FIFO contains as many bytes as its size, it can't contain more */
+	if (fifol == (1 << (chan->fifo_size + 1)))
+		goto skip_fifol_update;
+
+	/*
+	 * In case of PACKING (Destination burst length > Source burst length) or UNPACKING
+	 * (Source burst length > Destination burst length), bytes could be pending in the FIFO
+	 * (to be packed up to Destination burst length or unpacked into Destination burst length
+	 * chunks).
+	 * BNDT is not reliable, as it reflects the number of bytes read from the source but not the
+	 * number of bytes written to the destination.
+	 * FIFOL is also not sufficient, because it reflects the number of available write beats in
+	 * units of Destination data width but not the bytes not yet packed or unpacked.
+	 * In case of Destination increment DINC, it is possible to compute the number of bytes in
+	 * the FIFO:
+	 * fifol_in_bytes = bytes_read - bytes_written.
+	 */
+	pack_unpack = !!(FIELD_GET(CTR1_PAM, hwdesc->ctr1) == CTR1_PAM_PACK_UNPACK);
+	if (pack_unpack && (hwdesc->ctr1 & CTR1_DINC)) {
+		int bytes_read = FIELD_GET(CBR1_BNDT, hwdesc->cbr1) - bndt;
+		int bytes_written = cdar - hwdesc->cdar;
+
+		if (bytes_read > 0)
+			fifol = bytes_read - bytes_written;
+	}
+
+skip_fifol_update:
+	if (fifol) {
+		dev_dbg(chan2dev(chan), "%u byte(s) in the FIFO\n", fifol);
+		dma_set_in_flight_bytes(txstate, fifol);
+		/*
+		 * Residue is already accurate for DMA_MEM_TO_DEV as BNDT reflects data read from
+		 * the source memory buffer, so just need to add fifol to residue in case of
+		 * DMA_DEV_TO_MEM transfer because these bytes are not yet written in destination
+		 * memory buffer.
+		 */
+		if (chan->dma_config.direction == DMA_DEV_TO_MEM)
+			residue += fifol;
+	}
+	dma_set_residue(txstate, residue);
+}
+
 static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
 {
 	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
@@ -1301,6 +1429,39 @@ static void stm32_dma3_synchronize(struct dma_chan *c)
 	vchan_synchronize(&chan->vchan);
 }
 
+static enum dma_status stm32_dma3_tx_status(struct dma_chan *c, dma_cookie_t cookie,
+					    struct dma_tx_state *txstate)
+{
+	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
+	struct stm32_dma3_swdesc *swdesc = NULL;
+	enum dma_status status;
+	unsigned long flags;
+	struct virt_dma_desc *vd;
+
+	status = dma_cookie_status(c, cookie, txstate);
+	if (status == DMA_COMPLETE)
+		return status;
+
+	if (!txstate)
+		return chan->dma_status;
+
+	spin_lock_irqsave(&chan->vchan.lock, flags);
+
+	vd = vchan_find_desc(&chan->vchan, cookie);
+	if (vd)
+		swdesc = to_stm32_dma3_swdesc(vd);
+	else if (chan->swdesc && chan->swdesc->vdesc.tx.cookie == cookie)
+		swdesc = chan->swdesc;
+
+	/* Get residue/in_flight_bytes only if a transfer is currently running (swdesc != NULL) */
+	if (swdesc)
+		stm32_dma3_chan_set_residue(chan, swdesc, txstate);
+
+	spin_unlock_irqrestore(&chan->vchan.lock, flags);
+
+	return chan->dma_status;
+}
+
 static void stm32_dma3_issue_pending(struct dma_chan *c)
 {
 	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
@@ -1497,7 +1658,7 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 
 	dma_dev->descriptor_reuse = true;
 	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
-	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
+	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
 	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
 	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
 	dma_dev->device_prep_dma_memcpy = stm32_dma3_prep_dma_memcpy;
@@ -1509,7 +1670,7 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 	dma_dev->device_resume = stm32_dma3_resume;
 	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
 	dma_dev->device_synchronize = stm32_dma3_synchronize;
-	dma_dev->device_tx_status = dma_cookie_status;
+	dma_dev->device_tx_status = stm32_dma3_tx_status;
 	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
 
 	/* if dma_channels is not modified, get it from hwcfgr1 */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/12] dmaengine: add channel device name to channel registration
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (8 preceding siblings ...)
  2024-04-23 12:32 ` [PATCH 09/12] dmaengine: stm32-dma3: improve residue granularity Amelie Delaunay
@ 2024-04-23 12:33 ` Amelie Delaunay
  2024-04-23 12:33 ` [PATCH 11/12] dmaengine: stm32-dma3: defer channel registration to specify channel name Amelie Delaunay
  2024-04-23 12:33 ` [PATCH 12/12] arm64: dts: st: add HPDMA nodes on stm32mp251 Amelie Delaunay
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:33 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

Channel device name is used for sysfs, but also by dmatest filter function.

With dynamic channel registration, channels can be registered after dma
controller registration. Users may want to have specific channel names.

If name is NULL, the channel name relies on previous implementation,
dma<controller_device_id>chan<channel_device_id>.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/dmaengine.c   | 16 ++++++++++------
 drivers/dma/idxd/dma.c    |  2 +-
 include/linux/dmaengine.h |  3 ++-
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 491b22240221..c380a4dda77a 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -1037,7 +1037,8 @@ static int get_dma_id(struct dma_device *device)
 }
 
 static int __dma_async_device_channel_register(struct dma_device *device,
-					       struct dma_chan *chan)
+					       struct dma_chan *chan,
+					       const char *name)
 {
 	int rc;
 
@@ -1066,8 +1067,10 @@ static int __dma_async_device_channel_register(struct dma_device *device,
 	chan->dev->device.parent = device->dev;
 	chan->dev->chan = chan;
 	chan->dev->dev_id = device->dev_id;
-	dev_set_name(&chan->dev->device, "dma%dchan%d",
-		     device->dev_id, chan->chan_id);
+	if (!name)
+		dev_set_name(&chan->dev->device, "dma%dchan%d", device->dev_id, chan->chan_id);
+	else
+		dev_set_name(&chan->dev->device, name);
 	rc = device_register(&chan->dev->device);
 	if (rc)
 		goto err_out_ida;
@@ -1087,11 +1090,12 @@ static int __dma_async_device_channel_register(struct dma_device *device,
 }
 
 int dma_async_device_channel_register(struct dma_device *device,
-				      struct dma_chan *chan)
+				      struct dma_chan *chan,
+				      const char *name)
 {
 	int rc;
 
-	rc = __dma_async_device_channel_register(device, chan);
+	rc = __dma_async_device_channel_register(device, chan, name);
 	if (rc < 0)
 		return rc;
 
@@ -1203,7 +1207,7 @@ int dma_async_device_register(struct dma_device *device)
 
 	/* represent channels in sysfs. Probably want devs too */
 	list_for_each_entry(chan, &device->channels, device_node) {
-		rc = __dma_async_device_channel_register(device, chan);
+		rc = __dma_async_device_channel_register(device, chan, NULL);
 		if (rc < 0)
 			goto err_out;
 	}
diff --git a/drivers/dma/idxd/dma.c b/drivers/dma/idxd/dma.c
index cd835eabd31b..dbecd699237e 100644
--- a/drivers/dma/idxd/dma.c
+++ b/drivers/dma/idxd/dma.c
@@ -269,7 +269,7 @@ static int idxd_register_dma_channel(struct idxd_wq *wq)
 		desc->txd.tx_submit = idxd_dma_tx_submit;
 	}
 
-	rc = dma_async_device_channel_register(dma, chan);
+	rc = dma_async_device_channel_register(dma, chan, NULL);
 	if (rc < 0) {
 		kfree(idxd_chan);
 		return rc;
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 752dbde4cec1..73537fddbb52 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -1575,7 +1575,8 @@ int dma_async_device_register(struct dma_device *device);
 int dmaenginem_async_device_register(struct dma_device *device);
 void dma_async_device_unregister(struct dma_device *device);
 int dma_async_device_channel_register(struct dma_device *device,
-				      struct dma_chan *chan);
+				      struct dma_chan *chan,
+				      const char *name);
 void dma_async_device_channel_unregister(struct dma_device *device,
 					 struct dma_chan *chan);
 void dma_run_dependencies(struct dma_async_tx_descriptor *tx);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/12] dmaengine: stm32-dma3: defer channel registration to specify channel name
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (9 preceding siblings ...)
  2024-04-23 12:33 ` [PATCH 10/12] dmaengine: add channel device name to channel registration Amelie Delaunay
@ 2024-04-23 12:33 ` Amelie Delaunay
  2024-04-23 12:33 ` [PATCH 12/12] arm64: dts: st: add HPDMA nodes on stm32mp251 Amelie Delaunay
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:33 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

On STM32 DMA3, channels can be reserved, so they are non available for
Linux. This non-availability creates a mismatch between dma_chan id and
DMA3 channel id.

Use dma_async_device_channel_register() to register the channels
after controller registration and change the default channel name, so that
it can match the name in the Reference Manual and ease requesting a channel
thanks to its name.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 drivers/dma/stm32/stm32-dma3.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
index f15cc0129fc7..b6d8afd5ed34 100644
--- a/drivers/dma/stm32/stm32-dma3.c
+++ b/drivers/dma/stm32/stm32-dma3.c
@@ -1723,9 +1723,6 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
 		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
 		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
-		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
-
-		vchan_init(&chan->vchan, dma_dev);
 	}
 
 	ret = dmaenginem_async_device_register(dma_dev);
@@ -1733,14 +1730,26 @@ static int stm32_dma3_probe(struct platform_device *pdev)
 		goto err_clk_disable;
 
 	for (i = 0; i < ddata->dma_channels; i++) {
+		char name[12];
+
 		if (chan_reserved & BIT(i))
 			continue;
 
+		chan = &ddata->chans[i];
+		snprintf(name, sizeof(name), "dma%dchan%d", ddata->dma_dev.dev_id, chan->id);
+
+		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
+		vchan_init(&chan->vchan, dma_dev);
+
+		ret = dma_async_device_channel_register(&ddata->dma_dev, &chan->vchan.chan, name);
+		if (ret) {
+			dev_err_probe(&pdev->dev, ret, "Failed to register channel %s\n", name);
+			goto err_clk_disable;
+		}
+
 		ret = platform_get_irq(pdev, i);
 		if (ret < 0)
 			goto err_clk_disable;
-
-		chan = &ddata->chans[i];
 		chan->irq = ret;
 
 		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/12] arm64: dts: st: add HPDMA nodes on stm32mp251
  2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
                   ` (10 preceding siblings ...)
  2024-04-23 12:33 ` [PATCH 11/12] dmaengine: stm32-dma3: defer channel registration to specify channel name Amelie Delaunay
@ 2024-04-23 12:33 ` Amelie Delaunay
  11 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 12:33 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue
  Cc: dmaengine, devicetree, linux-stm32, linux-arm-kernel,
	linux-kernel, linux-hardening, Amelie Delaunay

The High Performance Direct Memory Access (HPDMA) controller is used to
perform programmable data transfers between memory-mapped peripherals
and memories (or between memories) via linked-lists.

There are 3 instances of HPDMA on stm32mp251, using stm32-dma3 driver, with
16 channels per instance and with one interrupt per channel.
Channels 0 to 7 are implemented with a FIFO of 8 bytes.
Channels 8 to 11 are implemented with a FIFO of 32 bytes.
Channels 12 to 15 are implemented with a FIFO of 128 bytes.
Thanks to stm32-dma3 bindings, the user can ask for a channel with specific
FIFO size.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
---
 arch/arm64/boot/dts/st/stm32mp251.dtsi | 69 ++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/arch/arm64/boot/dts/st/stm32mp251.dtsi b/arch/arm64/boot/dts/st/stm32mp251.dtsi
index 5dd4f3580a60..0b80d23fbb54 100644
--- a/arch/arm64/boot/dts/st/stm32mp251.dtsi
+++ b/arch/arm64/boot/dts/st/stm32mp251.dtsi
@@ -123,6 +123,75 @@ soc@0 {
 		interrupt-parent = <&intc>;
 		ranges = <0x0 0x0 0x0 0x80000000>;
 
+		hpdma: dma-controller@40400000 {
+			compatible = "st,stm32-dma3";
+			reg = <0x40400000 0x1000>;
+			interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 35 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 37 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 39 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 43 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 44 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 45 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 46 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 47 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
+			clocks = <&ck_icn_ls_mcu>;
+			#dma-cells = <3>;
+		};
+
+		hpdma2: dma-controller@40410000 {
+			compatible = "st,stm32-dma3";
+			reg = <0x40410000 0x1000>;
+			interrupts = <GIC_SPI 49 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 50 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 51 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 52 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 53 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 54 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 57 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 59 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 60 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 61 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 62 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 63 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 64 IRQ_TYPE_LEVEL_HIGH>;
+			clocks = <&ck_icn_ls_mcu>;
+			#dma-cells = <3>;
+		};
+
+		hpdma3: dma-controller@40420000 {
+			compatible = "st,stm32-dma3";
+			reg = <0x40420000 0x1000>;
+			interrupts = <GIC_SPI 65 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 66 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 67 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 68 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 69 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 70 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 71 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 73 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 76 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 79 IRQ_TYPE_LEVEL_HIGH>,
+				     <GIC_SPI 80 IRQ_TYPE_LEVEL_HIGH>;
+			clocks = <&ck_icn_ls_mcu>;
+			#dma-cells = <3>;
+		};
+
 		rifsc: rifsc-bus@42080000 {
 			compatible = "simple-bus";
 			reg = <0x42080000 0x1000>;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings
  2024-04-23 12:32 ` [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings Amelie Delaunay
@ 2024-04-23 13:50   ` Rob Herring
  2024-04-23 14:46     ` Amelie Delaunay
  0 siblings, 1 reply; 29+ messages in thread
From: Rob Herring @ 2024-04-23 13:50 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Maxime Coquelin, dmaengine, linux-kernel, linux-stm32,
	Vinod Koul, linux-hardening, Alexandre Torgue,
	Krzysztof Kozlowski, Conor Dooley, devicetree, Rob Herring,
	linux-arm-kernel


On Tue, 23 Apr 2024 14:32:51 +0200, Amelie Delaunay wrote:
> Gather the STM32 DMA controllers bindings under ./dma/stm32/
> 
> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
> ---
>  .../devicetree/bindings/dma/{ => stm32}/st,stm32-dma.yaml     | 4 ++--
>  .../devicetree/bindings/dma/{ => stm32}/st,stm32-dmamux.yaml  | 4 ++--
>  .../devicetree/bindings/dma/{ => stm32}/st,stm32-mdma.yaml    | 4 ++--
>  3 files changed, 6 insertions(+), 6 deletions(-)
>  rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dma.yaml (97%)
>  rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dmamux.yaml (89%)
>  rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-mdma.yaml (96%)
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:


doc reference errors (make refcheckdocs):
Warning: Documentation/devicetree/bindings/spi/st,stm32-spi.yaml references a file that doesn't exist: Documentation/devicetree/bindings/dma/st,stm32-dma.yaml
Documentation/devicetree/bindings/spi/st,stm32-spi.yaml: Documentation/devicetree/bindings/dma/st,stm32-dma.yaml

See https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20240423123302.1550592-2-amelie.delaunay@foss.st.com

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings
  2024-04-23 13:50   ` Rob Herring
@ 2024-04-23 14:46     ` Amelie Delaunay
  0 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-04-23 14:46 UTC (permalink / raw)
  To: Rob Herring
  Cc: Maxime Coquelin, dmaengine, linux-kernel, linux-stm32,
	Vinod Koul, linux-hardening, Alexandre Torgue,
	Krzysztof Kozlowski, Conor Dooley, devicetree, Rob Herring,
	linux-arm-kernel

Hi Rob,

On 4/23/24 15:50, Rob Herring wrote:
> 
> On Tue, 23 Apr 2024 14:32:51 +0200, Amelie Delaunay wrote:
>> Gather the STM32 DMA controllers bindings under ./dma/stm32/
>>
>> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
>> ---
>>   .../devicetree/bindings/dma/{ => stm32}/st,stm32-dma.yaml     | 4 ++--
>>   .../devicetree/bindings/dma/{ => stm32}/st,stm32-dmamux.yaml  | 4 ++--
>>   .../devicetree/bindings/dma/{ => stm32}/st,stm32-mdma.yaml    | 4 ++--
>>   3 files changed, 6 insertions(+), 6 deletions(-)
>>   rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dma.yaml (97%)
>>   rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-dmamux.yaml (89%)
>>   rename Documentation/devicetree/bindings/dma/{ => stm32}/st,stm32-mdma.yaml (96%)
>>
> 
> My bot found errors running 'make dt_binding_check' on your patch:
> 
> yamllint warnings/errors:
> 
> dtschema/dtc warnings/errors:
> 
> 
> doc reference errors (make refcheckdocs):
> Warning: Documentation/devicetree/bindings/spi/st,stm32-spi.yaml references a file that doesn't exist: Documentation/devicetree/bindings/dma/st,stm32-dma.yaml
> Documentation/devicetree/bindings/spi/st,stm32-spi.yaml: Documentation/devicetree/bindings/dma/st,stm32-dma.yaml
> 
> See https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20240423123302.1550592-2-amelie.delaunay@foss.st.com
> 
> The base for the series is generally the latest rc1. A different dependency
> should be noted in *this* patch.
> 
> If you already ran 'make dt_binding_check' and didn't see the above
> error(s), then make sure 'yamllint' is installed and dt-schema is up to
> date:
> 
> pip3 install dtschema --upgrade
> 
> Please check and re-submit after running the above command yourself. Note
> that DT_SCHEMA_FILES can be set to your schema file to speed up checking
> your schema. However, it must be unset to test all examples with your schema.
> 

Indeed. I'll wait for reviews of the whole series before sending a v2 
fixing this warning.

Regards,
Amelie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 04/12] dt-bindings: dma: Document STM32 DMA3 controller bindings
  2024-04-23 12:32 ` [PATCH 04/12] dt-bindings: dma: Document STM32 DMA3 controller bindings Amelie Delaunay
@ 2024-04-23 15:22   ` Rob Herring
  0 siblings, 0 replies; 29+ messages in thread
From: Rob Herring @ 2024-04-23 15:22 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Vinod Koul, Krzysztof Kozlowski, Conor Dooley, Maxime Coquelin,
	Alexandre Torgue, dmaengine, devicetree, linux-stm32,
	linux-arm-kernel, linux-kernel, linux-hardening

On Tue, Apr 23, 2024 at 02:32:54PM +0200, Amelie Delaunay wrote:
> The STM32 DMA3 is a Direct Memory Access controller with different features
> depending on its hardware configuration.
> The channels have not the same capabilities, some have a larger FIFO, so
> their performance is higher.
> This patch describes STM32 DMA3 bindings, used to select a channel that
> fits client requirements, and to pre-configure the channel depending on
> the client needs.
> 
> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
> ---
>  .../bindings/dma/stm32/st,stm32-dma3.yaml     | 125 ++++++++++++++++++
>  1 file changed, 125 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml
> 
> diff --git a/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml
> new file mode 100644
> index 000000000000..ea4f8f6add3c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/stm32/st,stm32-dma3.yaml
> @@ -0,0 +1,125 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/dma/stm32/st,stm32-dma3.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: STMicroelectronics STM32 DMA3 Controller
> +
> +description: |
> +  The STM32 DMA3 is a direct memory access controller with different features
> +  depending on its hardware configuration.
> +  It is either called LPDMA (Low Power), GPDMA (General Purpose) or
> +  HPDMA (High Performance).
> +  Its hardware configuration registers allow to dynamically expose its features.
> +
> +  GPDMA and HPDMA support 16 independent DMA channels, while only 4 for LPDMA.
> +  GPDMA and HPDMA support 256 DMA requests from peripherals, 8 for LPDMA.
> +
> +  Bindings are generic for these 3 STM32 DMA3 configurations.
> +
> +  DMA clients connected to the STM32 DMA3 controller must use the format described
> +  in the ../dma.txt file, using a four-cell specifier for each channel.
> +  A phandle to the DMA controller plus the following three integer cells:

This description should be part of #dma-cells.

> +    1. The request line number
> +    2. A 32-bit mask specifying the DMA channel requirements
> +      -bit 0-1: The priority level
> +        0x0: low priority, low weight
> +        0x1: low priority, mid weight
> +        0x2: low priority, high weight
> +        0x3: high priority
> +      -bit 4-7: The FIFO requirement for queuing source and destination transfers
> +        0x0: no FIFO requirement/any channel can fit
> +        0x2: FIFO of 8 bytes (2^2+1)
> +        0x4: FIFO of 32 bytes (2^4+1)
> +        0x6: FIFO of 128 bytes (2^6+1)
> +        0x7: FIFO of 256 bytes (2^7+1)
> +    3. A 32-bit mask specifying the DMA transfer requirements
> +      -bit 0: The source incrementing burst
> +        0x0: fixed burst
> +        0x1: contiguously incremented burst
> +      -bit 1: The source allocated port
> +        0x0: port 0 is allocated to the source transfer
> +        0x1: port 1 is allocated to the source transfer
> +      -bit 4: The destination incrementing burst
> +        0x0: fixed burst
> +        0x1: contiguously incremented burst
> +      -bit 5: The destination allocated port
> +        0x0: port 0 is allocated to the destination transfer
> +        0x1: port 1 is allocated to the destination transfer
> +      -bit 8: The type of hardware request
> +        0x0: burst
> +        0x1: block
> +      -bit 9: The control mode
> +        0x0: DMA controller control mode
> +        0x1: peripheral control mode
> +      -bit 12-13: The transfer complete event mode
> +        0x0: at block level, transfer complete event is generated at the end of a block
> +        0x2: at LLI level, the transfer complete event is generated at the end of the LLI transfer,
> +             including the update of the LLI if any
> +        0x3: at channel level, the transfer complete event is generated at the end of the last LLI
> +
> +maintainers:
> +  - Amelie Delaunay <amelie.delaunay@foss.st.com>
> +
> +allOf:
> +  - $ref: /schemas/dma/dma-controller.yaml#
> +
> +properties:
> +  "#dma-cells":
> +    const: 3
> +
> +  compatible:
> +    const: st,stm32-dma3

SoC specific compatible needed.

> +
> +  reg:
> +    maxItems: 1
> +
> +  clocks:
> +    maxItems: 1
> +
> +  resets:
> +    maxItems: 1
> +
> +  interrupts:
> +    minItems: 4
> +    maxItems: 16

I assume this is 1 interrupt per channel? But I shouldn't have to 
assume. Either you need to list every interrupt out or you can keep this 
adding some description if they are all the type of interrupt (e.g. per 
channel).

> +
> +  power-domains:
> +    maxItems: 1
> +
> +required:
> +  - compatible
> +  - reg
> +  - clocks
> +  - interrupts
> +
> +unevaluatedProperties: false
> +
> +examples:
> +  - |
> +    #include <dt-bindings/interrupt-controller/arm-gic.h>
> +    #include <dt-bindings/clock/st,stm32mp25-rcc.h>
> +    dma-controller@40400000 {
> +      compatible = "st,stm32-dma3";
> +      reg = <0x40400000 0x1000>;
> +      interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 35 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 36 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 37 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 39 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 43 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 44 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 45 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 46 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 47 IRQ_TYPE_LEVEL_HIGH>,
> +                   <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
> +      clocks = <&rcc CK_BUS_HPDMA1>;
> +      #dma-cells = <3>;
> +    };
> +...
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-04-23 12:32 ` [PATCH 05/12] dmaengine: Add STM32 DMA3 support Amelie Delaunay
@ 2024-05-04 12:40   ` Vinod Koul
  2024-05-07 11:33     ` Amelie Delaunay
  2024-05-04 14:27   ` Christophe JAILLET
  2024-05-15 18:56   ` Frank Li
  2 siblings, 1 reply; 29+ messages in thread
From: Vinod Koul @ 2024-05-04 12:40 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Maxime Coquelin,
	Alexandre Torgue, dmaengine, devicetree, linux-stm32,
	linux-arm-kernel, linux-kernel, linux-hardening

On 23-04-24, 14:32, Amelie Delaunay wrote:
> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
> controller:
> - LPDMA (Low Power): 4 channels, no FIFO
> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
> Hardware configuration of the channels is retrieved from the hardware
> configuration registers.
> The client can specify its channel requirements through device tree.
> STM32 DMA3 channels can be individually reserved either because they are
> secure, or dedicated to another CPU.
> Indeed, channels availability depends on Resource Isolation Framework
> (RIF) configuration. RIF grants access to buses with Compartiment ID

Compartiment? typo...?

> (CIF) filtering, secure and privilege level. It also assigns DMA channels
> to one or several processors.
> DMA channels used by Linux should be CID-filtered and statically assigned
> to CID1 or shared with other CPUs but using semaphore. In case CID
> filtering is not configured, dma-channel-mask property can be used to
> specify available DMA channels to the kernel, otherwise such channels
> will be marked as reserved and can't be used by Linux.
> 
> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
> ---
>  drivers/dma/stm32/Kconfig      |   10 +
>  drivers/dma/stm32/Makefile     |    1 +
>  drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
>  3 files changed, 1442 insertions(+)
>  create mode 100644 drivers/dma/stm32/stm32-dma3.c
> 
> diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
> index b72ae1a4502f..4d8d8063133b 100644
> --- a/drivers/dma/stm32/Kconfig
> +++ b/drivers/dma/stm32/Kconfig
> @@ -34,4 +34,14 @@ config STM32_MDMA
>  	  If you have a board based on STM32 SoC with such DMA controller
>  	  and want to use MDMA say Y here.
>  
> +config STM32_DMA3
> +	tristate "STMicroelectronics STM32 DMA3 support"
> +	select DMA_ENGINE
> +	select DMA_VIRTUAL_CHANNELS
> +	help
> +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
> +	  STM32 platforms.
> +	  If you have a board based on STM32 SoC with such DMA3 controller
> +	  and want to use DMA3, say Y here.
> +
>  endif
> diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
> index 663a3896a881..5082db4b4c1c 100644
> --- a/drivers/dma/stm32/Makefile
> +++ b/drivers/dma/stm32/Makefile
> @@ -2,3 +2,4 @@
>  obj-$(CONFIG_STM32_DMA) += stm32-dma.o
>  obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
>  obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
> +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o

are there any similarities in mdma/dma and dma3..?
can anything be reused...?

> diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
> new file mode 100644
> index 000000000000..b5493f497d06
> --- /dev/null
> +++ b/drivers/dma/stm32/stm32-dma3.c
> @@ -0,0 +1,1431 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * STM32 DMA3 controller driver
> + *
> + * Copyright (C) STMicroelectronics 2024
> + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/clk.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/dmaengine.h>
> +#include <linux/dmapool.h>
> +#include <linux/init.h>
> +#include <linux/iopoll.h>
> +#include <linux/list.h>
> +#include <linux/module.h>
> +#include <linux/of_dma.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/reset.h>
> +#include <linux/slab.h>
> +
> +#include "../virt-dma.h"
> +
> +#define STM32_DMA3_SECCFGR		0x00
> +#define STM32_DMA3_PRIVCFGR		0x04
> +#define STM32_DMA3_RCFGLOCKR		0x08
> +#define STM32_DMA3_MISR			0x0C

lower hex please

> +#define STM32_DMA3_SMISR		0x10
> +
> +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
> +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
> +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
> +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
> +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
> +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
> +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
> +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
> +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
> +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
> +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
> +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
> +
> +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
> +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
> +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
> +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
> +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
> +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
> +#define STM32_DMA3_VERR			0xFF4

here as well

> +
> +/* SECCFGR DMA secure configuration register */
> +#define SECCFGR_SEC(x)			BIT(x)
> +
> +/* MISR DMA non-secure/secure masked interrupt status register */
> +#define MISR_MIS(x)			BIT(x)
> +
> +/* CxLBAR DMA channel x linked_list base address register */
> +#define CLBAR_LBA			GENMASK(31, 16)
> +
> +/* CxCIDCFGR DMA channel x CID register */
> +#define CCIDCFGR_CFEN			BIT(0)
> +#define CCIDCFGR_SEM_EN			BIT(1)
> +#define CCIDCFGR_SCID			GENMASK(5, 4)
> +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
> +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
> +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
> +
> +enum ccidcfgr_cid {
> +	CCIDCFGR_CID0,
> +	CCIDCFGR_CID1,
> +	CCIDCFGR_CID2,
> +};
> +
> +/* CxSEMCR DMA channel x semaphore control register */
> +#define CSEMCR_SEM_MUTEX		BIT(0)
> +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
> +
> +/* CxFCR DMA channel x flag clear register */
> +#define CFCR_TCF			BIT(8)
> +#define CFCR_HTF			BIT(9)
> +#define CFCR_DTEF			BIT(10)
> +#define CFCR_ULEF			BIT(11)
> +#define CFCR_USEF			BIT(12)
> +#define CFCR_SUSPF			BIT(13)
> +
> +/* CxSR DMA channel x status register */
> +#define CSR_IDLEF			BIT(0)
> +#define CSR_TCF				BIT(8)
> +#define CSR_HTF				BIT(9)
> +#define CSR_DTEF			BIT(10)
> +#define CSR_ULEF			BIT(11)
> +#define CSR_USEF			BIT(12)
> +#define CSR_SUSPF			BIT(13)
> +#define CSR_ALL_F			GENMASK(13, 8)
> +#define CSR_FIFOL			GENMASK(24, 16)
> +
> +/* CxCR DMA channel x control register */
> +#define CCR_EN				BIT(0)
> +#define CCR_RESET			BIT(1)
> +#define CCR_SUSP			BIT(2)
> +#define CCR_TCIE			BIT(8)
> +#define CCR_HTIE			BIT(9)
> +#define CCR_DTEIE			BIT(10)
> +#define CCR_ULEIE			BIT(11)
> +#define CCR_USEIE			BIT(12)
> +#define CCR_SUSPIE			BIT(13)
> +#define CCR_ALLIE			GENMASK(13, 8)
> +#define CCR_LSM				BIT(16)
> +#define CCR_LAP				BIT(17)
> +#define CCR_PRIO			GENMASK(23, 22)
> +
> +enum ccr_prio {
> +	CCR_PRIO_LOW,
> +	CCR_PRIO_MID,
> +	CCR_PRIO_HIGH,
> +	CCR_PRIO_VERY_HIGH,
> +};
> +
> +/* CxTR1 DMA channel x transfer register 1 */
> +#define CTR1_SINC			BIT(3)
> +#define CTR1_SBL_1			GENMASK(9, 4)
> +#define CTR1_DINC			BIT(19)
> +#define CTR1_DBL_1			GENMASK(25, 20)
> +#define CTR1_SDW_LOG2			GENMASK(1, 0)
> +#define CTR1_PAM			GENMASK(12, 11)
> +#define CTR1_SAP			BIT(14)
> +#define CTR1_DDW_LOG2			GENMASK(17, 16)
> +#define CTR1_DAP			BIT(30)
> +
> +enum ctr1_dw {
> +	CTR1_DW_BYTE,
> +	CTR1_DW_HWORD,
> +	CTR1_DW_WORD,
> +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
> +};
> +
> +enum ctr1_pam {
> +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
> +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
> +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
> +};
> +
> +/* CxTR2 DMA channel x transfer register 2 */
> +#define CTR2_REQSEL			GENMASK(7, 0)
> +#define CTR2_SWREQ			BIT(9)
> +#define CTR2_DREQ			BIT(10)
> +#define CTR2_BREQ			BIT(11)
> +#define CTR2_PFREQ			BIT(12)
> +#define CTR2_TCEM			GENMASK(31, 30)
> +
> +enum ctr2_tcem {
> +	CTR2_TCEM_BLOCK,
> +	CTR2_TCEM_REPEAT_BLOCK,
> +	CTR2_TCEM_LLI,
> +	CTR2_TCEM_CHANNEL,
> +};
> +
> +/* CxBR1 DMA channel x block register 1 */
> +#define CBR1_BNDT			GENMASK(15, 0)
> +
> +/* CxLLR DMA channel x linked-list address register */
> +#define CLLR_LA				GENMASK(15, 2)
> +#define CLLR_ULL			BIT(16)
> +#define CLLR_UDA			BIT(27)
> +#define CLLR_USA			BIT(28)
> +#define CLLR_UB1			BIT(29)
> +#define CLLR_UT2			BIT(30)
> +#define CLLR_UT1			BIT(31)
> +
> +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
> +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
> +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
> +
> +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
> +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
> +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
> +
> +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
> +
> +/* HWCFGR2 DMA hardware configuration register 2 */
> +#define G_MAX_REQ_ID			GENMASK(7, 0)
> +
> +/* HWCFGR1 DMA hardware configuration register 1 */
> +#define G_MASTER_PORTS			GENMASK(2, 0)
> +#define G_NUM_CHANNELS			GENMASK(12, 8)
> +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
> +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
> +
> +enum stm32_dma3_master_ports {
> +	AXI64, /* 1x AXI: 64-bit port 0 */
> +	AHB32, /* 1x AHB: 32-bit port 0 */
> +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
> +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
> +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
> +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
> +};
> +
> +enum stm32_dma3_port_data_width {
> +	DW_32, /* 32-bit, for AHB */
> +	DW_64, /* 64-bit, for AXI */
> +	DW_128, /* 128-bit, for AXI */
> +	DW_INVALID,
> +};
> +
> +/* VERR DMA version register */
> +#define VERR_MINREV			GENMASK(3, 0)
> +#define VERR_MAJREV			GENMASK(7, 4)
> +
> +/* Device tree */
> +/* struct stm32_dma3_dt_conf */
> +/* .ch_conf */
> +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
> +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
> +/* .tr_conf */
> +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
> +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
> +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
> +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
> +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
> +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
> +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
> +
> +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
> +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
> +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
> +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
> +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
> +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
> +
> +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
> +struct stm32_dma3_hwdesc {
> +	u32 ctr1;
> +	u32 ctr2;
> +	u32 cbr1;
> +	u32 csar;
> +	u32 cdar;
> +	u32 cllr;
> +} __aligned(32);
> +
> +/*
> + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
> + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
> + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
> + * the sg size limitation.
> + */
> +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
> +
> +/*
> + * Linked-list items
> + */
> +struct stm32_dma3_lli {
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	dma_addr_t hwdesc_addr;
> +};
> +
> +struct stm32_dma3_swdesc {
> +	struct virt_dma_desc vdesc;
> +	u32 ccr;
> +	bool cyclic;
> +	u32 lli_size;
> +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
> +};
> +
> +struct stm32_dma3_dt_conf {
> +	u32 ch_id;
> +	u32 req_line;
> +	u32 ch_conf;
> +	u32 tr_conf;
> +};
> +
> +struct stm32_dma3_chan {
> +	struct virt_dma_chan vchan;
> +	u32 id;
> +	int irq;
> +	u32 fifo_size;
> +	u32 max_burst;
> +	bool semaphore_mode;
> +	struct stm32_dma3_dt_conf dt_config;
> +	struct dma_slave_config dma_config;
> +	struct dma_pool *lli_pool;
> +	struct stm32_dma3_swdesc *swdesc;
> +	enum ctr2_tcem tcem;
> +	u32 dma_status;
> +};
> +
> +struct stm32_dma3_ddata {
> +	struct dma_device dma_dev;
> +	void __iomem *base;
> +	struct clk *clk;
> +	struct stm32_dma3_chan *chans;
> +	u32 dma_channels;
> +	u32 dma_requests;
> +	enum stm32_dma3_port_data_width ports_max_dw[2];
> +};
> +
> +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
> +{
> +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
> +}
> +
> +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
> +{
> +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
> +}
> +
> +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
> +{
> +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
> +}
> +
> +static struct device *chan2dev(struct stm32_dma3_chan *chan)
> +{
> +	return &chan->vchan.chan.dev->device;
> +}
> +
> +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct device *dev = chan2dev(chan);
> +	u32 id = chan->id, offset;
> +
> +	offset = STM32_DMA3_SECCFGR;
> +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_PRIVCFGR;
> +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CCIDCFGR(id);
> +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CSEMCR(id);
> +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CSR(id);
> +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CCR(id);
> +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CTR1(id);
> +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CTR2(id);
> +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CBR1(id);
> +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CSAR(id);
> +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CDAR(id);
> +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CLLR(id);
> +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CLBAR(id);
> +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +}
> +
> +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
> +					struct stm32_dma3_swdesc *swdesc)
> +{
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	int i;
> +
> +	for (i = 0; i < swdesc->lli_size; i++) {
> +		hwdesc = swdesc->lli[i].hwdesc;
> +		if (i)
> +			dev_dbg(chan2dev(chan), "V\n");
> +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
> +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
> +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
> +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
> +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
> +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
> +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
> +	}
> +
> +	if (swdesc->cyclic) {
> +		dev_dbg(chan2dev(chan), "|\n");
> +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
> +	} else {
> +		dev_dbg(chan2dev(chan), "X\n");
> +	}
> +}
> +
> +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct stm32_dma3_swdesc *swdesc;
> +	int i;
> +
> +	/*
> +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
> +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
> +	 * addressed, so abort the allocation.
> +	 */
> +	if ((count * 32) > CLLR_LA) {
> +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
> +		return NULL;
> +	}
> +
> +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
> +	if (!swdesc)
> +		return NULL;
> +
> +	for (i = 0; i < count; i++) {
> +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
> +							&swdesc->lli[i].hwdesc_addr);
> +		if (!swdesc->lli[i].hwdesc)
> +			goto err_pool_free;
> +	}
> +	swdesc->lli_size = count;
> +	swdesc->ccr = 0;
> +
> +	/* Set LL base address */
> +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
> +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
> +
> +	/* Set LL allocated port */
> +	swdesc->ccr &= ~CCR_LAP;
> +
> +	return swdesc;
> +
> +err_pool_free:
> +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
> +	while (--i >= 0)
> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> +	kfree(swdesc);
> +
> +	return NULL;
> +}
> +
> +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
> +				      struct stm32_dma3_swdesc *swdesc)
> +{
> +	int i;
> +
> +	for (i = 0; i < swdesc->lli_size; i++)
> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> +
> +	kfree(swdesc);
> +}
> +
> +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
> +{
> +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
> +
> +	stm32_dma3_chan_desc_free(chan, swdesc);
> +}
> +
> +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct device *dev = chan2dev(chan);
> +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
> +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
> +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
> +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
> +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
> +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
> +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
> +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
> +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
> +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
> +
> +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
> +		dev_err(dev, "null source block size and no update of this value\n");
> +	if (bndt % sdw)
> +		dev_err(dev, "source block size not multiple of src data width\n");
> +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
> +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
> +	if (csar % sdw)
> +		dev_err(dev, "unaligned source address not multiple of src data width\n");
> +	if (cdar % ddw)
> +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
> +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
> +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
> +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
> +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);

NO error/abort if this is wrong...?

> +}
> +
> +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
> +					struct stm32_dma3_swdesc *swdesc,
> +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
> +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
> +{
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	dma_addr_t next_lli;
> +	u32 next = curr + 1;
> +
> +	hwdesc = swdesc->lli[curr].hwdesc;
> +	hwdesc->ctr1 = ctr1;
> +	hwdesc->ctr2 = ctr2;
> +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
> +	hwdesc->csar = src;
> +	hwdesc->cdar = dst;
> +
> +	if (is_last) {
> +		if (is_cyclic)
> +			next_lli = swdesc->lli[0].hwdesc_addr;
> +		else
> +			next_lli = 0;
> +	} else {
> +		next_lli = swdesc->lli[next].hwdesc_addr;
> +	}
> +
> +	hwdesc->cllr = 0;
> +	if (next_lli) {
> +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
> +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
> +		hwdesc->cllr |= (next_lli & CLLR_LA);
> +	}
> +}
> +
> +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
> +						     enum stm32_dma3_port_data_width port_max_dw,
> +						     u32 len, dma_addr_t addr)
> +{
> +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
> +
> +	/* len and addr must be a multiple of dw */
> +	return 1 << __ffs(len | addr | max_dw);
> +}
> +
> +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
> +{
> +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
> +
> +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
> +	if (len < chan_max_burst)
> +		max_burst = len / dw;
> +
> +	/*
> +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
> +	 * If len is not a multiple of burst size, last burst is shortened by HW.
> +	 */
> +	return max_burst;
> +}
> +
> +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
> +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
> +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct dma_device dma_device = ddata->dma_dev;
> +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
> +	u32 _ctr1 = 0, _ctr2 = 0;
> +	u32 ch_conf = chan->dt_config.ch_conf;
> +	u32 tr_conf = chan->dt_config.tr_conf;
> +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
> +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
> +
> +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
> +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
> +
> +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
> +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
> +	sbl_max = chan->dma_config.src_maxburst ? : 1;
> +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
> +
> +	/* Following conditions would raise User Setting Error interrupt */
> +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
> +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
> +		return -EINVAL;
> +	}
> +
> +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
> +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
> +		return -EINVAL;
> +	}
> +
> +	sap_max_dw = ddata->ports_max_dw[sap];
> +	dap_max_dw = ddata->ports_max_dw[dap];
> +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
> +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
> +		dev_err(chan2dev(chan),
> +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
> +			sdw, ddw, sap, dap);
> +		return -EINVAL;
> +	}
> +
> +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
> +		_ctr1 |= CTR1_SINC;
> +	if (sap)
> +		_ctr1 |= CTR1_SAP;
> +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
> +		_ctr1 |= CTR1_DINC;
> +	if (dap)
> +		_ctr1 |= CTR1_DAP;
> +
> +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
> +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
> +		_ctr2 |= CTR2_BREQ;
> +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
> +		_ctr2 |= CTR2_PFREQ;
> +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
> +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
> +
> +	/* Store TCEM to know on which event TC flag occurred */
> +	chan->tcem = tcem;
> +	/* Store direction for residue computation */
> +	chan->dma_config.direction = dir;
> +
> +	switch (dir) {
> +	case DMA_MEM_TO_DEV:
> +		/* Set destination (device) data width and burst */
> +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
> +							    len, dst_addr));
> +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
> +
> +		/* Set source (memory) data width and burst */
> +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
> +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
> +
> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> +
> +		if (ddw != sdw) {
> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> +			/* Should never reach this case as ddw is clamped down */
> +			if (len & (ddw - 1)) {
> +				dev_err(chan2dev(chan),
> +					"Packing mode is enabled and len is not multiple of ddw");
> +				return -EINVAL;
> +			}
> +		}
> +
> +		/* dst = dev */
> +		_ctr2 |= CTR2_DREQ;
> +
> +		break;
> +
> +	case DMA_DEV_TO_MEM:
> +		/* Set source (device) data width and burst */
> +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
> +							    len, src_addr));
> +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
> +
> +		/* Set destination (memory) data width and burst */
> +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
> +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
> +
> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> +
> +		if (ddw != sdw) {
> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> +			/* Should never reach this case as ddw is clamped down */
> +			if (len & (ddw - 1)) {
> +				dev_err(chan2dev(chan),
> +					"Packing mode is enabled and len is not multiple of ddw\n");
> +				return -EINVAL;
> +			}
> +		}
> +
> +		/* dst = mem */
> +		_ctr2 &= ~CTR2_DREQ;
> +
> +		break;
> +
> +	default:
> +		dev_err(chan2dev(chan), "Direction %s not supported\n",
> +			dmaengine_get_direction_text(dir));
> +		return -EINVAL;
> +	}
> +
> +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
> +	*ctr1 = _ctr1;
> +	*ctr2 = _ctr2;
> +
> +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
> +		__func__, sdw, sbl_max, ddw, dbl_max);
> +
> +	return 0;
> +}
> +
> +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct virt_dma_desc *vdesc;
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	u32 id = chan->id;
> +	u32 csr, ccr;
> +
> +	vdesc = vchan_next_desc(&chan->vchan);
> +	if (!vdesc) {
> +		chan->swdesc = NULL;
> +		return;
> +	}
> +	list_del(&vdesc->node);
> +
> +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
> +	hwdesc = chan->swdesc->lli[0].hwdesc;
> +
> +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
> +
> +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
> +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
> +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
> +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
> +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
> +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
> +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
> +
> +	/* Clear any pending interrupts */
> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
> +	if (csr & CSR_ALL_F)
> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
> +
> +	stm32_dma3_chan_dump_reg(chan);
> +
> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
> +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
> +
> +	chan->dma_status = DMA_IN_PROGRESS;
> +
> +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
> +}
> +
> +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> +	int ret = 0;
> +
> +	if (susp)
> +		ccr |= CCR_SUSP;
> +	else
> +		ccr &= ~CCR_SUSP;
> +
> +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
> +
> +	if (susp) {
> +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
> +							csr & CSR_SUSPF, 1, 10);
> +		if (!ret)
> +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
> +
> +		stm32_dma3_chan_dump_reg(chan);
> +	}
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> +
> +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
> +}
> +
> +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 ccr;
> +	int ret = 0;
> +
> +	chan->dma_status = DMA_COMPLETE;
> +
> +	/* Disable interrupts */
> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
> +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
> +
> +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
> +		/* Suspend the channel */
> +		ret = stm32_dma3_chan_suspend(chan, true);
> +		if (ret)
> +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
> +	}
> +
> +	/*
> +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
> +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
> +	 */
> +	stm32_dma3_chan_reset(chan);
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
> +{
> +	if (!chan->swdesc)
> +		return;
> +
> +	vchan_cookie_complete(&chan->swdesc->vdesc);
> +	chan->swdesc = NULL;
> +	stm32_dma3_chan_start(chan);
> +}
> +
> +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
> +{
> +	struct stm32_dma3_chan *chan = devid;
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 misr, csr, ccr;
> +
> +	spin_lock(&chan->vchan.lock);
> +
> +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
> +	if (!(misr & MISR_MIS(chan->id))) {
> +		spin_unlock(&chan->vchan.lock);
> +		return IRQ_NONE;
> +	}
> +
> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
> +
> +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
> +		if (chan->swdesc->cyclic)
> +			vchan_cyclic_callback(&chan->swdesc->vdesc);
> +		else
> +			stm32_dma3_chan_complete(chan);
> +	}
> +
> +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
> +		dev_err(chan2dev(chan), "User setting error\n");
> +		chan->dma_status = DMA_ERROR;
> +		/* CCR.EN automatically cleared by HW */
> +		stm32_dma3_check_user_setting(chan);
> +		stm32_dma3_chan_reset(chan);
> +	}
> +
> +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
> +		dev_err(chan2dev(chan), "Update link transfer error\n");
> +		chan->dma_status = DMA_ERROR;
> +		/* CCR.EN automatically cleared by HW */
> +		stm32_dma3_chan_reset(chan);
> +	}
> +
> +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
> +		dev_err(chan2dev(chan), "Data transfer error\n");
> +		chan->dma_status = DMA_ERROR;
> +		/* CCR.EN automatically cleared by HW */
> +		stm32_dma3_chan_reset(chan);
> +	}
> +
> +	/*
> +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
> +	 * ensure HTF flag to be cleared, with other flags.
> +	 */
> +	csr &= (ccr | CCR_HTIE);
> +
> +	if (csr)
> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
> +
> +	spin_unlock(&chan->vchan.lock);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 id = chan->id, csemcr, ccid;
> +	int ret;
> +
> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	/* Ensure the channel is free */
> +	if (chan->semaphore_mode &&
> +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
> +		ret = -EBUSY;
> +		goto err_put_sync;
> +	}
> +
> +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
> +					  sizeof(struct stm32_dma3_hwdesc),
> +					  __alignof__(struct stm32_dma3_hwdesc), 0);
> +	if (!chan->lli_pool) {
> +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
> +		ret = -ENOMEM;
> +		goto err_put_sync;
> +	}
> +
> +	/* Take the channel semaphore */
> +	if (chan->semaphore_mode) {
> +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
> +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
> +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
> +		/* Check that the channel is well taken */
> +		if (ccid != CCIDCFGR_CID1) {
> +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
> +			ret = -EPERM;
> +			goto err_pool_destroy;
> +		}
> +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
> +	}
> +
> +	return 0;
> +
> +err_pool_destroy:
> +	dmam_pool_destroy(chan->lli_pool);
> +	chan->lli_pool = NULL;
> +
> +err_put_sync:
> +	pm_runtime_put_sync(ddata->dma_dev.dev);
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	unsigned long flags;
> +
> +	/* Ensure channel is in idle state */
> +	spin_lock_irqsave(&chan->vchan.lock, flags);
> +	stm32_dma3_chan_stop(chan);
> +	chan->swdesc = NULL;
> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> +
> +	vchan_free_chan_resources(to_virt_chan(c));
> +
> +	dmam_pool_destroy(chan->lli_pool);
> +	chan->lli_pool = NULL;
> +
> +	/* Release the channel semaphore */
> +	if (chan->semaphore_mode)
> +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
> +
> +	pm_runtime_put_sync(ddata->dma_dev.dev);
> +
> +	/* Reset configuration */
> +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
> +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
> +}
> +
> +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
> +								struct scatterlist *sgl,
> +								unsigned int sg_len,
> +								enum dma_transfer_direction dir,
> +								unsigned long flags, void *context)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_swdesc *swdesc;
> +	struct scatterlist *sg;
> +	size_t len;
> +	dma_addr_t sg_addr, dev_addr, src, dst;
> +	u32 i, j, count, ctr1, ctr2;
> +	int ret;
> +
> +	count = sg_len;
> +	for_each_sg(sgl, sg, sg_len, i) {
> +		len = sg_dma_len(sg);
> +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
> +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
> +	}
> +
> +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
> +	if (!swdesc)
> +		return NULL;
> +
> +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
> +	j = 0;
> +	for_each_sg(sgl, sg, sg_len, i) {
> +		sg_addr = sg_dma_address(sg);
> +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
> +						     chan->dma_config.src_addr;
> +		len = sg_dma_len(sg);
> +
> +		do {
> +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
> +
> +			if (dir == DMA_MEM_TO_DEV) {
> +				src = sg_addr;
> +				dst = dev_addr;
> +
> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> +							      src, dst, chunk);
> +
> +				if (FIELD_GET(CTR1_DINC, ctr1))
> +					dev_addr += chunk;
> +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
> +				src = dev_addr;
> +				dst = sg_addr;
> +
> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> +							      src, dst, chunk);
> +
> +				if (FIELD_GET(CTR1_SINC, ctr1))
> +					dev_addr += chunk;
> +			}
> +
> +			if (ret)
> +				goto err_desc_free;
> +
> +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
> +						    ctr1, ctr2, j == (count - 1), false);
> +
> +			sg_addr += chunk;
> +			len -= chunk;
> +			j++;
> +		} while (len);
> +	}
> +
> +	/* Enable Error interrupts */
> +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
> +	/* Enable Transfer state interrupts */
> +	swdesc->ccr |= CCR_TCIE;
> +
> +	swdesc->cyclic = false;
> +
> +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
> +
> +err_desc_free:
> +	stm32_dma3_chan_desc_free(chan, swdesc);
> +
> +	return NULL;
> +}
> +
> +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +
> +	if (!chan->fifo_size) {
> +		caps->max_burst = 0;
> +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +	} else {
> +		/* Burst transfer should not exceed half of the fifo size */
> +		caps->max_burst = chan->max_burst;
> +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
> +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +		}
> +	}
> +}
> +
> +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +
> +	memcpy(&chan->dma_config, config, sizeof(*config));
> +
> +	return 0;
> +}
> +
> +static int stm32_dma3_terminate_all(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	unsigned long flags;
> +	LIST_HEAD(head);
> +
> +	spin_lock_irqsave(&chan->vchan.lock, flags);
> +
> +	if (chan->swdesc) {
> +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
> +		chan->swdesc = NULL;
> +	}
> +
> +	stm32_dma3_chan_stop(chan);
> +
> +	vchan_get_all_descriptors(&chan->vchan, &head);
> +
> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> +	vchan_dma_desc_free_list(&chan->vchan, &head);
> +
> +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
> +
> +	return 0;
> +}
> +
> +static void stm32_dma3_synchronize(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +
> +	vchan_synchronize(&chan->vchan);
> +}
> +
> +static void stm32_dma3_issue_pending(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&chan->vchan.lock, flags);
> +
> +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
> +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
> +		stm32_dma3_chan_start(chan);
> +	}
> +
> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> +}
> +
> +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct stm32_dma3_dt_conf *conf = fn_param;
> +	u32 mask, semcr;
> +	int ret;
> +
> +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
> +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
> +
> +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
> +		if (!(mask & BIT(chan->id)))
> +			return false;
> +
> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> +	if (ret < 0)
> +		return false;
> +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
> +	pm_runtime_put_sync(ddata->dma_dev.dev);
> +
> +	/* Check if chan is free */
> +	if (semcr & CSEMCR_SEM_MUTEX)
> +		return false;
> +
> +	/* Check if chan fifo fits well */
> +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
> +		return false;
> +
> +	return true;
> +}
> +
> +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
> +{
> +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
> +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
> +	struct stm32_dma3_dt_conf conf;
> +	struct stm32_dma3_chan *chan;
> +	struct dma_chan *c;
> +
> +	if (dma_spec->args_count < 3) {
> +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
> +		return NULL;
> +	}
> +
> +	conf.req_line = dma_spec->args[0];
> +	conf.ch_conf = dma_spec->args[1];
> +	conf.tr_conf = dma_spec->args[2];
> +
> +	if (conf.req_line >= ddata->dma_requests) {
> +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
> +		return NULL;
> +	}
> +
> +	/* Request dma channel among the generic dma controller list */
> +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
> +	if (!c) {
> +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
> +		return NULL;
> +	}
> +
> +	chan = to_stm32_dma3_chan(c);
> +	chan->dt_config = conf;
> +
> +	return c;
> +}
> +
> +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
> +{
> +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
> +
> +	/* Reserve Secure channels */
> +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
> +
> +	/*
> +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
> +	 * the processor which is configuring and using the given channel.
> +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
> +	 * specify available DMA channels to the kernel.
> +	 */
> +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
> +
> +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
> +	for (i = 0; i < ddata->dma_channels; i++) {
> +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
> +
> +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
> +			invalid_cid |= BIT(i);
> +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
> +				chan_reserved |= BIT(i);
> +		} else { /* CID-filtered */
> +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
> +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
> +					chan_reserved |= BIT(i);
> +			} else { /* Semaphore mode */
> +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
> +					chan_reserved |= BIT(i);
> +				ddata->chans[i].semaphore_mode = true;
> +			}
> +		}
> +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
> +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
> +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
> +			(chan_reserved & BIT(i)) ? "denied" :
> +			mask & BIT(i) ? "force allowed" : "allowed");
> +	}
> +
> +	if (invalid_cid)
> +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
> +			 ddata->dma_channels, &invalid_cid);
> +
> +	return chan_reserved;
> +}
> +
> +static const struct of_device_id stm32_dma3_of_match[] = {
> +	{ .compatible = "st,stm32-dma3", },
> +	{ /* sentinel */},
> +};
> +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
> +
> +static int stm32_dma3_probe(struct platform_device *pdev)
> +{
> +	struct device_node *np = pdev->dev.of_node;
> +	struct stm32_dma3_ddata *ddata;
> +	struct reset_control *reset;
> +	struct stm32_dma3_chan *chan;
> +	struct dma_device *dma_dev;
> +	u32 master_ports, chan_reserved, i, verr;
> +	u64 hwcfgr;
> +	int ret;
> +
> +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
> +	if (!ddata)
> +		return -ENOMEM;
> +	platform_set_drvdata(pdev, ddata);
> +
> +	dma_dev = &ddata->dma_dev;
> +
> +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
> +	if (IS_ERR(ddata->base))
> +		return PTR_ERR(ddata->base);
> +
> +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
> +	if (IS_ERR(ddata->clk))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
> +
> +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
> +	if (IS_ERR(reset))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
> +
> +	ret = clk_prepare_enable(ddata->clk);
> +	if (ret)
> +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
> +
> +	reset_control_reset(reset);
> +
> +	INIT_LIST_HEAD(&dma_dev->channels);
> +
> +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
> +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
> +	dma_dev->dev = &pdev->dev;
> +	/*
> +	 * This controller supports up to 8-byte buswidth depending on the port used and the
> +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
> +	 */
> +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
> +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
> +
> +	dma_dev->descriptor_reuse = true;
> +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
> +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
> +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
> +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
> +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
> +	dma_dev->device_caps = stm32_dma3_caps;
> +	dma_dev->device_config = stm32_dma3_config;
> +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
> +	dma_dev->device_synchronize = stm32_dma3_synchronize;
> +	dma_dev->device_tx_status = dma_cookie_status;
> +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
> +
> +	/* if dma_channels is not modified, get it from hwcfgr1 */
> +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
> +	}
> +
> +	/* if dma_requests is not modified, get it from hwcfgr2 */
> +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
> +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
> +	}
> +
> +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
> +
> +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
> +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
> +		ddata->ports_max_dw[1] = DW_INVALID;
> +	else /* Dual master ports */
> +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
> +
> +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
> +				    GFP_KERNEL);
> +	if (!ddata->chans) {
> +		ret = -ENOMEM;
> +		goto err_clk_disable;
> +	}
> +
> +	chan_reserved = stm32_dma3_check_rif(ddata);
> +
> +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
> +		ret = -ENODEV;
> +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
> +		goto err_clk_disable;
> +	}
> +
> +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
> +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
> +
> +	for (i = 0; i < ddata->dma_channels; i++) {
> +		if (chan_reserved & BIT(i))
> +			continue;
> +
> +		chan = &ddata->chans[i];
> +		chan->id = i;
> +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
> +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
> +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
> +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
> +
> +		vchan_init(&chan->vchan, dma_dev);
> +	}
> +
> +	ret = dmaenginem_async_device_register(dma_dev);
> +	if (ret)
> +		goto err_clk_disable;
> +
> +	for (i = 0; i < ddata->dma_channels; i++) {
> +		if (chan_reserved & BIT(i))
> +			continue;
> +
> +		ret = platform_get_irq(pdev, i);
> +		if (ret < 0)
> +			goto err_clk_disable;
> +
> +		chan = &ddata->chans[i];
> +		chan->irq = ret;
> +
> +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
> +				       dev_name(chan2dev(chan)), chan);
> +		if (ret) {
> +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
> +				      dev_name(chan2dev(chan)));
> +			goto err_clk_disable;
> +		}
> +	}
> +
> +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
> +	if (ret) {
> +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
> +		goto err_clk_disable;
> +	}
> +
> +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
> +
> +	pm_runtime_set_active(&pdev->dev);
> +	pm_runtime_enable(&pdev->dev);
> +	pm_runtime_get_noresume(&pdev->dev);
> +	pm_runtime_put(&pdev->dev);
> +
> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
> +
> +	return 0;
> +
> +err_clk_disable:
> +	clk_disable_unprepare(ddata->clk);
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_remove(struct platform_device *pdev)
> +{
> +	pm_runtime_disable(&pdev->dev);
> +}
> +
> +static int stm32_dma3_runtime_suspend(struct device *dev)
> +{
> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> +
> +	clk_disable_unprepare(ddata->clk);
> +
> +	return 0;
> +}
> +
> +static int stm32_dma3_runtime_resume(struct device *dev)
> +{
> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> +	int ret;
> +
> +	ret = clk_prepare_enable(ddata->clk);
> +	if (ret)
> +		dev_err(dev, "Failed to enable clk: %d\n", ret);
> +
> +	return ret;
> +}
> +
> +static const struct dev_pm_ops stm32_dma3_pm_ops = {
> +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
> +};
> +
> +static struct platform_driver stm32_dma3_driver = {
> +	.probe = stm32_dma3_probe,
> +	.remove_new = stm32_dma3_remove,
> +	.driver = {
> +		.name = "stm32-dma3",
> +		.of_match_table = stm32_dma3_of_match,
> +		.pm = pm_ptr(&stm32_dma3_pm_ops),
> +	},
> +};
> +
> +static int __init stm32_dma3_init(void)
> +{
> +	return platform_driver_register(&stm32_dma3_driver);
> +}
> +
> +subsys_initcall(stm32_dma3_init);
> +
> +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
> +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
> +MODULE_LICENSE("GPL");
> -- 
> 2.25.1

-- 
~Vinod

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-04-23 12:32 ` [PATCH 05/12] dmaengine: Add STM32 DMA3 support Amelie Delaunay
  2024-05-04 12:40   ` Vinod Koul
@ 2024-05-04 14:27   ` Christophe JAILLET
  2024-05-07 12:37     ` Amelie Delaunay
  2024-05-15 18:56   ` Frank Li
  2 siblings, 1 reply; 29+ messages in thread
From: Christophe JAILLET @ 2024-05-04 14:27 UTC (permalink / raw)
  To: amelie.delaunay
  Cc: alexandre.torgue, conor+dt, devicetree, dmaengine,
	krzysztof.kozlowski+dt, linux-arm-kernel, linux-hardening,
	linux-kernel, linux-stm32, mcoquelin.stm32, robh+dt, vkoul

Le 23/04/2024 à 14:32, Amelie Delaunay a écrit :
> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
> controller:
> - LPDMA (Low Power): 4 channels, no FIFO
> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
> Hardware configuration of the channels is retrieved from the hardware
> configuration registers.
> The client can specify its channel requirements through device tree.
> STM32 DMA3 channels can be individually reserved either because they are
> secure, or dedicated to another CPU.
> Indeed, channels availability depends on Resource Isolation Framework
> (RIF) configuration. RIF grants access to buses with Compartiment ID
> (CIF) filtering, secure and privilege level. It also assigns DMA channels
> to one or several processors.
> DMA channels used by Linux should be CID-filtered and statically assigned
> to CID1 or shared with other CPUs but using semaphore. In case CID
> filtering is not configured, dma-channel-mask property can be used to
> specify available DMA channels to the kernel, otherwise such channels
> will be marked as reserved and can't be used by Linux.
> 
> Signed-off-by: Amelie Delaunay <amelie.delaunay-rj0Iel/JR4NBDgjK7y7TUQ@public.gmane.org>
> ---

...

> +	pm_runtime_set_active(&pdev->dev);
> +	pm_runtime_enable(&pdev->dev);
> +	pm_runtime_get_noresume(&pdev->dev);
> +	pm_runtime_put(&pdev->dev);
> +
> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
> +
> +	return 0;
> +
> +err_clk_disable:
> +	clk_disable_unprepare(ddata->clk);
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_remove(struct platform_device *pdev)
> +{
> +	pm_runtime_disable(&pdev->dev);

Hi,

missing clk_disable_unprepare(ddata->clk);?

as in the error handling path on the probe just above?

CJ

> +}

...


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-04 12:40   ` Vinod Koul
@ 2024-05-07 11:33     ` Amelie Delaunay
  2024-05-07 20:26       ` Frank Li
  0 siblings, 1 reply; 29+ messages in thread
From: Amelie Delaunay @ 2024-05-07 11:33 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Maxime Coquelin,
	Alexandre Torgue, dmaengine, devicetree, linux-stm32,
	linux-arm-kernel, linux-kernel, linux-hardening

Hi Vinod,

Thanks for the review.

On 5/4/24 14:40, Vinod Koul wrote:
> On 23-04-24, 14:32, Amelie Delaunay wrote:
>> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
>> controller:
>> - LPDMA (Low Power): 4 channels, no FIFO
>> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
>> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
>> Hardware configuration of the channels is retrieved from the hardware
>> configuration registers.
>> The client can specify its channel requirements through device tree.
>> STM32 DMA3 channels can be individually reserved either because they are
>> secure, or dedicated to another CPU.
>> Indeed, channels availability depends on Resource Isolation Framework
>> (RIF) configuration. RIF grants access to buses with Compartiment ID
> 
> Compartiment? typo...?
> 

Sorry, indeed, Compartment instead.

>> (CIF) filtering, secure and privilege level. It also assigns DMA channels
>> to one or several processors.
>> DMA channels used by Linux should be CID-filtered and statically assigned
>> to CID1 or shared with other CPUs but using semaphore. In case CID
>> filtering is not configured, dma-channel-mask property can be used to
>> specify available DMA channels to the kernel, otherwise such channels
>> will be marked as reserved and can't be used by Linux.
>>
>> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
>> ---
>>   drivers/dma/stm32/Kconfig      |   10 +
>>   drivers/dma/stm32/Makefile     |    1 +
>>   drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
>>   3 files changed, 1442 insertions(+)
>>   create mode 100644 drivers/dma/stm32/stm32-dma3.c
>>
>> diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
>> index b72ae1a4502f..4d8d8063133b 100644
>> --- a/drivers/dma/stm32/Kconfig
>> +++ b/drivers/dma/stm32/Kconfig
>> @@ -34,4 +34,14 @@ config STM32_MDMA
>>   	  If you have a board based on STM32 SoC with such DMA controller
>>   	  and want to use MDMA say Y here.
>>   
>> +config STM32_DMA3
>> +	tristate "STMicroelectronics STM32 DMA3 support"
>> +	select DMA_ENGINE
>> +	select DMA_VIRTUAL_CHANNELS
>> +	help
>> +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
>> +	  STM32 platforms.
>> +	  If you have a board based on STM32 SoC with such DMA3 controller
>> +	  and want to use DMA3, say Y here.
>> +
>>   endif
>> diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
>> index 663a3896a881..5082db4b4c1c 100644
>> --- a/drivers/dma/stm32/Makefile
>> +++ b/drivers/dma/stm32/Makefile
>> @@ -2,3 +2,4 @@
>>   obj-$(CONFIG_STM32_DMA) += stm32-dma.o
>>   obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
>>   obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
>> +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
> 
> are there any similarities in mdma/dma and dma3..?
> can anything be reused...?
> 

DMA/MDMA were originally intended for STM32 MCUs and have been used in 
STM32MP1 MPUs.
New MPUs (STM32MP2, ...) and STM32 MCUs (STM32H5, STM32N6, ...) use DMA3.
Unlike DMA/MDMA, DMA3 can be declined in multiple configurations, LPDMA, 
GPDMA, HPDMA, and among these global configurations, there are possible 
sub-configurations (e.g. channel fifo size). stm32-dma3 uses the 
hardware configuration registers to discover the controller/channels 
capabilities.
Reuse stm32-dma or stm32-mdma would lead to complicating the driver and 
making future stm32-dma3 evolutions for next STM32 MPUs intricate and 
very difficult.

>> diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
>> new file mode 100644
>> index 000000000000..b5493f497d06
>> --- /dev/null
>> +++ b/drivers/dma/stm32/stm32-dma3.c
>> @@ -0,0 +1,1431 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * STM32 DMA3 controller driver
>> + *
>> + * Copyright (C) STMicroelectronics 2024
>> + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
>> + */
>> +
>> +#include <linux/bitfield.h>
>> +#include <linux/clk.h>
>> +#include <linux/dma-mapping.h>
>> +#include <linux/dmaengine.h>
>> +#include <linux/dmapool.h>
>> +#include <linux/init.h>
>> +#include <linux/iopoll.h>
>> +#include <linux/list.h>
>> +#include <linux/module.h>
>> +#include <linux/of_dma.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/pm_runtime.h>
>> +#include <linux/reset.h>
>> +#include <linux/slab.h>
>> +
>> +#include "../virt-dma.h"
>> +
>> +#define STM32_DMA3_SECCFGR		0x00
>> +#define STM32_DMA3_PRIVCFGR		0x04
>> +#define STM32_DMA3_RCFGLOCKR		0x08
>> +#define STM32_DMA3_MISR			0x0C
> 
> lower hex please
> 

Ok.

>> +#define STM32_DMA3_SMISR		0x10
>> +
>> +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
>> +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
>> +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
>> +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
>> +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
>> +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
>> +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
>> +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
>> +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
>> +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
>> +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
>> +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
>> +
>> +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
>> +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
>> +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
>> +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
>> +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
>> +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
>> +#define STM32_DMA3_VERR			0xFF4
> 
> here as well
> 

Ok.

>> +
>> +/* SECCFGR DMA secure configuration register */
>> +#define SECCFGR_SEC(x)			BIT(x)
>> +
>> +/* MISR DMA non-secure/secure masked interrupt status register */
>> +#define MISR_MIS(x)			BIT(x)
>> +
>> +/* CxLBAR DMA channel x linked_list base address register */
>> +#define CLBAR_LBA			GENMASK(31, 16)
>> +
>> +/* CxCIDCFGR DMA channel x CID register */
>> +#define CCIDCFGR_CFEN			BIT(0)
>> +#define CCIDCFGR_SEM_EN			BIT(1)
>> +#define CCIDCFGR_SCID			GENMASK(5, 4)
>> +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
>> +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
>> +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
>> +
>> +enum ccidcfgr_cid {
>> +	CCIDCFGR_CID0,
>> +	CCIDCFGR_CID1,
>> +	CCIDCFGR_CID2,
>> +};
>> +
>> +/* CxSEMCR DMA channel x semaphore control register */
>> +#define CSEMCR_SEM_MUTEX		BIT(0)
>> +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
>> +
>> +/* CxFCR DMA channel x flag clear register */
>> +#define CFCR_TCF			BIT(8)
>> +#define CFCR_HTF			BIT(9)
>> +#define CFCR_DTEF			BIT(10)
>> +#define CFCR_ULEF			BIT(11)
>> +#define CFCR_USEF			BIT(12)
>> +#define CFCR_SUSPF			BIT(13)
>> +
>> +/* CxSR DMA channel x status register */
>> +#define CSR_IDLEF			BIT(0)
>> +#define CSR_TCF				BIT(8)
>> +#define CSR_HTF				BIT(9)
>> +#define CSR_DTEF			BIT(10)
>> +#define CSR_ULEF			BIT(11)
>> +#define CSR_USEF			BIT(12)
>> +#define CSR_SUSPF			BIT(13)
>> +#define CSR_ALL_F			GENMASK(13, 8)
>> +#define CSR_FIFOL			GENMASK(24, 16)
>> +
>> +/* CxCR DMA channel x control register */
>> +#define CCR_EN				BIT(0)
>> +#define CCR_RESET			BIT(1)
>> +#define CCR_SUSP			BIT(2)
>> +#define CCR_TCIE			BIT(8)
>> +#define CCR_HTIE			BIT(9)
>> +#define CCR_DTEIE			BIT(10)
>> +#define CCR_ULEIE			BIT(11)
>> +#define CCR_USEIE			BIT(12)
>> +#define CCR_SUSPIE			BIT(13)
>> +#define CCR_ALLIE			GENMASK(13, 8)
>> +#define CCR_LSM				BIT(16)
>> +#define CCR_LAP				BIT(17)
>> +#define CCR_PRIO			GENMASK(23, 22)
>> +
>> +enum ccr_prio {
>> +	CCR_PRIO_LOW,
>> +	CCR_PRIO_MID,
>> +	CCR_PRIO_HIGH,
>> +	CCR_PRIO_VERY_HIGH,
>> +};
>> +
>> +/* CxTR1 DMA channel x transfer register 1 */
>> +#define CTR1_SINC			BIT(3)
>> +#define CTR1_SBL_1			GENMASK(9, 4)
>> +#define CTR1_DINC			BIT(19)
>> +#define CTR1_DBL_1			GENMASK(25, 20)
>> +#define CTR1_SDW_LOG2			GENMASK(1, 0)
>> +#define CTR1_PAM			GENMASK(12, 11)
>> +#define CTR1_SAP			BIT(14)
>> +#define CTR1_DDW_LOG2			GENMASK(17, 16)
>> +#define CTR1_DAP			BIT(30)
>> +
>> +enum ctr1_dw {
>> +	CTR1_DW_BYTE,
>> +	CTR1_DW_HWORD,
>> +	CTR1_DW_WORD,
>> +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
>> +};
>> +
>> +enum ctr1_pam {
>> +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
>> +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
>> +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
>> +};
>> +
>> +/* CxTR2 DMA channel x transfer register 2 */
>> +#define CTR2_REQSEL			GENMASK(7, 0)
>> +#define CTR2_SWREQ			BIT(9)
>> +#define CTR2_DREQ			BIT(10)
>> +#define CTR2_BREQ			BIT(11)
>> +#define CTR2_PFREQ			BIT(12)
>> +#define CTR2_TCEM			GENMASK(31, 30)
>> +
>> +enum ctr2_tcem {
>> +	CTR2_TCEM_BLOCK,
>> +	CTR2_TCEM_REPEAT_BLOCK,
>> +	CTR2_TCEM_LLI,
>> +	CTR2_TCEM_CHANNEL,
>> +};
>> +
>> +/* CxBR1 DMA channel x block register 1 */
>> +#define CBR1_BNDT			GENMASK(15, 0)
>> +
>> +/* CxLLR DMA channel x linked-list address register */
>> +#define CLLR_LA				GENMASK(15, 2)
>> +#define CLLR_ULL			BIT(16)
>> +#define CLLR_UDA			BIT(27)
>> +#define CLLR_USA			BIT(28)
>> +#define CLLR_UB1			BIT(29)
>> +#define CLLR_UT2			BIT(30)
>> +#define CLLR_UT1			BIT(31)
>> +
>> +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
>> +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
>> +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
>> +
>> +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
>> +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
>> +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
>> +
>> +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
>> +
>> +/* HWCFGR2 DMA hardware configuration register 2 */
>> +#define G_MAX_REQ_ID			GENMASK(7, 0)
>> +
>> +/* HWCFGR1 DMA hardware configuration register 1 */
>> +#define G_MASTER_PORTS			GENMASK(2, 0)
>> +#define G_NUM_CHANNELS			GENMASK(12, 8)
>> +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
>> +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
>> +
>> +enum stm32_dma3_master_ports {
>> +	AXI64, /* 1x AXI: 64-bit port 0 */
>> +	AHB32, /* 1x AHB: 32-bit port 0 */
>> +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
>> +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
>> +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
>> +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
>> +};
>> +
>> +enum stm32_dma3_port_data_width {
>> +	DW_32, /* 32-bit, for AHB */
>> +	DW_64, /* 64-bit, for AXI */
>> +	DW_128, /* 128-bit, for AXI */
>> +	DW_INVALID,
>> +};
>> +
>> +/* VERR DMA version register */
>> +#define VERR_MINREV			GENMASK(3, 0)
>> +#define VERR_MAJREV			GENMASK(7, 4)
>> +
>> +/* Device tree */
>> +/* struct stm32_dma3_dt_conf */
>> +/* .ch_conf */
>> +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
>> +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
>> +/* .tr_conf */
>> +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
>> +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
>> +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
>> +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
>> +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
>> +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
>> +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
>> +
>> +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
>> +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
>> +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
>> +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
>> +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
>> +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
>> +
>> +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
>> +struct stm32_dma3_hwdesc {
>> +	u32 ctr1;
>> +	u32 ctr2;
>> +	u32 cbr1;
>> +	u32 csar;
>> +	u32 cdar;
>> +	u32 cllr;
>> +} __aligned(32);
>> +
>> +/*
>> + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
>> + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
>> + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
>> + * the sg size limitation.
>> + */
>> +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
>> +
>> +/*
>> + * Linked-list items
>> + */
>> +struct stm32_dma3_lli {
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	dma_addr_t hwdesc_addr;
>> +};
>> +
>> +struct stm32_dma3_swdesc {
>> +	struct virt_dma_desc vdesc;
>> +	u32 ccr;
>> +	bool cyclic;
>> +	u32 lli_size;
>> +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
>> +};
>> +
>> +struct stm32_dma3_dt_conf {
>> +	u32 ch_id;
>> +	u32 req_line;
>> +	u32 ch_conf;
>> +	u32 tr_conf;
>> +};
>> +
>> +struct stm32_dma3_chan {
>> +	struct virt_dma_chan vchan;
>> +	u32 id;
>> +	int irq;
>> +	u32 fifo_size;
>> +	u32 max_burst;
>> +	bool semaphore_mode;
>> +	struct stm32_dma3_dt_conf dt_config;
>> +	struct dma_slave_config dma_config;
>> +	struct dma_pool *lli_pool;
>> +	struct stm32_dma3_swdesc *swdesc;
>> +	enum ctr2_tcem tcem;
>> +	u32 dma_status;
>> +};
>> +
>> +struct stm32_dma3_ddata {
>> +	struct dma_device dma_dev;
>> +	void __iomem *base;
>> +	struct clk *clk;
>> +	struct stm32_dma3_chan *chans;
>> +	u32 dma_channels;
>> +	u32 dma_requests;
>> +	enum stm32_dma3_port_data_width ports_max_dw[2];
>> +};
>> +
>> +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
>> +{
>> +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
>> +}
>> +
>> +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
>> +{
>> +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
>> +}
>> +
>> +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
>> +{
>> +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
>> +}
>> +
>> +static struct device *chan2dev(struct stm32_dma3_chan *chan)
>> +{
>> +	return &chan->vchan.chan.dev->device;
>> +}
>> +
>> +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct device *dev = chan2dev(chan);
>> +	u32 id = chan->id, offset;
>> +
>> +	offset = STM32_DMA3_SECCFGR;
>> +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_PRIVCFGR;
>> +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CCIDCFGR(id);
>> +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CSEMCR(id);
>> +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CSR(id);
>> +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CCR(id);
>> +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CTR1(id);
>> +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CTR2(id);
>> +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CBR1(id);
>> +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CSAR(id);
>> +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CDAR(id);
>> +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CLLR(id);
>> +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CLBAR(id);
>> +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +}
>> +
>> +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
>> +					struct stm32_dma3_swdesc *swdesc)
>> +{
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	int i;
>> +
>> +	for (i = 0; i < swdesc->lli_size; i++) {
>> +		hwdesc = swdesc->lli[i].hwdesc;
>> +		if (i)
>> +			dev_dbg(chan2dev(chan), "V\n");
>> +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
>> +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
>> +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
>> +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
>> +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
>> +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
>> +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
>> +	}
>> +
>> +	if (swdesc->cyclic) {
>> +		dev_dbg(chan2dev(chan), "|\n");
>> +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
>> +	} else {
>> +		dev_dbg(chan2dev(chan), "X\n");
>> +	}
>> +}
>> +
>> +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct stm32_dma3_swdesc *swdesc;
>> +	int i;
>> +
>> +	/*
>> +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
>> +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
>> +	 * addressed, so abort the allocation.
>> +	 */
>> +	if ((count * 32) > CLLR_LA) {
>> +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
>> +		return NULL;
>> +	}
>> +
>> +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
>> +	if (!swdesc)
>> +		return NULL;
>> +
>> +	for (i = 0; i < count; i++) {
>> +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
>> +							&swdesc->lli[i].hwdesc_addr);
>> +		if (!swdesc->lli[i].hwdesc)
>> +			goto err_pool_free;
>> +	}
>> +	swdesc->lli_size = count;
>> +	swdesc->ccr = 0;
>> +
>> +	/* Set LL base address */
>> +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
>> +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
>> +
>> +	/* Set LL allocated port */
>> +	swdesc->ccr &= ~CCR_LAP;
>> +
>> +	return swdesc;
>> +
>> +err_pool_free:
>> +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
>> +	while (--i >= 0)
>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>> +	kfree(swdesc);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
>> +				      struct stm32_dma3_swdesc *swdesc)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < swdesc->lli_size; i++)
>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>> +
>> +	kfree(swdesc);
>> +}
>> +
>> +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
>> +{
>> +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
>> +
>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>> +}
>> +
>> +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct device *dev = chan2dev(chan);
>> +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
>> +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
>> +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
>> +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
>> +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
>> +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
>> +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
>> +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
>> +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
>> +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
>> +
>> +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
>> +		dev_err(dev, "null source block size and no update of this value\n");
>> +	if (bndt % sdw)
>> +		dev_err(dev, "source block size not multiple of src data width\n");
>> +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
>> +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
>> +	if (csar % sdw)
>> +		dev_err(dev, "unaligned source address not multiple of src data width\n");
>> +	if (cdar % ddw)
>> +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
>> +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
>> +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
>> +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
>> +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
> 
> NO error/abort if this is wrong...?
> 

User setting error triggers an interrupt caught in stm32_dma3_chan_irq() 
interrupt handler.
Indeed User setting error can occur when enabling the channel or when 
DMA3 registers are updated with each linked-list item.
In interrupt handler, when USEF (User Setting Error Flag) is set, this 
function (stm32_dma3_check_user_setting) helps the user to understand 
what went wrong. The hardware automatically disables the channel to 
prevent the execution of the wrongly programmed transfer and the driver 
resets the channel and sets chan->dma_status = DMA_ERROR;. 
dmaengine_tx_status() will return DMA_ERROR.
So from user point of view, the transfer will never complete, and the 
channel is ready to be reprogrammed.
Note that in _prep_ functions, all is checked to avoid user setting 
error. If a user setting error occurs, it is rather due to a corrupted 
linked-list item (that should fortunately never happen).

>> +}
>> +
>> +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
>> +					struct stm32_dma3_swdesc *swdesc,
>> +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
>> +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
>> +{
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	dma_addr_t next_lli;
>> +	u32 next = curr + 1;
>> +
>> +	hwdesc = swdesc->lli[curr].hwdesc;
>> +	hwdesc->ctr1 = ctr1;
>> +	hwdesc->ctr2 = ctr2;
>> +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
>> +	hwdesc->csar = src;
>> +	hwdesc->cdar = dst;
>> +
>> +	if (is_last) {
>> +		if (is_cyclic)
>> +			next_lli = swdesc->lli[0].hwdesc_addr;
>> +		else
>> +			next_lli = 0;
>> +	} else {
>> +		next_lli = swdesc->lli[next].hwdesc_addr;
>> +	}
>> +
>> +	hwdesc->cllr = 0;
>> +	if (next_lli) {
>> +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
>> +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
>> +		hwdesc->cllr |= (next_lli & CLLR_LA);
>> +	}
>> +}
>> +
>> +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
>> +						     enum stm32_dma3_port_data_width port_max_dw,
>> +						     u32 len, dma_addr_t addr)
>> +{
>> +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
>> +
>> +	/* len and addr must be a multiple of dw */
>> +	return 1 << __ffs(len | addr | max_dw);
>> +}
>> +
>> +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
>> +{
>> +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
>> +
>> +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
>> +	if (len < chan_max_burst)
>> +		max_burst = len / dw;
>> +
>> +	/*
>> +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
>> +	 * If len is not a multiple of burst size, last burst is shortened by HW.
>> +	 */
>> +	return max_burst;
>> +}
>> +
>> +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
>> +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
>> +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct dma_device dma_device = ddata->dma_dev;
>> +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
>> +	u32 _ctr1 = 0, _ctr2 = 0;
>> +	u32 ch_conf = chan->dt_config.ch_conf;
>> +	u32 tr_conf = chan->dt_config.tr_conf;
>> +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
>> +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
>> +
>> +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
>> +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
>> +
>> +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
>> +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
>> +	sbl_max = chan->dma_config.src_maxburst ? : 1;
>> +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
>> +
>> +	/* Following conditions would raise User Setting Error interrupt */
>> +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
>> +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
>> +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	sap_max_dw = ddata->ports_max_dw[sap];
>> +	dap_max_dw = ddata->ports_max_dw[dap];
>> +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
>> +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
>> +		dev_err(chan2dev(chan),
>> +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
>> +			sdw, ddw, sap, dap);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
>> +		_ctr1 |= CTR1_SINC;
>> +	if (sap)
>> +		_ctr1 |= CTR1_SAP;
>> +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
>> +		_ctr1 |= CTR1_DINC;
>> +	if (dap)
>> +		_ctr1 |= CTR1_DAP;
>> +
>> +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
>> +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
>> +		_ctr2 |= CTR2_BREQ;
>> +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
>> +		_ctr2 |= CTR2_PFREQ;
>> +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
>> +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
>> +
>> +	/* Store TCEM to know on which event TC flag occurred */
>> +	chan->tcem = tcem;
>> +	/* Store direction for residue computation */
>> +	chan->dma_config.direction = dir;
>> +
>> +	switch (dir) {
>> +	case DMA_MEM_TO_DEV:
>> +		/* Set destination (device) data width and burst */
>> +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
>> +							    len, dst_addr));
>> +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
>> +
>> +		/* Set source (memory) data width and burst */
>> +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
>> +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
>> +
>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>> +
>> +		if (ddw != sdw) {
>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>> +			/* Should never reach this case as ddw is clamped down */
>> +			if (len & (ddw - 1)) {
>> +				dev_err(chan2dev(chan),
>> +					"Packing mode is enabled and len is not multiple of ddw");
>> +				return -EINVAL;
>> +			}
>> +		}
>> +
>> +		/* dst = dev */
>> +		_ctr2 |= CTR2_DREQ;
>> +
>> +		break;
>> +
>> +	case DMA_DEV_TO_MEM:
>> +		/* Set source (device) data width and burst */
>> +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
>> +							    len, src_addr));
>> +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
>> +
>> +		/* Set destination (memory) data width and burst */
>> +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
>> +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
>> +
>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>> +
>> +		if (ddw != sdw) {
>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>> +			/* Should never reach this case as ddw is clamped down */
>> +			if (len & (ddw - 1)) {
>> +				dev_err(chan2dev(chan),
>> +					"Packing mode is enabled and len is not multiple of ddw\n");
>> +				return -EINVAL;
>> +			}
>> +		}
>> +
>> +		/* dst = mem */
>> +		_ctr2 &= ~CTR2_DREQ;
>> +
>> +		break;
>> +
>> +	default:
>> +		dev_err(chan2dev(chan), "Direction %s not supported\n",
>> +			dmaengine_get_direction_text(dir));
>> +		return -EINVAL;
>> +	}
>> +
>> +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
>> +	*ctr1 = _ctr1;
>> +	*ctr2 = _ctr2;
>> +
>> +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
>> +		__func__, sdw, sbl_max, ddw, dbl_max);
>> +
>> +	return 0;
>> +}
>> +
>> +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct virt_dma_desc *vdesc;
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	u32 id = chan->id;
>> +	u32 csr, ccr;
>> +
>> +	vdesc = vchan_next_desc(&chan->vchan);
>> +	if (!vdesc) {
>> +		chan->swdesc = NULL;
>> +		return;
>> +	}
>> +	list_del(&vdesc->node);
>> +
>> +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
>> +	hwdesc = chan->swdesc->lli[0].hwdesc;
>> +
>> +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
>> +
>> +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
>> +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
>> +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
>> +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
>> +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
>> +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
>> +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
>> +
>> +	/* Clear any pending interrupts */
>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
>> +	if (csr & CSR_ALL_F)
>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
>> +
>> +	stm32_dma3_chan_dump_reg(chan);
>> +
>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
>> +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
>> +
>> +	chan->dma_status = DMA_IN_PROGRESS;
>> +
>> +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
>> +}
>> +
>> +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>> +	int ret = 0;
>> +
>> +	if (susp)
>> +		ccr |= CCR_SUSP;
>> +	else
>> +		ccr &= ~CCR_SUSP;
>> +
>> +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
>> +
>> +	if (susp) {
>> +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
>> +							csr & CSR_SUSPF, 1, 10);
>> +		if (!ret)
>> +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
>> +
>> +		stm32_dma3_chan_dump_reg(chan);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>> +
>> +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
>> +}
>> +
>> +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 ccr;
>> +	int ret = 0;
>> +
>> +	chan->dma_status = DMA_COMPLETE;
>> +
>> +	/* Disable interrupts */
>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
>> +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
>> +
>> +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
>> +		/* Suspend the channel */
>> +		ret = stm32_dma3_chan_suspend(chan, true);
>> +		if (ret)
>> +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
>> +	}
>> +
>> +	/*
>> +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
>> +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
>> +	 */
>> +	stm32_dma3_chan_reset(chan);
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
>> +{
>> +	if (!chan->swdesc)
>> +		return;
>> +
>> +	vchan_cookie_complete(&chan->swdesc->vdesc);
>> +	chan->swdesc = NULL;
>> +	stm32_dma3_chan_start(chan);
>> +}
>> +
>> +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
>> +{
>> +	struct stm32_dma3_chan *chan = devid;
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 misr, csr, ccr;
>> +
>> +	spin_lock(&chan->vchan.lock);
>> +
>> +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
>> +	if (!(misr & MISR_MIS(chan->id))) {
>> +		spin_unlock(&chan->vchan.lock);
>> +		return IRQ_NONE;
>> +	}
>> +
>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
>> +
>> +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
>> +		if (chan->swdesc->cyclic)
>> +			vchan_cyclic_callback(&chan->swdesc->vdesc);
>> +		else
>> +			stm32_dma3_chan_complete(chan);
>> +	}
>> +
>> +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
>> +		dev_err(chan2dev(chan), "User setting error\n");
>> +		chan->dma_status = DMA_ERROR;
>> +		/* CCR.EN automatically cleared by HW */
>> +		stm32_dma3_check_user_setting(chan);
>> +		stm32_dma3_chan_reset(chan);
>> +	}
>> +
>> +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
>> +		dev_err(chan2dev(chan), "Update link transfer error\n");
>> +		chan->dma_status = DMA_ERROR;
>> +		/* CCR.EN automatically cleared by HW */
>> +		stm32_dma3_chan_reset(chan);
>> +	}
>> +
>> +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
>> +		dev_err(chan2dev(chan), "Data transfer error\n");
>> +		chan->dma_status = DMA_ERROR;
>> +		/* CCR.EN automatically cleared by HW */
>> +		stm32_dma3_chan_reset(chan);
>> +	}
>> +
>> +	/*
>> +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
>> +	 * ensure HTF flag to be cleared, with other flags.
>> +	 */
>> +	csr &= (ccr | CCR_HTIE);
>> +
>> +	if (csr)
>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
>> +
>> +	spin_unlock(&chan->vchan.lock);
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 id = chan->id, csemcr, ccid;
>> +	int ret;
>> +
>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	/* Ensure the channel is free */
>> +	if (chan->semaphore_mode &&
>> +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
>> +		ret = -EBUSY;
>> +		goto err_put_sync;
>> +	}
>> +
>> +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
>> +					  sizeof(struct stm32_dma3_hwdesc),
>> +					  __alignof__(struct stm32_dma3_hwdesc), 0);
>> +	if (!chan->lli_pool) {
>> +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
>> +		ret = -ENOMEM;
>> +		goto err_put_sync;
>> +	}
>> +
>> +	/* Take the channel semaphore */
>> +	if (chan->semaphore_mode) {
>> +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
>> +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
>> +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
>> +		/* Check that the channel is well taken */
>> +		if (ccid != CCIDCFGR_CID1) {
>> +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
>> +			ret = -EPERM;
>> +			goto err_pool_destroy;
>> +		}
>> +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
>> +	}
>> +
>> +	return 0;
>> +
>> +err_pool_destroy:
>> +	dmam_pool_destroy(chan->lli_pool);
>> +	chan->lli_pool = NULL;
>> +
>> +err_put_sync:
>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	unsigned long flags;
>> +
>> +	/* Ensure channel is in idle state */
>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>> +	stm32_dma3_chan_stop(chan);
>> +	chan->swdesc = NULL;
>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>> +
>> +	vchan_free_chan_resources(to_virt_chan(c));
>> +
>> +	dmam_pool_destroy(chan->lli_pool);
>> +	chan->lli_pool = NULL;
>> +
>> +	/* Release the channel semaphore */
>> +	if (chan->semaphore_mode)
>> +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
>> +
>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>> +
>> +	/* Reset configuration */
>> +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
>> +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
>> +}
>> +
>> +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
>> +								struct scatterlist *sgl,
>> +								unsigned int sg_len,
>> +								enum dma_transfer_direction dir,
>> +								unsigned long flags, void *context)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_swdesc *swdesc;
>> +	struct scatterlist *sg;
>> +	size_t len;
>> +	dma_addr_t sg_addr, dev_addr, src, dst;
>> +	u32 i, j, count, ctr1, ctr2;
>> +	int ret;
>> +
>> +	count = sg_len;
>> +	for_each_sg(sgl, sg, sg_len, i) {
>> +		len = sg_dma_len(sg);
>> +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
>> +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
>> +	}
>> +
>> +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
>> +	if (!swdesc)
>> +		return NULL;
>> +
>> +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
>> +	j = 0;
>> +	for_each_sg(sgl, sg, sg_len, i) {
>> +		sg_addr = sg_dma_address(sg);
>> +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
>> +						     chan->dma_config.src_addr;
>> +		len = sg_dma_len(sg);
>> +
>> +		do {
>> +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
>> +
>> +			if (dir == DMA_MEM_TO_DEV) {
>> +				src = sg_addr;
>> +				dst = dev_addr;
>> +
>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>> +							      src, dst, chunk);
>> +
>> +				if (FIELD_GET(CTR1_DINC, ctr1))
>> +					dev_addr += chunk;
>> +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
>> +				src = dev_addr;
>> +				dst = sg_addr;
>> +
>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>> +							      src, dst, chunk);
>> +
>> +				if (FIELD_GET(CTR1_SINC, ctr1))
>> +					dev_addr += chunk;
>> +			}
>> +
>> +			if (ret)
>> +				goto err_desc_free;
>> +
>> +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
>> +						    ctr1, ctr2, j == (count - 1), false);
>> +
>> +			sg_addr += chunk;
>> +			len -= chunk;
>> +			j++;
>> +		} while (len);
>> +	}
>> +
>> +	/* Enable Error interrupts */
>> +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
>> +	/* Enable Transfer state interrupts */
>> +	swdesc->ccr |= CCR_TCIE;
>> +
>> +	swdesc->cyclic = false;
>> +
>> +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
>> +
>> +err_desc_free:
>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +
>> +	if (!chan->fifo_size) {
>> +		caps->max_burst = 0;
>> +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +	} else {
>> +		/* Burst transfer should not exceed half of the fifo size */
>> +		caps->max_burst = chan->max_burst;
>> +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
>> +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +		}
>> +	}
>> +}
>> +
>> +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +
>> +	memcpy(&chan->dma_config, config, sizeof(*config));
>> +
>> +	return 0;
>> +}
>> +
>> +static int stm32_dma3_terminate_all(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	unsigned long flags;
>> +	LIST_HEAD(head);
>> +
>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>> +
>> +	if (chan->swdesc) {
>> +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
>> +		chan->swdesc = NULL;
>> +	}
>> +
>> +	stm32_dma3_chan_stop(chan);
>> +
>> +	vchan_get_all_descriptors(&chan->vchan, &head);
>> +
>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>> +	vchan_dma_desc_free_list(&chan->vchan, &head);
>> +
>> +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
>> +
>> +	return 0;
>> +}
>> +
>> +static void stm32_dma3_synchronize(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +
>> +	vchan_synchronize(&chan->vchan);
>> +}
>> +
>> +static void stm32_dma3_issue_pending(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>> +
>> +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
>> +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
>> +		stm32_dma3_chan_start(chan);
>> +	}
>> +
>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>> +}
>> +
>> +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct stm32_dma3_dt_conf *conf = fn_param;
>> +	u32 mask, semcr;
>> +	int ret;
>> +
>> +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
>> +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
>> +
>> +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
>> +		if (!(mask & BIT(chan->id)))
>> +			return false;
>> +
>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>> +	if (ret < 0)
>> +		return false;
>> +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>> +
>> +	/* Check if chan is free */
>> +	if (semcr & CSEMCR_SEM_MUTEX)
>> +		return false;
>> +
>> +	/* Check if chan fifo fits well */
>> +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
>> +		return false;
>> +
>> +	return true;
>> +}
>> +
>> +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
>> +{
>> +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
>> +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
>> +	struct stm32_dma3_dt_conf conf;
>> +	struct stm32_dma3_chan *chan;
>> +	struct dma_chan *c;
>> +
>> +	if (dma_spec->args_count < 3) {
>> +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
>> +		return NULL;
>> +	}
>> +
>> +	conf.req_line = dma_spec->args[0];
>> +	conf.ch_conf = dma_spec->args[1];
>> +	conf.tr_conf = dma_spec->args[2];
>> +
>> +	if (conf.req_line >= ddata->dma_requests) {
>> +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
>> +		return NULL;
>> +	}
>> +
>> +	/* Request dma channel among the generic dma controller list */
>> +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
>> +	if (!c) {
>> +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
>> +		return NULL;
>> +	}
>> +
>> +	chan = to_stm32_dma3_chan(c);
>> +	chan->dt_config = conf;
>> +
>> +	return c;
>> +}
>> +
>> +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
>> +{
>> +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
>> +
>> +	/* Reserve Secure channels */
>> +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
>> +
>> +	/*
>> +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
>> +	 * the processor which is configuring and using the given channel.
>> +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
>> +	 * specify available DMA channels to the kernel.
>> +	 */
>> +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
>> +
>> +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
>> +	for (i = 0; i < ddata->dma_channels; i++) {
>> +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
>> +
>> +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
>> +			invalid_cid |= BIT(i);
>> +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
>> +				chan_reserved |= BIT(i);
>> +		} else { /* CID-filtered */
>> +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
>> +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
>> +					chan_reserved |= BIT(i);
>> +			} else { /* Semaphore mode */
>> +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
>> +					chan_reserved |= BIT(i);
>> +				ddata->chans[i].semaphore_mode = true;
>> +			}
>> +		}
>> +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
>> +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
>> +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
>> +			(chan_reserved & BIT(i)) ? "denied" :
>> +			mask & BIT(i) ? "force allowed" : "allowed");
>> +	}
>> +
>> +	if (invalid_cid)
>> +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
>> +			 ddata->dma_channels, &invalid_cid);
>> +
>> +	return chan_reserved;
>> +}
>> +
>> +static const struct of_device_id stm32_dma3_of_match[] = {
>> +	{ .compatible = "st,stm32-dma3", },
>> +	{ /* sentinel */},
>> +};
>> +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
>> +
>> +static int stm32_dma3_probe(struct platform_device *pdev)
>> +{
>> +	struct device_node *np = pdev->dev.of_node;
>> +	struct stm32_dma3_ddata *ddata;
>> +	struct reset_control *reset;
>> +	struct stm32_dma3_chan *chan;
>> +	struct dma_device *dma_dev;
>> +	u32 master_ports, chan_reserved, i, verr;
>> +	u64 hwcfgr;
>> +	int ret;
>> +
>> +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
>> +	if (!ddata)
>> +		return -ENOMEM;
>> +	platform_set_drvdata(pdev, ddata);
>> +
>> +	dma_dev = &ddata->dma_dev;
>> +
>> +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
>> +	if (IS_ERR(ddata->base))
>> +		return PTR_ERR(ddata->base);
>> +
>> +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
>> +	if (IS_ERR(ddata->clk))
>> +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
>> +
>> +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
>> +	if (IS_ERR(reset))
>> +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
>> +
>> +	ret = clk_prepare_enable(ddata->clk);
>> +	if (ret)
>> +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
>> +
>> +	reset_control_reset(reset);
>> +
>> +	INIT_LIST_HEAD(&dma_dev->channels);
>> +
>> +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
>> +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
>> +	dma_dev->dev = &pdev->dev;
>> +	/*
>> +	 * This controller supports up to 8-byte buswidth depending on the port used and the
>> +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
>> +	 */
>> +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
>> +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
>> +
>> +	dma_dev->descriptor_reuse = true;
>> +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
>> +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
>> +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
>> +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
>> +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
>> +	dma_dev->device_caps = stm32_dma3_caps;
>> +	dma_dev->device_config = stm32_dma3_config;
>> +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
>> +	dma_dev->device_synchronize = stm32_dma3_synchronize;
>> +	dma_dev->device_tx_status = dma_cookie_status;
>> +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
>> +
>> +	/* if dma_channels is not modified, get it from hwcfgr1 */
>> +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>> +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
>> +	}
>> +
>> +	/* if dma_requests is not modified, get it from hwcfgr2 */
>> +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
>> +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
>> +	}
>> +
>> +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>> +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
>> +
>> +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
>> +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
>> +		ddata->ports_max_dw[1] = DW_INVALID;
>> +	else /* Dual master ports */
>> +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
>> +
>> +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
>> +				    GFP_KERNEL);
>> +	if (!ddata->chans) {
>> +		ret = -ENOMEM;
>> +		goto err_clk_disable;
>> +	}
>> +
>> +	chan_reserved = stm32_dma3_check_rif(ddata);
>> +
>> +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
>> +		ret = -ENODEV;
>> +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
>> +		goto err_clk_disable;
>> +	}
>> +
>> +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
>> +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
>> +
>> +	for (i = 0; i < ddata->dma_channels; i++) {
>> +		if (chan_reserved & BIT(i))
>> +			continue;
>> +
>> +		chan = &ddata->chans[i];
>> +		chan->id = i;
>> +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
>> +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
>> +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
>> +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
>> +
>> +		vchan_init(&chan->vchan, dma_dev);
>> +	}
>> +
>> +	ret = dmaenginem_async_device_register(dma_dev);
>> +	if (ret)
>> +		goto err_clk_disable;
>> +
>> +	for (i = 0; i < ddata->dma_channels; i++) {
>> +		if (chan_reserved & BIT(i))
>> +			continue;
>> +
>> +		ret = platform_get_irq(pdev, i);
>> +		if (ret < 0)
>> +			goto err_clk_disable;
>> +
>> +		chan = &ddata->chans[i];
>> +		chan->irq = ret;
>> +
>> +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
>> +				       dev_name(chan2dev(chan)), chan);
>> +		if (ret) {
>> +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
>> +				      dev_name(chan2dev(chan)));
>> +			goto err_clk_disable;
>> +		}
>> +	}
>> +
>> +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
>> +	if (ret) {
>> +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
>> +		goto err_clk_disable;
>> +	}
>> +
>> +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
>> +
>> +	pm_runtime_set_active(&pdev->dev);
>> +	pm_runtime_enable(&pdev->dev);
>> +	pm_runtime_get_noresume(&pdev->dev);
>> +	pm_runtime_put(&pdev->dev);
>> +
>> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
>> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
>> +
>> +	return 0;
>> +
>> +err_clk_disable:
>> +	clk_disable_unprepare(ddata->clk);
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_remove(struct platform_device *pdev)
>> +{
>> +	pm_runtime_disable(&pdev->dev);
>> +}
>> +
>> +static int stm32_dma3_runtime_suspend(struct device *dev)
>> +{
>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>> +
>> +	clk_disable_unprepare(ddata->clk);
>> +
>> +	return 0;
>> +}
>> +
>> +static int stm32_dma3_runtime_resume(struct device *dev)
>> +{
>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>> +	int ret;
>> +
>> +	ret = clk_prepare_enable(ddata->clk);
>> +	if (ret)
>> +		dev_err(dev, "Failed to enable clk: %d\n", ret);
>> +
>> +	return ret;
>> +}
>> +
>> +static const struct dev_pm_ops stm32_dma3_pm_ops = {
>> +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
>> +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
>> +};
>> +
>> +static struct platform_driver stm32_dma3_driver = {
>> +	.probe = stm32_dma3_probe,
>> +	.remove_new = stm32_dma3_remove,
>> +	.driver = {
>> +		.name = "stm32-dma3",
>> +		.of_match_table = stm32_dma3_of_match,
>> +		.pm = pm_ptr(&stm32_dma3_pm_ops),
>> +	},
>> +};
>> +
>> +static int __init stm32_dma3_init(void)
>> +{
>> +	return platform_driver_register(&stm32_dma3_driver);
>> +}
>> +
>> +subsys_initcall(stm32_dma3_init);
>> +
>> +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
>> +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
>> +MODULE_LICENSE("GPL");
>> -- 
>> 2.25.1
> 

Regards,
Amelie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-04 14:27   ` Christophe JAILLET
@ 2024-05-07 12:37     ` Amelie Delaunay
  0 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-05-07 12:37 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: alexandre.torgue, conor+dt, devicetree, dmaengine,
	krzysztof.kozlowski+dt, linux-arm-kernel, linux-hardening,
	linux-kernel, linux-stm32, mcoquelin.stm32, robh+dt, vkoul

Hi Christophe,

Thanks for the review.

On 5/4/24 16:27, Christophe JAILLET wrote:
> Le 23/04/2024 à 14:32, Amelie Delaunay a écrit :
>> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 
>> DMA3
>> controller:
>> - LPDMA (Low Power): 4 channels, no FIFO
>> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
>> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
>> Hardware configuration of the channels is retrieved from the hardware
>> configuration registers.
>> The client can specify its channel requirements through device tree.
>> STM32 DMA3 channels can be individually reserved either because they are
>> secure, or dedicated to another CPU.
>> Indeed, channels availability depends on Resource Isolation Framework
>> (RIF) configuration. RIF grants access to buses with Compartiment ID
>> (CIF) filtering, secure and privilege level. It also assigns DMA channels
>> to one or several processors.
>> DMA channels used by Linux should be CID-filtered and statically assigned
>> to CID1 or shared with other CPUs but using semaphore. In case CID
>> filtering is not configured, dma-channel-mask property can be used to
>> specify available DMA channels to the kernel, otherwise such channels
>> will be marked as reserved and can't be used by Linux.
>>
>> Signed-off-by: Amelie Delaunay 
>> <amelie.delaunay-rj0Iel/JR4NBDgjK7y7TUQ@public.gmane.org>
>> ---
> 
> ...
> 
>> +    pm_runtime_set_active(&pdev->dev);
>> +    pm_runtime_enable(&pdev->dev);
>> +    pm_runtime_get_noresume(&pdev->dev);
>> +    pm_runtime_put(&pdev->dev);
>> +
>> +    dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
>> +         FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
>> +
>> +    return 0;
>> +
>> +err_clk_disable:
>> +    clk_disable_unprepare(ddata->clk);
>> +
>> +    return ret;
>> +}
>> +
>> +static void stm32_dma3_remove(struct platform_device *pdev)
>> +{
>> +    pm_runtime_disable(&pdev->dev);
> 
> Hi,
> 
> missing clk_disable_unprepare(ddata->clk);?
> 
> as in the error handling path on the probe just above?
> 
> CJ
> 

Clock is entirely managed by pm_runtime, except in error path of probe 
since pm_runtime is enabled only at the very end.
Clock is enabled with pm_runtime_resume_and_get() when a channel is 
requested or when an asynchronous register access occurs (filter_fn, 
debugfs, runtime_resume) and clock is disabled with 
pm_runtime_put_sync() when releasing a channel or at the end of 
asynchronous register access (filter_fn, debugfs, runtime_suspend).
Adding clk_disable_unprepare(ddata->clk); here leads to clock already 
disabled/unprepared warnings in drivers/clk/clk.c 
clk_core_disable()/clk_core_unprepare().

>> +}
> 
> ...
> 

Regards,
Amelie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-07 11:33     ` Amelie Delaunay
@ 2024-05-07 20:26       ` Frank Li
  2024-05-13  9:21         ` Amelie Delaunay
  0 siblings, 1 reply; 29+ messages in thread
From: Frank Li @ 2024-05-07 20:26 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

On Tue, May 07, 2024 at 01:33:31PM +0200, Amelie Delaunay wrote:
> Hi Vinod,
> 
> Thanks for the review.
> 
> On 5/4/24 14:40, Vinod Koul wrote:
> > On 23-04-24, 14:32, Amelie Delaunay wrote:
> > > STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
> > > controller:
> > > - LPDMA (Low Power): 4 channels, no FIFO
> > > - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
> > > - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
> > > Hardware configuration of the channels is retrieved from the hardware
> > > configuration registers.
> > > The client can specify its channel requirements through device tree.
> > > STM32 DMA3 channels can be individually reserved either because they are
> > > secure, or dedicated to another CPU.
> > > Indeed, channels availability depends on Resource Isolation Framework
> > > (RIF) configuration. RIF grants access to buses with Compartiment ID
> > 
> > Compartiment? typo...?
> > 
> 
> Sorry, indeed, Compartment instead.
> 
> > > (CIF) filtering, secure and privilege level. It also assigns DMA channels
> > > to one or several processors.
> > > DMA channels used by Linux should be CID-filtered and statically assigned
> > > to CID1 or shared with other CPUs but using semaphore. In case CID
> > > filtering is not configured, dma-channel-mask property can be used to
> > > specify available DMA channels to the kernel, otherwise such channels
> > > will be marked as reserved and can't be used by Linux.
> > > 
> > > Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
> > > ---
> > >   drivers/dma/stm32/Kconfig      |   10 +
> > >   drivers/dma/stm32/Makefile     |    1 +
> > >   drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
> > >   3 files changed, 1442 insertions(+)
> > >   create mode 100644 drivers/dma/stm32/stm32-dma3.c
> > > 
> > > diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
> > > index b72ae1a4502f..4d8d8063133b 100644
> > > --- a/drivers/dma/stm32/Kconfig
> > > +++ b/drivers/dma/stm32/Kconfig
> > > @@ -34,4 +34,14 @@ config STM32_MDMA
> > >   	  If you have a board based on STM32 SoC with such DMA controller
> > >   	  and want to use MDMA say Y here.
> > > +config STM32_DMA3
> > > +	tristate "STMicroelectronics STM32 DMA3 support"
> > > +	select DMA_ENGINE
> > > +	select DMA_VIRTUAL_CHANNELS
> > > +	help
> > > +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
> > > +	  STM32 platforms.
> > > +	  If you have a board based on STM32 SoC with such DMA3 controller
> > > +	  and want to use DMA3, say Y here.
> > > +
> > >   endif
> > > diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
> > > index 663a3896a881..5082db4b4c1c 100644
> > > --- a/drivers/dma/stm32/Makefile
> > > +++ b/drivers/dma/stm32/Makefile
> > > @@ -2,3 +2,4 @@
> > >   obj-$(CONFIG_STM32_DMA) += stm32-dma.o
> > >   obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
> > >   obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
> > > +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
> > 
> > are there any similarities in mdma/dma and dma3..?
> > can anything be reused...?
> > 
> 
> DMA/MDMA were originally intended for STM32 MCUs and have been used in
> STM32MP1 MPUs.
> New MPUs (STM32MP2, ...) and STM32 MCUs (STM32H5, STM32N6, ...) use DMA3.
> Unlike DMA/MDMA, DMA3 can be declined in multiple configurations, LPDMA,
> GPDMA, HPDMA, and among these global configurations, there are possible
> sub-configurations (e.g. channel fifo size). stm32-dma3 uses the hardware
> configuration registers to discover the controller/channels capabilities.
> Reuse stm32-dma or stm32-mdma would lead to complicating the driver and
> making future stm32-dma3 evolutions for next STM32 MPUs intricate and very
> difficult.

I think your reason still not enough to create new driver instead try to
reuse old one.

Does register layout or dma descriptor is totally difference? 

If dma descriptor format is the same, at least you can reuse prepare DMA
descriptor part. 

Choose channel is independt part of DMA channel. You can create sperate
one for difference DMA engine.

Frank

> 
> > > diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
> > > new file mode 100644
> > > index 000000000000..b5493f497d06
> > > --- /dev/null
> > > +++ b/drivers/dma/stm32/stm32-dma3.c
> > > @@ -0,0 +1,1431 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +/*
> > > + * STM32 DMA3 controller driver
> > > + *
> > > + * Copyright (C) STMicroelectronics 2024
> > > + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
> > > + */
> > > +
> > > +#include <linux/bitfield.h>
> > > +#include <linux/clk.h>
> > > +#include <linux/dma-mapping.h>
> > > +#include <linux/dmaengine.h>
> > > +#include <linux/dmapool.h>
> > > +#include <linux/init.h>
> > > +#include <linux/iopoll.h>
> > > +#include <linux/list.h>
> > > +#include <linux/module.h>
> > > +#include <linux/of_dma.h>
> > > +#include <linux/platform_device.h>
> > > +#include <linux/pm_runtime.h>
> > > +#include <linux/reset.h>
> > > +#include <linux/slab.h>
> > > +
> > > +#include "../virt-dma.h"
> > > +
> > > +#define STM32_DMA3_SECCFGR		0x00
> > > +#define STM32_DMA3_PRIVCFGR		0x04
> > > +#define STM32_DMA3_RCFGLOCKR		0x08
> > > +#define STM32_DMA3_MISR			0x0C
> > 
> > lower hex please
> > 
> 
> Ok.
> 
> > > +#define STM32_DMA3_SMISR		0x10
> > > +
> > > +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
> > > +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
> > > +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
> > > +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
> > > +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
> > > +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
> > > +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
> > > +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
> > > +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
> > > +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
> > > +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
> > > +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
> > > +
> > > +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
> > > +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
> > > +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
> > > +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
> > > +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
> > > +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
> > > +#define STM32_DMA3_VERR			0xFF4
> > 
> > here as well
> > 
> 
> Ok.
> 
> > > +
> > > +/* SECCFGR DMA secure configuration register */
> > > +#define SECCFGR_SEC(x)			BIT(x)
> > > +
> > > +/* MISR DMA non-secure/secure masked interrupt status register */
> > > +#define MISR_MIS(x)			BIT(x)
> > > +
> > > +/* CxLBAR DMA channel x linked_list base address register */
> > > +#define CLBAR_LBA			GENMASK(31, 16)
> > > +
> > > +/* CxCIDCFGR DMA channel x CID register */
> > > +#define CCIDCFGR_CFEN			BIT(0)
> > > +#define CCIDCFGR_SEM_EN			BIT(1)
> > > +#define CCIDCFGR_SCID			GENMASK(5, 4)
> > > +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
> > > +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
> > > +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
> > > +
> > > +enum ccidcfgr_cid {
> > > +	CCIDCFGR_CID0,
> > > +	CCIDCFGR_CID1,
> > > +	CCIDCFGR_CID2,
> > > +};
> > > +
> > > +/* CxSEMCR DMA channel x semaphore control register */
> > > +#define CSEMCR_SEM_MUTEX		BIT(0)
> > > +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
> > > +
> > > +/* CxFCR DMA channel x flag clear register */
> > > +#define CFCR_TCF			BIT(8)
> > > +#define CFCR_HTF			BIT(9)
> > > +#define CFCR_DTEF			BIT(10)
> > > +#define CFCR_ULEF			BIT(11)
> > > +#define CFCR_USEF			BIT(12)
> > > +#define CFCR_SUSPF			BIT(13)
> > > +
> > > +/* CxSR DMA channel x status register */
> > > +#define CSR_IDLEF			BIT(0)
> > > +#define CSR_TCF				BIT(8)
> > > +#define CSR_HTF				BIT(9)
> > > +#define CSR_DTEF			BIT(10)
> > > +#define CSR_ULEF			BIT(11)
> > > +#define CSR_USEF			BIT(12)
> > > +#define CSR_SUSPF			BIT(13)
> > > +#define CSR_ALL_F			GENMASK(13, 8)
> > > +#define CSR_FIFOL			GENMASK(24, 16)
> > > +
> > > +/* CxCR DMA channel x control register */
> > > +#define CCR_EN				BIT(0)
> > > +#define CCR_RESET			BIT(1)
> > > +#define CCR_SUSP			BIT(2)
> > > +#define CCR_TCIE			BIT(8)
> > > +#define CCR_HTIE			BIT(9)
> > > +#define CCR_DTEIE			BIT(10)
> > > +#define CCR_ULEIE			BIT(11)
> > > +#define CCR_USEIE			BIT(12)
> > > +#define CCR_SUSPIE			BIT(13)
> > > +#define CCR_ALLIE			GENMASK(13, 8)
> > > +#define CCR_LSM				BIT(16)
> > > +#define CCR_LAP				BIT(17)
> > > +#define CCR_PRIO			GENMASK(23, 22)
> > > +
> > > +enum ccr_prio {
> > > +	CCR_PRIO_LOW,
> > > +	CCR_PRIO_MID,
> > > +	CCR_PRIO_HIGH,
> > > +	CCR_PRIO_VERY_HIGH,
> > > +};
> > > +
> > > +/* CxTR1 DMA channel x transfer register 1 */
> > > +#define CTR1_SINC			BIT(3)
> > > +#define CTR1_SBL_1			GENMASK(9, 4)
> > > +#define CTR1_DINC			BIT(19)
> > > +#define CTR1_DBL_1			GENMASK(25, 20)
> > > +#define CTR1_SDW_LOG2			GENMASK(1, 0)
> > > +#define CTR1_PAM			GENMASK(12, 11)
> > > +#define CTR1_SAP			BIT(14)
> > > +#define CTR1_DDW_LOG2			GENMASK(17, 16)
> > > +#define CTR1_DAP			BIT(30)
> > > +
> > > +enum ctr1_dw {
> > > +	CTR1_DW_BYTE,
> > > +	CTR1_DW_HWORD,
> > > +	CTR1_DW_WORD,
> > > +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
> > > +};
> > > +
> > > +enum ctr1_pam {
> > > +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
> > > +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
> > > +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
> > > +};
> > > +
> > > +/* CxTR2 DMA channel x transfer register 2 */
> > > +#define CTR2_REQSEL			GENMASK(7, 0)
> > > +#define CTR2_SWREQ			BIT(9)
> > > +#define CTR2_DREQ			BIT(10)
> > > +#define CTR2_BREQ			BIT(11)
> > > +#define CTR2_PFREQ			BIT(12)
> > > +#define CTR2_TCEM			GENMASK(31, 30)
> > > +
> > > +enum ctr2_tcem {
> > > +	CTR2_TCEM_BLOCK,
> > > +	CTR2_TCEM_REPEAT_BLOCK,
> > > +	CTR2_TCEM_LLI,
> > > +	CTR2_TCEM_CHANNEL,
> > > +};
> > > +
> > > +/* CxBR1 DMA channel x block register 1 */
> > > +#define CBR1_BNDT			GENMASK(15, 0)
> > > +
> > > +/* CxLLR DMA channel x linked-list address register */
> > > +#define CLLR_LA				GENMASK(15, 2)
> > > +#define CLLR_ULL			BIT(16)
> > > +#define CLLR_UDA			BIT(27)
> > > +#define CLLR_USA			BIT(28)
> > > +#define CLLR_UB1			BIT(29)
> > > +#define CLLR_UT2			BIT(30)
> > > +#define CLLR_UT1			BIT(31)
> > > +
> > > +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
> > > +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
> > > +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
> > > +
> > > +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
> > > +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
> > > +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
> > > +
> > > +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
> > > +
> > > +/* HWCFGR2 DMA hardware configuration register 2 */
> > > +#define G_MAX_REQ_ID			GENMASK(7, 0)
> > > +
> > > +/* HWCFGR1 DMA hardware configuration register 1 */
> > > +#define G_MASTER_PORTS			GENMASK(2, 0)
> > > +#define G_NUM_CHANNELS			GENMASK(12, 8)
> > > +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
> > > +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
> > > +
> > > +enum stm32_dma3_master_ports {
> > > +	AXI64, /* 1x AXI: 64-bit port 0 */
> > > +	AHB32, /* 1x AHB: 32-bit port 0 */
> > > +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
> > > +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
> > > +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
> > > +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
> > > +};
> > > +
> > > +enum stm32_dma3_port_data_width {
> > > +	DW_32, /* 32-bit, for AHB */
> > > +	DW_64, /* 64-bit, for AXI */
> > > +	DW_128, /* 128-bit, for AXI */
> > > +	DW_INVALID,
> > > +};
> > > +
> > > +/* VERR DMA version register */
> > > +#define VERR_MINREV			GENMASK(3, 0)
> > > +#define VERR_MAJREV			GENMASK(7, 4)
> > > +
> > > +/* Device tree */
> > > +/* struct stm32_dma3_dt_conf */
> > > +/* .ch_conf */
> > > +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
> > > +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
> > > +/* .tr_conf */
> > > +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
> > > +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
> > > +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
> > > +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
> > > +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
> > > +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
> > > +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
> > > +
> > > +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
> > > +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> > > +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
> > > +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> > > +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
> > > +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
> > > +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
> > > +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
> > > +
> > > +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
> > > +struct stm32_dma3_hwdesc {
> > > +	u32 ctr1;
> > > +	u32 ctr2;
> > > +	u32 cbr1;
> > > +	u32 csar;
> > > +	u32 cdar;
> > > +	u32 cllr;
> > > +} __aligned(32);
> > > +
> > > +/*
> > > + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
> > > + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
> > > + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
> > > + * the sg size limitation.
> > > + */
> > > +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
> > > +
> > > +/*
> > > + * Linked-list items
> > > + */
> > > +struct stm32_dma3_lli {
> > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > +	dma_addr_t hwdesc_addr;
> > > +};
> > > +
> > > +struct stm32_dma3_swdesc {
> > > +	struct virt_dma_desc vdesc;
> > > +	u32 ccr;
> > > +	bool cyclic;
> > > +	u32 lli_size;
> > > +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
> > > +};
> > > +
> > > +struct stm32_dma3_dt_conf {
> > > +	u32 ch_id;
> > > +	u32 req_line;
> > > +	u32 ch_conf;
> > > +	u32 tr_conf;
> > > +};
> > > +
> > > +struct stm32_dma3_chan {
> > > +	struct virt_dma_chan vchan;
> > > +	u32 id;
> > > +	int irq;
> > > +	u32 fifo_size;
> > > +	u32 max_burst;
> > > +	bool semaphore_mode;
> > > +	struct stm32_dma3_dt_conf dt_config;
> > > +	struct dma_slave_config dma_config;
> > > +	struct dma_pool *lli_pool;
> > > +	struct stm32_dma3_swdesc *swdesc;
> > > +	enum ctr2_tcem tcem;
> > > +	u32 dma_status;
> > > +};
> > > +
> > > +struct stm32_dma3_ddata {
> > > +	struct dma_device dma_dev;
> > > +	void __iomem *base;
> > > +	struct clk *clk;
> > > +	struct stm32_dma3_chan *chans;
> > > +	u32 dma_channels;
> > > +	u32 dma_requests;
> > > +	enum stm32_dma3_port_data_width ports_max_dw[2];
> > > +};
> > > +
> > > +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
> > > +{
> > > +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
> > > +}
> > > +
> > > +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
> > > +{
> > > +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
> > > +}
> > > +
> > > +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
> > > +{
> > > +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
> > > +}
> > > +
> > > +static struct device *chan2dev(struct stm32_dma3_chan *chan)
> > > +{
> > > +	return &chan->vchan.chan.dev->device;
> > > +}
> > > +
> > > +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	struct device *dev = chan2dev(chan);
> > > +	u32 id = chan->id, offset;
> > > +
> > > +	offset = STM32_DMA3_SECCFGR;
> > > +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_PRIVCFGR;
> > > +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CCIDCFGR(id);
> > > +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CSEMCR(id);
> > > +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CSR(id);
> > > +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CCR(id);
> > > +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CTR1(id);
> > > +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CTR2(id);
> > > +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CBR1(id);
> > > +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CSAR(id);
> > > +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CDAR(id);
> > > +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CLLR(id);
> > > +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +	offset = STM32_DMA3_CLBAR(id);
> > > +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > +}
> > > +
> > > +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
> > > +					struct stm32_dma3_swdesc *swdesc)
> > > +{
> > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > +	int i;
> > > +
> > > +	for (i = 0; i < swdesc->lli_size; i++) {
> > > +		hwdesc = swdesc->lli[i].hwdesc;
> > > +		if (i)
> > > +			dev_dbg(chan2dev(chan), "V\n");
> > > +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
> > > +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
> > > +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
> > > +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
> > > +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
> > > +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
> > > +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
> > > +	}
> > > +
> > > +	if (swdesc->cyclic) {
> > > +		dev_dbg(chan2dev(chan), "|\n");
> > > +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
> > > +	} else {
> > > +		dev_dbg(chan2dev(chan), "X\n");
> > > +	}
> > > +}
> > > +
> > > +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	struct stm32_dma3_swdesc *swdesc;
> > > +	int i;
> > > +
> > > +	/*
> > > +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
> > > +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
> > > +	 * addressed, so abort the allocation.
> > > +	 */
> > > +	if ((count * 32) > CLLR_LA) {
> > > +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
> > > +		return NULL;
> > > +	}
> > > +
> > > +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
> > > +	if (!swdesc)
> > > +		return NULL;
> > > +
> > > +	for (i = 0; i < count; i++) {
> > > +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
> > > +							&swdesc->lli[i].hwdesc_addr);
> > > +		if (!swdesc->lli[i].hwdesc)
> > > +			goto err_pool_free;
> > > +	}
> > > +	swdesc->lli_size = count;
> > > +	swdesc->ccr = 0;
> > > +
> > > +	/* Set LL base address */
> > > +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
> > > +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
> > > +
> > > +	/* Set LL allocated port */
> > > +	swdesc->ccr &= ~CCR_LAP;
> > > +
> > > +	return swdesc;
> > > +
> > > +err_pool_free:
> > > +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
> > > +	while (--i >= 0)
> > > +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> > > +	kfree(swdesc);
> > > +
> > > +	return NULL;
> > > +}
> > > +
> > > +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
> > > +				      struct stm32_dma3_swdesc *swdesc)
> > > +{
> > > +	int i;
> > > +
> > > +	for (i = 0; i < swdesc->lli_size; i++)
> > > +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> > > +
> > > +	kfree(swdesc);
> > > +}
> > > +
> > > +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
> > > +{
> > > +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
> > > +
> > > +	stm32_dma3_chan_desc_free(chan, swdesc);
> > > +}
> > > +
> > > +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	struct device *dev = chan2dev(chan);
> > > +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
> > > +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
> > > +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
> > > +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
> > > +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
> > > +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
> > > +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
> > > +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
> > > +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
> > > +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
> > > +
> > > +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
> > > +		dev_err(dev, "null source block size and no update of this value\n");
> > > +	if (bndt % sdw)
> > > +		dev_err(dev, "source block size not multiple of src data width\n");
> > > +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
> > > +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
> > > +	if (csar % sdw)
> > > +		dev_err(dev, "unaligned source address not multiple of src data width\n");
> > > +	if (cdar % ddw)
> > > +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
> > > +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
> > > +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
> > > +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
> > > +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
> > 
> > NO error/abort if this is wrong...?
> > 
> 
> User setting error triggers an interrupt caught in stm32_dma3_chan_irq()
> interrupt handler.
> Indeed User setting error can occur when enabling the channel or when DMA3
> registers are updated with each linked-list item.
> In interrupt handler, when USEF (User Setting Error Flag) is set, this
> function (stm32_dma3_check_user_setting) helps the user to understand what
> went wrong. The hardware automatically disables the channel to prevent the
> execution of the wrongly programmed transfer and the driver resets the
> channel and sets chan->dma_status = DMA_ERROR;. dmaengine_tx_status() will
> return DMA_ERROR.
> So from user point of view, the transfer will never complete, and the
> channel is ready to be reprogrammed.
> Note that in _prep_ functions, all is checked to avoid user setting error.
> If a user setting error occurs, it is rather due to a corrupted linked-list
> item (that should fortunately never happen).
> 
> > > +}
> > > +
> > > +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
> > > +					struct stm32_dma3_swdesc *swdesc,
> > > +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
> > > +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
> > > +{
> > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > +	dma_addr_t next_lli;
> > > +	u32 next = curr + 1;
> > > +
> > > +	hwdesc = swdesc->lli[curr].hwdesc;
> > > +	hwdesc->ctr1 = ctr1;
> > > +	hwdesc->ctr2 = ctr2;
> > > +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
> > > +	hwdesc->csar = src;
> > > +	hwdesc->cdar = dst;
> > > +
> > > +	if (is_last) {
> > > +		if (is_cyclic)
> > > +			next_lli = swdesc->lli[0].hwdesc_addr;
> > > +		else
> > > +			next_lli = 0;
> > > +	} else {
> > > +		next_lli = swdesc->lli[next].hwdesc_addr;
> > > +	}
> > > +
> > > +	hwdesc->cllr = 0;
> > > +	if (next_lli) {
> > > +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
> > > +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
> > > +		hwdesc->cllr |= (next_lli & CLLR_LA);
> > > +	}
> > > +}
> > > +
> > > +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
> > > +						     enum stm32_dma3_port_data_width port_max_dw,
> > > +						     u32 len, dma_addr_t addr)
> > > +{
> > > +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
> > > +
> > > +	/* len and addr must be a multiple of dw */
> > > +	return 1 << __ffs(len | addr | max_dw);
> > > +}
> > > +
> > > +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
> > > +{
> > > +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
> > > +
> > > +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
> > > +	if (len < chan_max_burst)
> > > +		max_burst = len / dw;
> > > +
> > > +	/*
> > > +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
> > > +	 * If len is not a multiple of burst size, last burst is shortened by HW.
> > > +	 */
> > > +	return max_burst;
> > > +}
> > > +
> > > +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
> > > +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
> > > +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	struct dma_device dma_device = ddata->dma_dev;
> > > +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
> > > +	u32 _ctr1 = 0, _ctr2 = 0;
> > > +	u32 ch_conf = chan->dt_config.ch_conf;
> > > +	u32 tr_conf = chan->dt_config.tr_conf;
> > > +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
> > > +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
> > > +
> > > +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
> > > +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
> > > +
> > > +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
> > > +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
> > > +	sbl_max = chan->dma_config.src_maxburst ? : 1;
> > > +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
> > > +
> > > +	/* Following conditions would raise User Setting Error interrupt */
> > > +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
> > > +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
> > > +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	sap_max_dw = ddata->ports_max_dw[sap];
> > > +	dap_max_dw = ddata->ports_max_dw[dap];
> > > +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
> > > +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
> > > +		dev_err(chan2dev(chan),
> > > +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
> > > +			sdw, ddw, sap, dap);
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
> > > +		_ctr1 |= CTR1_SINC;
> > > +	if (sap)
> > > +		_ctr1 |= CTR1_SAP;
> > > +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
> > > +		_ctr1 |= CTR1_DINC;
> > > +	if (dap)
> > > +		_ctr1 |= CTR1_DAP;
> > > +
> > > +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
> > > +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
> > > +		_ctr2 |= CTR2_BREQ;
> > > +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
> > > +		_ctr2 |= CTR2_PFREQ;
> > > +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
> > > +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
> > > +
> > > +	/* Store TCEM to know on which event TC flag occurred */
> > > +	chan->tcem = tcem;
> > > +	/* Store direction for residue computation */
> > > +	chan->dma_config.direction = dir;
> > > +
> > > +	switch (dir) {
> > > +	case DMA_MEM_TO_DEV:
> > > +		/* Set destination (device) data width and burst */
> > > +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
> > > +							    len, dst_addr));
> > > +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
> > > +
> > > +		/* Set source (memory) data width and burst */
> > > +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
> > > +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
> > > +
> > > +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> > > +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> > > +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> > > +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> > > +
> > > +		if (ddw != sdw) {
> > > +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> > > +			/* Should never reach this case as ddw is clamped down */
> > > +			if (len & (ddw - 1)) {
> > > +				dev_err(chan2dev(chan),
> > > +					"Packing mode is enabled and len is not multiple of ddw");
> > > +				return -EINVAL;
> > > +			}
> > > +		}
> > > +
> > > +		/* dst = dev */
> > > +		_ctr2 |= CTR2_DREQ;
> > > +
> > > +		break;
> > > +
> > > +	case DMA_DEV_TO_MEM:
> > > +		/* Set source (device) data width and burst */
> > > +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
> > > +							    len, src_addr));
> > > +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
> > > +
> > > +		/* Set destination (memory) data width and burst */
> > > +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
> > > +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
> > > +
> > > +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> > > +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> > > +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> > > +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> > > +
> > > +		if (ddw != sdw) {
> > > +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> > > +			/* Should never reach this case as ddw is clamped down */
> > > +			if (len & (ddw - 1)) {
> > > +				dev_err(chan2dev(chan),
> > > +					"Packing mode is enabled and len is not multiple of ddw\n");
> > > +				return -EINVAL;
> > > +			}
> > > +		}
> > > +
> > > +		/* dst = mem */
> > > +		_ctr2 &= ~CTR2_DREQ;
> > > +
> > > +		break;
> > > +
> > > +	default:
> > > +		dev_err(chan2dev(chan), "Direction %s not supported\n",
> > > +			dmaengine_get_direction_text(dir));
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
> > > +	*ctr1 = _ctr1;
> > > +	*ctr2 = _ctr2;
> > > +
> > > +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
> > > +		__func__, sdw, sbl_max, ddw, dbl_max);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	struct virt_dma_desc *vdesc;
> > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > +	u32 id = chan->id;
> > > +	u32 csr, ccr;
> > > +
> > > +	vdesc = vchan_next_desc(&chan->vchan);
> > > +	if (!vdesc) {
> > > +		chan->swdesc = NULL;
> > > +		return;
> > > +	}
> > > +	list_del(&vdesc->node);
> > > +
> > > +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
> > > +	hwdesc = chan->swdesc->lli[0].hwdesc;
> > > +
> > > +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
> > > +
> > > +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
> > > +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
> > > +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
> > > +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
> > > +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
> > > +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
> > > +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
> > > +
> > > +	/* Clear any pending interrupts */
> > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
> > > +	if (csr & CSR_ALL_F)
> > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
> > > +
> > > +	stm32_dma3_chan_dump_reg(chan);
> > > +
> > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
> > > +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
> > > +
> > > +	chan->dma_status = DMA_IN_PROGRESS;
> > > +
> > > +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
> > > +}
> > > +
> > > +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > +	int ret = 0;
> > > +
> > > +	if (susp)
> > > +		ccr |= CCR_SUSP;
> > > +	else
> > > +		ccr &= ~CCR_SUSP;
> > > +
> > > +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
> > > +
> > > +	if (susp) {
> > > +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
> > > +							csr & CSR_SUSPF, 1, 10);
> > > +		if (!ret)
> > > +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > +
> > > +		stm32_dma3_chan_dump_reg(chan);
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > +
> > > +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
> > > +}
> > > +
> > > +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 ccr;
> > > +	int ret = 0;
> > > +
> > > +	chan->dma_status = DMA_COMPLETE;
> > > +
> > > +	/* Disable interrupts */
> > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
> > > +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
> > > +
> > > +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
> > > +		/* Suspend the channel */
> > > +		ret = stm32_dma3_chan_suspend(chan, true);
> > > +		if (ret)
> > > +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
> > > +	}
> > > +
> > > +	/*
> > > +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
> > > +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
> > > +	 */
> > > +	stm32_dma3_chan_reset(chan);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
> > > +{
> > > +	if (!chan->swdesc)
> > > +		return;
> > > +
> > > +	vchan_cookie_complete(&chan->swdesc->vdesc);
> > > +	chan->swdesc = NULL;
> > > +	stm32_dma3_chan_start(chan);
> > > +}
> > > +
> > > +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
> > > +{
> > > +	struct stm32_dma3_chan *chan = devid;
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 misr, csr, ccr;
> > > +
> > > +	spin_lock(&chan->vchan.lock);
> > > +
> > > +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
> > > +	if (!(misr & MISR_MIS(chan->id))) {
> > > +		spin_unlock(&chan->vchan.lock);
> > > +		return IRQ_NONE;
> > > +	}
> > > +
> > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
> > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
> > > +
> > > +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
> > > +		if (chan->swdesc->cyclic)
> > > +			vchan_cyclic_callback(&chan->swdesc->vdesc);
> > > +		else
> > > +			stm32_dma3_chan_complete(chan);
> > > +	}
> > > +
> > > +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
> > > +		dev_err(chan2dev(chan), "User setting error\n");
> > > +		chan->dma_status = DMA_ERROR;
> > > +		/* CCR.EN automatically cleared by HW */
> > > +		stm32_dma3_check_user_setting(chan);
> > > +		stm32_dma3_chan_reset(chan);
> > > +	}
> > > +
> > > +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
> > > +		dev_err(chan2dev(chan), "Update link transfer error\n");
> > > +		chan->dma_status = DMA_ERROR;
> > > +		/* CCR.EN automatically cleared by HW */
> > > +		stm32_dma3_chan_reset(chan);
> > > +	}
> > > +
> > > +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
> > > +		dev_err(chan2dev(chan), "Data transfer error\n");
> > > +		chan->dma_status = DMA_ERROR;
> > > +		/* CCR.EN automatically cleared by HW */
> > > +		stm32_dma3_chan_reset(chan);
> > > +	}
> > > +
> > > +	/*
> > > +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
> > > +	 * ensure HTF flag to be cleared, with other flags.
> > > +	 */
> > > +	csr &= (ccr | CCR_HTIE);
> > > +
> > > +	if (csr)
> > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > +
> > > +	spin_unlock(&chan->vchan.lock);
> > > +
> > > +	return IRQ_HANDLED;
> > > +}
> > > +
> > > +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 id = chan->id, csemcr, ccid;
> > > +	int ret;
> > > +
> > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > +	if (ret < 0)
> > > +		return ret;
> > > +
> > > +	/* Ensure the channel is free */
> > > +	if (chan->semaphore_mode &&
> > > +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
> > > +		ret = -EBUSY;
> > > +		goto err_put_sync;
> > > +	}
> > > +
> > > +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
> > > +					  sizeof(struct stm32_dma3_hwdesc),
> > > +					  __alignof__(struct stm32_dma3_hwdesc), 0);
> > > +	if (!chan->lli_pool) {
> > > +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
> > > +		ret = -ENOMEM;
> > > +		goto err_put_sync;
> > > +	}
> > > +
> > > +	/* Take the channel semaphore */
> > > +	if (chan->semaphore_mode) {
> > > +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
> > > +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
> > > +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
> > > +		/* Check that the channel is well taken */
> > > +		if (ccid != CCIDCFGR_CID1) {
> > > +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
> > > +			ret = -EPERM;
> > > +			goto err_pool_destroy;
> > > +		}
> > > +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
> > > +	}
> > > +
> > > +	return 0;
> > > +
> > > +err_pool_destroy:
> > > +	dmam_pool_destroy(chan->lli_pool);
> > > +	chan->lli_pool = NULL;
> > > +
> > > +err_put_sync:
> > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	unsigned long flags;
> > > +
> > > +	/* Ensure channel is in idle state */
> > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > +	stm32_dma3_chan_stop(chan);
> > > +	chan->swdesc = NULL;
> > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > +
> > > +	vchan_free_chan_resources(to_virt_chan(c));
> > > +
> > > +	dmam_pool_destroy(chan->lli_pool);
> > > +	chan->lli_pool = NULL;
> > > +
> > > +	/* Release the channel semaphore */
> > > +	if (chan->semaphore_mode)
> > > +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > +
> > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > +
> > > +	/* Reset configuration */
> > > +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
> > > +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
> > > +}
> > > +
> > > +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
> > > +								struct scatterlist *sgl,
> > > +								unsigned int sg_len,
> > > +								enum dma_transfer_direction dir,
> > > +								unsigned long flags, void *context)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_swdesc *swdesc;
> > > +	struct scatterlist *sg;
> > > +	size_t len;
> > > +	dma_addr_t sg_addr, dev_addr, src, dst;
> > > +	u32 i, j, count, ctr1, ctr2;
> > > +	int ret;
> > > +
> > > +	count = sg_len;
> > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > +		len = sg_dma_len(sg);
> > > +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
> > > +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
> > > +	}
> > > +
> > > +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
> > > +	if (!swdesc)
> > > +		return NULL;
> > > +
> > > +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
> > > +	j = 0;
> > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > +		sg_addr = sg_dma_address(sg);
> > > +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
> > > +						     chan->dma_config.src_addr;
> > > +		len = sg_dma_len(sg);
> > > +
> > > +		do {
> > > +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
> > > +
> > > +			if (dir == DMA_MEM_TO_DEV) {
> > > +				src = sg_addr;
> > > +				dst = dev_addr;
> > > +
> > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > +							      src, dst, chunk);
> > > +
> > > +				if (FIELD_GET(CTR1_DINC, ctr1))
> > > +					dev_addr += chunk;
> > > +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
> > > +				src = dev_addr;
> > > +				dst = sg_addr;
> > > +
> > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > +							      src, dst, chunk);
> > > +
> > > +				if (FIELD_GET(CTR1_SINC, ctr1))
> > > +					dev_addr += chunk;
> > > +			}
> > > +
> > > +			if (ret)
> > > +				goto err_desc_free;
> > > +
> > > +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
> > > +						    ctr1, ctr2, j == (count - 1), false);
> > > +
> > > +			sg_addr += chunk;
> > > +			len -= chunk;
> > > +			j++;
> > > +		} while (len);
> > > +	}
> > > +
> > > +	/* Enable Error interrupts */
> > > +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
> > > +	/* Enable Transfer state interrupts */
> > > +	swdesc->ccr |= CCR_TCIE;
> > > +
> > > +	swdesc->cyclic = false;
> > > +
> > > +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
> > > +
> > > +err_desc_free:
> > > +	stm32_dma3_chan_desc_free(chan, swdesc);
> > > +
> > > +	return NULL;
> > > +}
> > > +
> > > +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +
> > > +	if (!chan->fifo_size) {
> > > +		caps->max_burst = 0;
> > > +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +	} else {
> > > +		/* Burst transfer should not exceed half of the fifo size */
> > > +		caps->max_burst = chan->max_burst;
> > > +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
> > > +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +		}
> > > +	}
> > > +}
> > > +
> > > +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +
> > > +	memcpy(&chan->dma_config, config, sizeof(*config));
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static int stm32_dma3_terminate_all(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	unsigned long flags;
> > > +	LIST_HEAD(head);
> > > +
> > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > +
> > > +	if (chan->swdesc) {
> > > +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
> > > +		chan->swdesc = NULL;
> > > +	}
> > > +
> > > +	stm32_dma3_chan_stop(chan);
> > > +
> > > +	vchan_get_all_descriptors(&chan->vchan, &head);
> > > +
> > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > +	vchan_dma_desc_free_list(&chan->vchan, &head);
> > > +
> > > +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void stm32_dma3_synchronize(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +
> > > +	vchan_synchronize(&chan->vchan);
> > > +}
> > > +
> > > +static void stm32_dma3_issue_pending(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	unsigned long flags;
> > > +
> > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > +
> > > +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
> > > +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
> > > +		stm32_dma3_chan_start(chan);
> > > +	}
> > > +
> > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > +}
> > > +
> > > +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	struct stm32_dma3_dt_conf *conf = fn_param;
> > > +	u32 mask, semcr;
> > > +	int ret;
> > > +
> > > +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
> > > +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
> > > +
> > > +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
> > > +		if (!(mask & BIT(chan->id)))
> > > +			return false;
> > > +
> > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > +	if (ret < 0)
> > > +		return false;
> > > +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > +
> > > +	/* Check if chan is free */
> > > +	if (semcr & CSEMCR_SEM_MUTEX)
> > > +		return false;
> > > +
> > > +	/* Check if chan fifo fits well */
> > > +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
> > > +		return false;
> > > +
> > > +	return true;
> > > +}
> > > +
> > > +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
> > > +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
> > > +	struct stm32_dma3_dt_conf conf;
> > > +	struct stm32_dma3_chan *chan;
> > > +	struct dma_chan *c;
> > > +
> > > +	if (dma_spec->args_count < 3) {
> > > +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
> > > +		return NULL;
> > > +	}
> > > +
> > > +	conf.req_line = dma_spec->args[0];
> > > +	conf.ch_conf = dma_spec->args[1];
> > > +	conf.tr_conf = dma_spec->args[2];
> > > +
> > > +	if (conf.req_line >= ddata->dma_requests) {
> > > +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
> > > +		return NULL;
> > > +	}
> > > +
> > > +	/* Request dma channel among the generic dma controller list */
> > > +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
> > > +	if (!c) {
> > > +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
> > > +		return NULL;
> > > +	}
> > > +
> > > +	chan = to_stm32_dma3_chan(c);
> > > +	chan->dt_config = conf;
> > > +
> > > +	return c;
> > > +}
> > > +
> > > +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
> > > +{
> > > +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
> > > +
> > > +	/* Reserve Secure channels */
> > > +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
> > > +
> > > +	/*
> > > +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
> > > +	 * the processor which is configuring and using the given channel.
> > > +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
> > > +	 * specify available DMA channels to the kernel.
> > > +	 */
> > > +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
> > > +
> > > +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
> > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
> > > +
> > > +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
> > > +			invalid_cid |= BIT(i);
> > > +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
> > > +				chan_reserved |= BIT(i);
> > > +		} else { /* CID-filtered */
> > > +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
> > > +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
> > > +					chan_reserved |= BIT(i);
> > > +			} else { /* Semaphore mode */
> > > +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
> > > +					chan_reserved |= BIT(i);
> > > +				ddata->chans[i].semaphore_mode = true;
> > > +			}
> > > +		}
> > > +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
> > > +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
> > > +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
> > > +			(chan_reserved & BIT(i)) ? "denied" :
> > > +			mask & BIT(i) ? "force allowed" : "allowed");
> > > +	}
> > > +
> > > +	if (invalid_cid)
> > > +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
> > > +			 ddata->dma_channels, &invalid_cid);
> > > +
> > > +	return chan_reserved;
> > > +}
> > > +
> > > +static const struct of_device_id stm32_dma3_of_match[] = {
> > > +	{ .compatible = "st,stm32-dma3", },
> > > +	{ /* sentinel */},
> > > +};
> > > +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
> > > +
> > > +static int stm32_dma3_probe(struct platform_device *pdev)
> > > +{
> > > +	struct device_node *np = pdev->dev.of_node;
> > > +	struct stm32_dma3_ddata *ddata;
> > > +	struct reset_control *reset;
> > > +	struct stm32_dma3_chan *chan;
> > > +	struct dma_device *dma_dev;
> > > +	u32 master_ports, chan_reserved, i, verr;
> > > +	u64 hwcfgr;
> > > +	int ret;
> > > +
> > > +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
> > > +	if (!ddata)
> > > +		return -ENOMEM;
> > > +	platform_set_drvdata(pdev, ddata);
> > > +
> > > +	dma_dev = &ddata->dma_dev;
> > > +
> > > +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
> > > +	if (IS_ERR(ddata->base))
> > > +		return PTR_ERR(ddata->base);
> > > +
> > > +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
> > > +	if (IS_ERR(ddata->clk))
> > > +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
> > > +
> > > +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
> > > +	if (IS_ERR(reset))
> > > +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
> > > +
> > > +	ret = clk_prepare_enable(ddata->clk);
> > > +	if (ret)
> > > +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
> > > +
> > > +	reset_control_reset(reset);
> > > +
> > > +	INIT_LIST_HEAD(&dma_dev->channels);
> > > +
> > > +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
> > > +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
> > > +	dma_dev->dev = &pdev->dev;
> > > +	/*
> > > +	 * This controller supports up to 8-byte buswidth depending on the port used and the
> > > +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
> > > +	 */
> > > +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
> > > +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
> > > +
> > > +	dma_dev->descriptor_reuse = true;
> > > +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
> > > +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
> > > +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
> > > +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
> > > +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
> > > +	dma_dev->device_caps = stm32_dma3_caps;
> > > +	dma_dev->device_config = stm32_dma3_config;
> > > +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
> > > +	dma_dev->device_synchronize = stm32_dma3_synchronize;
> > > +	dma_dev->device_tx_status = dma_cookie_status;
> > > +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
> > > +
> > > +	/* if dma_channels is not modified, get it from hwcfgr1 */
> > > +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
> > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
> > > +	}
> > > +
> > > +	/* if dma_requests is not modified, get it from hwcfgr2 */
> > > +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
> > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
> > > +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
> > > +	}
> > > +
> > > +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
> > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
> > > +
> > > +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
> > > +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
> > > +		ddata->ports_max_dw[1] = DW_INVALID;
> > > +	else /* Dual master ports */
> > > +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
> > > +
> > > +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
> > > +				    GFP_KERNEL);
> > > +	if (!ddata->chans) {
> > > +		ret = -ENOMEM;
> > > +		goto err_clk_disable;
> > > +	}
> > > +
> > > +	chan_reserved = stm32_dma3_check_rif(ddata);
> > > +
> > > +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
> > > +		ret = -ENODEV;
> > > +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
> > > +		goto err_clk_disable;
> > > +	}
> > > +
> > > +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
> > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
> > > +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
> > > +
> > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > +		if (chan_reserved & BIT(i))
> > > +			continue;
> > > +
> > > +		chan = &ddata->chans[i];
> > > +		chan->id = i;
> > > +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
> > > +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
> > > +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
> > > +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
> > > +
> > > +		vchan_init(&chan->vchan, dma_dev);
> > > +	}
> > > +
> > > +	ret = dmaenginem_async_device_register(dma_dev);
> > > +	if (ret)
> > > +		goto err_clk_disable;
> > > +
> > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > +		if (chan_reserved & BIT(i))
> > > +			continue;
> > > +
> > > +		ret = platform_get_irq(pdev, i);
> > > +		if (ret < 0)
> > > +			goto err_clk_disable;
> > > +
> > > +		chan = &ddata->chans[i];
> > > +		chan->irq = ret;
> > > +
> > > +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
> > > +				       dev_name(chan2dev(chan)), chan);
> > > +		if (ret) {
> > > +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
> > > +				      dev_name(chan2dev(chan)));
> > > +			goto err_clk_disable;
> > > +		}
> > > +	}
> > > +
> > > +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
> > > +	if (ret) {
> > > +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
> > > +		goto err_clk_disable;
> > > +	}
> > > +
> > > +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
> > > +
> > > +	pm_runtime_set_active(&pdev->dev);
> > > +	pm_runtime_enable(&pdev->dev);
> > > +	pm_runtime_get_noresume(&pdev->dev);
> > > +	pm_runtime_put(&pdev->dev);
> > > +
> > > +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
> > > +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
> > > +
> > > +	return 0;
> > > +
> > > +err_clk_disable:
> > > +	clk_disable_unprepare(ddata->clk);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_remove(struct platform_device *pdev)
> > > +{
> > > +	pm_runtime_disable(&pdev->dev);
> > > +}
> > > +
> > > +static int stm32_dma3_runtime_suspend(struct device *dev)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > +
> > > +	clk_disable_unprepare(ddata->clk);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static int stm32_dma3_runtime_resume(struct device *dev)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > +	int ret;
> > > +
> > > +	ret = clk_prepare_enable(ddata->clk);
> > > +	if (ret)
> > > +		dev_err(dev, "Failed to enable clk: %d\n", ret);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static const struct dev_pm_ops stm32_dma3_pm_ops = {
> > > +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> > > +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
> > > +};
> > > +
> > > +static struct platform_driver stm32_dma3_driver = {
> > > +	.probe = stm32_dma3_probe,
> > > +	.remove_new = stm32_dma3_remove,
> > > +	.driver = {
> > > +		.name = "stm32-dma3",
> > > +		.of_match_table = stm32_dma3_of_match,
> > > +		.pm = pm_ptr(&stm32_dma3_pm_ops),
> > > +	},
> > > +};
> > > +
> > > +static int __init stm32_dma3_init(void)
> > > +{
> > > +	return platform_driver_register(&stm32_dma3_driver);
> > > +}
> > > +
> > > +subsys_initcall(stm32_dma3_init);
> > > +
> > > +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
> > > +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
> > > +MODULE_LICENSE("GPL");
> > > -- 
> > > 2.25.1
> > 
> 
> Regards,
> Amelie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-07 20:26       ` Frank Li
@ 2024-05-13  9:21         ` Amelie Delaunay
  2024-05-15 18:45           ` Frank Li
  0 siblings, 1 reply; 29+ messages in thread
From: Amelie Delaunay @ 2024-05-13  9:21 UTC (permalink / raw)
  To: Frank Li
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

Hi Frank,

On 5/7/24 22:26, Frank Li wrote:
> On Tue, May 07, 2024 at 01:33:31PM +0200, Amelie Delaunay wrote:
>> Hi Vinod,
>>
>> Thanks for the review.
>>
>> On 5/4/24 14:40, Vinod Koul wrote:
>>> On 23-04-24, 14:32, Amelie Delaunay wrote:
>>>> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
>>>> controller:
>>>> - LPDMA (Low Power): 4 channels, no FIFO
>>>> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
>>>> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
>>>> Hardware configuration of the channels is retrieved from the hardware
>>>> configuration registers.
>>>> The client can specify its channel requirements through device tree.
>>>> STM32 DMA3 channels can be individually reserved either because they are
>>>> secure, or dedicated to another CPU.
>>>> Indeed, channels availability depends on Resource Isolation Framework
>>>> (RIF) configuration. RIF grants access to buses with Compartiment ID
>>>
>>> Compartiment? typo...?
>>>
>>
>> Sorry, indeed, Compartment instead.
>>
>>>> (CIF) filtering, secure and privilege level. It also assigns DMA channels
>>>> to one or several processors.
>>>> DMA channels used by Linux should be CID-filtered and statically assigned
>>>> to CID1 or shared with other CPUs but using semaphore. In case CID
>>>> filtering is not configured, dma-channel-mask property can be used to
>>>> specify available DMA channels to the kernel, otherwise such channels
>>>> will be marked as reserved and can't be used by Linux.
>>>>
>>>> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
>>>> ---
>>>>    drivers/dma/stm32/Kconfig      |   10 +
>>>>    drivers/dma/stm32/Makefile     |    1 +
>>>>    drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
>>>>    3 files changed, 1442 insertions(+)
>>>>    create mode 100644 drivers/dma/stm32/stm32-dma3.c
>>>>
>>>> diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
>>>> index b72ae1a4502f..4d8d8063133b 100644
>>>> --- a/drivers/dma/stm32/Kconfig
>>>> +++ b/drivers/dma/stm32/Kconfig
>>>> @@ -34,4 +34,14 @@ config STM32_MDMA
>>>>    	  If you have a board based on STM32 SoC with such DMA controller
>>>>    	  and want to use MDMA say Y here.
>>>> +config STM32_DMA3
>>>> +	tristate "STMicroelectronics STM32 DMA3 support"
>>>> +	select DMA_ENGINE
>>>> +	select DMA_VIRTUAL_CHANNELS
>>>> +	help
>>>> +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
>>>> +	  STM32 platforms.
>>>> +	  If you have a board based on STM32 SoC with such DMA3 controller
>>>> +	  and want to use DMA3, say Y here.
>>>> +
>>>>    endif
>>>> diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
>>>> index 663a3896a881..5082db4b4c1c 100644
>>>> --- a/drivers/dma/stm32/Makefile
>>>> +++ b/drivers/dma/stm32/Makefile
>>>> @@ -2,3 +2,4 @@
>>>>    obj-$(CONFIG_STM32_DMA) += stm32-dma.o
>>>>    obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
>>>>    obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
>>>> +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
>>>
>>> are there any similarities in mdma/dma and dma3..?
>>> can anything be reused...?
>>>
>>
>> DMA/MDMA were originally intended for STM32 MCUs and have been used in
>> STM32MP1 MPUs.
>> New MPUs (STM32MP2, ...) and STM32 MCUs (STM32H5, STM32N6, ...) use DMA3.
>> Unlike DMA/MDMA, DMA3 can be declined in multiple configurations, LPDMA,
>> GPDMA, HPDMA, and among these global configurations, there are possible
>> sub-configurations (e.g. channel fifo size). stm32-dma3 uses the hardware
>> configuration registers to discover the controller/channels capabilities.
>> Reuse stm32-dma or stm32-mdma would lead to complicating the driver and
>> making future stm32-dma3 evolutions for next STM32 MPUs intricate and very
>> difficult.
> 
> I think your reason still not enough to create new driver instead try to
> reuse old one.
> 
> Does register layout or dma descriptor is totally difference?
> 
> If dma descriptor format is the same, at least you can reuse prepare DMA
> descriptor part.
> 
> Choose channel is independt part of DMA channel. You can create sperate
> one for difference DMA engine.
> 
> Frank
> 

stm32-dma is not considered for reuse : register layout is completely 
different and this DMA controller doesn't rely on descriptors mechanism.

stm32-mdma is based on descriptors mechanism but even there, there are 
significant differences in register layout and descriptors structure.
As you can see:
/* Descriptor from stm32-mdma */
struct stm32_mdma_hwdesc {
	u32 ctcr;
	u32 cbndtr;
	u32 csar;
	u32 cdar;
	u32 cbrur;
	u32 clar;
	u32 ctbr;
	u32 dummy;
	u32 cmar;
	u32 cmdr;
} __aligned(64);

/* Descriptor from stm32-dma3 */
struct stm32_dma3_hwdesc {
	u32 ctr1;
	u32 ctr2;
	u32 cbr1;
	u32 csar;
	u32 cdar;
	u32 cllr;
} __aligned(32);

Moreover, stm32-dma3 can have static or dynamic linked-list items. 
Dynamic data structure support is not yet in this patchset, current 
implementation is undergoing validation and maturation.
"cllr"  configures the data structure of the next linked-list item in 
addition to its address pointer. The descriptor can be "compacted" 
depending on cllr update bits values.

/* CxLLR DMA channel x linked-list address register */
#define CLLR_LA				GENMASK(15, 2) /* Address */
#define CLLR_ULL			BIT(16) /* CxLLR update ? */
#define CLLR_UDA			BIT(27) /* CxDAR update ? */
#define CLLR_USA			BIT(28) /* CxSAR update ? */
#define CLLR_UB1			BIT(29) /* CxBR1 update ? */
#define CLLR_UT2			BIT(30) /* CxTR2 update ? */
#define CLLR_UT1			BIT(31) /* CxTR1 update ? */

If one or more CLLR_Uxx bit(s) is(are) not set, it means the 
corresponding u32 value(s) in the descriptor is(are) not there. For 
example, if CLLR_ULL bit is the only one that is set, then "cllr" value 
should be at offset 0 in linked-list data structure.

I hope this gives an insights into why I've decided not to reuse the 
existing drivers, either in whole or in part.

Amelie

>>
>>>> diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
>>>> new file mode 100644
>>>> index 000000000000..b5493f497d06
>>>> --- /dev/null
>>>> +++ b/drivers/dma/stm32/stm32-dma3.c
>>>> @@ -0,0 +1,1431 @@
>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>> +/*
>>>> + * STM32 DMA3 controller driver
>>>> + *
>>>> + * Copyright (C) STMicroelectronics 2024
>>>> + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
>>>> + */
>>>> +
>>>> +#include <linux/bitfield.h>
>>>> +#include <linux/clk.h>
>>>> +#include <linux/dma-mapping.h>
>>>> +#include <linux/dmaengine.h>
>>>> +#include <linux/dmapool.h>
>>>> +#include <linux/init.h>
>>>> +#include <linux/iopoll.h>
>>>> +#include <linux/list.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/of_dma.h>
>>>> +#include <linux/platform_device.h>
>>>> +#include <linux/pm_runtime.h>
>>>> +#include <linux/reset.h>
>>>> +#include <linux/slab.h>
>>>> +
>>>> +#include "../virt-dma.h"
>>>> +
>>>> +#define STM32_DMA3_SECCFGR		0x00
>>>> +#define STM32_DMA3_PRIVCFGR		0x04
>>>> +#define STM32_DMA3_RCFGLOCKR		0x08
>>>> +#define STM32_DMA3_MISR			0x0C
>>>
>>> lower hex please
>>>
>>
>> Ok.
>>
>>>> +#define STM32_DMA3_SMISR		0x10
>>>> +
>>>> +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
>>>> +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
>>>> +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
>>>> +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
>>>> +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
>>>> +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
>>>> +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
>>>> +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
>>>> +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
>>>> +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
>>>> +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
>>>> +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
>>>> +
>>>> +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
>>>> +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
>>>> +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
>>>> +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
>>>> +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
>>>> +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
>>>> +#define STM32_DMA3_VERR			0xFF4
>>>
>>> here as well
>>>
>>
>> Ok.
>>
>>>> +
>>>> +/* SECCFGR DMA secure configuration register */
>>>> +#define SECCFGR_SEC(x)			BIT(x)
>>>> +
>>>> +/* MISR DMA non-secure/secure masked interrupt status register */
>>>> +#define MISR_MIS(x)			BIT(x)
>>>> +
>>>> +/* CxLBAR DMA channel x linked_list base address register */
>>>> +#define CLBAR_LBA			GENMASK(31, 16)
>>>> +
>>>> +/* CxCIDCFGR DMA channel x CID register */
>>>> +#define CCIDCFGR_CFEN			BIT(0)
>>>> +#define CCIDCFGR_SEM_EN			BIT(1)
>>>> +#define CCIDCFGR_SCID			GENMASK(5, 4)
>>>> +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
>>>> +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
>>>> +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
>>>> +
>>>> +enum ccidcfgr_cid {
>>>> +	CCIDCFGR_CID0,
>>>> +	CCIDCFGR_CID1,
>>>> +	CCIDCFGR_CID2,
>>>> +};
>>>> +
>>>> +/* CxSEMCR DMA channel x semaphore control register */
>>>> +#define CSEMCR_SEM_MUTEX		BIT(0)
>>>> +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
>>>> +
>>>> +/* CxFCR DMA channel x flag clear register */
>>>> +#define CFCR_TCF			BIT(8)
>>>> +#define CFCR_HTF			BIT(9)
>>>> +#define CFCR_DTEF			BIT(10)
>>>> +#define CFCR_ULEF			BIT(11)
>>>> +#define CFCR_USEF			BIT(12)
>>>> +#define CFCR_SUSPF			BIT(13)
>>>> +
>>>> +/* CxSR DMA channel x status register */
>>>> +#define CSR_IDLEF			BIT(0)
>>>> +#define CSR_TCF				BIT(8)
>>>> +#define CSR_HTF				BIT(9)
>>>> +#define CSR_DTEF			BIT(10)
>>>> +#define CSR_ULEF			BIT(11)
>>>> +#define CSR_USEF			BIT(12)
>>>> +#define CSR_SUSPF			BIT(13)
>>>> +#define CSR_ALL_F			GENMASK(13, 8)
>>>> +#define CSR_FIFOL			GENMASK(24, 16)
>>>> +
>>>> +/* CxCR DMA channel x control register */
>>>> +#define CCR_EN				BIT(0)
>>>> +#define CCR_RESET			BIT(1)
>>>> +#define CCR_SUSP			BIT(2)
>>>> +#define CCR_TCIE			BIT(8)
>>>> +#define CCR_HTIE			BIT(9)
>>>> +#define CCR_DTEIE			BIT(10)
>>>> +#define CCR_ULEIE			BIT(11)
>>>> +#define CCR_USEIE			BIT(12)
>>>> +#define CCR_SUSPIE			BIT(13)
>>>> +#define CCR_ALLIE			GENMASK(13, 8)
>>>> +#define CCR_LSM				BIT(16)
>>>> +#define CCR_LAP				BIT(17)
>>>> +#define CCR_PRIO			GENMASK(23, 22)
>>>> +
>>>> +enum ccr_prio {
>>>> +	CCR_PRIO_LOW,
>>>> +	CCR_PRIO_MID,
>>>> +	CCR_PRIO_HIGH,
>>>> +	CCR_PRIO_VERY_HIGH,
>>>> +};
>>>> +
>>>> +/* CxTR1 DMA channel x transfer register 1 */
>>>> +#define CTR1_SINC			BIT(3)
>>>> +#define CTR1_SBL_1			GENMASK(9, 4)
>>>> +#define CTR1_DINC			BIT(19)
>>>> +#define CTR1_DBL_1			GENMASK(25, 20)
>>>> +#define CTR1_SDW_LOG2			GENMASK(1, 0)
>>>> +#define CTR1_PAM			GENMASK(12, 11)
>>>> +#define CTR1_SAP			BIT(14)
>>>> +#define CTR1_DDW_LOG2			GENMASK(17, 16)
>>>> +#define CTR1_DAP			BIT(30)
>>>> +
>>>> +enum ctr1_dw {
>>>> +	CTR1_DW_BYTE,
>>>> +	CTR1_DW_HWORD,
>>>> +	CTR1_DW_WORD,
>>>> +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
>>>> +};
>>>> +
>>>> +enum ctr1_pam {
>>>> +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
>>>> +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
>>>> +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
>>>> +};
>>>> +
>>>> +/* CxTR2 DMA channel x transfer register 2 */
>>>> +#define CTR2_REQSEL			GENMASK(7, 0)
>>>> +#define CTR2_SWREQ			BIT(9)
>>>> +#define CTR2_DREQ			BIT(10)
>>>> +#define CTR2_BREQ			BIT(11)
>>>> +#define CTR2_PFREQ			BIT(12)
>>>> +#define CTR2_TCEM			GENMASK(31, 30)
>>>> +
>>>> +enum ctr2_tcem {
>>>> +	CTR2_TCEM_BLOCK,
>>>> +	CTR2_TCEM_REPEAT_BLOCK,
>>>> +	CTR2_TCEM_LLI,
>>>> +	CTR2_TCEM_CHANNEL,
>>>> +};
>>>> +
>>>> +/* CxBR1 DMA channel x block register 1 */
>>>> +#define CBR1_BNDT			GENMASK(15, 0)
>>>> +
>>>> +/* CxLLR DMA channel x linked-list address register */
>>>> +#define CLLR_LA				GENMASK(15, 2)
>>>> +#define CLLR_ULL			BIT(16)
>>>> +#define CLLR_UDA			BIT(27)
>>>> +#define CLLR_USA			BIT(28)
>>>> +#define CLLR_UB1			BIT(29)
>>>> +#define CLLR_UT2			BIT(30)
>>>> +#define CLLR_UT1			BIT(31)
>>>> +
>>>> +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
>>>> +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
>>>> +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
>>>> +
>>>> +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
>>>> +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
>>>> +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
>>>> +
>>>> +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
>>>> +
>>>> +/* HWCFGR2 DMA hardware configuration register 2 */
>>>> +#define G_MAX_REQ_ID			GENMASK(7, 0)
>>>> +
>>>> +/* HWCFGR1 DMA hardware configuration register 1 */
>>>> +#define G_MASTER_PORTS			GENMASK(2, 0)
>>>> +#define G_NUM_CHANNELS			GENMASK(12, 8)
>>>> +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
>>>> +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
>>>> +
>>>> +enum stm32_dma3_master_ports {
>>>> +	AXI64, /* 1x AXI: 64-bit port 0 */
>>>> +	AHB32, /* 1x AHB: 32-bit port 0 */
>>>> +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
>>>> +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
>>>> +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
>>>> +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
>>>> +};
>>>> +
>>>> +enum stm32_dma3_port_data_width {
>>>> +	DW_32, /* 32-bit, for AHB */
>>>> +	DW_64, /* 64-bit, for AXI */
>>>> +	DW_128, /* 128-bit, for AXI */
>>>> +	DW_INVALID,
>>>> +};
>>>> +
>>>> +/* VERR DMA version register */
>>>> +#define VERR_MINREV			GENMASK(3, 0)
>>>> +#define VERR_MAJREV			GENMASK(7, 4)
>>>> +
>>>> +/* Device tree */
>>>> +/* struct stm32_dma3_dt_conf */
>>>> +/* .ch_conf */
>>>> +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
>>>> +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
>>>> +/* .tr_conf */
>>>> +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
>>>> +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
>>>> +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
>>>> +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
>>>> +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
>>>> +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
>>>> +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
>>>> +
>>>> +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
>>>> +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>>>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
>>>> +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>>>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
>>>> +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
>>>> +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
>>>> +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
>>>> +
>>>> +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
>>>> +struct stm32_dma3_hwdesc {
>>>> +	u32 ctr1;
>>>> +	u32 ctr2;
>>>> +	u32 cbr1;
>>>> +	u32 csar;
>>>> +	u32 cdar;
>>>> +	u32 cllr;
>>>> +} __aligned(32);
>>>> +
>>>> +/*
>>>> + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
>>>> + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
>>>> + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
>>>> + * the sg size limitation.
>>>> + */
>>>> +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
>>>> +
>>>> +/*
>>>> + * Linked-list items
>>>> + */
>>>> +struct stm32_dma3_lli {
>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>> +	dma_addr_t hwdesc_addr;
>>>> +};
>>>> +
>>>> +struct stm32_dma3_swdesc {
>>>> +	struct virt_dma_desc vdesc;
>>>> +	u32 ccr;
>>>> +	bool cyclic;
>>>> +	u32 lli_size;
>>>> +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
>>>> +};
>>>> +
>>>> +struct stm32_dma3_dt_conf {
>>>> +	u32 ch_id;
>>>> +	u32 req_line;
>>>> +	u32 ch_conf;
>>>> +	u32 tr_conf;
>>>> +};
>>>> +
>>>> +struct stm32_dma3_chan {
>>>> +	struct virt_dma_chan vchan;
>>>> +	u32 id;
>>>> +	int irq;
>>>> +	u32 fifo_size;
>>>> +	u32 max_burst;
>>>> +	bool semaphore_mode;
>>>> +	struct stm32_dma3_dt_conf dt_config;
>>>> +	struct dma_slave_config dma_config;
>>>> +	struct dma_pool *lli_pool;
>>>> +	struct stm32_dma3_swdesc *swdesc;
>>>> +	enum ctr2_tcem tcem;
>>>> +	u32 dma_status;
>>>> +};
>>>> +
>>>> +struct stm32_dma3_ddata {
>>>> +	struct dma_device dma_dev;
>>>> +	void __iomem *base;
>>>> +	struct clk *clk;
>>>> +	struct stm32_dma3_chan *chans;
>>>> +	u32 dma_channels;
>>>> +	u32 dma_requests;
>>>> +	enum stm32_dma3_port_data_width ports_max_dw[2];
>>>> +};
>>>> +
>>>> +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
>>>> +}
>>>> +
>>>> +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
>>>> +{
>>>> +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
>>>> +}
>>>> +
>>>> +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
>>>> +{
>>>> +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
>>>> +}
>>>> +
>>>> +static struct device *chan2dev(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	return &chan->vchan.chan.dev->device;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	struct device *dev = chan2dev(chan);
>>>> +	u32 id = chan->id, offset;
>>>> +
>>>> +	offset = STM32_DMA3_SECCFGR;
>>>> +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_PRIVCFGR;
>>>> +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CCIDCFGR(id);
>>>> +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CSEMCR(id);
>>>> +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CSR(id);
>>>> +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CCR(id);
>>>> +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CTR1(id);
>>>> +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CTR2(id);
>>>> +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CBR1(id);
>>>> +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CSAR(id);
>>>> +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CDAR(id);
>>>> +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CLLR(id);
>>>> +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +	offset = STM32_DMA3_CLBAR(id);
>>>> +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
>>>> +					struct stm32_dma3_swdesc *swdesc)
>>>> +{
>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>> +	int i;
>>>> +
>>>> +	for (i = 0; i < swdesc->lli_size; i++) {
>>>> +		hwdesc = swdesc->lli[i].hwdesc;
>>>> +		if (i)
>>>> +			dev_dbg(chan2dev(chan), "V\n");
>>>> +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
>>>> +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
>>>> +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
>>>> +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
>>>> +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
>>>> +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
>>>> +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
>>>> +	}
>>>> +
>>>> +	if (swdesc->cyclic) {
>>>> +		dev_dbg(chan2dev(chan), "|\n");
>>>> +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
>>>> +	} else {
>>>> +		dev_dbg(chan2dev(chan), "X\n");
>>>> +	}
>>>> +}
>>>> +
>>>> +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	struct stm32_dma3_swdesc *swdesc;
>>>> +	int i;
>>>> +
>>>> +	/*
>>>> +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
>>>> +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
>>>> +	 * addressed, so abort the allocation.
>>>> +	 */
>>>> +	if ((count * 32) > CLLR_LA) {
>>>> +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
>>>> +	if (!swdesc)
>>>> +		return NULL;
>>>> +
>>>> +	for (i = 0; i < count; i++) {
>>>> +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
>>>> +							&swdesc->lli[i].hwdesc_addr);
>>>> +		if (!swdesc->lli[i].hwdesc)
>>>> +			goto err_pool_free;
>>>> +	}
>>>> +	swdesc->lli_size = count;
>>>> +	swdesc->ccr = 0;
>>>> +
>>>> +	/* Set LL base address */
>>>> +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
>>>> +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
>>>> +
>>>> +	/* Set LL allocated port */
>>>> +	swdesc->ccr &= ~CCR_LAP;
>>>> +
>>>> +	return swdesc;
>>>> +
>>>> +err_pool_free:
>>>> +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
>>>> +	while (--i >= 0)
>>>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>>>> +	kfree(swdesc);
>>>> +
>>>> +	return NULL;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
>>>> +				      struct stm32_dma3_swdesc *swdesc)
>>>> +{
>>>> +	int i;
>>>> +
>>>> +	for (i = 0; i < swdesc->lli_size; i++)
>>>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>>>> +
>>>> +	kfree(swdesc);
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
>>>> +{
>>>> +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
>>>> +
>>>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>>>> +}
>>>> +
>>>> +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	struct device *dev = chan2dev(chan);
>>>> +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
>>>> +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
>>>> +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
>>>> +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
>>>> +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
>>>> +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
>>>> +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
>>>> +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
>>>> +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
>>>> +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
>>>> +
>>>> +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
>>>> +		dev_err(dev, "null source block size and no update of this value\n");
>>>> +	if (bndt % sdw)
>>>> +		dev_err(dev, "source block size not multiple of src data width\n");
>>>> +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
>>>> +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
>>>> +	if (csar % sdw)
>>>> +		dev_err(dev, "unaligned source address not multiple of src data width\n");
>>>> +	if (cdar % ddw)
>>>> +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
>>>> +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
>>>> +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
>>>> +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
>>>> +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
>>>
>>> NO error/abort if this is wrong...?
>>>
>>
>> User setting error triggers an interrupt caught in stm32_dma3_chan_irq()
>> interrupt handler.
>> Indeed User setting error can occur when enabling the channel or when DMA3
>> registers are updated with each linked-list item.
>> In interrupt handler, when USEF (User Setting Error Flag) is set, this
>> function (stm32_dma3_check_user_setting) helps the user to understand what
>> went wrong. The hardware automatically disables the channel to prevent the
>> execution of the wrongly programmed transfer and the driver resets the
>> channel and sets chan->dma_status = DMA_ERROR;. dmaengine_tx_status() will
>> return DMA_ERROR.
>> So from user point of view, the transfer will never complete, and the
>> channel is ready to be reprogrammed.
>> Note that in _prep_ functions, all is checked to avoid user setting error.
>> If a user setting error occurs, it is rather due to a corrupted linked-list
>> item (that should fortunately never happen).
>>
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
>>>> +					struct stm32_dma3_swdesc *swdesc,
>>>> +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
>>>> +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
>>>> +{
>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>> +	dma_addr_t next_lli;
>>>> +	u32 next = curr + 1;
>>>> +
>>>> +	hwdesc = swdesc->lli[curr].hwdesc;
>>>> +	hwdesc->ctr1 = ctr1;
>>>> +	hwdesc->ctr2 = ctr2;
>>>> +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
>>>> +	hwdesc->csar = src;
>>>> +	hwdesc->cdar = dst;
>>>> +
>>>> +	if (is_last) {
>>>> +		if (is_cyclic)
>>>> +			next_lli = swdesc->lli[0].hwdesc_addr;
>>>> +		else
>>>> +			next_lli = 0;
>>>> +	} else {
>>>> +		next_lli = swdesc->lli[next].hwdesc_addr;
>>>> +	}
>>>> +
>>>> +	hwdesc->cllr = 0;
>>>> +	if (next_lli) {
>>>> +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
>>>> +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
>>>> +		hwdesc->cllr |= (next_lli & CLLR_LA);
>>>> +	}
>>>> +}
>>>> +
>>>> +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
>>>> +						     enum stm32_dma3_port_data_width port_max_dw,
>>>> +						     u32 len, dma_addr_t addr)
>>>> +{
>>>> +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
>>>> +
>>>> +	/* len and addr must be a multiple of dw */
>>>> +	return 1 << __ffs(len | addr | max_dw);
>>>> +}
>>>> +
>>>> +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
>>>> +{
>>>> +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
>>>> +
>>>> +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
>>>> +	if (len < chan_max_burst)
>>>> +		max_burst = len / dw;
>>>> +
>>>> +	/*
>>>> +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
>>>> +	 * If len is not a multiple of burst size, last burst is shortened by HW.
>>>> +	 */
>>>> +	return max_burst;
>>>> +}
>>>> +
>>>> +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
>>>> +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
>>>> +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	struct dma_device dma_device = ddata->dma_dev;
>>>> +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
>>>> +	u32 _ctr1 = 0, _ctr2 = 0;
>>>> +	u32 ch_conf = chan->dt_config.ch_conf;
>>>> +	u32 tr_conf = chan->dt_config.tr_conf;
>>>> +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
>>>> +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
>>>> +
>>>> +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
>>>> +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
>>>> +
>>>> +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
>>>> +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
>>>> +	sbl_max = chan->dma_config.src_maxburst ? : 1;
>>>> +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
>>>> +
>>>> +	/* Following conditions would raise User Setting Error interrupt */
>>>> +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
>>>> +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
>>>> +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	sap_max_dw = ddata->ports_max_dw[sap];
>>>> +	dap_max_dw = ddata->ports_max_dw[dap];
>>>> +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
>>>> +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
>>>> +		dev_err(chan2dev(chan),
>>>> +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
>>>> +			sdw, ddw, sap, dap);
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
>>>> +		_ctr1 |= CTR1_SINC;
>>>> +	if (sap)
>>>> +		_ctr1 |= CTR1_SAP;
>>>> +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
>>>> +		_ctr1 |= CTR1_DINC;
>>>> +	if (dap)
>>>> +		_ctr1 |= CTR1_DAP;
>>>> +
>>>> +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
>>>> +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
>>>> +		_ctr2 |= CTR2_BREQ;
>>>> +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
>>>> +		_ctr2 |= CTR2_PFREQ;
>>>> +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
>>>> +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
>>>> +
>>>> +	/* Store TCEM to know on which event TC flag occurred */
>>>> +	chan->tcem = tcem;
>>>> +	/* Store direction for residue computation */
>>>> +	chan->dma_config.direction = dir;
>>>> +
>>>> +	switch (dir) {
>>>> +	case DMA_MEM_TO_DEV:
>>>> +		/* Set destination (device) data width and burst */
>>>> +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
>>>> +							    len, dst_addr));
>>>> +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
>>>> +
>>>> +		/* Set source (memory) data width and burst */
>>>> +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
>>>> +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
>>>> +
>>>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>>>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>>>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>>>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>>>> +
>>>> +		if (ddw != sdw) {
>>>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>>>> +			/* Should never reach this case as ddw is clamped down */
>>>> +			if (len & (ddw - 1)) {
>>>> +				dev_err(chan2dev(chan),
>>>> +					"Packing mode is enabled and len is not multiple of ddw");
>>>> +				return -EINVAL;
>>>> +			}
>>>> +		}
>>>> +
>>>> +		/* dst = dev */
>>>> +		_ctr2 |= CTR2_DREQ;
>>>> +
>>>> +		break;
>>>> +
>>>> +	case DMA_DEV_TO_MEM:
>>>> +		/* Set source (device) data width and burst */
>>>> +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
>>>> +							    len, src_addr));
>>>> +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
>>>> +
>>>> +		/* Set destination (memory) data width and burst */
>>>> +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
>>>> +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
>>>> +
>>>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>>>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>>>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>>>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>>>> +
>>>> +		if (ddw != sdw) {
>>>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>>>> +			/* Should never reach this case as ddw is clamped down */
>>>> +			if (len & (ddw - 1)) {
>>>> +				dev_err(chan2dev(chan),
>>>> +					"Packing mode is enabled and len is not multiple of ddw\n");
>>>> +				return -EINVAL;
>>>> +			}
>>>> +		}
>>>> +
>>>> +		/* dst = mem */
>>>> +		_ctr2 &= ~CTR2_DREQ;
>>>> +
>>>> +		break;
>>>> +
>>>> +	default:
>>>> +		dev_err(chan2dev(chan), "Direction %s not supported\n",
>>>> +			dmaengine_get_direction_text(dir));
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
>>>> +	*ctr1 = _ctr1;
>>>> +	*ctr2 = _ctr2;
>>>> +
>>>> +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
>>>> +		__func__, sdw, sbl_max, ddw, dbl_max);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	struct virt_dma_desc *vdesc;
>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>> +	u32 id = chan->id;
>>>> +	u32 csr, ccr;
>>>> +
>>>> +	vdesc = vchan_next_desc(&chan->vchan);
>>>> +	if (!vdesc) {
>>>> +		chan->swdesc = NULL;
>>>> +		return;
>>>> +	}
>>>> +	list_del(&vdesc->node);
>>>> +
>>>> +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
>>>> +	hwdesc = chan->swdesc->lli[0].hwdesc;
>>>> +
>>>> +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
>>>> +
>>>> +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
>>>> +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
>>>> +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
>>>> +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
>>>> +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
>>>> +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
>>>> +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
>>>> +
>>>> +	/* Clear any pending interrupts */
>>>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
>>>> +	if (csr & CSR_ALL_F)
>>>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
>>>> +
>>>> +	stm32_dma3_chan_dump_reg(chan);
>>>> +
>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
>>>> +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
>>>> +
>>>> +	chan->dma_status = DMA_IN_PROGRESS;
>>>> +
>>>> +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
>>>> +}
>>>> +
>>>> +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>>>> +	int ret = 0;
>>>> +
>>>> +	if (susp)
>>>> +		ccr |= CCR_SUSP;
>>>> +	else
>>>> +		ccr &= ~CCR_SUSP;
>>>> +
>>>> +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +
>>>> +	if (susp) {
>>>> +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
>>>> +							csr & CSR_SUSPF, 1, 10);
>>>> +		if (!ret)
>>>> +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
>>>> +
>>>> +		stm32_dma3_chan_dump_reg(chan);
>>>> +	}
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>>>> +
>>>> +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +}
>>>> +
>>>> +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 ccr;
>>>> +	int ret = 0;
>>>> +
>>>> +	chan->dma_status = DMA_COMPLETE;
>>>> +
>>>> +	/* Disable interrupts */
>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +
>>>> +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
>>>> +		/* Suspend the channel */
>>>> +		ret = stm32_dma3_chan_suspend(chan, true);
>>>> +		if (ret)
>>>> +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
>>>> +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
>>>> +	 */
>>>> +	stm32_dma3_chan_reset(chan);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	if (!chan->swdesc)
>>>> +		return;
>>>> +
>>>> +	vchan_cookie_complete(&chan->swdesc->vdesc);
>>>> +	chan->swdesc = NULL;
>>>> +	stm32_dma3_chan_start(chan);
>>>> +}
>>>> +
>>>> +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = devid;
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 misr, csr, ccr;
>>>> +
>>>> +	spin_lock(&chan->vchan.lock);
>>>> +
>>>> +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
>>>> +	if (!(misr & MISR_MIS(chan->id))) {
>>>> +		spin_unlock(&chan->vchan.lock);
>>>> +		return IRQ_NONE;
>>>> +	}
>>>> +
>>>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
>>>> +
>>>> +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
>>>> +		if (chan->swdesc->cyclic)
>>>> +			vchan_cyclic_callback(&chan->swdesc->vdesc);
>>>> +		else
>>>> +			stm32_dma3_chan_complete(chan);
>>>> +	}
>>>> +
>>>> +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
>>>> +		dev_err(chan2dev(chan), "User setting error\n");
>>>> +		chan->dma_status = DMA_ERROR;
>>>> +		/* CCR.EN automatically cleared by HW */
>>>> +		stm32_dma3_check_user_setting(chan);
>>>> +		stm32_dma3_chan_reset(chan);
>>>> +	}
>>>> +
>>>> +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
>>>> +		dev_err(chan2dev(chan), "Update link transfer error\n");
>>>> +		chan->dma_status = DMA_ERROR;
>>>> +		/* CCR.EN automatically cleared by HW */
>>>> +		stm32_dma3_chan_reset(chan);
>>>> +	}
>>>> +
>>>> +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
>>>> +		dev_err(chan2dev(chan), "Data transfer error\n");
>>>> +		chan->dma_status = DMA_ERROR;
>>>> +		/* CCR.EN automatically cleared by HW */
>>>> +		stm32_dma3_chan_reset(chan);
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
>>>> +	 * ensure HTF flag to be cleared, with other flags.
>>>> +	 */
>>>> +	csr &= (ccr | CCR_HTIE);
>>>> +
>>>> +	if (csr)
>>>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
>>>> +
>>>> +	spin_unlock(&chan->vchan.lock);
>>>> +
>>>> +	return IRQ_HANDLED;
>>>> +}
>>>> +
>>>> +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 id = chan->id, csemcr, ccid;
>>>> +	int ret;
>>>> +
>>>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>>>> +	if (ret < 0)
>>>> +		return ret;
>>>> +
>>>> +	/* Ensure the channel is free */
>>>> +	if (chan->semaphore_mode &&
>>>> +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
>>>> +		ret = -EBUSY;
>>>> +		goto err_put_sync;
>>>> +	}
>>>> +
>>>> +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
>>>> +					  sizeof(struct stm32_dma3_hwdesc),
>>>> +					  __alignof__(struct stm32_dma3_hwdesc), 0);
>>>> +	if (!chan->lli_pool) {
>>>> +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
>>>> +		ret = -ENOMEM;
>>>> +		goto err_put_sync;
>>>> +	}
>>>> +
>>>> +	/* Take the channel semaphore */
>>>> +	if (chan->semaphore_mode) {
>>>> +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
>>>> +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
>>>> +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
>>>> +		/* Check that the channel is well taken */
>>>> +		if (ccid != CCIDCFGR_CID1) {
>>>> +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
>>>> +			ret = -EPERM;
>>>> +			goto err_pool_destroy;
>>>> +		}
>>>> +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +
>>>> +err_pool_destroy:
>>>> +	dmam_pool_destroy(chan->lli_pool);
>>>> +	chan->lli_pool = NULL;
>>>> +
>>>> +err_put_sync:
>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	unsigned long flags;
>>>> +
>>>> +	/* Ensure channel is in idle state */
>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>> +	stm32_dma3_chan_stop(chan);
>>>> +	chan->swdesc = NULL;
>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>> +
>>>> +	vchan_free_chan_resources(to_virt_chan(c));
>>>> +
>>>> +	dmam_pool_destroy(chan->lli_pool);
>>>> +	chan->lli_pool = NULL;
>>>> +
>>>> +	/* Release the channel semaphore */
>>>> +	if (chan->semaphore_mode)
>>>> +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
>>>> +
>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>> +
>>>> +	/* Reset configuration */
>>>> +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
>>>> +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
>>>> +}
>>>> +
>>>> +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
>>>> +								struct scatterlist *sgl,
>>>> +								unsigned int sg_len,
>>>> +								enum dma_transfer_direction dir,
>>>> +								unsigned long flags, void *context)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_swdesc *swdesc;
>>>> +	struct scatterlist *sg;
>>>> +	size_t len;
>>>> +	dma_addr_t sg_addr, dev_addr, src, dst;
>>>> +	u32 i, j, count, ctr1, ctr2;
>>>> +	int ret;
>>>> +
>>>> +	count = sg_len;
>>>> +	for_each_sg(sgl, sg, sg_len, i) {
>>>> +		len = sg_dma_len(sg);
>>>> +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
>>>> +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
>>>> +	}
>>>> +
>>>> +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
>>>> +	if (!swdesc)
>>>> +		return NULL;
>>>> +
>>>> +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
>>>> +	j = 0;
>>>> +	for_each_sg(sgl, sg, sg_len, i) {
>>>> +		sg_addr = sg_dma_address(sg);
>>>> +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
>>>> +						     chan->dma_config.src_addr;
>>>> +		len = sg_dma_len(sg);
>>>> +
>>>> +		do {
>>>> +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
>>>> +
>>>> +			if (dir == DMA_MEM_TO_DEV) {
>>>> +				src = sg_addr;
>>>> +				dst = dev_addr;
>>>> +
>>>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>>>> +							      src, dst, chunk);
>>>> +
>>>> +				if (FIELD_GET(CTR1_DINC, ctr1))
>>>> +					dev_addr += chunk;
>>>> +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
>>>> +				src = dev_addr;
>>>> +				dst = sg_addr;
>>>> +
>>>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>>>> +							      src, dst, chunk);
>>>> +
>>>> +				if (FIELD_GET(CTR1_SINC, ctr1))
>>>> +					dev_addr += chunk;
>>>> +			}
>>>> +
>>>> +			if (ret)
>>>> +				goto err_desc_free;
>>>> +
>>>> +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
>>>> +						    ctr1, ctr2, j == (count - 1), false);
>>>> +
>>>> +			sg_addr += chunk;
>>>> +			len -= chunk;
>>>> +			j++;
>>>> +		} while (len);
>>>> +	}
>>>> +
>>>> +	/* Enable Error interrupts */
>>>> +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
>>>> +	/* Enable Transfer state interrupts */
>>>> +	swdesc->ccr |= CCR_TCIE;
>>>> +
>>>> +	swdesc->cyclic = false;
>>>> +
>>>> +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
>>>> +
>>>> +err_desc_free:
>>>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>>>> +
>>>> +	return NULL;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +
>>>> +	if (!chan->fifo_size) {
>>>> +		caps->max_burst = 0;
>>>> +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +	} else {
>>>> +		/* Burst transfer should not exceed half of the fifo size */
>>>> +		caps->max_burst = chan->max_burst;
>>>> +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
>>>> +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +		}
>>>> +	}
>>>> +}
>>>> +
>>>> +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +
>>>> +	memcpy(&chan->dma_config, config, sizeof(*config));
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int stm32_dma3_terminate_all(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	unsigned long flags;
>>>> +	LIST_HEAD(head);
>>>> +
>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>> +
>>>> +	if (chan->swdesc) {
>>>> +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
>>>> +		chan->swdesc = NULL;
>>>> +	}
>>>> +
>>>> +	stm32_dma3_chan_stop(chan);
>>>> +
>>>> +	vchan_get_all_descriptors(&chan->vchan, &head);
>>>> +
>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>> +	vchan_dma_desc_free_list(&chan->vchan, &head);
>>>> +
>>>> +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_synchronize(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +
>>>> +	vchan_synchronize(&chan->vchan);
>>>> +}
>>>> +
>>>> +static void stm32_dma3_issue_pending(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	unsigned long flags;
>>>> +
>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>> +
>>>> +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
>>>> +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
>>>> +		stm32_dma3_chan_start(chan);
>>>> +	}
>>>> +
>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>> +}
>>>> +
>>>> +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	struct stm32_dma3_dt_conf *conf = fn_param;
>>>> +	u32 mask, semcr;
>>>> +	int ret;
>>>> +
>>>> +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
>>>> +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
>>>> +
>>>> +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
>>>> +		if (!(mask & BIT(chan->id)))
>>>> +			return false;
>>>> +
>>>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>>>> +	if (ret < 0)
>>>> +		return false;
>>>> +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>> +
>>>> +	/* Check if chan is free */
>>>> +	if (semcr & CSEMCR_SEM_MUTEX)
>>>> +		return false;
>>>> +
>>>> +	/* Check if chan fifo fits well */
>>>> +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
>>>> +		return false;
>>>> +
>>>> +	return true;
>>>> +}
>>>> +
>>>> +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
>>>> +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
>>>> +	struct stm32_dma3_dt_conf conf;
>>>> +	struct stm32_dma3_chan *chan;
>>>> +	struct dma_chan *c;
>>>> +
>>>> +	if (dma_spec->args_count < 3) {
>>>> +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	conf.req_line = dma_spec->args[0];
>>>> +	conf.ch_conf = dma_spec->args[1];
>>>> +	conf.tr_conf = dma_spec->args[2];
>>>> +
>>>> +	if (conf.req_line >= ddata->dma_requests) {
>>>> +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	/* Request dma channel among the generic dma controller list */
>>>> +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
>>>> +	if (!c) {
>>>> +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	chan = to_stm32_dma3_chan(c);
>>>> +	chan->dt_config = conf;
>>>> +
>>>> +	return c;
>>>> +}
>>>> +
>>>> +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
>>>> +{
>>>> +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
>>>> +
>>>> +	/* Reserve Secure channels */
>>>> +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
>>>> +
>>>> +	/*
>>>> +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
>>>> +	 * the processor which is configuring and using the given channel.
>>>> +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
>>>> +	 * specify available DMA channels to the kernel.
>>>> +	 */
>>>> +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
>>>> +
>>>> +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>> +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
>>>> +
>>>> +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
>>>> +			invalid_cid |= BIT(i);
>>>> +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
>>>> +				chan_reserved |= BIT(i);
>>>> +		} else { /* CID-filtered */
>>>> +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
>>>> +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
>>>> +					chan_reserved |= BIT(i);
>>>> +			} else { /* Semaphore mode */
>>>> +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
>>>> +					chan_reserved |= BIT(i);
>>>> +				ddata->chans[i].semaphore_mode = true;
>>>> +			}
>>>> +		}
>>>> +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
>>>> +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
>>>> +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
>>>> +			(chan_reserved & BIT(i)) ? "denied" :
>>>> +			mask & BIT(i) ? "force allowed" : "allowed");
>>>> +	}
>>>> +
>>>> +	if (invalid_cid)
>>>> +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
>>>> +			 ddata->dma_channels, &invalid_cid);
>>>> +
>>>> +	return chan_reserved;
>>>> +}
>>>> +
>>>> +static const struct of_device_id stm32_dma3_of_match[] = {
>>>> +	{ .compatible = "st,stm32-dma3", },
>>>> +	{ /* sentinel */},
>>>> +};
>>>> +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
>>>> +
>>>> +static int stm32_dma3_probe(struct platform_device *pdev)
>>>> +{
>>>> +	struct device_node *np = pdev->dev.of_node;
>>>> +	struct stm32_dma3_ddata *ddata;
>>>> +	struct reset_control *reset;
>>>> +	struct stm32_dma3_chan *chan;
>>>> +	struct dma_device *dma_dev;
>>>> +	u32 master_ports, chan_reserved, i, verr;
>>>> +	u64 hwcfgr;
>>>> +	int ret;
>>>> +
>>>> +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
>>>> +	if (!ddata)
>>>> +		return -ENOMEM;
>>>> +	platform_set_drvdata(pdev, ddata);
>>>> +
>>>> +	dma_dev = &ddata->dma_dev;
>>>> +
>>>> +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
>>>> +	if (IS_ERR(ddata->base))
>>>> +		return PTR_ERR(ddata->base);
>>>> +
>>>> +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
>>>> +	if (IS_ERR(ddata->clk))
>>>> +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
>>>> +
>>>> +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
>>>> +	if (IS_ERR(reset))
>>>> +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
>>>> +
>>>> +	ret = clk_prepare_enable(ddata->clk);
>>>> +	if (ret)
>>>> +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
>>>> +
>>>> +	reset_control_reset(reset);
>>>> +
>>>> +	INIT_LIST_HEAD(&dma_dev->channels);
>>>> +
>>>> +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
>>>> +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
>>>> +	dma_dev->dev = &pdev->dev;
>>>> +	/*
>>>> +	 * This controller supports up to 8-byte buswidth depending on the port used and the
>>>> +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
>>>> +	 */
>>>> +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
>>>> +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
>>>> +
>>>> +	dma_dev->descriptor_reuse = true;
>>>> +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
>>>> +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
>>>> +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
>>>> +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
>>>> +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
>>>> +	dma_dev->device_caps = stm32_dma3_caps;
>>>> +	dma_dev->device_config = stm32_dma3_config;
>>>> +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
>>>> +	dma_dev->device_synchronize = stm32_dma3_synchronize;
>>>> +	dma_dev->device_tx_status = dma_cookie_status;
>>>> +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
>>>> +
>>>> +	/* if dma_channels is not modified, get it from hwcfgr1 */
>>>> +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
>>>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>>>> +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
>>>> +	}
>>>> +
>>>> +	/* if dma_requests is not modified, get it from hwcfgr2 */
>>>> +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
>>>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
>>>> +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
>>>> +	}
>>>> +
>>>> +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
>>>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>>>> +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
>>>> +
>>>> +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
>>>> +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
>>>> +		ddata->ports_max_dw[1] = DW_INVALID;
>>>> +	else /* Dual master ports */
>>>> +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
>>>> +
>>>> +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
>>>> +				    GFP_KERNEL);
>>>> +	if (!ddata->chans) {
>>>> +		ret = -ENOMEM;
>>>> +		goto err_clk_disable;
>>>> +	}
>>>> +
>>>> +	chan_reserved = stm32_dma3_check_rif(ddata);
>>>> +
>>>> +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
>>>> +		ret = -ENODEV;
>>>> +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
>>>> +		goto err_clk_disable;
>>>> +	}
>>>> +
>>>> +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
>>>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
>>>> +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
>>>> +
>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>> +		if (chan_reserved & BIT(i))
>>>> +			continue;
>>>> +
>>>> +		chan = &ddata->chans[i];
>>>> +		chan->id = i;
>>>> +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
>>>> +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
>>>> +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
>>>> +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
>>>> +
>>>> +		vchan_init(&chan->vchan, dma_dev);
>>>> +	}
>>>> +
>>>> +	ret = dmaenginem_async_device_register(dma_dev);
>>>> +	if (ret)
>>>> +		goto err_clk_disable;
>>>> +
>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>> +		if (chan_reserved & BIT(i))
>>>> +			continue;
>>>> +
>>>> +		ret = platform_get_irq(pdev, i);
>>>> +		if (ret < 0)
>>>> +			goto err_clk_disable;
>>>> +
>>>> +		chan = &ddata->chans[i];
>>>> +		chan->irq = ret;
>>>> +
>>>> +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
>>>> +				       dev_name(chan2dev(chan)), chan);
>>>> +		if (ret) {
>>>> +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
>>>> +				      dev_name(chan2dev(chan)));
>>>> +			goto err_clk_disable;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
>>>> +	if (ret) {
>>>> +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
>>>> +		goto err_clk_disable;
>>>> +	}
>>>> +
>>>> +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
>>>> +
>>>> +	pm_runtime_set_active(&pdev->dev);
>>>> +	pm_runtime_enable(&pdev->dev);
>>>> +	pm_runtime_get_noresume(&pdev->dev);
>>>> +	pm_runtime_put(&pdev->dev);
>>>> +
>>>> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
>>>> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
>>>> +
>>>> +	return 0;
>>>> +
>>>> +err_clk_disable:
>>>> +	clk_disable_unprepare(ddata->clk);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_remove(struct platform_device *pdev)
>>>> +{
>>>> +	pm_runtime_disable(&pdev->dev);
>>>> +}
>>>> +
>>>> +static int stm32_dma3_runtime_suspend(struct device *dev)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>>>> +
>>>> +	clk_disable_unprepare(ddata->clk);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int stm32_dma3_runtime_resume(struct device *dev)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>>>> +	int ret;
>>>> +
>>>> +	ret = clk_prepare_enable(ddata->clk);
>>>> +	if (ret)
>>>> +		dev_err(dev, "Failed to enable clk: %d\n", ret);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static const struct dev_pm_ops stm32_dma3_pm_ops = {
>>>> +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
>>>> +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
>>>> +};
>>>> +
>>>> +static struct platform_driver stm32_dma3_driver = {
>>>> +	.probe = stm32_dma3_probe,
>>>> +	.remove_new = stm32_dma3_remove,
>>>> +	.driver = {
>>>> +		.name = "stm32-dma3",
>>>> +		.of_match_table = stm32_dma3_of_match,
>>>> +		.pm = pm_ptr(&stm32_dma3_pm_ops),
>>>> +	},
>>>> +};
>>>> +
>>>> +static int __init stm32_dma3_init(void)
>>>> +{
>>>> +	return platform_driver_register(&stm32_dma3_driver);
>>>> +}
>>>> +
>>>> +subsys_initcall(stm32_dma3_init);
>>>> +
>>>> +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
>>>> +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
>>>> +MODULE_LICENSE("GPL");
>>>> -- 
>>>> 2.25.1
>>>
>>
>> Regards,
>> Amelie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-13  9:21         ` Amelie Delaunay
@ 2024-05-15 18:45           ` Frank Li
  2024-05-16  9:42             ` Amelie Delaunay
  0 siblings, 1 reply; 29+ messages in thread
From: Frank Li @ 2024-05-15 18:45 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

On Mon, May 13, 2024 at 11:21:18AM +0200, Amelie Delaunay wrote:
> Hi Frank,
> 
> On 5/7/24 22:26, Frank Li wrote:
> > On Tue, May 07, 2024 at 01:33:31PM +0200, Amelie Delaunay wrote:
> > > Hi Vinod,
> > > 
> > > Thanks for the review.
> > > 
> > > On 5/4/24 14:40, Vinod Koul wrote:
> > > > On 23-04-24, 14:32, Amelie Delaunay wrote:
> > > > > STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
> > > > > controller:
> > > > > - LPDMA (Low Power): 4 channels, no FIFO
> > > > > - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
> > > > > - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
> > > > > Hardware configuration of the channels is retrieved from the hardware
> > > > > configuration registers.
> > > > > The client can specify its channel requirements through device tree.
> > > > > STM32 DMA3 channels can be individually reserved either because they are
> > > > > secure, or dedicated to another CPU.
> > > > > Indeed, channels availability depends on Resource Isolation Framework
> > > > > (RIF) configuration. RIF grants access to buses with Compartiment ID
> > > > 
> > > > Compartiment? typo...?
> > > > 
> > > 
> > > Sorry, indeed, Compartment instead.
> > > 
> > > > > (CIF) filtering, secure and privilege level. It also assigns DMA channels
> > > > > to one or several processors.
> > > > > DMA channels used by Linux should be CID-filtered and statically assigned
> > > > > to CID1 or shared with other CPUs but using semaphore. In case CID
> > > > > filtering is not configured, dma-channel-mask property can be used to
> > > > > specify available DMA channels to the kernel, otherwise such channels
> > > > > will be marked as reserved and can't be used by Linux.
> > > > > 
> > > > > Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
> > > > > ---
> > > > >    drivers/dma/stm32/Kconfig      |   10 +
> > > > >    drivers/dma/stm32/Makefile     |    1 +
> > > > >    drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
> > > > >    3 files changed, 1442 insertions(+)
> > > > >    create mode 100644 drivers/dma/stm32/stm32-dma3.c
> > > > > 
> > > > > diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
> > > > > index b72ae1a4502f..4d8d8063133b 100644
> > > > > --- a/drivers/dma/stm32/Kconfig
> > > > > +++ b/drivers/dma/stm32/Kconfig
> > > > > @@ -34,4 +34,14 @@ config STM32_MDMA
> > > > >    	  If you have a board based on STM32 SoC with such DMA controller
> > > > >    	  and want to use MDMA say Y here.
> > > > > +config STM32_DMA3
> > > > > +	tristate "STMicroelectronics STM32 DMA3 support"
> > > > > +	select DMA_ENGINE
> > > > > +	select DMA_VIRTUAL_CHANNELS
> > > > > +	help
> > > > > +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
> > > > > +	  STM32 platforms.
> > > > > +	  If you have a board based on STM32 SoC with such DMA3 controller
> > > > > +	  and want to use DMA3, say Y here.
> > > > > +
> > > > >    endif
> > > > > diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
> > > > > index 663a3896a881..5082db4b4c1c 100644
> > > > > --- a/drivers/dma/stm32/Makefile
> > > > > +++ b/drivers/dma/stm32/Makefile
> > > > > @@ -2,3 +2,4 @@
> > > > >    obj-$(CONFIG_STM32_DMA) += stm32-dma.o
> > > > >    obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
> > > > >    obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
> > > > > +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
> > > > 
> > > > are there any similarities in mdma/dma and dma3..?
> > > > can anything be reused...?
> > > > 
> > > 
> > > DMA/MDMA were originally intended for STM32 MCUs and have been used in
> > > STM32MP1 MPUs.
> > > New MPUs (STM32MP2, ...) and STM32 MCUs (STM32H5, STM32N6, ...) use DMA3.
> > > Unlike DMA/MDMA, DMA3 can be declined in multiple configurations, LPDMA,
> > > GPDMA, HPDMA, and among these global configurations, there are possible
> > > sub-configurations (e.g. channel fifo size). stm32-dma3 uses the hardware
> > > configuration registers to discover the controller/channels capabilities.
> > > Reuse stm32-dma or stm32-mdma would lead to complicating the driver and
> > > making future stm32-dma3 evolutions for next STM32 MPUs intricate and very
> > > difficult.
> > 
> > I think your reason still not enough to create new driver instead try to
> > reuse old one.
> > 
> > Does register layout or dma descriptor is totally difference?
> > 
> > If dma descriptor format is the same, at least you can reuse prepare DMA
> > descriptor part.
> > 
> > Choose channel is independt part of DMA channel. You can create sperate
> > one for difference DMA engine.
> > 
> > Frank
> > 
> 
> stm32-dma is not considered for reuse : register layout is completely
> different and this DMA controller doesn't rely on descriptors mechanism.
> 
> stm32-mdma is based on descriptors mechanism but even there, there are
> significant differences in register layout and descriptors structure.
> As you can see:

Can you add such description in commit message?

Frank 

> /* Descriptor from stm32-mdma */
> struct stm32_mdma_hwdesc {
> 	u32 ctcr;
> 	u32 cbndtr;
> 	u32 csar;
> 	u32 cdar;
> 	u32 cbrur;
> 	u32 clar;
> 	u32 ctbr;
> 	u32 dummy;
> 	u32 cmar;
> 	u32 cmdr;
> } __aligned(64);
> 
> /* Descriptor from stm32-dma3 */
> struct stm32_dma3_hwdesc {
> 	u32 ctr1;
> 	u32 ctr2;
> 	u32 cbr1;
> 	u32 csar;
> 	u32 cdar;
> 	u32 cllr;
> } __aligned(32);
> 
> Moreover, stm32-dma3 can have static or dynamic linked-list items. Dynamic
> data structure support is not yet in this patchset, current implementation
> is undergoing validation and maturation.
> "cllr"  configures the data structure of the next linked-list item in
> addition to its address pointer. The descriptor can be "compacted" depending
> on cllr update bits values.
> 
> /* CxLLR DMA channel x linked-list address register */
> #define CLLR_LA				GENMASK(15, 2) /* Address */
> #define CLLR_ULL			BIT(16) /* CxLLR update ? */
> #define CLLR_UDA			BIT(27) /* CxDAR update ? */
> #define CLLR_USA			BIT(28) /* CxSAR update ? */
> #define CLLR_UB1			BIT(29) /* CxBR1 update ? */
> #define CLLR_UT2			BIT(30) /* CxTR2 update ? */
> #define CLLR_UT1			BIT(31) /* CxTR1 update ? */
> 
> If one or more CLLR_Uxx bit(s) is(are) not set, it means the corresponding
> u32 value(s) in the descriptor is(are) not there. For example, if CLLR_ULL
> bit is the only one that is set, then "cllr" value should be at offset 0 in
> linked-list data structure.
> 
> I hope this gives an insights into why I've decided not to reuse the
> existing drivers, either in whole or in part.
> 
> Amelie
> 
> > > 
> > > > > diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
> > > > > new file mode 100644
> > > > > index 000000000000..b5493f497d06
> > > > > --- /dev/null
> > > > > +++ b/drivers/dma/stm32/stm32-dma3.c
> > > > > @@ -0,0 +1,1431 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > > +/*
> > > > > + * STM32 DMA3 controller driver
> > > > > + *
> > > > > + * Copyright (C) STMicroelectronics 2024
> > > > > + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
> > > > > + */
> > > > > +
> > > > > +#include <linux/bitfield.h>
> > > > > +#include <linux/clk.h>
> > > > > +#include <linux/dma-mapping.h>
> > > > > +#include <linux/dmaengine.h>
> > > > > +#include <linux/dmapool.h>
> > > > > +#include <linux/init.h>
> > > > > +#include <linux/iopoll.h>
> > > > > +#include <linux/list.h>
> > > > > +#include <linux/module.h>
> > > > > +#include <linux/of_dma.h>
> > > > > +#include <linux/platform_device.h>
> > > > > +#include <linux/pm_runtime.h>
> > > > > +#include <linux/reset.h>
> > > > > +#include <linux/slab.h>
> > > > > +
> > > > > +#include "../virt-dma.h"
> > > > > +
> > > > > +#define STM32_DMA3_SECCFGR		0x00
> > > > > +#define STM32_DMA3_PRIVCFGR		0x04
> > > > > +#define STM32_DMA3_RCFGLOCKR		0x08
> > > > > +#define STM32_DMA3_MISR			0x0C
> > > > 
> > > > lower hex please
> > > > 
> > > 
> > > Ok.
> > > 
> > > > > +#define STM32_DMA3_SMISR		0x10
> > > > > +
> > > > > +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
> > > > > +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
> > > > > +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
> > > > > +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
> > > > > +
> > > > > +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
> > > > > +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
> > > > > +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
> > > > > +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
> > > > > +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
> > > > > +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
> > > > > +#define STM32_DMA3_VERR			0xFF4
> > > > 
> > > > here as well
> > > > 
> > > 
> > > Ok.
> > > 
> > > > > +
> > > > > +/* SECCFGR DMA secure configuration register */
> > > > > +#define SECCFGR_SEC(x)			BIT(x)
> > > > > +
> > > > > +/* MISR DMA non-secure/secure masked interrupt status register */
> > > > > +#define MISR_MIS(x)			BIT(x)
> > > > > +
> > > > > +/* CxLBAR DMA channel x linked_list base address register */
> > > > > +#define CLBAR_LBA			GENMASK(31, 16)
> > > > > +
> > > > > +/* CxCIDCFGR DMA channel x CID register */
> > > > > +#define CCIDCFGR_CFEN			BIT(0)
> > > > > +#define CCIDCFGR_SEM_EN			BIT(1)
> > > > > +#define CCIDCFGR_SCID			GENMASK(5, 4)
> > > > > +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
> > > > > +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
> > > > > +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
> > > > > +
> > > > > +enum ccidcfgr_cid {
> > > > > +	CCIDCFGR_CID0,
> > > > > +	CCIDCFGR_CID1,
> > > > > +	CCIDCFGR_CID2,
> > > > > +};
> > > > > +
> > > > > +/* CxSEMCR DMA channel x semaphore control register */
> > > > > +#define CSEMCR_SEM_MUTEX		BIT(0)
> > > > > +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
> > > > > +
> > > > > +/* CxFCR DMA channel x flag clear register */
> > > > > +#define CFCR_TCF			BIT(8)
> > > > > +#define CFCR_HTF			BIT(9)
> > > > > +#define CFCR_DTEF			BIT(10)
> > > > > +#define CFCR_ULEF			BIT(11)
> > > > > +#define CFCR_USEF			BIT(12)
> > > > > +#define CFCR_SUSPF			BIT(13)
> > > > > +
> > > > > +/* CxSR DMA channel x status register */
> > > > > +#define CSR_IDLEF			BIT(0)
> > > > > +#define CSR_TCF				BIT(8)
> > > > > +#define CSR_HTF				BIT(9)
> > > > > +#define CSR_DTEF			BIT(10)
> > > > > +#define CSR_ULEF			BIT(11)
> > > > > +#define CSR_USEF			BIT(12)
> > > > > +#define CSR_SUSPF			BIT(13)
> > > > > +#define CSR_ALL_F			GENMASK(13, 8)
> > > > > +#define CSR_FIFOL			GENMASK(24, 16)
> > > > > +
> > > > > +/* CxCR DMA channel x control register */
> > > > > +#define CCR_EN				BIT(0)
> > > > > +#define CCR_RESET			BIT(1)
> > > > > +#define CCR_SUSP			BIT(2)
> > > > > +#define CCR_TCIE			BIT(8)
> > > > > +#define CCR_HTIE			BIT(9)
> > > > > +#define CCR_DTEIE			BIT(10)
> > > > > +#define CCR_ULEIE			BIT(11)
> > > > > +#define CCR_USEIE			BIT(12)
> > > > > +#define CCR_SUSPIE			BIT(13)
> > > > > +#define CCR_ALLIE			GENMASK(13, 8)
> > > > > +#define CCR_LSM				BIT(16)
> > > > > +#define CCR_LAP				BIT(17)
> > > > > +#define CCR_PRIO			GENMASK(23, 22)
> > > > > +
> > > > > +enum ccr_prio {
> > > > > +	CCR_PRIO_LOW,
> > > > > +	CCR_PRIO_MID,
> > > > > +	CCR_PRIO_HIGH,
> > > > > +	CCR_PRIO_VERY_HIGH,
> > > > > +};
> > > > > +
> > > > > +/* CxTR1 DMA channel x transfer register 1 */
> > > > > +#define CTR1_SINC			BIT(3)
> > > > > +#define CTR1_SBL_1			GENMASK(9, 4)
> > > > > +#define CTR1_DINC			BIT(19)
> > > > > +#define CTR1_DBL_1			GENMASK(25, 20)
> > > > > +#define CTR1_SDW_LOG2			GENMASK(1, 0)
> > > > > +#define CTR1_PAM			GENMASK(12, 11)
> > > > > +#define CTR1_SAP			BIT(14)
> > > > > +#define CTR1_DDW_LOG2			GENMASK(17, 16)
> > > > > +#define CTR1_DAP			BIT(30)
> > > > > +
> > > > > +enum ctr1_dw {
> > > > > +	CTR1_DW_BYTE,
> > > > > +	CTR1_DW_HWORD,
> > > > > +	CTR1_DW_WORD,
> > > > > +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
> > > > > +};
> > > > > +
> > > > > +enum ctr1_pam {
> > > > > +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
> > > > > +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
> > > > > +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
> > > > > +};
> > > > > +
> > > > > +/* CxTR2 DMA channel x transfer register 2 */
> > > > > +#define CTR2_REQSEL			GENMASK(7, 0)
> > > > > +#define CTR2_SWREQ			BIT(9)
> > > > > +#define CTR2_DREQ			BIT(10)
> > > > > +#define CTR2_BREQ			BIT(11)
> > > > > +#define CTR2_PFREQ			BIT(12)
> > > > > +#define CTR2_TCEM			GENMASK(31, 30)
> > > > > +
> > > > > +enum ctr2_tcem {
> > > > > +	CTR2_TCEM_BLOCK,
> > > > > +	CTR2_TCEM_REPEAT_BLOCK,
> > > > > +	CTR2_TCEM_LLI,
> > > > > +	CTR2_TCEM_CHANNEL,
> > > > > +};
> > > > > +
> > > > > +/* CxBR1 DMA channel x block register 1 */
> > > > > +#define CBR1_BNDT			GENMASK(15, 0)
> > > > > +
> > > > > +/* CxLLR DMA channel x linked-list address register */
> > > > > +#define CLLR_LA				GENMASK(15, 2)
> > > > > +#define CLLR_ULL			BIT(16)
> > > > > +#define CLLR_UDA			BIT(27)
> > > > > +#define CLLR_USA			BIT(28)
> > > > > +#define CLLR_UB1			BIT(29)
> > > > > +#define CLLR_UT2			BIT(30)
> > > > > +#define CLLR_UT1			BIT(31)
> > > > > +
> > > > > +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
> > > > > +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
> > > > > +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
> > > > > +
> > > > > +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
> > > > > +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
> > > > > +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
> > > > > +
> > > > > +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
> > > > > +
> > > > > +/* HWCFGR2 DMA hardware configuration register 2 */
> > > > > +#define G_MAX_REQ_ID			GENMASK(7, 0)
> > > > > +
> > > > > +/* HWCFGR1 DMA hardware configuration register 1 */
> > > > > +#define G_MASTER_PORTS			GENMASK(2, 0)
> > > > > +#define G_NUM_CHANNELS			GENMASK(12, 8)
> > > > > +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
> > > > > +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
> > > > > +
> > > > > +enum stm32_dma3_master_ports {
> > > > > +	AXI64, /* 1x AXI: 64-bit port 0 */
> > > > > +	AHB32, /* 1x AHB: 32-bit port 0 */
> > > > > +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
> > > > > +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
> > > > > +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
> > > > > +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
> > > > > +};
> > > > > +
> > > > > +enum stm32_dma3_port_data_width {
> > > > > +	DW_32, /* 32-bit, for AHB */
> > > > > +	DW_64, /* 64-bit, for AXI */
> > > > > +	DW_128, /* 128-bit, for AXI */
> > > > > +	DW_INVALID,
> > > > > +};
> > > > > +
> > > > > +/* VERR DMA version register */
> > > > > +#define VERR_MINREV			GENMASK(3, 0)
> > > > > +#define VERR_MAJREV			GENMASK(7, 4)
> > > > > +
> > > > > +/* Device tree */
> > > > > +/* struct stm32_dma3_dt_conf */
> > > > > +/* .ch_conf */
> > > > > +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
> > > > > +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
> > > > > +/* .tr_conf */
> > > > > +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
> > > > > +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
> > > > > +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
> > > > > +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
> > > > > +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
> > > > > +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
> > > > > +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
> > > > > +
> > > > > +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
> > > > > +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> > > > > +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
> > > > > +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> > > > > +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
> > > > > +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
> > > > > +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
> > > > > +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
> > > > > +
> > > > > +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
> > > > > +struct stm32_dma3_hwdesc {
> > > > > +	u32 ctr1;
> > > > > +	u32 ctr2;
> > > > > +	u32 cbr1;
> > > > > +	u32 csar;
> > > > > +	u32 cdar;
> > > > > +	u32 cllr;
> > > > > +} __aligned(32);
> > > > > +
> > > > > +/*
> > > > > + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
> > > > > + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
> > > > > + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
> > > > > + * the sg size limitation.
> > > > > + */
> > > > > +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
> > > > > +
> > > > > +/*
> > > > > + * Linked-list items
> > > > > + */
> > > > > +struct stm32_dma3_lli {
> > > > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > > > +	dma_addr_t hwdesc_addr;
> > > > > +};
> > > > > +
> > > > > +struct stm32_dma3_swdesc {
> > > > > +	struct virt_dma_desc vdesc;
> > > > > +	u32 ccr;
> > > > > +	bool cyclic;
> > > > > +	u32 lli_size;
> > > > > +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
> > > > > +};
> > > > > +
> > > > > +struct stm32_dma3_dt_conf {
> > > > > +	u32 ch_id;
> > > > > +	u32 req_line;
> > > > > +	u32 ch_conf;
> > > > > +	u32 tr_conf;
> > > > > +};
> > > > > +
> > > > > +struct stm32_dma3_chan {
> > > > > +	struct virt_dma_chan vchan;
> > > > > +	u32 id;
> > > > > +	int irq;
> > > > > +	u32 fifo_size;
> > > > > +	u32 max_burst;
> > > > > +	bool semaphore_mode;
> > > > > +	struct stm32_dma3_dt_conf dt_config;
> > > > > +	struct dma_slave_config dma_config;
> > > > > +	struct dma_pool *lli_pool;
> > > > > +	struct stm32_dma3_swdesc *swdesc;
> > > > > +	enum ctr2_tcem tcem;
> > > > > +	u32 dma_status;
> > > > > +};
> > > > > +
> > > > > +struct stm32_dma3_ddata {
> > > > > +	struct dma_device dma_dev;
> > > > > +	void __iomem *base;
> > > > > +	struct clk *clk;
> > > > > +	struct stm32_dma3_chan *chans;
> > > > > +	u32 dma_channels;
> > > > > +	u32 dma_requests;
> > > > > +	enum stm32_dma3_port_data_width ports_max_dw[2];
> > > > > +};
> > > > > +
> > > > > +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
> > > > > +}
> > > > > +
> > > > > +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
> > > > > +{
> > > > > +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
> > > > > +}
> > > > > +
> > > > > +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
> > > > > +{
> > > > > +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
> > > > > +}
> > > > > +
> > > > > +static struct device *chan2dev(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	return &chan->vchan.chan.dev->device;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	struct device *dev = chan2dev(chan);
> > > > > +	u32 id = chan->id, offset;
> > > > > +
> > > > > +	offset = STM32_DMA3_SECCFGR;
> > > > > +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_PRIVCFGR;
> > > > > +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CCIDCFGR(id);
> > > > > +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CSEMCR(id);
> > > > > +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CSR(id);
> > > > > +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CCR(id);
> > > > > +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CTR1(id);
> > > > > +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CTR2(id);
> > > > > +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CBR1(id);
> > > > > +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CSAR(id);
> > > > > +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CDAR(id);
> > > > > +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CLLR(id);
> > > > > +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +	offset = STM32_DMA3_CLBAR(id);
> > > > > +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
> > > > > +					struct stm32_dma3_swdesc *swdesc)
> > > > > +{
> > > > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > > > +	int i;
> > > > > +
> > > > > +	for (i = 0; i < swdesc->lli_size; i++) {
> > > > > +		hwdesc = swdesc->lli[i].hwdesc;
> > > > > +		if (i)
> > > > > +			dev_dbg(chan2dev(chan), "V\n");
> > > > > +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
> > > > > +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
> > > > > +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
> > > > > +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
> > > > > +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
> > > > > +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
> > > > > +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
> > > > > +	}
> > > > > +
> > > > > +	if (swdesc->cyclic) {
> > > > > +		dev_dbg(chan2dev(chan), "|\n");
> > > > > +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
> > > > > +	} else {
> > > > > +		dev_dbg(chan2dev(chan), "X\n");
> > > > > +	}
> > > > > +}
> > > > > +
> > > > > +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	struct stm32_dma3_swdesc *swdesc;
> > > > > +	int i;
> > > > > +
> > > > > +	/*
> > > > > +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
> > > > > +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
> > > > > +	 * addressed, so abort the allocation.
> > > > > +	 */
> > > > > +	if ((count * 32) > CLLR_LA) {
> > > > > +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
> > > > > +		return NULL;
> > > > > +	}
> > > > > +
> > > > > +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
> > > > > +	if (!swdesc)
> > > > > +		return NULL;
> > > > > +
> > > > > +	for (i = 0; i < count; i++) {
> > > > > +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
> > > > > +							&swdesc->lli[i].hwdesc_addr);
> > > > > +		if (!swdesc->lli[i].hwdesc)
> > > > > +			goto err_pool_free;
> > > > > +	}
> > > > > +	swdesc->lli_size = count;
> > > > > +	swdesc->ccr = 0;
> > > > > +
> > > > > +	/* Set LL base address */
> > > > > +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
> > > > > +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
> > > > > +
> > > > > +	/* Set LL allocated port */
> > > > > +	swdesc->ccr &= ~CCR_LAP;
> > > > > +
> > > > > +	return swdesc;
> > > > > +
> > > > > +err_pool_free:
> > > > > +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
> > > > > +	while (--i >= 0)
> > > > > +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> > > > > +	kfree(swdesc);
> > > > > +
> > > > > +	return NULL;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
> > > > > +				      struct stm32_dma3_swdesc *swdesc)
> > > > > +{
> > > > > +	int i;
> > > > > +
> > > > > +	for (i = 0; i < swdesc->lli_size; i++)
> > > > > +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> > > > > +
> > > > > +	kfree(swdesc);
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
> > > > > +{
> > > > > +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
> > > > > +
> > > > > +	stm32_dma3_chan_desc_free(chan, swdesc);
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	struct device *dev = chan2dev(chan);
> > > > > +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
> > > > > +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
> > > > > +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
> > > > > +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
> > > > > +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
> > > > > +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
> > > > > +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
> > > > > +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
> > > > > +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
> > > > > +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
> > > > > +
> > > > > +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
> > > > > +		dev_err(dev, "null source block size and no update of this value\n");
> > > > > +	if (bndt % sdw)
> > > > > +		dev_err(dev, "source block size not multiple of src data width\n");
> > > > > +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
> > > > > +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
> > > > > +	if (csar % sdw)
> > > > > +		dev_err(dev, "unaligned source address not multiple of src data width\n");
> > > > > +	if (cdar % ddw)
> > > > > +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
> > > > > +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
> > > > > +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
> > > > > +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
> > > > > +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
> > > > 
> > > > NO error/abort if this is wrong...?
> > > > 
> > > 
> > > User setting error triggers an interrupt caught in stm32_dma3_chan_irq()
> > > interrupt handler.
> > > Indeed User setting error can occur when enabling the channel or when DMA3
> > > registers are updated with each linked-list item.
> > > In interrupt handler, when USEF (User Setting Error Flag) is set, this
> > > function (stm32_dma3_check_user_setting) helps the user to understand what
> > > went wrong. The hardware automatically disables the channel to prevent the
> > > execution of the wrongly programmed transfer and the driver resets the
> > > channel and sets chan->dma_status = DMA_ERROR;. dmaengine_tx_status() will
> > > return DMA_ERROR.
> > > So from user point of view, the transfer will never complete, and the
> > > channel is ready to be reprogrammed.
> > > Note that in _prep_ functions, all is checked to avoid user setting error.
> > > If a user setting error occurs, it is rather due to a corrupted linked-list
> > > item (that should fortunately never happen).
> > > 
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
> > > > > +					struct stm32_dma3_swdesc *swdesc,
> > > > > +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
> > > > > +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
> > > > > +{
> > > > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > > > +	dma_addr_t next_lli;
> > > > > +	u32 next = curr + 1;
> > > > > +
> > > > > +	hwdesc = swdesc->lli[curr].hwdesc;
> > > > > +	hwdesc->ctr1 = ctr1;
> > > > > +	hwdesc->ctr2 = ctr2;
> > > > > +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
> > > > > +	hwdesc->csar = src;
> > > > > +	hwdesc->cdar = dst;
> > > > > +
> > > > > +	if (is_last) {
> > > > > +		if (is_cyclic)
> > > > > +			next_lli = swdesc->lli[0].hwdesc_addr;
> > > > > +		else
> > > > > +			next_lli = 0;
> > > > > +	} else {
> > > > > +		next_lli = swdesc->lli[next].hwdesc_addr;
> > > > > +	}
> > > > > +
> > > > > +	hwdesc->cllr = 0;
> > > > > +	if (next_lli) {
> > > > > +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
> > > > > +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
> > > > > +		hwdesc->cllr |= (next_lli & CLLR_LA);
> > > > > +	}
> > > > > +}
> > > > > +
> > > > > +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
> > > > > +						     enum stm32_dma3_port_data_width port_max_dw,
> > > > > +						     u32 len, dma_addr_t addr)
> > > > > +{
> > > > > +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
> > > > > +
> > > > > +	/* len and addr must be a multiple of dw */
> > > > > +	return 1 << __ffs(len | addr | max_dw);
> > > > > +}
> > > > > +
> > > > > +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
> > > > > +{
> > > > > +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
> > > > > +
> > > > > +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
> > > > > +	if (len < chan_max_burst)
> > > > > +		max_burst = len / dw;
> > > > > +
> > > > > +	/*
> > > > > +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
> > > > > +	 * If len is not a multiple of burst size, last burst is shortened by HW.
> > > > > +	 */
> > > > > +	return max_burst;
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
> > > > > +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
> > > > > +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	struct dma_device dma_device = ddata->dma_dev;
> > > > > +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
> > > > > +	u32 _ctr1 = 0, _ctr2 = 0;
> > > > > +	u32 ch_conf = chan->dt_config.ch_conf;
> > > > > +	u32 tr_conf = chan->dt_config.tr_conf;
> > > > > +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
> > > > > +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
> > > > > +
> > > > > +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
> > > > > +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
> > > > > +
> > > > > +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
> > > > > +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
> > > > > +	sbl_max = chan->dma_config.src_maxburst ? : 1;
> > > > > +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
> > > > > +
> > > > > +	/* Following conditions would raise User Setting Error interrupt */
> > > > > +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
> > > > > +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
> > > > > +		return -EINVAL;
> > > > > +	}
> > > > > +
> > > > > +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
> > > > > +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
> > > > > +		return -EINVAL;
> > > > > +	}
> > > > > +
> > > > > +	sap_max_dw = ddata->ports_max_dw[sap];
> > > > > +	dap_max_dw = ddata->ports_max_dw[dap];
> > > > > +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
> > > > > +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
> > > > > +		dev_err(chan2dev(chan),
> > > > > +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
> > > > > +			sdw, ddw, sap, dap);
> > > > > +		return -EINVAL;
> > > > > +	}
> > > > > +
> > > > > +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
> > > > > +		_ctr1 |= CTR1_SINC;
> > > > > +	if (sap)
> > > > > +		_ctr1 |= CTR1_SAP;
> > > > > +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
> > > > > +		_ctr1 |= CTR1_DINC;
> > > > > +	if (dap)
> > > > > +		_ctr1 |= CTR1_DAP;
> > > > > +
> > > > > +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
> > > > > +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
> > > > > +		_ctr2 |= CTR2_BREQ;
> > > > > +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
> > > > > +		_ctr2 |= CTR2_PFREQ;
> > > > > +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
> > > > > +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
> > > > > +
> > > > > +	/* Store TCEM to know on which event TC flag occurred */
> > > > > +	chan->tcem = tcem;
> > > > > +	/* Store direction for residue computation */
> > > > > +	chan->dma_config.direction = dir;
> > > > > +
> > > > > +	switch (dir) {
> > > > > +	case DMA_MEM_TO_DEV:
> > > > > +		/* Set destination (device) data width and burst */
> > > > > +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
> > > > > +							    len, dst_addr));
> > > > > +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
> > > > > +
> > > > > +		/* Set source (memory) data width and burst */
> > > > > +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
> > > > > +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
> > > > > +
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> > > > > +
> > > > > +		if (ddw != sdw) {
> > > > > +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> > > > > +			/* Should never reach this case as ddw is clamped down */
> > > > > +			if (len & (ddw - 1)) {
> > > > > +				dev_err(chan2dev(chan),
> > > > > +					"Packing mode is enabled and len is not multiple of ddw");
> > > > > +				return -EINVAL;
> > > > > +			}
> > > > > +		}
> > > > > +
> > > > > +		/* dst = dev */
> > > > > +		_ctr2 |= CTR2_DREQ;
> > > > > +
> > > > > +		break;
> > > > > +
> > > > > +	case DMA_DEV_TO_MEM:
> > > > > +		/* Set source (device) data width and burst */
> > > > > +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
> > > > > +							    len, src_addr));
> > > > > +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
> > > > > +
> > > > > +		/* Set destination (memory) data width and burst */
> > > > > +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
> > > > > +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
> > > > > +
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> > > > > +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> > > > > +
> > > > > +		if (ddw != sdw) {
> > > > > +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> > > > > +			/* Should never reach this case as ddw is clamped down */
> > > > > +			if (len & (ddw - 1)) {
> > > > > +				dev_err(chan2dev(chan),
> > > > > +					"Packing mode is enabled and len is not multiple of ddw\n");
> > > > > +				return -EINVAL;
> > > > > +			}
> > > > > +		}
> > > > > +
> > > > > +		/* dst = mem */
> > > > > +		_ctr2 &= ~CTR2_DREQ;
> > > > > +
> > > > > +		break;
> > > > > +
> > > > > +	default:
> > > > > +		dev_err(chan2dev(chan), "Direction %s not supported\n",
> > > > > +			dmaengine_get_direction_text(dir));
> > > > > +		return -EINVAL;
> > > > > +	}
> > > > > +
> > > > > +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
> > > > > +	*ctr1 = _ctr1;
> > > > > +	*ctr2 = _ctr2;
> > > > > +
> > > > > +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
> > > > > +		__func__, sdw, sbl_max, ddw, dbl_max);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	struct virt_dma_desc *vdesc;
> > > > > +	struct stm32_dma3_hwdesc *hwdesc;
> > > > > +	u32 id = chan->id;
> > > > > +	u32 csr, ccr;
> > > > > +
> > > > > +	vdesc = vchan_next_desc(&chan->vchan);
> > > > > +	if (!vdesc) {
> > > > > +		chan->swdesc = NULL;
> > > > > +		return;
> > > > > +	}
> > > > > +	list_del(&vdesc->node);
> > > > > +
> > > > > +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
> > > > > +	hwdesc = chan->swdesc->lli[0].hwdesc;
> > > > > +
> > > > > +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
> > > > > +
> > > > > +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
> > > > > +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
> > > > > +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
> > > > > +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
> > > > > +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
> > > > > +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
> > > > > +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
> > > > > +
> > > > > +	/* Clear any pending interrupts */
> > > > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
> > > > > +	if (csr & CSR_ALL_F)
> > > > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
> > > > > +
> > > > > +	stm32_dma3_chan_dump_reg(chan);
> > > > > +
> > > > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
> > > > > +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
> > > > > +
> > > > > +	chan->dma_status = DMA_IN_PROGRESS;
> > > > > +
> > > > > +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	if (susp)
> > > > > +		ccr |= CCR_SUSP;
> > > > > +	else
> > > > > +		ccr &= ~CCR_SUSP;
> > > > > +
> > > > > +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +
> > > > > +	if (susp) {
> > > > > +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
> > > > > +							csr & CSR_SUSPF, 1, 10);
> > > > > +		if (!ret)
> > > > > +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > > > +
> > > > > +		stm32_dma3_chan_dump_reg(chan);
> > > > > +	}
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > > > +
> > > > > +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 ccr;
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	chan->dma_status = DMA_COMPLETE;
> > > > > +
> > > > > +	/* Disable interrupts */
> > > > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +
> > > > > +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
> > > > > +		/* Suspend the channel */
> > > > > +		ret = stm32_dma3_chan_suspend(chan, true);
> > > > > +		if (ret)
> > > > > +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
> > > > > +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
> > > > > +	 */
> > > > > +	stm32_dma3_chan_reset(chan);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	if (!chan->swdesc)
> > > > > +		return;
> > > > > +
> > > > > +	vchan_cookie_complete(&chan->swdesc->vdesc);
> > > > > +	chan->swdesc = NULL;
> > > > > +	stm32_dma3_chan_start(chan);
> > > > > +}
> > > > > +
> > > > > +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = devid;
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 misr, csr, ccr;
> > > > > +
> > > > > +	spin_lock(&chan->vchan.lock);
> > > > > +
> > > > > +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
> > > > > +	if (!(misr & MISR_MIS(chan->id))) {
> > > > > +		spin_unlock(&chan->vchan.lock);
> > > > > +		return IRQ_NONE;
> > > > > +	}
> > > > > +
> > > > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
> > > > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
> > > > > +
> > > > > +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
> > > > > +		if (chan->swdesc->cyclic)
> > > > > +			vchan_cyclic_callback(&chan->swdesc->vdesc);
> > > > > +		else
> > > > > +			stm32_dma3_chan_complete(chan);
> > > > > +	}
> > > > > +
> > > > > +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
> > > > > +		dev_err(chan2dev(chan), "User setting error\n");
> > > > > +		chan->dma_status = DMA_ERROR;
> > > > > +		/* CCR.EN automatically cleared by HW */
> > > > > +		stm32_dma3_check_user_setting(chan);
> > > > > +		stm32_dma3_chan_reset(chan);
> > > > > +	}
> > > > > +
> > > > > +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
> > > > > +		dev_err(chan2dev(chan), "Update link transfer error\n");
> > > > > +		chan->dma_status = DMA_ERROR;
> > > > > +		/* CCR.EN automatically cleared by HW */
> > > > > +		stm32_dma3_chan_reset(chan);
> > > > > +	}
> > > > > +
> > > > > +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
> > > > > +		dev_err(chan2dev(chan), "Data transfer error\n");
> > > > > +		chan->dma_status = DMA_ERROR;
> > > > > +		/* CCR.EN automatically cleared by HW */
> > > > > +		stm32_dma3_chan_reset(chan);
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
> > > > > +	 * ensure HTF flag to be cleared, with other flags.
> > > > > +	 */
> > > > > +	csr &= (ccr | CCR_HTIE);
> > > > > +
> > > > > +	if (csr)
> > > > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > > > +
> > > > > +	spin_unlock(&chan->vchan.lock);
> > > > > +
> > > > > +	return IRQ_HANDLED;
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 id = chan->id, csemcr, ccid;
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > > > +	if (ret < 0)
> > > > > +		return ret;
> > > > > +
> > > > > +	/* Ensure the channel is free */
> > > > > +	if (chan->semaphore_mode &&
> > > > > +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
> > > > > +		ret = -EBUSY;
> > > > > +		goto err_put_sync;
> > > > > +	}
> > > > > +
> > > > > +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
> > > > > +					  sizeof(struct stm32_dma3_hwdesc),
> > > > > +					  __alignof__(struct stm32_dma3_hwdesc), 0);
> > > > > +	if (!chan->lli_pool) {
> > > > > +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
> > > > > +		ret = -ENOMEM;
> > > > > +		goto err_put_sync;
> > > > > +	}
> > > > > +
> > > > > +	/* Take the channel semaphore */
> > > > > +	if (chan->semaphore_mode) {
> > > > > +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
> > > > > +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
> > > > > +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
> > > > > +		/* Check that the channel is well taken */
> > > > > +		if (ccid != CCIDCFGR_CID1) {
> > > > > +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
> > > > > +			ret = -EPERM;
> > > > > +			goto err_pool_destroy;
> > > > > +		}
> > > > > +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
> > > > > +	}
> > > > > +
> > > > > +	return 0;
> > > > > +
> > > > > +err_pool_destroy:
> > > > > +	dmam_pool_destroy(chan->lli_pool);
> > > > > +	chan->lli_pool = NULL;
> > > > > +
> > > > > +err_put_sync:
> > > > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	unsigned long flags;
> > > > > +
> > > > > +	/* Ensure channel is in idle state */
> > > > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > > > +	stm32_dma3_chan_stop(chan);
> > > > > +	chan->swdesc = NULL;
> > > > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > > > +
> > > > > +	vchan_free_chan_resources(to_virt_chan(c));
> > > > > +
> > > > > +	dmam_pool_destroy(chan->lli_pool);
> > > > > +	chan->lli_pool = NULL;
> > > > > +
> > > > > +	/* Release the channel semaphore */
> > > > > +	if (chan->semaphore_mode)
> > > > > +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > > > +
> > > > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > > > +
> > > > > +	/* Reset configuration */
> > > > > +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
> > > > > +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
> > > > > +}
> > > > > +
> > > > > +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
> > > > > +								struct scatterlist *sgl,
> > > > > +								unsigned int sg_len,
> > > > > +								enum dma_transfer_direction dir,
> > > > > +								unsigned long flags, void *context)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_swdesc *swdesc;
> > > > > +	struct scatterlist *sg;
> > > > > +	size_t len;
> > > > > +	dma_addr_t sg_addr, dev_addr, src, dst;
> > > > > +	u32 i, j, count, ctr1, ctr2;
> > > > > +	int ret;
> > > > > +
> > > > > +	count = sg_len;
> > > > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > > > +		len = sg_dma_len(sg);
> > > > > +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
> > > > > +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
> > > > > +	}
> > > > > +
> > > > > +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
> > > > > +	if (!swdesc)
> > > > > +		return NULL;
> > > > > +
> > > > > +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
> > > > > +	j = 0;
> > > > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > > > +		sg_addr = sg_dma_address(sg);
> > > > > +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
> > > > > +						     chan->dma_config.src_addr;
> > > > > +		len = sg_dma_len(sg);
> > > > > +
> > > > > +		do {
> > > > > +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
> > > > > +
> > > > > +			if (dir == DMA_MEM_TO_DEV) {
> > > > > +				src = sg_addr;
> > > > > +				dst = dev_addr;
> > > > > +
> > > > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > > > +							      src, dst, chunk);
> > > > > +
> > > > > +				if (FIELD_GET(CTR1_DINC, ctr1))
> > > > > +					dev_addr += chunk;
> > > > > +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
> > > > > +				src = dev_addr;
> > > > > +				dst = sg_addr;
> > > > > +
> > > > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > > > +							      src, dst, chunk);
> > > > > +
> > > > > +				if (FIELD_GET(CTR1_SINC, ctr1))
> > > > > +					dev_addr += chunk;
> > > > > +			}
> > > > > +
> > > > > +			if (ret)
> > > > > +				goto err_desc_free;
> > > > > +
> > > > > +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
> > > > > +						    ctr1, ctr2, j == (count - 1), false);
> > > > > +
> > > > > +			sg_addr += chunk;
> > > > > +			len -= chunk;
> > > > > +			j++;
> > > > > +		} while (len);
> > > > > +	}
> > > > > +
> > > > > +	/* Enable Error interrupts */
> > > > > +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
> > > > > +	/* Enable Transfer state interrupts */
> > > > > +	swdesc->ccr |= CCR_TCIE;
> > > > > +
> > > > > +	swdesc->cyclic = false;
> > > > > +
> > > > > +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
> > > > > +
> > > > > +err_desc_free:
> > > > > +	stm32_dma3_chan_desc_free(chan, swdesc);
> > > > > +
> > > > > +	return NULL;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +
> > > > > +	if (!chan->fifo_size) {
> > > > > +		caps->max_burst = 0;
> > > > > +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +	} else {
> > > > > +		/* Burst transfer should not exceed half of the fifo size */
> > > > > +		caps->max_burst = chan->max_burst;
> > > > > +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
> > > > > +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +		}
> > > > > +	}
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +
> > > > > +	memcpy(&chan->dma_config, config, sizeof(*config));
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_terminate_all(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	unsigned long flags;
> > > > > +	LIST_HEAD(head);
> > > > > +
> > > > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > > > +
> > > > > +	if (chan->swdesc) {
> > > > > +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
> > > > > +		chan->swdesc = NULL;
> > > > > +	}
> > > > > +
> > > > > +	stm32_dma3_chan_stop(chan);
> > > > > +
> > > > > +	vchan_get_all_descriptors(&chan->vchan, &head);
> > > > > +
> > > > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > > > +	vchan_dma_desc_free_list(&chan->vchan, &head);
> > > > > +
> > > > > +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_synchronize(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +
> > > > > +	vchan_synchronize(&chan->vchan);
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_issue_pending(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	unsigned long flags;
> > > > > +
> > > > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > > > +
> > > > > +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
> > > > > +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
> > > > > +		stm32_dma3_chan_start(chan);
> > > > > +	}
> > > > > +
> > > > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > > > +}
> > > > > +
> > > > > +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	struct stm32_dma3_dt_conf *conf = fn_param;
> > > > > +	u32 mask, semcr;
> > > > > +	int ret;
> > > > > +
> > > > > +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
> > > > > +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
> > > > > +
> > > > > +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
> > > > > +		if (!(mask & BIT(chan->id)))
> > > > > +			return false;
> > > > > +
> > > > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > > > +	if (ret < 0)
> > > > > +		return false;
> > > > > +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > > > +
> > > > > +	/* Check if chan is free */
> > > > > +	if (semcr & CSEMCR_SEM_MUTEX)
> > > > > +		return false;
> > > > > +
> > > > > +	/* Check if chan fifo fits well */
> > > > > +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
> > > > > +		return false;
> > > > > +
> > > > > +	return true;
> > > > > +}
> > > > > +
> > > > > +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
> > > > > +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
> > > > > +	struct stm32_dma3_dt_conf conf;
> > > > > +	struct stm32_dma3_chan *chan;
> > > > > +	struct dma_chan *c;
> > > > > +
> > > > > +	if (dma_spec->args_count < 3) {
> > > > > +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
> > > > > +		return NULL;
> > > > > +	}
> > > > > +
> > > > > +	conf.req_line = dma_spec->args[0];
> > > > > +	conf.ch_conf = dma_spec->args[1];
> > > > > +	conf.tr_conf = dma_spec->args[2];
> > > > > +
> > > > > +	if (conf.req_line >= ddata->dma_requests) {
> > > > > +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
> > > > > +		return NULL;
> > > > > +	}
> > > > > +
> > > > > +	/* Request dma channel among the generic dma controller list */
> > > > > +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
> > > > > +	if (!c) {
> > > > > +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
> > > > > +		return NULL;
> > > > > +	}
> > > > > +
> > > > > +	chan = to_stm32_dma3_chan(c);
> > > > > +	chan->dt_config = conf;
> > > > > +
> > > > > +	return c;
> > > > > +}
> > > > > +
> > > > > +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
> > > > > +{
> > > > > +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
> > > > > +
> > > > > +	/* Reserve Secure channels */
> > > > > +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
> > > > > +
> > > > > +	/*
> > > > > +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
> > > > > +	 * the processor which is configuring and using the given channel.
> > > > > +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
> > > > > +	 * specify available DMA channels to the kernel.
> > > > > +	 */
> > > > > +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
> > > > > +
> > > > > +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
> > > > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > > > +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
> > > > > +
> > > > > +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
> > > > > +			invalid_cid |= BIT(i);
> > > > > +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
> > > > > +				chan_reserved |= BIT(i);
> > > > > +		} else { /* CID-filtered */
> > > > > +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
> > > > > +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
> > > > > +					chan_reserved |= BIT(i);
> > > > > +			} else { /* Semaphore mode */
> > > > > +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
> > > > > +					chan_reserved |= BIT(i);
> > > > > +				ddata->chans[i].semaphore_mode = true;
> > > > > +			}
> > > > > +		}
> > > > > +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
> > > > > +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
> > > > > +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
> > > > > +			(chan_reserved & BIT(i)) ? "denied" :
> > > > > +			mask & BIT(i) ? "force allowed" : "allowed");
> > > > > +	}
> > > > > +
> > > > > +	if (invalid_cid)
> > > > > +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
> > > > > +			 ddata->dma_channels, &invalid_cid);
> > > > > +
> > > > > +	return chan_reserved;
> > > > > +}
> > > > > +
> > > > > +static const struct of_device_id stm32_dma3_of_match[] = {
> > > > > +	{ .compatible = "st,stm32-dma3", },
> > > > > +	{ /* sentinel */},
> > > > > +};
> > > > > +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
> > > > > +
> > > > > +static int stm32_dma3_probe(struct platform_device *pdev)
> > > > > +{
> > > > > +	struct device_node *np = pdev->dev.of_node;
> > > > > +	struct stm32_dma3_ddata *ddata;
> > > > > +	struct reset_control *reset;
> > > > > +	struct stm32_dma3_chan *chan;
> > > > > +	struct dma_device *dma_dev;
> > > > > +	u32 master_ports, chan_reserved, i, verr;
> > > > > +	u64 hwcfgr;
> > > > > +	int ret;
> > > > > +
> > > > > +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
> > > > > +	if (!ddata)
> > > > > +		return -ENOMEM;
> > > > > +	platform_set_drvdata(pdev, ddata);
> > > > > +
> > > > > +	dma_dev = &ddata->dma_dev;
> > > > > +
> > > > > +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
> > > > > +	if (IS_ERR(ddata->base))
> > > > > +		return PTR_ERR(ddata->base);
> > > > > +
> > > > > +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
> > > > > +	if (IS_ERR(ddata->clk))
> > > > > +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
> > > > > +
> > > > > +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
> > > > > +	if (IS_ERR(reset))
> > > > > +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
> > > > > +
> > > > > +	ret = clk_prepare_enable(ddata->clk);
> > > > > +	if (ret)
> > > > > +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
> > > > > +
> > > > > +	reset_control_reset(reset);
> > > > > +
> > > > > +	INIT_LIST_HEAD(&dma_dev->channels);
> > > > > +
> > > > > +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
> > > > > +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
> > > > > +	dma_dev->dev = &pdev->dev;
> > > > > +	/*
> > > > > +	 * This controller supports up to 8-byte buswidth depending on the port used and the
> > > > > +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
> > > > > +	 */
> > > > > +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
> > > > > +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
> > > > > +
> > > > > +	dma_dev->descriptor_reuse = true;
> > > > > +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
> > > > > +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
> > > > > +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
> > > > > +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
> > > > > +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
> > > > > +	dma_dev->device_caps = stm32_dma3_caps;
> > > > > +	dma_dev->device_config = stm32_dma3_config;
> > > > > +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
> > > > > +	dma_dev->device_synchronize = stm32_dma3_synchronize;
> > > > > +	dma_dev->device_tx_status = dma_cookie_status;
> > > > > +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
> > > > > +
> > > > > +	/* if dma_channels is not modified, get it from hwcfgr1 */
> > > > > +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
> > > > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > > > +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
> > > > > +	}
> > > > > +
> > > > > +	/* if dma_requests is not modified, get it from hwcfgr2 */
> > > > > +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
> > > > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
> > > > > +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
> > > > > +	}
> > > > > +
> > > > > +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
> > > > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > > > +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
> > > > > +
> > > > > +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
> > > > > +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
> > > > > +		ddata->ports_max_dw[1] = DW_INVALID;
> > > > > +	else /* Dual master ports */
> > > > > +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
> > > > > +
> > > > > +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
> > > > > +				    GFP_KERNEL);
> > > > > +	if (!ddata->chans) {
> > > > > +		ret = -ENOMEM;
> > > > > +		goto err_clk_disable;
> > > > > +	}
> > > > > +
> > > > > +	chan_reserved = stm32_dma3_check_rif(ddata);
> > > > > +
> > > > > +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
> > > > > +		ret = -ENODEV;
> > > > > +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
> > > > > +		goto err_clk_disable;
> > > > > +	}
> > > > > +
> > > > > +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
> > > > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
> > > > > +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
> > > > > +
> > > > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > > > +		if (chan_reserved & BIT(i))
> > > > > +			continue;
> > > > > +
> > > > > +		chan = &ddata->chans[i];
> > > > > +		chan->id = i;
> > > > > +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
> > > > > +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
> > > > > +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
> > > > > +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
> > > > > +
> > > > > +		vchan_init(&chan->vchan, dma_dev);
> > > > > +	}
> > > > > +
> > > > > +	ret = dmaenginem_async_device_register(dma_dev);
> > > > > +	if (ret)
> > > > > +		goto err_clk_disable;
> > > > > +
> > > > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > > > +		if (chan_reserved & BIT(i))
> > > > > +			continue;
> > > > > +
> > > > > +		ret = platform_get_irq(pdev, i);
> > > > > +		if (ret < 0)
> > > > > +			goto err_clk_disable;
> > > > > +
> > > > > +		chan = &ddata->chans[i];
> > > > > +		chan->irq = ret;
> > > > > +
> > > > > +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
> > > > > +				       dev_name(chan2dev(chan)), chan);
> > > > > +		if (ret) {
> > > > > +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
> > > > > +				      dev_name(chan2dev(chan)));
> > > > > +			goto err_clk_disable;
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > > +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
> > > > > +	if (ret) {
> > > > > +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
> > > > > +		goto err_clk_disable;
> > > > > +	}
> > > > > +
> > > > > +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
> > > > > +
> > > > > +	pm_runtime_set_active(&pdev->dev);
> > > > > +	pm_runtime_enable(&pdev->dev);
> > > > > +	pm_runtime_get_noresume(&pdev->dev);
> > > > > +	pm_runtime_put(&pdev->dev);
> > > > > +
> > > > > +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
> > > > > +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
> > > > > +
> > > > > +	return 0;
> > > > > +
> > > > > +err_clk_disable:
> > > > > +	clk_disable_unprepare(ddata->clk);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_remove(struct platform_device *pdev)
> > > > > +{
> > > > > +	pm_runtime_disable(&pdev->dev);
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_runtime_suspend(struct device *dev)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > > > +
> > > > > +	clk_disable_unprepare(ddata->clk);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_runtime_resume(struct device *dev)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = clk_prepare_enable(ddata->clk);
> > > > > +	if (ret)
> > > > > +		dev_err(dev, "Failed to enable clk: %d\n", ret);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static const struct dev_pm_ops stm32_dma3_pm_ops = {
> > > > > +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> > > > > +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
> > > > > +};
> > > > > +
> > > > > +static struct platform_driver stm32_dma3_driver = {
> > > > > +	.probe = stm32_dma3_probe,
> > > > > +	.remove_new = stm32_dma3_remove,
> > > > > +	.driver = {
> > > > > +		.name = "stm32-dma3",
> > > > > +		.of_match_table = stm32_dma3_of_match,
> > > > > +		.pm = pm_ptr(&stm32_dma3_pm_ops),
> > > > > +	},
> > > > > +};
> > > > > +
> > > > > +static int __init stm32_dma3_init(void)
> > > > > +{
> > > > > +	return platform_driver_register(&stm32_dma3_driver);
> > > > > +}
> > > > > +
> > > > > +subsys_initcall(stm32_dma3_init);
> > > > > +
> > > > > +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
> > > > > +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
> > > > > +MODULE_LICENSE("GPL");
> > > > > -- 
> > > > > 2.25.1
> > > > 
> > > 
> > > Regards,
> > > Amelie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-04-23 12:32 ` [PATCH 05/12] dmaengine: Add STM32 DMA3 support Amelie Delaunay
  2024-05-04 12:40   ` Vinod Koul
  2024-05-04 14:27   ` Christophe JAILLET
@ 2024-05-15 18:56   ` Frank Li
  2024-05-16 15:25     ` Amelie Delaunay
  2 siblings, 1 reply; 29+ messages in thread
From: Frank Li @ 2024-05-15 18:56 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

On Tue, Apr 23, 2024 at 02:32:55PM +0200, Amelie Delaunay wrote:
> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
> controller:
> - LPDMA (Low Power): 4 channels, no FIFO
> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
> Hardware configuration of the channels is retrieved from the hardware
> configuration registers.
> The client can specify its channel requirements through device tree.
> STM32 DMA3 channels can be individually reserved either because they are
> secure, or dedicated to another CPU.
> Indeed, channels availability depends on Resource Isolation Framework
> (RIF) configuration. RIF grants access to buses with Compartiment ID
> (CIF) filtering, secure and privilege level. It also assigns DMA channels
> to one or several processors.
> DMA channels used by Linux should be CID-filtered and statically assigned
> to CID1 or shared with other CPUs but using semaphore. In case CID
> filtering is not configured, dma-channel-mask property can be used to
> specify available DMA channels to the kernel, otherwise such channels
> will be marked as reserved and can't be used by Linux.
> 
> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
> ---
>  drivers/dma/stm32/Kconfig      |   10 +
>  drivers/dma/stm32/Makefile     |    1 +
>  drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
>  3 files changed, 1442 insertions(+)
>  create mode 100644 drivers/dma/stm32/stm32-dma3.c
> 
> diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
> index b72ae1a4502f..4d8d8063133b 100644
> --- a/drivers/dma/stm32/Kconfig
> +++ b/drivers/dma/stm32/Kconfig
> @@ -34,4 +34,14 @@ config STM32_MDMA
>  	  If you have a board based on STM32 SoC with such DMA controller
>  	  and want to use MDMA say Y here.
>  
> +config STM32_DMA3
> +	tristate "STMicroelectronics STM32 DMA3 support"
> +	select DMA_ENGINE
> +	select DMA_VIRTUAL_CHANNELS
> +	help
> +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
> +	  STM32 platforms.
> +	  If you have a board based on STM32 SoC with such DMA3 controller
> +	  and want to use DMA3, say Y here.
> +
>  endif
> diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
> index 663a3896a881..5082db4b4c1c 100644
> --- a/drivers/dma/stm32/Makefile
> +++ b/drivers/dma/stm32/Makefile
> @@ -2,3 +2,4 @@
>  obj-$(CONFIG_STM32_DMA) += stm32-dma.o
>  obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
>  obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
> +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
> diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
> new file mode 100644
> index 000000000000..b5493f497d06
> --- /dev/null
> +++ b/drivers/dma/stm32/stm32-dma3.c
> @@ -0,0 +1,1431 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * STM32 DMA3 controller driver
> + *
> + * Copyright (C) STMicroelectronics 2024
> + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/clk.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/dmaengine.h>
> +#include <linux/dmapool.h>
> +#include <linux/init.h>
> +#include <linux/iopoll.h>
> +#include <linux/list.h>
> +#include <linux/module.h>
> +#include <linux/of_dma.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/reset.h>
> +#include <linux/slab.h>
> +
> +#include "../virt-dma.h"
> +
> +#define STM32_DMA3_SECCFGR		0x00
> +#define STM32_DMA3_PRIVCFGR		0x04
> +#define STM32_DMA3_RCFGLOCKR		0x08
> +#define STM32_DMA3_MISR			0x0C
> +#define STM32_DMA3_SMISR		0x10
> +
> +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
> +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
> +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
> +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
> +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
> +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
> +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
> +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
> +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
> +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
> +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
> +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
> +
> +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
> +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
> +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
> +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
> +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
> +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
> +#define STM32_DMA3_VERR			0xFF4
> +
> +/* SECCFGR DMA secure configuration register */
> +#define SECCFGR_SEC(x)			BIT(x)
> +
> +/* MISR DMA non-secure/secure masked interrupt status register */
> +#define MISR_MIS(x)			BIT(x)
> +
> +/* CxLBAR DMA channel x linked_list base address register */
> +#define CLBAR_LBA			GENMASK(31, 16)
> +
> +/* CxCIDCFGR DMA channel x CID register */
> +#define CCIDCFGR_CFEN			BIT(0)
> +#define CCIDCFGR_SEM_EN			BIT(1)
> +#define CCIDCFGR_SCID			GENMASK(5, 4)
> +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
> +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
> +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
> +
> +enum ccidcfgr_cid {
> +	CCIDCFGR_CID0,
> +	CCIDCFGR_CID1,
> +	CCIDCFGR_CID2,
> +};
> +
> +/* CxSEMCR DMA channel x semaphore control register */
> +#define CSEMCR_SEM_MUTEX		BIT(0)
> +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
> +
> +/* CxFCR DMA channel x flag clear register */
> +#define CFCR_TCF			BIT(8)
> +#define CFCR_HTF			BIT(9)
> +#define CFCR_DTEF			BIT(10)
> +#define CFCR_ULEF			BIT(11)
> +#define CFCR_USEF			BIT(12)
> +#define CFCR_SUSPF			BIT(13)
> +
> +/* CxSR DMA channel x status register */
> +#define CSR_IDLEF			BIT(0)
> +#define CSR_TCF				BIT(8)
> +#define CSR_HTF				BIT(9)
> +#define CSR_DTEF			BIT(10)
> +#define CSR_ULEF			BIT(11)
> +#define CSR_USEF			BIT(12)
> +#define CSR_SUSPF			BIT(13)
> +#define CSR_ALL_F			GENMASK(13, 8)
> +#define CSR_FIFOL			GENMASK(24, 16)
> +
> +/* CxCR DMA channel x control register */
> +#define CCR_EN				BIT(0)
> +#define CCR_RESET			BIT(1)
> +#define CCR_SUSP			BIT(2)
> +#define CCR_TCIE			BIT(8)
> +#define CCR_HTIE			BIT(9)
> +#define CCR_DTEIE			BIT(10)
> +#define CCR_ULEIE			BIT(11)
> +#define CCR_USEIE			BIT(12)
> +#define CCR_SUSPIE			BIT(13)
> +#define CCR_ALLIE			GENMASK(13, 8)
> +#define CCR_LSM				BIT(16)
> +#define CCR_LAP				BIT(17)
> +#define CCR_PRIO			GENMASK(23, 22)
> +
> +enum ccr_prio {
> +	CCR_PRIO_LOW,
> +	CCR_PRIO_MID,
> +	CCR_PRIO_HIGH,
> +	CCR_PRIO_VERY_HIGH,
> +};
> +
> +/* CxTR1 DMA channel x transfer register 1 */
> +#define CTR1_SINC			BIT(3)
> +#define CTR1_SBL_1			GENMASK(9, 4)
> +#define CTR1_DINC			BIT(19)
> +#define CTR1_DBL_1			GENMASK(25, 20)
> +#define CTR1_SDW_LOG2			GENMASK(1, 0)
> +#define CTR1_PAM			GENMASK(12, 11)
> +#define CTR1_SAP			BIT(14)
> +#define CTR1_DDW_LOG2			GENMASK(17, 16)
> +#define CTR1_DAP			BIT(30)
> +
> +enum ctr1_dw {
> +	CTR1_DW_BYTE,
> +	CTR1_DW_HWORD,
> +	CTR1_DW_WORD,
> +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
> +};
> +
> +enum ctr1_pam {
> +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
> +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
> +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
> +};
> +
> +/* CxTR2 DMA channel x transfer register 2 */
> +#define CTR2_REQSEL			GENMASK(7, 0)
> +#define CTR2_SWREQ			BIT(9)
> +#define CTR2_DREQ			BIT(10)
> +#define CTR2_BREQ			BIT(11)
> +#define CTR2_PFREQ			BIT(12)
> +#define CTR2_TCEM			GENMASK(31, 30)
> +
> +enum ctr2_tcem {
> +	CTR2_TCEM_BLOCK,
> +	CTR2_TCEM_REPEAT_BLOCK,
> +	CTR2_TCEM_LLI,
> +	CTR2_TCEM_CHANNEL,
> +};
> +
> +/* CxBR1 DMA channel x block register 1 */
> +#define CBR1_BNDT			GENMASK(15, 0)
> +
> +/* CxLLR DMA channel x linked-list address register */
> +#define CLLR_LA				GENMASK(15, 2)
> +#define CLLR_ULL			BIT(16)
> +#define CLLR_UDA			BIT(27)
> +#define CLLR_USA			BIT(28)
> +#define CLLR_UB1			BIT(29)
> +#define CLLR_UT2			BIT(30)
> +#define CLLR_UT1			BIT(31)
> +
> +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
> +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
> +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
> +
> +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
> +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
> +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
> +
> +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
> +
> +/* HWCFGR2 DMA hardware configuration register 2 */
> +#define G_MAX_REQ_ID			GENMASK(7, 0)
> +
> +/* HWCFGR1 DMA hardware configuration register 1 */
> +#define G_MASTER_PORTS			GENMASK(2, 0)
> +#define G_NUM_CHANNELS			GENMASK(12, 8)
> +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
> +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
> +
> +enum stm32_dma3_master_ports {
> +	AXI64, /* 1x AXI: 64-bit port 0 */
> +	AHB32, /* 1x AHB: 32-bit port 0 */
> +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
> +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
> +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
> +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
> +};
> +
> +enum stm32_dma3_port_data_width {
> +	DW_32, /* 32-bit, for AHB */
> +	DW_64, /* 64-bit, for AXI */
> +	DW_128, /* 128-bit, for AXI */
> +	DW_INVALID,
> +};
> +
> +/* VERR DMA version register */
> +#define VERR_MINREV			GENMASK(3, 0)
> +#define VERR_MAJREV			GENMASK(7, 4)
> +
> +/* Device tree */
> +/* struct stm32_dma3_dt_conf */
> +/* .ch_conf */
> +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
> +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
> +/* .tr_conf */
> +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
> +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
> +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
> +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
> +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
> +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
> +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
> +
> +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
> +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
> +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
> +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
> +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
> +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
> +
> +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
> +struct stm32_dma3_hwdesc {
> +	u32 ctr1;
> +	u32 ctr2;
> +	u32 cbr1;
> +	u32 csar;
> +	u32 cdar;
> +	u32 cllr;
> +} __aligned(32);
> +
> +/*
> + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
> + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
> + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
> + * the sg size limitation.
> + */
> +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
> +
> +/*
> + * Linked-list items
> + */
> +struct stm32_dma3_lli {
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	dma_addr_t hwdesc_addr;
> +};
> +
> +struct stm32_dma3_swdesc {
> +	struct virt_dma_desc vdesc;
> +	u32 ccr;
> +	bool cyclic;
> +	u32 lli_size;
> +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
> +};
> +
> +struct stm32_dma3_dt_conf {
> +	u32 ch_id;
> +	u32 req_line;
> +	u32 ch_conf;
> +	u32 tr_conf;
> +};
> +
> +struct stm32_dma3_chan {
> +	struct virt_dma_chan vchan;
> +	u32 id;
> +	int irq;
> +	u32 fifo_size;
> +	u32 max_burst;
> +	bool semaphore_mode;
> +	struct stm32_dma3_dt_conf dt_config;
> +	struct dma_slave_config dma_config;
> +	struct dma_pool *lli_pool;
> +	struct stm32_dma3_swdesc *swdesc;
> +	enum ctr2_tcem tcem;
> +	u32 dma_status;
> +};
> +
> +struct stm32_dma3_ddata {
> +	struct dma_device dma_dev;
> +	void __iomem *base;
> +	struct clk *clk;
> +	struct stm32_dma3_chan *chans;
> +	u32 dma_channels;
> +	u32 dma_requests;
> +	enum stm32_dma3_port_data_width ports_max_dw[2];
> +};
> +
> +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
> +{
> +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
> +}
> +
> +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
> +{
> +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
> +}
> +
> +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
> +{
> +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
> +}
> +
> +static struct device *chan2dev(struct stm32_dma3_chan *chan)
> +{
> +	return &chan->vchan.chan.dev->device;
> +}
> +
> +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct device *dev = chan2dev(chan);
> +	u32 id = chan->id, offset;
> +
> +	offset = STM32_DMA3_SECCFGR;
> +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_PRIVCFGR;
> +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CCIDCFGR(id);
> +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CSEMCR(id);
> +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CSR(id);
> +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CCR(id);
> +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CTR1(id);
> +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CTR2(id);
> +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CBR1(id);
> +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CSAR(id);
> +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CDAR(id);
> +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CLLR(id);
> +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +	offset = STM32_DMA3_CLBAR(id);
> +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
> +}
> +
> +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
> +					struct stm32_dma3_swdesc *swdesc)
> +{
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	int i;
> +
> +	for (i = 0; i < swdesc->lli_size; i++) {
> +		hwdesc = swdesc->lli[i].hwdesc;
> +		if (i)
> +			dev_dbg(chan2dev(chan), "V\n");
> +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
> +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
> +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
> +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
> +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
> +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
> +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
> +	}
> +
> +	if (swdesc->cyclic) {
> +		dev_dbg(chan2dev(chan), "|\n");
> +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
> +	} else {
> +		dev_dbg(chan2dev(chan), "X\n");
> +	}
> +}
> +
> +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct stm32_dma3_swdesc *swdesc;
> +	int i;
> +
> +	/*
> +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
> +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
> +	 * addressed, so abort the allocation.
> +	 */
> +	if ((count * 32) > CLLR_LA) {
> +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
> +		return NULL;
> +	}
> +
> +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
> +	if (!swdesc)
> +		return NULL;
> +
> +	for (i = 0; i < count; i++) {
> +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
> +							&swdesc->lli[i].hwdesc_addr);
> +		if (!swdesc->lli[i].hwdesc)
> +			goto err_pool_free;
> +	}
> +	swdesc->lli_size = count;
> +	swdesc->ccr = 0;
> +
> +	/* Set LL base address */
> +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
> +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
> +
> +	/* Set LL allocated port */
> +	swdesc->ccr &= ~CCR_LAP;
> +
> +	return swdesc;
> +
> +err_pool_free:
> +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
> +	while (--i >= 0)
> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> +	kfree(swdesc);
> +
> +	return NULL;
> +}
> +
> +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
> +				      struct stm32_dma3_swdesc *swdesc)
> +{
> +	int i;
> +
> +	for (i = 0; i < swdesc->lli_size; i++)
> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
> +
> +	kfree(swdesc);
> +}
> +
> +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
> +{
> +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
> +
> +	stm32_dma3_chan_desc_free(chan, swdesc);
> +}
> +
> +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct device *dev = chan2dev(chan);
> +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
> +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
> +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
> +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
> +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
> +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
> +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
> +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
> +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
> +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
> +
> +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
> +		dev_err(dev, "null source block size and no update of this value\n");
> +	if (bndt % sdw)
> +		dev_err(dev, "source block size not multiple of src data width\n");
> +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
> +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
> +	if (csar % sdw)
> +		dev_err(dev, "unaligned source address not multiple of src data width\n");
> +	if (cdar % ddw)
> +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
> +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
> +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
> +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
> +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
> +}
> +
> +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
> +					struct stm32_dma3_swdesc *swdesc,
> +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
> +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
> +{
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	dma_addr_t next_lli;
> +	u32 next = curr + 1;
> +
> +	hwdesc = swdesc->lli[curr].hwdesc;
> +	hwdesc->ctr1 = ctr1;
> +	hwdesc->ctr2 = ctr2;
> +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
> +	hwdesc->csar = src;
> +	hwdesc->cdar = dst;
> +
> +	if (is_last) {
> +		if (is_cyclic)
> +			next_lli = swdesc->lli[0].hwdesc_addr;
> +		else
> +			next_lli = 0;
> +	} else {
> +		next_lli = swdesc->lli[next].hwdesc_addr;
> +	}
> +
> +	hwdesc->cllr = 0;
> +	if (next_lli) {
> +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
> +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
> +		hwdesc->cllr |= (next_lli & CLLR_LA);
> +	}
> +}
> +
> +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
> +						     enum stm32_dma3_port_data_width port_max_dw,
> +						     u32 len, dma_addr_t addr)
> +{
> +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
> +
> +	/* len and addr must be a multiple of dw */
> +	return 1 << __ffs(len | addr | max_dw);
> +}
> +
> +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
> +{
> +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
> +
> +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
> +	if (len < chan_max_burst)
> +		max_burst = len / dw;
> +
> +	/*
> +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
> +	 * If len is not a multiple of burst size, last burst is shortened by HW.
> +	 */
> +	return max_burst;
> +}
> +
> +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
> +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
> +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct dma_device dma_device = ddata->dma_dev;
> +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
> +	u32 _ctr1 = 0, _ctr2 = 0;
> +	u32 ch_conf = chan->dt_config.ch_conf;
> +	u32 tr_conf = chan->dt_config.tr_conf;
> +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
> +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
> +
> +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
> +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
> +
> +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
> +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
> +	sbl_max = chan->dma_config.src_maxburst ? : 1;
> +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
> +
> +	/* Following conditions would raise User Setting Error interrupt */
> +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
> +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
> +		return -EINVAL;
> +	}
> +
> +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
> +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
> +		return -EINVAL;
> +	}
> +
> +	sap_max_dw = ddata->ports_max_dw[sap];
> +	dap_max_dw = ddata->ports_max_dw[dap];
> +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
> +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
> +		dev_err(chan2dev(chan),
> +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
> +			sdw, ddw, sap, dap);
> +		return -EINVAL;
> +	}
> +
> +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
> +		_ctr1 |= CTR1_SINC;
> +	if (sap)
> +		_ctr1 |= CTR1_SAP;
> +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
> +		_ctr1 |= CTR1_DINC;
> +	if (dap)
> +		_ctr1 |= CTR1_DAP;
> +
> +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
> +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
> +		_ctr2 |= CTR2_BREQ;
> +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
> +		_ctr2 |= CTR2_PFREQ;
> +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
> +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
> +
> +	/* Store TCEM to know on which event TC flag occurred */
> +	chan->tcem = tcem;
> +	/* Store direction for residue computation */
> +	chan->dma_config.direction = dir;
> +
> +	switch (dir) {
> +	case DMA_MEM_TO_DEV:
> +		/* Set destination (device) data width and burst */
> +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
> +							    len, dst_addr));
> +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
> +
> +		/* Set source (memory) data width and burst */
> +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
> +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
> +
> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> +
> +		if (ddw != sdw) {
> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> +			/* Should never reach this case as ddw is clamped down */
> +			if (len & (ddw - 1)) {
> +				dev_err(chan2dev(chan),
> +					"Packing mode is enabled and len is not multiple of ddw");
> +				return -EINVAL;
> +			}
> +		}
> +
> +		/* dst = dev */
> +		_ctr2 |= CTR2_DREQ;
> +
> +		break;
> +
> +	case DMA_DEV_TO_MEM:
> +		/* Set source (device) data width and burst */
> +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
> +							    len, src_addr));
> +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
> +
> +		/* Set destination (memory) data width and burst */
> +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
> +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
> +
> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
> +
> +		if (ddw != sdw) {
> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
> +			/* Should never reach this case as ddw is clamped down */
> +			if (len & (ddw - 1)) {
> +				dev_err(chan2dev(chan),
> +					"Packing mode is enabled and len is not multiple of ddw\n");
> +				return -EINVAL;
> +			}
> +		}
> +
> +		/* dst = mem */
> +		_ctr2 &= ~CTR2_DREQ;
> +
> +		break;
> +
> +	default:
> +		dev_err(chan2dev(chan), "Direction %s not supported\n",
> +			dmaengine_get_direction_text(dir));
> +		return -EINVAL;
> +	}
> +
> +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
> +	*ctr1 = _ctr1;
> +	*ctr2 = _ctr2;
> +
> +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
> +		__func__, sdw, sbl_max, ddw, dbl_max);
> +
> +	return 0;
> +}
> +
> +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct virt_dma_desc *vdesc;
> +	struct stm32_dma3_hwdesc *hwdesc;
> +	u32 id = chan->id;
> +	u32 csr, ccr;
> +
> +	vdesc = vchan_next_desc(&chan->vchan);
> +	if (!vdesc) {
> +		chan->swdesc = NULL;
> +		return;
> +	}
> +	list_del(&vdesc->node);
> +
> +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
> +	hwdesc = chan->swdesc->lli[0].hwdesc;
> +
> +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
> +
> +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
> +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
> +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
> +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
> +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
> +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
> +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
> +
> +	/* Clear any pending interrupts */
> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
> +	if (csr & CSR_ALL_F)
> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
> +
> +	stm32_dma3_chan_dump_reg(chan);
> +
> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
> +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));

This one should use writel instead of writel_relaxed because it need
dma_wmb() as barrier for preious write complete.

Frank

> +
> +	chan->dma_status = DMA_IN_PROGRESS;
> +
> +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
> +}
> +
> +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> +	int ret = 0;
> +
> +	if (susp)
> +		ccr |= CCR_SUSP;
> +	else
> +		ccr &= ~CCR_SUSP;
> +
> +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
> +
> +	if (susp) {
> +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
> +							csr & CSR_SUSPF, 1, 10);
> +		if (!ret)
> +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
> +
> +		stm32_dma3_chan_dump_reg(chan);
> +	}
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> +
> +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
> +}
> +
> +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
> +{
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 ccr;
> +	int ret = 0;
> +
> +	chan->dma_status = DMA_COMPLETE;
> +
> +	/* Disable interrupts */
> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
> +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
> +
> +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
> +		/* Suspend the channel */
> +		ret = stm32_dma3_chan_suspend(chan, true);
> +		if (ret)
> +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
> +	}
> +
> +	/*
> +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
> +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
> +	 */
> +	stm32_dma3_chan_reset(chan);
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
> +{
> +	if (!chan->swdesc)
> +		return;
> +
> +	vchan_cookie_complete(&chan->swdesc->vdesc);
> +	chan->swdesc = NULL;
> +	stm32_dma3_chan_start(chan);
> +}
> +
> +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
> +{
> +	struct stm32_dma3_chan *chan = devid;
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 misr, csr, ccr;
> +
> +	spin_lock(&chan->vchan.lock);
> +
> +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
> +	if (!(misr & MISR_MIS(chan->id))) {
> +		spin_unlock(&chan->vchan.lock);
> +		return IRQ_NONE;
> +	}
> +
> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
> +
> +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
> +		if (chan->swdesc->cyclic)
> +			vchan_cyclic_callback(&chan->swdesc->vdesc);
> +		else
> +			stm32_dma3_chan_complete(chan);
> +	}
> +
> +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
> +		dev_err(chan2dev(chan), "User setting error\n");
> +		chan->dma_status = DMA_ERROR;
> +		/* CCR.EN automatically cleared by HW */
> +		stm32_dma3_check_user_setting(chan);
> +		stm32_dma3_chan_reset(chan);
> +	}
> +
> +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
> +		dev_err(chan2dev(chan), "Update link transfer error\n");
> +		chan->dma_status = DMA_ERROR;
> +		/* CCR.EN automatically cleared by HW */
> +		stm32_dma3_chan_reset(chan);
> +	}
> +
> +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
> +		dev_err(chan2dev(chan), "Data transfer error\n");
> +		chan->dma_status = DMA_ERROR;
> +		/* CCR.EN automatically cleared by HW */
> +		stm32_dma3_chan_reset(chan);
> +	}
> +
> +	/*
> +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
> +	 * ensure HTF flag to be cleared, with other flags.
> +	 */
> +	csr &= (ccr | CCR_HTIE);
> +
> +	if (csr)
> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
> +
> +	spin_unlock(&chan->vchan.lock);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	u32 id = chan->id, csemcr, ccid;
> +	int ret;
> +
> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> +	if (ret < 0)
> +		return ret;

It doesn't prefer runtime pm get at alloc dma chan, many client driver
doesn't actual user dma when allocate dma chan.

Ideally, resume get when issue_pending. Please refer pl330.c. 

You may add runtime pm later after enablement patch.

Frank

> +
> +	/* Ensure the channel is free */
> +	if (chan->semaphore_mode &&
> +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
> +		ret = -EBUSY;
> +		goto err_put_sync;
> +	}
> +
> +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
> +					  sizeof(struct stm32_dma3_hwdesc),
> +					  __alignof__(struct stm32_dma3_hwdesc), 0);
> +	if (!chan->lli_pool) {
> +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
> +		ret = -ENOMEM;
> +		goto err_put_sync;
> +	}
> +
> +	/* Take the channel semaphore */
> +	if (chan->semaphore_mode) {
> +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
> +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
> +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
> +		/* Check that the channel is well taken */
> +		if (ccid != CCIDCFGR_CID1) {
> +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
> +			ret = -EPERM;
> +			goto err_pool_destroy;
> +		}
> +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
> +	}
> +
> +	return 0;
> +
> +err_pool_destroy:
> +	dmam_pool_destroy(chan->lli_pool);
> +	chan->lli_pool = NULL;
> +
> +err_put_sync:
> +	pm_runtime_put_sync(ddata->dma_dev.dev);
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	unsigned long flags;
> +
> +	/* Ensure channel is in idle state */
> +	spin_lock_irqsave(&chan->vchan.lock, flags);
> +	stm32_dma3_chan_stop(chan);
> +	chan->swdesc = NULL;
> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> +
> +	vchan_free_chan_resources(to_virt_chan(c));
> +
> +	dmam_pool_destroy(chan->lli_pool);
> +	chan->lli_pool = NULL;
> +
> +	/* Release the channel semaphore */
> +	if (chan->semaphore_mode)
> +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
> +
> +	pm_runtime_put_sync(ddata->dma_dev.dev);
> +
> +	/* Reset configuration */
> +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
> +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
> +}
> +
> +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
> +								struct scatterlist *sgl,
> +								unsigned int sg_len,
> +								enum dma_transfer_direction dir,
> +								unsigned long flags, void *context)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_swdesc *swdesc;
> +	struct scatterlist *sg;
> +	size_t len;
> +	dma_addr_t sg_addr, dev_addr, src, dst;
> +	u32 i, j, count, ctr1, ctr2;
> +	int ret;
> +
> +	count = sg_len;
> +	for_each_sg(sgl, sg, sg_len, i) {
> +		len = sg_dma_len(sg);
> +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
> +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
> +	}
> +
> +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
> +	if (!swdesc)
> +		return NULL;
> +
> +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
> +	j = 0;
> +	for_each_sg(sgl, sg, sg_len, i) {
> +		sg_addr = sg_dma_address(sg);
> +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
> +						     chan->dma_config.src_addr;
> +		len = sg_dma_len(sg);
> +
> +		do {
> +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
> +
> +			if (dir == DMA_MEM_TO_DEV) {
> +				src = sg_addr;
> +				dst = dev_addr;
> +
> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> +							      src, dst, chunk);
> +
> +				if (FIELD_GET(CTR1_DINC, ctr1))
> +					dev_addr += chunk;
> +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
> +				src = dev_addr;
> +				dst = sg_addr;
> +
> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> +							      src, dst, chunk);
> +
> +				if (FIELD_GET(CTR1_SINC, ctr1))
> +					dev_addr += chunk;
> +			}
> +
> +			if (ret)
> +				goto err_desc_free;
> +
> +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
> +						    ctr1, ctr2, j == (count - 1), false);
> +
> +			sg_addr += chunk;
> +			len -= chunk;
> +			j++;
> +		} while (len);
> +	}
> +
> +	/* Enable Error interrupts */
> +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
> +	/* Enable Transfer state interrupts */
> +	swdesc->ccr |= CCR_TCIE;
> +
> +	swdesc->cyclic = false;
> +
> +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
> +
> +err_desc_free:
> +	stm32_dma3_chan_desc_free(chan, swdesc);
> +
> +	return NULL;
> +}
> +
> +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +
> +	if (!chan->fifo_size) {
> +		caps->max_burst = 0;
> +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +	} else {
> +		/* Burst transfer should not exceed half of the fifo size */
> +		caps->max_burst = chan->max_burst;
> +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
> +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +		}
> +	}
> +}
> +
> +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +
> +	memcpy(&chan->dma_config, config, sizeof(*config));
> +
> +	return 0;
> +}
> +
> +static int stm32_dma3_terminate_all(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	unsigned long flags;
> +	LIST_HEAD(head);
> +
> +	spin_lock_irqsave(&chan->vchan.lock, flags);
> +
> +	if (chan->swdesc) {
> +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
> +		chan->swdesc = NULL;
> +	}
> +
> +	stm32_dma3_chan_stop(chan);
> +
> +	vchan_get_all_descriptors(&chan->vchan, &head);
> +
> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> +	vchan_dma_desc_free_list(&chan->vchan, &head);
> +
> +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
> +
> +	return 0;
> +}
> +
> +static void stm32_dma3_synchronize(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +
> +	vchan_synchronize(&chan->vchan);
> +}
> +
> +static void stm32_dma3_issue_pending(struct dma_chan *c)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&chan->vchan.lock, flags);
> +
> +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
> +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
> +		stm32_dma3_chan_start(chan);
> +	}
> +
> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> +}
> +
> +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
> +{
> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> +	struct stm32_dma3_dt_conf *conf = fn_param;
> +	u32 mask, semcr;
> +	int ret;
> +
> +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
> +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
> +
> +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
> +		if (!(mask & BIT(chan->id)))
> +			return false;
> +
> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> +	if (ret < 0)
> +		return false;
> +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
> +	pm_runtime_put_sync(ddata->dma_dev.dev);
> +
> +	/* Check if chan is free */
> +	if (semcr & CSEMCR_SEM_MUTEX)
> +		return false;
> +
> +	/* Check if chan fifo fits well */
> +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
> +		return false;
> +
> +	return true;
> +}
> +
> +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
> +{
> +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
> +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
> +	struct stm32_dma3_dt_conf conf;
> +	struct stm32_dma3_chan *chan;
> +	struct dma_chan *c;
> +
> +	if (dma_spec->args_count < 3) {
> +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
> +		return NULL;
> +	}
> +
> +	conf.req_line = dma_spec->args[0];
> +	conf.ch_conf = dma_spec->args[1];
> +	conf.tr_conf = dma_spec->args[2];
> +
> +	if (conf.req_line >= ddata->dma_requests) {
> +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
> +		return NULL;
> +	}
> +
> +	/* Request dma channel among the generic dma controller list */
> +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
> +	if (!c) {
> +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
> +		return NULL;
> +	}
> +
> +	chan = to_stm32_dma3_chan(c);
> +	chan->dt_config = conf;
> +
> +	return c;
> +}
> +
> +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
> +{
> +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
> +
> +	/* Reserve Secure channels */
> +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
> +
> +	/*
> +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
> +	 * the processor which is configuring and using the given channel.
> +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
> +	 * specify available DMA channels to the kernel.
> +	 */
> +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
> +
> +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
> +	for (i = 0; i < ddata->dma_channels; i++) {
> +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
> +
> +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
> +			invalid_cid |= BIT(i);
> +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
> +				chan_reserved |= BIT(i);
> +		} else { /* CID-filtered */
> +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
> +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
> +					chan_reserved |= BIT(i);
> +			} else { /* Semaphore mode */
> +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
> +					chan_reserved |= BIT(i);
> +				ddata->chans[i].semaphore_mode = true;
> +			}
> +		}
> +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
> +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
> +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
> +			(chan_reserved & BIT(i)) ? "denied" :
> +			mask & BIT(i) ? "force allowed" : "allowed");
> +	}
> +
> +	if (invalid_cid)
> +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
> +			 ddata->dma_channels, &invalid_cid);
> +
> +	return chan_reserved;
> +}
> +
> +static const struct of_device_id stm32_dma3_of_match[] = {
> +	{ .compatible = "st,stm32-dma3", },
> +	{ /* sentinel */},
> +};
> +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
> +
> +static int stm32_dma3_probe(struct platform_device *pdev)
> +{
> +	struct device_node *np = pdev->dev.of_node;
> +	struct stm32_dma3_ddata *ddata;
> +	struct reset_control *reset;
> +	struct stm32_dma3_chan *chan;
> +	struct dma_device *dma_dev;
> +	u32 master_ports, chan_reserved, i, verr;
> +	u64 hwcfgr;
> +	int ret;
> +
> +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
> +	if (!ddata)
> +		return -ENOMEM;
> +	platform_set_drvdata(pdev, ddata);
> +
> +	dma_dev = &ddata->dma_dev;
> +
> +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
> +	if (IS_ERR(ddata->base))
> +		return PTR_ERR(ddata->base);
> +
> +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
> +	if (IS_ERR(ddata->clk))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
> +
> +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
> +	if (IS_ERR(reset))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
> +
> +	ret = clk_prepare_enable(ddata->clk);
> +	if (ret)
> +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
> +
> +	reset_control_reset(reset);
> +
> +	INIT_LIST_HEAD(&dma_dev->channels);
> +
> +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
> +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
> +	dma_dev->dev = &pdev->dev;
> +	/*
> +	 * This controller supports up to 8-byte buswidth depending on the port used and the
> +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
> +	 */
> +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
> +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
> +
> +	dma_dev->descriptor_reuse = true;
> +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
> +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
> +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
> +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
> +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
> +	dma_dev->device_caps = stm32_dma3_caps;
> +	dma_dev->device_config = stm32_dma3_config;
> +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
> +	dma_dev->device_synchronize = stm32_dma3_synchronize;
> +	dma_dev->device_tx_status = dma_cookie_status;
> +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
> +
> +	/* if dma_channels is not modified, get it from hwcfgr1 */
> +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
> +	}
> +
> +	/* if dma_requests is not modified, get it from hwcfgr2 */
> +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
> +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
> +	}
> +
> +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
> +
> +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
> +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
> +		ddata->ports_max_dw[1] = DW_INVALID;
> +	else /* Dual master ports */
> +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
> +
> +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
> +				    GFP_KERNEL);
> +	if (!ddata->chans) {
> +		ret = -ENOMEM;
> +		goto err_clk_disable;
> +	}
> +
> +	chan_reserved = stm32_dma3_check_rif(ddata);
> +
> +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
> +		ret = -ENODEV;
> +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
> +		goto err_clk_disable;
> +	}
> +
> +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
> +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
> +
> +	for (i = 0; i < ddata->dma_channels; i++) {
> +		if (chan_reserved & BIT(i))
> +			continue;
> +
> +		chan = &ddata->chans[i];
> +		chan->id = i;
> +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
> +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
> +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
> +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
> +
> +		vchan_init(&chan->vchan, dma_dev);
> +	}
> +
> +	ret = dmaenginem_async_device_register(dma_dev);
> +	if (ret)
> +		goto err_clk_disable;
> +
> +	for (i = 0; i < ddata->dma_channels; i++) {
> +		if (chan_reserved & BIT(i))
> +			continue;
> +
> +		ret = platform_get_irq(pdev, i);
> +		if (ret < 0)
> +			goto err_clk_disable;
> +
> +		chan = &ddata->chans[i];
> +		chan->irq = ret;
> +
> +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
> +				       dev_name(chan2dev(chan)), chan);
> +		if (ret) {
> +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
> +				      dev_name(chan2dev(chan)));
> +			goto err_clk_disable;
> +		}
> +	}
> +
> +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
> +	if (ret) {
> +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
> +		goto err_clk_disable;
> +	}
> +
> +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
> +
> +	pm_runtime_set_active(&pdev->dev);
> +	pm_runtime_enable(&pdev->dev);
> +	pm_runtime_get_noresume(&pdev->dev);
> +	pm_runtime_put(&pdev->dev);
> +
> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
> +
> +	return 0;
> +
> +err_clk_disable:
> +	clk_disable_unprepare(ddata->clk);
> +
> +	return ret;
> +}
> +
> +static void stm32_dma3_remove(struct platform_device *pdev)
> +{
> +	pm_runtime_disable(&pdev->dev);
> +}
> +
> +static int stm32_dma3_runtime_suspend(struct device *dev)
> +{
> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> +
> +	clk_disable_unprepare(ddata->clk);
> +
> +	return 0;
> +}
> +
> +static int stm32_dma3_runtime_resume(struct device *dev)
> +{
> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> +	int ret;
> +
> +	ret = clk_prepare_enable(ddata->clk);
> +	if (ret)
> +		dev_err(dev, "Failed to enable clk: %d\n", ret);
> +
> +	return ret;
> +}
> +
> +static const struct dev_pm_ops stm32_dma3_pm_ops = {
> +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
> +};
> +
> +static struct platform_driver stm32_dma3_driver = {
> +	.probe = stm32_dma3_probe,
> +	.remove_new = stm32_dma3_remove,
> +	.driver = {
> +		.name = "stm32-dma3",
> +		.of_match_table = stm32_dma3_of_match,
> +		.pm = pm_ptr(&stm32_dma3_pm_ops),
> +	},
> +};
> +
> +static int __init stm32_dma3_init(void)
> +{
> +	return platform_driver_register(&stm32_dma3_driver);
> +}
> +
> +subsys_initcall(stm32_dma3_init);
> +
> +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
> +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
> +MODULE_LICENSE("GPL");
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-15 18:45           ` Frank Li
@ 2024-05-16  9:42             ` Amelie Delaunay
  0 siblings, 0 replies; 29+ messages in thread
From: Amelie Delaunay @ 2024-05-16  9:42 UTC (permalink / raw)
  To: Frank Li
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening



On 5/15/24 20:45, Frank Li wrote:
> On Mon, May 13, 2024 at 11:21:18AM +0200, Amelie Delaunay wrote:
>> Hi Frank,
>>
>> On 5/7/24 22:26, Frank Li wrote:
>>> On Tue, May 07, 2024 at 01:33:31PM +0200, Amelie Delaunay wrote:
>>>> Hi Vinod,
>>>>
>>>> Thanks for the review.
>>>>
>>>> On 5/4/24 14:40, Vinod Koul wrote:
>>>>> On 23-04-24, 14:32, Amelie Delaunay wrote:
>>>>>> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
>>>>>> controller:
>>>>>> - LPDMA (Low Power): 4 channels, no FIFO
>>>>>> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
>>>>>> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
>>>>>> Hardware configuration of the channels is retrieved from the hardware
>>>>>> configuration registers.
>>>>>> The client can specify its channel requirements through device tree.
>>>>>> STM32 DMA3 channels can be individually reserved either because they are
>>>>>> secure, or dedicated to another CPU.
>>>>>> Indeed, channels availability depends on Resource Isolation Framework
>>>>>> (RIF) configuration. RIF grants access to buses with Compartiment ID
>>>>>
>>>>> Compartiment? typo...?
>>>>>
>>>>
>>>> Sorry, indeed, Compartment instead.
>>>>
>>>>>> (CIF) filtering, secure and privilege level. It also assigns DMA channels
>>>>>> to one or several processors.
>>>>>> DMA channels used by Linux should be CID-filtered and statically assigned
>>>>>> to CID1 or shared with other CPUs but using semaphore. In case CID
>>>>>> filtering is not configured, dma-channel-mask property can be used to
>>>>>> specify available DMA channels to the kernel, otherwise such channels
>>>>>> will be marked as reserved and can't be used by Linux.
>>>>>>
>>>>>> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
>>>>>> ---
>>>>>>     drivers/dma/stm32/Kconfig      |   10 +
>>>>>>     drivers/dma/stm32/Makefile     |    1 +
>>>>>>     drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
>>>>>>     3 files changed, 1442 insertions(+)
>>>>>>     create mode 100644 drivers/dma/stm32/stm32-dma3.c
>>>>>>
>>>>>> diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
>>>>>> index b72ae1a4502f..4d8d8063133b 100644
>>>>>> --- a/drivers/dma/stm32/Kconfig
>>>>>> +++ b/drivers/dma/stm32/Kconfig
>>>>>> @@ -34,4 +34,14 @@ config STM32_MDMA
>>>>>>     	  If you have a board based on STM32 SoC with such DMA controller
>>>>>>     	  and want to use MDMA say Y here.
>>>>>> +config STM32_DMA3
>>>>>> +	tristate "STMicroelectronics STM32 DMA3 support"
>>>>>> +	select DMA_ENGINE
>>>>>> +	select DMA_VIRTUAL_CHANNELS
>>>>>> +	help
>>>>>> +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
>>>>>> +	  STM32 platforms.
>>>>>> +	  If you have a board based on STM32 SoC with such DMA3 controller
>>>>>> +	  and want to use DMA3, say Y here.
>>>>>> +
>>>>>>     endif
>>>>>> diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
>>>>>> index 663a3896a881..5082db4b4c1c 100644
>>>>>> --- a/drivers/dma/stm32/Makefile
>>>>>> +++ b/drivers/dma/stm32/Makefile
>>>>>> @@ -2,3 +2,4 @@
>>>>>>     obj-$(CONFIG_STM32_DMA) += stm32-dma.o
>>>>>>     obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
>>>>>>     obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
>>>>>> +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
>>>>>
>>>>> are there any similarities in mdma/dma and dma3..?
>>>>> can anything be reused...?
>>>>>
>>>>
>>>> DMA/MDMA were originally intended for STM32 MCUs and have been used in
>>>> STM32MP1 MPUs.
>>>> New MPUs (STM32MP2, ...) and STM32 MCUs (STM32H5, STM32N6, ...) use DMA3.
>>>> Unlike DMA/MDMA, DMA3 can be declined in multiple configurations, LPDMA,
>>>> GPDMA, HPDMA, and among these global configurations, there are possible
>>>> sub-configurations (e.g. channel fifo size). stm32-dma3 uses the hardware
>>>> configuration registers to discover the controller/channels capabilities.
>>>> Reuse stm32-dma or stm32-mdma would lead to complicating the driver and
>>>> making future stm32-dma3 evolutions for next STM32 MPUs intricate and very
>>>> difficult.
>>>
>>> I think your reason still not enough to create new driver instead try to
>>> reuse old one.
>>>
>>> Does register layout or dma descriptor is totally difference?
>>>
>>> If dma descriptor format is the same, at least you can reuse prepare DMA
>>> descriptor part.
>>>
>>> Choose channel is independt part of DMA channel. You can create sperate
>>> one for difference DMA engine.
>>>
>>> Frank
>>>
>>
>> stm32-dma is not considered for reuse : register layout is completely
>> different and this DMA controller doesn't rely on descriptors mechanism.
>>
>> stm32-mdma is based on descriptors mechanism but even there, there are
>> significant differences in register layout and descriptors structure.
>> As you can see:
> 
> Can you add such description in commit message?
> 
> Frank
> 

Ok I will enhance the commit message in new patchset version (v3).

Amelie

>> /* Descriptor from stm32-mdma */
>> struct stm32_mdma_hwdesc {
>> 	u32 ctcr;
>> 	u32 cbndtr;
>> 	u32 csar;
>> 	u32 cdar;
>> 	u32 cbrur;
>> 	u32 clar;
>> 	u32 ctbr;
>> 	u32 dummy;
>> 	u32 cmar;
>> 	u32 cmdr;
>> } __aligned(64);
>>
>> /* Descriptor from stm32-dma3 */
>> struct stm32_dma3_hwdesc {
>> 	u32 ctr1;
>> 	u32 ctr2;
>> 	u32 cbr1;
>> 	u32 csar;
>> 	u32 cdar;
>> 	u32 cllr;
>> } __aligned(32);
>>
>> Moreover, stm32-dma3 can have static or dynamic linked-list items. Dynamic
>> data structure support is not yet in this patchset, current implementation
>> is undergoing validation and maturation.
>> "cllr"  configures the data structure of the next linked-list item in
>> addition to its address pointer. The descriptor can be "compacted" depending
>> on cllr update bits values.
>>
>> /* CxLLR DMA channel x linked-list address register */
>> #define CLLR_LA				GENMASK(15, 2) /* Address */
>> #define CLLR_ULL			BIT(16) /* CxLLR update ? */
>> #define CLLR_UDA			BIT(27) /* CxDAR update ? */
>> #define CLLR_USA			BIT(28) /* CxSAR update ? */
>> #define CLLR_UB1			BIT(29) /* CxBR1 update ? */
>> #define CLLR_UT2			BIT(30) /* CxTR2 update ? */
>> #define CLLR_UT1			BIT(31) /* CxTR1 update ? */
>>
>> If one or more CLLR_Uxx bit(s) is(are) not set, it means the corresponding
>> u32 value(s) in the descriptor is(are) not there. For example, if CLLR_ULL
>> bit is the only one that is set, then "cllr" value should be at offset 0 in
>> linked-list data structure.
>>
>> I hope this gives an insights into why I've decided not to reuse the
>> existing drivers, either in whole or in part.
>>
>> Amelie
>>
>>>>
>>>>>> diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..b5493f497d06
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/dma/stm32/stm32-dma3.c
>>>>>> @@ -0,0 +1,1431 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>>>> +/*
>>>>>> + * STM32 DMA3 controller driver
>>>>>> + *
>>>>>> + * Copyright (C) STMicroelectronics 2024
>>>>>> + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
>>>>>> + */
>>>>>> +
>>>>>> +#include <linux/bitfield.h>
>>>>>> +#include <linux/clk.h>
>>>>>> +#include <linux/dma-mapping.h>
>>>>>> +#include <linux/dmaengine.h>
>>>>>> +#include <linux/dmapool.h>
>>>>>> +#include <linux/init.h>
>>>>>> +#include <linux/iopoll.h>
>>>>>> +#include <linux/list.h>
>>>>>> +#include <linux/module.h>
>>>>>> +#include <linux/of_dma.h>
>>>>>> +#include <linux/platform_device.h>
>>>>>> +#include <linux/pm_runtime.h>
>>>>>> +#include <linux/reset.h>
>>>>>> +#include <linux/slab.h>
>>>>>> +
>>>>>> +#include "../virt-dma.h"
>>>>>> +
>>>>>> +#define STM32_DMA3_SECCFGR		0x00
>>>>>> +#define STM32_DMA3_PRIVCFGR		0x04
>>>>>> +#define STM32_DMA3_RCFGLOCKR		0x08
>>>>>> +#define STM32_DMA3_MISR			0x0C
>>>>>
>>>>> lower hex please
>>>>>
>>>>
>>>> Ok.
>>>>
>>>>>> +#define STM32_DMA3_SMISR		0x10
>>>>>> +
>>>>>> +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
>>>>>> +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
>>>>>> +
>>>>>> +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
>>>>>> +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
>>>>>> +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
>>>>>> +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
>>>>>> +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
>>>>>> +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
>>>>>> +#define STM32_DMA3_VERR			0xFF4
>>>>>
>>>>> here as well
>>>>>
>>>>
>>>> Ok.
>>>>
>>>>>> +
>>>>>> +/* SECCFGR DMA secure configuration register */
>>>>>> +#define SECCFGR_SEC(x)			BIT(x)
>>>>>> +
>>>>>> +/* MISR DMA non-secure/secure masked interrupt status register */
>>>>>> +#define MISR_MIS(x)			BIT(x)
>>>>>> +
>>>>>> +/* CxLBAR DMA channel x linked_list base address register */
>>>>>> +#define CLBAR_LBA			GENMASK(31, 16)
>>>>>> +
>>>>>> +/* CxCIDCFGR DMA channel x CID register */
>>>>>> +#define CCIDCFGR_CFEN			BIT(0)
>>>>>> +#define CCIDCFGR_SEM_EN			BIT(1)
>>>>>> +#define CCIDCFGR_SCID			GENMASK(5, 4)
>>>>>> +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
>>>>>> +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
>>>>>> +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
>>>>>> +
>>>>>> +enum ccidcfgr_cid {
>>>>>> +	CCIDCFGR_CID0,
>>>>>> +	CCIDCFGR_CID1,
>>>>>> +	CCIDCFGR_CID2,
>>>>>> +};
>>>>>> +
>>>>>> +/* CxSEMCR DMA channel x semaphore control register */
>>>>>> +#define CSEMCR_SEM_MUTEX		BIT(0)
>>>>>> +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
>>>>>> +
>>>>>> +/* CxFCR DMA channel x flag clear register */
>>>>>> +#define CFCR_TCF			BIT(8)
>>>>>> +#define CFCR_HTF			BIT(9)
>>>>>> +#define CFCR_DTEF			BIT(10)
>>>>>> +#define CFCR_ULEF			BIT(11)
>>>>>> +#define CFCR_USEF			BIT(12)
>>>>>> +#define CFCR_SUSPF			BIT(13)
>>>>>> +
>>>>>> +/* CxSR DMA channel x status register */
>>>>>> +#define CSR_IDLEF			BIT(0)
>>>>>> +#define CSR_TCF				BIT(8)
>>>>>> +#define CSR_HTF				BIT(9)
>>>>>> +#define CSR_DTEF			BIT(10)
>>>>>> +#define CSR_ULEF			BIT(11)
>>>>>> +#define CSR_USEF			BIT(12)
>>>>>> +#define CSR_SUSPF			BIT(13)
>>>>>> +#define CSR_ALL_F			GENMASK(13, 8)
>>>>>> +#define CSR_FIFOL			GENMASK(24, 16)
>>>>>> +
>>>>>> +/* CxCR DMA channel x control register */
>>>>>> +#define CCR_EN				BIT(0)
>>>>>> +#define CCR_RESET			BIT(1)
>>>>>> +#define CCR_SUSP			BIT(2)
>>>>>> +#define CCR_TCIE			BIT(8)
>>>>>> +#define CCR_HTIE			BIT(9)
>>>>>> +#define CCR_DTEIE			BIT(10)
>>>>>> +#define CCR_ULEIE			BIT(11)
>>>>>> +#define CCR_USEIE			BIT(12)
>>>>>> +#define CCR_SUSPIE			BIT(13)
>>>>>> +#define CCR_ALLIE			GENMASK(13, 8)
>>>>>> +#define CCR_LSM				BIT(16)
>>>>>> +#define CCR_LAP				BIT(17)
>>>>>> +#define CCR_PRIO			GENMASK(23, 22)
>>>>>> +
>>>>>> +enum ccr_prio {
>>>>>> +	CCR_PRIO_LOW,
>>>>>> +	CCR_PRIO_MID,
>>>>>> +	CCR_PRIO_HIGH,
>>>>>> +	CCR_PRIO_VERY_HIGH,
>>>>>> +};
>>>>>> +
>>>>>> +/* CxTR1 DMA channel x transfer register 1 */
>>>>>> +#define CTR1_SINC			BIT(3)
>>>>>> +#define CTR1_SBL_1			GENMASK(9, 4)
>>>>>> +#define CTR1_DINC			BIT(19)
>>>>>> +#define CTR1_DBL_1			GENMASK(25, 20)
>>>>>> +#define CTR1_SDW_LOG2			GENMASK(1, 0)
>>>>>> +#define CTR1_PAM			GENMASK(12, 11)
>>>>>> +#define CTR1_SAP			BIT(14)
>>>>>> +#define CTR1_DDW_LOG2			GENMASK(17, 16)
>>>>>> +#define CTR1_DAP			BIT(30)
>>>>>> +
>>>>>> +enum ctr1_dw {
>>>>>> +	CTR1_DW_BYTE,
>>>>>> +	CTR1_DW_HWORD,
>>>>>> +	CTR1_DW_WORD,
>>>>>> +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
>>>>>> +};
>>>>>> +
>>>>>> +enum ctr1_pam {
>>>>>> +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
>>>>>> +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
>>>>>> +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
>>>>>> +};
>>>>>> +
>>>>>> +/* CxTR2 DMA channel x transfer register 2 */
>>>>>> +#define CTR2_REQSEL			GENMASK(7, 0)
>>>>>> +#define CTR2_SWREQ			BIT(9)
>>>>>> +#define CTR2_DREQ			BIT(10)
>>>>>> +#define CTR2_BREQ			BIT(11)
>>>>>> +#define CTR2_PFREQ			BIT(12)
>>>>>> +#define CTR2_TCEM			GENMASK(31, 30)
>>>>>> +
>>>>>> +enum ctr2_tcem {
>>>>>> +	CTR2_TCEM_BLOCK,
>>>>>> +	CTR2_TCEM_REPEAT_BLOCK,
>>>>>> +	CTR2_TCEM_LLI,
>>>>>> +	CTR2_TCEM_CHANNEL,
>>>>>> +};
>>>>>> +
>>>>>> +/* CxBR1 DMA channel x block register 1 */
>>>>>> +#define CBR1_BNDT			GENMASK(15, 0)
>>>>>> +
>>>>>> +/* CxLLR DMA channel x linked-list address register */
>>>>>> +#define CLLR_LA				GENMASK(15, 2)
>>>>>> +#define CLLR_ULL			BIT(16)
>>>>>> +#define CLLR_UDA			BIT(27)
>>>>>> +#define CLLR_USA			BIT(28)
>>>>>> +#define CLLR_UB1			BIT(29)
>>>>>> +#define CLLR_UT2			BIT(30)
>>>>>> +#define CLLR_UT1			BIT(31)
>>>>>> +
>>>>>> +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
>>>>>> +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
>>>>>> +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
>>>>>> +
>>>>>> +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
>>>>>> +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
>>>>>> +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
>>>>>> +
>>>>>> +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
>>>>>> +
>>>>>> +/* HWCFGR2 DMA hardware configuration register 2 */
>>>>>> +#define G_MAX_REQ_ID			GENMASK(7, 0)
>>>>>> +
>>>>>> +/* HWCFGR1 DMA hardware configuration register 1 */
>>>>>> +#define G_MASTER_PORTS			GENMASK(2, 0)
>>>>>> +#define G_NUM_CHANNELS			GENMASK(12, 8)
>>>>>> +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
>>>>>> +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
>>>>>> +
>>>>>> +enum stm32_dma3_master_ports {
>>>>>> +	AXI64, /* 1x AXI: 64-bit port 0 */
>>>>>> +	AHB32, /* 1x AHB: 32-bit port 0 */
>>>>>> +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
>>>>>> +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
>>>>>> +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
>>>>>> +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
>>>>>> +};
>>>>>> +
>>>>>> +enum stm32_dma3_port_data_width {
>>>>>> +	DW_32, /* 32-bit, for AHB */
>>>>>> +	DW_64, /* 64-bit, for AXI */
>>>>>> +	DW_128, /* 128-bit, for AXI */
>>>>>> +	DW_INVALID,
>>>>>> +};
>>>>>> +
>>>>>> +/* VERR DMA version register */
>>>>>> +#define VERR_MINREV			GENMASK(3, 0)
>>>>>> +#define VERR_MAJREV			GENMASK(7, 4)
>>>>>> +
>>>>>> +/* Device tree */
>>>>>> +/* struct stm32_dma3_dt_conf */
>>>>>> +/* .ch_conf */
>>>>>> +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
>>>>>> +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
>>>>>> +/* .tr_conf */
>>>>>> +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
>>>>>> +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
>>>>>> +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
>>>>>> +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
>>>>>> +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
>>>>>> +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
>>>>>> +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
>>>>>> +
>>>>>> +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
>>>>>> +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>>>>>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
>>>>>> +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>>>>>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
>>>>>> +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
>>>>>> +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
>>>>>> +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
>>>>>> +
>>>>>> +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
>>>>>> +struct stm32_dma3_hwdesc {
>>>>>> +	u32 ctr1;
>>>>>> +	u32 ctr2;
>>>>>> +	u32 cbr1;
>>>>>> +	u32 csar;
>>>>>> +	u32 cdar;
>>>>>> +	u32 cllr;
>>>>>> +} __aligned(32);
>>>>>> +
>>>>>> +/*
>>>>>> + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
>>>>>> + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
>>>>>> + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
>>>>>> + * the sg size limitation.
>>>>>> + */
>>>>>> +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
>>>>>> +
>>>>>> +/*
>>>>>> + * Linked-list items
>>>>>> + */
>>>>>> +struct stm32_dma3_lli {
>>>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>>>> +	dma_addr_t hwdesc_addr;
>>>>>> +};
>>>>>> +
>>>>>> +struct stm32_dma3_swdesc {
>>>>>> +	struct virt_dma_desc vdesc;
>>>>>> +	u32 ccr;
>>>>>> +	bool cyclic;
>>>>>> +	u32 lli_size;
>>>>>> +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
>>>>>> +};
>>>>>> +
>>>>>> +struct stm32_dma3_dt_conf {
>>>>>> +	u32 ch_id;
>>>>>> +	u32 req_line;
>>>>>> +	u32 ch_conf;
>>>>>> +	u32 tr_conf;
>>>>>> +};
>>>>>> +
>>>>>> +struct stm32_dma3_chan {
>>>>>> +	struct virt_dma_chan vchan;
>>>>>> +	u32 id;
>>>>>> +	int irq;
>>>>>> +	u32 fifo_size;
>>>>>> +	u32 max_burst;
>>>>>> +	bool semaphore_mode;
>>>>>> +	struct stm32_dma3_dt_conf dt_config;
>>>>>> +	struct dma_slave_config dma_config;
>>>>>> +	struct dma_pool *lli_pool;
>>>>>> +	struct stm32_dma3_swdesc *swdesc;
>>>>>> +	enum ctr2_tcem tcem;
>>>>>> +	u32 dma_status;
>>>>>> +};
>>>>>> +
>>>>>> +struct stm32_dma3_ddata {
>>>>>> +	struct dma_device dma_dev;
>>>>>> +	void __iomem *base;
>>>>>> +	struct clk *clk;
>>>>>> +	struct stm32_dma3_chan *chans;
>>>>>> +	u32 dma_channels;
>>>>>> +	u32 dma_requests;
>>>>>> +	enum stm32_dma3_port_data_width ports_max_dw[2];
>>>>>> +};
>>>>>> +
>>>>>> +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
>>>>>> +}
>>>>>> +
>>>>>> +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
>>>>>> +{
>>>>>> +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
>>>>>> +}
>>>>>> +
>>>>>> +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
>>>>>> +{
>>>>>> +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
>>>>>> +}
>>>>>> +
>>>>>> +static struct device *chan2dev(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	return &chan->vchan.chan.dev->device;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	struct device *dev = chan2dev(chan);
>>>>>> +	u32 id = chan->id, offset;
>>>>>> +
>>>>>> +	offset = STM32_DMA3_SECCFGR;
>>>>>> +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_PRIVCFGR;
>>>>>> +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CCIDCFGR(id);
>>>>>> +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CSEMCR(id);
>>>>>> +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CSR(id);
>>>>>> +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CCR(id);
>>>>>> +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CTR1(id);
>>>>>> +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CTR2(id);
>>>>>> +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CBR1(id);
>>>>>> +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CSAR(id);
>>>>>> +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CDAR(id);
>>>>>> +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CLLR(id);
>>>>>> +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +	offset = STM32_DMA3_CLBAR(id);
>>>>>> +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
>>>>>> +					struct stm32_dma3_swdesc *swdesc)
>>>>>> +{
>>>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>>>> +	int i;
>>>>>> +
>>>>>> +	for (i = 0; i < swdesc->lli_size; i++) {
>>>>>> +		hwdesc = swdesc->lli[i].hwdesc;
>>>>>> +		if (i)
>>>>>> +			dev_dbg(chan2dev(chan), "V\n");
>>>>>> +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
>>>>>> +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
>>>>>> +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
>>>>>> +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
>>>>>> +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
>>>>>> +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
>>>>>> +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
>>>>>> +	}
>>>>>> +
>>>>>> +	if (swdesc->cyclic) {
>>>>>> +		dev_dbg(chan2dev(chan), "|\n");
>>>>>> +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
>>>>>> +	} else {
>>>>>> +		dev_dbg(chan2dev(chan), "X\n");
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>> +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	struct stm32_dma3_swdesc *swdesc;
>>>>>> +	int i;
>>>>>> +
>>>>>> +	/*
>>>>>> +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
>>>>>> +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
>>>>>> +	 * addressed, so abort the allocation.
>>>>>> +	 */
>>>>>> +	if ((count * 32) > CLLR_LA) {
>>>>>> +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
>>>>>> +		return NULL;
>>>>>> +	}
>>>>>> +
>>>>>> +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
>>>>>> +	if (!swdesc)
>>>>>> +		return NULL;
>>>>>> +
>>>>>> +	for (i = 0; i < count; i++) {
>>>>>> +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
>>>>>> +							&swdesc->lli[i].hwdesc_addr);
>>>>>> +		if (!swdesc->lli[i].hwdesc)
>>>>>> +			goto err_pool_free;
>>>>>> +	}
>>>>>> +	swdesc->lli_size = count;
>>>>>> +	swdesc->ccr = 0;
>>>>>> +
>>>>>> +	/* Set LL base address */
>>>>>> +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
>>>>>> +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
>>>>>> +
>>>>>> +	/* Set LL allocated port */
>>>>>> +	swdesc->ccr &= ~CCR_LAP;
>>>>>> +
>>>>>> +	return swdesc;
>>>>>> +
>>>>>> +err_pool_free:
>>>>>> +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
>>>>>> +	while (--i >= 0)
>>>>>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>>>>>> +	kfree(swdesc);
>>>>>> +
>>>>>> +	return NULL;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
>>>>>> +				      struct stm32_dma3_swdesc *swdesc)
>>>>>> +{
>>>>>> +	int i;
>>>>>> +
>>>>>> +	for (i = 0; i < swdesc->lli_size; i++)
>>>>>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>>>>>> +
>>>>>> +	kfree(swdesc);
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
>>>>>> +{
>>>>>> +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
>>>>>> +
>>>>>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	struct device *dev = chan2dev(chan);
>>>>>> +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
>>>>>> +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
>>>>>> +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
>>>>>> +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
>>>>>> +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
>>>>>> +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
>>>>>> +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
>>>>>> +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
>>>>>> +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
>>>>>> +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
>>>>>> +
>>>>>> +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
>>>>>> +		dev_err(dev, "null source block size and no update of this value\n");
>>>>>> +	if (bndt % sdw)
>>>>>> +		dev_err(dev, "source block size not multiple of src data width\n");
>>>>>> +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
>>>>>> +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
>>>>>> +	if (csar % sdw)
>>>>>> +		dev_err(dev, "unaligned source address not multiple of src data width\n");
>>>>>> +	if (cdar % ddw)
>>>>>> +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
>>>>>> +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
>>>>>> +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
>>>>>> +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
>>>>>> +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
>>>>>
>>>>> NO error/abort if this is wrong...?
>>>>>
>>>>
>>>> User setting error triggers an interrupt caught in stm32_dma3_chan_irq()
>>>> interrupt handler.
>>>> Indeed User setting error can occur when enabling the channel or when DMA3
>>>> registers are updated with each linked-list item.
>>>> In interrupt handler, when USEF (User Setting Error Flag) is set, this
>>>> function (stm32_dma3_check_user_setting) helps the user to understand what
>>>> went wrong. The hardware automatically disables the channel to prevent the
>>>> execution of the wrongly programmed transfer and the driver resets the
>>>> channel and sets chan->dma_status = DMA_ERROR;. dmaengine_tx_status() will
>>>> return DMA_ERROR.
>>>> So from user point of view, the transfer will never complete, and the
>>>> channel is ready to be reprogrammed.
>>>> Note that in _prep_ functions, all is checked to avoid user setting error.
>>>> If a user setting error occurs, it is rather due to a corrupted linked-list
>>>> item (that should fortunately never happen).
>>>>
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
>>>>>> +					struct stm32_dma3_swdesc *swdesc,
>>>>>> +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
>>>>>> +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
>>>>>> +{
>>>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>>>> +	dma_addr_t next_lli;
>>>>>> +	u32 next = curr + 1;
>>>>>> +
>>>>>> +	hwdesc = swdesc->lli[curr].hwdesc;
>>>>>> +	hwdesc->ctr1 = ctr1;
>>>>>> +	hwdesc->ctr2 = ctr2;
>>>>>> +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
>>>>>> +	hwdesc->csar = src;
>>>>>> +	hwdesc->cdar = dst;
>>>>>> +
>>>>>> +	if (is_last) {
>>>>>> +		if (is_cyclic)
>>>>>> +			next_lli = swdesc->lli[0].hwdesc_addr;
>>>>>> +		else
>>>>>> +			next_lli = 0;
>>>>>> +	} else {
>>>>>> +		next_lli = swdesc->lli[next].hwdesc_addr;
>>>>>> +	}
>>>>>> +
>>>>>> +	hwdesc->cllr = 0;
>>>>>> +	if (next_lli) {
>>>>>> +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
>>>>>> +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
>>>>>> +		hwdesc->cllr |= (next_lli & CLLR_LA);
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>> +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
>>>>>> +						     enum stm32_dma3_port_data_width port_max_dw,
>>>>>> +						     u32 len, dma_addr_t addr)
>>>>>> +{
>>>>>> +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
>>>>>> +
>>>>>> +	/* len and addr must be a multiple of dw */
>>>>>> +	return 1 << __ffs(len | addr | max_dw);
>>>>>> +}
>>>>>> +
>>>>>> +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
>>>>>> +{
>>>>>> +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
>>>>>> +
>>>>>> +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
>>>>>> +	if (len < chan_max_burst)
>>>>>> +		max_burst = len / dw;
>>>>>> +
>>>>>> +	/*
>>>>>> +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
>>>>>> +	 * If len is not a multiple of burst size, last burst is shortened by HW.
>>>>>> +	 */
>>>>>> +	return max_burst;
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
>>>>>> +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
>>>>>> +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	struct dma_device dma_device = ddata->dma_dev;
>>>>>> +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
>>>>>> +	u32 _ctr1 = 0, _ctr2 = 0;
>>>>>> +	u32 ch_conf = chan->dt_config.ch_conf;
>>>>>> +	u32 tr_conf = chan->dt_config.tr_conf;
>>>>>> +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
>>>>>> +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
>>>>>> +
>>>>>> +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
>>>>>> +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
>>>>>> +
>>>>>> +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
>>>>>> +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
>>>>>> +	sbl_max = chan->dma_config.src_maxburst ? : 1;
>>>>>> +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
>>>>>> +
>>>>>> +	/* Following conditions would raise User Setting Error interrupt */
>>>>>> +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
>>>>>> +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
>>>>>> +		return -EINVAL;
>>>>>> +	}
>>>>>> +
>>>>>> +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
>>>>>> +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
>>>>>> +		return -EINVAL;
>>>>>> +	}
>>>>>> +
>>>>>> +	sap_max_dw = ddata->ports_max_dw[sap];
>>>>>> +	dap_max_dw = ddata->ports_max_dw[dap];
>>>>>> +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
>>>>>> +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
>>>>>> +		dev_err(chan2dev(chan),
>>>>>> +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
>>>>>> +			sdw, ddw, sap, dap);
>>>>>> +		return -EINVAL;
>>>>>> +	}
>>>>>> +
>>>>>> +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
>>>>>> +		_ctr1 |= CTR1_SINC;
>>>>>> +	if (sap)
>>>>>> +		_ctr1 |= CTR1_SAP;
>>>>>> +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
>>>>>> +		_ctr1 |= CTR1_DINC;
>>>>>> +	if (dap)
>>>>>> +		_ctr1 |= CTR1_DAP;
>>>>>> +
>>>>>> +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
>>>>>> +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
>>>>>> +		_ctr2 |= CTR2_BREQ;
>>>>>> +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
>>>>>> +		_ctr2 |= CTR2_PFREQ;
>>>>>> +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
>>>>>> +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
>>>>>> +
>>>>>> +	/* Store TCEM to know on which event TC flag occurred */
>>>>>> +	chan->tcem = tcem;
>>>>>> +	/* Store direction for residue computation */
>>>>>> +	chan->dma_config.direction = dir;
>>>>>> +
>>>>>> +	switch (dir) {
>>>>>> +	case DMA_MEM_TO_DEV:
>>>>>> +		/* Set destination (device) data width and burst */
>>>>>> +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
>>>>>> +							    len, dst_addr));
>>>>>> +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
>>>>>> +
>>>>>> +		/* Set source (memory) data width and burst */
>>>>>> +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
>>>>>> +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
>>>>>> +
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>>>>>> +
>>>>>> +		if (ddw != sdw) {
>>>>>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>>>>>> +			/* Should never reach this case as ddw is clamped down */
>>>>>> +			if (len & (ddw - 1)) {
>>>>>> +				dev_err(chan2dev(chan),
>>>>>> +					"Packing mode is enabled and len is not multiple of ddw");
>>>>>> +				return -EINVAL;
>>>>>> +			}
>>>>>> +		}
>>>>>> +
>>>>>> +		/* dst = dev */
>>>>>> +		_ctr2 |= CTR2_DREQ;
>>>>>> +
>>>>>> +		break;
>>>>>> +
>>>>>> +	case DMA_DEV_TO_MEM:
>>>>>> +		/* Set source (device) data width and burst */
>>>>>> +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
>>>>>> +							    len, src_addr));
>>>>>> +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
>>>>>> +
>>>>>> +		/* Set destination (memory) data width and burst */
>>>>>> +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
>>>>>> +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
>>>>>> +
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>>>>>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>>>>>> +
>>>>>> +		if (ddw != sdw) {
>>>>>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>>>>>> +			/* Should never reach this case as ddw is clamped down */
>>>>>> +			if (len & (ddw - 1)) {
>>>>>> +				dev_err(chan2dev(chan),
>>>>>> +					"Packing mode is enabled and len is not multiple of ddw\n");
>>>>>> +				return -EINVAL;
>>>>>> +			}
>>>>>> +		}
>>>>>> +
>>>>>> +		/* dst = mem */
>>>>>> +		_ctr2 &= ~CTR2_DREQ;
>>>>>> +
>>>>>> +		break;
>>>>>> +
>>>>>> +	default:
>>>>>> +		dev_err(chan2dev(chan), "Direction %s not supported\n",
>>>>>> +			dmaengine_get_direction_text(dir));
>>>>>> +		return -EINVAL;
>>>>>> +	}
>>>>>> +
>>>>>> +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
>>>>>> +	*ctr1 = _ctr1;
>>>>>> +	*ctr2 = _ctr2;
>>>>>> +
>>>>>> +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
>>>>>> +		__func__, sdw, sbl_max, ddw, dbl_max);
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	struct virt_dma_desc *vdesc;
>>>>>> +	struct stm32_dma3_hwdesc *hwdesc;
>>>>>> +	u32 id = chan->id;
>>>>>> +	u32 csr, ccr;
>>>>>> +
>>>>>> +	vdesc = vchan_next_desc(&chan->vchan);
>>>>>> +	if (!vdesc) {
>>>>>> +		chan->swdesc = NULL;
>>>>>> +		return;
>>>>>> +	}
>>>>>> +	list_del(&vdesc->node);
>>>>>> +
>>>>>> +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
>>>>>> +	hwdesc = chan->swdesc->lli[0].hwdesc;
>>>>>> +
>>>>>> +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
>>>>>> +
>>>>>> +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
>>>>>> +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
>>>>>> +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
>>>>>> +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
>>>>>> +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
>>>>>> +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
>>>>>> +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
>>>>>> +
>>>>>> +	/* Clear any pending interrupts */
>>>>>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
>>>>>> +	if (csr & CSR_ALL_F)
>>>>>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
>>>>>> +
>>>>>> +	stm32_dma3_chan_dump_reg(chan);
>>>>>> +
>>>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
>>>>>> +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
>>>>>> +
>>>>>> +	chan->dma_status = DMA_IN_PROGRESS;
>>>>>> +
>>>>>> +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>>>>>> +	int ret = 0;
>>>>>> +
>>>>>> +	if (susp)
>>>>>> +		ccr |= CCR_SUSP;
>>>>>> +	else
>>>>>> +		ccr &= ~CCR_SUSP;
>>>>>> +
>>>>>> +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
>>>>>> +
>>>>>> +	if (susp) {
>>>>>> +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
>>>>>> +							csr & CSR_SUSPF, 1, 10);
>>>>>> +		if (!ret)
>>>>>> +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
>>>>>> +
>>>>>> +		stm32_dma3_chan_dump_reg(chan);
>>>>>> +	}
>>>>>> +
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>>>>>> +
>>>>>> +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	u32 ccr;
>>>>>> +	int ret = 0;
>>>>>> +
>>>>>> +	chan->dma_status = DMA_COMPLETE;
>>>>>> +
>>>>>> +	/* Disable interrupts */
>>>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
>>>>>> +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
>>>>>> +
>>>>>> +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
>>>>>> +		/* Suspend the channel */
>>>>>> +		ret = stm32_dma3_chan_suspend(chan, true);
>>>>>> +		if (ret)
>>>>>> +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
>>>>>> +	}
>>>>>> +
>>>>>> +	/*
>>>>>> +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
>>>>>> +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
>>>>>> +	 */
>>>>>> +	stm32_dma3_chan_reset(chan);
>>>>>> +
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
>>>>>> +{
>>>>>> +	if (!chan->swdesc)
>>>>>> +		return;
>>>>>> +
>>>>>> +	vchan_cookie_complete(&chan->swdesc->vdesc);
>>>>>> +	chan->swdesc = NULL;
>>>>>> +	stm32_dma3_chan_start(chan);
>>>>>> +}
>>>>>> +
>>>>>> +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = devid;
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	u32 misr, csr, ccr;
>>>>>> +
>>>>>> +	spin_lock(&chan->vchan.lock);
>>>>>> +
>>>>>> +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
>>>>>> +	if (!(misr & MISR_MIS(chan->id))) {
>>>>>> +		spin_unlock(&chan->vchan.lock);
>>>>>> +		return IRQ_NONE;
>>>>>> +	}
>>>>>> +
>>>>>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
>>>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
>>>>>> +
>>>>>> +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
>>>>>> +		if (chan->swdesc->cyclic)
>>>>>> +			vchan_cyclic_callback(&chan->swdesc->vdesc);
>>>>>> +		else
>>>>>> +			stm32_dma3_chan_complete(chan);
>>>>>> +	}
>>>>>> +
>>>>>> +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
>>>>>> +		dev_err(chan2dev(chan), "User setting error\n");
>>>>>> +		chan->dma_status = DMA_ERROR;
>>>>>> +		/* CCR.EN automatically cleared by HW */
>>>>>> +		stm32_dma3_check_user_setting(chan);
>>>>>> +		stm32_dma3_chan_reset(chan);
>>>>>> +	}
>>>>>> +
>>>>>> +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
>>>>>> +		dev_err(chan2dev(chan), "Update link transfer error\n");
>>>>>> +		chan->dma_status = DMA_ERROR;
>>>>>> +		/* CCR.EN automatically cleared by HW */
>>>>>> +		stm32_dma3_chan_reset(chan);
>>>>>> +	}
>>>>>> +
>>>>>> +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
>>>>>> +		dev_err(chan2dev(chan), "Data transfer error\n");
>>>>>> +		chan->dma_status = DMA_ERROR;
>>>>>> +		/* CCR.EN automatically cleared by HW */
>>>>>> +		stm32_dma3_chan_reset(chan);
>>>>>> +	}
>>>>>> +
>>>>>> +	/*
>>>>>> +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
>>>>>> +	 * ensure HTF flag to be cleared, with other flags.
>>>>>> +	 */
>>>>>> +	csr &= (ccr | CCR_HTIE);
>>>>>> +
>>>>>> +	if (csr)
>>>>>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
>>>>>> +
>>>>>> +	spin_unlock(&chan->vchan.lock);
>>>>>> +
>>>>>> +	return IRQ_HANDLED;
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	u32 id = chan->id, csemcr, ccid;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>>>>>> +	if (ret < 0)
>>>>>> +		return ret;
>>>>>> +
>>>>>> +	/* Ensure the channel is free */
>>>>>> +	if (chan->semaphore_mode &&
>>>>>> +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
>>>>>> +		ret = -EBUSY;
>>>>>> +		goto err_put_sync;
>>>>>> +	}
>>>>>> +
>>>>>> +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
>>>>>> +					  sizeof(struct stm32_dma3_hwdesc),
>>>>>> +					  __alignof__(struct stm32_dma3_hwdesc), 0);
>>>>>> +	if (!chan->lli_pool) {
>>>>>> +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
>>>>>> +		ret = -ENOMEM;
>>>>>> +		goto err_put_sync;
>>>>>> +	}
>>>>>> +
>>>>>> +	/* Take the channel semaphore */
>>>>>> +	if (chan->semaphore_mode) {
>>>>>> +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
>>>>>> +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
>>>>>> +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
>>>>>> +		/* Check that the channel is well taken */
>>>>>> +		if (ccid != CCIDCFGR_CID1) {
>>>>>> +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
>>>>>> +			ret = -EPERM;
>>>>>> +			goto err_pool_destroy;
>>>>>> +		}
>>>>>> +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
>>>>>> +	}
>>>>>> +
>>>>>> +	return 0;
>>>>>> +
>>>>>> +err_pool_destroy:
>>>>>> +	dmam_pool_destroy(chan->lli_pool);
>>>>>> +	chan->lli_pool = NULL;
>>>>>> +
>>>>>> +err_put_sync:
>>>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>>>> +
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	unsigned long flags;
>>>>>> +
>>>>>> +	/* Ensure channel is in idle state */
>>>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>>>> +	stm32_dma3_chan_stop(chan);
>>>>>> +	chan->swdesc = NULL;
>>>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>>>> +
>>>>>> +	vchan_free_chan_resources(to_virt_chan(c));
>>>>>> +
>>>>>> +	dmam_pool_destroy(chan->lli_pool);
>>>>>> +	chan->lli_pool = NULL;
>>>>>> +
>>>>>> +	/* Release the channel semaphore */
>>>>>> +	if (chan->semaphore_mode)
>>>>>> +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
>>>>>> +
>>>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>>>> +
>>>>>> +	/* Reset configuration */
>>>>>> +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
>>>>>> +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
>>>>>> +}
>>>>>> +
>>>>>> +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
>>>>>> +								struct scatterlist *sgl,
>>>>>> +								unsigned int sg_len,
>>>>>> +								enum dma_transfer_direction dir,
>>>>>> +								unsigned long flags, void *context)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +	struct stm32_dma3_swdesc *swdesc;
>>>>>> +	struct scatterlist *sg;
>>>>>> +	size_t len;
>>>>>> +	dma_addr_t sg_addr, dev_addr, src, dst;
>>>>>> +	u32 i, j, count, ctr1, ctr2;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	count = sg_len;
>>>>>> +	for_each_sg(sgl, sg, sg_len, i) {
>>>>>> +		len = sg_dma_len(sg);
>>>>>> +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
>>>>>> +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
>>>>>> +	}
>>>>>> +
>>>>>> +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
>>>>>> +	if (!swdesc)
>>>>>> +		return NULL;
>>>>>> +
>>>>>> +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
>>>>>> +	j = 0;
>>>>>> +	for_each_sg(sgl, sg, sg_len, i) {
>>>>>> +		sg_addr = sg_dma_address(sg);
>>>>>> +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
>>>>>> +						     chan->dma_config.src_addr;
>>>>>> +		len = sg_dma_len(sg);
>>>>>> +
>>>>>> +		do {
>>>>>> +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
>>>>>> +
>>>>>> +			if (dir == DMA_MEM_TO_DEV) {
>>>>>> +				src = sg_addr;
>>>>>> +				dst = dev_addr;
>>>>>> +
>>>>>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>>>>>> +							      src, dst, chunk);
>>>>>> +
>>>>>> +				if (FIELD_GET(CTR1_DINC, ctr1))
>>>>>> +					dev_addr += chunk;
>>>>>> +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
>>>>>> +				src = dev_addr;
>>>>>> +				dst = sg_addr;
>>>>>> +
>>>>>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>>>>>> +							      src, dst, chunk);
>>>>>> +
>>>>>> +				if (FIELD_GET(CTR1_SINC, ctr1))
>>>>>> +					dev_addr += chunk;
>>>>>> +			}
>>>>>> +
>>>>>> +			if (ret)
>>>>>> +				goto err_desc_free;
>>>>>> +
>>>>>> +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
>>>>>> +						    ctr1, ctr2, j == (count - 1), false);
>>>>>> +
>>>>>> +			sg_addr += chunk;
>>>>>> +			len -= chunk;
>>>>>> +			j++;
>>>>>> +		} while (len);
>>>>>> +	}
>>>>>> +
>>>>>> +	/* Enable Error interrupts */
>>>>>> +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
>>>>>> +	/* Enable Transfer state interrupts */
>>>>>> +	swdesc->ccr |= CCR_TCIE;
>>>>>> +
>>>>>> +	swdesc->cyclic = false;
>>>>>> +
>>>>>> +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
>>>>>> +
>>>>>> +err_desc_free:
>>>>>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>>>>>> +
>>>>>> +	return NULL;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +
>>>>>> +	if (!chan->fifo_size) {
>>>>>> +		caps->max_burst = 0;
>>>>>> +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>>>> +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>>>> +	} else {
>>>>>> +		/* Burst transfer should not exceed half of the fifo size */
>>>>>> +		caps->max_burst = chan->max_burst;
>>>>>> +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
>>>>>> +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>>>> +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>>>> +		}
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +
>>>>>> +	memcpy(&chan->dma_config, config, sizeof(*config));
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_terminate_all(struct dma_chan *c)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +	unsigned long flags;
>>>>>> +	LIST_HEAD(head);
>>>>>> +
>>>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>>>> +
>>>>>> +	if (chan->swdesc) {
>>>>>> +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
>>>>>> +		chan->swdesc = NULL;
>>>>>> +	}
>>>>>> +
>>>>>> +	stm32_dma3_chan_stop(chan);
>>>>>> +
>>>>>> +	vchan_get_all_descriptors(&chan->vchan, &head);
>>>>>> +
>>>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>>>> +	vchan_dma_desc_free_list(&chan->vchan, &head);
>>>>>> +
>>>>>> +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_synchronize(struct dma_chan *c)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +
>>>>>> +	vchan_synchronize(&chan->vchan);
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_issue_pending(struct dma_chan *c)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +	unsigned long flags;
>>>>>> +
>>>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>>>> +
>>>>>> +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
>>>>>> +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
>>>>>> +		stm32_dma3_chan_start(chan);
>>>>>> +	}
>>>>>> +
>>>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>>>> +}
>>>>>> +
>>>>>> +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
>>>>>> +{
>>>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>>>> +	struct stm32_dma3_dt_conf *conf = fn_param;
>>>>>> +	u32 mask, semcr;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
>>>>>> +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
>>>>>> +
>>>>>> +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
>>>>>> +		if (!(mask & BIT(chan->id)))
>>>>>> +			return false;
>>>>>> +
>>>>>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>>>>>> +	if (ret < 0)
>>>>>> +		return false;
>>>>>> +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
>>>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>>>> +
>>>>>> +	/* Check if chan is free */
>>>>>> +	if (semcr & CSEMCR_SEM_MUTEX)
>>>>>> +		return false;
>>>>>> +
>>>>>> +	/* Check if chan fifo fits well */
>>>>>> +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
>>>>>> +		return false;
>>>>>> +
>>>>>> +	return true;
>>>>>> +}
>>>>>> +
>>>>>> +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
>>>>>> +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
>>>>>> +	struct stm32_dma3_dt_conf conf;
>>>>>> +	struct stm32_dma3_chan *chan;
>>>>>> +	struct dma_chan *c;
>>>>>> +
>>>>>> +	if (dma_spec->args_count < 3) {
>>>>>> +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
>>>>>> +		return NULL;
>>>>>> +	}
>>>>>> +
>>>>>> +	conf.req_line = dma_spec->args[0];
>>>>>> +	conf.ch_conf = dma_spec->args[1];
>>>>>> +	conf.tr_conf = dma_spec->args[2];
>>>>>> +
>>>>>> +	if (conf.req_line >= ddata->dma_requests) {
>>>>>> +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
>>>>>> +		return NULL;
>>>>>> +	}
>>>>>> +
>>>>>> +	/* Request dma channel among the generic dma controller list */
>>>>>> +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
>>>>>> +	if (!c) {
>>>>>> +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
>>>>>> +		return NULL;
>>>>>> +	}
>>>>>> +
>>>>>> +	chan = to_stm32_dma3_chan(c);
>>>>>> +	chan->dt_config = conf;
>>>>>> +
>>>>>> +	return c;
>>>>>> +}
>>>>>> +
>>>>>> +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
>>>>>> +{
>>>>>> +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
>>>>>> +
>>>>>> +	/* Reserve Secure channels */
>>>>>> +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
>>>>>> +
>>>>>> +	/*
>>>>>> +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
>>>>>> +	 * the processor which is configuring and using the given channel.
>>>>>> +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
>>>>>> +	 * specify available DMA channels to the kernel.
>>>>>> +	 */
>>>>>> +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
>>>>>> +
>>>>>> +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
>>>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>>>> +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
>>>>>> +
>>>>>> +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
>>>>>> +			invalid_cid |= BIT(i);
>>>>>> +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
>>>>>> +				chan_reserved |= BIT(i);
>>>>>> +		} else { /* CID-filtered */
>>>>>> +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
>>>>>> +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
>>>>>> +					chan_reserved |= BIT(i);
>>>>>> +			} else { /* Semaphore mode */
>>>>>> +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
>>>>>> +					chan_reserved |= BIT(i);
>>>>>> +				ddata->chans[i].semaphore_mode = true;
>>>>>> +			}
>>>>>> +		}
>>>>>> +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
>>>>>> +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
>>>>>> +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
>>>>>> +			(chan_reserved & BIT(i)) ? "denied" :
>>>>>> +			mask & BIT(i) ? "force allowed" : "allowed");
>>>>>> +	}
>>>>>> +
>>>>>> +	if (invalid_cid)
>>>>>> +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
>>>>>> +			 ddata->dma_channels, &invalid_cid);
>>>>>> +
>>>>>> +	return chan_reserved;
>>>>>> +}
>>>>>> +
>>>>>> +static const struct of_device_id stm32_dma3_of_match[] = {
>>>>>> +	{ .compatible = "st,stm32-dma3", },
>>>>>> +	{ /* sentinel */},
>>>>>> +};
>>>>>> +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
>>>>>> +
>>>>>> +static int stm32_dma3_probe(struct platform_device *pdev)
>>>>>> +{
>>>>>> +	struct device_node *np = pdev->dev.of_node;
>>>>>> +	struct stm32_dma3_ddata *ddata;
>>>>>> +	struct reset_control *reset;
>>>>>> +	struct stm32_dma3_chan *chan;
>>>>>> +	struct dma_device *dma_dev;
>>>>>> +	u32 master_ports, chan_reserved, i, verr;
>>>>>> +	u64 hwcfgr;
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
>>>>>> +	if (!ddata)
>>>>>> +		return -ENOMEM;
>>>>>> +	platform_set_drvdata(pdev, ddata);
>>>>>> +
>>>>>> +	dma_dev = &ddata->dma_dev;
>>>>>> +
>>>>>> +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
>>>>>> +	if (IS_ERR(ddata->base))
>>>>>> +		return PTR_ERR(ddata->base);
>>>>>> +
>>>>>> +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
>>>>>> +	if (IS_ERR(ddata->clk))
>>>>>> +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
>>>>>> +
>>>>>> +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
>>>>>> +	if (IS_ERR(reset))
>>>>>> +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
>>>>>> +
>>>>>> +	ret = clk_prepare_enable(ddata->clk);
>>>>>> +	if (ret)
>>>>>> +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
>>>>>> +
>>>>>> +	reset_control_reset(reset);
>>>>>> +
>>>>>> +	INIT_LIST_HEAD(&dma_dev->channels);
>>>>>> +
>>>>>> +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
>>>>>> +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
>>>>>> +	dma_dev->dev = &pdev->dev;
>>>>>> +	/*
>>>>>> +	 * This controller supports up to 8-byte buswidth depending on the port used and the
>>>>>> +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
>>>>>> +	 */
>>>>>> +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
>>>>>> +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>>>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>>>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>>>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>>>> +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>>>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>>>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>>>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>>>> +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
>>>>>> +
>>>>>> +	dma_dev->descriptor_reuse = true;
>>>>>> +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
>>>>>> +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
>>>>>> +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
>>>>>> +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
>>>>>> +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
>>>>>> +	dma_dev->device_caps = stm32_dma3_caps;
>>>>>> +	dma_dev->device_config = stm32_dma3_config;
>>>>>> +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
>>>>>> +	dma_dev->device_synchronize = stm32_dma3_synchronize;
>>>>>> +	dma_dev->device_tx_status = dma_cookie_status;
>>>>>> +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
>>>>>> +
>>>>>> +	/* if dma_channels is not modified, get it from hwcfgr1 */
>>>>>> +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
>>>>>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>>>>>> +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
>>>>>> +	}
>>>>>> +
>>>>>> +	/* if dma_requests is not modified, get it from hwcfgr2 */
>>>>>> +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
>>>>>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
>>>>>> +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
>>>>>> +	}
>>>>>> +
>>>>>> +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
>>>>>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>>>>>> +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
>>>>>> +
>>>>>> +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
>>>>>> +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
>>>>>> +		ddata->ports_max_dw[1] = DW_INVALID;
>>>>>> +	else /* Dual master ports */
>>>>>> +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
>>>>>> +
>>>>>> +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
>>>>>> +				    GFP_KERNEL);
>>>>>> +	if (!ddata->chans) {
>>>>>> +		ret = -ENOMEM;
>>>>>> +		goto err_clk_disable;
>>>>>> +	}
>>>>>> +
>>>>>> +	chan_reserved = stm32_dma3_check_rif(ddata);
>>>>>> +
>>>>>> +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
>>>>>> +		ret = -ENODEV;
>>>>>> +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
>>>>>> +		goto err_clk_disable;
>>>>>> +	}
>>>>>> +
>>>>>> +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
>>>>>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
>>>>>> +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
>>>>>> +
>>>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>>>> +		if (chan_reserved & BIT(i))
>>>>>> +			continue;
>>>>>> +
>>>>>> +		chan = &ddata->chans[i];
>>>>>> +		chan->id = i;
>>>>>> +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
>>>>>> +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
>>>>>> +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
>>>>>> +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
>>>>>> +
>>>>>> +		vchan_init(&chan->vchan, dma_dev);
>>>>>> +	}
>>>>>> +
>>>>>> +	ret = dmaenginem_async_device_register(dma_dev);
>>>>>> +	if (ret)
>>>>>> +		goto err_clk_disable;
>>>>>> +
>>>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>>>> +		if (chan_reserved & BIT(i))
>>>>>> +			continue;
>>>>>> +
>>>>>> +		ret = platform_get_irq(pdev, i);
>>>>>> +		if (ret < 0)
>>>>>> +			goto err_clk_disable;
>>>>>> +
>>>>>> +		chan = &ddata->chans[i];
>>>>>> +		chan->irq = ret;
>>>>>> +
>>>>>> +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
>>>>>> +				       dev_name(chan2dev(chan)), chan);
>>>>>> +		if (ret) {
>>>>>> +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
>>>>>> +				      dev_name(chan2dev(chan)));
>>>>>> +			goto err_clk_disable;
>>>>>> +		}
>>>>>> +	}
>>>>>> +
>>>>>> +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
>>>>>> +	if (ret) {
>>>>>> +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
>>>>>> +		goto err_clk_disable;
>>>>>> +	}
>>>>>> +
>>>>>> +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
>>>>>> +
>>>>>> +	pm_runtime_set_active(&pdev->dev);
>>>>>> +	pm_runtime_enable(&pdev->dev);
>>>>>> +	pm_runtime_get_noresume(&pdev->dev);
>>>>>> +	pm_runtime_put(&pdev->dev);
>>>>>> +
>>>>>> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
>>>>>> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
>>>>>> +
>>>>>> +	return 0;
>>>>>> +
>>>>>> +err_clk_disable:
>>>>>> +	clk_disable_unprepare(ddata->clk);
>>>>>> +
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static void stm32_dma3_remove(struct platform_device *pdev)
>>>>>> +{
>>>>>> +	pm_runtime_disable(&pdev->dev);
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_runtime_suspend(struct device *dev)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>>>>>> +
>>>>>> +	clk_disable_unprepare(ddata->clk);
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>> +static int stm32_dma3_runtime_resume(struct device *dev)
>>>>>> +{
>>>>>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>>>>>> +	int ret;
>>>>>> +
>>>>>> +	ret = clk_prepare_enable(ddata->clk);
>>>>>> +	if (ret)
>>>>>> +		dev_err(dev, "Failed to enable clk: %d\n", ret);
>>>>>> +
>>>>>> +	return ret;
>>>>>> +}
>>>>>> +
>>>>>> +static const struct dev_pm_ops stm32_dma3_pm_ops = {
>>>>>> +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
>>>>>> +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
>>>>>> +};
>>>>>> +
>>>>>> +static struct platform_driver stm32_dma3_driver = {
>>>>>> +	.probe = stm32_dma3_probe,
>>>>>> +	.remove_new = stm32_dma3_remove,
>>>>>> +	.driver = {
>>>>>> +		.name = "stm32-dma3",
>>>>>> +		.of_match_table = stm32_dma3_of_match,
>>>>>> +		.pm = pm_ptr(&stm32_dma3_pm_ops),
>>>>>> +	},
>>>>>> +};
>>>>>> +
>>>>>> +static int __init stm32_dma3_init(void)
>>>>>> +{
>>>>>> +	return platform_driver_register(&stm32_dma3_driver);
>>>>>> +}
>>>>>> +
>>>>>> +subsys_initcall(stm32_dma3_init);
>>>>>> +
>>>>>> +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
>>>>>> +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
>>>>>> +MODULE_LICENSE("GPL");
>>>>>> -- 
>>>>>> 2.25.1
>>>>>
>>>>
>>>> Regards,
>>>> Amelie

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-15 18:56   ` Frank Li
@ 2024-05-16 15:25     ` Amelie Delaunay
  2024-05-16 17:09       ` Frank Li
  0 siblings, 1 reply; 29+ messages in thread
From: Amelie Delaunay @ 2024-05-16 15:25 UTC (permalink / raw)
  To: Frank Li
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

On 5/15/24 20:56, Frank Li wrote:
> On Tue, Apr 23, 2024 at 02:32:55PM +0200, Amelie Delaunay wrote:
>> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
>> controller:
>> - LPDMA (Low Power): 4 channels, no FIFO
>> - GPDMA (General Purpose): 16 channels, FIFO from 8 to 32 bytes
>> - HPDMA (High Performance): 16 channels, FIFO from 8 to 256 bytes
>> Hardware configuration of the channels is retrieved from the hardware
>> configuration registers.
>> The client can specify its channel requirements through device tree.
>> STM32 DMA3 channels can be individually reserved either because they are
>> secure, or dedicated to another CPU.
>> Indeed, channels availability depends on Resource Isolation Framework
>> (RIF) configuration. RIF grants access to buses with Compartiment ID
>> (CIF) filtering, secure and privilege level. It also assigns DMA channels
>> to one or several processors.
>> DMA channels used by Linux should be CID-filtered and statically assigned
>> to CID1 or shared with other CPUs but using semaphore. In case CID
>> filtering is not configured, dma-channel-mask property can be used to
>> specify available DMA channels to the kernel, otherwise such channels
>> will be marked as reserved and can't be used by Linux.
>>
>> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
>> ---
>>   drivers/dma/stm32/Kconfig      |   10 +
>>   drivers/dma/stm32/Makefile     |    1 +
>>   drivers/dma/stm32/stm32-dma3.c | 1431 ++++++++++++++++++++++++++++++++
>>   3 files changed, 1442 insertions(+)
>>   create mode 100644 drivers/dma/stm32/stm32-dma3.c
>>
>> diff --git a/drivers/dma/stm32/Kconfig b/drivers/dma/stm32/Kconfig
>> index b72ae1a4502f..4d8d8063133b 100644
>> --- a/drivers/dma/stm32/Kconfig
>> +++ b/drivers/dma/stm32/Kconfig
>> @@ -34,4 +34,14 @@ config STM32_MDMA
>>   	  If you have a board based on STM32 SoC with such DMA controller
>>   	  and want to use MDMA say Y here.
>>   
>> +config STM32_DMA3
>> +	tristate "STMicroelectronics STM32 DMA3 support"
>> +	select DMA_ENGINE
>> +	select DMA_VIRTUAL_CHANNELS
>> +	help
>> +	  Enable support for the on-chip DMA3 controller on STMicroelectronics
>> +	  STM32 platforms.
>> +	  If you have a board based on STM32 SoC with such DMA3 controller
>> +	  and want to use DMA3, say Y here.
>> +
>>   endif
>> diff --git a/drivers/dma/stm32/Makefile b/drivers/dma/stm32/Makefile
>> index 663a3896a881..5082db4b4c1c 100644
>> --- a/drivers/dma/stm32/Makefile
>> +++ b/drivers/dma/stm32/Makefile
>> @@ -2,3 +2,4 @@
>>   obj-$(CONFIG_STM32_DMA) += stm32-dma.o
>>   obj-$(CONFIG_STM32_DMAMUX) += stm32-dmamux.o
>>   obj-$(CONFIG_STM32_MDMA) += stm32-mdma.o
>> +obj-$(CONFIG_STM32_DMA3) += stm32-dma3.o
>> diff --git a/drivers/dma/stm32/stm32-dma3.c b/drivers/dma/stm32/stm32-dma3.c
>> new file mode 100644
>> index 000000000000..b5493f497d06
>> --- /dev/null
>> +++ b/drivers/dma/stm32/stm32-dma3.c
>> @@ -0,0 +1,1431 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * STM32 DMA3 controller driver
>> + *
>> + * Copyright (C) STMicroelectronics 2024
>> + * Author(s): Amelie Delaunay <amelie.delaunay@foss.st.com>
>> + */
>> +
>> +#include <linux/bitfield.h>
>> +#include <linux/clk.h>
>> +#include <linux/dma-mapping.h>
>> +#include <linux/dmaengine.h>
>> +#include <linux/dmapool.h>
>> +#include <linux/init.h>
>> +#include <linux/iopoll.h>
>> +#include <linux/list.h>
>> +#include <linux/module.h>
>> +#include <linux/of_dma.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/pm_runtime.h>
>> +#include <linux/reset.h>
>> +#include <linux/slab.h>
>> +
>> +#include "../virt-dma.h"
>> +
>> +#define STM32_DMA3_SECCFGR		0x00
>> +#define STM32_DMA3_PRIVCFGR		0x04
>> +#define STM32_DMA3_RCFGLOCKR		0x08
>> +#define STM32_DMA3_MISR			0x0C
>> +#define STM32_DMA3_SMISR		0x10
>> +
>> +#define STM32_DMA3_CLBAR(x)		(0x50 + 0x80 * (x))
>> +#define STM32_DMA3_CCIDCFGR(x)		(0x54 + 0x80 * (x))
>> +#define STM32_DMA3_CSEMCR(x)		(0x58 + 0x80 * (x))
>> +#define STM32_DMA3_CFCR(x)		(0x5C + 0x80 * (x))
>> +#define STM32_DMA3_CSR(x)		(0x60 + 0x80 * (x))
>> +#define STM32_DMA3_CCR(x)		(0x64 + 0x80 * (x))
>> +#define STM32_DMA3_CTR1(x)		(0x90 + 0x80 * (x))
>> +#define STM32_DMA3_CTR2(x)		(0x94 + 0x80 * (x))
>> +#define STM32_DMA3_CBR1(x)		(0x98 + 0x80 * (x))
>> +#define STM32_DMA3_CSAR(x)		(0x9C + 0x80 * (x))
>> +#define STM32_DMA3_CDAR(x)		(0xA0 + 0x80 * (x))
>> +#define STM32_DMA3_CLLR(x)		(0xCC + 0x80 * (x))
>> +
>> +#define STM32_DMA3_HWCFGR13		0xFC0 /* G_PER_CTRL(X) x=8..15 */
>> +#define STM32_DMA3_HWCFGR12		0xFC4 /* G_PER_CTRL(X) x=0..7 */
>> +#define STM32_DMA3_HWCFGR4		0xFE4 /* G_FIFO_SIZE(X) x=8..15 */
>> +#define STM32_DMA3_HWCFGR3		0xFE8 /* G_FIFO_SIZE(X) x=0..7 */
>> +#define STM32_DMA3_HWCFGR2		0xFEC /* G_MAX_REQ_ID */
>> +#define STM32_DMA3_HWCFGR1		0xFF0 /* G_MASTER_PORTS, G_NUM_CHANNELS, G_Mx_DATA_WIDTH */
>> +#define STM32_DMA3_VERR			0xFF4
>> +
>> +/* SECCFGR DMA secure configuration register */
>> +#define SECCFGR_SEC(x)			BIT(x)
>> +
>> +/* MISR DMA non-secure/secure masked interrupt status register */
>> +#define MISR_MIS(x)			BIT(x)
>> +
>> +/* CxLBAR DMA channel x linked_list base address register */
>> +#define CLBAR_LBA			GENMASK(31, 16)
>> +
>> +/* CxCIDCFGR DMA channel x CID register */
>> +#define CCIDCFGR_CFEN			BIT(0)
>> +#define CCIDCFGR_SEM_EN			BIT(1)
>> +#define CCIDCFGR_SCID			GENMASK(5, 4)
>> +#define CCIDCFGR_SEM_WLIST_CID0		BIT(16)
>> +#define CCIDCFGR_SEM_WLIST_CID1		BIT(17)
>> +#define CCIDCFGR_SEM_WLIST_CID2		BIT(18)
>> +
>> +enum ccidcfgr_cid {
>> +	CCIDCFGR_CID0,
>> +	CCIDCFGR_CID1,
>> +	CCIDCFGR_CID2,
>> +};
>> +
>> +/* CxSEMCR DMA channel x semaphore control register */
>> +#define CSEMCR_SEM_MUTEX		BIT(0)
>> +#define CSEMCR_SEM_CCID			GENMASK(5, 4)
>> +
>> +/* CxFCR DMA channel x flag clear register */
>> +#define CFCR_TCF			BIT(8)
>> +#define CFCR_HTF			BIT(9)
>> +#define CFCR_DTEF			BIT(10)
>> +#define CFCR_ULEF			BIT(11)
>> +#define CFCR_USEF			BIT(12)
>> +#define CFCR_SUSPF			BIT(13)
>> +
>> +/* CxSR DMA channel x status register */
>> +#define CSR_IDLEF			BIT(0)
>> +#define CSR_TCF				BIT(8)
>> +#define CSR_HTF				BIT(9)
>> +#define CSR_DTEF			BIT(10)
>> +#define CSR_ULEF			BIT(11)
>> +#define CSR_USEF			BIT(12)
>> +#define CSR_SUSPF			BIT(13)
>> +#define CSR_ALL_F			GENMASK(13, 8)
>> +#define CSR_FIFOL			GENMASK(24, 16)
>> +
>> +/* CxCR DMA channel x control register */
>> +#define CCR_EN				BIT(0)
>> +#define CCR_RESET			BIT(1)
>> +#define CCR_SUSP			BIT(2)
>> +#define CCR_TCIE			BIT(8)
>> +#define CCR_HTIE			BIT(9)
>> +#define CCR_DTEIE			BIT(10)
>> +#define CCR_ULEIE			BIT(11)
>> +#define CCR_USEIE			BIT(12)
>> +#define CCR_SUSPIE			BIT(13)
>> +#define CCR_ALLIE			GENMASK(13, 8)
>> +#define CCR_LSM				BIT(16)
>> +#define CCR_LAP				BIT(17)
>> +#define CCR_PRIO			GENMASK(23, 22)
>> +
>> +enum ccr_prio {
>> +	CCR_PRIO_LOW,
>> +	CCR_PRIO_MID,
>> +	CCR_PRIO_HIGH,
>> +	CCR_PRIO_VERY_HIGH,
>> +};
>> +
>> +/* CxTR1 DMA channel x transfer register 1 */
>> +#define CTR1_SINC			BIT(3)
>> +#define CTR1_SBL_1			GENMASK(9, 4)
>> +#define CTR1_DINC			BIT(19)
>> +#define CTR1_DBL_1			GENMASK(25, 20)
>> +#define CTR1_SDW_LOG2			GENMASK(1, 0)
>> +#define CTR1_PAM			GENMASK(12, 11)
>> +#define CTR1_SAP			BIT(14)
>> +#define CTR1_DDW_LOG2			GENMASK(17, 16)
>> +#define CTR1_DAP			BIT(30)
>> +
>> +enum ctr1_dw {
>> +	CTR1_DW_BYTE,
>> +	CTR1_DW_HWORD,
>> +	CTR1_DW_WORD,
>> +	CTR1_DW_DWORD, /* Depends on HWCFGR1.G_M0_DATA_WIDTH_ENC and .G_M1_DATA_WIDTH_ENC */
>> +};
>> +
>> +enum ctr1_pam {
>> +	CTR1_PAM_0S_LT, /* if DDW > SDW, padded with 0s else left-truncated */
>> +	CTR1_PAM_SE_RT, /* if DDW > SDW, sign extended else right-truncated */
>> +	CTR1_PAM_PACK_UNPACK, /* FIFO queued */
>> +};
>> +
>> +/* CxTR2 DMA channel x transfer register 2 */
>> +#define CTR2_REQSEL			GENMASK(7, 0)
>> +#define CTR2_SWREQ			BIT(9)
>> +#define CTR2_DREQ			BIT(10)
>> +#define CTR2_BREQ			BIT(11)
>> +#define CTR2_PFREQ			BIT(12)
>> +#define CTR2_TCEM			GENMASK(31, 30)
>> +
>> +enum ctr2_tcem {
>> +	CTR2_TCEM_BLOCK,
>> +	CTR2_TCEM_REPEAT_BLOCK,
>> +	CTR2_TCEM_LLI,
>> +	CTR2_TCEM_CHANNEL,
>> +};
>> +
>> +/* CxBR1 DMA channel x block register 1 */
>> +#define CBR1_BNDT			GENMASK(15, 0)
>> +
>> +/* CxLLR DMA channel x linked-list address register */
>> +#define CLLR_LA				GENMASK(15, 2)
>> +#define CLLR_ULL			BIT(16)
>> +#define CLLR_UDA			BIT(27)
>> +#define CLLR_USA			BIT(28)
>> +#define CLLR_UB1			BIT(29)
>> +#define CLLR_UT2			BIT(30)
>> +#define CLLR_UT1			BIT(31)
>> +
>> +/* HWCFGR13 DMA hardware configuration register 13 x=8..15 */
>> +/* HWCFGR12 DMA hardware configuration register 12 x=0..7 */
>> +#define G_PER_CTRL(x)			(ULL(0x1) << (4 * (x)))
>> +
>> +/* HWCFGR4 DMA hardware configuration register 4 x=8..15 */
>> +/* HWCFGR3 DMA hardware configuration register 3 x=0..7 */
>> +#define G_FIFO_SIZE(x)			(ULL(0x7) << (4 * (x)))
>> +
>> +#define get_chan_hwcfg(x, mask, reg)	(((reg) & (mask)) >> (4 * (x)))
>> +
>> +/* HWCFGR2 DMA hardware configuration register 2 */
>> +#define G_MAX_REQ_ID			GENMASK(7, 0)
>> +
>> +/* HWCFGR1 DMA hardware configuration register 1 */
>> +#define G_MASTER_PORTS			GENMASK(2, 0)
>> +#define G_NUM_CHANNELS			GENMASK(12, 8)
>> +#define G_M0_DATA_WIDTH_ENC		GENMASK(25, 24)
>> +#define G_M1_DATA_WIDTH_ENC		GENMASK(29, 28)
>> +
>> +enum stm32_dma3_master_ports {
>> +	AXI64, /* 1x AXI: 64-bit port 0 */
>> +	AHB32, /* 1x AHB: 32-bit port 0 */
>> +	AHB32_AHB32, /* 2x AHB: 32-bit port 0 and 32-bit port 1 */
>> +	AXI64_AHB32, /* 1x AXI 64-bit port 0 and 1x AHB 32-bit port 1 */
>> +	AXI64_AXI64, /* 2x AXI: 64-bit port 0 and 64-bit port 1 */
>> +	AXI128_AHB32, /* 1x AXI 128-bit port 0 and 1x AHB 32-bit port 1 */
>> +};
>> +
>> +enum stm32_dma3_port_data_width {
>> +	DW_32, /* 32-bit, for AHB */
>> +	DW_64, /* 64-bit, for AXI */
>> +	DW_128, /* 128-bit, for AXI */
>> +	DW_INVALID,
>> +};
>> +
>> +/* VERR DMA version register */
>> +#define VERR_MINREV			GENMASK(3, 0)
>> +#define VERR_MAJREV			GENMASK(7, 4)
>> +
>> +/* Device tree */
>> +/* struct stm32_dma3_dt_conf */
>> +/* .ch_conf */
>> +#define STM32_DMA3_DT_PRIO		GENMASK(1, 0) /* CCR_PRIO */
>> +#define STM32_DMA3_DT_FIFO		GENMASK(7, 4)
>> +/* .tr_conf */
>> +#define STM32_DMA3_DT_SINC		BIT(0) /* CTR1_SINC */
>> +#define STM32_DMA3_DT_SAP		BIT(1) /* CTR1_SAP */
>> +#define STM32_DMA3_DT_DINC		BIT(4) /* CTR1_DINC */
>> +#define STM32_DMA3_DT_DAP		BIT(5) /* CTR1_DAP */
>> +#define STM32_DMA3_DT_BREQ		BIT(8) /* CTR2_BREQ */
>> +#define STM32_DMA3_DT_PFREQ		BIT(9) /* CTR2_PFREQ */
>> +#define STM32_DMA3_DT_TCEM		GENMASK(13, 12) /* CTR2_TCEM */
>> +
>> +#define STM32_DMA3_MAX_BLOCK_SIZE	ALIGN_DOWN(CBR1_BNDT, 64)
>> +#define port_is_ahb(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) == DW_32); })
>> +#define port_is_axi(maxdw)		({ typeof(maxdw) (_maxdw) = (maxdw); \
>> +					   ((_maxdw) != DW_INVALID) && ((_maxdw) != DW_32); })
>> +#define get_chan_max_dw(maxdw, maxburst)((port_is_ahb(maxdw) ||			     \
>> +					  (maxburst) < DMA_SLAVE_BUSWIDTH_8_BYTES) ? \
>> +					 DMA_SLAVE_BUSWIDTH_4_BYTES : DMA_SLAVE_BUSWIDTH_8_BYTES)
>> +
>> +/* Static linked-list data structure (depends on update bits UT1/UT2/UB1/USA/UDA/ULL) */
>> +struct stm32_dma3_hwdesc {
>> +	u32 ctr1;
>> +	u32 ctr2;
>> +	u32 cbr1;
>> +	u32 csar;
>> +	u32 cdar;
>> +	u32 cllr;
>> +} __aligned(32);
>> +
>> +/*
>> + * CLLR_LA / sizeof(struct stm32_dma3_hwdesc) represents the number of hdwdesc that can be addressed
>> + * by the pointer to the next linked-list data structure. The __aligned forces the 32-byte
>> + * alignment. So use hardcoded 32. Multiplied by the max block size of each item, it represents
>> + * the sg size limitation.
>> + */
>> +#define STM32_DMA3_MAX_SEG_SIZE		((CLLR_LA / 32) * STM32_DMA3_MAX_BLOCK_SIZE)
>> +
>> +/*
>> + * Linked-list items
>> + */
>> +struct stm32_dma3_lli {
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	dma_addr_t hwdesc_addr;
>> +};
>> +
>> +struct stm32_dma3_swdesc {
>> +	struct virt_dma_desc vdesc;
>> +	u32 ccr;
>> +	bool cyclic;
>> +	u32 lli_size;
>> +	struct stm32_dma3_lli lli[] __counted_by(lli_size);
>> +};
>> +
>> +struct stm32_dma3_dt_conf {
>> +	u32 ch_id;
>> +	u32 req_line;
>> +	u32 ch_conf;
>> +	u32 tr_conf;
>> +};
>> +
>> +struct stm32_dma3_chan {
>> +	struct virt_dma_chan vchan;
>> +	u32 id;
>> +	int irq;
>> +	u32 fifo_size;
>> +	u32 max_burst;
>> +	bool semaphore_mode;
>> +	struct stm32_dma3_dt_conf dt_config;
>> +	struct dma_slave_config dma_config;
>> +	struct dma_pool *lli_pool;
>> +	struct stm32_dma3_swdesc *swdesc;
>> +	enum ctr2_tcem tcem;
>> +	u32 dma_status;
>> +};
>> +
>> +struct stm32_dma3_ddata {
>> +	struct dma_device dma_dev;
>> +	void __iomem *base;
>> +	struct clk *clk;
>> +	struct stm32_dma3_chan *chans;
>> +	u32 dma_channels;
>> +	u32 dma_requests;
>> +	enum stm32_dma3_port_data_width ports_max_dw[2];
>> +};
>> +
>> +static inline struct stm32_dma3_ddata *to_stm32_dma3_ddata(struct stm32_dma3_chan *chan)
>> +{
>> +	return container_of(chan->vchan.chan.device, struct stm32_dma3_ddata, dma_dev);
>> +}
>> +
>> +static inline struct stm32_dma3_chan *to_stm32_dma3_chan(struct dma_chan *c)
>> +{
>> +	return container_of(c, struct stm32_dma3_chan, vchan.chan);
>> +}
>> +
>> +static inline struct stm32_dma3_swdesc *to_stm32_dma3_swdesc(struct virt_dma_desc *vdesc)
>> +{
>> +	return container_of(vdesc, struct stm32_dma3_swdesc, vdesc);
>> +}
>> +
>> +static struct device *chan2dev(struct stm32_dma3_chan *chan)
>> +{
>> +	return &chan->vchan.chan.dev->device;
>> +}
>> +
>> +static void stm32_dma3_chan_dump_reg(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct device *dev = chan2dev(chan);
>> +	u32 id = chan->id, offset;
>> +
>> +	offset = STM32_DMA3_SECCFGR;
>> +	dev_dbg(dev, "SECCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_PRIVCFGR;
>> +	dev_dbg(dev, "PRIVCFGR(0x%03x): %08x\n", offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CCIDCFGR(id);
>> +	dev_dbg(dev, "C%dCIDCFGR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CSEMCR(id);
>> +	dev_dbg(dev, "C%dSEMCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CSR(id);
>> +	dev_dbg(dev, "C%dSR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CCR(id);
>> +	dev_dbg(dev, "C%dCR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CTR1(id);
>> +	dev_dbg(dev, "C%dTR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CTR2(id);
>> +	dev_dbg(dev, "C%dTR2(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CBR1(id);
>> +	dev_dbg(dev, "C%dBR1(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CSAR(id);
>> +	dev_dbg(dev, "C%dSAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CDAR(id);
>> +	dev_dbg(dev, "C%dDAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CLLR(id);
>> +	dev_dbg(dev, "C%dLLR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +	offset = STM32_DMA3_CLBAR(id);
>> +	dev_dbg(dev, "C%dLBAR(0x%03x): %08x\n", id, offset, readl_relaxed(ddata->base + offset));
>> +}
>> +
>> +static void stm32_dma3_chan_dump_hwdesc(struct stm32_dma3_chan *chan,
>> +					struct stm32_dma3_swdesc *swdesc)
>> +{
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	int i;
>> +
>> +	for (i = 0; i < swdesc->lli_size; i++) {
>> +		hwdesc = swdesc->lli[i].hwdesc;
>> +		if (i)
>> +			dev_dbg(chan2dev(chan), "V\n");
>> +		dev_dbg(chan2dev(chan), "[%d]@%pad\n", i, &swdesc->lli[i].hwdesc_addr);
>> +		dev_dbg(chan2dev(chan), "| C%dTR1: %08x\n", chan->id, hwdesc->ctr1);
>> +		dev_dbg(chan2dev(chan), "| C%dTR2: %08x\n", chan->id, hwdesc->ctr2);
>> +		dev_dbg(chan2dev(chan), "| C%dBR1: %08x\n", chan->id, hwdesc->cbr1);
>> +		dev_dbg(chan2dev(chan), "| C%dSAR: %08x\n", chan->id, hwdesc->csar);
>> +		dev_dbg(chan2dev(chan), "| C%dDAR: %08x\n", chan->id, hwdesc->cdar);
>> +		dev_dbg(chan2dev(chan), "| C%dLLR: %08x\n", chan->id, hwdesc->cllr);
>> +	}
>> +
>> +	if (swdesc->cyclic) {
>> +		dev_dbg(chan2dev(chan), "|\n");
>> +		dev_dbg(chan2dev(chan), "-->[0]@%pad\n", &swdesc->lli[0].hwdesc_addr);
>> +	} else {
>> +		dev_dbg(chan2dev(chan), "X\n");
>> +	}
>> +}
>> +
>> +static struct stm32_dma3_swdesc *stm32_dma3_chan_desc_alloc(struct stm32_dma3_chan *chan, u32 count)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct stm32_dma3_swdesc *swdesc;
>> +	int i;
>> +
>> +	/*
>> +	 * If the memory to be allocated for the number of hwdesc (6 u32 members but 32-bytes
>> +	 * aligned) is greater than the maximum address of CLLR_LA, then the last items can't be
>> +	 * addressed, so abort the allocation.
>> +	 */
>> +	if ((count * 32) > CLLR_LA) {
>> +		dev_err(chan2dev(chan), "Transfer is too big (> %luB)\n", STM32_DMA3_MAX_SEG_SIZE);
>> +		return NULL;
>> +	}
>> +
>> +	swdesc = kzalloc(struct_size(swdesc, lli, count), GFP_NOWAIT);
>> +	if (!swdesc)
>> +		return NULL;
>> +
>> +	for (i = 0; i < count; i++) {
>> +		swdesc->lli[i].hwdesc = dma_pool_zalloc(chan->lli_pool, GFP_NOWAIT,
>> +							&swdesc->lli[i].hwdesc_addr);
>> +		if (!swdesc->lli[i].hwdesc)
>> +			goto err_pool_free;
>> +	}
>> +	swdesc->lli_size = count;
>> +	swdesc->ccr = 0;
>> +
>> +	/* Set LL base address */
>> +	writel_relaxed(swdesc->lli[0].hwdesc_addr & CLBAR_LBA,
>> +		       ddata->base + STM32_DMA3_CLBAR(chan->id));
>> +
>> +	/* Set LL allocated port */
>> +	swdesc->ccr &= ~CCR_LAP;
>> +
>> +	return swdesc;
>> +
>> +err_pool_free:
>> +	dev_err(chan2dev(chan), "Failed to alloc descriptors\n");
>> +	while (--i >= 0)
>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>> +	kfree(swdesc);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void stm32_dma3_chan_desc_free(struct stm32_dma3_chan *chan,
>> +				      struct stm32_dma3_swdesc *swdesc)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < swdesc->lli_size; i++)
>> +		dma_pool_free(chan->lli_pool, swdesc->lli[i].hwdesc, swdesc->lli[i].hwdesc_addr);
>> +
>> +	kfree(swdesc);
>> +}
>> +
>> +static void stm32_dma3_chan_vdesc_free(struct virt_dma_desc *vdesc)
>> +{
>> +	struct stm32_dma3_swdesc *swdesc = to_stm32_dma3_swdesc(vdesc);
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(vdesc->tx.chan);
>> +
>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>> +}
>> +
>> +static void stm32_dma3_check_user_setting(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct device *dev = chan2dev(chan);
>> +	u32 ctr1 = readl_relaxed(ddata->base + STM32_DMA3_CTR1(chan->id));
>> +	u32 cbr1 = readl_relaxed(ddata->base + STM32_DMA3_CBR1(chan->id));
>> +	u32 csar = readl_relaxed(ddata->base + STM32_DMA3_CSAR(chan->id));
>> +	u32 cdar = readl_relaxed(ddata->base + STM32_DMA3_CDAR(chan->id));
>> +	u32 cllr = readl_relaxed(ddata->base + STM32_DMA3_CLLR(chan->id));
>> +	u32 bndt = FIELD_GET(CBR1_BNDT, cbr1);
>> +	u32 sdw = 1 << FIELD_GET(CTR1_SDW_LOG2, ctr1);
>> +	u32 ddw = 1 << FIELD_GET(CTR1_DDW_LOG2, ctr1);
>> +	u32 sap = FIELD_GET(CTR1_SAP, ctr1);
>> +	u32 dap = FIELD_GET(CTR1_DAP, ctr1);
>> +
>> +	if (!bndt && !FIELD_GET(CLLR_UB1, cllr))
>> +		dev_err(dev, "null source block size and no update of this value\n");
>> +	if (bndt % sdw)
>> +		dev_err(dev, "source block size not multiple of src data width\n");
>> +	if (FIELD_GET(CTR1_PAM, ctr1) == CTR1_PAM_PACK_UNPACK && bndt % ddw)
>> +		dev_err(dev, "(un)packing mode w/ src block size not multiple of dst data width\n");
>> +	if (csar % sdw)
>> +		dev_err(dev, "unaligned source address not multiple of src data width\n");
>> +	if (cdar % ddw)
>> +		dev_err(dev, "unaligned destination address not multiple of dst data width\n");
>> +	if (sdw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[sap]))
>> +		dev_err(dev, "double-word source data width not supported on port %u\n", sap);
>> +	if (ddw == DMA_SLAVE_BUSWIDTH_8_BYTES && port_is_ahb(ddata->ports_max_dw[dap]))
>> +		dev_err(dev, "double-word destination data width not supported on port %u\n", dap);
>> +}
>> +
>> +static void stm32_dma3_chan_prep_hwdesc(struct stm32_dma3_chan *chan,
>> +					struct stm32_dma3_swdesc *swdesc,
>> +					u32 curr, dma_addr_t src, dma_addr_t dst, u32 len,
>> +					u32 ctr1, u32 ctr2, bool is_last, bool is_cyclic)
>> +{
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	dma_addr_t next_lli;
>> +	u32 next = curr + 1;
>> +
>> +	hwdesc = swdesc->lli[curr].hwdesc;
>> +	hwdesc->ctr1 = ctr1;
>> +	hwdesc->ctr2 = ctr2;
>> +	hwdesc->cbr1 = FIELD_PREP(CBR1_BNDT, len);
>> +	hwdesc->csar = src;
>> +	hwdesc->cdar = dst;
>> +
>> +	if (is_last) {
>> +		if (is_cyclic)
>> +			next_lli = swdesc->lli[0].hwdesc_addr;
>> +		else
>> +			next_lli = 0;
>> +	} else {
>> +		next_lli = swdesc->lli[next].hwdesc_addr;
>> +	}
>> +
>> +	hwdesc->cllr = 0;
>> +	if (next_lli) {
>> +		hwdesc->cllr |= CLLR_UT1 | CLLR_UT2 | CLLR_UB1;
>> +		hwdesc->cllr |= CLLR_USA | CLLR_UDA | CLLR_ULL;
>> +		hwdesc->cllr |= (next_lli & CLLR_LA);
>> +	}
>> +}
>> +
>> +static enum dma_slave_buswidth stm32_dma3_get_max_dw(u32 chan_max_burst,
>> +						     enum stm32_dma3_port_data_width port_max_dw,
>> +						     u32 len, dma_addr_t addr)
>> +{
>> +	enum dma_slave_buswidth max_dw = get_chan_max_dw(port_max_dw, chan_max_burst);
>> +
>> +	/* len and addr must be a multiple of dw */
>> +	return 1 << __ffs(len | addr | max_dw);
>> +}
>> +
>> +static u32 stm32_dma3_get_max_burst(u32 len, enum dma_slave_buswidth dw, u32 chan_max_burst)
>> +{
>> +	u32 max_burst = chan_max_burst ? chan_max_burst / dw : 1;
>> +
>> +	/* len is a multiple of dw, so if len is < chan_max_burst, shorten burst */
>> +	if (len < chan_max_burst)
>> +		max_burst = len / dw;
>> +
>> +	/*
>> +	 * HW doesn't modify the burst if burst size <= half of the fifo size.
>> +	 * If len is not a multiple of burst size, last burst is shortened by HW.
>> +	 */
>> +	return max_burst;
>> +}
>> +
>> +static int stm32_dma3_chan_prep_hw(struct stm32_dma3_chan *chan, enum dma_transfer_direction dir,
>> +				   u32 *ccr, u32 *ctr1, u32 *ctr2,
>> +				   dma_addr_t src_addr, dma_addr_t dst_addr, u32 len)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct dma_device dma_device = ddata->dma_dev;
>> +	u32 sdw, ddw, sbl_max, dbl_max, tcem;
>> +	u32 _ctr1 = 0, _ctr2 = 0;
>> +	u32 ch_conf = chan->dt_config.ch_conf;
>> +	u32 tr_conf = chan->dt_config.tr_conf;
>> +	u32 sap = FIELD_GET(STM32_DMA3_DT_SAP, tr_conf), sap_max_dw;
>> +	u32 dap = FIELD_GET(STM32_DMA3_DT_DAP, tr_conf), dap_max_dw;
>> +
>> +	dev_dbg(chan2dev(chan), "%s from %pad to %pad\n",
>> +		dmaengine_get_direction_text(dir), &src_addr, &dst_addr);
>> +
>> +	sdw = chan->dma_config.src_addr_width ? : get_chan_max_dw(sap, chan->max_burst);
>> +	ddw = chan->dma_config.dst_addr_width ? : get_chan_max_dw(dap, chan->max_burst);
>> +	sbl_max = chan->dma_config.src_maxburst ? : 1;
>> +	dbl_max = chan->dma_config.dst_maxburst ? : 1;
>> +
>> +	/* Following conditions would raise User Setting Error interrupt */
>> +	if (!(dma_device.src_addr_widths & BIT(sdw)) || !(dma_device.dst_addr_widths & BIT(ddw))) {
>> +		dev_err(chan2dev(chan), "Bus width (src=%u, dst=%u) not supported\n", sdw, ddw);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (ddata->ports_max_dw[1] == DW_INVALID && (sap || dap)) {
>> +		dev_err(chan2dev(chan), "Only one master port, port 1 is not supported\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	sap_max_dw = ddata->ports_max_dw[sap];
>> +	dap_max_dw = ddata->ports_max_dw[dap];
>> +	if ((port_is_ahb(sap_max_dw) && sdw == DMA_SLAVE_BUSWIDTH_8_BYTES) ||
>> +	    (port_is_ahb(dap_max_dw) && ddw == DMA_SLAVE_BUSWIDTH_8_BYTES)) {
>> +		dev_err(chan2dev(chan),
>> +			"8 bytes buswidth (src=%u, dst=%u) not supported on port (sap=%u, dap=%u\n",
>> +			sdw, ddw, sap, dap);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (FIELD_GET(STM32_DMA3_DT_SINC, tr_conf))
>> +		_ctr1 |= CTR1_SINC;
>> +	if (sap)
>> +		_ctr1 |= CTR1_SAP;
>> +	if (FIELD_GET(STM32_DMA3_DT_DINC, tr_conf))
>> +		_ctr1 |= CTR1_DINC;
>> +	if (dap)
>> +		_ctr1 |= CTR1_DAP;
>> +
>> +	_ctr2 |= FIELD_PREP(CTR2_REQSEL, chan->dt_config.req_line) & ~CTR2_SWREQ;
>> +	if (FIELD_GET(STM32_DMA3_DT_BREQ, tr_conf))
>> +		_ctr2 |= CTR2_BREQ;
>> +	if (dir == DMA_DEV_TO_MEM && FIELD_GET(STM32_DMA3_DT_PFREQ, tr_conf))
>> +		_ctr2 |= CTR2_PFREQ;
>> +	tcem = FIELD_GET(STM32_DMA3_DT_TCEM, tr_conf);
>> +	_ctr2 |= FIELD_PREP(CTR2_TCEM, tcem);
>> +
>> +	/* Store TCEM to know on which event TC flag occurred */
>> +	chan->tcem = tcem;
>> +	/* Store direction for residue computation */
>> +	chan->dma_config.direction = dir;
>> +
>> +	switch (dir) {
>> +	case DMA_MEM_TO_DEV:
>> +		/* Set destination (device) data width and burst */
>> +		ddw = min_t(u32, ddw, stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw,
>> +							    len, dst_addr));
>> +		dbl_max = min_t(u32, dbl_max, stm32_dma3_get_max_burst(len, ddw, chan->max_burst));
>> +
>> +		/* Set source (memory) data width and burst */
>> +		sdw = stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw, len, src_addr);
>> +		sbl_max = stm32_dma3_get_max_burst(len, sdw, chan->max_burst);
>> +
>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>> +
>> +		if (ddw != sdw) {
>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>> +			/* Should never reach this case as ddw is clamped down */
>> +			if (len & (ddw - 1)) {
>> +				dev_err(chan2dev(chan),
>> +					"Packing mode is enabled and len is not multiple of ddw");
>> +				return -EINVAL;
>> +			}
>> +		}
>> +
>> +		/* dst = dev */
>> +		_ctr2 |= CTR2_DREQ;
>> +
>> +		break;
>> +
>> +	case DMA_DEV_TO_MEM:
>> +		/* Set source (device) data width and burst */
>> +		sdw = min_t(u32, sdw, stm32_dma3_get_max_dw(chan->max_burst, sap_max_dw,
>> +							    len, src_addr));
>> +		sbl_max = min_t(u32, sbl_max, stm32_dma3_get_max_burst(len, sdw, chan->max_burst));
>> +
>> +		/* Set destination (memory) data width and burst */
>> +		ddw = stm32_dma3_get_max_dw(chan->max_burst, dap_max_dw, len, dst_addr);
>> +		dbl_max = stm32_dma3_get_max_burst(len, ddw, chan->max_burst);
>> +
>> +		_ctr1 |= FIELD_PREP(CTR1_SDW_LOG2, ilog2(sdw));
>> +		_ctr1 |= FIELD_PREP(CTR1_SBL_1, sbl_max - 1);
>> +		_ctr1 |= FIELD_PREP(CTR1_DDW_LOG2, ilog2(ddw));
>> +		_ctr1 |= FIELD_PREP(CTR1_DBL_1, dbl_max - 1);
>> +
>> +		if (ddw != sdw) {
>> +			_ctr1 |= FIELD_PREP(CTR1_PAM, CTR1_PAM_PACK_UNPACK);
>> +			/* Should never reach this case as ddw is clamped down */
>> +			if (len & (ddw - 1)) {
>> +				dev_err(chan2dev(chan),
>> +					"Packing mode is enabled and len is not multiple of ddw\n");
>> +				return -EINVAL;
>> +			}
>> +		}
>> +
>> +		/* dst = mem */
>> +		_ctr2 &= ~CTR2_DREQ;
>> +
>> +		break;
>> +
>> +	default:
>> +		dev_err(chan2dev(chan), "Direction %s not supported\n",
>> +			dmaengine_get_direction_text(dir));
>> +		return -EINVAL;
>> +	}
>> +
>> +	*ccr |= FIELD_PREP(CCR_PRIO, FIELD_GET(STM32_DMA3_DT_PRIO, ch_conf));
>> +	*ctr1 = _ctr1;
>> +	*ctr2 = _ctr2;
>> +
>> +	dev_dbg(chan2dev(chan), "%s: sdw=%u bytes sbl=%u beats ddw=%u bytes dbl=%u beats\n",
>> +		__func__, sdw, sbl_max, ddw, dbl_max);
>> +
>> +	return 0;
>> +}
>> +
>> +static void stm32_dma3_chan_start(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct virt_dma_desc *vdesc;
>> +	struct stm32_dma3_hwdesc *hwdesc;
>> +	u32 id = chan->id;
>> +	u32 csr, ccr;
>> +
>> +	vdesc = vchan_next_desc(&chan->vchan);
>> +	if (!vdesc) {
>> +		chan->swdesc = NULL;
>> +		return;
>> +	}
>> +	list_del(&vdesc->node);
>> +
>> +	chan->swdesc = to_stm32_dma3_swdesc(vdesc);
>> +	hwdesc = chan->swdesc->lli[0].hwdesc;
>> +
>> +	stm32_dma3_chan_dump_hwdesc(chan, chan->swdesc);
>> +
>> +	writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
>> +	writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
>> +	writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
>> +	writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
>> +	writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
>> +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
>> +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
>> +
>> +	/* Clear any pending interrupts */
>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
>> +	if (csr & CSR_ALL_F)
>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
>> +
>> +	stm32_dma3_chan_dump_reg(chan);
>> +
>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
>> +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
> 
> This one should use writel instead of writel_relaxed because it need
> dma_wmb() as barrier for preious write complete.
> 
> Frank
> 

ddata->base is Device memory type thanks to ioremap() use, so it is 
strongly ordered and non-cacheable.
DMA3 is outside CPU cluster, its registers are accessible through AHB bus.
dma_wmb() (in case of writel instead of writel_relaxed) is useless in 
that case: it won't ensure the propagation on the bus is complete, and 
it will have impacts on the system.
That's why CCR register is written once,  then it is read before CCR_EN 
is set and being written again, with _relaxed(), because registers are 
behind a bus, and ioremapped with Device memory type which ensures it is 
strongly ordered and non-cacheable.

>> +
>> +	chan->dma_status = DMA_IN_PROGRESS;
>> +
>> +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
>> +}
>> +
>> +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>> +	int ret = 0;
>> +
>> +	if (susp)
>> +		ccr |= CCR_SUSP;
>> +	else
>> +		ccr &= ~CCR_SUSP;
>> +
>> +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
>> +
>> +	if (susp) {
>> +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
>> +							csr & CSR_SUSPF, 1, 10);
>> +		if (!ret)
>> +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
>> +
>> +		stm32_dma3_chan_dump_reg(chan);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>> +
>> +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
>> +}
>> +
>> +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
>> +{
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 ccr;
>> +	int ret = 0;
>> +
>> +	chan->dma_status = DMA_COMPLETE;
>> +
>> +	/* Disable interrupts */
>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
>> +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
>> +
>> +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
>> +		/* Suspend the channel */
>> +		ret = stm32_dma3_chan_suspend(chan, true);
>> +		if (ret)
>> +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
>> +	}
>> +
>> +	/*
>> +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
>> +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
>> +	 */
>> +	stm32_dma3_chan_reset(chan);
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
>> +{
>> +	if (!chan->swdesc)
>> +		return;
>> +
>> +	vchan_cookie_complete(&chan->swdesc->vdesc);
>> +	chan->swdesc = NULL;
>> +	stm32_dma3_chan_start(chan);
>> +}
>> +
>> +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
>> +{
>> +	struct stm32_dma3_chan *chan = devid;
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 misr, csr, ccr;
>> +
>> +	spin_lock(&chan->vchan.lock);
>> +
>> +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
>> +	if (!(misr & MISR_MIS(chan->id))) {
>> +		spin_unlock(&chan->vchan.lock);
>> +		return IRQ_NONE;
>> +	}
>> +
>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
>> +
>> +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
>> +		if (chan->swdesc->cyclic)
>> +			vchan_cyclic_callback(&chan->swdesc->vdesc);
>> +		else
>> +			stm32_dma3_chan_complete(chan);
>> +	}
>> +
>> +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
>> +		dev_err(chan2dev(chan), "User setting error\n");
>> +		chan->dma_status = DMA_ERROR;
>> +		/* CCR.EN automatically cleared by HW */
>> +		stm32_dma3_check_user_setting(chan);
>> +		stm32_dma3_chan_reset(chan);
>> +	}
>> +
>> +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
>> +		dev_err(chan2dev(chan), "Update link transfer error\n");
>> +		chan->dma_status = DMA_ERROR;
>> +		/* CCR.EN automatically cleared by HW */
>> +		stm32_dma3_chan_reset(chan);
>> +	}
>> +
>> +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
>> +		dev_err(chan2dev(chan), "Data transfer error\n");
>> +		chan->dma_status = DMA_ERROR;
>> +		/* CCR.EN automatically cleared by HW */
>> +		stm32_dma3_chan_reset(chan);
>> +	}
>> +
>> +	/*
>> +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
>> +	 * ensure HTF flag to be cleared, with other flags.
>> +	 */
>> +	csr &= (ccr | CCR_HTIE);
>> +
>> +	if (csr)
>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
>> +
>> +	spin_unlock(&chan->vchan.lock);
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	u32 id = chan->id, csemcr, ccid;
>> +	int ret;
>> +
>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>> +	if (ret < 0)
>> +		return ret;
> 
> It doesn't prefer runtime pm get at alloc dma chan, many client driver
> doesn't actual user dma when allocate dma chan.
> 
> Ideally, resume get when issue_pending. Please refer pl330.c.
> 
> You may add runtime pm later after enablement patch.
> 
> Frank
> 

To well balance clock enable/disable, if pm_runtime_resume_and_get() 
(rather than pm_runtime_get_sync() which doesn't decrement the counter 
in case of error) is used when issue_pending, it means 
pm_runtime_put_sync() should be done when transfer ends.

terminate_all is not always called, so put_sync can't be used only 
there, it should be conditionnally used in terminate_all, but also in 
interrupt handler, on error events and on transfer completion event, 
provided that it is the last transfer complete event (last item of the 
linked-list).

For clients with high transfer rate, it means a lot of clock enable/disable.
Moreover, DMA3 clock is managed by Secure OS. So it means a lot of 
non-secure/secure world transitions.

I prefer to keep the implementation as it is for now, and possibly 
propose runtime pm improvement later, with autosuspend.

Amelie

>> +
>> +	/* Ensure the channel is free */
>> +	if (chan->semaphore_mode &&
>> +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
>> +		ret = -EBUSY;
>> +		goto err_put_sync;
>> +	}
>> +
>> +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
>> +					  sizeof(struct stm32_dma3_hwdesc),
>> +					  __alignof__(struct stm32_dma3_hwdesc), 0);
>> +	if (!chan->lli_pool) {
>> +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
>> +		ret = -ENOMEM;
>> +		goto err_put_sync;
>> +	}
>> +
>> +	/* Take the channel semaphore */
>> +	if (chan->semaphore_mode) {
>> +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
>> +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
>> +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
>> +		/* Check that the channel is well taken */
>> +		if (ccid != CCIDCFGR_CID1) {
>> +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
>> +			ret = -EPERM;
>> +			goto err_pool_destroy;
>> +		}
>> +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
>> +	}
>> +
>> +	return 0;
>> +
>> +err_pool_destroy:
>> +	dmam_pool_destroy(chan->lli_pool);
>> +	chan->lli_pool = NULL;
>> +
>> +err_put_sync:
>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	unsigned long flags;
>> +
>> +	/* Ensure channel is in idle state */
>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>> +	stm32_dma3_chan_stop(chan);
>> +	chan->swdesc = NULL;
>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>> +
>> +	vchan_free_chan_resources(to_virt_chan(c));
>> +
>> +	dmam_pool_destroy(chan->lli_pool);
>> +	chan->lli_pool = NULL;
>> +
>> +	/* Release the channel semaphore */
>> +	if (chan->semaphore_mode)
>> +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
>> +
>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>> +
>> +	/* Reset configuration */
>> +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
>> +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
>> +}
>> +
>> +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
>> +								struct scatterlist *sgl,
>> +								unsigned int sg_len,
>> +								enum dma_transfer_direction dir,
>> +								unsigned long flags, void *context)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_swdesc *swdesc;
>> +	struct scatterlist *sg;
>> +	size_t len;
>> +	dma_addr_t sg_addr, dev_addr, src, dst;
>> +	u32 i, j, count, ctr1, ctr2;
>> +	int ret;
>> +
>> +	count = sg_len;
>> +	for_each_sg(sgl, sg, sg_len, i) {
>> +		len = sg_dma_len(sg);
>> +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
>> +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
>> +	}
>> +
>> +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
>> +	if (!swdesc)
>> +		return NULL;
>> +
>> +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
>> +	j = 0;
>> +	for_each_sg(sgl, sg, sg_len, i) {
>> +		sg_addr = sg_dma_address(sg);
>> +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
>> +						     chan->dma_config.src_addr;
>> +		len = sg_dma_len(sg);
>> +
>> +		do {
>> +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
>> +
>> +			if (dir == DMA_MEM_TO_DEV) {
>> +				src = sg_addr;
>> +				dst = dev_addr;
>> +
>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>> +							      src, dst, chunk);
>> +
>> +				if (FIELD_GET(CTR1_DINC, ctr1))
>> +					dev_addr += chunk;
>> +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
>> +				src = dev_addr;
>> +				dst = sg_addr;
>> +
>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>> +							      src, dst, chunk);
>> +
>> +				if (FIELD_GET(CTR1_SINC, ctr1))
>> +					dev_addr += chunk;
>> +			}
>> +
>> +			if (ret)
>> +				goto err_desc_free;
>> +
>> +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
>> +						    ctr1, ctr2, j == (count - 1), false);
>> +
>> +			sg_addr += chunk;
>> +			len -= chunk;
>> +			j++;
>> +		} while (len);
>> +	}
>> +
>> +	/* Enable Error interrupts */
>> +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
>> +	/* Enable Transfer state interrupts */
>> +	swdesc->ccr |= CCR_TCIE;
>> +
>> +	swdesc->cyclic = false;
>> +
>> +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
>> +
>> +err_desc_free:
>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +
>> +	if (!chan->fifo_size) {
>> +		caps->max_burst = 0;
>> +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +	} else {
>> +		/* Burst transfer should not exceed half of the fifo size */
>> +		caps->max_burst = chan->max_burst;
>> +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
>> +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +		}
>> +	}
>> +}
>> +
>> +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +
>> +	memcpy(&chan->dma_config, config, sizeof(*config));
>> +
>> +	return 0;
>> +}
>> +
>> +static int stm32_dma3_terminate_all(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	unsigned long flags;
>> +	LIST_HEAD(head);
>> +
>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>> +
>> +	if (chan->swdesc) {
>> +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
>> +		chan->swdesc = NULL;
>> +	}
>> +
>> +	stm32_dma3_chan_stop(chan);
>> +
>> +	vchan_get_all_descriptors(&chan->vchan, &head);
>> +
>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>> +	vchan_dma_desc_free_list(&chan->vchan, &head);
>> +
>> +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
>> +
>> +	return 0;
>> +}
>> +
>> +static void stm32_dma3_synchronize(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +
>> +	vchan_synchronize(&chan->vchan);
>> +}
>> +
>> +static void stm32_dma3_issue_pending(struct dma_chan *c)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>> +
>> +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
>> +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
>> +		stm32_dma3_chan_start(chan);
>> +	}
>> +
>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>> +}
>> +
>> +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
>> +{
>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>> +	struct stm32_dma3_dt_conf *conf = fn_param;
>> +	u32 mask, semcr;
>> +	int ret;
>> +
>> +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
>> +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
>> +
>> +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
>> +		if (!(mask & BIT(chan->id)))
>> +			return false;
>> +
>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>> +	if (ret < 0)
>> +		return false;
>> +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>> +
>> +	/* Check if chan is free */
>> +	if (semcr & CSEMCR_SEM_MUTEX)
>> +		return false;
>> +
>> +	/* Check if chan fifo fits well */
>> +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
>> +		return false;
>> +
>> +	return true;
>> +}
>> +
>> +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
>> +{
>> +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
>> +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
>> +	struct stm32_dma3_dt_conf conf;
>> +	struct stm32_dma3_chan *chan;
>> +	struct dma_chan *c;
>> +
>> +	if (dma_spec->args_count < 3) {
>> +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
>> +		return NULL;
>> +	}
>> +
>> +	conf.req_line = dma_spec->args[0];
>> +	conf.ch_conf = dma_spec->args[1];
>> +	conf.tr_conf = dma_spec->args[2];
>> +
>> +	if (conf.req_line >= ddata->dma_requests) {
>> +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
>> +		return NULL;
>> +	}
>> +
>> +	/* Request dma channel among the generic dma controller list */
>> +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
>> +	if (!c) {
>> +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
>> +		return NULL;
>> +	}
>> +
>> +	chan = to_stm32_dma3_chan(c);
>> +	chan->dt_config = conf;
>> +
>> +	return c;
>> +}
>> +
>> +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
>> +{
>> +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
>> +
>> +	/* Reserve Secure channels */
>> +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
>> +
>> +	/*
>> +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
>> +	 * the processor which is configuring and using the given channel.
>> +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
>> +	 * specify available DMA channels to the kernel.
>> +	 */
>> +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
>> +
>> +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
>> +	for (i = 0; i < ddata->dma_channels; i++) {
>> +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
>> +
>> +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
>> +			invalid_cid |= BIT(i);
>> +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
>> +				chan_reserved |= BIT(i);
>> +		} else { /* CID-filtered */
>> +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
>> +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
>> +					chan_reserved |= BIT(i);
>> +			} else { /* Semaphore mode */
>> +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
>> +					chan_reserved |= BIT(i);
>> +				ddata->chans[i].semaphore_mode = true;
>> +			}
>> +		}
>> +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
>> +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
>> +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
>> +			(chan_reserved & BIT(i)) ? "denied" :
>> +			mask & BIT(i) ? "force allowed" : "allowed");
>> +	}
>> +
>> +	if (invalid_cid)
>> +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
>> +			 ddata->dma_channels, &invalid_cid);
>> +
>> +	return chan_reserved;
>> +}
>> +
>> +static const struct of_device_id stm32_dma3_of_match[] = {
>> +	{ .compatible = "st,stm32-dma3", },
>> +	{ /* sentinel */},
>> +};
>> +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
>> +
>> +static int stm32_dma3_probe(struct platform_device *pdev)
>> +{
>> +	struct device_node *np = pdev->dev.of_node;
>> +	struct stm32_dma3_ddata *ddata;
>> +	struct reset_control *reset;
>> +	struct stm32_dma3_chan *chan;
>> +	struct dma_device *dma_dev;
>> +	u32 master_ports, chan_reserved, i, verr;
>> +	u64 hwcfgr;
>> +	int ret;
>> +
>> +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
>> +	if (!ddata)
>> +		return -ENOMEM;
>> +	platform_set_drvdata(pdev, ddata);
>> +
>> +	dma_dev = &ddata->dma_dev;
>> +
>> +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
>> +	if (IS_ERR(ddata->base))
>> +		return PTR_ERR(ddata->base);
>> +
>> +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
>> +	if (IS_ERR(ddata->clk))
>> +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
>> +
>> +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
>> +	if (IS_ERR(reset))
>> +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
>> +
>> +	ret = clk_prepare_enable(ddata->clk);
>> +	if (ret)
>> +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
>> +
>> +	reset_control_reset(reset);
>> +
>> +	INIT_LIST_HEAD(&dma_dev->channels);
>> +
>> +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
>> +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
>> +	dma_dev->dev = &pdev->dev;
>> +	/*
>> +	 * This controller supports up to 8-byte buswidth depending on the port used and the
>> +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
>> +	 */
>> +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
>> +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>> +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
>> +
>> +	dma_dev->descriptor_reuse = true;
>> +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
>> +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
>> +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
>> +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
>> +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
>> +	dma_dev->device_caps = stm32_dma3_caps;
>> +	dma_dev->device_config = stm32_dma3_config;
>> +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
>> +	dma_dev->device_synchronize = stm32_dma3_synchronize;
>> +	dma_dev->device_tx_status = dma_cookie_status;
>> +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
>> +
>> +	/* if dma_channels is not modified, get it from hwcfgr1 */
>> +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>> +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
>> +	}
>> +
>> +	/* if dma_requests is not modified, get it from hwcfgr2 */
>> +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
>> +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
>> +	}
>> +
>> +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>> +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
>> +
>> +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
>> +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
>> +		ddata->ports_max_dw[1] = DW_INVALID;
>> +	else /* Dual master ports */
>> +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
>> +
>> +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
>> +				    GFP_KERNEL);
>> +	if (!ddata->chans) {
>> +		ret = -ENOMEM;
>> +		goto err_clk_disable;
>> +	}
>> +
>> +	chan_reserved = stm32_dma3_check_rif(ddata);
>> +
>> +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
>> +		ret = -ENODEV;
>> +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
>> +		goto err_clk_disable;
>> +	}
>> +
>> +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
>> +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
>> +
>> +	for (i = 0; i < ddata->dma_channels; i++) {
>> +		if (chan_reserved & BIT(i))
>> +			continue;
>> +
>> +		chan = &ddata->chans[i];
>> +		chan->id = i;
>> +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
>> +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
>> +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
>> +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
>> +
>> +		vchan_init(&chan->vchan, dma_dev);
>> +	}
>> +
>> +	ret = dmaenginem_async_device_register(dma_dev);
>> +	if (ret)
>> +		goto err_clk_disable;
>> +
>> +	for (i = 0; i < ddata->dma_channels; i++) {
>> +		if (chan_reserved & BIT(i))
>> +			continue;
>> +
>> +		ret = platform_get_irq(pdev, i);
>> +		if (ret < 0)
>> +			goto err_clk_disable;
>> +
>> +		chan = &ddata->chans[i];
>> +		chan->irq = ret;
>> +
>> +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
>> +				       dev_name(chan2dev(chan)), chan);
>> +		if (ret) {
>> +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
>> +				      dev_name(chan2dev(chan)));
>> +			goto err_clk_disable;
>> +		}
>> +	}
>> +
>> +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
>> +	if (ret) {
>> +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
>> +		goto err_clk_disable;
>> +	}
>> +
>> +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
>> +
>> +	pm_runtime_set_active(&pdev->dev);
>> +	pm_runtime_enable(&pdev->dev);
>> +	pm_runtime_get_noresume(&pdev->dev);
>> +	pm_runtime_put(&pdev->dev);
>> +
>> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
>> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
>> +
>> +	return 0;
>> +
>> +err_clk_disable:
>> +	clk_disable_unprepare(ddata->clk);
>> +
>> +	return ret;
>> +}
>> +
>> +static void stm32_dma3_remove(struct platform_device *pdev)
>> +{
>> +	pm_runtime_disable(&pdev->dev);
>> +}
>> +
>> +static int stm32_dma3_runtime_suspend(struct device *dev)
>> +{
>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>> +
>> +	clk_disable_unprepare(ddata->clk);
>> +
>> +	return 0;
>> +}
>> +
>> +static int stm32_dma3_runtime_resume(struct device *dev)
>> +{
>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>> +	int ret;
>> +
>> +	ret = clk_prepare_enable(ddata->clk);
>> +	if (ret)
>> +		dev_err(dev, "Failed to enable clk: %d\n", ret);
>> +
>> +	return ret;
>> +}
>> +
>> +static const struct dev_pm_ops stm32_dma3_pm_ops = {
>> +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
>> +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
>> +};
>> +
>> +static struct platform_driver stm32_dma3_driver = {
>> +	.probe = stm32_dma3_probe,
>> +	.remove_new = stm32_dma3_remove,
>> +	.driver = {
>> +		.name = "stm32-dma3",
>> +		.of_match_table = stm32_dma3_of_match,
>> +		.pm = pm_ptr(&stm32_dma3_pm_ops),
>> +	},
>> +};
>> +
>> +static int __init stm32_dma3_init(void)
>> +{
>> +	return platform_driver_register(&stm32_dma3_driver);
>> +}
>> +
>> +subsys_initcall(stm32_dma3_init);
>> +
>> +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
>> +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
>> +MODULE_LICENSE("GPL");
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-16 15:25     ` Amelie Delaunay
@ 2024-05-16 17:09       ` Frank Li
  2024-05-17  9:42         ` Amelie Delaunay
  0 siblings, 1 reply; 29+ messages in thread
From: Frank Li @ 2024-05-16 17:09 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

On Thu, May 16, 2024 at 05:25:58PM +0200, Amelie Delaunay wrote:
> On 5/15/24 20:56, Frank Li wrote:
> > On Tue, Apr 23, 2024 at 02:32:55PM +0200, Amelie Delaunay wrote:
> > > STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
> > > controller:
...
> > > +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
> > > +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
> > > +
> > > +	/* Clear any pending interrupts */
> > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
> > > +	if (csr & CSR_ALL_F)
> > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
> > > +
> > > +	stm32_dma3_chan_dump_reg(chan);
> > > +
> > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
> > > +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
> > 
> > This one should use writel instead of writel_relaxed because it need
> > dma_wmb() as barrier for preious write complete.
> > 
> > Frank
> > 
> 
> ddata->base is Device memory type thanks to ioremap() use, so it is strongly
> ordered and non-cacheable.
> DMA3 is outside CPU cluster, its registers are accessible through AHB bus.
> dma_wmb() (in case of writel instead of writel_relaxed) is useless in that
> case: it won't ensure the propagation on the bus is complete, and it will
> have impacts on the system.
> That's why CCR register is written once,  then it is read before CCR_EN is
> set and being written again, with _relaxed(), because registers are behind a
> bus, and ioremapped with Device memory type which ensures it is strongly
> ordered and non-cacheable.

regardless memory map, writel_relaxed() just make sure io write and read is
orderred, not necessary order with other memory access. only readl and
writel make sure order with other memory read/write.

1. Write src_addr to descriptor
2. dma_wmb()
3. Write "ready" to descriptor
4. enable channel or doorbell by write a register.

if 4 use writel_relaxe(). because 3 write to DDR, which difference place of
mmio, 4 may happen before 3.  Your can refer axi order model.

4 have to use ONE writel(), to make sure 3 already write to DDR.

You need use at least one writel() to make sure all nornmal memory finish.

> 
> > > +
> > > +	chan->dma_status = DMA_IN_PROGRESS;
> > > +
> > > +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
> > > +}
> > > +
> > > +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > +	int ret = 0;
> > > +
> > > +	if (susp)
> > > +		ccr |= CCR_SUSP;
> > > +	else
> > > +		ccr &= ~CCR_SUSP;
> > > +
> > > +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
> > > +
> > > +	if (susp) {
> > > +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
> > > +							csr & CSR_SUSPF, 1, 10);
> > > +		if (!ret)
> > > +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > +
> > > +		stm32_dma3_chan_dump_reg(chan);
> > > +	}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > +
> > > +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
> > > +}
> > > +
> > > +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 ccr;
> > > +	int ret = 0;
> > > +
> > > +	chan->dma_status = DMA_COMPLETE;
> > > +
> > > +	/* Disable interrupts */
> > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
> > > +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
> > > +
> > > +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
> > > +		/* Suspend the channel */
> > > +		ret = stm32_dma3_chan_suspend(chan, true);
> > > +		if (ret)
> > > +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
> > > +	}
> > > +
> > > +	/*
> > > +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
> > > +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
> > > +	 */
> > > +	stm32_dma3_chan_reset(chan);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
> > > +{
> > > +	if (!chan->swdesc)
> > > +		return;
> > > +
> > > +	vchan_cookie_complete(&chan->swdesc->vdesc);
> > > +	chan->swdesc = NULL;
> > > +	stm32_dma3_chan_start(chan);
> > > +}
> > > +
> > > +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
> > > +{
> > > +	struct stm32_dma3_chan *chan = devid;
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 misr, csr, ccr;
> > > +
> > > +	spin_lock(&chan->vchan.lock);
> > > +
> > > +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
> > > +	if (!(misr & MISR_MIS(chan->id))) {
> > > +		spin_unlock(&chan->vchan.lock);
> > > +		return IRQ_NONE;
> > > +	}
> > > +
> > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
> > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
> > > +
> > > +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
> > > +		if (chan->swdesc->cyclic)
> > > +			vchan_cyclic_callback(&chan->swdesc->vdesc);
> > > +		else
> > > +			stm32_dma3_chan_complete(chan);
> > > +	}
> > > +
> > > +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
> > > +		dev_err(chan2dev(chan), "User setting error\n");
> > > +		chan->dma_status = DMA_ERROR;
> > > +		/* CCR.EN automatically cleared by HW */
> > > +		stm32_dma3_check_user_setting(chan);
> > > +		stm32_dma3_chan_reset(chan);
> > > +	}
> > > +
> > > +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
> > > +		dev_err(chan2dev(chan), "Update link transfer error\n");
> > > +		chan->dma_status = DMA_ERROR;
> > > +		/* CCR.EN automatically cleared by HW */
> > > +		stm32_dma3_chan_reset(chan);
> > > +	}
> > > +
> > > +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
> > > +		dev_err(chan2dev(chan), "Data transfer error\n");
> > > +		chan->dma_status = DMA_ERROR;
> > > +		/* CCR.EN automatically cleared by HW */
> > > +		stm32_dma3_chan_reset(chan);
> > > +	}
> > > +
> > > +	/*
> > > +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
> > > +	 * ensure HTF flag to be cleared, with other flags.
> > > +	 */
> > > +	csr &= (ccr | CCR_HTIE);
> > > +
> > > +	if (csr)
> > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > +
> > > +	spin_unlock(&chan->vchan.lock);
> > > +
> > > +	return IRQ_HANDLED;
> > > +}
> > > +
> > > +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	u32 id = chan->id, csemcr, ccid;
> > > +	int ret;
> > > +
> > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > +	if (ret < 0)
> > > +		return ret;
> > 
> > It doesn't prefer runtime pm get at alloc dma chan, many client driver
> > doesn't actual user dma when allocate dma chan.
> > 
> > Ideally, resume get when issue_pending. Please refer pl330.c.
> > 
> > You may add runtime pm later after enablement patch.
> > 
> > Frank
> > 
> 
> To well balance clock enable/disable, if pm_runtime_resume_and_get() (rather
> than pm_runtime_get_sync() which doesn't decrement the counter in case of
> error) is used when issue_pending, it means pm_runtime_put_sync() should be
> done when transfer ends.
> 
> terminate_all is not always called, so put_sync can't be used only there, it
> should be conditionnally used in terminate_all, but also in interrupt
> handler, on error events and on transfer completion event, provided that it
> is the last transfer complete event (last item of the linked-list).
> 
> For clients with high transfer rate, it means a lot of clock enable/disable.
> Moreover, DMA3 clock is managed by Secure OS. So it means a lot of
> non-secure/secure world transitions.
> 
> I prefer to keep the implementation as it is for now, and possibly propose
> runtime pm improvement later, with autosuspend.


Autosuspend is perfered. we try to use pm_runtime_get/put at channel alloc
/free before, but this solution are rejected by community.

you can leave clock on for this enablement patch and add runtime pm later
time.

Frank

> 
> Amelie
> 
> > > +
> > > +	/* Ensure the channel is free */
> > > +	if (chan->semaphore_mode &&
> > > +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
> > > +		ret = -EBUSY;
> > > +		goto err_put_sync;
> > > +	}
> > > +
> > > +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
> > > +					  sizeof(struct stm32_dma3_hwdesc),
> > > +					  __alignof__(struct stm32_dma3_hwdesc), 0);
> > > +	if (!chan->lli_pool) {
> > > +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
> > > +		ret = -ENOMEM;
> > > +		goto err_put_sync;
> > > +	}
> > > +
> > > +	/* Take the channel semaphore */
> > > +	if (chan->semaphore_mode) {
> > > +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
> > > +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
> > > +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
> > > +		/* Check that the channel is well taken */
> > > +		if (ccid != CCIDCFGR_CID1) {
> > > +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
> > > +			ret = -EPERM;
> > > +			goto err_pool_destroy;
> > > +		}
> > > +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
> > > +	}
> > > +
> > > +	return 0;
> > > +
> > > +err_pool_destroy:
> > > +	dmam_pool_destroy(chan->lli_pool);
> > > +	chan->lli_pool = NULL;
> > > +
> > > +err_put_sync:
> > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	unsigned long flags;
> > > +
> > > +	/* Ensure channel is in idle state */
> > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > +	stm32_dma3_chan_stop(chan);
> > > +	chan->swdesc = NULL;
> > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > +
> > > +	vchan_free_chan_resources(to_virt_chan(c));
> > > +
> > > +	dmam_pool_destroy(chan->lli_pool);
> > > +	chan->lli_pool = NULL;
> > > +
> > > +	/* Release the channel semaphore */
> > > +	if (chan->semaphore_mode)
> > > +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > +
> > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > +
> > > +	/* Reset configuration */
> > > +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
> > > +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
> > > +}
> > > +
> > > +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
> > > +								struct scatterlist *sgl,
> > > +								unsigned int sg_len,
> > > +								enum dma_transfer_direction dir,
> > > +								unsigned long flags, void *context)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_swdesc *swdesc;
> > > +	struct scatterlist *sg;
> > > +	size_t len;
> > > +	dma_addr_t sg_addr, dev_addr, src, dst;
> > > +	u32 i, j, count, ctr1, ctr2;
> > > +	int ret;
> > > +
> > > +	count = sg_len;
> > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > +		len = sg_dma_len(sg);
> > > +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
> > > +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
> > > +	}
> > > +
> > > +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
> > > +	if (!swdesc)
> > > +		return NULL;
> > > +
> > > +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
> > > +	j = 0;
> > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > +		sg_addr = sg_dma_address(sg);
> > > +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
> > > +						     chan->dma_config.src_addr;
> > > +		len = sg_dma_len(sg);
> > > +
> > > +		do {
> > > +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
> > > +
> > > +			if (dir == DMA_MEM_TO_DEV) {
> > > +				src = sg_addr;
> > > +				dst = dev_addr;
> > > +
> > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > +							      src, dst, chunk);
> > > +
> > > +				if (FIELD_GET(CTR1_DINC, ctr1))
> > > +					dev_addr += chunk;
> > > +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
> > > +				src = dev_addr;
> > > +				dst = sg_addr;
> > > +
> > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > +							      src, dst, chunk);
> > > +
> > > +				if (FIELD_GET(CTR1_SINC, ctr1))
> > > +					dev_addr += chunk;
> > > +			}
> > > +
> > > +			if (ret)
> > > +				goto err_desc_free;
> > > +
> > > +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
> > > +						    ctr1, ctr2, j == (count - 1), false);
> > > +
> > > +			sg_addr += chunk;
> > > +			len -= chunk;
> > > +			j++;
> > > +		} while (len);
> > > +	}
> > > +
> > > +	/* Enable Error interrupts */
> > > +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
> > > +	/* Enable Transfer state interrupts */
> > > +	swdesc->ccr |= CCR_TCIE;
> > > +
> > > +	swdesc->cyclic = false;
> > > +
> > > +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
> > > +
> > > +err_desc_free:
> > > +	stm32_dma3_chan_desc_free(chan, swdesc);
> > > +
> > > +	return NULL;
> > > +}
> > > +
> > > +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +
> > > +	if (!chan->fifo_size) {
> > > +		caps->max_burst = 0;
> > > +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +	} else {
> > > +		/* Burst transfer should not exceed half of the fifo size */
> > > +		caps->max_burst = chan->max_burst;
> > > +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
> > > +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +		}
> > > +	}
> > > +}
> > > +
> > > +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +
> > > +	memcpy(&chan->dma_config, config, sizeof(*config));
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static int stm32_dma3_terminate_all(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	unsigned long flags;
> > > +	LIST_HEAD(head);
> > > +
> > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > +
> > > +	if (chan->swdesc) {
> > > +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
> > > +		chan->swdesc = NULL;
> > > +	}
> > > +
> > > +	stm32_dma3_chan_stop(chan);
> > > +
> > > +	vchan_get_all_descriptors(&chan->vchan, &head);
> > > +
> > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > +	vchan_dma_desc_free_list(&chan->vchan, &head);
> > > +
> > > +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void stm32_dma3_synchronize(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +
> > > +	vchan_synchronize(&chan->vchan);
> > > +}
> > > +
> > > +static void stm32_dma3_issue_pending(struct dma_chan *c)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	unsigned long flags;
> > > +
> > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > +
> > > +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
> > > +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
> > > +		stm32_dma3_chan_start(chan);
> > > +	}
> > > +
> > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > +}
> > > +
> > > +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
> > > +{
> > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > +	struct stm32_dma3_dt_conf *conf = fn_param;
> > > +	u32 mask, semcr;
> > > +	int ret;
> > > +
> > > +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
> > > +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
> > > +
> > > +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
> > > +		if (!(mask & BIT(chan->id)))
> > > +			return false;
> > > +
> > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > +	if (ret < 0)
> > > +		return false;
> > > +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > +
> > > +	/* Check if chan is free */
> > > +	if (semcr & CSEMCR_SEM_MUTEX)
> > > +		return false;
> > > +
> > > +	/* Check if chan fifo fits well */
> > > +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
> > > +		return false;
> > > +
> > > +	return true;
> > > +}
> > > +
> > > +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
> > > +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
> > > +	struct stm32_dma3_dt_conf conf;
> > > +	struct stm32_dma3_chan *chan;
> > > +	struct dma_chan *c;
> > > +
> > > +	if (dma_spec->args_count < 3) {
> > > +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
> > > +		return NULL;
> > > +	}
> > > +
> > > +	conf.req_line = dma_spec->args[0];
> > > +	conf.ch_conf = dma_spec->args[1];
> > > +	conf.tr_conf = dma_spec->args[2];
> > > +
> > > +	if (conf.req_line >= ddata->dma_requests) {
> > > +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
> > > +		return NULL;
> > > +	}
> > > +
> > > +	/* Request dma channel among the generic dma controller list */
> > > +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
> > > +	if (!c) {
> > > +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
> > > +		return NULL;
> > > +	}
> > > +
> > > +	chan = to_stm32_dma3_chan(c);
> > > +	chan->dt_config = conf;
> > > +
> > > +	return c;
> > > +}
> > > +
> > > +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
> > > +{
> > > +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
> > > +
> > > +	/* Reserve Secure channels */
> > > +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
> > > +
> > > +	/*
> > > +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
> > > +	 * the processor which is configuring and using the given channel.
> > > +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
> > > +	 * specify available DMA channels to the kernel.
> > > +	 */
> > > +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
> > > +
> > > +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
> > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
> > > +
> > > +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
> > > +			invalid_cid |= BIT(i);
> > > +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
> > > +				chan_reserved |= BIT(i);
> > > +		} else { /* CID-filtered */
> > > +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
> > > +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
> > > +					chan_reserved |= BIT(i);
> > > +			} else { /* Semaphore mode */
> > > +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
> > > +					chan_reserved |= BIT(i);
> > > +				ddata->chans[i].semaphore_mode = true;
> > > +			}
> > > +		}
> > > +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
> > > +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
> > > +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
> > > +			(chan_reserved & BIT(i)) ? "denied" :
> > > +			mask & BIT(i) ? "force allowed" : "allowed");
> > > +	}
> > > +
> > > +	if (invalid_cid)
> > > +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
> > > +			 ddata->dma_channels, &invalid_cid);
> > > +
> > > +	return chan_reserved;
> > > +}
> > > +
> > > +static const struct of_device_id stm32_dma3_of_match[] = {
> > > +	{ .compatible = "st,stm32-dma3", },
> > > +	{ /* sentinel */},
> > > +};
> > > +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
> > > +
> > > +static int stm32_dma3_probe(struct platform_device *pdev)
> > > +{
> > > +	struct device_node *np = pdev->dev.of_node;
> > > +	struct stm32_dma3_ddata *ddata;
> > > +	struct reset_control *reset;
> > > +	struct stm32_dma3_chan *chan;
> > > +	struct dma_device *dma_dev;
> > > +	u32 master_ports, chan_reserved, i, verr;
> > > +	u64 hwcfgr;
> > > +	int ret;
> > > +
> > > +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
> > > +	if (!ddata)
> > > +		return -ENOMEM;
> > > +	platform_set_drvdata(pdev, ddata);
> > > +
> > > +	dma_dev = &ddata->dma_dev;
> > > +
> > > +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
> > > +	if (IS_ERR(ddata->base))
> > > +		return PTR_ERR(ddata->base);
> > > +
> > > +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
> > > +	if (IS_ERR(ddata->clk))
> > > +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
> > > +
> > > +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
> > > +	if (IS_ERR(reset))
> > > +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
> > > +
> > > +	ret = clk_prepare_enable(ddata->clk);
> > > +	if (ret)
> > > +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
> > > +
> > > +	reset_control_reset(reset);
> > > +
> > > +	INIT_LIST_HEAD(&dma_dev->channels);
> > > +
> > > +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
> > > +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
> > > +	dma_dev->dev = &pdev->dev;
> > > +	/*
> > > +	 * This controller supports up to 8-byte buswidth depending on the port used and the
> > > +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
> > > +	 */
> > > +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
> > > +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
> > > +
> > > +	dma_dev->descriptor_reuse = true;
> > > +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
> > > +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
> > > +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
> > > +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
> > > +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
> > > +	dma_dev->device_caps = stm32_dma3_caps;
> > > +	dma_dev->device_config = stm32_dma3_config;
> > > +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
> > > +	dma_dev->device_synchronize = stm32_dma3_synchronize;
> > > +	dma_dev->device_tx_status = dma_cookie_status;
> > > +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
> > > +
> > > +	/* if dma_channels is not modified, get it from hwcfgr1 */
> > > +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
> > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
> > > +	}
> > > +
> > > +	/* if dma_requests is not modified, get it from hwcfgr2 */
> > > +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
> > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
> > > +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
> > > +	}
> > > +
> > > +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
> > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
> > > +
> > > +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
> > > +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
> > > +		ddata->ports_max_dw[1] = DW_INVALID;
> > > +	else /* Dual master ports */
> > > +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
> > > +
> > > +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
> > > +				    GFP_KERNEL);
> > > +	if (!ddata->chans) {
> > > +		ret = -ENOMEM;
> > > +		goto err_clk_disable;
> > > +	}
> > > +
> > > +	chan_reserved = stm32_dma3_check_rif(ddata);
> > > +
> > > +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
> > > +		ret = -ENODEV;
> > > +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
> > > +		goto err_clk_disable;
> > > +	}
> > > +
> > > +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
> > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
> > > +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
> > > +
> > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > +		if (chan_reserved & BIT(i))
> > > +			continue;
> > > +
> > > +		chan = &ddata->chans[i];
> > > +		chan->id = i;
> > > +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
> > > +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
> > > +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
> > > +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
> > > +
> > > +		vchan_init(&chan->vchan, dma_dev);
> > > +	}
> > > +
> > > +	ret = dmaenginem_async_device_register(dma_dev);
> > > +	if (ret)
> > > +		goto err_clk_disable;
> > > +
> > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > +		if (chan_reserved & BIT(i))
> > > +			continue;
> > > +
> > > +		ret = platform_get_irq(pdev, i);
> > > +		if (ret < 0)
> > > +			goto err_clk_disable;
> > > +
> > > +		chan = &ddata->chans[i];
> > > +		chan->irq = ret;
> > > +
> > > +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
> > > +				       dev_name(chan2dev(chan)), chan);
> > > +		if (ret) {
> > > +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
> > > +				      dev_name(chan2dev(chan)));
> > > +			goto err_clk_disable;
> > > +		}
> > > +	}
> > > +
> > > +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
> > > +	if (ret) {
> > > +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
> > > +		goto err_clk_disable;
> > > +	}
> > > +
> > > +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
> > > +
> > > +	pm_runtime_set_active(&pdev->dev);
> > > +	pm_runtime_enable(&pdev->dev);
> > > +	pm_runtime_get_noresume(&pdev->dev);
> > > +	pm_runtime_put(&pdev->dev);
> > > +
> > > +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
> > > +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
> > > +
> > > +	return 0;
> > > +
> > > +err_clk_disable:
> > > +	clk_disable_unprepare(ddata->clk);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static void stm32_dma3_remove(struct platform_device *pdev)
> > > +{
> > > +	pm_runtime_disable(&pdev->dev);
> > > +}
> > > +
> > > +static int stm32_dma3_runtime_suspend(struct device *dev)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > +
> > > +	clk_disable_unprepare(ddata->clk);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static int stm32_dma3_runtime_resume(struct device *dev)
> > > +{
> > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > +	int ret;
> > > +
> > > +	ret = clk_prepare_enable(ddata->clk);
> > > +	if (ret)
> > > +		dev_err(dev, "Failed to enable clk: %d\n", ret);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static const struct dev_pm_ops stm32_dma3_pm_ops = {
> > > +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> > > +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
> > > +};
> > > +
> > > +static struct platform_driver stm32_dma3_driver = {
> > > +	.probe = stm32_dma3_probe,
> > > +	.remove_new = stm32_dma3_remove,
> > > +	.driver = {
> > > +		.name = "stm32-dma3",
> > > +		.of_match_table = stm32_dma3_of_match,
> > > +		.pm = pm_ptr(&stm32_dma3_pm_ops),
> > > +	},
> > > +};
> > > +
> > > +static int __init stm32_dma3_init(void)
> > > +{
> > > +	return platform_driver_register(&stm32_dma3_driver);
> > > +}
> > > +
> > > +subsys_initcall(stm32_dma3_init);
> > > +
> > > +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
> > > +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
> > > +MODULE_LICENSE("GPL");
> > > -- 
> > > 2.25.1
> > > 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-16 17:09       ` Frank Li
@ 2024-05-17  9:42         ` Amelie Delaunay
  2024-05-17 14:57           ` Frank Li
  0 siblings, 1 reply; 29+ messages in thread
From: Amelie Delaunay @ 2024-05-17  9:42 UTC (permalink / raw)
  To: Frank Li
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

On 5/16/24 19:09, Frank Li wrote:
> On Thu, May 16, 2024 at 05:25:58PM +0200, Amelie Delaunay wrote:
>> On 5/15/24 20:56, Frank Li wrote:
>>> On Tue, Apr 23, 2024 at 02:32:55PM +0200, Amelie Delaunay wrote:
>>>> STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
>>>> controller:
> ...
>>>> +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
>>>> +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
>>>> +
>>>> +	/* Clear any pending interrupts */
>>>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
>>>> +	if (csr & CSR_ALL_F)
>>>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
>>>> +
>>>> +	stm32_dma3_chan_dump_reg(chan);
>>>> +
>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
>>>> +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
>>>
>>> This one should use writel instead of writel_relaxed because it need
>>> dma_wmb() as barrier for preious write complete.
>>>
>>> Frank
>>>
>>
>> ddata->base is Device memory type thanks to ioremap() use, so it is strongly
>> ordered and non-cacheable.
>> DMA3 is outside CPU cluster, its registers are accessible through AHB bus.
>> dma_wmb() (in case of writel instead of writel_relaxed) is useless in that
>> case: it won't ensure the propagation on the bus is complete, and it will
>> have impacts on the system.
>> That's why CCR register is written once,  then it is read before CCR_EN is
>> set and being written again, with _relaxed(), because registers are behind a
>> bus, and ioremapped with Device memory type which ensures it is strongly
>> ordered and non-cacheable.
> 
> regardless memory map, writel_relaxed() just make sure io write and read is
> orderred, not necessary order with other memory access. only readl and
> writel make sure order with other memory read/write.
> 
> 1. Write src_addr to descriptor
> 2. dma_wmb()
> 3. Write "ready" to descriptor
> 4. enable channel or doorbell by write a register.
> 
> if 4 use writel_relaxe(). because 3 write to DDR, which difference place of
> mmio, 4 may happen before 3.  Your can refer axi order model.
> 
> 4 have to use ONE writel(), to make sure 3 already write to DDR.
> 
> You need use at least one writel() to make sure all nornmal memory finish.
> 

+    writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
+    writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
+    writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
+    writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
+    writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
+    writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
+    writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));

These writel_relaxed() are from descriptors to DMA3 registers 
(descriptors being prepared "a long time ago" during _prep_).
As I said previously, DMA3 registers are outside CPU cluster, accessible 
through AHB bus, and ddata->base to address registers is ioremapped as 
Device memory type, non-cacheable and strongly ordered.

arch/arm/include/asm/io.h:
/*
* ioremap() and friends.
*
* ioremap() takes a resource address, and size.  Due to the ARM memory
* types, it is important to use the correct ioremap() function as each
* mapping has specific properties.
*
* Function		Memory type	Cacheability	Cache hint
* *ioremap()*		*Device*		*n/a*		*n/a*
* ioremap_cache()	Normal		Writeback	Read allocate
* ioremap_wc()		Normal		Non-cacheable	n/a
* ioremap_wt()		Normal		Non-cacheable	n/a
*
* All device mappings have the following properties:
* - no access speculation
* - no repetition (eg, on return from an exception)
* - number, order and size of accesses are maintained
* - unaligned accesses are "unpredictable"
* - writes may be delayed before they hit the endpoint device

On our platforms, we know that to ensure the writes have hit the 
endpoint device (aka DMA3 registers), a read have to be done before.
And that's what is done before enabling the channel:

+    ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
+    writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));

If there was an issue in this part of the code, it means channel would 
be started while it is wrongly programmed. In that case, DMA3 would 
raise a User Setting Error interrupt and disable the channel. User 
Setting Error is managed in this driver 
(USEF/stm32_dma3_check_user_setting()). And we never had reached a 
situation.

>>
>>>> +
>>>> +	chan->dma_status = DMA_IN_PROGRESS;
>>>> +
>>>> +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
>>>> +}
>>>> +
>>>> +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>>>> +	int ret = 0;
>>>> +
>>>> +	if (susp)
>>>> +		ccr |= CCR_SUSP;
>>>> +	else
>>>> +		ccr &= ~CCR_SUSP;
>>>> +
>>>> +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +
>>>> +	if (susp) {
>>>> +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
>>>> +							csr & CSR_SUSPF, 1, 10);
>>>> +		if (!ret)
>>>> +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
>>>> +
>>>> +		stm32_dma3_chan_dump_reg(chan);
>>>> +	}
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
>>>> +
>>>> +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +}
>>>> +
>>>> +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 ccr;
>>>> +	int ret = 0;
>>>> +
>>>> +	chan->dma_status = DMA_COMPLETE;
>>>> +
>>>> +	/* Disable interrupts */
>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
>>>> +
>>>> +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
>>>> +		/* Suspend the channel */
>>>> +		ret = stm32_dma3_chan_suspend(chan, true);
>>>> +		if (ret)
>>>> +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
>>>> +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
>>>> +	 */
>>>> +	stm32_dma3_chan_reset(chan);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
>>>> +{
>>>> +	if (!chan->swdesc)
>>>> +		return;
>>>> +
>>>> +	vchan_cookie_complete(&chan->swdesc->vdesc);
>>>> +	chan->swdesc = NULL;
>>>> +	stm32_dma3_chan_start(chan);
>>>> +}
>>>> +
>>>> +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = devid;
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 misr, csr, ccr;
>>>> +
>>>> +	spin_lock(&chan->vchan.lock);
>>>> +
>>>> +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
>>>> +	if (!(misr & MISR_MIS(chan->id))) {
>>>> +		spin_unlock(&chan->vchan.lock);
>>>> +		return IRQ_NONE;
>>>> +	}
>>>> +
>>>> +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
>>>> +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
>>>> +
>>>> +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
>>>> +		if (chan->swdesc->cyclic)
>>>> +			vchan_cyclic_callback(&chan->swdesc->vdesc);
>>>> +		else
>>>> +			stm32_dma3_chan_complete(chan);
>>>> +	}
>>>> +
>>>> +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
>>>> +		dev_err(chan2dev(chan), "User setting error\n");
>>>> +		chan->dma_status = DMA_ERROR;
>>>> +		/* CCR.EN automatically cleared by HW */
>>>> +		stm32_dma3_check_user_setting(chan);
>>>> +		stm32_dma3_chan_reset(chan);
>>>> +	}
>>>> +
>>>> +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
>>>> +		dev_err(chan2dev(chan), "Update link transfer error\n");
>>>> +		chan->dma_status = DMA_ERROR;
>>>> +		/* CCR.EN automatically cleared by HW */
>>>> +		stm32_dma3_chan_reset(chan);
>>>> +	}
>>>> +
>>>> +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
>>>> +		dev_err(chan2dev(chan), "Data transfer error\n");
>>>> +		chan->dma_status = DMA_ERROR;
>>>> +		/* CCR.EN automatically cleared by HW */
>>>> +		stm32_dma3_chan_reset(chan);
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
>>>> +	 * ensure HTF flag to be cleared, with other flags.
>>>> +	 */
>>>> +	csr &= (ccr | CCR_HTIE);
>>>> +
>>>> +	if (csr)
>>>> +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
>>>> +
>>>> +	spin_unlock(&chan->vchan.lock);
>>>> +
>>>> +	return IRQ_HANDLED;
>>>> +}
>>>> +
>>>> +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	u32 id = chan->id, csemcr, ccid;
>>>> +	int ret;
>>>> +
>>>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>>>> +	if (ret < 0)
>>>> +		return ret;
>>>
>>> It doesn't prefer runtime pm get at alloc dma chan, many client driver
>>> doesn't actual user dma when allocate dma chan.
>>>
>>> Ideally, resume get when issue_pending. Please refer pl330.c.
>>>
>>> You may add runtime pm later after enablement patch.
>>>
>>> Frank
>>>
>>
>> To well balance clock enable/disable, if pm_runtime_resume_and_get() (rather
>> than pm_runtime_get_sync() which doesn't decrement the counter in case of
>> error) is used when issue_pending, it means pm_runtime_put_sync() should be
>> done when transfer ends.
>>
>> terminate_all is not always called, so put_sync can't be used only there, it
>> should be conditionnally used in terminate_all, but also in interrupt
>> handler, on error events and on transfer completion event, provided that it
>> is the last transfer complete event (last item of the linked-list).
>>
>> For clients with high transfer rate, it means a lot of clock enable/disable.
>> Moreover, DMA3 clock is managed by Secure OS. So it means a lot of
>> non-secure/secure world transitions.
>>
>> I prefer to keep the implementation as it is for now, and possibly propose
>> runtime pm improvement later, with autosuspend.
> 
> 
> Autosuspend is perfered. we try to use pm_runtime_get/put at channel alloc
> /free before, but this solution are rejected by community.
> 
> you can leave clock on for this enablement patch and add runtime pm later
> time.
> 
> Frank
> 

Current implementation leaves the clock off if no channel is requested. 
It also disables the clock if platform is suspended.
I just took example from what is done in stm32 drivers.

I have further patches, not proposed in this series which adds a basic 
support of DMA3. There will be improvements, including runtime pm, in 
next series.

Amelie

>>
>> Amelie
>>
>>>> +
>>>> +	/* Ensure the channel is free */
>>>> +	if (chan->semaphore_mode &&
>>>> +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
>>>> +		ret = -EBUSY;
>>>> +		goto err_put_sync;
>>>> +	}
>>>> +
>>>> +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
>>>> +					  sizeof(struct stm32_dma3_hwdesc),
>>>> +					  __alignof__(struct stm32_dma3_hwdesc), 0);
>>>> +	if (!chan->lli_pool) {
>>>> +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
>>>> +		ret = -ENOMEM;
>>>> +		goto err_put_sync;
>>>> +	}
>>>> +
>>>> +	/* Take the channel semaphore */
>>>> +	if (chan->semaphore_mode) {
>>>> +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
>>>> +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
>>>> +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
>>>> +		/* Check that the channel is well taken */
>>>> +		if (ccid != CCIDCFGR_CID1) {
>>>> +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
>>>> +			ret = -EPERM;
>>>> +			goto err_pool_destroy;
>>>> +		}
>>>> +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +
>>>> +err_pool_destroy:
>>>> +	dmam_pool_destroy(chan->lli_pool);
>>>> +	chan->lli_pool = NULL;
>>>> +
>>>> +err_put_sync:
>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	unsigned long flags;
>>>> +
>>>> +	/* Ensure channel is in idle state */
>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>> +	stm32_dma3_chan_stop(chan);
>>>> +	chan->swdesc = NULL;
>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>> +
>>>> +	vchan_free_chan_resources(to_virt_chan(c));
>>>> +
>>>> +	dmam_pool_destroy(chan->lli_pool);
>>>> +	chan->lli_pool = NULL;
>>>> +
>>>> +	/* Release the channel semaphore */
>>>> +	if (chan->semaphore_mode)
>>>> +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
>>>> +
>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>> +
>>>> +	/* Reset configuration */
>>>> +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
>>>> +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
>>>> +}
>>>> +
>>>> +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
>>>> +								struct scatterlist *sgl,
>>>> +								unsigned int sg_len,
>>>> +								enum dma_transfer_direction dir,
>>>> +								unsigned long flags, void *context)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_swdesc *swdesc;
>>>> +	struct scatterlist *sg;
>>>> +	size_t len;
>>>> +	dma_addr_t sg_addr, dev_addr, src, dst;
>>>> +	u32 i, j, count, ctr1, ctr2;
>>>> +	int ret;
>>>> +
>>>> +	count = sg_len;
>>>> +	for_each_sg(sgl, sg, sg_len, i) {
>>>> +		len = sg_dma_len(sg);
>>>> +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
>>>> +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
>>>> +	}
>>>> +
>>>> +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
>>>> +	if (!swdesc)
>>>> +		return NULL;
>>>> +
>>>> +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
>>>> +	j = 0;
>>>> +	for_each_sg(sgl, sg, sg_len, i) {
>>>> +		sg_addr = sg_dma_address(sg);
>>>> +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
>>>> +						     chan->dma_config.src_addr;
>>>> +		len = sg_dma_len(sg);
>>>> +
>>>> +		do {
>>>> +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
>>>> +
>>>> +			if (dir == DMA_MEM_TO_DEV) {
>>>> +				src = sg_addr;
>>>> +				dst = dev_addr;
>>>> +
>>>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>>>> +							      src, dst, chunk);
>>>> +
>>>> +				if (FIELD_GET(CTR1_DINC, ctr1))
>>>> +					dev_addr += chunk;
>>>> +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
>>>> +				src = dev_addr;
>>>> +				dst = sg_addr;
>>>> +
>>>> +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
>>>> +							      src, dst, chunk);
>>>> +
>>>> +				if (FIELD_GET(CTR1_SINC, ctr1))
>>>> +					dev_addr += chunk;
>>>> +			}
>>>> +
>>>> +			if (ret)
>>>> +				goto err_desc_free;
>>>> +
>>>> +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
>>>> +						    ctr1, ctr2, j == (count - 1), false);
>>>> +
>>>> +			sg_addr += chunk;
>>>> +			len -= chunk;
>>>> +			j++;
>>>> +		} while (len);
>>>> +	}
>>>> +
>>>> +	/* Enable Error interrupts */
>>>> +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
>>>> +	/* Enable Transfer state interrupts */
>>>> +	swdesc->ccr |= CCR_TCIE;
>>>> +
>>>> +	swdesc->cyclic = false;
>>>> +
>>>> +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
>>>> +
>>>> +err_desc_free:
>>>> +	stm32_dma3_chan_desc_free(chan, swdesc);
>>>> +
>>>> +	return NULL;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +
>>>> +	if (!chan->fifo_size) {
>>>> +		caps->max_burst = 0;
>>>> +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +	} else {
>>>> +		/* Burst transfer should not exceed half of the fifo size */
>>>> +		caps->max_burst = chan->max_burst;
>>>> +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
>>>> +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +		}
>>>> +	}
>>>> +}
>>>> +
>>>> +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +
>>>> +	memcpy(&chan->dma_config, config, sizeof(*config));
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int stm32_dma3_terminate_all(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	unsigned long flags;
>>>> +	LIST_HEAD(head);
>>>> +
>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>> +
>>>> +	if (chan->swdesc) {
>>>> +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
>>>> +		chan->swdesc = NULL;
>>>> +	}
>>>> +
>>>> +	stm32_dma3_chan_stop(chan);
>>>> +
>>>> +	vchan_get_all_descriptors(&chan->vchan, &head);
>>>> +
>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>> +	vchan_dma_desc_free_list(&chan->vchan, &head);
>>>> +
>>>> +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_synchronize(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +
>>>> +	vchan_synchronize(&chan->vchan);
>>>> +}
>>>> +
>>>> +static void stm32_dma3_issue_pending(struct dma_chan *c)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	unsigned long flags;
>>>> +
>>>> +	spin_lock_irqsave(&chan->vchan.lock, flags);
>>>> +
>>>> +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
>>>> +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
>>>> +		stm32_dma3_chan_start(chan);
>>>> +	}
>>>> +
>>>> +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
>>>> +}
>>>> +
>>>> +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
>>>> +{
>>>> +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
>>>> +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
>>>> +	struct stm32_dma3_dt_conf *conf = fn_param;
>>>> +	u32 mask, semcr;
>>>> +	int ret;
>>>> +
>>>> +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
>>>> +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
>>>> +
>>>> +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
>>>> +		if (!(mask & BIT(chan->id)))
>>>> +			return false;
>>>> +
>>>> +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
>>>> +	if (ret < 0)
>>>> +		return false;
>>>> +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
>>>> +	pm_runtime_put_sync(ddata->dma_dev.dev);
>>>> +
>>>> +	/* Check if chan is free */
>>>> +	if (semcr & CSEMCR_SEM_MUTEX)
>>>> +		return false;
>>>> +
>>>> +	/* Check if chan fifo fits well */
>>>> +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
>>>> +		return false;
>>>> +
>>>> +	return true;
>>>> +}
>>>> +
>>>> +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
>>>> +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
>>>> +	struct stm32_dma3_dt_conf conf;
>>>> +	struct stm32_dma3_chan *chan;
>>>> +	struct dma_chan *c;
>>>> +
>>>> +	if (dma_spec->args_count < 3) {
>>>> +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	conf.req_line = dma_spec->args[0];
>>>> +	conf.ch_conf = dma_spec->args[1];
>>>> +	conf.tr_conf = dma_spec->args[2];
>>>> +
>>>> +	if (conf.req_line >= ddata->dma_requests) {
>>>> +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	/* Request dma channel among the generic dma controller list */
>>>> +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
>>>> +	if (!c) {
>>>> +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>> +	chan = to_stm32_dma3_chan(c);
>>>> +	chan->dt_config = conf;
>>>> +
>>>> +	return c;
>>>> +}
>>>> +
>>>> +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
>>>> +{
>>>> +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
>>>> +
>>>> +	/* Reserve Secure channels */
>>>> +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
>>>> +
>>>> +	/*
>>>> +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
>>>> +	 * the processor which is configuring and using the given channel.
>>>> +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
>>>> +	 * specify available DMA channels to the kernel.
>>>> +	 */
>>>> +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
>>>> +
>>>> +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>> +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
>>>> +
>>>> +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
>>>> +			invalid_cid |= BIT(i);
>>>> +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
>>>> +				chan_reserved |= BIT(i);
>>>> +		} else { /* CID-filtered */
>>>> +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
>>>> +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
>>>> +					chan_reserved |= BIT(i);
>>>> +			} else { /* Semaphore mode */
>>>> +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
>>>> +					chan_reserved |= BIT(i);
>>>> +				ddata->chans[i].semaphore_mode = true;
>>>> +			}
>>>> +		}
>>>> +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
>>>> +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
>>>> +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
>>>> +			(chan_reserved & BIT(i)) ? "denied" :
>>>> +			mask & BIT(i) ? "force allowed" : "allowed");
>>>> +	}
>>>> +
>>>> +	if (invalid_cid)
>>>> +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
>>>> +			 ddata->dma_channels, &invalid_cid);
>>>> +
>>>> +	return chan_reserved;
>>>> +}
>>>> +
>>>> +static const struct of_device_id stm32_dma3_of_match[] = {
>>>> +	{ .compatible = "st,stm32-dma3", },
>>>> +	{ /* sentinel */},
>>>> +};
>>>> +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
>>>> +
>>>> +static int stm32_dma3_probe(struct platform_device *pdev)
>>>> +{
>>>> +	struct device_node *np = pdev->dev.of_node;
>>>> +	struct stm32_dma3_ddata *ddata;
>>>> +	struct reset_control *reset;
>>>> +	struct stm32_dma3_chan *chan;
>>>> +	struct dma_device *dma_dev;
>>>> +	u32 master_ports, chan_reserved, i, verr;
>>>> +	u64 hwcfgr;
>>>> +	int ret;
>>>> +
>>>> +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
>>>> +	if (!ddata)
>>>> +		return -ENOMEM;
>>>> +	platform_set_drvdata(pdev, ddata);
>>>> +
>>>> +	dma_dev = &ddata->dma_dev;
>>>> +
>>>> +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
>>>> +	if (IS_ERR(ddata->base))
>>>> +		return PTR_ERR(ddata->base);
>>>> +
>>>> +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
>>>> +	if (IS_ERR(ddata->clk))
>>>> +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
>>>> +
>>>> +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
>>>> +	if (IS_ERR(reset))
>>>> +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
>>>> +
>>>> +	ret = clk_prepare_enable(ddata->clk);
>>>> +	if (ret)
>>>> +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
>>>> +
>>>> +	reset_control_reset(reset);
>>>> +
>>>> +	INIT_LIST_HEAD(&dma_dev->channels);
>>>> +
>>>> +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
>>>> +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
>>>> +	dma_dev->dev = &pdev->dev;
>>>> +	/*
>>>> +	 * This controller supports up to 8-byte buswidth depending on the port used and the
>>>> +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
>>>> +	 */
>>>> +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
>>>> +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
>>>> +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
>>>> +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
>>>> +
>>>> +	dma_dev->descriptor_reuse = true;
>>>> +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
>>>> +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
>>>> +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
>>>> +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
>>>> +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
>>>> +	dma_dev->device_caps = stm32_dma3_caps;
>>>> +	dma_dev->device_config = stm32_dma3_config;
>>>> +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
>>>> +	dma_dev->device_synchronize = stm32_dma3_synchronize;
>>>> +	dma_dev->device_tx_status = dma_cookie_status;
>>>> +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
>>>> +
>>>> +	/* if dma_channels is not modified, get it from hwcfgr1 */
>>>> +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
>>>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>>>> +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
>>>> +	}
>>>> +
>>>> +	/* if dma_requests is not modified, get it from hwcfgr2 */
>>>> +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
>>>> +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
>>>> +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
>>>> +	}
>>>> +
>>>> +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
>>>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
>>>> +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
>>>> +
>>>> +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
>>>> +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
>>>> +		ddata->ports_max_dw[1] = DW_INVALID;
>>>> +	else /* Dual master ports */
>>>> +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
>>>> +
>>>> +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
>>>> +				    GFP_KERNEL);
>>>> +	if (!ddata->chans) {
>>>> +		ret = -ENOMEM;
>>>> +		goto err_clk_disable;
>>>> +	}
>>>> +
>>>> +	chan_reserved = stm32_dma3_check_rif(ddata);
>>>> +
>>>> +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
>>>> +		ret = -ENODEV;
>>>> +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
>>>> +		goto err_clk_disable;
>>>> +	}
>>>> +
>>>> +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
>>>> +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
>>>> +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
>>>> +
>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>> +		if (chan_reserved & BIT(i))
>>>> +			continue;
>>>> +
>>>> +		chan = &ddata->chans[i];
>>>> +		chan->id = i;
>>>> +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
>>>> +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
>>>> +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
>>>> +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
>>>> +
>>>> +		vchan_init(&chan->vchan, dma_dev);
>>>> +	}
>>>> +
>>>> +	ret = dmaenginem_async_device_register(dma_dev);
>>>> +	if (ret)
>>>> +		goto err_clk_disable;
>>>> +
>>>> +	for (i = 0; i < ddata->dma_channels; i++) {
>>>> +		if (chan_reserved & BIT(i))
>>>> +			continue;
>>>> +
>>>> +		ret = platform_get_irq(pdev, i);
>>>> +		if (ret < 0)
>>>> +			goto err_clk_disable;
>>>> +
>>>> +		chan = &ddata->chans[i];
>>>> +		chan->irq = ret;
>>>> +
>>>> +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
>>>> +				       dev_name(chan2dev(chan)), chan);
>>>> +		if (ret) {
>>>> +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
>>>> +				      dev_name(chan2dev(chan)));
>>>> +			goto err_clk_disable;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
>>>> +	if (ret) {
>>>> +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
>>>> +		goto err_clk_disable;
>>>> +	}
>>>> +
>>>> +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
>>>> +
>>>> +	pm_runtime_set_active(&pdev->dev);
>>>> +	pm_runtime_enable(&pdev->dev);
>>>> +	pm_runtime_get_noresume(&pdev->dev);
>>>> +	pm_runtime_put(&pdev->dev);
>>>> +
>>>> +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
>>>> +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
>>>> +
>>>> +	return 0;
>>>> +
>>>> +err_clk_disable:
>>>> +	clk_disable_unprepare(ddata->clk);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void stm32_dma3_remove(struct platform_device *pdev)
>>>> +{
>>>> +	pm_runtime_disable(&pdev->dev);
>>>> +}
>>>> +
>>>> +static int stm32_dma3_runtime_suspend(struct device *dev)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>>>> +
>>>> +	clk_disable_unprepare(ddata->clk);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int stm32_dma3_runtime_resume(struct device *dev)
>>>> +{
>>>> +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
>>>> +	int ret;
>>>> +
>>>> +	ret = clk_prepare_enable(ddata->clk);
>>>> +	if (ret)
>>>> +		dev_err(dev, "Failed to enable clk: %d\n", ret);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static const struct dev_pm_ops stm32_dma3_pm_ops = {
>>>> +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
>>>> +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
>>>> +};
>>>> +
>>>> +static struct platform_driver stm32_dma3_driver = {
>>>> +	.probe = stm32_dma3_probe,
>>>> +	.remove_new = stm32_dma3_remove,
>>>> +	.driver = {
>>>> +		.name = "stm32-dma3",
>>>> +		.of_match_table = stm32_dma3_of_match,
>>>> +		.pm = pm_ptr(&stm32_dma3_pm_ops),
>>>> +	},
>>>> +};
>>>> +
>>>> +static int __init stm32_dma3_init(void)
>>>> +{
>>>> +	return platform_driver_register(&stm32_dma3_driver);
>>>> +}
>>>> +
>>>> +subsys_initcall(stm32_dma3_init);
>>>> +
>>>> +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
>>>> +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
>>>> +MODULE_LICENSE("GPL");
>>>> -- 
>>>> 2.25.1
>>>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] dmaengine: Add STM32 DMA3 support
  2024-05-17  9:42         ` Amelie Delaunay
@ 2024-05-17 14:57           ` Frank Li
  0 siblings, 0 replies; 29+ messages in thread
From: Frank Li @ 2024-05-17 14:57 UTC (permalink / raw)
  To: Amelie Delaunay
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Maxime Coquelin, Alexandre Torgue, dmaengine, devicetree,
	linux-stm32, linux-arm-kernel, linux-kernel, linux-hardening

On Fri, May 17, 2024 at 11:42:17AM +0200, Amelie Delaunay wrote:
> On 5/16/24 19:09, Frank Li wrote:
> > On Thu, May 16, 2024 at 05:25:58PM +0200, Amelie Delaunay wrote:
> > > On 5/15/24 20:56, Frank Li wrote:
> > > > On Tue, Apr 23, 2024 at 02:32:55PM +0200, Amelie Delaunay wrote:
> > > > > STM32 DMA3 driver supports the 3 hardware configurations of the STM32 DMA3
> > > > > controller:
> > ...
> > > > > +	writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
> > > > > +	writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
> > > > > +
> > > > > +	/* Clear any pending interrupts */
> > > > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(id));
> > > > > +	if (csr & CSR_ALL_F)
> > > > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(id));
> > > > > +
> > > > > +	stm32_dma3_chan_dump_reg(chan);
> > > > > +
> > > > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
> > > > > +	writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
> > > > 
> > > > This one should use writel instead of writel_relaxed because it need
> > > > dma_wmb() as barrier for preious write complete.
> > > > 
> > > > Frank
> > > > 
> > > 
> > > ddata->base is Device memory type thanks to ioremap() use, so it is strongly
> > > ordered and non-cacheable.
> > > DMA3 is outside CPU cluster, its registers are accessible through AHB bus.
> > > dma_wmb() (in case of writel instead of writel_relaxed) is useless in that
> > > case: it won't ensure the propagation on the bus is complete, and it will
> > > have impacts on the system.
> > > That's why CCR register is written once,  then it is read before CCR_EN is
> > > set and being written again, with _relaxed(), because registers are behind a
> > > bus, and ioremapped with Device memory type which ensures it is strongly
> > > ordered and non-cacheable.
> > 
> > regardless memory map, writel_relaxed() just make sure io write and read is
> > orderred, not necessary order with other memory access. only readl and
> > writel make sure order with other memory read/write.
> > 
> > 1. Write src_addr to descriptor
> > 2. dma_wmb()
> > 3. Write "ready" to descriptor
> > 4. enable channel or doorbell by write a register.
> > 
> > if 4 use writel_relaxe(). because 3 write to DDR, which difference place of
> > mmio, 4 may happen before 3.  Your can refer axi order model.
> > 
> > 4 have to use ONE writel(), to make sure 3 already write to DDR.
> > 
> > You need use at least one writel() to make sure all nornmal memory finish.
> > 
> 
> +    writel_relaxed(chan->swdesc->ccr, ddata->base + STM32_DMA3_CCR(id));
> +    writel_relaxed(hwdesc->ctr1, ddata->base + STM32_DMA3_CTR1(id));
> +    writel_relaxed(hwdesc->ctr2, ddata->base + STM32_DMA3_CTR2(id));
> +    writel_relaxed(hwdesc->cbr1, ddata->base + STM32_DMA3_CBR1(id));
> +    writel_relaxed(hwdesc->csar, ddata->base + STM32_DMA3_CSAR(id));
> +    writel_relaxed(hwdesc->cdar, ddata->base + STM32_DMA3_CDAR(id));
> +    writel_relaxed(hwdesc->cllr, ddata->base + STM32_DMA3_CLLR(id));
> 
> These writel_relaxed() are from descriptors to DMA3 registers (descriptors
> being prepared "a long time ago" during _prep_).

You can't depend on "a long time ago" during _prep_. If later your driver
run at fast CPU. The execute time will be short.

All dma_map_sg and dma_alloc_coherence ... need at least one writel() to
make sure previous write actually reach to DDR. 

Some data may not really reach DDR, when DMA already start transfer

Please ref linux kernel document: 
	Documentation/memory-barriers.txt, line 1948.

In your issue_pending(), call this function to enable channel. So need
at least one writel().

> As I said previously, DMA3 registers are outside CPU cluster, accessible
> through AHB bus, and ddata->base to address registers is ioremapped as
> Device memory type, non-cacheable and strongly ordered.
> 
> arch/arm/include/asm/io.h:
> /*
> * ioremap() and friends.
> *
> * ioremap() takes a resource address, and size.  Due to the ARM memory
> * types, it is important to use the correct ioremap() function as each
> * mapping has specific properties.
> *
> * Function		Memory type	Cacheability	Cache hint
> * *ioremap()*		*Device*		*n/a*		*n/a*
> * ioremap_cache()	Normal		Writeback	Read allocate
> * ioremap_wc()		Normal		Non-cacheable	n/a
> * ioremap_wt()		Normal		Non-cacheable	n/a
> *
> * All device mappings have the following properties:
> * - no access speculation
> * - no repetition (eg, on return from an exception)
> * - number, order and size of accesses are maintained
> * - unaligned accesses are "unpredictable"
> * - writes may be delayed before they hit the endpoint device
> 
> On our platforms, we know that to ensure the writes have hit the endpoint
> device (aka DMA3 registers), a read have to be done before.
> And that's what is done before enabling the channel:
> 
> +    ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(id));
> +    writel_relaxed(ccr | CCR_EN, ddata->base + STM32_DMA3_CCR(id));
> 
> If there was an issue in this part of the code, it means channel would be
> started while it is wrongly programmed. In that case, DMA3 would raise a
> User Setting Error interrupt and disable the channel. User Setting Error is
> managed in this driver (USEF/stm32_dma3_check_user_setting()). And we never
> had reached a situation.


I am not talking about registers IO read/write order. read_releax() and
write_relax() can guarantee IO read and write ordered. But not guarantee
order with normal memory's read and write.

You have not met problem doesn't means your code is correct. Please check
document: 
	Documentation/memory-barriers.txt

> 
> > > 
> > > > > +
> > > > > +	chan->dma_status = DMA_IN_PROGRESS;
> > > > > +
> > > > > +	dev_dbg(chan2dev(chan), "vchan %pK: started\n", &chan->vchan);
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_chan_suspend(struct stm32_dma3_chan *chan, bool susp)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 csr, ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	if (susp)
> > > > > +		ccr |= CCR_SUSP;
> > > > > +	else
> > > > > +		ccr &= ~CCR_SUSP;
> > > > > +
> > > > > +	writel_relaxed(ccr, ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +
> > > > > +	if (susp) {
> > > > > +		ret = readl_relaxed_poll_timeout_atomic(ddata->base + STM32_DMA3_CSR(chan->id), csr,
> > > > > +							csr & CSR_SUSPF, 1, 10);
> > > > > +		if (!ret)
> > > > > +			writel_relaxed(CFCR_SUSPF, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > > > +
> > > > > +		stm32_dma3_chan_dump_reg(chan);
> > > > > +	}
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_reset(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & ~CCR_EN;
> > > > > +
> > > > > +	writel_relaxed(ccr |= CCR_RESET, ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_chan_stop(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 ccr;
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	chan->dma_status = DMA_COMPLETE;
> > > > > +
> > > > > +	/* Disable interrupts */
> > > > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +	writel_relaxed(ccr & ~(CCR_ALLIE | CCR_EN), ddata->base + STM32_DMA3_CCR(chan->id));
> > > > > +
> > > > > +	if (!(ccr & CCR_SUSP) && (ccr & CCR_EN)) {
> > > > > +		/* Suspend the channel */
> > > > > +		ret = stm32_dma3_chan_suspend(chan, true);
> > > > > +		if (ret)
> > > > > +			dev_warn(chan2dev(chan), "%s: timeout, data might be lost\n", __func__);
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * Reset the channel: this causes the reset of the FIFO and the reset of the channel
> > > > > +	 * internal state, the reset of CCR_EN and CCR_SUSP bits.
> > > > > +	 */
> > > > > +	stm32_dma3_chan_reset(chan);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_chan_complete(struct stm32_dma3_chan *chan)
> > > > > +{
> > > > > +	if (!chan->swdesc)
> > > > > +		return;
> > > > > +
> > > > > +	vchan_cookie_complete(&chan->swdesc->vdesc);
> > > > > +	chan->swdesc = NULL;
> > > > > +	stm32_dma3_chan_start(chan);
> > > > > +}
> > > > > +
> > > > > +static irqreturn_t stm32_dma3_chan_irq(int irq, void *devid)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = devid;
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 misr, csr, ccr;
> > > > > +
> > > > > +	spin_lock(&chan->vchan.lock);
> > > > > +
> > > > > +	misr = readl_relaxed(ddata->base + STM32_DMA3_MISR);
> > > > > +	if (!(misr & MISR_MIS(chan->id))) {
> > > > > +		spin_unlock(&chan->vchan.lock);
> > > > > +		return IRQ_NONE;
> > > > > +	}
> > > > > +
> > > > > +	csr = readl_relaxed(ddata->base + STM32_DMA3_CSR(chan->id));
> > > > > +	ccr = readl_relaxed(ddata->base + STM32_DMA3_CCR(chan->id)) & CCR_ALLIE;
> > > > > +
> > > > > +	if (csr & CSR_TCF && ccr & CCR_TCIE) {
> > > > > +		if (chan->swdesc->cyclic)
> > > > > +			vchan_cyclic_callback(&chan->swdesc->vdesc);
> > > > > +		else
> > > > > +			stm32_dma3_chan_complete(chan);
> > > > > +	}
> > > > > +
> > > > > +	if (csr & CSR_USEF && ccr & CCR_USEIE) {
> > > > > +		dev_err(chan2dev(chan), "User setting error\n");
> > > > > +		chan->dma_status = DMA_ERROR;
> > > > > +		/* CCR.EN automatically cleared by HW */
> > > > > +		stm32_dma3_check_user_setting(chan);
> > > > > +		stm32_dma3_chan_reset(chan);
> > > > > +	}
> > > > > +
> > > > > +	if (csr & CSR_ULEF && ccr & CCR_ULEIE) {
> > > > > +		dev_err(chan2dev(chan), "Update link transfer error\n");
> > > > > +		chan->dma_status = DMA_ERROR;
> > > > > +		/* CCR.EN automatically cleared by HW */
> > > > > +		stm32_dma3_chan_reset(chan);
> > > > > +	}
> > > > > +
> > > > > +	if (csr & CSR_DTEF && ccr & CCR_DTEIE) {
> > > > > +		dev_err(chan2dev(chan), "Data transfer error\n");
> > > > > +		chan->dma_status = DMA_ERROR;
> > > > > +		/* CCR.EN automatically cleared by HW */
> > > > > +		stm32_dma3_chan_reset(chan);
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * Half Transfer Interrupt may be disabled but Half Transfer Flag can be set,
> > > > > +	 * ensure HTF flag to be cleared, with other flags.
> > > > > +	 */
> > > > > +	csr &= (ccr | CCR_HTIE);
> > > > > +
> > > > > +	if (csr)
> > > > > +		writel_relaxed(csr, ddata->base + STM32_DMA3_CFCR(chan->id));
> > > > > +
> > > > > +	spin_unlock(&chan->vchan.lock);
> > > > > +
> > > > > +	return IRQ_HANDLED;
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_alloc_chan_resources(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	u32 id = chan->id, csemcr, ccid;
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > > > +	if (ret < 0)
> > > > > +		return ret;
> > > > 
> > > > It doesn't prefer runtime pm get at alloc dma chan, many client driver
> > > > doesn't actual user dma when allocate dma chan.
> > > > 
> > > > Ideally, resume get when issue_pending. Please refer pl330.c.
> > > > 
> > > > You may add runtime pm later after enablement patch.
> > > > 
> > > > Frank
> > > > 
> > > 
> > > To well balance clock enable/disable, if pm_runtime_resume_and_get() (rather
> > > than pm_runtime_get_sync() which doesn't decrement the counter in case of
> > > error) is used when issue_pending, it means pm_runtime_put_sync() should be
> > > done when transfer ends.
> > > 
> > > terminate_all is not always called, so put_sync can't be used only there, it
> > > should be conditionnally used in terminate_all, but also in interrupt
> > > handler, on error events and on transfer completion event, provided that it
> > > is the last transfer complete event (last item of the linked-list).
> > > 
> > > For clients with high transfer rate, it means a lot of clock enable/disable.
> > > Moreover, DMA3 clock is managed by Secure OS. So it means a lot of
> > > non-secure/secure world transitions.
> > > 
> > > I prefer to keep the implementation as it is for now, and possibly propose
> > > runtime pm improvement later, with autosuspend.
> > 
> > 
> > Autosuspend is perfered. we try to use pm_runtime_get/put at channel alloc
> > /free before, but this solution are rejected by community.
> > 
> > you can leave clock on for this enablement patch and add runtime pm later
> > time.
> > 
> > Frank
> > 
> 
> Current implementation leaves the clock off if no channel is requested. It
> also disables the clock if platform is suspended.
> I just took example from what is done in stm32 drivers.
> 
> I have further patches, not proposed in this series which adds a basic
> support of DMA3. There will be improvements, including runtime pm, in next
> series.
> 
> Amelie
> 
> > > 
> > > Amelie
> > > 
> > > > > +
> > > > > +	/* Ensure the channel is free */
> > > > > +	if (chan->semaphore_mode &&
> > > > > +	    readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id)) & CSEMCR_SEM_MUTEX) {
> > > > > +		ret = -EBUSY;
> > > > > +		goto err_put_sync;
> > > > > +	}
> > > > > +
> > > > > +	chan->lli_pool = dmam_pool_create(dev_name(&c->dev->device), c->device->dev,
> > > > > +					  sizeof(struct stm32_dma3_hwdesc),
> > > > > +					  __alignof__(struct stm32_dma3_hwdesc), 0);
> > > > > +	if (!chan->lli_pool) {
> > > > > +		dev_err(chan2dev(chan), "Failed to create LLI pool\n");
> > > > > +		ret = -ENOMEM;
> > > > > +		goto err_put_sync;
> > > > > +	}
> > > > > +
> > > > > +	/* Take the channel semaphore */
> > > > > +	if (chan->semaphore_mode) {
> > > > > +		writel_relaxed(CSEMCR_SEM_MUTEX, ddata->base + STM32_DMA3_CSEMCR(id));
> > > > > +		csemcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(id));
> > > > > +		ccid = FIELD_GET(CSEMCR_SEM_CCID, csemcr);
> > > > > +		/* Check that the channel is well taken */
> > > > > +		if (ccid != CCIDCFGR_CID1) {
> > > > > +			dev_err(chan2dev(chan), "Not under CID1 control (in-use by CID%d)\n", ccid);
> > > > > +			ret = -EPERM;
> > > > > +			goto err_pool_destroy;
> > > > > +		}
> > > > > +		dev_dbg(chan2dev(chan), "Under CID1 control (semcr=0x%08x)\n", csemcr);
> > > > > +	}
> > > > > +
> > > > > +	return 0;
> > > > > +
> > > > > +err_pool_destroy:
> > > > > +	dmam_pool_destroy(chan->lli_pool);
> > > > > +	chan->lli_pool = NULL;
> > > > > +
> > > > > +err_put_sync:
> > > > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_free_chan_resources(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	unsigned long flags;
> > > > > +
> > > > > +	/* Ensure channel is in idle state */
> > > > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > > > +	stm32_dma3_chan_stop(chan);
> > > > > +	chan->swdesc = NULL;
> > > > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > > > +
> > > > > +	vchan_free_chan_resources(to_virt_chan(c));
> > > > > +
> > > > > +	dmam_pool_destroy(chan->lli_pool);
> > > > > +	chan->lli_pool = NULL;
> > > > > +
> > > > > +	/* Release the channel semaphore */
> > > > > +	if (chan->semaphore_mode)
> > > > > +		writel_relaxed(0, ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > > > +
> > > > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > > > +
> > > > > +	/* Reset configuration */
> > > > > +	memset(&chan->dt_config, 0, sizeof(chan->dt_config));
> > > > > +	memset(&chan->dma_config, 0, sizeof(chan->dma_config));
> > > > > +}
> > > > > +
> > > > > +static struct dma_async_tx_descriptor *stm32_dma3_prep_slave_sg(struct dma_chan *c,
> > > > > +								struct scatterlist *sgl,
> > > > > +								unsigned int sg_len,
> > > > > +								enum dma_transfer_direction dir,
> > > > > +								unsigned long flags, void *context)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_swdesc *swdesc;
> > > > > +	struct scatterlist *sg;
> > > > > +	size_t len;
> > > > > +	dma_addr_t sg_addr, dev_addr, src, dst;
> > > > > +	u32 i, j, count, ctr1, ctr2;
> > > > > +	int ret;
> > > > > +
> > > > > +	count = sg_len;
> > > > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > > > +		len = sg_dma_len(sg);
> > > > > +		if (len > STM32_DMA3_MAX_BLOCK_SIZE)
> > > > > +			count += DIV_ROUND_UP(len, STM32_DMA3_MAX_BLOCK_SIZE) - 1;
> > > > > +	}
> > > > > +
> > > > > +	swdesc = stm32_dma3_chan_desc_alloc(chan, count);
> > > > > +	if (!swdesc)
> > > > > +		return NULL;
> > > > > +
> > > > > +	/* sg_len and i correspond to the initial sgl; count and j correspond to the hwdesc LL */
> > > > > +	j = 0;
> > > > > +	for_each_sg(sgl, sg, sg_len, i) {
> > > > > +		sg_addr = sg_dma_address(sg);
> > > > > +		dev_addr = (dir == DMA_MEM_TO_DEV) ? chan->dma_config.dst_addr :
> > > > > +						     chan->dma_config.src_addr;
> > > > > +		len = sg_dma_len(sg);
> > > > > +
> > > > > +		do {
> > > > > +			size_t chunk = min_t(size_t, len, STM32_DMA3_MAX_BLOCK_SIZE);
> > > > > +
> > > > > +			if (dir == DMA_MEM_TO_DEV) {
> > > > > +				src = sg_addr;
> > > > > +				dst = dev_addr;
> > > > > +
> > > > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > > > +							      src, dst, chunk);
> > > > > +
> > > > > +				if (FIELD_GET(CTR1_DINC, ctr1))
> > > > > +					dev_addr += chunk;
> > > > > +			} else { /* (dir == DMA_DEV_TO_MEM || dir == DMA_MEM_TO_MEM) */
> > > > > +				src = dev_addr;
> > > > > +				dst = sg_addr;
> > > > > +
> > > > > +				ret = stm32_dma3_chan_prep_hw(chan, dir, &swdesc->ccr, &ctr1, &ctr2,
> > > > > +							      src, dst, chunk);
> > > > > +
> > > > > +				if (FIELD_GET(CTR1_SINC, ctr1))
> > > > > +					dev_addr += chunk;
> > > > > +			}
> > > > > +
> > > > > +			if (ret)
> > > > > +				goto err_desc_free;
> > > > > +
> > > > > +			stm32_dma3_chan_prep_hwdesc(chan, swdesc, j, src, dst, chunk,
> > > > > +						    ctr1, ctr2, j == (count - 1), false);
> > > > > +
> > > > > +			sg_addr += chunk;
> > > > > +			len -= chunk;
> > > > > +			j++;
> > > > > +		} while (len);
> > > > > +	}
> > > > > +
> > > > > +	/* Enable Error interrupts */
> > > > > +	swdesc->ccr |= CCR_USEIE | CCR_ULEIE | CCR_DTEIE;
> > > > > +	/* Enable Transfer state interrupts */
> > > > > +	swdesc->ccr |= CCR_TCIE;
> > > > > +
> > > > > +	swdesc->cyclic = false;
> > > > > +
> > > > > +	return vchan_tx_prep(&chan->vchan, &swdesc->vdesc, flags);
> > > > > +
> > > > > +err_desc_free:
> > > > > +	stm32_dma3_chan_desc_free(chan, swdesc);
> > > > > +
> > > > > +	return NULL;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_caps(struct dma_chan *c, struct dma_slave_caps *caps)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +
> > > > > +	if (!chan->fifo_size) {
> > > > > +		caps->max_burst = 0;
> > > > > +		caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +		caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +	} else {
> > > > > +		/* Burst transfer should not exceed half of the fifo size */
> > > > > +		caps->max_burst = chan->max_burst;
> > > > > +		if (caps->max_burst < DMA_SLAVE_BUSWIDTH_8_BYTES) {
> > > > > +			caps->src_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +			caps->dst_addr_widths &= ~BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +		}
> > > > > +	}
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_config(struct dma_chan *c, struct dma_slave_config *config)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +
> > > > > +	memcpy(&chan->dma_config, config, sizeof(*config));
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_terminate_all(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	unsigned long flags;
> > > > > +	LIST_HEAD(head);
> > > > > +
> > > > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > > > +
> > > > > +	if (chan->swdesc) {
> > > > > +		vchan_terminate_vdesc(&chan->swdesc->vdesc);
> > > > > +		chan->swdesc = NULL;
> > > > > +	}
> > > > > +
> > > > > +	stm32_dma3_chan_stop(chan);
> > > > > +
> > > > > +	vchan_get_all_descriptors(&chan->vchan, &head);
> > > > > +
> > > > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > > > +	vchan_dma_desc_free_list(&chan->vchan, &head);
> > > > > +
> > > > > +	dev_dbg(chan2dev(chan), "vchan %pK: terminated\n", &chan->vchan);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_synchronize(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +
> > > > > +	vchan_synchronize(&chan->vchan);
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_issue_pending(struct dma_chan *c)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	unsigned long flags;
> > > > > +
> > > > > +	spin_lock_irqsave(&chan->vchan.lock, flags);
> > > > > +
> > > > > +	if (vchan_issue_pending(&chan->vchan) && !chan->swdesc) {
> > > > > +		dev_dbg(chan2dev(chan), "vchan %pK: issued\n", &chan->vchan);
> > > > > +		stm32_dma3_chan_start(chan);
> > > > > +	}
> > > > > +
> > > > > +	spin_unlock_irqrestore(&chan->vchan.lock, flags);
> > > > > +}
> > > > > +
> > > > > +static bool stm32_dma3_filter_fn(struct dma_chan *c, void *fn_param)
> > > > > +{
> > > > > +	struct stm32_dma3_chan *chan = to_stm32_dma3_chan(c);
> > > > > +	struct stm32_dma3_ddata *ddata = to_stm32_dma3_ddata(chan);
> > > > > +	struct stm32_dma3_dt_conf *conf = fn_param;
> > > > > +	u32 mask, semcr;
> > > > > +	int ret;
> > > > > +
> > > > > +	dev_dbg(c->device->dev, "%s(%s): req_line=%d ch_conf=%08x tr_conf=%08x\n",
> > > > > +		__func__, dma_chan_name(c), conf->req_line, conf->ch_conf, conf->tr_conf);
> > > > > +
> > > > > +	if (!of_property_read_u32(c->device->dev->of_node, "dma-channel-mask", &mask))
> > > > > +		if (!(mask & BIT(chan->id)))
> > > > > +			return false;
> > > > > +
> > > > > +	ret = pm_runtime_resume_and_get(ddata->dma_dev.dev);
> > > > > +	if (ret < 0)
> > > > > +		return false;
> > > > > +	semcr = readl_relaxed(ddata->base + STM32_DMA3_CSEMCR(chan->id));
> > > > > +	pm_runtime_put_sync(ddata->dma_dev.dev);
> > > > > +
> > > > > +	/* Check if chan is free */
> > > > > +	if (semcr & CSEMCR_SEM_MUTEX)
> > > > > +		return false;
> > > > > +
> > > > > +	/* Check if chan fifo fits well */
> > > > > +	if (FIELD_GET(STM32_DMA3_DT_FIFO, conf->ch_conf) != chan->fifo_size)
> > > > > +		return false;
> > > > > +
> > > > > +	return true;
> > > > > +}
> > > > > +
> > > > > +static struct dma_chan *stm32_dma3_of_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = ofdma->of_dma_data;
> > > > > +	dma_cap_mask_t mask = ddata->dma_dev.cap_mask;
> > > > > +	struct stm32_dma3_dt_conf conf;
> > > > > +	struct stm32_dma3_chan *chan;
> > > > > +	struct dma_chan *c;
> > > > > +
> > > > > +	if (dma_spec->args_count < 3) {
> > > > > +		dev_err(ddata->dma_dev.dev, "Invalid args count\n");
> > > > > +		return NULL;
> > > > > +	}
> > > > > +
> > > > > +	conf.req_line = dma_spec->args[0];
> > > > > +	conf.ch_conf = dma_spec->args[1];
> > > > > +	conf.tr_conf = dma_spec->args[2];
> > > > > +
> > > > > +	if (conf.req_line >= ddata->dma_requests) {
> > > > > +		dev_err(ddata->dma_dev.dev, "Invalid request line\n");
> > > > > +		return NULL;
> > > > > +	}
> > > > > +
> > > > > +	/* Request dma channel among the generic dma controller list */
> > > > > +	c = dma_request_channel(mask, stm32_dma3_filter_fn, &conf);
> > > > > +	if (!c) {
> > > > > +		dev_err(ddata->dma_dev.dev, "No suitable channel found\n");
> > > > > +		return NULL;
> > > > > +	}
> > > > > +
> > > > > +	chan = to_stm32_dma3_chan(c);
> > > > > +	chan->dt_config = conf;
> > > > > +
> > > > > +	return c;
> > > > > +}
> > > > > +
> > > > > +static u32 stm32_dma3_check_rif(struct stm32_dma3_ddata *ddata)
> > > > > +{
> > > > > +	u32 chan_reserved, mask = 0, i, ccidcfgr, invalid_cid = 0;
> > > > > +
> > > > > +	/* Reserve Secure channels */
> > > > > +	chan_reserved = readl_relaxed(ddata->base + STM32_DMA3_SECCFGR);
> > > > > +
> > > > > +	/*
> > > > > +	 * CID filtering must be configured to ensure that the DMA3 channel will inherit the CID of
> > > > > +	 * the processor which is configuring and using the given channel.
> > > > > +	 * In case CID filtering is not configured, dma-channel-mask property can be used to
> > > > > +	 * specify available DMA channels to the kernel.
> > > > > +	 */
> > > > > +	of_property_read_u32(ddata->dma_dev.dev->of_node, "dma-channel-mask", &mask);
> > > > > +
> > > > > +	/* Reserve !CID-filtered not in dma-channel-mask, static CID != CID1, CID1 not allowed */
> > > > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > > > +		ccidcfgr = readl_relaxed(ddata->base + STM32_DMA3_CCIDCFGR(i));
> > > > > +
> > > > > +		if (!(ccidcfgr & CCIDCFGR_CFEN)) { /* !CID-filtered */
> > > > > +			invalid_cid |= BIT(i);
> > > > > +			if (!(mask & BIT(i))) /* Not in dma-channel-mask */
> > > > > +				chan_reserved |= BIT(i);
> > > > > +		} else { /* CID-filtered */
> > > > > +			if (!(ccidcfgr & CCIDCFGR_SEM_EN)) { /* Static CID mode */
> > > > > +				if (FIELD_GET(CCIDCFGR_SCID, ccidcfgr) != CCIDCFGR_CID1)
> > > > > +					chan_reserved |= BIT(i);
> > > > > +			} else { /* Semaphore mode */
> > > > > +				if (!FIELD_GET(CCIDCFGR_SEM_WLIST_CID1, ccidcfgr))
> > > > > +					chan_reserved |= BIT(i);
> > > > > +				ddata->chans[i].semaphore_mode = true;
> > > > > +			}
> > > > > +		}
> > > > > +		dev_dbg(ddata->dma_dev.dev, "chan%d: %s mode, %s\n", i,
> > > > > +			!(ccidcfgr & CCIDCFGR_CFEN) ? "!CID-filtered" :
> > > > > +			ddata->chans[i].semaphore_mode ? "Semaphore" : "Static CID",
> > > > > +			(chan_reserved & BIT(i)) ? "denied" :
> > > > > +			mask & BIT(i) ? "force allowed" : "allowed");
> > > > > +	}
> > > > > +
> > > > > +	if (invalid_cid)
> > > > > +		dev_warn(ddata->dma_dev.dev, "chan%*pbl have invalid CID configuration\n",
> > > > > +			 ddata->dma_channels, &invalid_cid);
> > > > > +
> > > > > +	return chan_reserved;
> > > > > +}
> > > > > +
> > > > > +static const struct of_device_id stm32_dma3_of_match[] = {
> > > > > +	{ .compatible = "st,stm32-dma3", },
> > > > > +	{ /* sentinel */},
> > > > > +};
> > > > > +MODULE_DEVICE_TABLE(of, stm32_dma3_of_match);
> > > > > +
> > > > > +static int stm32_dma3_probe(struct platform_device *pdev)
> > > > > +{
> > > > > +	struct device_node *np = pdev->dev.of_node;
> > > > > +	struct stm32_dma3_ddata *ddata;
> > > > > +	struct reset_control *reset;
> > > > > +	struct stm32_dma3_chan *chan;
> > > > > +	struct dma_device *dma_dev;
> > > > > +	u32 master_ports, chan_reserved, i, verr;
> > > > > +	u64 hwcfgr;
> > > > > +	int ret;
> > > > > +
> > > > > +	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
> > > > > +	if (!ddata)
> > > > > +		return -ENOMEM;
> > > > > +	platform_set_drvdata(pdev, ddata);
> > > > > +
> > > > > +	dma_dev = &ddata->dma_dev;
> > > > > +
> > > > > +	ddata->base = devm_platform_ioremap_resource(pdev, 0);
> > > > > +	if (IS_ERR(ddata->base))
> > > > > +		return PTR_ERR(ddata->base);
> > > > > +
> > > > > +	ddata->clk = devm_clk_get(&pdev->dev, NULL);
> > > > > +	if (IS_ERR(ddata->clk))
> > > > > +		return dev_err_probe(&pdev->dev, PTR_ERR(ddata->clk), "Failed to get clk\n");
> > > > > +
> > > > > +	reset = devm_reset_control_get_optional(&pdev->dev, NULL);
> > > > > +	if (IS_ERR(reset))
> > > > > +		return dev_err_probe(&pdev->dev, PTR_ERR(reset), "Failed to get reset\n");
> > > > > +
> > > > > +	ret = clk_prepare_enable(ddata->clk);
> > > > > +	if (ret)
> > > > > +		return dev_err_probe(&pdev->dev, ret, "Failed to enable clk\n");
> > > > > +
> > > > > +	reset_control_reset(reset);
> > > > > +
> > > > > +	INIT_LIST_HEAD(&dma_dev->channels);
> > > > > +
> > > > > +	dma_cap_set(DMA_SLAVE, dma_dev->cap_mask);
> > > > > +	dma_cap_set(DMA_PRIVATE, dma_dev->cap_mask);
> > > > > +	dma_dev->dev = &pdev->dev;
> > > > > +	/*
> > > > > +	 * This controller supports up to 8-byte buswidth depending on the port used and the
> > > > > +	 * channel, and can only access address at even boundaries, multiple of the buswidth.
> > > > > +	 */
> > > > > +	dma_dev->copy_align = DMAENGINE_ALIGN_8_BYTES;
> > > > > +	dma_dev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +	dma_dev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_4_BYTES) |
> > > > > +				   BIT(DMA_SLAVE_BUSWIDTH_8_BYTES);
> > > > > +	dma_dev->directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | BIT(DMA_MEM_TO_MEM);
> > > > > +
> > > > > +	dma_dev->descriptor_reuse = true;
> > > > > +	dma_dev->max_sg_burst = STM32_DMA3_MAX_SEG_SIZE;
> > > > > +	dma_dev->residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR;
> > > > > +	dma_dev->device_alloc_chan_resources = stm32_dma3_alloc_chan_resources;
> > > > > +	dma_dev->device_free_chan_resources = stm32_dma3_free_chan_resources;
> > > > > +	dma_dev->device_prep_slave_sg = stm32_dma3_prep_slave_sg;
> > > > > +	dma_dev->device_caps = stm32_dma3_caps;
> > > > > +	dma_dev->device_config = stm32_dma3_config;
> > > > > +	dma_dev->device_terminate_all = stm32_dma3_terminate_all;
> > > > > +	dma_dev->device_synchronize = stm32_dma3_synchronize;
> > > > > +	dma_dev->device_tx_status = dma_cookie_status;
> > > > > +	dma_dev->device_issue_pending = stm32_dma3_issue_pending;
> > > > > +
> > > > > +	/* if dma_channels is not modified, get it from hwcfgr1 */
> > > > > +	if (of_property_read_u32(np, "dma-channels", &ddata->dma_channels)) {
> > > > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > > > +		ddata->dma_channels = FIELD_GET(G_NUM_CHANNELS, hwcfgr);
> > > > > +	}
> > > > > +
> > > > > +	/* if dma_requests is not modified, get it from hwcfgr2 */
> > > > > +	if (of_property_read_u32(np, "dma-requests", &ddata->dma_requests)) {
> > > > > +		hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR2);
> > > > > +		ddata->dma_requests = FIELD_GET(G_MAX_REQ_ID, hwcfgr) + 1;
> > > > > +	}
> > > > > +
> > > > > +	/* G_MASTER_PORTS, G_M0_DATA_WIDTH_ENC, G_M1_DATA_WIDTH_ENC in HWCFGR1 */
> > > > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR1);
> > > > > +	master_ports = FIELD_GET(G_MASTER_PORTS, hwcfgr);
> > > > > +
> > > > > +	ddata->ports_max_dw[0] = FIELD_GET(G_M0_DATA_WIDTH_ENC, hwcfgr);
> > > > > +	if (master_ports == AXI64 || master_ports == AHB32) /* Single master port */
> > > > > +		ddata->ports_max_dw[1] = DW_INVALID;
> > > > > +	else /* Dual master ports */
> > > > > +		ddata->ports_max_dw[1] = FIELD_GET(G_M1_DATA_WIDTH_ENC, hwcfgr);
> > > > > +
> > > > > +	ddata->chans = devm_kcalloc(&pdev->dev, ddata->dma_channels, sizeof(*ddata->chans),
> > > > > +				    GFP_KERNEL);
> > > > > +	if (!ddata->chans) {
> > > > > +		ret = -ENOMEM;
> > > > > +		goto err_clk_disable;
> > > > > +	}
> > > > > +
> > > > > +	chan_reserved = stm32_dma3_check_rif(ddata);
> > > > > +
> > > > > +	if (chan_reserved == GENMASK(ddata->dma_channels - 1, 0)) {
> > > > > +		ret = -ENODEV;
> > > > > +		dev_err_probe(&pdev->dev, ret, "No channel available, abort registration\n");
> > > > > +		goto err_clk_disable;
> > > > > +	}
> > > > > +
> > > > > +	/* G_FIFO_SIZE x=0..7 in HWCFGR3 and G_FIFO_SIZE x=8..15 in HWCFGR4 */
> > > > > +	hwcfgr = readl_relaxed(ddata->base + STM32_DMA3_HWCFGR3);
> > > > > +	hwcfgr |= ((u64)readl_relaxed(ddata->base + STM32_DMA3_HWCFGR4)) << 32;
> > > > > +
> > > > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > > > +		if (chan_reserved & BIT(i))
> > > > > +			continue;
> > > > > +
> > > > > +		chan = &ddata->chans[i];
> > > > > +		chan->id = i;
> > > > > +		chan->fifo_size = get_chan_hwcfg(i, G_FIFO_SIZE(i), hwcfgr);
> > > > > +		/* If chan->fifo_size > 0 then half of the fifo size, else no burst when no FIFO */
> > > > > +		chan->max_burst = (chan->fifo_size) ? (1 << (chan->fifo_size + 1)) / 2 : 0;
> > > > > +		chan->vchan.desc_free = stm32_dma3_chan_vdesc_free;
> > > > > +
> > > > > +		vchan_init(&chan->vchan, dma_dev);
> > > > > +	}
> > > > > +
> > > > > +	ret = dmaenginem_async_device_register(dma_dev);
> > > > > +	if (ret)
> > > > > +		goto err_clk_disable;
> > > > > +
> > > > > +	for (i = 0; i < ddata->dma_channels; i++) {
> > > > > +		if (chan_reserved & BIT(i))
> > > > > +			continue;
> > > > > +
> > > > > +		ret = platform_get_irq(pdev, i);
> > > > > +		if (ret < 0)
> > > > > +			goto err_clk_disable;
> > > > > +
> > > > > +		chan = &ddata->chans[i];
> > > > > +		chan->irq = ret;
> > > > > +
> > > > > +		ret = devm_request_irq(&pdev->dev, chan->irq, stm32_dma3_chan_irq, 0,
> > > > > +				       dev_name(chan2dev(chan)), chan);
> > > > > +		if (ret) {
> > > > > +			dev_err_probe(&pdev->dev, ret, "Failed to request channel %s IRQ\n",
> > > > > +				      dev_name(chan2dev(chan)));
> > > > > +			goto err_clk_disable;
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > > +	ret = of_dma_controller_register(np, stm32_dma3_of_xlate, ddata);
> > > > > +	if (ret) {
> > > > > +		dev_err_probe(&pdev->dev, ret, "Failed to register controller\n");
> > > > > +		goto err_clk_disable;
> > > > > +	}
> > > > > +
> > > > > +	verr = readl_relaxed(ddata->base + STM32_DMA3_VERR);
> > > > > +
> > > > > +	pm_runtime_set_active(&pdev->dev);
> > > > > +	pm_runtime_enable(&pdev->dev);
> > > > > +	pm_runtime_get_noresume(&pdev->dev);
> > > > > +	pm_runtime_put(&pdev->dev);
> > > > > +
> > > > > +	dev_info(&pdev->dev, "STM32 DMA3 registered rev:%lu.%lu\n",
> > > > > +		 FIELD_GET(VERR_MAJREV, verr), FIELD_GET(VERR_MINREV, verr));
> > > > > +
> > > > > +	return 0;
> > > > > +
> > > > > +err_clk_disable:
> > > > > +	clk_disable_unprepare(ddata->clk);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static void stm32_dma3_remove(struct platform_device *pdev)
> > > > > +{
> > > > > +	pm_runtime_disable(&pdev->dev);
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_runtime_suspend(struct device *dev)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > > > +
> > > > > +	clk_disable_unprepare(ddata->clk);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static int stm32_dma3_runtime_resume(struct device *dev)
> > > > > +{
> > > > > +	struct stm32_dma3_ddata *ddata = dev_get_drvdata(dev);
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = clk_prepare_enable(ddata->clk);
> > > > > +	if (ret)
> > > > > +		dev_err(dev, "Failed to enable clk: %d\n", ret);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static const struct dev_pm_ops stm32_dma3_pm_ops = {
> > > > > +	SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> > > > > +	RUNTIME_PM_OPS(stm32_dma3_runtime_suspend, stm32_dma3_runtime_resume, NULL)
> > > > > +};
> > > > > +
> > > > > +static struct platform_driver stm32_dma3_driver = {
> > > > > +	.probe = stm32_dma3_probe,
> > > > > +	.remove_new = stm32_dma3_remove,
> > > > > +	.driver = {
> > > > > +		.name = "stm32-dma3",
> > > > > +		.of_match_table = stm32_dma3_of_match,
> > > > > +		.pm = pm_ptr(&stm32_dma3_pm_ops),
> > > > > +	},
> > > > > +};
> > > > > +
> > > > > +static int __init stm32_dma3_init(void)
> > > > > +{
> > > > > +	return platform_driver_register(&stm32_dma3_driver);
> > > > > +}
> > > > > +
> > > > > +subsys_initcall(stm32_dma3_init);
> > > > > +
> > > > > +MODULE_DESCRIPTION("STM32 DMA3 controller driver");
> > > > > +MODULE_AUTHOR("Amelie Delaunay <amelie.delaunay@foss.st.com>");
> > > > > +MODULE_LICENSE("GPL");
> > > > > -- 
> > > > > 2.25.1
> > > > > 

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-05-17 14:57 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-23 12:32 [PATCH 00/12] Introduce STM32 DMA3 support Amelie Delaunay
2024-04-23 12:32 ` [PATCH 01/12] dt-bindings: dma: New directory for STM32 DMA controllers bindings Amelie Delaunay
2024-04-23 13:50   ` Rob Herring
2024-04-23 14:46     ` Amelie Delaunay
2024-04-23 12:32 ` [PATCH 02/12] dmaengine: stm32: New directory for STM32 DMA controllers drivers Amelie Delaunay
2024-04-23 12:32 ` [PATCH 03/12] MAINTAINERS: Add entry for STM32 DMA controllers drivers and documentation Amelie Delaunay
2024-04-23 12:32 ` [PATCH 04/12] dt-bindings: dma: Document STM32 DMA3 controller bindings Amelie Delaunay
2024-04-23 15:22   ` Rob Herring
2024-04-23 12:32 ` [PATCH 05/12] dmaengine: Add STM32 DMA3 support Amelie Delaunay
2024-05-04 12:40   ` Vinod Koul
2024-05-07 11:33     ` Amelie Delaunay
2024-05-07 20:26       ` Frank Li
2024-05-13  9:21         ` Amelie Delaunay
2024-05-15 18:45           ` Frank Li
2024-05-16  9:42             ` Amelie Delaunay
2024-05-04 14:27   ` Christophe JAILLET
2024-05-07 12:37     ` Amelie Delaunay
2024-05-15 18:56   ` Frank Li
2024-05-16 15:25     ` Amelie Delaunay
2024-05-16 17:09       ` Frank Li
2024-05-17  9:42         ` Amelie Delaunay
2024-05-17 14:57           ` Frank Li
2024-04-23 12:32 ` [PATCH 06/12] dmaengine: stm32-dma3: add DMA_CYCLIC capability Amelie Delaunay
2024-04-23 12:32 ` [PATCH 07/12] dmaengine: stm32-dma3: add DMA_MEMCPY capability Amelie Delaunay
2024-04-23 12:32 ` [PATCH 08/12] dmaengine: stm32-dma3: add device_pause and device_resume ops Amelie Delaunay
2024-04-23 12:32 ` [PATCH 09/12] dmaengine: stm32-dma3: improve residue granularity Amelie Delaunay
2024-04-23 12:33 ` [PATCH 10/12] dmaengine: add channel device name to channel registration Amelie Delaunay
2024-04-23 12:33 ` [PATCH 11/12] dmaengine: stm32-dma3: defer channel registration to specify channel name Amelie Delaunay
2024-04-23 12:33 ` [PATCH 12/12] arm64: dts: st: add HPDMA nodes on stm32mp251 Amelie Delaunay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).