Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible with currently implemented in the kernel DW DMAC driver, but there are some peculiarities which must be taken into account in order to have the device fully supported. First of all traditionally we replaced the legacy plain text-based dt-binding file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight channels, which alas have different max burst length configuration. In particular first two channels may burst up to 128 bits (16 bytes) at a time while the rest of them just up to 32 bits. We must make sure that the DMA subsystem doesn't set values exceeding these limitations otherwise the controller will hang up. In third currently we discovered the problem in using the DW APB SPI driver together with DW DMAC. The problem happens if there is no natively implemented multi-block LLP transfers support and the SPI-transfer length exceeds the max lock size. In this case due to asynchronous handling of Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use the DMAC to asynchronously execute the transfers we'd have to at least warn the user of the possible errors. In forth it's worth to set the DMA device max segment size with max block size config specific to the DW DMA controller. It shall help the DMA clients to create size-optimized SG-list items for the controller. This in turn will cause less dw_desc allocations, less LLP reinitializations, better DMA device performance. Finally there is a bug in the algorithm of the nollp flag detection. In particular even if DW DMAC parameters state the multi-block transfers support there is still HC_LLP (hardcode LLP) flag, which if set makes expected by the driver true multi-block LLP functionality unusable. This happens cause' if HC_LLP flag is set the LLP registers will be hardcoded to zero so the contiguous multi-block transfers will be only supported. We must take the flag into account when detecting the LLP support otherwise the driver just won't work correctly. This patchset is rebased and tested on the mainline Linux kernel 5.7-rc4: 0e698dfa2822 ("Linux 5.7-rc4") tag: v5.7-rc4 Changelog v2: - Rearrange SoBs. - Move $ref to the root level of the properties. So do do with the constraints in the DT binding. - Replace "additionalProperties: false" with "unevaluatedProperties: false" property in the DT binding file. - Discard default settings defined out of property enum constraint. - Set default max-burst-len to 256 TR-WIDTH words in the DT binding. - Discard noLLP and block_size accessors. - Set max segment size of the DMA device structure with the DW DMA block size config. - Print warning if noLLP flag is set. - Discard max burst length accessor. - Add comment about why hardware accelerated LLP list support depends on both MBLK_EN and HC_LLP configs setting. - Use explicit bits state comparison operator in noLLP flag setting. Link: https://lore.kernel.org/dmaengine/20200508105304.14065-1-Sergey.Semin@baikalelectronics.ru/ Changelog v3: - Use the block_size found for the very first channel instead of looking for the maximum of maximum block sizes. - Don't define device-specific device_dma_parameters object, since it has already been defined by the platform device core. - Add more details into the property description about what limitations snps,max-burst-len defines. - Move commit fb7e3bbfc830 ("dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config") to the head of the series. - Add a new patch "dmaengine: Introduce min burst length capability" as a result of the discussion with Vinod and Andy regarding the burst length capability. - Add a new patch "dmaengine: Introduce max SG list entries capability" suggested by Andy. - Add a new patch "dmaengine: Introduce DMA-device device_caps callback" as a result of the discussion with Vinud and Andy in the framework of DW DMA burst and LLP capabilities. - Add a new patch "dmaengine: dw: Add dummy device_caps callback" as a preparation commit before setting the max_burst and max_sg_nents DW DMA capabilities. - Override the slave channel max_burst capability instead of calculating the minimum value of max burst lengths and setting the DMA-device generic capability. - Add a new patch "dmaengine: dw: Initialize max_sg_nents with nollp flag". This is required to fix the DW APB SSI issue of the Tx and Rx DMA channels de-synchronization. Link: https://lore.kernel.org/dmaengine/20200526225022.20405-1-Sergey.Semin@baikalelectronics.ru/ Changelog v4: - Use explicit if-else statement when assigning the max_sg_nents field. - Clamp the dst and src burst lengths in the generic dwc_config() method instead of doing that in the encode_maxburst() callback. - Define max_burst with u32 type in struct dw_dma_platform_data. - Perform of_property_read_u32_array() with the platform data max_burst member passed directly. - Add a new patch "dmaengine: dw: Initialize min_burst capability", which initializes the min_burst capability with 1. - Fix of->if typo. It should be definitely "of" in the max_sg_list capability description. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Maxim Kaurkin <Maxim.Kaurkin@baikalelectronics.ru> Cc: Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru> Cc: Ramil Zaripov <Ramil.Zaripov@baikalelectronics.ru> Cc: Ekaterina Skachko <Ekaterina.Skachko@baikalelectronics.ru> Cc: Vadim Vlasov <V.Vlasov@baikalelectronics.ru> Cc: Alexey Kolotnikov <Alexey.Kolotnikov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: dmaengine@vger.kernel.org Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Serge Semin (11): dt-bindings: dma: dw: Convert DW DMAC to DT binding dt-bindings: dma: dw: Add max burst transaction length property dmaengine: Introduce min burst length capability dmaengine: Introduce max SG list entries capability dmaengine: Introduce DMA-device device_caps callback dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config dmaengine: dw: Set DMA device max segment size parameter dmaengine: dw: Add dummy device_caps callback dmaengine: dw: Initialize min_burst capability dmaengine: dw: Introduce max burst length hw config dmaengine: dw: Initialize max_sg_nents capability .../bindings/dma/snps,dma-spear1340.yaml | 176 ++++++++++++++++++ .../devicetree/bindings/dma/snps-dma.txt | 69 ------- drivers/dma/dmaengine.c | 5 + drivers/dma/dw/core.c | 47 ++++- drivers/dma/dw/of.c | 5 + drivers/dma/dw/regs.h | 3 + include/linux/dmaengine.h | 14 ++ include/linux/platform_data/dma-dw.h | 4 + 8 files changed, 253 insertions(+), 70 deletions(-) create mode 100644 Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml delete mode 100644 Documentation/devicetree/bindings/dma/snps-dma.txt -- 2.26.2
Modern device tree bindings are supposed to be created as YAML-files in accordance with dt-schema. This commit replaces the Synopsis Designware DMA controller legacy bare text bindings with YAML file. The only required prorties are "compatible", "reg", "#dma-cells" and "interrupts", which will be used by the driver to correctly find the controller memory region and handle its events. The rest of the properties are optional, since in case if either "dma-channels" or "dma-masters" isn't specified, the driver will attempt to auto-detect the IP core configuration. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: linux-mips@vger.kernel.org --- Changelog v2: - Rearrange SoBs. - Move $ref to the root level of the properties. So do do with the constraints. - Discard default settings defined out of the property enum constraint. - Replace "additionalProperties: false" with "unevaluatedProperties: false" property. - Remove a label definition from the binding example. --- .../bindings/dma/snps,dma-spear1340.yaml | 161 ++++++++++++++++++ .../devicetree/bindings/dma/snps-dma.txt | 69 -------- 2 files changed, 161 insertions(+), 69 deletions(-) create mode 100644 Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml delete mode 100644 Documentation/devicetree/bindings/dma/snps-dma.txt diff --git a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml new file mode 100644 index 000000000000..e7611840a7cf --- /dev/null +++ b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml @@ -0,0 +1,161 @@ +# SPDX-License-Identifier: GPL-2.0-only +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/dma/snps,dma-spear1340.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Synopsys Designware DMA Controller + +maintainers: + - Viresh Kumar <vireshk@kernel.org> + - Andy Shevchenko <andriy.shevchenko@linux.intel.com> + +allOf: + - $ref: "dma-controller.yaml#" + +properties: + compatible: + const: snps,dma-spear1340 + + "#dma-cells": + const: 3 + description: | + First cell is a phandle pointing to the DMA controller. Second one is + the DMA request line number. Third cell is the memory master identifier + for transfers on dynamically allocated channel. Fourth cell is the + peripheral master identifier for transfers on an allocated channel. + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + clocks: + maxItems: 1 + + clock-names: + description: AHB interface reference clock. + const: hclk + + dma-channels: + description: | + Number of DMA channels supported by the controller. In case if + not specified the driver will try to auto-detect this and + the rest of the optional parameters. + minimum: 1 + maximum: 8 + + dma-requests: + minimum: 1 + maximum: 16 + + dma-masters: + $ref: /schemas/types.yaml#definitions/uint32 + description: | + Number of DMA masters supported by the controller. In case if + not specified the driver will try to auto-detect this and + the rest of the optional parameters. + minimum: 1 + maximum: 4 + + chan_allocation_order: + $ref: /schemas/types.yaml#definitions/uint32 + description: | + DMA channels allocation order specifier. Zero means ascending order + (first free allocated), while one - descending (last free allocated). + default: 0 + enum: [0, 1] + + chan_priority: + $ref: /schemas/types.yaml#definitions/uint32 + description: | + DMA channels priority order. Zero means ascending channels priority + so the very first channel has the highest priority. While 1 means + descending priority (the last channel has the highest priority). + default: 0 + enum: [0, 1] + + block_size: + $ref: /schemas/types.yaml#definitions/uint32 + description: Maximum block size supported by the DMA controller. + enum: [3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095] + + data-width: + $ref: /schemas/types.yaml#/definitions/uint32-array + description: Data bus width per each DMA master in bytes. + items: + maxItems: 4 + items: + enum: [4, 8, 16, 32] + + data_width: + $ref: /schemas/types.yaml#/definitions/uint32-array + deprecated: true + description: | + Data bus width per each DMA master in (2^n * 8) bits. This property is + deprecated. It' usage is discouraged in favor of data-width one. Moreover + the property incorrectly permits to define data-bus width of 8 and 16 + bits, which is impossible in accordance with DW DMAC IP-core data book. + items: + maxItems: 4 + items: + enum: + - 0 # 8 bits + - 1 # 16 bits + - 2 # 32 bits + - 3 # 64 bits + - 4 # 128 bits + - 5 # 256 bits + default: 0 + + multi-block: + $ref: /schemas/types.yaml#/definitions/uint32-array + description: | + LLP-based multi-block transfer supported by hardware per + each DMA channel. + items: + maxItems: 8 + items: + enum: [0, 1] + default: 1 + + snps,dma-protection-control: + $ref: /schemas/types.yaml#definitions/uint32 + description: | + Bits one-to-one passed to the AHB HPROT[3:1] bus. Each bit setting + indicates the following features: bit 0 - privileged mode, + bit 1 - DMA is bufferable, bit 2 - DMA is cacheable. + default: 0 + minimum: 0 + maximum: 7 + +unevaluatedProperties: false + +required: + - compatible + - "#dma-cells" + - reg + - interrupts + +examples: + - | + dma-controller@fc000000 { + compatible = "snps,dma-spear1340"; + reg = <0xfc000000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <12>; + + dma-channels = <8>; + dma-requests = <16>; + dma-masters = <4>; + #dma-cells = <3>; + + chan_allocation_order = <1>; + chan_priority = <1>; + block_size = <0xfff>; + data-width = <8 8>; + multi-block = <0 0 0 0 0 0 0 0>; + snps,max-burst-len = <16 16 4 4 4 4 4 4>; + }; +... diff --git a/Documentation/devicetree/bindings/dma/snps-dma.txt b/Documentation/devicetree/bindings/dma/snps-dma.txt deleted file mode 100644 index 0bedceed1963..000000000000 --- a/Documentation/devicetree/bindings/dma/snps-dma.txt +++ /dev/null @@ -1,69 +0,0 @@ -* Synopsys Designware DMA Controller - -Required properties: -- compatible: "snps,dma-spear1340" -- reg: Address range of the DMAC registers -- interrupt: Should contain the DMAC interrupt number -- dma-channels: Number of channels supported by hardware -- dma-requests: Number of DMA request lines supported, up to 16 -- dma-masters: Number of AHB masters supported by the controller -- #dma-cells: must be <3> -- chan_allocation_order: order of allocation of channel, 0 (default): ascending, - 1: descending -- chan_priority: priority of channels. 0 (default): increase from chan 0->n, 1: - increase from chan n->0 -- block_size: Maximum block size supported by the controller -- data-width: Maximum data width supported by hardware per AHB master - (in bytes, power of 2) - - -Deprecated properties: -- data_width: Maximum data width supported by hardware per AHB master - (0 - 8bits, 1 - 16bits, ..., 5 - 256bits) - - -Optional properties: -- multi-block: Multi block transfers supported by hardware. Array property with - one cell per channel. 0: not supported, 1 (default): supported. -- snps,dma-protection-control: AHB HPROT[3:1] protection setting. - The default value is 0 (for non-cacheable, non-buffered, - unprivileged data access). - Refer to include/dt-bindings/dma/dw-dmac.h for possible values. - -Example: - - dmahost: dma@fc000000 { - compatible = "snps,dma-spear1340"; - reg = <0xfc000000 0x1000>; - interrupt-parent = <&vic1>; - interrupts = <12>; - - dma-channels = <8>; - dma-requests = <16>; - dma-masters = <2>; - #dma-cells = <3>; - chan_allocation_order = <1>; - chan_priority = <1>; - block_size = <0xfff>; - data-width = <8 8>; - }; - -DMA clients connected to the Designware DMA controller must use the format -described in the dma.txt file, using a four-cell specifier for each channel. -The four cells in order are: - -1. A phandle pointing to the DMA controller -2. The DMA request line number -3. Memory master for transfers on allocated channel -4. Peripheral master for transfers on allocated channel - -Example: - - serial@e0000000 { - compatible = "arm,pl011", "arm,primecell"; - reg = <0xe0000000 0x1000>; - interrupts = <0 35 0x4>; - dmas = <&dmahost 12 0 1>, - <&dmahost 13 1 0>; - dma-names = "rx", "rx"; - }; -- 2.26.2
This array property is used to indicate the maximum burst transaction length supported by each DMA channel. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: linux-mips@vger.kernel.org --- Changelog v2: - Rearrange SoBs. - Move $ref to the root level of the properties. So do with the constraints. - Set default max-burst-len to 256 TR-WIDTH words. Changelog v3: - Add more details into the property description about what limitations snps,max-burst-len defines. --- .../bindings/dma/snps,dma-spear1340.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml index e7611840a7cf..20870f5c14dd 100644 --- a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml +++ b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml @@ -120,6 +120,21 @@ properties: enum: [0, 1] default: 1 + snps,max-burst-len: + $ref: /schemas/types.yaml#/definitions/uint32-array + description: | + Maximum length of the burst transactions supported by the controller. + This property defines the upper limit of the run-time burst setting + (CTLx.SRC_MSIZE/CTLx.DST_MSIZE fields) so the allowed burst length + will be from 1 to max-burst-len words. It's an array property with one + cell per channel in the units determined by the value set in the + CTLx.SRC_TR_WIDTH/CTLx.DST_TR_WIDTH fields (data width). + items: + maxItems: 8 + items: + enum: [4, 8, 16, 32, 64, 128, 256] + default: 256 + snps,dma-protection-control: $ref: /schemas/types.yaml#definitions/uint32 description: | -- 2.26.2
Some hardware aside from default 0/1 may have greater minimum burst transactions length constraints. Here we introduce the DMA device and slave capability, which if required can be initialized by the DMA engine driver with the device-specific value. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v3: - This is a new patch created as a result of the discussion with Vinud and Andy in the framework of DW DMA burst and LLP capabilities. --- drivers/dma/dmaengine.c | 1 + include/linux/dmaengine.h | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c index d31076d9ef25..b332ffe52780 100644 --- a/drivers/dma/dmaengine.c +++ b/drivers/dma/dmaengine.c @@ -590,6 +590,7 @@ int dma_get_slave_caps(struct dma_chan *chan, struct dma_slave_caps *caps) caps->src_addr_widths = device->src_addr_widths; caps->dst_addr_widths = device->dst_addr_widths; caps->directions = device->directions; + caps->min_burst = device->min_burst; caps->max_burst = device->max_burst; caps->residue_granularity = device->residue_granularity; caps->descriptor_reuse = device->descriptor_reuse; diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h index e1c03339918f..0c7403b27133 100644 --- a/include/linux/dmaengine.h +++ b/include/linux/dmaengine.h @@ -465,6 +465,7 @@ enum dma_residue_granularity { * Since the enum dma_transfer_direction is not defined as bit flag for * each type, the dma controller should set BIT(<TYPE>) and same * should be checked by controller as well + * @min_burst: min burst capability per-transfer * @max_burst: max burst capability per-transfer * @cmd_pause: true, if pause is supported (i.e. for reading residue or * for resume later) @@ -478,6 +479,7 @@ struct dma_slave_caps { u32 src_addr_widths; u32 dst_addr_widths; u32 directions; + u32 min_burst; u32 max_burst; bool cmd_pause; bool cmd_resume; @@ -769,6 +771,7 @@ struct dma_filter { * Since the enum dma_transfer_direction is not defined as bit flag for * each type, the dma controller should set BIT(<TYPE>) and same * should be checked by controller as well + * @min_burst: min burst capability per-transfer * @max_burst: max burst capability per-transfer * @residue_granularity: granularity of the transfer residue reported * by tx_status @@ -839,6 +842,7 @@ struct dma_device { u32 src_addr_widths; u32 dst_addr_widths; u32 directions; + u32 min_burst; u32 max_burst; bool descriptor_reuse; enum dma_residue_granularity residue_granularity; -- 2.26.2
Some devices may lack the support of the hardware accelerated SG list entries automatic walking through and execution. In this case a burden of the SG list traversal and DMA engine re-initialization lies on the DMA engine driver (normally implemented by using a DMA transfer completion IRQ to recharge the DMA device with a next SG list entry). But such solution may not be suitable for some DMA consumers. In particular SPI devices need both Tx and Rx DMA channels work synchronously in order to avoid the Rx FIFO overflow. In case if Rx DMA channel is paused for some time while the Tx DMA channel works implicitly pulling data into the Rx FIFO, the later will be eventually overflown, which will cause the data loss. So if SG list entries aren't automatically fetched by the DMA engine, but are one-by-one manually selected for execution in the ISRs/deferred work/etc., such problem will eventually happen due to the non-deterministic latencies of the service execution. In order to let the DMA consumer know about the DMA device capabilities regarding the hardware accelerated SG list traversal we introduce the max_sg_list capability. It is supposed to be initialized by the DMA engine driver with 0 if there is no limitation for the number of SG entries atomically executed and with non-zero value if there is such constraints, so the upper limit is determined by the number set to the property. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v3: - This is a new patch created as a result of the discussion with Vinud and Andy in the framework of DW DMA burst and LLP capabilities. Changelog v4: - Fix of->if typo. It should be definitely of. --- drivers/dma/dmaengine.c | 1 + include/linux/dmaengine.h | 8 ++++++++ 2 files changed, 9 insertions(+) diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c index b332ffe52780..ad56ad58932c 100644 --- a/drivers/dma/dmaengine.c +++ b/drivers/dma/dmaengine.c @@ -592,6 +592,7 @@ int dma_get_slave_caps(struct dma_chan *chan, struct dma_slave_caps *caps) caps->directions = device->directions; caps->min_burst = device->min_burst; caps->max_burst = device->max_burst; + caps->max_sg_nents = device->max_sg_nents; caps->residue_granularity = device->residue_granularity; caps->descriptor_reuse = device->descriptor_reuse; caps->cmd_pause = !!device->device_pause; diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h index 0c7403b27133..a7e4d8dfdd19 100644 --- a/include/linux/dmaengine.h +++ b/include/linux/dmaengine.h @@ -467,6 +467,9 @@ enum dma_residue_granularity { * should be checked by controller as well * @min_burst: min burst capability per-transfer * @max_burst: max burst capability per-transfer + * @max_sg_nents: max number of SG list entries executed in a single atomic + * DMA tansaction with no intermediate IRQ for reinitialization. Zero + * value means unlimited number of entries. * @cmd_pause: true, if pause is supported (i.e. for reading residue or * for resume later) * @cmd_resume: true, if resume is supported @@ -481,6 +484,7 @@ struct dma_slave_caps { u32 directions; u32 min_burst; u32 max_burst; + u32 max_sg_nents; bool cmd_pause; bool cmd_resume; bool cmd_terminate; @@ -773,6 +777,9 @@ struct dma_filter { * should be checked by controller as well * @min_burst: min burst capability per-transfer * @max_burst: max burst capability per-transfer + * @max_sg_nents: max number of SG list entries executed in a single atomic + * DMA tansaction with no intermediate IRQ for reinitialization. Zero + * value means unlimited number of entries. * @residue_granularity: granularity of the transfer residue reported * by tx_status * @device_alloc_chan_resources: allocate resources and return the @@ -844,6 +851,7 @@ struct dma_device { u32 directions; u32 min_burst; u32 max_burst; + u32 max_sg_nents; bool descriptor_reuse; enum dma_residue_granularity residue_granularity; -- 2.26.2
There are DMA devices (like ours version of Synopsys DW DMAC) which have DMA capabilities non-uniformly redistributed amongst the device channels. In order to provide a way of exposing the channel-specific parameters to the DMA engine consumers, we introduce a new DMA-device callback. In case if provided it gets called from the dma_get_slave_caps() method and is able to override the generic DMA-device capabilities. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v3: - This is a new patch created as a result of the discussion with Vinod and Andy in the framework of DW DMA burst and LLP capabilities. --- drivers/dma/dmaengine.c | 3 +++ include/linux/dmaengine.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c index ad56ad58932c..edbb11d56cde 100644 --- a/drivers/dma/dmaengine.c +++ b/drivers/dma/dmaengine.c @@ -599,6 +599,9 @@ int dma_get_slave_caps(struct dma_chan *chan, struct dma_slave_caps *caps) caps->cmd_resume = !!device->device_resume; caps->cmd_terminate = !!device->device_terminate_all; + if (device->device_caps) + device->device_caps(chan, caps); + return 0; } EXPORT_SYMBOL_GPL(dma_get_slave_caps); diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h index a7e4d8dfdd19..b303e59929e5 100644 --- a/include/linux/dmaengine.h +++ b/include/linux/dmaengine.h @@ -899,6 +899,8 @@ struct dma_device { struct dma_chan *chan, dma_addr_t dst, u64 data, unsigned long flags); + void (*device_caps)(struct dma_chan *chan, + struct dma_slave_caps *caps); int (*device_config)(struct dma_chan *chan, struct dma_slave_config *config); int (*device_pause)(struct dma_chan *chan); -- 2.26.2
Full multi-block transfers functionality is enabled in DW DMA controller only if CHx_MULTI_BLK_EN is set. But LLP-based transfers can be executed only if hardcode channel x LLP register feature isn't enabled, which can be switched on at the IP core synthesis for optimization. If it's enabled then the LLP register is hardcoded to zero, so the blocks chaining based on the LLPs is unsupported. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v2: - Rearrange SoBs. - Add comment about why hardware accelerated LLP list support depends on both MBLK_EN and HC_LLP configs setting. - Use explicit bits state comparison operator. Changelog v3: - Move the patch to the head of the series. --- drivers/dma/dw/core.c | 11 ++++++++++- drivers/dma/dw/regs.h | 1 + 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index 21cb2a58dbd2..33e99d95b3d3 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -1178,8 +1178,17 @@ int do_dma_probe(struct dw_dma_chip *chip) */ dwc->block_size = (4 << ((pdata->block_size >> 4 * i) & 0xf)) - 1; + + /* + * According to the DW DMA databook the true scatter- + * gether LLPs aren't available if either multi-block + * config is disabled (CHx_MULTI_BLK_EN == 0) or the + * LLP register is hard-coded to zeros + * (CHx_HC_LLP == 1). + */ dwc->nollp = - (dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0; + (dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0 || + (dwc_params >> DWC_PARAMS_HC_LLP & 0x1) == 1; } else { dwc->block_size = pdata->block_size; dwc->nollp = !pdata->multi_block[i]; diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h index 3fce66ecee7a..1ab840b06e79 100644 --- a/drivers/dma/dw/regs.h +++ b/drivers/dma/dw/regs.h @@ -125,6 +125,7 @@ struct dw_dma_regs { /* Bitfields in DWC_PARAMS */ #define DWC_PARAMS_MBLK_EN 11 /* multi block transfer */ +#define DWC_PARAMS_HC_LLP 13 /* set LLP register to zero */ /* bursts size */ enum dw_dma_msize { -- 2.26.2
Maximum block size DW DMAC configuration corresponds to the max segment size DMA parameter in the DMA core subsystem notation. Lets set it with a value specific to the probed DW DMA controller. It shall help the DMA clients to create size-optimized SG-list items for the controller. This in turn will cause less dw_desc allocations, less LLP reinitializations, better DMA device performance. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v2: - This is a new patch created in place of the dropped one: "dmaengine: dw: Add LLP and block size config accessors". Changelog v3: - Use the block_size found for the very first channel instead of looking for the maximum of maximum block sizes. - Don't define device-specific device_dma_parameters object, since it has already been defined by the platform device core. --- drivers/dma/dw/core.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index 33e99d95b3d3..fb95920c429e 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -1229,6 +1229,13 @@ int do_dma_probe(struct dw_dma_chip *chip) BIT(DMA_MEM_TO_MEM); dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST; + /* + * For now there is no hardware with non uniform maximum block size + * across all of the device channels, so we set the maximum segment + * size as the block size found for the very first channel. + */ + dma_set_max_seg_size(dw->dma.dev, dw->chan[0].block_size); + err = dma_async_device_register(&dw->dma); if (err) goto err_dma_register; -- 2.26.2
Since some DW DMA controllers (like one installed on Baikal-T1 SoC) may have non-uniform DMA capabilities per device channels, let's add the DW DMA specific device_caps callback to expose that specifics up to the DMA consumer. It's a dummy function for now. We'll fill it in with capabilities overrides in the next commits. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v3: - This is a new patch created as a result of the discussion with Vinud and Andy in the framework of DW DMA burst and LLP capabilities. --- drivers/dma/dw/core.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index fb95920c429e..ceded21537e2 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -1049,6 +1049,11 @@ static void dwc_free_chan_resources(struct dma_chan *chan) dev_vdbg(chan2dev(chan), "%s: done\n", __func__); } +static void dwc_caps(struct dma_chan *chan, struct dma_slave_caps *caps) +{ + +} + int do_dma_probe(struct dw_dma_chip *chip) { struct dw_dma *dw = chip->dw; @@ -1214,6 +1219,7 @@ int do_dma_probe(struct dw_dma_chip *chip) dw->dma.device_prep_dma_memcpy = dwc_prep_dma_memcpy; dw->dma.device_prep_slave_sg = dwc_prep_slave_sg; + dw->dma.device_caps = dwc_caps; dw->dma.device_config = dwc_config; dw->dma.device_pause = dwc_pause; dw->dma.device_resume = dwc_resume; -- 2.26.2
According to the DW APB DMAC data book the minimum burst transaction length is 1 and it's true for any version of the controller since isn't parametrised in the coreAssembler so can't be changed at the IP-core synthesis stage. Let's initialise the min_burst member of the DMA controller descriptor so the DMA clients could use it to properly optimize the DMA requests. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v4: - This is a new patch suggested by Andy. --- drivers/dma/dw/core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index ceded21537e2..a8cebb1dbb68 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -1229,6 +1229,7 @@ int do_dma_probe(struct dw_dma_chip *chip) dw->dma.device_issue_pending = dwc_issue_pending; /* DMA capabilities */ + dw->dma.min_burst = 1; dw->dma.src_addr_widths = DW_DMA_BUSWIDTHS; dw->dma.dst_addr_widths = DW_DMA_BUSWIDTHS; dw->dma.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | -- 2.26.2
IP core of the DW DMA controller may be synthesized with different max burst length of the transfers per each channel. According to Synopsis having the fixed maximum burst transactions length may provide some performance gain. At the same time setting up the source and destination multi size exceeding the max burst length limitation may cause a serious problems. In our case the DMA transaction just hangs up. In order to fix this lets introduce the max burst length platform config of the DW DMA controller device and don't let the DMA channels configuration code exceed the burst length hardware limitation. Note the maximum burst length parameter can be detected either in runtime from the DWC parameter registers or from the dedicated DT property. Depending on the IP core configuration the maximum value can vary from channel to channel so by overriding the channel slave max_burst capability we make sure a DMA consumer will get the channel-specific max burst length. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v2: - Rearrange SoBs. - Discard dwc_get_maxburst() accessor. It's enough to have a clamping guard against exceeding the hardware max burst limitation. Changelog v3: - Override the slave channel max_burst capability instead of calculating the minimum value of max burst lengths and setting the DMA-device generic capability. Changelog v4: - Clamp the dst and src burst lengths in the generic dwc_config() method instead of doing that in the encode_maxburst() callback. - Define max_burst with u32 type in struct dw_dma_platform_data. - Perform of_property_read_u32_array() directly into the platform data max_burst member. --- drivers/dma/dw/core.c | 10 ++++++++++ drivers/dma/dw/of.c | 5 +++++ drivers/dma/dw/regs.h | 2 ++ include/linux/platform_data/dma-dw.h | 4 ++++ 4 files changed, 21 insertions(+) diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index a8cebb1dbb68..60ef779fc5e0 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -791,6 +791,11 @@ static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig) memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig)); + dwc->dma_sconfig.src_maxburst = + clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst); + dwc->dma_sconfig.dst_maxburst = + clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst); + dw->encode_maxburst(dwc, &dwc->dma_sconfig.src_maxburst); dw->encode_maxburst(dwc, &dwc->dma_sconfig.dst_maxburst); @@ -1051,7 +1056,9 @@ static void dwc_free_chan_resources(struct dma_chan *chan) static void dwc_caps(struct dma_chan *chan, struct dma_slave_caps *caps) { + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); + caps->max_burst = dwc->max_burst; } int do_dma_probe(struct dw_dma_chip *chip) @@ -1194,9 +1201,12 @@ int do_dma_probe(struct dw_dma_chip *chip) dwc->nollp = (dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0 || (dwc_params >> DWC_PARAMS_HC_LLP & 0x1) == 1; + dwc->max_burst = + (0x4 << (dwc_params >> DWC_PARAMS_MSIZE & 0x7)); } else { dwc->block_size = pdata->block_size; dwc->nollp = !pdata->multi_block[i]; + dwc->max_burst = pdata->max_burst[i] ?: DW_DMA_MAX_BURST; } } diff --git a/drivers/dma/dw/of.c b/drivers/dma/dw/of.c index 9e27831dee32..1474b3817ef4 100644 --- a/drivers/dma/dw/of.c +++ b/drivers/dma/dw/of.c @@ -98,6 +98,11 @@ struct dw_dma_platform_data *dw_dma_parse_dt(struct platform_device *pdev) pdata->multi_block[tmp] = 1; } + if (of_property_read_u32_array(np, "snps,max-burst-len", pdata->max_burst, + nr_channels)) { + memset32(pdata->max_burst, DW_DMA_MAX_BURST, nr_channels); + } + if (!of_property_read_u32(np, "snps,dma-protection-control", &tmp)) { if (tmp > CHAN_PROTCTL_MASK) return NULL; diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h index 1ab840b06e79..76654bd13c1a 100644 --- a/drivers/dma/dw/regs.h +++ b/drivers/dma/dw/regs.h @@ -126,6 +126,7 @@ struct dw_dma_regs { /* Bitfields in DWC_PARAMS */ #define DWC_PARAMS_MBLK_EN 11 /* multi block transfer */ #define DWC_PARAMS_HC_LLP 13 /* set LLP register to zero */ +#define DWC_PARAMS_MSIZE 16 /* max group transaction size */ /* bursts size */ enum dw_dma_msize { @@ -284,6 +285,7 @@ struct dw_dma_chan { /* hardware configuration */ unsigned int block_size; bool nollp; + u32 max_burst; /* custom slave configuration */ struct dw_dma_slave dws; diff --git a/include/linux/platform_data/dma-dw.h b/include/linux/platform_data/dma-dw.h index f3eaf9ec00a1..29c484da2979 100644 --- a/include/linux/platform_data/dma-dw.h +++ b/include/linux/platform_data/dma-dw.h @@ -12,6 +12,7 @@ #define DW_DMA_MAX_NR_MASTERS 4 #define DW_DMA_MAX_NR_CHANNELS 8 +#define DW_DMA_MAX_BURST 256 /** * struct dw_dma_slave - Controller-specific information about a slave @@ -42,6 +43,8 @@ struct dw_dma_slave { * @data_width: Maximum data width supported by hardware per AHB master * (in bytes, power of 2) * @multi_block: Multi block transfers supported by hardware per channel. + * @max_burst: Maximum value of burst transaction size supported by hardware + * per channel (in units of CTL.SRC_TR_WIDTH/CTL.DST_TR_WIDTH). * @protctl: Protection control signals setting per channel. */ struct dw_dma_platform_data { @@ -56,6 +59,7 @@ struct dw_dma_platform_data { unsigned char nr_masters; unsigned char data_width[DW_DMA_MAX_NR_MASTERS]; unsigned char multi_block[DW_DMA_MAX_NR_CHANNELS]; + u32 max_burst[DW_DMA_MAX_NR_CHANNELS]; #define CHAN_PROTCTL_PRIVILEGED BIT(0) #define CHAN_PROTCTL_BUFFERABLE BIT(1) #define CHAN_PROTCTL_CACHEABLE BIT(2) -- 2.26.2
Multi-block support provides a way to map the kernel-specific SG-table so the DW DMA device would handle it as a whole instead of handling the SG-list items or so called LLP block items one by one. So if true LLP list isn't supported by the DW DMA engine, then soft-LLP mode will be utilized to load and execute each LLP-block one by one. The soft-LLP mode of the DMA transactions execution might not work well for some DMA consumers like SPI due to its Tx and Rx buffers inter-dependency. Let's initialize the max_sg_nents DMA channels capability based on the nollp flag state. If it's true, no hardware accelerated LLP is available and max_sg_nents should be set with 1, which means that the DMA engine can handle only a single SG list entry at a time. If noLLP is set to false, then hardware accelerated LLP is supported and the DMA engine can handle infinite number of SG entries in a single DMA transaction. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org --- Changelog v3: - This is a new patch created as a result of the discussion with Vinud and Andy in the framework of DW DMA burst and LLP capabilities. Changelog v4: - Use explicit if-else statement when assigning the max_sg_nents field. --- drivers/dma/dw/core.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index 60ef779fc5e0..b76eee75fde8 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -1059,6 +1059,18 @@ static void dwc_caps(struct dma_chan *chan, struct dma_slave_caps *caps) struct dw_dma_chan *dwc = to_dw_dma_chan(chan); caps->max_burst = dwc->max_burst; + + /* + * It might be crucial for some devices to have the hardware + * accelerated multi-block transfers supported, aka LLPs in DW DMAC + * notation. So if LLPs are supported then max_sg_nents is set to + * zero which means unlimited number of SG entries can be handled in a + * single DMA transaction, otherwise it's just one SG entry. + */ + if (dwc->nollp) + caps->max_sg_nents = 1; + else + caps->max_sg_nents = 0; } int do_dma_probe(struct dw_dma_chip *chip) -- 2.26.2
On Fri, May 29, 2020 at 01:23:59AM +0300, Serge Semin wrote: > According to the DW APB DMAC data book the minimum burst transaction > length is 1 and it's true for any version of the controller since > isn't parametrised in the coreAssembler so can't be changed at the > IP-core synthesis stage. Let's initialise the min_burst member of the > DMA controller descriptor so the DMA clients could use it to properly > optimize the DMA requests. > @@ -1229,6 +1229,7 @@ int do_dma_probe(struct dw_dma_chip *chip) > dw->dma.device_issue_pending = dwc_issue_pending; > > /* DMA capabilities */ > + dw->dma.min_burst = 1; Perhaps then relaxed maximum, like dw->dma.max_burst = 256; (channels will update this) ? > dw->dma.src_addr_widths = DW_DMA_BUSWIDTHS; > dw->dma.dst_addr_widths = DW_DMA_BUSWIDTHS; > dw->dma.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) | > -- > 2.26.2 > -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:24:00AM +0300, Serge Semin wrote: > IP core of the DW DMA controller may be synthesized with different > max burst length of the transfers per each channel. According to Synopsis > having the fixed maximum burst transactions length may provide some > performance gain. At the same time setting up the source and destination > multi size exceeding the max burst length limitation may cause a serious > problems. In our case the DMA transaction just hangs up. In order to fix > this lets introduce the max burst length platform config of the DW DMA > controller device and don't let the DMA channels configuration code > exceed the burst length hardware limitation. > > Note the maximum burst length parameter can be detected either in runtime > from the DWC parameter registers or from the dedicated DT property. > Depending on the IP core configuration the maximum value can vary from > channel to channel so by overriding the channel slave max_burst capability > we make sure a DMA consumer will get the channel-specific max burst > length. LGTM, but consider comment to previous patch (in that case perhaps definition of min and max should be moved there). Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> > Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Rob Herring <robh+dt@kernel.org> > Cc: linux-mips@vger.kernel.org > Cc: devicetree@vger.kernel.org > > --- > > Changelog v2: > - Rearrange SoBs. > - Discard dwc_get_maxburst() accessor. It's enough to have a clamping > guard against exceeding the hardware max burst limitation. > > Changelog v3: > - Override the slave channel max_burst capability instead of calculating > the minimum value of max burst lengths and setting the DMA-device > generic capability. > > Changelog v4: > - Clamp the dst and src burst lengths in the generic dwc_config() method > instead of doing that in the encode_maxburst() callback. > - Define max_burst with u32 type in struct dw_dma_platform_data. > - Perform of_property_read_u32_array() directly into the platform data > max_burst member. > --- > drivers/dma/dw/core.c | 10 ++++++++++ > drivers/dma/dw/of.c | 5 +++++ > drivers/dma/dw/regs.h | 2 ++ > include/linux/platform_data/dma-dw.h | 4 ++++ > 4 files changed, 21 insertions(+) > > diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c > index a8cebb1dbb68..60ef779fc5e0 100644 > --- a/drivers/dma/dw/core.c > +++ b/drivers/dma/dw/core.c > @@ -791,6 +791,11 @@ static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig) > > memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig)); > > + dwc->dma_sconfig.src_maxburst = > + clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst); > + dwc->dma_sconfig.dst_maxburst = > + clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst); > + > dw->encode_maxburst(dwc, &dwc->dma_sconfig.src_maxburst); > dw->encode_maxburst(dwc, &dwc->dma_sconfig.dst_maxburst); > > @@ -1051,7 +1056,9 @@ static void dwc_free_chan_resources(struct dma_chan *chan) > > static void dwc_caps(struct dma_chan *chan, struct dma_slave_caps *caps) > { > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); > > + caps->max_burst = dwc->max_burst; > } > > int do_dma_probe(struct dw_dma_chip *chip) > @@ -1194,9 +1201,12 @@ int do_dma_probe(struct dw_dma_chip *chip) > dwc->nollp = > (dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0 || > (dwc_params >> DWC_PARAMS_HC_LLP & 0x1) == 1; > + dwc->max_burst = > + (0x4 << (dwc_params >> DWC_PARAMS_MSIZE & 0x7)); > } else { > dwc->block_size = pdata->block_size; > dwc->nollp = !pdata->multi_block[i]; > + dwc->max_burst = pdata->max_burst[i] ?: DW_DMA_MAX_BURST; > } > } > > diff --git a/drivers/dma/dw/of.c b/drivers/dma/dw/of.c > index 9e27831dee32..1474b3817ef4 100644 > --- a/drivers/dma/dw/of.c > +++ b/drivers/dma/dw/of.c > @@ -98,6 +98,11 @@ struct dw_dma_platform_data *dw_dma_parse_dt(struct platform_device *pdev) > pdata->multi_block[tmp] = 1; > } > > + if (of_property_read_u32_array(np, "snps,max-burst-len", pdata->max_burst, > + nr_channels)) { > + memset32(pdata->max_burst, DW_DMA_MAX_BURST, nr_channels); > + } > + > if (!of_property_read_u32(np, "snps,dma-protection-control", &tmp)) { > if (tmp > CHAN_PROTCTL_MASK) > return NULL; > diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h > index 1ab840b06e79..76654bd13c1a 100644 > --- a/drivers/dma/dw/regs.h > +++ b/drivers/dma/dw/regs.h > @@ -126,6 +126,7 @@ struct dw_dma_regs { > /* Bitfields in DWC_PARAMS */ > #define DWC_PARAMS_MBLK_EN 11 /* multi block transfer */ > #define DWC_PARAMS_HC_LLP 13 /* set LLP register to zero */ > +#define DWC_PARAMS_MSIZE 16 /* max group transaction size */ > > /* bursts size */ > enum dw_dma_msize { > @@ -284,6 +285,7 @@ struct dw_dma_chan { > /* hardware configuration */ > unsigned int block_size; > bool nollp; > + u32 max_burst; > > /* custom slave configuration */ > struct dw_dma_slave dws; > diff --git a/include/linux/platform_data/dma-dw.h b/include/linux/platform_data/dma-dw.h > index f3eaf9ec00a1..29c484da2979 100644 > --- a/include/linux/platform_data/dma-dw.h > +++ b/include/linux/platform_data/dma-dw.h > @@ -12,6 +12,7 @@ > > #define DW_DMA_MAX_NR_MASTERS 4 > #define DW_DMA_MAX_NR_CHANNELS 8 > +#define DW_DMA_MAX_BURST 256 > > /** > * struct dw_dma_slave - Controller-specific information about a slave > @@ -42,6 +43,8 @@ struct dw_dma_slave { > * @data_width: Maximum data width supported by hardware per AHB master > * (in bytes, power of 2) > * @multi_block: Multi block transfers supported by hardware per channel. > + * @max_burst: Maximum value of burst transaction size supported by hardware > + * per channel (in units of CTL.SRC_TR_WIDTH/CTL.DST_TR_WIDTH). > * @protctl: Protection control signals setting per channel. > */ > struct dw_dma_platform_data { > @@ -56,6 +59,7 @@ struct dw_dma_platform_data { > unsigned char nr_masters; > unsigned char data_width[DW_DMA_MAX_NR_MASTERS]; > unsigned char multi_block[DW_DMA_MAX_NR_CHANNELS]; > + u32 max_burst[DW_DMA_MAX_NR_CHANNELS]; > #define CHAN_PROTCTL_PRIVILEGED BIT(0) > #define CHAN_PROTCTL_BUFFERABLE BIT(1) > #define CHAN_PROTCTL_CACHEABLE BIT(2) > -- > 2.26.2 > -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:24:01AM +0300, Serge Semin wrote: > Multi-block support provides a way to map the kernel-specific SG-table so > the DW DMA device would handle it as a whole instead of handling the > SG-list items or so called LLP block items one by one. So if true LLP > list isn't supported by the DW DMA engine, then soft-LLP mode will be > utilized to load and execute each LLP-block one by one. The soft-LLP mode > of the DMA transactions execution might not work well for some DMA > consumers like SPI due to its Tx and Rx buffers inter-dependency. Let's > initialize the max_sg_nents DMA channels capability based on the nollp > flag state. If it's true, no hardware accelerated LLP is available and > max_sg_nents should be set with 1, which means that the DMA engine > can handle only a single SG list entry at a time. If noLLP is set to > false, then hardware accelerated LLP is supported and the DMA engine > can handle infinite number of SG entries in a single DMA transaction. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> > Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Rob Herring <robh+dt@kernel.org> > Cc: linux-mips@vger.kernel.org > Cc: devicetree@vger.kernel.org > > --- > > Changelog v3: > - This is a new patch created as a result of the discussion with Vinud and > Andy in the framework of DW DMA burst and LLP capabilities. > > Changelog v4: > - Use explicit if-else statement when assigning the max_sg_nents field. > --- > drivers/dma/dw/core.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c > index 60ef779fc5e0..b76eee75fde8 100644 > --- a/drivers/dma/dw/core.c > +++ b/drivers/dma/dw/core.c > @@ -1059,6 +1059,18 @@ static void dwc_caps(struct dma_chan *chan, struct dma_slave_caps *caps) > struct dw_dma_chan *dwc = to_dw_dma_chan(chan); > > caps->max_burst = dwc->max_burst; > + > + /* > + * It might be crucial for some devices to have the hardware > + * accelerated multi-block transfers supported, aka LLPs in DW DMAC > + * notation. So if LLPs are supported then max_sg_nents is set to > + * zero which means unlimited number of SG entries can be handled in a > + * single DMA transaction, otherwise it's just one SG entry. > + */ > + if (dwc->nollp) > + caps->max_sg_nents = 1; > + else > + caps->max_sg_nents = 0; > } > > int do_dma_probe(struct dw_dma_chip *chip) > -- > 2.26.2 > -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:25:15PM +0300, Andy Shevchenko wrote: > On Fri, May 29, 2020 at 01:23:59AM +0300, Serge Semin wrote: > > According to the DW APB DMAC data book the minimum burst transaction > > length is 1 and it's true for any version of the controller since > > isn't parametrised in the coreAssembler so can't be changed at the > > IP-core synthesis stage. Let's initialise the min_burst member of the > > DMA controller descriptor so the DMA clients could use it to properly > > optimize the DMA requests. ... > > /* DMA capabilities */ > > > + dw->dma.min_burst = 1; > > Perhaps then relaxed maximum, like > > dw->dma.max_burst = 256; > > (channels will update this) > > ? And forgot to mention that perhaps we need a definitions for both. > > dw->dma.src_addr_widths = DW_DMA_BUSWIDTHS; > > dw->dma.dst_addr_widths = DW_DMA_BUSWIDTHS; -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:29:02PM +0300, Andy Shevchenko wrote: > On Fri, May 29, 2020 at 01:25:15PM +0300, Andy Shevchenko wrote: > > On Fri, May 29, 2020 at 01:23:59AM +0300, Serge Semin wrote: > > > According to the DW APB DMAC data book the minimum burst transaction > > > length is 1 and it's true for any version of the controller since > > > isn't parametrised in the coreAssembler so can't be changed at the > > > IP-core synthesis stage. Let's initialise the min_burst member of the > > > DMA controller descriptor so the DMA clients could use it to properly > > > optimize the DMA requests. > > ... > > > > /* DMA capabilities */ > > > > > + dw->dma.min_burst = 1; > > > > Perhaps then relaxed maximum, like > > > > dw->dma.max_burst = 256; > > > > (channels will update this) > > > > ? > > And forgot to mention that perhaps we need a definitions for both. By "definitions for both" do you mean a macro with corresponding parameter definition like it's done for the max burst length in the next patch? Something like this: --- include/linux/platform_data/dma-dw.h +++ include/linux/platform_data/dma-dw.h +#define DW_DMA_MIN_BURST 1 +#define DW_DMA_MAX_BURST 256 ? -Sergey > > > > dw->dma.src_addr_widths = DW_DMA_BUSWIDTHS; > > > dw->dma.dst_addr_widths = DW_DMA_BUSWIDTHS; > > -- > With Best Regards, > Andy Shevchenko > >
On Fri, May 29, 2020 at 01:41:19PM +0300, Serge Semin wrote: > On Fri, May 29, 2020 at 01:29:02PM +0300, Andy Shevchenko wrote: > > On Fri, May 29, 2020 at 01:25:15PM +0300, Andy Shevchenko wrote: > > > On Fri, May 29, 2020 at 01:23:59AM +0300, Serge Semin wrote: ... > > > > /* DMA capabilities */ > > > > + dw->dma.min_burst = 1; > > > > > > Perhaps then relaxed maximum, like > > > > > > dw->dma.max_burst = 256; > > > > > > (channels will update this) > > > > > > ? > > > And forgot to mention that perhaps we need a definitions for both. > > By "definitions for both" do you mean a macro with corresponding parameter > definition like it's done for the max burst length in the next patch? > Something like this: > --- include/linux/platform_data/dma-dw.h > +++ include/linux/platform_data/dma-dw.h > +#define DW_DMA_MIN_BURST 1 > +#define DW_DMA_MAX_BURST 256 > > ? Yes! -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:50:09PM +0300, Andy Shevchenko wrote: > On Fri, May 29, 2020 at 01:41:19PM +0300, Serge Semin wrote: > > On Fri, May 29, 2020 at 01:29:02PM +0300, Andy Shevchenko wrote: > > > On Fri, May 29, 2020 at 01:25:15PM +0300, Andy Shevchenko wrote: > > > > On Fri, May 29, 2020 at 01:23:59AM +0300, Serge Semin wrote: > > ... > > > > > > /* DMA capabilities */ > > > > > + dw->dma.min_burst = 1; > > > > > > > > Perhaps then relaxed maximum, like > > > > > > > > dw->dma.max_burst = 256; > > > > > > > > (channels will update this) > > > > > > > > ? > > > > > And forgot to mention that perhaps we need a definitions for both. > > > > By "definitions for both" do you mean a macro with corresponding parameter > > definition like it's done for the max burst length in the next patch? > > Something like this: > > --- include/linux/platform_data/dma-dw.h > > +++ include/linux/platform_data/dma-dw.h > > +#define DW_DMA_MIN_BURST 1 > > +#define DW_DMA_MAX_BURST 256 > > > > ? > > Yes! Ok. Good idea. I'll do that. Thanks. -Sergey > > -- > With Best Regards, > Andy Shevchenko > >
On Fri, May 29, 2020 at 01:23:53AM +0300, Serge Semin wrote: > Some hardware aside from default 0/1 may have greater minimum burst > transactions length constraints. Here we introduce the DMA device > and slave capability, which if required can be initialized by the DMA > engine driver with the device-specific value. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> > Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Cc: Rob Herring <robh+dt@kernel.org> > Cc: linux-mips@vger.kernel.org > Cc: devicetree@vger.kernel.org > > --- > > Changelog v3: > - This is a new patch created as a result of the discussion with Vinud and > Andy in the framework of DW DMA burst and LLP capabilities. > --- > drivers/dma/dmaengine.c | 1 + > include/linux/dmaengine.h | 4 ++++ > 2 files changed, 5 insertions(+) > > diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c > index d31076d9ef25..b332ffe52780 100644 > --- a/drivers/dma/dmaengine.c > +++ b/drivers/dma/dmaengine.c > @@ -590,6 +590,7 @@ int dma_get_slave_caps(struct dma_chan *chan, struct dma_slave_caps *caps) > caps->src_addr_widths = device->src_addr_widths; > caps->dst_addr_widths = device->dst_addr_widths; > caps->directions = device->directions; > + caps->min_burst = device->min_burst; > caps->max_burst = device->max_burst; > caps->residue_granularity = device->residue_granularity; > caps->descriptor_reuse = device->descriptor_reuse; > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h > index e1c03339918f..0c7403b27133 100644 > --- a/include/linux/dmaengine.h > +++ b/include/linux/dmaengine.h > @@ -465,6 +465,7 @@ enum dma_residue_granularity { > * Since the enum dma_transfer_direction is not defined as bit flag for > * each type, the dma controller should set BIT(<TYPE>) and same > * should be checked by controller as well > + * @min_burst: min burst capability per-transfer > * @max_burst: max burst capability per-transfer > * @cmd_pause: true, if pause is supported (i.e. for reading residue or > * for resume later) > @@ -478,6 +479,7 @@ struct dma_slave_caps { > u32 src_addr_widths; > u32 dst_addr_widths; > u32 directions; > + u32 min_burst; > u32 max_burst; > bool cmd_pause; > bool cmd_resume; > @@ -769,6 +771,7 @@ struct dma_filter { > * Since the enum dma_transfer_direction is not defined as bit flag for > * each type, the dma controller should set BIT(<TYPE>) and same > * should be checked by controller as well > + * @min_burst: min burst capability per-transfer > * @max_burst: max burst capability per-transfer > * @residue_granularity: granularity of the transfer residue reported > * by tx_status > @@ -839,6 +842,7 @@ struct dma_device { > u32 src_addr_widths; > u32 dst_addr_widths; > u32 directions; > + u32 min_burst; > u32 max_burst; > bool descriptor_reuse; > enum dma_residue_granularity residue_granularity; > -- > 2.26.2 > -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:23:55AM +0300, Serge Semin wrote: > There are DMA devices (like ours version of Synopsys DW DMAC) which have > DMA capabilities non-uniformly redistributed amongst the device channels. > In order to provide a way of exposing the channel-specific parameters to > the DMA engine consumers, we introduce a new DMA-device callback. In case > if provided it gets called from the dma_get_slave_caps() method and is > able to override the generic DMA-device capabilities. I thought there is a pattern to return something, but it seems none. So, I have nothing against it to return void. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> But consider one comment below. > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> > Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Cc: Rob Herring <robh+dt@kernel.org> > Cc: linux-mips@vger.kernel.org > Cc: devicetree@vger.kernel.org > > --- > > Changelog v3: > - This is a new patch created as a result of the discussion with Vinod and > Andy in the framework of DW DMA burst and LLP capabilities. > --- > drivers/dma/dmaengine.c | 3 +++ > include/linux/dmaengine.h | 2 ++ > 2 files changed, 5 insertions(+) > > diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c > index ad56ad58932c..edbb11d56cde 100644 > --- a/drivers/dma/dmaengine.c > +++ b/drivers/dma/dmaengine.c > @@ -599,6 +599,9 @@ int dma_get_slave_caps(struct dma_chan *chan, struct dma_slave_caps *caps) > caps->cmd_resume = !!device->device_resume; > caps->cmd_terminate = !!device->device_terminate_all; > Perhaps a comment to explain that this is channel specific correction / override / you name it on top of device level capabilities? > + if (device->device_caps) > + device->device_caps(chan, caps); > + > return 0; > } > EXPORT_SYMBOL_GPL(dma_get_slave_caps); > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h > index a7e4d8dfdd19..b303e59929e5 100644 > --- a/include/linux/dmaengine.h > +++ b/include/linux/dmaengine.h > @@ -899,6 +899,8 @@ struct dma_device { > struct dma_chan *chan, dma_addr_t dst, u64 data, > unsigned long flags); > > + void (*device_caps)(struct dma_chan *chan, > + struct dma_slave_caps *caps); > int (*device_config)(struct dma_chan *chan, > struct dma_slave_config *config); > int (*device_pause)(struct dma_chan *chan); > -- > 2.26.2 > -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:23:57AM +0300, Serge Semin wrote: > Maximum block size DW DMAC configuration corresponds to the max segment > size DMA parameter in the DMA core subsystem notation. Lets set it with a > value specific to the probed DW DMA controller. It shall help the DMA > clients to create size-optimized SG-list items for the controller. This in > turn will cause less dw_desc allocations, less LLP reinitializations, > better DMA device performance. Yes, something like that for time being, thanks! Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> > Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Rob Herring <robh+dt@kernel.org> > Cc: linux-mips@vger.kernel.org > Cc: devicetree@vger.kernel.org > > --- > > Changelog v2: > - This is a new patch created in place of the dropped one: > "dmaengine: dw: Add LLP and block size config accessors". > > Changelog v3: > - Use the block_size found for the very first channel instead of looking for > the maximum of maximum block sizes. > - Don't define device-specific device_dma_parameters object, since it has > already been defined by the platform device core. > --- > drivers/dma/dw/core.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c > index 33e99d95b3d3..fb95920c429e 100644 > --- a/drivers/dma/dw/core.c > +++ b/drivers/dma/dw/core.c > @@ -1229,6 +1229,13 @@ int do_dma_probe(struct dw_dma_chip *chip) > BIT(DMA_MEM_TO_MEM); > dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST; > > + /* > + * For now there is no hardware with non uniform maximum block size > + * across all of the device channels, so we set the maximum segment > + * size as the block size found for the very first channel. > + */ > + dma_set_max_seg_size(dw->dma.dev, dw->chan[0].block_size); > + > err = dma_async_device_register(&dw->dma); > if (err) > goto err_dma_register; > -- > 2.26.2 > -- With Best Regards, Andy Shevchenko
On Fri, May 29, 2020 at 01:23:58AM +0300, Serge Semin wrote:
> Since some DW DMA controllers (like one installed on Baikal-T1 SoC) may
> have non-uniform DMA capabilities per device channels, let's add
> the DW DMA specific device_caps callback to expose that specifics up to
> the DMA consumer. It's a dummy function for now. We'll fill it in with
> capabilities overrides in the next commits.
This one I leave to Vinod to decide what to do.
It is not harmful per se, but I consider better if it has a user already.
Thus, no tag, sorry.
--
With Best Regards,
Andy Shevchenko
On Fri, May 29, 2020 at 03:12:03PM +0300, Andy Shevchenko wrote: > On Fri, May 29, 2020 at 01:23:55AM +0300, Serge Semin wrote: > > There are DMA devices (like ours version of Synopsys DW DMAC) which have > > DMA capabilities non-uniformly redistributed amongst the device channels. > > In order to provide a way of exposing the channel-specific parameters to > > the DMA engine consumers, we introduce a new DMA-device callback. In case > > if provided it gets called from the dma_get_slave_caps() method and is > > able to override the generic DMA-device capabilities. > > I thought there is a pattern to return something, but it seems none. > So, I have nothing against it to return void. > > Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > But consider one comment below. > > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > > Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> > > Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> > > Cc: Arnd Bergmann <arnd@arndb.de> > > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > Cc: Rob Herring <robh+dt@kernel.org> > > Cc: linux-mips@vger.kernel.org > > Cc: devicetree@vger.kernel.org > > > > --- > > > > Changelog v3: > > - This is a new patch created as a result of the discussion with Vinod and > > Andy in the framework of DW DMA burst and LLP capabilities. > > --- > > drivers/dma/dmaengine.c | 3 +++ > > include/linux/dmaengine.h | 2 ++ > > 2 files changed, 5 insertions(+) > > > > diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c > > index ad56ad58932c..edbb11d56cde 100644 > > --- a/drivers/dma/dmaengine.c > > +++ b/drivers/dma/dmaengine.c > > @@ -599,6 +599,9 @@ int dma_get_slave_caps(struct dma_chan *chan, struct dma_slave_caps *caps) > > caps->cmd_resume = !!device->device_resume; > > caps->cmd_terminate = !!device->device_terminate_all; > > > > Perhaps a comment to explain that this is channel specific correction / > override / you name it on top of device level capabilities? > > > + if (device->device_caps) > > + device->device_caps(chan, caps); > > + Agreed. I also forgot to add a doc-comment above the struct dma_device definition. -Sergey > > return 0; > > } > > EXPORT_SYMBOL_GPL(dma_get_slave_caps); > > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h > > index a7e4d8dfdd19..b303e59929e5 100644 > > --- a/include/linux/dmaengine.h > > +++ b/include/linux/dmaengine.h > > @@ -899,6 +899,8 @@ struct dma_device { > > struct dma_chan *chan, dma_addr_t dst, u64 data, > > unsigned long flags); > > > > + void (*device_caps)(struct dma_chan *chan, > > + struct dma_slave_caps *caps); > > int (*device_config)(struct dma_chan *chan, > > struct dma_slave_config *config); > > int (*device_pause)(struct dma_chan *chan); > > -- > > 2.26.2 > > > > -- > With Best Regards, > Andy Shevchenko > >