dmaengine Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
@ 2020-03-06 13:10 Sergey.Semin
  2020-03-06 13:29 ` Andy Shevchenko
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
  0 siblings, 2 replies; 72+ messages in thread
From: Sergey.Semin @ 2020-03-06 13:10 UTC (permalink / raw)
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Maxim Kaurkin,
	Pavel Parkhomenko, Ramil Zaripov, Ekaterina Skachko,
	Vadim Vlasov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Viresh Kumar, Andy Shevchenko, Dan Williams, Vinod Koul,
	Rob Herring, Mark Rutland, dmaengine, devicetree, linux-kernel

From: Serge Semin <fancer.lancer@gmail.com>

Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
with currently implemented in the kernel DW DMAC driver, but there are some
peculiarities which must be taken into account in order to have the device
fully supported.

First of all traditionally we replaced the legacy plain text-based dt-binding
file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
channels, which alas have different max burst length configuration.
In particular first two channels may burst up to 128 bits (16 bytes) at a time
while the rest of them just up to 32 bits. We must make sure that the DMA
subsystem doesn't set values exceeding these limitations otherwise the
controller will hang up. In third currently we discovered the problem in using
the DW APB SPI driver together with DW DMAC. The problem happens if there is no
natively implemented multi-block LLP transfers support and the SPI-transfer
length exceeds the max lock size. In this case due to asynchronous handling of
Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
the DMAC to asynchronously execute the transfers we'd have to at least warn
the user of the possible errors.

Finally there is a bug in the algorithm of the nollp flag detection.
In particular even if DW DMAC parameters state the multi-block transfers
support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
by the driver true multi-block LLP functionality unusable. This happens cause'
if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
contiguous multi-block transfers will be only supported. We must take the
flag into account when detecting the LLP support otherwise the driver just
won't work correctly.

This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
commit 98d54f81e36b ("Linux 5.6-rc4").

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Maxim Kaurkin <Maxim.Kaurkin@baikalelectronics.ru>
Cc: Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru>
Cc: Ramil Zaripov <Ramil.Zaripov@baikalelectronics.ru>
Cc: Ekaterina Skachko <Ekaterina.Skachko@baikalelectronics.ru>
Cc: Vadim Vlasov <V.Vlasov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Viresh Kumar <vireshk@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: dmaengine@vger.kernel.org
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (5):
  dt-bindings: dma: dw: Replace DW DMAC legacy bindings with YAML-based
    one
  dt-bindings: dma: dw: Add max burst transaction length property
    bindings
  dmaengine: dw: Add LLP and block size config accessors
  dmaengine: dw: Introduce max burst length hw config
  dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config

 .../bindings/dma/snps,dma-spear1340.yaml      | 180 ++++++++++++++++++
 .../devicetree/bindings/dma/snps-dma.txt      |  69 -------
 drivers/dma/dw/core.c                         |  24 ++-
 drivers/dma/dw/dw.c                           |   1 +
 drivers/dma/dw/of.c                           |   9 +
 drivers/dma/dw/regs.h                         |   3 +
 include/linux/platform_data/dma-dw.h          |  22 +++
 7 files changed, 238 insertions(+), 70 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
 delete mode 100644 Documentation/devicetree/bindings/dma/snps-dma.txt

-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
  2020-03-06 13:10 [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account Sergey.Semin
@ 2020-03-06 13:29 ` Andy Shevchenko
  2020-03-06 13:30   ` Andy Shevchenko
       [not found]   ` <20200306133756.0F74C8030793@mail.baikalelectronics.ru>
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
  1 sibling, 2 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-03-06 13:29 UTC (permalink / raw)
  To: Sergey.Semin
  Cc: Serge Semin, Alexey Malahov, Maxim Kaurkin, Pavel Parkhomenko,
	Ramil Zaripov, Ekaterina Skachko, Vadim Vlasov,
	Thomas Bogendoerfer, Paul Burton, Ralf Baechle, Viresh Kumar,
	Dan Williams, Vinod Koul, Rob Herring, Mark Rutland, dmaengine,
	devicetree, linux-kernel

On Fri, Mar 06, 2020 at 04:10:29PM +0300, Sergey.Semin@baikalelectronics.ru wrote:
> From: Serge Semin <fancer.lancer@gmail.com>
> 
> Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
> peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
> with currently implemented in the kernel DW DMAC driver, but there are some
> peculiarities which must be taken into account in order to have the device
> fully supported.
> 
> First of all traditionally we replaced the legacy plain text-based dt-binding
> file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
> channels, which alas have different max burst length configuration.
> In particular first two channels may burst up to 128 bits (16 bytes) at a time
> while the rest of them just up to 32 bits. We must make sure that the DMA
> subsystem doesn't set values exceeding these limitations otherwise the
> controller will hang up. In third currently we discovered the problem in using
> the DW APB SPI driver together with DW DMAC. The problem happens if there is no
> natively implemented multi-block LLP transfers support and the SPI-transfer
> length exceeds the max lock size. In this case due to asynchronous handling of
> Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
> overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
> the DMAC to asynchronously execute the transfers we'd have to at least warn
> the user of the possible errors.
> 
> Finally there is a bug in the algorithm of the nollp flag detection.
> In particular even if DW DMAC parameters state the multi-block transfers
> support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
> by the driver true multi-block LLP functionality unusable. This happens cause'
> if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> contiguous multi-block transfers will be only supported. We must take the
> flag into account when detecting the LLP support otherwise the driver just
> won't work correctly.
> 
> This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
> commit 98d54f81e36b ("Linux 5.6-rc4").

Thank you for your series!

I'll definitely review it, but it will take time. So, I think due to late
submission this is material at least for v5.8.

> 
> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
> Signed-off-by: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
> Cc: Maxim Kaurkin <Maxim.Kaurkin@baikalelectronics.ru>
> Cc: Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru>
> Cc: Ramil Zaripov <Ramil.Zaripov@baikalelectronics.ru>
> Cc: Ekaterina Skachko <Ekaterina.Skachko@baikalelectronics.ru>
> Cc: Vadim Vlasov <V.Vlasov@baikalelectronics.ru>
> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
> Cc: Paul Burton <paulburton@kernel.org>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Viresh Kumar <vireshk@kernel.org>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Vinod Koul <vkoul@kernel.org>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: dmaengine@vger.kernel.org
> Cc: devicetree@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> Serge Semin (5):
>   dt-bindings: dma: dw: Replace DW DMAC legacy bindings with YAML-based
>     one
>   dt-bindings: dma: dw: Add max burst transaction length property
>     bindings
>   dmaengine: dw: Add LLP and block size config accessors
>   dmaengine: dw: Introduce max burst length hw config
>   dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config
> 
>  .../bindings/dma/snps,dma-spear1340.yaml      | 180 ++++++++++++++++++
>  .../devicetree/bindings/dma/snps-dma.txt      |  69 -------
>  drivers/dma/dw/core.c                         |  24 ++-
>  drivers/dma/dw/dw.c                           |   1 +
>  drivers/dma/dw/of.c                           |   9 +
>  drivers/dma/dw/regs.h                         |   3 +
>  include/linux/platform_data/dma-dw.h          |  22 +++
>  7 files changed, 238 insertions(+), 70 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
>  delete mode 100644 Documentation/devicetree/bindings/dma/snps-dma.txt
> 
> -- 
> 2.25.1
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
  2020-03-06 13:29 ` Andy Shevchenko
@ 2020-03-06 13:30   ` Andy Shevchenko
  2020-03-06 13:43     ` Vinod Koul
       [not found]     ` <20200306135050.40094803087C@mail.baikalelectronics.ru>
       [not found]   ` <20200306133756.0F74C8030793@mail.baikalelectronics.ru>
  1 sibling, 2 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-03-06 13:30 UTC (permalink / raw)
  To: Sergey.Semin
  Cc: Serge Semin, Alexey Malahov, Maxim Kaurkin, Pavel Parkhomenko,
	Ramil Zaripov, Ekaterina Skachko, Vadim Vlasov,
	Thomas Bogendoerfer, Paul Burton, Ralf Baechle, Viresh Kumar,
	Dan Williams, Vinod Koul, Rob Herring, Mark Rutland, dmaengine,
	devicetree, linux-kernel

On Fri, Mar 06, 2020 at 03:29:12PM +0200, Andy Shevchenko wrote:
> On Fri, Mar 06, 2020 at 04:10:29PM +0300, Sergey.Semin@baikalelectronics.ru wrote:
> > From: Serge Semin <fancer.lancer@gmail.com>
> > 
> > Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
> > peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
> > with currently implemented in the kernel DW DMAC driver, but there are some
> > peculiarities which must be taken into account in order to have the device
> > fully supported.
> > 
> > First of all traditionally we replaced the legacy plain text-based dt-binding
> > file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
> > channels, which alas have different max burst length configuration.
> > In particular first two channels may burst up to 128 bits (16 bytes) at a time
> > while the rest of them just up to 32 bits. We must make sure that the DMA
> > subsystem doesn't set values exceeding these limitations otherwise the
> > controller will hang up. In third currently we discovered the problem in using
> > the DW APB SPI driver together with DW DMAC. The problem happens if there is no
> > natively implemented multi-block LLP transfers support and the SPI-transfer
> > length exceeds the max lock size. In this case due to asynchronous handling of
> > Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
> > overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
> > the DMAC to asynchronously execute the transfers we'd have to at least warn
> > the user of the possible errors.
> > 
> > Finally there is a bug in the algorithm of the nollp flag detection.
> > In particular even if DW DMAC parameters state the multi-block transfers
> > support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
> > by the driver true multi-block LLP functionality unusable. This happens cause'
> > if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> > contiguous multi-block transfers will be only supported. We must take the
> > flag into account when detecting the LLP support otherwise the driver just
> > won't work correctly.
> > 
> > This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
> > commit 98d54f81e36b ("Linux 5.6-rc4").
> 
> Thank you for your series!
> 
> I'll definitely review it, but it will take time. So, I think due to late
> submission this is material at least for v5.8.

One thing that I can tell immediately is the broken email thread in this series.
Whenever you do a series, use `git format-patch --cover-letter --thread ...`,
so, it will link the mail properly.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
  2020-03-06 13:30   ` Andy Shevchenko
@ 2020-03-06 13:43     ` Vinod Koul
       [not found]     ` <20200306135050.40094803087C@mail.baikalelectronics.ru>
  1 sibling, 0 replies; 72+ messages in thread
From: Vinod Koul @ 2020-03-06 13:43 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Sergey.Semin, Serge Semin, Alexey Malahov, Maxim Kaurkin,
	Pavel Parkhomenko, Ramil Zaripov, Ekaterina Skachko,
	Vadim Vlasov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Viresh Kumar, Dan Williams, Rob Herring, Mark Rutland, dmaengine,
	devicetree, linux-kernel

On 06-03-20, 15:30, Andy Shevchenko wrote:
> On Fri, Mar 06, 2020 at 03:29:12PM +0200, Andy Shevchenko wrote:
> > On Fri, Mar 06, 2020 at 04:10:29PM +0300, Sergey.Semin@baikalelectronics.ru wrote:
> > > From: Serge Semin <fancer.lancer@gmail.com>
> > > 
> > > Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
> > > peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
> > > with currently implemented in the kernel DW DMAC driver, but there are some
> > > peculiarities which must be taken into account in order to have the device
> > > fully supported.
> > > 
> > > First of all traditionally we replaced the legacy plain text-based dt-binding
> > > file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
> > > channels, which alas have different max burst length configuration.
> > > In particular first two channels may burst up to 128 bits (16 bytes) at a time
> > > while the rest of them just up to 32 bits. We must make sure that the DMA
> > > subsystem doesn't set values exceeding these limitations otherwise the
> > > controller will hang up. In third currently we discovered the problem in using
> > > the DW APB SPI driver together with DW DMAC. The problem happens if there is no
> > > natively implemented multi-block LLP transfers support and the SPI-transfer
> > > length exceeds the max lock size. In this case due to asynchronous handling of
> > > Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
> > > overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
> > > the DMAC to asynchronously execute the transfers we'd have to at least warn
> > > the user of the possible errors.
> > > 
> > > Finally there is a bug in the algorithm of the nollp flag detection.
> > > In particular even if DW DMAC parameters state the multi-block transfers
> > > support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
> > > by the driver true multi-block LLP functionality unusable. This happens cause'
> > > if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> > > contiguous multi-block transfers will be only supported. We must take the
> > > flag into account when detecting the LLP support otherwise the driver just
> > > won't work correctly.
> > > 
> > > This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
> > > commit 98d54f81e36b ("Linux 5.6-rc4").
> > 
> > Thank you for your series!
> > 
> > I'll definitely review it, but it will take time. So, I think due to late
> > submission this is material at least for v5.8.
> 
> One thing that I can tell immediately is the broken email thread in this series.
> Whenever you do a series, use `git format-patch --cover-letter --thread ...`,
> so, it will link the mail properly.

And all the dmaengine specific patches should be sent to dmaengine list,
I see only few of them on the list.. that confuses tools like
patchwork..

Pls fix these and resubmit

Thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
       [not found]   ` <20200306133756.0F74C8030793@mail.baikalelectronics.ru>
@ 2020-03-06 13:47     ` Sergey Semin
  2020-03-06 14:11       ` Andy Shevchenko
       [not found]       ` <20200306141135.9C4F380307C2@mail.baikalelectronics.ru>
  0 siblings, 2 replies; 72+ messages in thread
From: Sergey Semin @ 2020-03-06 13:47 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Alexey Malahov, Maxim Kaurkin, Pavel Parkhomenko, Ramil Zaripov,
	Ekaterina Skachko, Vadim Vlasov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Viresh Kumar, Dan Williams,
	Vinod Koul, Rob Herring, Mark Rutland, dmaengine, devicetree,
	linux-kernel

On Fri, Mar 06, 2020 at 03:30:35PM +0200, Andy Shevchenko wrote:
> On Fri, Mar 06, 2020 at 03:29:12PM +0200, Andy Shevchenko wrote:
> > On Fri, Mar 06, 2020 at 04:10:29PM +0300, Sergey.Semin@baikalelectronics.ru wrote:
> > > From: Serge Semin <fancer.lancer@gmail.com>
> > > 
> > > Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
> > > peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
> > > with currently implemented in the kernel DW DMAC driver, but there are some
> > > peculiarities which must be taken into account in order to have the device
> > > fully supported.
> > > 
> > > First of all traditionally we replaced the legacy plain text-based dt-binding
> > > file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
> > > channels, which alas have different max burst length configuration.
> > > In particular first two channels may burst up to 128 bits (16 bytes) at a time
> > > while the rest of them just up to 32 bits. We must make sure that the DMA
> > > subsystem doesn't set values exceeding these limitations otherwise the
> > > controller will hang up. In third currently we discovered the problem in using
> > > the DW APB SPI driver together with DW DMAC. The problem happens if there is no
> > > natively implemented multi-block LLP transfers support and the SPI-transfer
> > > length exceeds the max lock size. In this case due to asynchronous handling of
> > > Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
> > > overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
> > > the DMAC to asynchronously execute the transfers we'd have to at least warn
> > > the user of the possible errors.
> > > 
> > > Finally there is a bug in the algorithm of the nollp flag detection.
> > > In particular even if DW DMAC parameters state the multi-block transfers
> > > support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
> > > by the driver true multi-block LLP functionality unusable. This happens cause'
> > > if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> > > contiguous multi-block transfers will be only supported. We must take the
> > > flag into account when detecting the LLP support otherwise the driver just
> > > won't work correctly.
> > > 
> > > This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
> > > commit 98d54f81e36b ("Linux 5.6-rc4").
> > 
> > Thank you for your series!
> > 
> > I'll definitely review it, but it will take time. So, I think due to late
> > submission this is material at least for v5.8.
> 

Hello Andy,
Thanks for the quick response. Looking forward to get the patches
reviewed and move on with the next patchset I'll send after this. It concerns
DW APB SSI driver, which uses the changes introduced by this one. So the
sooner we finished with this patchset the better. Although I understand
that it may take some time. I've just sent over 12 patchset, which have a lot
of fixups and new drivers.)

> One thing that I can tell immediately is the broken email thread in this series.
> Whenever you do a series, use `git format-patch --cover-letter --thread ...`,
> so, it will link the mail properly.
> 

I've got thread=true in my gitconfig file, so each email should have
the proper reference and in-reply-to to the cover-letter (I see it from
the log). The problem popped up from a different place. For some reason the
automatic CC/To list extraction command didn't do the job right, so we ended
up with lacking of mailing lists in Cc's in this patchset. The command look like
this:

git send-email --cc-cmd "scripts/get_maintainer.pl --separator , --nokeywords --nogit --nogit-fallback --norolestats --nom" \
                   --to-cmd "scripts/get_maintainer.pl --separator , --nokeywords --nogit --nogit-fallback --norolestats --nol" \
                   --from "Serge Semin <Sergey.Semin at baikalelectronics.ru>" \
                   --smtp-server-option="-abaikal" --cover-letter -5

Regards,
-Sergey

> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
  2020-03-06 13:47     ` Sergey Semin
@ 2020-03-06 14:11       ` Andy Shevchenko
       [not found]       ` <20200306141135.9C4F380307C2@mail.baikalelectronics.ru>
  1 sibling, 0 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-03-06 14:11 UTC (permalink / raw)
  To: Sergey Semin
  Cc: Alexey Malahov, Maxim Kaurkin, Pavel Parkhomenko, Ramil Zaripov,
	Ekaterina Skachko, Vadim Vlasov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Viresh Kumar, Dan Williams,
	Vinod Koul, Rob Herring, Mark Rutland, dmaengine, devicetree,
	linux-kernel

On Fri, Mar 06, 2020 at 04:47:20PM +0300, Sergey Semin wrote:
> On Fri, Mar 06, 2020 at 03:30:35PM +0200, Andy Shevchenko wrote:
> > On Fri, Mar 06, 2020 at 03:29:12PM +0200, Andy Shevchenko wrote:
> > > On Fri, Mar 06, 2020 at 04:10:29PM +0300, Sergey.Semin@baikalelectronics.ru wrote:
> > > > From: Serge Semin <fancer.lancer@gmail.com>
> > > > 
> > > > Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
> > > > peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
> > > > with currently implemented in the kernel DW DMAC driver, but there are some
> > > > peculiarities which must be taken into account in order to have the device
> > > > fully supported.
> > > > 
> > > > First of all traditionally we replaced the legacy plain text-based dt-binding
> > > > file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
> > > > channels, which alas have different max burst length configuration.
> > > > In particular first two channels may burst up to 128 bits (16 bytes) at a time
> > > > while the rest of them just up to 32 bits. We must make sure that the DMA
> > > > subsystem doesn't set values exceeding these limitations otherwise the
> > > > controller will hang up. In third currently we discovered the problem in using
> > > > the DW APB SPI driver together with DW DMAC. The problem happens if there is no
> > > > natively implemented multi-block LLP transfers support and the SPI-transfer
> > > > length exceeds the max lock size. In this case due to asynchronous handling of
> > > > Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
> > > > overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
> > > > the DMAC to asynchronously execute the transfers we'd have to at least warn
> > > > the user of the possible errors.
> > > > 
> > > > Finally there is a bug in the algorithm of the nollp flag detection.
> > > > In particular even if DW DMAC parameters state the multi-block transfers
> > > > support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
> > > > by the driver true multi-block LLP functionality unusable. This happens cause'
> > > > if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> > > > contiguous multi-block transfers will be only supported. We must take the
> > > > flag into account when detecting the LLP support otherwise the driver just
> > > > won't work correctly.
> > > > 
> > > > This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
> > > > commit 98d54f81e36b ("Linux 5.6-rc4").
> > > 
> > > Thank you for your series!
> > > 
> > > I'll definitely review it, but it will take time. So, I think due to late
> > > submission this is material at least for v5.8.
> > 
> 
> Hello Andy,
> Thanks for the quick response. Looking forward to get the patches
> reviewed and move on with the next patchset I'll send after this. It concerns
> DW APB SSI driver, which uses the changes introduced by this one.

> So the
> sooner we finished with this patchset the better.

Everybody will win, but review will take as long as it take. And for sure it
will miss v5.7 release cycle. Because too many patch sets sent at once
followed by schedule, we almost at v5.6-rc5.

> Although I understand
> that it may take some time. I've just sent over 12 patchset, which have a lot
> of fixups and new drivers.)
> 
> > One thing that I can tell immediately is the broken email thread in this series.
> > Whenever you do a series, use `git format-patch --cover-letter --thread ...`,
> > so, it will link the mail properly.
> > 
> 
> I've got thread=true in my gitconfig file, so each email should have
> the proper reference and in-reply-to to the cover-letter (I see it from
> the log). The problem popped up from a different place. For some reason the
> automatic CC/To list extraction command didn't do the job right, so we ended
> up with lacking of mailing lists in Cc's in this patchset. The command look like
> this:
> 
> git send-email --cc-cmd "scripts/get_maintainer.pl --separator , --nokeywords --nogit --nogit-fallback --norolestats --nom" \
>                    --to-cmd "scripts/get_maintainer.pl --separator , --nokeywords --nogit --nogit-fallback --norolestats --nol" \
>                    --from "Serge Semin <Sergey.Semin at baikalelectronics.ru>" \
>                    --smtp-server-option="-abaikal" --cover-letter -5

I'm talking about one which makes your Message-Id/Reference headers broken
between cover letter and the rest of the series. It might be because of missed
patches in the chain.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
       [not found]     ` <20200306135050.40094803087C@mail.baikalelectronics.ru>
@ 2020-03-09 21:45       ` Sergey Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Sergey Semin @ 2020-03-09 21:45 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Andy Shevchenko, Alexey Malahov, Maxim Kaurkin,
	Pavel Parkhomenko, Ramil Zaripov, Ekaterina Skachko,
	Vadim Vlasov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Viresh Kumar, Dan Williams, Rob Herring, Mark Rutland, dmaengine,
	devicetree, linux-kernel

On Fri, Mar 06, 2020 at 07:13:12PM +0530, Vinod Koul wrote:
> On 06-03-20, 15:30, Andy Shevchenko wrote:
> > On Fri, Mar 06, 2020 at 03:29:12PM +0200, Andy Shevchenko wrote:
> > > On Fri, Mar 06, 2020 at 04:10:29PM +0300, Sergey.Semin@baikalelectronics.ru wrote:
> > > > From: Serge Semin <fancer.lancer@gmail.com>
> > > > 
> > > > Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
> > > > peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
> > > > with currently implemented in the kernel DW DMAC driver, but there are some
> > > > peculiarities which must be taken into account in order to have the device
> > > > fully supported.
> > > > 
> > > > First of all traditionally we replaced the legacy plain text-based dt-binding
> > > > file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
> > > > channels, which alas have different max burst length configuration.
> > > > In particular first two channels may burst up to 128 bits (16 bytes) at a time
> > > > while the rest of them just up to 32 bits. We must make sure that the DMA
> > > > subsystem doesn't set values exceeding these limitations otherwise the
> > > > controller will hang up. In third currently we discovered the problem in using
> > > > the DW APB SPI driver together with DW DMAC. The problem happens if there is no
> > > > natively implemented multi-block LLP transfers support and the SPI-transfer
> > > > length exceeds the max lock size. In this case due to asynchronous handling of
> > > > Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
> > > > overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
> > > > the DMAC to asynchronously execute the transfers we'd have to at least warn
> > > > the user of the possible errors.
> > > > 
> > > > Finally there is a bug in the algorithm of the nollp flag detection.
> > > > In particular even if DW DMAC parameters state the multi-block transfers
> > > > support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
> > > > by the driver true multi-block LLP functionality unusable. This happens cause'
> > > > if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> > > > contiguous multi-block transfers will be only supported. We must take the
> > > > flag into account when detecting the LLP support otherwise the driver just
> > > > won't work correctly.
> > > > 
> > > > This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
> > > > commit 98d54f81e36b ("Linux 5.6-rc4").
> > > 
> > > Thank you for your series!
> > > 
> > > I'll definitely review it, but it will take time. So, I think due to late
> > > submission this is material at least for v5.8.
> > 
> > One thing that I can tell immediately is the broken email thread in this series.
> > Whenever you do a series, use `git format-patch --cover-letter --thread ...`,
> > so, it will link the mail properly.
> 
> And all the dmaengine specific patches should be sent to dmaengine list,
> I see only few of them on the list.. that confuses tools like
> patchwork..
> 
> Pls fix these and resubmit
> 

Folks. I've found out what was wrong with the emails threading. As I
said my gitconfig had the next settings set: chainreplyto = false,
thread = true. So the emails should have been formatted as expected by
the requirements. And they were on my emails client side, so I didn't see
the problem you've got.

It wasn't a first time I was submitting patches to the kernel, but it was
a first time of me using the corporate exchange server for it. It turned out
the damn server changed the Message-Id field of the emails header on the
way of transmitting the messages. If you take a look at the non-cover-letter
emails you've got from me you'll see that they actually have the In-Reply-To
and References fields with Id's referring to the original Message-Id. After
our system administrator fixes that problem and we come up with solutions
for the issues you've found in the patches I'll definitely resend the
patchset. This time I'll also make sure the emailing lists are also included
in Cc. Sorry for the inconvenience.

Regards,
-Sergey

> Thanks
> -- 
> ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
       [not found]       ` <20200306141135.9C4F380307C2@mail.baikalelectronics.ru>
@ 2020-03-09 22:08         ` Sergey Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Sergey Semin @ 2020-03-09 22:08 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Alexey Malahov, Maxim Kaurkin, Pavel Parkhomenko, Ramil Zaripov,
	Ekaterina Skachko, Vadim Vlasov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Viresh Kumar, Dan Williams,
	Vinod Koul, Rob Herring, Mark Rutland, dmaengine, devicetree,
	linux-kernel

On Fri, Mar 06, 2020 at 04:11:28PM +0200, Andy Shevchenko wrote:
> On Fri, Mar 06, 2020 at 04:47:20PM +0300, Sergey Semin wrote:
> > On Fri, Mar 06, 2020 at 03:30:35PM +0200, Andy Shevchenko wrote:
> > > On Fri, Mar 06, 2020 at 03:29:12PM +0200, Andy Shevchenko wrote:
> > > > On Fri, Mar 06, 2020 at 04:10:29PM +0300, Sergey.Semin@baikalelectronics.ru wrote:
> > > > > From: Serge Semin <fancer.lancer@gmail.com>
> > > > > 
> > > > > Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
> > > > > peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
> > > > > with currently implemented in the kernel DW DMAC driver, but there are some
> > > > > peculiarities which must be taken into account in order to have the device
> > > > > fully supported.
> > > > > 
> > > > > First of all traditionally we replaced the legacy plain text-based dt-binding
> > > > > file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
> > > > > channels, which alas have different max burst length configuration.
> > > > > In particular first two channels may burst up to 128 bits (16 bytes) at a time
> > > > > while the rest of them just up to 32 bits. We must make sure that the DMA
> > > > > subsystem doesn't set values exceeding these limitations otherwise the
> > > > > controller will hang up. In third currently we discovered the problem in using
> > > > > the DW APB SPI driver together with DW DMAC. The problem happens if there is no
> > > > > natively implemented multi-block LLP transfers support and the SPI-transfer
> > > > > length exceeds the max lock size. In this case due to asynchronous handling of
> > > > > Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
> > > > > overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
> > > > > the DMAC to asynchronously execute the transfers we'd have to at least warn
> > > > > the user of the possible errors.
> > > > > 
> > > > > Finally there is a bug in the algorithm of the nollp flag detection.
> > > > > In particular even if DW DMAC parameters state the multi-block transfers
> > > > > support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
> > > > > by the driver true multi-block LLP functionality unusable. This happens cause'
> > > > > if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> > > > > contiguous multi-block transfers will be only supported. We must take the
> > > > > flag into account when detecting the LLP support otherwise the driver just
> > > > > won't work correctly.
> > > > > 
> > > > > This patchset is rebased and tested on the mainline Linux kernel 5.6-rc4:
> > > > > commit 98d54f81e36b ("Linux 5.6-rc4").
> > > > 
> > > > Thank you for your series!
> > > > 
> > > > I'll definitely review it, but it will take time. So, I think due to late
> > > > submission this is material at least for v5.8.
> > > 
> > 
> > Hello Andy,
> > Thanks for the quick response. Looking forward to get the patches
> > reviewed and move on with the next patchset I'll send after this. It concerns
> > DW APB SSI driver, which uses the changes introduced by this one.
> 
> > So the
> > sooner we finished with this patchset the better.
> 
> Everybody will win, but review will take as long as it take. And for sure it
> will miss v5.7 release cycle. Because too many patch sets sent at once
> followed by schedule, we almost at v5.6-rc5.
> 

Yeah. 13 patchsets is a lot of work to review. I was just saying, that
even though there are many patches sent, there are even more being
scheduled for submission after that, which rely on the alterations
provided by these patches. Though the pacthes dependency may change
seeing you have issues regarding some of them.)

> > Although I understand
> > that it may take some time. I've just sent over 12 patchset, which have a lot
> > of fixups and new drivers.)
> > 
> > > One thing that I can tell immediately is the broken email thread in this series.
> > > Whenever you do a series, use `git format-patch --cover-letter --thread ...`,
> > > so, it will link the mail properly.
> > > 
> > 
> > I've got thread=true in my gitconfig file, so each email should have
> > the proper reference and in-reply-to to the cover-letter (I see it from
> > the log). The problem popped up from a different place. For some reason the
> > automatic CC/To list extraction command didn't do the job right, so we ended
> > up with lacking of mailing lists in Cc's in this patchset. The command look like
> > this:
> > 
> > git send-email --cc-cmd "scripts/get_maintainer.pl --separator , --nokeywords --nogit --nogit-fallback --norolestats --nom" \
> >                    --to-cmd "scripts/get_maintainer.pl --separator , --nokeywords --nogit --nogit-fallback --norolestats --nol" \
> >                    --from "Serge Semin <Sergey.Semin at baikalelectronics.ru>" \
> >                    --smtp-server-option="-abaikal" --cover-letter -5
> 
> I'm talking about one which makes your Message-Id/Reference headers broken
> between cover letter and the rest of the series. It might be because of missed
> patches in the chain.
> 

Ok. Now I see what you meant. First I had a thought there was some
misunderstanding on your or my side, because my neomutt client didn't
show any Ids confusion. But after another maintainer complained about
the same problem I realized that the issue must be at someplace I
couldn't have noticed. Then I thought that the outgoing email server
could have changed the order of the sent emails. But it turned out the
problem was in the message Ids replacement performed by our corporate
exchange server. Please see the email I've sent in reply to the Vinod
comment regarding the emailing list Ccing. It describes what was really
wrong with the threading config.

Regards,
-Segey

> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 0/6] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account
  2020-03-06 13:10 [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account Sergey.Semin
  2020-03-06 13:29 ` Andy Shevchenko
@ 2020-05-08 10:52 ` Serge Semin
  2020-05-08 10:52   ` [PATCH v2 1/6] dt-bindings: dma: dw: Convert DW DMAC to DT binding Serge Semin
                     ` (5 more replies)
  1 sibling, 6 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-08 10:52 UTC (permalink / raw)
  To: Vinod Koul, Viresh Kumar
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Maxim Kaurkin,
	Pavel Parkhomenko, Ramil Zaripov, Ekaterina Skachko,
	Vadim Vlasov, Alexey Kolotnikov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Andy Shevchenko,
	Dan Williams, Rob Herring, linux-mips, dmaengine, devicetree,
	linux-kernel

Baikal-T1 SoC has an DW DMAC on-board to provide a Mem-to-Mem, low-speed
peripherals Dev-to-Mem and Mem-to-Dev functionality. Mostly it's compatible
with currently implemented in the kernel DW DMAC driver, but there are some
peculiarities which must be taken into account in order to have the device
fully supported.

First of all traditionally we replaced the legacy plain text-based dt-binding
file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides eight
channels, which alas have different max burst length configuration.
In particular first two channels may burst up to 128 bits (16 bytes) at a time
while the rest of them just up to 32 bits. We must make sure that the DMA
subsystem doesn't set values exceeding these limitations otherwise the
controller will hang up. In third currently we discovered the problem in using
the DW APB SPI driver together with DW DMAC. The problem happens if there is no
natively implemented multi-block LLP transfers support and the SPI-transfer
length exceeds the max lock size. In this case due to asynchronous handling of
Tx- and Rx- SPI transfers interrupt we might end up with Dw APB SSI Rx FIFO
overflow. So if DW APB SSI (or any other DMAC service consumer) intends to use
the DMAC to asynchronously execute the transfers we'd have to at least warn
the user of the possible errors. In forth it's worth to set the DMA device max
segment size with max block size config specific to the DW DMA controller. It
shall help the DMA clients to create size-optimized SG-list items for the
controller. This in turn will cause less dw_desc allocations, less LLP
reinitializations, better DMA device performance.

Finally there is a bug in the algorithm of the nollp flag detection.
In particular even if DW DMAC parameters state the multi-block transfers
support there is still HC_LLP (hardcode LLP) flag, which if set makes expected
by the driver true multi-block LLP functionality unusable. This happens cause'
if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
contiguous multi-block transfers will be only supported. We must take the
flag into account when detecting the LLP support otherwise the driver just
won't work correctly.

This patchset is rebased and tested on the mainline Linux kernel 5.7-rc4:
0e698dfa2822 ("Linux 5.7-rc4")
tag: v5.7-rc4

Changelog v2:
- Rearrange SoBs.
- Move $ref to the root level of the properties. So do do with the
  constraints in the DT binding.
- Replace "additionalProperties: false" with "unevaluatedProperties: false"
  property in the DT binding file.
- Discard default settings defined out of property enum constraint.
- Set default max-burst-len to 256 TR-WIDTH words in the DT binding.
- Discard noLLP and block_size accessors.
- Set max segment size of the DMA device structure with the DW DMA block size
  config.
- Print warning if noLLP flag is set.
- Discard max burst length accessor.
- Add comment about why hardware accelerated LLP list support depends
  on both MBLK_EN and HC_LLP configs setting.
- Use explicit bits state comparison operator in noLLP flag setting.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Maxim Kaurkin <Maxim.Kaurkin@baikalelectronics.ru>
Cc: Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru>
Cc: Ramil Zaripov <Ramil.Zaripov@baikalelectronics.ru>
Cc: Ekaterina Skachko <Ekaterina.Skachko@baikalelectronics.ru>
Cc: Vadim Vlasov <V.Vlasov@baikalelectronics.ru>
Cc: Alexey Kolotnikov <Alexey.Kolotnikov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: dmaengine@vger.kernel.org
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (6):
  dt-bindings: dma: dw: Convert DW DMAC to DT binding
  dt-bindings: dma: dw: Add max burst transaction length property
  dmaengine: dw: Set DMA device max segment size parameter
  dmaengine: dw: Print warning if multi-block is unsupported
  dmaengine: dw: Introduce max burst length hw config
  dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config

 .../bindings/dma/snps,dma-spear1340.yaml      | 173 ++++++++++++++++++
 .../devicetree/bindings/dma/snps-dma.txt      |  69 -------
 drivers/dma/dw/core.c                         |  57 +++++-
 drivers/dma/dw/dw.c                           |   1 +
 drivers/dma/dw/of.c                           |   9 +
 drivers/dma/dw/regs.h                         |  21 ++-
 include/linux/platform_data/dma-dw.h          |   4 +
 7 files changed, 256 insertions(+), 78 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
 delete mode 100644 Documentation/devicetree/bindings/dma/snps-dma.txt

-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 1/6] dt-bindings: dma: dw: Convert DW DMAC to DT binding
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
@ 2020-05-08 10:52   ` Serge Semin
  2020-05-18 17:50     ` Rob Herring
  2020-05-08 10:53   ` [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property Serge Semin
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-08 10:52 UTC (permalink / raw)
  To: Vinod Koul, Viresh Kumar, Rob Herring, Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Dan Williams,
	linux-mips, dmaengine, devicetree, linux-kernel

Modern device tree bindings are supposed to be created as YAML-files
in accordance with dt-schema. This commit replaces the Synopsis
Designware DMA controller legacy bare text bindings with YAML file.
The only required prorties are "compatible", "reg", "#dma-cells" and
"interrupts", which will be used by the driver to correctly find the
controller memory region and handle its events. The rest of the properties
are optional, since in case if either "dma-channels" or "dma-masters" isn't
specified, the driver will attempt to auto-detect the IP core
configuration.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-mips@vger.kernel.org

---

Changelog v2:
- Rearrange SoBs.
- Move $ref to the root level of the properties. So do do with the
  constraints.
- Discard default settings defined out of the property enum constraint.
- Replace "additionalProperties: false" with "unevaluatedProperties: false"
  property.
- Remove a label definition from the binding example.
---
 .../bindings/dma/snps,dma-spear1340.yaml      | 161 ++++++++++++++++++
 .../devicetree/bindings/dma/snps-dma.txt      |  69 --------
 2 files changed, 161 insertions(+), 69 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
 delete mode 100644 Documentation/devicetree/bindings/dma/snps-dma.txt

diff --git a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
new file mode 100644
index 000000000000..e7611840a7cf
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
@@ -0,0 +1,161 @@
+# SPDX-License-Identifier: GPL-2.0-only
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/dma/snps,dma-spear1340.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Synopsys Designware DMA Controller
+
+maintainers:
+  - Viresh Kumar <vireshk@kernel.org>
+  - Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+
+allOf:
+  - $ref: "dma-controller.yaml#"
+
+properties:
+  compatible:
+    const: snps,dma-spear1340
+
+  "#dma-cells":
+    const: 3
+    description: |
+      First cell is a phandle pointing to the DMA controller. Second one is
+      the DMA request line number. Third cell is the memory master identifier
+      for transfers on dynamically allocated channel. Fourth cell is the
+      peripheral master identifier for transfers on an allocated channel.
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    maxItems: 1
+
+  clock-names:
+    description: AHB interface reference clock.
+    const: hclk
+
+  dma-channels:
+    description: |
+      Number of DMA channels supported by the controller. In case if
+      not specified the driver will try to auto-detect this and
+      the rest of the optional parameters.
+    minimum: 1
+    maximum: 8
+
+  dma-requests:
+    minimum: 1
+    maximum: 16
+
+  dma-masters:
+    $ref: /schemas/types.yaml#definitions/uint32
+    description: |
+      Number of DMA masters supported by the controller. In case if
+      not specified the driver will try to auto-detect this and
+      the rest of the optional parameters.
+    minimum: 1
+    maximum: 4
+
+  chan_allocation_order:
+    $ref: /schemas/types.yaml#definitions/uint32
+    description: |
+      DMA channels allocation order specifier. Zero means ascending order
+      (first free allocated), while one - descending (last free allocated).
+    default: 0
+    enum: [0, 1]
+
+  chan_priority:
+    $ref: /schemas/types.yaml#definitions/uint32
+    description: |
+      DMA channels priority order. Zero means ascending channels priority
+      so the very first channel has the highest priority. While 1 means
+      descending priority (the last channel has the highest priority).
+    default: 0
+    enum: [0, 1]
+
+  block_size:
+    $ref: /schemas/types.yaml#definitions/uint32
+    description: Maximum block size supported by the DMA controller.
+    enum: [3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095]
+
+  data-width:
+    $ref: /schemas/types.yaml#/definitions/uint32-array
+    description: Data bus width per each DMA master in bytes.
+    items:
+      maxItems: 4
+      items:
+        enum: [4, 8, 16, 32]
+
+  data_width:
+    $ref: /schemas/types.yaml#/definitions/uint32-array
+    deprecated: true
+    description: |
+      Data bus width per each DMA master in (2^n * 8) bits. This property is
+      deprecated. It' usage is discouraged in favor of data-width one. Moreover
+      the property incorrectly permits to define data-bus width of 8 and 16
+      bits, which is impossible in accordance with DW DMAC IP-core data book.
+    items:
+      maxItems: 4
+      items:
+        enum:
+          - 0 # 8 bits
+          - 1 # 16 bits
+          - 2 # 32 bits
+          - 3 # 64 bits
+          - 4 # 128 bits
+          - 5 # 256 bits
+        default: 0
+
+  multi-block:
+    $ref: /schemas/types.yaml#/definitions/uint32-array
+    description: |
+      LLP-based multi-block transfer supported by hardware per
+      each DMA channel.
+    items:
+      maxItems: 8
+      items:
+        enum: [0, 1]
+        default: 1
+
+  snps,dma-protection-control:
+    $ref: /schemas/types.yaml#definitions/uint32
+    description: |
+      Bits one-to-one passed to the AHB HPROT[3:1] bus. Each bit setting
+      indicates the following features: bit 0 - privileged mode,
+      bit 1 - DMA is bufferable, bit 2 - DMA is cacheable.
+    default: 0
+    minimum: 0
+    maximum: 7
+
+unevaluatedProperties: false
+
+required:
+  - compatible
+  - "#dma-cells"
+  - reg
+  - interrupts
+
+examples:
+  - |
+    dma-controller@fc000000 {
+      compatible = "snps,dma-spear1340";
+      reg = <0xfc000000 0x1000>;
+      interrupt-parent = <&vic1>;
+      interrupts = <12>;
+
+      dma-channels = <8>;
+      dma-requests = <16>;
+      dma-masters = <4>;
+      #dma-cells = <3>;
+
+      chan_allocation_order = <1>;
+      chan_priority = <1>;
+      block_size = <0xfff>;
+      data-width = <8 8>;
+      multi-block = <0 0 0 0 0 0 0 0>;
+      snps,max-burst-len = <16 16 4 4 4 4 4 4>;
+    };
+...
diff --git a/Documentation/devicetree/bindings/dma/snps-dma.txt b/Documentation/devicetree/bindings/dma/snps-dma.txt
deleted file mode 100644
index 0bedceed1963..000000000000
--- a/Documentation/devicetree/bindings/dma/snps-dma.txt
+++ /dev/null
@@ -1,69 +0,0 @@
-* Synopsys Designware DMA Controller
-
-Required properties:
-- compatible: "snps,dma-spear1340"
-- reg: Address range of the DMAC registers
-- interrupt: Should contain the DMAC interrupt number
-- dma-channels: Number of channels supported by hardware
-- dma-requests: Number of DMA request lines supported, up to 16
-- dma-masters: Number of AHB masters supported by the controller
-- #dma-cells: must be <3>
-- chan_allocation_order: order of allocation of channel, 0 (default): ascending,
-  1: descending
-- chan_priority: priority of channels. 0 (default): increase from chan 0->n, 1:
-  increase from chan n->0
-- block_size: Maximum block size supported by the controller
-- data-width: Maximum data width supported by hardware per AHB master
-  (in bytes, power of 2)
-
-
-Deprecated properties:
-- data_width: Maximum data width supported by hardware per AHB master
-  (0 - 8bits, 1 - 16bits, ..., 5 - 256bits)
-
-
-Optional properties:
-- multi-block: Multi block transfers supported by hardware. Array property with
-  one cell per channel. 0: not supported, 1 (default): supported.
-- snps,dma-protection-control: AHB HPROT[3:1] protection setting.
-  The default value is 0 (for non-cacheable, non-buffered,
-  unprivileged data access).
-  Refer to include/dt-bindings/dma/dw-dmac.h for possible values.
-
-Example:
-
-	dmahost: dma@fc000000 {
-		compatible = "snps,dma-spear1340";
-		reg = <0xfc000000 0x1000>;
-		interrupt-parent = <&vic1>;
-		interrupts = <12>;
-
-		dma-channels = <8>;
-		dma-requests = <16>;
-		dma-masters = <2>;
-		#dma-cells = <3>;
-		chan_allocation_order = <1>;
-		chan_priority = <1>;
-		block_size = <0xfff>;
-		data-width = <8 8>;
-	};
-
-DMA clients connected to the Designware DMA controller must use the format
-described in the dma.txt file, using a four-cell specifier for each channel.
-The four cells in order are:
-
-1. A phandle pointing to the DMA controller
-2. The DMA request line number
-3. Memory master for transfers on allocated channel
-4. Peripheral master for transfers on allocated channel
-
-Example:
-	
-	serial@e0000000 {
-		compatible = "arm,pl011", "arm,primecell";
-		reg = <0xe0000000 0x1000>;
-		interrupts = <0 35 0x4>;
-		dmas = <&dmahost 12 0 1>,
-			<&dmahost 13 1 0>;
-		dma-names = "rx", "rx";
-	};
-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
  2020-05-08 10:52   ` [PATCH v2 1/6] dt-bindings: dma: dw: Convert DW DMAC to DT binding Serge Semin
@ 2020-05-08 10:53   ` Serge Semin
  2020-05-08 11:12     ` Andy Shevchenko
  2020-05-08 10:53   ` [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter Serge Semin
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-08 10:53 UTC (permalink / raw)
  To: Vinod Koul, Viresh Kumar, Rob Herring
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Andy Shevchenko,
	Dan Williams, linux-mips, dmaengine, devicetree, linux-kernel

This array property is used to indicate the maximum burst transaction
length supported by each DMA channel.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Viresh Kumar <vireshk@kernel.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-mips@vger.kernel.org

---

Changelog v2:
- Rearrange SoBs.
- Move $ref to the root level of the properties. So do with the
  constraints.
- Set default max-burst-len to 256 TR-WIDTH words.
---
 .../devicetree/bindings/dma/snps,dma-spear1340.yaml  | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
index e7611840a7cf..7df4f9ad418a 100644
--- a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
+++ b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
@@ -120,6 +120,18 @@ properties:
         enum: [0, 1]
         default: 1
 
+  snps,max-burst-len:
+    $ref: /schemas/types.yaml#/definitions/uint32-array
+    description: |
+      Maximum length of burst transactions supported by hardware.
+      It's an array property with one cell per channel in units of
+      CTLx register SRC_TR_WIDTH/DST_TR_WIDTH (data-width) field.
+    items:
+      maxItems: 8
+      items:
+        enum: [4, 8, 16, 32, 64, 128, 256]
+        default: 256
+
   snps,dma-protection-control:
     $ref: /schemas/types.yaml#definitions/uint32
     description: |
-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
  2020-05-08 10:52   ` [PATCH v2 1/6] dt-bindings: dma: dw: Convert DW DMAC to DT binding Serge Semin
  2020-05-08 10:53   ` [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property Serge Semin
@ 2020-05-08 10:53   ` Serge Semin
  2020-05-08 11:21     ` Andy Shevchenko
  2020-05-08 10:53   ` [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported Serge Semin
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-08 10:53 UTC (permalink / raw)
  To: Vinod Koul, Viresh Kumar, Andy Shevchenko, Dan Williams
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, linux-kernel

Maximum block size DW DMAC configuration corresponds to the max segment
size DMA parameter in the DMA core subsystem notation. Lets set it with a
value specific to the probed DW DMA controller. It shall help the DMA
clients to create size-optimized SG-list items for the controller. This in
turn will cause less dw_desc allocations, less LLP reinitializations,
better DMA device performance.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org

---

Changelog v2:
- This is a new patch created in place of the dropped one:
  "dmaengine: dw: Add LLP and block size config accessors".
---
 drivers/dma/dw/core.c | 17 +++++++++++++++++
 drivers/dma/dw/regs.h | 18 ++++++++++--------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 21cb2a58dbd2..8bcd82c64478 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -1054,6 +1054,7 @@ int do_dma_probe(struct dw_dma_chip *chip)
 	struct dw_dma *dw = chip->dw;
 	struct dw_dma_platform_data *pdata;
 	bool			autocfg = false;
+	unsigned int		block_size = 0;
 	unsigned int		dw_params;
 	unsigned int		i;
 	int			err;
@@ -1184,6 +1185,18 @@ int do_dma_probe(struct dw_dma_chip *chip)
 			dwc->block_size = pdata->block_size;
 			dwc->nollp = !pdata->multi_block[i];
 		}
+
+		/*
+		 * Find maximum block size to be set as the DMA device maximum
+		 * segment size. By doing so we'll have size optimized SG-list
+		 * items for the channels with biggest block size. This won't
+		 * be a problem for the rest of the channels, since they will
+		 * still be able to split the requests up by allocating
+		 * multiple DW DMA LLP descriptors, which they would have done
+		 * anyway.
+		 */
+		if (dwc->block_size > block_size)
+			block_size = dwc->block_size;
 	}
 
 	/* Clear all interrupts on all channels. */
@@ -1220,6 +1233,10 @@ int do_dma_probe(struct dw_dma_chip *chip)
 			     BIT(DMA_MEM_TO_MEM);
 	dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
 
+	/* Block size corresponds to the maximum sg size */
+	dw->dma.dev->dma_parms = &dw->dma_parms;
+	dma_set_max_seg_size(dw->dma.dev, block_size);
+
 	err = dma_async_device_register(&dw->dma);
 	if (err)
 		goto err_dma_register;
diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h
index 3fce66ecee7a..20037d64f961 100644
--- a/drivers/dma/dw/regs.h
+++ b/drivers/dma/dw/regs.h
@@ -8,6 +8,7 @@
  */
 
 #include <linux/bitops.h>
+#include <linux/device.h>
 #include <linux/interrupt.h>
 #include <linux/dmaengine.h>
 
@@ -308,16 +309,17 @@ static inline struct dw_dma_chan *to_dw_dma_chan(struct dma_chan *chan)
 }
 
 struct dw_dma {
-	struct dma_device	dma;
-	char			name[20];
-	void __iomem		*regs;
-	struct dma_pool		*desc_pool;
-	struct tasklet_struct	tasklet;
+	struct dma_device		dma;
+	struct device_dma_parameters	dma_parms;
+	char				name[20];
+	void __iomem			*regs;
+	struct dma_pool			*desc_pool;
+	struct tasklet_struct		tasklet;
 
 	/* channels */
-	struct dw_dma_chan	*chan;
-	u8			all_chan_mask;
-	u8			in_use;
+	struct dw_dma_chan		*chan;
+	u8				all_chan_mask;
+	u8				in_use;
 
 	/* Channel operations */
 	void	(*initialize_chan)(struct dw_dma_chan *dwc);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
                     ` (2 preceding siblings ...)
  2020-05-08 10:53   ` [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter Serge Semin
@ 2020-05-08 10:53   ` Serge Semin
  2020-05-08 11:26     ` Andy Shevchenko
  2020-05-08 10:53   ` [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config Serge Semin
  2020-05-08 10:53   ` [PATCH v2 6/6] dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config Serge Semin
  5 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-08 10:53 UTC (permalink / raw)
  To: Vinod Koul, Viresh Kumar, Andy Shevchenko, Dan Williams
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, linux-kernel

Multi-block support provides a way to map the kernel-specific SG-table so
the DW DMA device would handle it as a whole instead of handling the
SG-list items or so called LLP block items one by one. So if true LLP
list isn't supported by the DW DMA engine, then soft-LLP mode will be
utilized to load and execute each LLP-block one by one. A problem may
happen for multi-block DMA slave transfers, when the slave device buffers
(for example Tx and Rx FIFOs) depend on each other and have size smaller
than the block size. In this case writing data to the DMA slave Tx buffer
may cause the Rx buffer overflow if Rx DMA channel is paused to
reinitialize the DW DMA controller with a next Rx LLP item. In particular
We've discovered this problem in the framework of the DW APB SPI device
working in conjunction with DW DMA. Since there is no comprehensive way to
fix it right now lets at least print a warning for the first found
multi-blockless DW DMAC channel. This shall point a developer to the
possible cause of the problem if one would experience a sudden data loss.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org

---

Changelog v2:
- This is a new patch created instead of the dropped one:
  "dmaengine: dw: Add LLP and block size config accessors"
---
 drivers/dma/dw/core.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 8bcd82c64478..e4749c296fca 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -1197,6 +1197,21 @@ int do_dma_probe(struct dw_dma_chip *chip)
 		 */
 		if (dwc->block_size > block_size)
 			block_size = dwc->block_size;
+
+		/*
+		 * It might crucial for some devices to have the hardware
+		 * accelerated multi-block transfers supported. Especially it
+		 * concerns if Tx and Rx DMA slave device buffers somehow
+		 * depend on each other. For instance an SPI controller may
+		 * experience Rx buffer overflow error if Tx DMA channel keeps
+		 * pushing data to the Tx FIFO, while Rx DMA channel is paused
+		 * to initialize the DW DMA controller with a next Rx LLP item.
+		 * Since there is no comprehensive way to fix it right now lets
+		 * at least print a warning that hardware LLPs reloading is
+		 * unsupported.
+		 */
+		if (dwc->nollp)
+			dev_warn_once(chip->dev, "No hardware LLP support\n");
 	}
 
 	/* Clear all interrupts on all channels. */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
                     ` (3 preceding siblings ...)
  2020-05-08 10:53   ` [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported Serge Semin
@ 2020-05-08 10:53   ` Serge Semin
  2020-05-08 11:41     ` Andy Shevchenko
  2020-05-08 10:53   ` [PATCH v2 6/6] dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config Serge Semin
  5 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-08 10:53 UTC (permalink / raw)
  To: Vinod Koul, Viresh Kumar, Andy Shevchenko, Dan Williams
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, linux-kernel

IP core of the DW DMA controller may be synthesized with different
max burst length of the transfers per each channel. According to Synopsis
having the fixed maximum burst transactions length may provide some
performance gain. At the same time setting up the source and destination
multi size exceeding the max burst length limitation may cause a serious
problems. In our case the system just hangs up. In order to fix this
lets introduce the max burst length platform config of the DW DMA
controller device and don't let the DMA channels configuration code
exceed the burst length hardware limitation. Depending on the IP core
configuration the maximum value can vary from channel to channel.
It can be detected either in runtime from the DWC parameter registers
or from the dedicated dts property.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org

---

Changelog v2:
- Rearrange SoBs.
- Discard dwc_get_maxburst() accessor. It's enough to have a clamping
  guard against exceeding the hardware max burst limitation.
---
 drivers/dma/dw/core.c                | 14 ++++++++++++++
 drivers/dma/dw/dw.c                  |  1 +
 drivers/dma/dw/of.c                  |  9 +++++++++
 drivers/dma/dw/regs.h                |  2 ++
 include/linux/platform_data/dma-dw.h |  4 ++++
 5 files changed, 30 insertions(+)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index e4749c296fca..5b76ccc857fd 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -1053,6 +1053,7 @@ int do_dma_probe(struct dw_dma_chip *chip)
 {
 	struct dw_dma *dw = chip->dw;
 	struct dw_dma_platform_data *pdata;
+	u32			max_burst = DW_DMA_MAX_BURST;
 	bool			autocfg = false;
 	unsigned int		block_size = 0;
 	unsigned int		dw_params;
@@ -1181,9 +1182,12 @@ int do_dma_probe(struct dw_dma_chip *chip)
 				(4 << ((pdata->block_size >> 4 * i) & 0xf)) - 1;
 			dwc->nollp =
 				(dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0;
+			dwc->max_burst =
+				(0x4 << (dwc_params >> DWC_PARAMS_MSIZE & 0x7));
 		} else {
 			dwc->block_size = pdata->block_size;
 			dwc->nollp = !pdata->multi_block[i];
+			dwc->max_burst = pdata->max_burst[i] ?: DW_DMA_MAX_BURST;
 		}
 
 		/*
@@ -1198,6 +1202,15 @@ int do_dma_probe(struct dw_dma_chip *chip)
 		if (dwc->block_size > block_size)
 			block_size = dwc->block_size;
 
+		/*
+		 * Find minimum of maximum burst lengths to be set in the
+		 * DMA device descriptor. This will at least leave us on a safe
+		 * side of using the DMA device, so the DMA clients can have it
+		 * to properly set buffer thresholds up.
+		 */
+		if (dwc->max_burst < max_burst)
+			max_burst = dwc->max_burst;
+
 		/*
 		 * It might crucial for some devices to have the hardware
 		 * accelerated multi-block transfers supported. Especially it
@@ -1244,6 +1257,7 @@ int do_dma_probe(struct dw_dma_chip *chip)
 	/* DMA capabilities */
 	dw->dma.src_addr_widths = DW_DMA_BUSWIDTHS;
 	dw->dma.dst_addr_widths = DW_DMA_BUSWIDTHS;
+	dw->dma.max_burst = max_burst;
 	dw->dma.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) |
 			     BIT(DMA_MEM_TO_MEM);
 	dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
index 7a085b3c1854..4d6b1ecabda4 100644
--- a/drivers/dma/dw/dw.c
+++ b/drivers/dma/dw/dw.c
@@ -86,6 +86,7 @@ static void dw_dma_encode_maxburst(struct dw_dma_chan *dwc, u32 *maxburst)
 	 * Fix burst size according to dw_dmac. We need to convert them as:
 	 * 1 -> 0, 4 -> 1, 8 -> 2, 16 -> 3.
 	 */
+	*maxburst = clamp(*maxburst, 0U, dwc->max_burst);
 	*maxburst = *maxburst > 1 ? fls(*maxburst) - 2 : 0;
 }
 
diff --git a/drivers/dma/dw/of.c b/drivers/dma/dw/of.c
index 9e27831dee32..d7323aad7cb5 100644
--- a/drivers/dma/dw/of.c
+++ b/drivers/dma/dw/of.c
@@ -98,6 +98,15 @@ struct dw_dma_platform_data *dw_dma_parse_dt(struct platform_device *pdev)
 			pdata->multi_block[tmp] = 1;
 	}
 
+	if (!of_property_read_u32_array(np, "snps,max-burst-len", mb,
+					nr_channels)) {
+		for (tmp = 0; tmp < nr_channels; tmp++)
+			pdata->max_burst[tmp] = mb[tmp];
+	} else {
+		for (tmp = 0; tmp < nr_channels; tmp++)
+			pdata->max_burst[tmp] = DW_DMA_MAX_BURST;
+	}
+
 	if (!of_property_read_u32(np, "snps,dma-protection-control", &tmp)) {
 		if (tmp > CHAN_PROTCTL_MASK)
 			return NULL;
diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h
index 20037d64f961..f581d4809b71 100644
--- a/drivers/dma/dw/regs.h
+++ b/drivers/dma/dw/regs.h
@@ -125,6 +125,7 @@ struct dw_dma_regs {
 #define DW_PARAMS_EN		28		/* encoded parameters */
 
 /* Bitfields in DWC_PARAMS */
+#define DWC_PARAMS_MSIZE	16		/* max group transaction size */
 #define DWC_PARAMS_MBLK_EN	11		/* multi block transfer */
 
 /* bursts size */
@@ -284,6 +285,7 @@ struct dw_dma_chan {
 	/* hardware configuration */
 	unsigned int		block_size;
 	bool			nollp;
+	u32			max_burst;
 
 	/* custom slave configuration */
 	struct dw_dma_slave	dws;
diff --git a/include/linux/platform_data/dma-dw.h b/include/linux/platform_data/dma-dw.h
index f3eaf9ec00a1..13e679afc0e0 100644
--- a/include/linux/platform_data/dma-dw.h
+++ b/include/linux/platform_data/dma-dw.h
@@ -12,6 +12,7 @@
 
 #define DW_DMA_MAX_NR_MASTERS	4
 #define DW_DMA_MAX_NR_CHANNELS	8
+#define DW_DMA_MAX_BURST	256
 
 /**
  * struct dw_dma_slave - Controller-specific information about a slave
@@ -42,6 +43,8 @@ struct dw_dma_slave {
  * @data_width: Maximum data width supported by hardware per AHB master
  *		(in bytes, power of 2)
  * @multi_block: Multi block transfers supported by hardware per channel.
+ * @max_burst: Maximum value of burst transaction size supported by hardware
+ *	       per channel (in units of CTL.SRC_TR_WIDTH/CTL.DST_TR_WIDTH).
  * @protctl: Protection control signals setting per channel.
  */
 struct dw_dma_platform_data {
@@ -56,6 +59,7 @@ struct dw_dma_platform_data {
 	unsigned char	nr_masters;
 	unsigned char	data_width[DW_DMA_MAX_NR_MASTERS];
 	unsigned char	multi_block[DW_DMA_MAX_NR_CHANNELS];
+	unsigned int	max_burst[DW_DMA_MAX_NR_CHANNELS];
 #define CHAN_PROTCTL_PRIVILEGED		BIT(0)
 #define CHAN_PROTCTL_BUFFERABLE		BIT(1)
 #define CHAN_PROTCTL_CACHEABLE		BIT(2)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 6/6] dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config
  2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
                     ` (4 preceding siblings ...)
  2020-05-08 10:53   ` [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config Serge Semin
@ 2020-05-08 10:53   ` Serge Semin
  2020-05-08 11:43     ` Andy Shevchenko
  5 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-08 10:53 UTC (permalink / raw)
  To: Vinod Koul, Viresh Kumar, Andy Shevchenko, Dan Williams
  Cc: Serge Semin, Serge Semin, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, linux-kernel

Full multi-block transfers functionality is enabled in DW DMA
controller only if CHx_MULTI_BLK_EN is set. But LLP-based transfers
can be executed only if hardcode channel x LLP register feature isn't
enabled, which can be switched on at the IP core synthesis for
optimization. If it's enabled then the LLP register is hardcoded to
zero, so the blocks chaining based on the LLPs is unsupported.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: devicetree@vger.kernel.org

---

Changelog v2:
- Rearrange SoBs.
- Add comment about why hardware accelerated LLP list support depends
  on both MBLK_EN and HC_LLP configs setting.
- Use explicit bits state comparison operator.
---
 drivers/dma/dw/core.c | 11 ++++++++++-
 drivers/dma/dw/regs.h |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 5b76ccc857fd..3179d45df662 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -1180,8 +1180,17 @@ int do_dma_probe(struct dw_dma_chip *chip)
 			 */
 			dwc->block_size =
 				(4 << ((pdata->block_size >> 4 * i) & 0xf)) - 1;
+
+			/*
+			 * According to the DW DMA databook the true scatter-
+			 * gether LLPs aren't available if either multi-block
+			 * config is disabled (CHx_MULTI_BLK_EN == 0) or the
+			 * LLP register is hard-coded to zeros
+			 * (CHx_HC_LLP == 1).
+			 */
 			dwc->nollp =
-				(dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0;
+				(dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0 ||
+				(dwc_params >> DWC_PARAMS_HC_LLP & 0x1) == 1;
 			dwc->max_burst =
 				(0x4 << (dwc_params >> DWC_PARAMS_MSIZE & 0x7));
 		} else {
diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h
index f581d4809b71..a8af19d0eabd 100644
--- a/drivers/dma/dw/regs.h
+++ b/drivers/dma/dw/regs.h
@@ -126,6 +126,7 @@ struct dw_dma_regs {
 
 /* Bitfields in DWC_PARAMS */
 #define DWC_PARAMS_MSIZE	16		/* max group transaction size */
+#define DWC_PARAMS_HC_LLP	13		/* set LLP register to zero */
 #define DWC_PARAMS_MBLK_EN	11		/* multi block transfer */
 
 /* bursts size */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-08 10:53   ` [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property Serge Semin
@ 2020-05-08 11:12     ` Andy Shevchenko
  2020-05-11 20:05       ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-08 11:12 UTC (permalink / raw)
  To: Serge Semin
  Cc: Vinod Koul, Viresh Kumar, Rob Herring, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> This array property is used to indicate the maximum burst transaction
> length supported by each DMA channel.

> +  snps,max-burst-len:
> +    $ref: /schemas/types.yaml#/definitions/uint32-array
> +    description: |
> +      Maximum length of burst transactions supported by hardware.
> +      It's an array property with one cell per channel in units of
> +      CTLx register SRC_TR_WIDTH/DST_TR_WIDTH (data-width) field.
> +    items:
> +      maxItems: 8
> +      items:

> +        enum: [4, 8, 16, 32, 64, 128, 256]

Isn't 1 allowed?

> +        default: 256

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-08 10:53   ` [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter Serge Semin
@ 2020-05-08 11:21     ` Andy Shevchenko
  2020-05-08 18:49       ` Vineet Gupta
  2020-05-11 21:16       ` Serge Semin
  0 siblings, 2 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-08 11:21 UTC (permalink / raw)
  To: Serge Semin, Vineet Gupta
  Cc: Vinod Koul, Viresh Kumar, Dan Williams, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

+Cc (Vineet, for information you probably know)

On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> Maximum block size DW DMAC configuration corresponds to the max segment
> size DMA parameter in the DMA core subsystem notation. Lets set it with a
> value specific to the probed DW DMA controller. It shall help the DMA
> clients to create size-optimized SG-list items for the controller. This in
> turn will cause less dw_desc allocations, less LLP reinitializations,
> better DMA device performance.

Thank you for the patch.
My comments below.

...

> +		/*
> +		 * Find maximum block size to be set as the DMA device maximum
> +		 * segment size. By doing so we'll have size optimized SG-list
> +		 * items for the channels with biggest block size. This won't
> +		 * be a problem for the rest of the channels, since they will
> +		 * still be able to split the requests up by allocating
> +		 * multiple DW DMA LLP descriptors, which they would have done
> +		 * anyway.
> +		 */
> +		if (dwc->block_size > block_size)
> +			block_size = dwc->block_size;
>  	}
>  
>  	/* Clear all interrupts on all channels. */
> @@ -1220,6 +1233,10 @@ int do_dma_probe(struct dw_dma_chip *chip)
>  			     BIT(DMA_MEM_TO_MEM);
>  	dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
>  
> +	/* Block size corresponds to the maximum sg size */
> +	dw->dma.dev->dma_parms = &dw->dma_parms;
> +	dma_set_max_seg_size(dw->dma.dev, block_size);
> +
>  	err = dma_async_device_register(&dw->dma);
>  	if (err)
>  		goto err_dma_register;

Yeah, I have locally something like this and I didn't dare to upstream because
there is an issue. We have this information per DMA controller, while we
actually need this on per DMA channel basis.

Above will work only for synthesized DMA with all channels having same block
size. That's why above conditional is not needed anyway.

OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
that Intel Medfield has interesting settings, but I don't remember if DMA
channels are different inside the same controller).

Vineet, do you have any information that Synopsys customers synthesized DMA
controllers with different channel characteristics inside one DMA IP?

...

>  #include <linux/bitops.h>

> +#include <linux/device.h>

Isn't enough to supply

struct device;

?

>  #include <linux/interrupt.h>
>  #include <linux/dmaengine.h>

Also this change needs a separate patch I suppose.

...

> -	struct dma_device	dma;
> -	char			name[20];
> -	void __iomem		*regs;
> -	struct dma_pool		*desc_pool;
> -	struct tasklet_struct	tasklet;
> +	struct dma_device		dma;
> +	struct device_dma_parameters	dma_parms;
> +	char				name[20];
> +	void __iomem			*regs;
> +	struct dma_pool			*desc_pool;
> +	struct tasklet_struct		tasklet;
>  
>  	/* channels */
> -	struct dw_dma_chan	*chan;
> -	u8			all_chan_mask;
> -	u8			in_use;
> +	struct dw_dma_chan		*chan;
> +	u8				all_chan_mask;
> +	u8				in_use;

Please split formatting fixes into a separate patch.


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-08 10:53   ` [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported Serge Semin
@ 2020-05-08 11:26     ` Andy Shevchenko
  2020-05-08 11:53       ` Mark Brown
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-08 11:26 UTC (permalink / raw)
  To: Serge Semin, Mark Brown
  Cc: Vinod Koul, Viresh Kumar, Dan Williams, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

+Cc: Mark (question about SPI + DMA workflow)

On Fri, May 08, 2020 at 01:53:02PM +0300, Serge Semin wrote:
> Multi-block support provides a way to map the kernel-specific SG-table so
> the DW DMA device would handle it as a whole instead of handling the
> SG-list items or so called LLP block items one by one. So if true LLP
> list isn't supported by the DW DMA engine, then soft-LLP mode will be
> utilized to load and execute each LLP-block one by one. A problem may
> happen for multi-block DMA slave transfers, when the slave device buffers
> (for example Tx and Rx FIFOs) depend on each other and have size smaller
> than the block size. In this case writing data to the DMA slave Tx buffer
> may cause the Rx buffer overflow if Rx DMA channel is paused to
> reinitialize the DW DMA controller with a next Rx LLP item. In particular
> We've discovered this problem in the framework of the DW APB SPI device

Mark, do we have any adjustment knobs in SPI core to cope with this?

> working in conjunction with DW DMA. Since there is no comprehensive way to
> fix it right now lets at least print a warning for the first found
> multi-blockless DW DMAC channel. This shall point a developer to the
> possible cause of the problem if one would experience a sudden data loss.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-08 10:53   ` [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config Serge Semin
@ 2020-05-08 11:41     ` Andy Shevchenko
  2020-05-12 14:08       ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-08 11:41 UTC (permalink / raw)
  To: Serge Semin
  Cc: Vinod Koul, Viresh Kumar, Dan Williams, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> IP core of the DW DMA controller may be synthesized with different
> max burst length of the transfers per each channel. According to Synopsis
> having the fixed maximum burst transactions length may provide some
> performance gain. At the same time setting up the source and destination
> multi size exceeding the max burst length limitation may cause a serious
> problems. In our case the system just hangs up. In order to fix this
> lets introduce the max burst length platform config of the DW DMA
> controller device and don't let the DMA channels configuration code
> exceed the burst length hardware limitation. Depending on the IP core
> configuration the maximum value can vary from channel to channel.
> It can be detected either in runtime from the DWC parameter registers
> or from the dedicated dts property.

I'm wondering what can be the scenario when your peripheral will ask something
which is not supported by DMA controller?

Peripheral needs to supply a lot of configuration parameters specific to the
DMA controller in use (that's why we have struct dw_dma_slave).
So, seems to me the feasible approach is supply correct data in the first place.

If you have specific channels to acquire then you probably need to provide a
custom xlate / filter functions. Because above seems a bit hackish workaround
of dynamic channel allocation mechanism.

But let's see what we can do better. Since maximum is defined on the slave side
device, it probably needs to define minimum as well, otherwise it's possible
that some hardware can't cope underrun bursts.

Vinod, what do you think?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/6] dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config
  2020-05-08 10:53   ` [PATCH v2 6/6] dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config Serge Semin
@ 2020-05-08 11:43     ` Andy Shevchenko
  0 siblings, 0 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-08 11:43 UTC (permalink / raw)
  To: Serge Semin
  Cc: Vinod Koul, Viresh Kumar, Dan Williams, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Fri, May 08, 2020 at 01:53:04PM +0300, Serge Semin wrote:
> Full multi-block transfers functionality is enabled in DW DMA
> controller only if CHx_MULTI_BLK_EN is set. But LLP-based transfers
> can be executed only if hardcode channel x LLP register feature isn't
> enabled, which can be switched on at the IP core synthesis for
> optimization. If it's enabled then the LLP register is hardcoded to
> zero, so the blocks chaining based on the LLPs is unsupported.
> 

This one is good.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Feel free to reassemble the series, so, Vinod can apply it independently.

> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
> Cc: Paul Burton <paulburton@kernel.org>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: linux-mips@vger.kernel.org
> Cc: devicetree@vger.kernel.org
> 
> ---
> 
> Changelog v2:
> - Rearrange SoBs.
> - Add comment about why hardware accelerated LLP list support depends
>   on both MBLK_EN and HC_LLP configs setting.
> - Use explicit bits state comparison operator.
> ---
>  drivers/dma/dw/core.c | 11 ++++++++++-
>  drivers/dma/dw/regs.h |  1 +
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
> index 5b76ccc857fd..3179d45df662 100644
> --- a/drivers/dma/dw/core.c
> +++ b/drivers/dma/dw/core.c
> @@ -1180,8 +1180,17 @@ int do_dma_probe(struct dw_dma_chip *chip)
>  			 */
>  			dwc->block_size =
>  				(4 << ((pdata->block_size >> 4 * i) & 0xf)) - 1;
> +
> +			/*
> +			 * According to the DW DMA databook the true scatter-
> +			 * gether LLPs aren't available if either multi-block
> +			 * config is disabled (CHx_MULTI_BLK_EN == 0) or the
> +			 * LLP register is hard-coded to zeros
> +			 * (CHx_HC_LLP == 1).
> +			 */
>  			dwc->nollp =
> -				(dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0;
> +				(dwc_params >> DWC_PARAMS_MBLK_EN & 0x1) == 0 ||
> +				(dwc_params >> DWC_PARAMS_HC_LLP & 0x1) == 1;
>  			dwc->max_burst =
>  				(0x4 << (dwc_params >> DWC_PARAMS_MSIZE & 0x7));
>  		} else {
> diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h
> index f581d4809b71..a8af19d0eabd 100644
> --- a/drivers/dma/dw/regs.h
> +++ b/drivers/dma/dw/regs.h
> @@ -126,6 +126,7 @@ struct dw_dma_regs {
>  
>  /* Bitfields in DWC_PARAMS */
>  #define DWC_PARAMS_MSIZE	16		/* max group transaction size */
> +#define DWC_PARAMS_HC_LLP	13		/* set LLP register to zero */
>  #define DWC_PARAMS_MBLK_EN	11		/* multi block transfer */
>  
>  /* bursts size */
> -- 
> 2.25.1
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-08 11:26     ` Andy Shevchenko
@ 2020-05-08 11:53       ` Mark Brown
  2020-05-08 19:06         ` Andy Shevchenko
  2020-05-11  2:10         ` Serge Semin
  0 siblings, 2 replies; 72+ messages in thread
From: Mark Brown @ 2020-05-08 11:53 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Dan Williams, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel


[-- Attachment #1: Type: text/plain, Size: 2404 bytes --]

On Fri, May 08, 2020 at 02:26:04PM +0300, Andy Shevchenko wrote:
> On Fri, May 08, 2020 at 01:53:02PM +0300, Serge Semin wrote:

> > Multi-block support provides a way to map the kernel-specific SG-table so
> > the DW DMA device would handle it as a whole instead of handling the
> > SG-list items or so called LLP block items one by one. So if true LLP
> > list isn't supported by the DW DMA engine, then soft-LLP mode will be
> > utilized to load and execute each LLP-block one by one. A problem may
> > happen for multi-block DMA slave transfers, when the slave device buffers
> > (for example Tx and Rx FIFOs) depend on each other and have size smaller
> > than the block size. In this case writing data to the DMA slave Tx buffer
> > may cause the Rx buffer overflow if Rx DMA channel is paused to
> > reinitialize the DW DMA controller with a next Rx LLP item. In particular
> > We've discovered this problem in the framework of the DW APB SPI device

> Mark, do we have any adjustment knobs in SPI core to cope with this?

Frankly I'm not sure I follow what the issue is - is an LLP block item
different from a SG list entry?  As far as I can tell the problem is
that the DMA controller does not support chaining transactions together
and possibly also has a limit on the transfer size?  Or possibly some
issue with the DMA controller locking the CPU out of the I/O bus for
noticable periods?  I can't really think what we could do about that if
the issue is transfer sizes, that just seems like hardware which is
never going to work reliably.  If the issue is not being able to chain
transfers then possibly an option to linearize messages into a single
transfer as suggested to cope with PIO devices with ill considered
automated chip select handling, though at some point you have to worry
about the cost of the memcpy() vs the cost of just doing PIO.

> > working in conjunction with DW DMA. Since there is no comprehensive way to
> > fix it right now lets at least print a warning for the first found
> > multi-blockless DW DMAC channel. This shall point a developer to the
> > possible cause of the problem if one would experience a sudden data loss.

I thought from the description of the SPI driver I just reviewed that
this hardware didn't have DMA?  Or are there separate blocks in the
hardware that have a more standard instantiation of the DesignWare SPI
controller with DMA attached?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-08 11:21     ` Andy Shevchenko
@ 2020-05-08 18:49       ` Vineet Gupta
  2020-05-11 21:16       ` Serge Semin
  1 sibling, 0 replies; 72+ messages in thread
From: Vineet Gupta @ 2020-05-08 18:49 UTC (permalink / raw)
  To: Andy Shevchenko, Serge Semin
  Cc: Vinod Koul, Viresh Kumar, Dan Williams, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel, arcml

On 5/8/20 4:21 AM, Andy Shevchenko wrote:
> Yeah, I have locally something like this and I didn't dare to upstream because
> there is an issue. We have this information per DMA controller, while we
> actually need this on per DMA channel basis.
>
> Above will work only for synthesized DMA with all channels having same block
> size. That's why above conditional is not needed anyway.
>
> OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> that Intel Medfield has interesting settings, but I don't remember if DMA
> channels are different inside the same controller).
>
> Vineet, do you have any information that Synopsys customers synthesized DMA
> controllers with different channel characteristics inside one DMA IP?

The IP drivers are done by different teams, but I can try and ask around.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-08 11:53       ` Mark Brown
@ 2020-05-08 19:06         ` Andy Shevchenko
  2020-05-11  3:13           ` Serge Semin
  2020-05-11  2:10         ` Serge Semin
  1 sibling, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-08 19:06 UTC (permalink / raw)
  To: Mark Brown
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Dan Williams, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Fri, May 08, 2020 at 12:53:34PM +0100, Mark Brown wrote:
> On Fri, May 08, 2020 at 02:26:04PM +0300, Andy Shevchenko wrote:
> > On Fri, May 08, 2020 at 01:53:02PM +0300, Serge Semin wrote:
> 
> > > Multi-block support provides a way to map the kernel-specific SG-table so
> > > the DW DMA device would handle it as a whole instead of handling the
> > > SG-list items or so called LLP block items one by one. So if true LLP
> > > list isn't supported by the DW DMA engine, then soft-LLP mode will be
> > > utilized to load and execute each LLP-block one by one. A problem may
> > > happen for multi-block DMA slave transfers, when the slave device buffers
> > > (for example Tx and Rx FIFOs) depend on each other and have size smaller
> > > than the block size. In this case writing data to the DMA slave Tx buffer
> > > may cause the Rx buffer overflow if Rx DMA channel is paused to
> > > reinitialize the DW DMA controller with a next Rx LLP item. In particular
> > > We've discovered this problem in the framework of the DW APB SPI device
> 
> > Mark, do we have any adjustment knobs in SPI core to cope with this?
> 
> Frankly I'm not sure I follow what the issue is - is an LLP block item
> different from a SG list entry?  As far as I can tell the problem is
> that the DMA controller does not support chaining transactions together
> and possibly also has a limit on the transfer size?  Or possibly some
> issue with the DMA controller locking the CPU out of the I/O bus for
> noticable periods?  I can't really think what we could do about that if
> the issue is transfer sizes, that just seems like hardware which is
> never going to work reliably.  If the issue is not being able to chain
> transfers then possibly an option to linearize messages into a single
> transfer as suggested to cope with PIO devices with ill considered
> automated chip select handling, though at some point you have to worry
> about the cost of the memcpy() vs the cost of just doing PIO.

My understanding that the programmed transfers (as separate items in SG list)
can be desynchronized due to LLP emulation in DMA driver. And suggestion
probably is to use only single entry (block) SG lists will do the trick (I
guess that we can configure SPI core do or do not change CS between them).

> > > working in conjunction with DW DMA. Since there is no comprehensive way to
> > > fix it right now lets at least print a warning for the first found
> > > multi-blockless DW DMAC channel. This shall point a developer to the
> > > possible cause of the problem if one would experience a sudden data loss.
> 
> I thought from the description of the SPI driver I just reviewed that
> this hardware didn't have DMA?  Or are there separate blocks in the
> hardware that have a more standard instantiation of the DesignWare SPI
> controller with DMA attached?

I speculate that the right words there should be 'we don't enable DMA right now
due to some issues' (see above).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-08 11:53       ` Mark Brown
  2020-05-08 19:06         ` Andy Shevchenko
@ 2020-05-11  2:10         ` Serge Semin
  2020-05-11 11:58           ` Mark Brown
  1 sibling, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-11  2:10 UTC (permalink / raw)
  To: Mark Brown
  Cc: Serge Semin, Andy Shevchenko, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

Hello Mark

On Fri, May 08, 2020 at 12:53:34PM +0100, Mark Brown wrote:
> On Fri, May 08, 2020 at 02:26:04PM +0300, Andy Shevchenko wrote:
> > On Fri, May 08, 2020 at 01:53:02PM +0300, Serge Semin wrote:
> 
> > > Multi-block support provides a way to map the kernel-specific SG-table so
> > > the DW DMA device would handle it as a whole instead of handling the
> > > SG-list items or so called LLP block items one by one. So if true LLP
> > > list isn't supported by the DW DMA engine, then soft-LLP mode will be
> > > utilized to load and execute each LLP-block one by one. A problem may
> > > happen for multi-block DMA slave transfers, when the slave device buffers
> > > (for example Tx and Rx FIFOs) depend on each other and have size smaller
> > > than the block size. In this case writing data to the DMA slave Tx buffer
> > > may cause the Rx buffer overflow if Rx DMA channel is paused to
> > > reinitialize the DW DMA controller with a next Rx LLP item. In particular
> > > We've discovered this problem in the framework of the DW APB SPI device
> 
> > Mark, do we have any adjustment knobs in SPI core to cope with this?
> 
> Frankly I'm not sure I follow what the issue is - is an LLP block item
> different from a SG list entry?  As far as I can tell the problem is
> that the DMA controller does not support chaining transactions together
> and possibly also has a limit on the transfer size?  Or possibly some
> issue with the DMA controller locking the CPU out of the I/O bus for
> noticable periods?  I can't really think what we could do about that if
> the issue is transfer sizes, that just seems like hardware which is
> never going to work reliably.  If the issue is not being able to chain
> transfers then possibly an option to linearize messages into a single
> transfer as suggested to cope with PIO devices with ill considered
> automated chip select handling, though at some point you have to worry
> about the cost of the memcpy() vs the cost of just doing PIO.

The problem is that our version of DW DMA controller can't automatically walk
over the chained SG list (in the DW DMA driver the SG list is mapped into a
chain of LLP items, which length is limited to the max transfer length supported
by the controller). In order to cope with such devices the DW DMA driver
manually (in IRQ handler) reloads the next SG/LLP item in the chain when a
previous one is finished. This causes a problem in the generic DW SSI driver
because normally the Tx DMA channel finishes working before the Rx DMA channel.
So the DW DMA driver will reload the next Tx SG/LLP item and will start the Tx
transaction while the Rx DMA finish IRQ is still pending. This most of the time
causes the Rx FIFO overrun and obviously data loss.

Alas linearizing the SPI messages won't help in this case because the DW DMA
driver will split it into the max transaction chunks anyway.

> 
> > > working in conjunction with DW DMA. Since there is no comprehensive way to
> > > fix it right now lets at least print a warning for the first found
> > > multi-blockless DW DMAC channel. This shall point a developer to the
> > > possible cause of the problem if one would experience a sudden data loss.
> 
> I thought from the description of the SPI driver I just reviewed that
> this hardware didn't have DMA?  Or are there separate blocks in the
> hardware that have a more standard instantiation of the DesignWare SPI
> controller with DMA attached?

You are right. Baikal-T1's got three SPI interfaces. Two of them are normal
DW APB SSI interfaces with 64 bytes FIFO, DMA, IRQ, their registers are
mapped in a dedicated memory space with no stuff like SPI flash direct mapping,
and the third one is the embedded into the System Boot Controller DW APB SSI
with all the peculiarities and complications I've described in the
corresponding patchset. Here in this patch I am talking about the former
ones.

-Sergey

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-08 19:06         ` Andy Shevchenko
@ 2020-05-11  3:13           ` Serge Semin
  2020-05-11 14:03             ` Andy Shevchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-11  3:13 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Mark Brown, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Fri, May 08, 2020 at 10:06:22PM +0300, Andy Shevchenko wrote:
> On Fri, May 08, 2020 at 12:53:34PM +0100, Mark Brown wrote:
> > On Fri, May 08, 2020 at 02:26:04PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:02PM +0300, Serge Semin wrote:
> > 
> > > > Multi-block support provides a way to map the kernel-specific SG-table so
> > > > the DW DMA device would handle it as a whole instead of handling the
> > > > SG-list items or so called LLP block items one by one. So if true LLP
> > > > list isn't supported by the DW DMA engine, then soft-LLP mode will be
> > > > utilized to load and execute each LLP-block one by one. A problem may
> > > > happen for multi-block DMA slave transfers, when the slave device buffers
> > > > (for example Tx and Rx FIFOs) depend on each other and have size smaller
> > > > than the block size. In this case writing data to the DMA slave Tx buffer
> > > > may cause the Rx buffer overflow if Rx DMA channel is paused to
> > > > reinitialize the DW DMA controller with a next Rx LLP item. In particular
> > > > We've discovered this problem in the framework of the DW APB SPI device
> > 
> > > Mark, do we have any adjustment knobs in SPI core to cope with this?
> > 
> > Frankly I'm not sure I follow what the issue is - is an LLP block item
> > different from a SG list entry?  As far as I can tell the problem is
> > that the DMA controller does not support chaining transactions together
> > and possibly also has a limit on the transfer size?  Or possibly some
> > issue with the DMA controller locking the CPU out of the I/O bus for
> > noticable periods?  I can't really think what we could do about that if
> > the issue is transfer sizes, that just seems like hardware which is
> > never going to work reliably.  If the issue is not being able to chain
> > transfers then possibly an option to linearize messages into a single
> > transfer as suggested to cope with PIO devices with ill considered
> > automated chip select handling, though at some point you have to worry
> > about the cost of the memcpy() vs the cost of just doing PIO.
> 
> My understanding that the programmed transfers (as separate items in SG list)
> can be desynchronized due to LLP emulation in DMA driver. And suggestion
> probably is to use only single entry (block) SG lists will do the trick (I
> guess that we can configure SPI core do or do not change CS between them).

CS has nothing to do with this. The problem is pure in the LLP emulation and Tx
channel being enabled before the Rx channel initialization during the next LLP
reload. Yes, if we have Tx and Rx SG/LLP list consisting of a single item, then
there is no problem. Though it would be good to fix the issue in general instead
of setting such fatal restrictions. If we had some fence of blocking one channel
before another is reinitialized, the problem could theoretically be solved.

It could be an interdependent DMA channels functionality. If two channels are
interdependent than the Rx channel could pause the Tx channel while it's in the
IRQ handling procedure (or at some other point... call a callback?). This !might!
fix the problem, but with no 100% guarantee of success. It will work only if IRQ
handler is executed with small latency, so the Tx channel is paused before the Rx
FIFO has been filled and overrun.

Another solution could be to reinitialize the interdependent channels
synchronously. Tx channel stops and waits until the Rx channel is finished its
business of data retrieval from SPI Rx FIFO. Though this solution implies
the Tx and Rx buffers of SG/LLP items being of the same size.

Although non of these solutions I really like to spend some time for its
development.

> 
> > > > working in conjunction with DW DMA. Since there is no comprehensive way to
> > > > fix it right now lets at least print a warning for the first found
> > > > multi-blockless DW DMAC channel. This shall point a developer to the
> > > > possible cause of the problem if one would experience a sudden data loss.
> > 
> > I thought from the description of the SPI driver I just reviewed that
> > this hardware didn't have DMA?  Or are there separate blocks in the
> > hardware that have a more standard instantiation of the DesignWare SPI
> > controller with DMA attached?
> 
> I speculate that the right words there should be 'we don't enable DMA right now
> due to some issues' (see above).

It's your speculation and it's kind of offensive implicitly implying I was
lying. If our System SPI controller had DMA I would have said that and would
have made it supported in the driver and probably wouldn't bother with a
dedicated driver development. Again the Baikal-T1 System Boot SPI controller
doesn't have DMA, doesn't have IRQ, is equipped with only 8 bytes FIFO, is
embedded into the Boot Controller, provides a dirmap interface to an SPI flash
and so on. Baikal-T1 has also got two more normal DW APB SSI interfaces with 64
bytes FIFO, IRQ and DMA.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11  2:10         ` Serge Semin
@ 2020-05-11 11:58           ` Mark Brown
  2020-05-11 13:45             ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Mark Brown @ 2020-05-11 11:58 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Andy Shevchenko, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel


[-- Attachment #1: Type: text/plain, Size: 568 bytes --]

On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:

> Alas linearizing the SPI messages won't help in this case because the DW DMA
> driver will split it into the max transaction chunks anyway.

That sounds like you need to also impose a limit on the maximum message
size as well then, with that you should be able to handle messages up
to whatever that limit is.  There's code for that bit already, so long
as the limit is not too low it should be fine for most devices and
client drivers can see the limit so they can be updated to work with it
if needed.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 11:58           ` Mark Brown
@ 2020-05-11 13:45             ` Serge Semin
  2020-05-11 13:58               ` Andy Shevchenko
  2020-05-11 17:44               ` Mark Brown
  0 siblings, 2 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-11 13:45 UTC (permalink / raw)
  To: Mark Brown
  Cc: Serge Semin, Andy Shevchenko, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> 
> > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > driver will split it into the max transaction chunks anyway.
> 
> That sounds like you need to also impose a limit on the maximum message
> size as well then, with that you should be able to handle messages up
> to whatever that limit is.  There's code for that bit already, so long
> as the limit is not too low it should be fine for most devices and
> client drivers can see the limit so they can be updated to work with it
> if needed.

Hmm, this might work. The problem will be with imposing such limitation through
the DW APB SSI driver. In order to do this I need to know:
1) Whether multi-block LLP is supported by the DW DMA controller.
2) Maximum DW DMA transfer block size.
Then I'll be able to use this information in the can_dma() callback to enable
the DMA xfers only for the safe transfers. Did you mean something like this when
you said "There's code for that bit already" ? If you meant the max_dma_len
parameter, then setting it won't work, because it just limits the SG items size
not the total length of a single transfer.

So the question is of how to export the multi-block LLP flag from DW DMAc
driver. Andy?

-Sergey


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 13:45             ` Serge Semin
@ 2020-05-11 13:58               ` Andy Shevchenko
  2020-05-11 17:48                 ` Mark Brown
  2020-05-11 19:32                 ` Serge Semin
  2020-05-11 17:44               ` Mark Brown
  1 sibling, 2 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-11 13:58 UTC (permalink / raw)
  To: Serge Semin
  Cc: Mark Brown, Serge Semin, Andy Shevchenko, Vinod Koul,
	Viresh Kumar, Dan Williams, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, Linux Kernel Mailing List

On Mon, May 11, 2020 at 4:48 PM Serge Semin
<Sergey.Semin@baikalelectronics.ru> wrote:
>
> On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> >
> > > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > > driver will split it into the max transaction chunks anyway.
> >
> > That sounds like you need to also impose a limit on the maximum message
> > size as well then, with that you should be able to handle messages up
> > to whatever that limit is.  There's code for that bit already, so long
> > as the limit is not too low it should be fine for most devices and
> > client drivers can see the limit so they can be updated to work with it
> > if needed.
>
> Hmm, this might work. The problem will be with imposing such limitation through
> the DW APB SSI driver. In order to do this I need to know:
> 1) Whether multi-block LLP is supported by the DW DMA controller.
> 2) Maximum DW DMA transfer block size.
> Then I'll be able to use this information in the can_dma() callback to enable
> the DMA xfers only for the safe transfers. Did you mean something like this when
> you said "There's code for that bit already" ? If you meant the max_dma_len
> parameter, then setting it won't work, because it just limits the SG items size
> not the total length of a single transfer.
>
> So the question is of how to export the multi-block LLP flag from DW DMAc
> driver. Andy?

I'm not sure I understand why do you need this being exported. Just
always supply SG list out of single entry and define the length
according to the maximum segment size (it's done IIRC in SPI core).

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11  3:13           ` Serge Semin
@ 2020-05-11 14:03             ` Andy Shevchenko
  0 siblings, 0 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-11 14:03 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Mark Brown, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Mon, May 11, 2020 at 06:13:44AM +0300, Serge Semin wrote:
> On Fri, May 08, 2020 at 10:06:22PM +0300, Andy Shevchenko wrote:
> > On Fri, May 08, 2020 at 12:53:34PM +0100, Mark Brown wrote:
> > > On Fri, May 08, 2020 at 02:26:04PM +0300, Andy Shevchenko wrote:
> > > > On Fri, May 08, 2020 at 01:53:02PM +0300, Serge Semin wrote:
> > > 
> > > > > Multi-block support provides a way to map the kernel-specific SG-table so
> > > > > the DW DMA device would handle it as a whole instead of handling the
> > > > > SG-list items or so called LLP block items one by one. So if true LLP
> > > > > list isn't supported by the DW DMA engine, then soft-LLP mode will be
> > > > > utilized to load and execute each LLP-block one by one. A problem may
> > > > > happen for multi-block DMA slave transfers, when the slave device buffers
> > > > > (for example Tx and Rx FIFOs) depend on each other and have size smaller
> > > > > than the block size. In this case writing data to the DMA slave Tx buffer
> > > > > may cause the Rx buffer overflow if Rx DMA channel is paused to
> > > > > reinitialize the DW DMA controller with a next Rx LLP item. In particular
> > > > > We've discovered this problem in the framework of the DW APB SPI device
> > > 
> > > > Mark, do we have any adjustment knobs in SPI core to cope with this?
> > > 
> > > Frankly I'm not sure I follow what the issue is - is an LLP block item
> > > different from a SG list entry?  As far as I can tell the problem is
> > > that the DMA controller does not support chaining transactions together
> > > and possibly also has a limit on the transfer size?  Or possibly some
> > > issue with the DMA controller locking the CPU out of the I/O bus for
> > > noticable periods?  I can't really think what we could do about that if
> > > the issue is transfer sizes, that just seems like hardware which is
> > > never going to work reliably.  If the issue is not being able to chain
> > > transfers then possibly an option to linearize messages into a single
> > > transfer as suggested to cope with PIO devices with ill considered
> > > automated chip select handling, though at some point you have to worry
> > > about the cost of the memcpy() vs the cost of just doing PIO.
> > 
> > My understanding that the programmed transfers (as separate items in SG list)
> > can be desynchronized due to LLP emulation in DMA driver. And suggestion
> > probably is to use only single entry (block) SG lists will do the trick (I
> > guess that we can configure SPI core do or do not change CS between them).
> 
> CS has nothing to do with this.

I meant that when you do a single entry SG transfer, you may need to shut SPI
core with CS toggling if needed (or otherwise).

> The problem is pure in the LLP emulation and Tx
> channel being enabled before the Rx channel initialization during the next LLP
> reload. Yes, if we have Tx and Rx SG/LLP list consisting of a single item, then
> there is no problem. Though it would be good to fix the issue in general instead
> of setting such fatal restrictions. If we had some fence of blocking one channel
> before another is reinitialized, the problem could theoretically be solved.
> 
> It could be an interdependent DMA channels functionality. If two channels are
> interdependent than the Rx channel could pause the Tx channel while it's in the
> IRQ handling procedure (or at some other point... call a callback?). This !might!
> fix the problem, but with no 100% guarantee of success. It will work only if IRQ
> handler is executed with small latency, so the Tx channel is paused before the Rx
> FIFO has been filled and overrun.
> 
> Another solution could be to reinitialize the interdependent channels
> synchronously. Tx channel stops and waits until the Rx channel is finished its
> business of data retrieval from SPI Rx FIFO. Though this solution implies
> the Tx and Rx buffers of SG/LLP items being of the same size.
> 
> Although non of these solutions I really like to spend some time for its
> development.

I think you don't need go too far with it and we can get easier solution (as
being discussed in continuation of this thread).

> > > > > working in conjunction with DW DMA. Since there is no comprehensive way to
> > > > > fix it right now lets at least print a warning for the first found
> > > > > multi-blockless DW DMAC channel. This shall point a developer to the
> > > > > possible cause of the problem if one would experience a sudden data loss.
> > > 
> > > I thought from the description of the SPI driver I just reviewed that
> > > this hardware didn't have DMA?  Or are there separate blocks in the
> > > hardware that have a more standard instantiation of the DesignWare SPI
> > > controller with DMA attached?
> > 
> > I speculate that the right words there should be 'we don't enable DMA right now
> > due to some issues' (see above).
> 
> It's your speculation and it's kind of offensive implicitly implying I was
> lying.

Sorry, if you think so. I didn't imply you are lying, I simple didn't get a big
picture, but here you elaborate better, thank you.

> If our System SPI controller had DMA I would have said that and would
> have made it supported in the driver and probably wouldn't bother with a
> dedicated driver development. Again the Baikal-T1 System Boot SPI controller
> doesn't have DMA, doesn't have IRQ, is equipped with only 8 bytes FIFO, is
> embedded into the Boot Controller, provides a dirmap interface to an SPI flash
> and so on. Baikal-T1 has also got two more normal DW APB SSI interfaces with 64
> bytes FIFO, IRQ and DMA.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 13:45             ` Serge Semin
  2020-05-11 13:58               ` Andy Shevchenko
@ 2020-05-11 17:44               ` Mark Brown
  2020-05-11 18:32                 ` Serge Semin
  1 sibling, 1 reply; 72+ messages in thread
From: Mark Brown @ 2020-05-11 17:44 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Andy Shevchenko, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel


[-- Attachment #1: Type: text/plain, Size: 1398 bytes --]

On Mon, May 11, 2020 at 04:45:02PM +0300, Serge Semin wrote:
> On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:

> > That sounds like you need to also impose a limit on the maximum message
> > size as well then, with that you should be able to handle messages up
> > to whatever that limit is.  There's code for that bit already, so long
> > as the limit is not too low it should be fine for most devices and
> > client drivers can see the limit so they can be updated to work with it
> > if needed.

> Hmm, this might work. The problem will be with imposing such limitation through
> the DW APB SSI driver. In order to do this I need to know:

> 1) Whether multi-block LLP is supported by the DW DMA controller.
> 2) Maximum DW DMA transfer block size.

There is a constraint enumeration interface in the DMA API which you
should be able to extend for this if it doesn't already support what you
need.

> Then I'll be able to use this information in the can_dma() callback to enable
> the DMA xfers only for the safe transfers. Did you mean something like this when
> you said "There's code for that bit already" ? If you meant the max_dma_len
> parameter, then setting it won't work, because it just limits the SG items size
> not the total length of a single transfer.

You can set max_transfer_size and/or max_message_size in the SPI driver
- you should be able to do this on probe.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 13:58               ` Andy Shevchenko
@ 2020-05-11 17:48                 ` Mark Brown
  2020-05-11 18:25                   ` Serge Semin
  2020-05-11 19:32                 ` Serge Semin
  1 sibling, 1 reply; 72+ messages in thread
From: Mark Brown @ 2020-05-11 17:48 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Andy Shevchenko, Vinod Koul,
	Viresh Kumar, Dan Williams, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, Linux Kernel Mailing List


[-- Attachment #1: Type: text/plain, Size: 671 bytes --]

On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> On Mon, May 11, 2020 at 4:48 PM Serge Semin

> > So the question is of how to export the multi-block LLP flag from DW DMAc
> > driver. Andy?

> I'm not sure I understand why do you need this being exported. Just
> always supply SG list out of single entry and define the length
> according to the maximum segment size (it's done IIRC in SPI core).

If there's a limit from the dmaengine it'd be a bit cleaner to export
the limit from the DMA engine (and it'd help with code reuse for clients
that might work with other DMA controllers without needing to add custom
compatibles for those instantiations).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 17:48                 ` Mark Brown
@ 2020-05-11 18:25                   ` Serge Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-11 18:25 UTC (permalink / raw)
  To: Mark Brown
  Cc: Serge Semin, Andy Shevchenko, Andy Shevchenko, Vinod Koul,
	Viresh Kumar, Dan Williams, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, Linux Kernel Mailing List

On Mon, May 11, 2020 at 06:48:00PM +0100, Mark Brown wrote:
> On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> > On Mon, May 11, 2020 at 4:48 PM Serge Semin
> 
> > > So the question is of how to export the multi-block LLP flag from DW DMAc
> > > driver. Andy?
> 
> > I'm not sure I understand why do you need this being exported. Just
> > always supply SG list out of single entry and define the length
> > according to the maximum segment size (it's done IIRC in SPI core).
> 
> If there's a limit from the dmaengine it'd be a bit cleaner to export
> the limit from the DMA engine (and it'd help with code reuse for clients
> that might work with other DMA controllers without needing to add custom
> compatibles for those instantiations).

Right. I've already posted a patch which exports the max segment size from the
DW DMA controller driver. The SPI core will get the limit in the spi_map_buf()
method by calling the dma_get_max_seg_size() function. The problem I
described concerns of how to determine whether to apply the solution Andy
suggested, since normally if DW DMA controller has true multi-block LLP
supported the workaround isn't required. So in order to solve the problem in a
generic way the easiest way would be to somehow get the noLLP flag from the DW
DMAC private data and select a one-by-one SG entries submission algorithm
instead of the normal one... On the other hand we could just implement a
flag-based quirks in the DW APB SSI driver and determine whether the LLP
problem exists for the platform-specific DW APB SSI controller.

-Sergey


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 17:44               ` Mark Brown
@ 2020-05-11 18:32                 ` Serge Semin
  2020-05-11 21:32                   ` Mark Brown
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-11 18:32 UTC (permalink / raw)
  To: Mark Brown
  Cc: Serge Semin, Andy Shevchenko, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On Mon, May 11, 2020 at 06:44:14PM +0100, Mark Brown wrote:
> On Mon, May 11, 2020 at 04:45:02PM +0300, Serge Semin wrote:
> > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> 
> > > That sounds like you need to also impose a limit on the maximum message
> > > size as well then, with that you should be able to handle messages up
> > > to whatever that limit is.  There's code for that bit already, so long
> > > as the limit is not too low it should be fine for most devices and
> > > client drivers can see the limit so they can be updated to work with it
> > > if needed.
> 
> > Hmm, this might work. The problem will be with imposing such limitation through
> > the DW APB SSI driver. In order to do this I need to know:
> 
> > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > 2) Maximum DW DMA transfer block size.
> 
> There is a constraint enumeration interface in the DMA API which you
> should be able to extend for this if it doesn't already support what you
> need.

Yes, that's max segment size.

> 
> > Then I'll be able to use this information in the can_dma() callback to enable
> > the DMA xfers only for the safe transfers. Did you mean something like this when
> > you said "There's code for that bit already" ? If you meant the max_dma_len
> > parameter, then setting it won't work, because it just limits the SG items size
> > not the total length of a single transfer.
> 
> You can set max_transfer_size and/or max_message_size in the SPI driver
> - you should be able to do this on probe.

Thanks for the explanation. Max segment size being set to the DMA controller generic
device should work well. There is no need in setting the transfer and messages
size limitations. Besides I don't really see the
max_transfer_size/max_message_size callbacks utilized in the SPI core. These
functions are called in the spi-mem.c driver only. Do I miss something?

-Sergey


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 13:58               ` Andy Shevchenko
  2020-05-11 17:48                 ` Mark Brown
@ 2020-05-11 19:32                 ` Serge Semin
  2020-05-11 21:07                   ` Andy Shevchenko
  1 sibling, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-11 19:32 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Mark Brown, Andy Shevchenko, Vinod Koul,
	Viresh Kumar, Dan Williams, Alexey Malahov, Thomas Bogendoerfer,
	Paul Burton, Ralf Baechle, Arnd Bergmann, Rob Herring,
	linux-mips, devicetree, dmaengine, Linux Kernel Mailing List

On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> On Mon, May 11, 2020 at 4:48 PM Serge Semin
> <Sergey.Semin@baikalelectronics.ru> wrote:
> >
> > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> > >
> > > > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > > > driver will split it into the max transaction chunks anyway.
> > >
> > > That sounds like you need to also impose a limit on the maximum message
> > > size as well then, with that you should be able to handle messages up
> > > to whatever that limit is.  There's code for that bit already, so long
> > > as the limit is not too low it should be fine for most devices and
> > > client drivers can see the limit so they can be updated to work with it
> > > if needed.
> >
> > Hmm, this might work. The problem will be with imposing such limitation through
> > the DW APB SSI driver. In order to do this I need to know:
> > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > 2) Maximum DW DMA transfer block size.
> > Then I'll be able to use this information in the can_dma() callback to enable
> > the DMA xfers only for the safe transfers. Did you mean something like this when
> > you said "There's code for that bit already" ? If you meant the max_dma_len
> > parameter, then setting it won't work, because it just limits the SG items size
> > not the total length of a single transfer.
> >
> > So the question is of how to export the multi-block LLP flag from DW DMAc
> > driver. Andy?
> 
> I'm not sure I understand why do you need this being exported. Just
> always supply SG list out of single entry and define the length
> according to the maximum segment size (it's done IIRC in SPI core).

Finally I see your point. So you suggest to feed the DMA engine with SG list
entries one-by-one instead of sending all of them at once in a single
dmaengine_prep_slave_sg() -> dmaengine_submit() -> dma_async_issue_pending()
session. Hm, this solution will work, but there is an issue. There is no
guarantee, that Tx and Rx SG lists are symmetric, consisting of the same
number of items with the same sizes. It depends on the Tx/Rx buffers physical
address alignment and their offsets within the memory pages. Though this
problem can be solved by making the Tx and Rx SG lists symmetric. I'll have
to implement a clever DMA IO loop, which would extract the DMA
addresses/lengths from the SG entries and perform the single-buffer DMA 
transactions with the DMA buffers of the same length.

Regarding noLLP being exported. Obviously I intended to solve the problem in a
generic way since the problem is common for noLLP DW APB SSI/DW DMAC combination.
In order to do this we need to know whether the multi-block LLP feature is
unsupported by the DW DMA controller. We either make such info somehow exported
from the DW DMA driver, so the DMA clients (like Dw APB SSI controller driver)
could be ready to work around the problem; or just implement a flag-based quirk
in the DMA client driver, which would be enabled in the platform-specific basis
depending on the platform device actually detected (for instance, a specific
version of the DW APB SSI IP). AFAICS You'd prefer the later option. 

Regarding SPI core toggling CS. It is irrelevant to this problem, since DMA
transactions are implemented within a single SPI transfer so the CS won't be
touched by the SPI core while we are working wht the xfer descriptor. Though
the problem with DW APB SSI native CS automatic toggling will persist anyway
no matter whether the multi-block LLPs are supported on not.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-08 11:12     ` Andy Shevchenko
@ 2020-05-11 20:05       ` Serge Semin
  2020-05-11 21:01         ` Andy Shevchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-11 20:05 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > This array property is used to indicate the maximum burst transaction
> > length supported by each DMA channel.
> 
> > +  snps,max-burst-len:
> > +    $ref: /schemas/types.yaml#/definitions/uint32-array
> > +    description: |
> > +      Maximum length of burst transactions supported by hardware.
> > +      It's an array property with one cell per channel in units of
> > +      CTLx register SRC_TR_WIDTH/DST_TR_WIDTH (data-width) field.
> > +    items:
> > +      maxItems: 8
> > +      items:
> 
> > +        enum: [4, 8, 16, 32, 64, 128, 256]
> 
> Isn't 1 allowed?

Burst length of 1 unit is supported, but in accordance with Data Book the MAX
burst length is limited to be equal to a value from the set I submitted. So the
max value can be either 4, or 8, or 16 and so on.

-Sergey

> 
> > +        default: 256
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-11 20:05       ` Serge Semin
@ 2020-05-11 21:01         ` Andy Shevchenko
  2020-05-11 21:35           ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-11 21:01 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > This array property is used to indicate the maximum burst transaction
> > > length supported by each DMA channel.
> > 
> > > +  snps,max-burst-len:
> > > +    $ref: /schemas/types.yaml#/definitions/uint32-array
> > > +    description: |
> > > +      Maximum length of burst transactions supported by hardware.
> > > +      It's an array property with one cell per channel in units of
> > > +      CTLx register SRC_TR_WIDTH/DST_TR_WIDTH (data-width) field.
> > > +    items:
> > > +      maxItems: 8
> > > +      items:
> > 
> > > +        enum: [4, 8, 16, 32, 64, 128, 256]
> > 
> > Isn't 1 allowed?
> 
> Burst length of 1 unit is supported, but in accordance with Data Book the MAX
> burst length is limited to be equal to a value from the set I submitted. So the
> max value can be either 4, or 8, or 16 and so on.

Hmm... It seems you mistakenly took here DMAH_CHx_MAX_MULT_SIZE pre-silicon
configuration parameter instead of runtime as described in Table 26:
CTLx.SRC_MSIZE and DEST_MSIZE Decoding.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 19:32                 ` Serge Semin
@ 2020-05-11 21:07                   ` Andy Shevchenko
  2020-05-11 21:08                     ` Andy Shevchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-11 21:07 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Mark Brown, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	Linux Kernel Mailing List

On Mon, May 11, 2020 at 10:32:55PM +0300, Serge Semin wrote:
> On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> > On Mon, May 11, 2020 at 4:48 PM Serge Semin
> > <Sergey.Semin@baikalelectronics.ru> wrote:
> > >
> > > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > > > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> > > >
> > > > > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > > > > driver will split it into the max transaction chunks anyway.
> > > >
> > > > That sounds like you need to also impose a limit on the maximum message
> > > > size as well then, with that you should be able to handle messages up
> > > > to whatever that limit is.  There's code for that bit already, so long
> > > > as the limit is not too low it should be fine for most devices and
> > > > client drivers can see the limit so they can be updated to work with it
> > > > if needed.
> > >
> > > Hmm, this might work. The problem will be with imposing such limitation through
> > > the DW APB SSI driver. In order to do this I need to know:
> > > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > > 2) Maximum DW DMA transfer block size.
> > > Then I'll be able to use this information in the can_dma() callback to enable
> > > the DMA xfers only for the safe transfers. Did you mean something like this when
> > > you said "There's code for that bit already" ? If you meant the max_dma_len
> > > parameter, then setting it won't work, because it just limits the SG items size
> > > not the total length of a single transfer.
> > >
> > > So the question is of how to export the multi-block LLP flag from DW DMAc
> > > driver. Andy?
> > 
> > I'm not sure I understand why do you need this being exported. Just
> > always supply SG list out of single entry and define the length
> > according to the maximum segment size (it's done IIRC in SPI core).
> 
> Finally I see your point. So you suggest to feed the DMA engine with SG list
> entries one-by-one instead of sending all of them at once in a single
> dmaengine_prep_slave_sg() -> dmaengine_submit() -> dma_async_issue_pending()
> session. Hm, this solution will work, but there is an issue. There is no
> guarantee, that Tx and Rx SG lists are symmetric, consisting of the same
> number of items with the same sizes. It depends on the Tx/Rx buffers physical
> address alignment and their offsets within the memory pages. Though this
> problem can be solved by making the Tx and Rx SG lists symmetric. I'll have
> to implement a clever DMA IO loop, which would extract the DMA
> addresses/lengths from the SG entries and perform the single-buffer DMA 
> transactions with the DMA buffers of the same length.
> 
> Regarding noLLP being exported. Obviously I intended to solve the problem in a
> generic way since the problem is common for noLLP DW APB SSI/DW DMAC combination.
> In order to do this we need to know whether the multi-block LLP feature is
> unsupported by the DW DMA controller. We either make such info somehow exported
> from the DW DMA driver, so the DMA clients (like Dw APB SSI controller driver)
> could be ready to work around the problem; or just implement a flag-based quirk
> in the DMA client driver, which would be enabled in the platform-specific basis
> depending on the platform device actually detected (for instance, a specific
> version of the DW APB SSI IP). AFAICS You'd prefer the later option. 

So, we may extend the struct of DMA parameters to tell the consumer amount of entries (each of which is no longer than maximum segment size) it can afford:
- 0: Auto (DMA driver handles any cases itself)
- 1: Only single entry
- 2: Up to two...


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 21:07                   ` Andy Shevchenko
@ 2020-05-11 21:08                     ` Andy Shevchenko
  2020-05-12 12:42                       ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-11 21:08 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Mark Brown, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	Linux Kernel Mailing List

On Tue, May 12, 2020 at 12:07:14AM +0300, Andy Shevchenko wrote:
> On Mon, May 11, 2020 at 10:32:55PM +0300, Serge Semin wrote:
> > On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> > > On Mon, May 11, 2020 at 4:48 PM Serge Semin
> > > <Sergey.Semin@baikalelectronics.ru> wrote:
> > > >
> > > > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > > > > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> > > > >
> > > > > > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > > > > > driver will split it into the max transaction chunks anyway.
> > > > >
> > > > > That sounds like you need to also impose a limit on the maximum message
> > > > > size as well then, with that you should be able to handle messages up
> > > > > to whatever that limit is.  There's code for that bit already, so long
> > > > > as the limit is not too low it should be fine for most devices and
> > > > > client drivers can see the limit so they can be updated to work with it
> > > > > if needed.
> > > >
> > > > Hmm, this might work. The problem will be with imposing such limitation through
> > > > the DW APB SSI driver. In order to do this I need to know:
> > > > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > > > 2) Maximum DW DMA transfer block size.
> > > > Then I'll be able to use this information in the can_dma() callback to enable
> > > > the DMA xfers only for the safe transfers. Did you mean something like this when
> > > > you said "There's code for that bit already" ? If you meant the max_dma_len
> > > > parameter, then setting it won't work, because it just limits the SG items size
> > > > not the total length of a single transfer.
> > > >
> > > > So the question is of how to export the multi-block LLP flag from DW DMAc
> > > > driver. Andy?
> > > 
> > > I'm not sure I understand why do you need this being exported. Just
> > > always supply SG list out of single entry and define the length
> > > according to the maximum segment size (it's done IIRC in SPI core).
> > 
> > Finally I see your point. So you suggest to feed the DMA engine with SG list
> > entries one-by-one instead of sending all of them at once in a single
> > dmaengine_prep_slave_sg() -> dmaengine_submit() -> dma_async_issue_pending()
> > session. Hm, this solution will work, but there is an issue. There is no
> > guarantee, that Tx and Rx SG lists are symmetric, consisting of the same
> > number of items with the same sizes. It depends on the Tx/Rx buffers physical
> > address alignment and their offsets within the memory pages. Though this
> > problem can be solved by making the Tx and Rx SG lists symmetric. I'll have
> > to implement a clever DMA IO loop, which would extract the DMA
> > addresses/lengths from the SG entries and perform the single-buffer DMA 
> > transactions with the DMA buffers of the same length.
> > 
> > Regarding noLLP being exported. Obviously I intended to solve the problem in a
> > generic way since the problem is common for noLLP DW APB SSI/DW DMAC combination.
> > In order to do this we need to know whether the multi-block LLP feature is
> > unsupported by the DW DMA controller. We either make such info somehow exported
> > from the DW DMA driver, so the DMA clients (like Dw APB SSI controller driver)
> > could be ready to work around the problem; or just implement a flag-based quirk
> > in the DMA client driver, which would be enabled in the platform-specific basis
> > depending on the platform device actually detected (for instance, a specific
> > version of the DW APB SSI IP). AFAICS You'd prefer the later option. 
> 
> So, we may extend the struct of DMA parameters to tell the consumer amount of entries (each of which is no longer than maximum segment size) it can afford:
> - 0: Auto (DMA driver handles any cases itself)
> - 1: Only single entry
> - 2: Up to two...

It will left implementation details (or i.o.w. obstacles or limitation) why DMA
can't do otherwise.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-08 11:21     ` Andy Shevchenko
  2020-05-08 18:49       ` Vineet Gupta
@ 2020-05-11 21:16       ` Serge Semin
  2020-05-12 12:35         ` Andy Shevchenko
  1 sibling, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-11 21:16 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vineet Gupta, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> +Cc (Vineet, for information you probably know)
> 
> On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > Maximum block size DW DMAC configuration corresponds to the max segment
> > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > value specific to the probed DW DMA controller. It shall help the DMA
> > clients to create size-optimized SG-list items for the controller. This in
> > turn will cause less dw_desc allocations, less LLP reinitializations,
> > better DMA device performance.
> 
> Thank you for the patch.
> My comments below.
> 
> ...
> 
> > +		/*
> > +		 * Find maximum block size to be set as the DMA device maximum
> > +		 * segment size. By doing so we'll have size optimized SG-list
> > +		 * items for the channels with biggest block size. This won't
> > +		 * be a problem for the rest of the channels, since they will
> > +		 * still be able to split the requests up by allocating
> > +		 * multiple DW DMA LLP descriptors, which they would have done
> > +		 * anyway.
> > +		 */
> > +		if (dwc->block_size > block_size)
> > +			block_size = dwc->block_size;
> >  	}
> >  
> >  	/* Clear all interrupts on all channels. */
> > @@ -1220,6 +1233,10 @@ int do_dma_probe(struct dw_dma_chip *chip)
> >  			     BIT(DMA_MEM_TO_MEM);
> >  	dw->dma.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
> >  
> > +	/* Block size corresponds to the maximum sg size */
> > +	dw->dma.dev->dma_parms = &dw->dma_parms;
> > +	dma_set_max_seg_size(dw->dma.dev, block_size);
> > +
> >  	err = dma_async_device_register(&dw->dma);
> >  	if (err)
> >  		goto err_dma_register;
> 
> Yeah, I have locally something like this and I didn't dare to upstream because
> there is an issue. We have this information per DMA controller, while we
> actually need this on per DMA channel basis.
> 
> Above will work only for synthesized DMA with all channels having same block
> size. That's why above conditional is not needed anyway.

Hm, I don't really see why the conditional isn't needed and this won't work. As
you can see in the loop above Initially I find a maximum of all channels maximum
block sizes and use it then as a max segment size parameter for the whole device.
If the DW DMA controller has the same max block size of all channels, then it
will be found. If the channels've been synthesized with different block sizes,
then the optimization will work for the one with greatest block size. The SG
list entries of the channels with lesser max block size will be split up
by the DW DMAC driver, which would have been done anyway without
max_segment_size being set. Here we at least provide the optimization for the
channels with greatest max block size.

I do understand that it would be good to have this parameter setup on per generic
DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
such facility, so setting at least some justified value is a good idea.

> 
> OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> that Intel Medfield has interesting settings, but I don't remember if DMA
> channels are different inside the same controller).
> 
> Vineet, do you have any information that Synopsys customers synthesized DMA
> controllers with different channel characteristics inside one DMA IP?

AFAICS the DW DMAC channels can be synthesized with different max block size.
The IP core supports such configuration. So we can't assume that such DMAC
release can't be found in a real hardware just because we've never seen one.
No matter what Vineet will have to say in response to your question.

> 
> ...
> 
> >  #include <linux/bitops.h>
> 
> > +#include <linux/device.h>
> 
> Isn't enough to supply
> 
> struct device;
> 
> ?

It's "struct device_dma_parameters" and I'd prefer to include the header file.

> 
> >  #include <linux/interrupt.h>
> >  #include <linux/dmaengine.h>
> 
> Also this change needs a separate patch I suppose.

Ah, just discovered there is no need in adding the dma_parms here because since
commit 7c8978c0837d ("driver core: platform: Initialize dma_parms for platform
devices") the dma_params pointer is already initialized. The same thing is done
for the PCI device too.

-Sergey

> 
> ...
> 
> > -	struct dma_device	dma;
> > -	char			name[20];
> > -	void __iomem		*regs;
> > -	struct dma_pool		*desc_pool;
> > -	struct tasklet_struct	tasklet;
> > +	struct dma_device		dma;
> > +	struct device_dma_parameters	dma_parms;
> > +	char				name[20];
> > +	void __iomem			*regs;
> > +	struct dma_pool			*desc_pool;
> > +	struct tasklet_struct		tasklet;
> >  
> >  	/* channels */
> > -	struct dw_dma_chan	*chan;
> > -	u8			all_chan_mask;
> > -	u8			in_use;
> > +	struct dw_dma_chan		*chan;
> > +	u8				all_chan_mask;
> > +	u8				in_use;
> 
> Please split formatting fixes into a separate patch.
> 
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 18:32                 ` Serge Semin
@ 2020-05-11 21:32                   ` Mark Brown
  0 siblings, 0 replies; 72+ messages in thread
From: Mark Brown @ 2020-05-11 21:32 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Andy Shevchenko, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel


[-- Attachment #1: Type: text/plain, Size: 882 bytes --]

On Mon, May 11, 2020 at 09:32:47PM +0300, Serge Semin wrote:

> Thanks for the explanation. Max segment size being set to the DMA controller generic
> device should work well. There is no need in setting the transfer and messages
> size limitations. Besides I don't really see the
> max_transfer_size/max_message_size callbacks utilized in the SPI core. These
> functions are called in the spi-mem.c driver only. Do I miss something?

We really should validate them in the core but really they're intended
for client drivers (like spi-mem kind of is) to allow them to adapt the
sizes of requests they're generating so the core never sees anything
that's too big.  For the transfers we have a spi_split_transfers_maxsize()
helper if anything wants to use it.  Fortunately there's not that many
controllers with low enough limits to worry about so actual usage hasn't
been that high.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-11 21:01         ` Andy Shevchenko
@ 2020-05-11 21:35           ` Serge Semin
  2020-05-12  9:08             ` Andy Shevchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-11 21:35 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > > This array property is used to indicate the maximum burst transaction
> > > > length supported by each DMA channel.
> > > 
> > > > +  snps,max-burst-len:
> > > > +    $ref: /schemas/types.yaml#/definitions/uint32-array
> > > > +    description: |
> > > > +      Maximum length of burst transactions supported by hardware.
> > > > +      It's an array property with one cell per channel in units of
> > > > +      CTLx register SRC_TR_WIDTH/DST_TR_WIDTH (data-width) field.
> > > > +    items:
> > > > +      maxItems: 8
> > > > +      items:
> > > 
> > > > +        enum: [4, 8, 16, 32, 64, 128, 256]
> > > 
> > > Isn't 1 allowed?
> > 
> > Burst length of 1 unit is supported, but in accordance with Data Book the MAX
> > burst length is limited to be equal to a value from the set I submitted. So the
> > max value can be either 4, or 8, or 16 and so on.
> 
> Hmm... It seems you mistakenly took here DMAH_CHx_MAX_MULT_SIZE pre-silicon
> configuration parameter instead of runtime as described in Table 26:
> CTLx.SRC_MSIZE and DEST_MSIZE Decoding.

No. You misunderstood what I meant. We shouldn't use a runtime parameters values
here. Why would we? Property "snps,max-burst-len" matches DMAH_CHx_MAX_MULT_SIZE
config parameter. See a comment to the "SRC_MSIZE" and "DEST_MSIZE" fields of the
registers. You'll find out that their maximum value is determined by the
DMAH_CHx_MAX_MULT_SIZE parameter, which must belong to the set [4, 8, 16, 32, 64,
128, 256]. So no matter how you synthesize the DW DMAC block you'll have at least
4x max burst length supported.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-11 21:35           ` Serge Semin
@ 2020-05-12  9:08             ` Andy Shevchenko
  2020-05-12 11:49               ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-12  9:08 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > > > This array property is used to indicate the maximum burst transaction
> > > > > length supported by each DMA channel.
> > > > 
> > > > > +  snps,max-burst-len:
> > > > > +    $ref: /schemas/types.yaml#/definitions/uint32-array
> > > > > +    description: |
> > > > > +      Maximum length of burst transactions supported by hardware.
> > > > > +      It's an array property with one cell per channel in units of
> > > > > +      CTLx register SRC_TR_WIDTH/DST_TR_WIDTH (data-width) field.
> > > > > +    items:
> > > > > +      maxItems: 8
> > > > > +      items:
> > > > 
> > > > > +        enum: [4, 8, 16, 32, 64, 128, 256]
> > > > 
> > > > Isn't 1 allowed?
> > > 
> > > Burst length of 1 unit is supported, but in accordance with Data Book the MAX
> > > burst length is limited to be equal to a value from the set I submitted. So the
> > > max value can be either 4, or 8, or 16 and so on.
> > 
> > Hmm... It seems you mistakenly took here DMAH_CHx_MAX_MULT_SIZE pre-silicon
> > configuration parameter instead of runtime as described in Table 26:
> > CTLx.SRC_MSIZE and DEST_MSIZE Decoding.
> 
> No. You misunderstood what I meant. We shouldn't use a runtime parameters values
> here. Why would we?

Because what we describe in the DTS is what user may do to the hardware. In
some cases user might want to limit this to 1, how to achieve that?

Rob, is there any clarification that schema describes only synthesized values?
Or i.o.w. shall we allow user to setup whatever hardware supports at run time?

> Property "snps,max-burst-len" matches DMAH_CHx_MAX_MULT_SIZE
> config parameter.

Why? User should have a possibility to ask whatever hardware supports at run time.

> See a comment to the "SRC_MSIZE" and "DEST_MSIZE" fields of the
> registers. You'll find out that their maximum value is determined by the
> DMAH_CHx_MAX_MULT_SIZE parameter, which must belong to the set [4, 8, 16, 32, 64,
> 128, 256]. So no matter how you synthesize the DW DMAC block you'll have at least
> 4x max burst length supported.

That's true.


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-12  9:08             ` Andy Shevchenko
@ 2020-05-12 11:49               ` Serge Semin
  2020-05-12 12:38                 ` Andy Shevchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-12 11:49 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > > > > This array property is used to indicate the maximum burst transaction
> > > > > > length supported by each DMA channel.
> > > > > 
> > > > > > +  snps,max-burst-len:
> > > > > > +    $ref: /schemas/types.yaml#/definitions/uint32-array
> > > > > > +    description: |
> > > > > > +      Maximum length of burst transactions supported by hardware.
> > > > > > +      It's an array property with one cell per channel in units of
> > > > > > +      CTLx register SRC_TR_WIDTH/DST_TR_WIDTH (data-width) field.
> > > > > > +    items:
> > > > > > +      maxItems: 8
> > > > > > +      items:
> > > > > 
> > > > > > +        enum: [4, 8, 16, 32, 64, 128, 256]
> > > > > 
> > > > > Isn't 1 allowed?
> > > > 
> > > > Burst length of 1 unit is supported, but in accordance with Data Book the MAX
> > > > burst length is limited to be equal to a value from the set I submitted. So the
> > > > max value can be either 4, or 8, or 16 and so on.
> > > 
> > > Hmm... It seems you mistakenly took here DMAH_CHx_MAX_MULT_SIZE pre-silicon
> > > configuration parameter instead of runtime as described in Table 26:
> > > CTLx.SRC_MSIZE and DEST_MSIZE Decoding.
> > 
> > No. You misunderstood what I meant. We shouldn't use a runtime parameters values
> > here. Why would we?
> 
> Because what we describe in the DTS is what user may do to the hardware. In
> some cases user might want to limit this to 1, how to achieve that?

No, dts isn't about hardware configuration, it's about hardware description. It's not
what user want, it's about what hardware can and can't. If a developer wants to limit
it to 1, one need to do this in software. The IP-core just can't be synthesized
with such limitation. No matter what, it must be no less than 4 as I described
in the enum setting.

> 
> Rob, is there any clarification that schema describes only synthesized values?
> Or i.o.w. shall we allow user to setup whatever hardware supports at run time?

One more time. max-burst-len set to 1 wouldn't describe the real hardware capability
because the Dw DMAC IP-core simply can't be synthesized with such max-burst-len.
In this patch I submitted the "max-burst-len" property, not just "burst-len"
setting.

> 
> > Property "snps,max-burst-len" matches DMAH_CHx_MAX_MULT_SIZE
> > config parameter.
> 
> Why? User should have a possibility to ask whatever hardware supports at run time.

Because the run time parameter is limited with DMAH_CHx_MAX_MULT_SIZE value, you agreed
with that further and "snps,max-burst-len" is about hardware limitation. For the
same reason the dma-channels property is limited to belong the segment 1 - 8, dma-masters
number must be limited with 1 - 4, block_size should be one of the set [3, 7, 15, 31, 63,
127, 255, 511, 1023, 2047, 4095] and so on. For instance, the block-size can be
set any but not greater than a value of the "block-size" property found in the
dt node or retrieved from the corresponding IP param register. It's not what user want,
but what hardware can support.

-Sergey

> 
> > See a comment to the "SRC_MSIZE" and "DEST_MSIZE" fields of the
> > registers. You'll find out that their maximum value is determined by the
> > DMAH_CHx_MAX_MULT_SIZE parameter, which must belong to the set [4, 8, 16, 32, 64,
> > 128, 256]. So no matter how you synthesize the DW DMAC block you'll have at least
> > 4x max burst length supported.
> 
> That's true.
> 
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-11 21:16       ` Serge Semin
@ 2020-05-12 12:35         ` Andy Shevchenko
  2020-05-12 17:01           ` Serge Semin
  2020-05-15  6:16           ` Vinod Koul
  0 siblings, 2 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-12 12:35 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Vineet Gupta, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > > Maximum block size DW DMAC configuration corresponds to the max segment
> > > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > > value specific to the probed DW DMA controller. It shall help the DMA
> > > clients to create size-optimized SG-list items for the controller. This in
> > > turn will cause less dw_desc allocations, less LLP reinitializations,
> > > better DMA device performance.

> > Yeah, I have locally something like this and I didn't dare to upstream because
> > there is an issue. We have this information per DMA controller, while we
> > actually need this on per DMA channel basis.
> > 
> > Above will work only for synthesized DMA with all channels having same block
> > size. That's why above conditional is not needed anyway.
> 
> Hm, I don't really see why the conditional isn't needed and this won't work. As
> you can see in the loop above Initially I find a maximum of all channels maximum
> block sizes and use it then as a max segment size parameter for the whole device.
> If the DW DMA controller has the same max block size of all channels, then it
> will be found. If the channels've been synthesized with different block sizes,
> then the optimization will work for the one with greatest block size. The SG
> list entries of the channels with lesser max block size will be split up
> by the DW DMAC driver, which would have been done anyway without
> max_segment_size being set. Here we at least provide the optimization for the
> channels with greatest max block size.
> 
> I do understand that it would be good to have this parameter setup on per generic
> DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
> such facility, so setting at least some justified value is a good idea.
> 
> > 
> > OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> > that Intel Medfield has interesting settings, but I don't remember if DMA
> > channels are different inside the same controller).
> > 
> > Vineet, do you have any information that Synopsys customers synthesized DMA
> > controllers with different channel characteristics inside one DMA IP?
> 
> AFAICS the DW DMAC channels can be synthesized with different max block size.
> The IP core supports such configuration. So we can't assume that such DMAC
> release can't be found in a real hardware just because we've never seen one.
> No matter what Vineet will have to say in response to your question.

My point here that we probably can avoid complications till we have real
hardware where it's different. As I said I don't remember a such, except
*maybe* Intel Medfield, which is quite outdated and not supported for wider
audience anyway.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-12 11:49               ` Serge Semin
@ 2020-05-12 12:38                 ` Andy Shevchenko
  2020-05-15  6:09                   ` Vinod Koul
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-12 12:38 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:

...

I leave it to Rob and Vinod.
It won't break our case, so, feel free with your approach.

P.S. Perhaps at some point we need to
1) convert properties to be u32 (it will simplify things);
2) convert legacy ones to proper format ('-' instead of '_', vendor prefix added);
3) parse them in core with device property API.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-11 21:08                     ` Andy Shevchenko
@ 2020-05-12 12:42                       ` Serge Semin
  2020-05-15  6:30                         ` Vinod Koul
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-12 12:42 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Mark Brown, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	Linux Kernel Mailing List

Vinod,

Could you join the discussion for a little bit?

In order to properly fix the problem discussed in this topic, we need to
introduce an additional capability exported by DMA channel handlers on per-channel
basis. It must be a number, which would indicate an upper limitation of the SG list
entries amount.
Something like this would do it:
struct dma_slave_caps {
...
	unsigned int max_sg_nents;
...
};
As Andy suggested it's value should be interpreted as:
0          - unlimited number of entries,
1:MAX_UINT - actual limit to the number of entries.

In addition to that seeing the dma_get_slave_caps() method provide the caps only
by getting them from the DMA device descriptor, while we need to have an info on
per-channel basis, it would be good to introduce a new DMA-device callback like:
struct dma_device {
...
	int (*device_caps)(struct dma_chan *chan,
			   struct dma_slave_caps *caps);
...
};
So the DMA driver could override the generic DMA device capabilities with the
values specific to the DMA channels. Such functionality will be also helpful for
the max-burst-len parameter introduced by this patchset, since depending on the
IP-core synthesis parameters it may be channel-specific.

Alternatively we could just introduce a new fields to the dma_chan structure and
retrieve the new caps values from them in the dma_get_slave_caps() method.
Though the solution with callback I like better.

What is your opinion about this? What solution you'd prefer?

On Tue, May 12, 2020 at 12:08:00AM +0300, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 12:07:14AM +0300, Andy Shevchenko wrote:
> > On Mon, May 11, 2020 at 10:32:55PM +0300, Serge Semin wrote:
> > > On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> > > > On Mon, May 11, 2020 at 4:48 PM Serge Semin
> > > > <Sergey.Semin@baikalelectronics.ru> wrote:
> > > > >
> > > > > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > > > > > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> > > > > >
> > > > > > > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > > > > > > driver will split it into the max transaction chunks anyway.
> > > > > >
> > > > > > That sounds like you need to also impose a limit on the maximum message
> > > > > > size as well then, with that you should be able to handle messages up
> > > > > > to whatever that limit is.  There's code for that bit already, so long
> > > > > > as the limit is not too low it should be fine for most devices and
> > > > > > client drivers can see the limit so they can be updated to work with it
> > > > > > if needed.
> > > > >
> > > > > Hmm, this might work. The problem will be with imposing such limitation through
> > > > > the DW APB SSI driver. In order to do this I need to know:
> > > > > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > > > > 2) Maximum DW DMA transfer block size.
> > > > > Then I'll be able to use this information in the can_dma() callback to enable
> > > > > the DMA xfers only for the safe transfers. Did you mean something like this when
> > > > > you said "There's code for that bit already" ? If you meant the max_dma_len
> > > > > parameter, then setting it won't work, because it just limits the SG items size
> > > > > not the total length of a single transfer.
> > > > >
> > > > > So the question is of how to export the multi-block LLP flag from DW DMAc
> > > > > driver. Andy?
> > > > 
> > > > I'm not sure I understand why do you need this being exported. Just
> > > > always supply SG list out of single entry and define the length
> > > > according to the maximum segment size (it's done IIRC in SPI core).
> > > 
> > > Finally I see your point. So you suggest to feed the DMA engine with SG list
> > > entries one-by-one instead of sending all of them at once in a single
> > > dmaengine_prep_slave_sg() -> dmaengine_submit() -> dma_async_issue_pending()
> > > session. Hm, this solution will work, but there is an issue. There is no
> > > guarantee, that Tx and Rx SG lists are symmetric, consisting of the same
> > > number of items with the same sizes. It depends on the Tx/Rx buffers physical
> > > address alignment and their offsets within the memory pages. Though this
> > > problem can be solved by making the Tx and Rx SG lists symmetric. I'll have
> > > to implement a clever DMA IO loop, which would extract the DMA
> > > addresses/lengths from the SG entries and perform the single-buffer DMA 
> > > transactions with the DMA buffers of the same length.
> > > 
> > > Regarding noLLP being exported. Obviously I intended to solve the problem in a
> > > generic way since the problem is common for noLLP DW APB SSI/DW DMAC combination.
> > > In order to do this we need to know whether the multi-block LLP feature is
> > > unsupported by the DW DMA controller. We either make such info somehow exported
> > > from the DW DMA driver, so the DMA clients (like Dw APB SSI controller driver)
> > > could be ready to work around the problem; or just implement a flag-based quirk
> > > in the DMA client driver, which would be enabled in the platform-specific basis
> > > depending on the platform device actually detected (for instance, a specific
> > > version of the DW APB SSI IP). AFAICS You'd prefer the later option. 
> > 
> > So, we may extend the struct of DMA parameters to tell the consumer amount of entries (each of which is no longer than maximum segment size) it can afford:
> > - 0: Auto (DMA driver handles any cases itself)
> > - 1: Only single entry
> > - 2: Up to two...
> 
> It will left implementation details (or i.o.w. obstacles or limitation) why DMA
> can't do otherwise.

Sounds good. Thanks for assistance.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-08 11:41     ` Andy Shevchenko
@ 2020-05-12 14:08       ` Serge Semin
  2020-05-12 19:12         ` Andy Shevchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-12 14:08 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > IP core of the DW DMA controller may be synthesized with different
> > max burst length of the transfers per each channel. According to Synopsis
> > having the fixed maximum burst transactions length may provide some
> > performance gain. At the same time setting up the source and destination
> > multi size exceeding the max burst length limitation may cause a serious
> > problems. In our case the system just hangs up. In order to fix this
> > lets introduce the max burst length platform config of the DW DMA
> > controller device and don't let the DMA channels configuration code
> > exceed the burst length hardware limitation. Depending on the IP core
> > configuration the maximum value can vary from channel to channel.
> > It can be detected either in runtime from the DWC parameter registers
> > or from the dedicated dts property.
> 
> I'm wondering what can be the scenario when your peripheral will ask something
> which is not supported by DMA controller?

I may misunderstood your statement, because seeing your activity around my
patchsets including the SPI patchset and sometimes very helpful comments,
this question answer seems too obvious to see you asking it.

No need to go far for an example. See the DW APB SSI driver. Its DMA module
specifies the burst length to be 16, while not all of ours channels supports it.
Yes, originally it has been developed for the Intel Midfield SPI, but since I
converted the driver into a generic code we can't use a fixed value. For instance
in our hardware only two DMA channels of total 16 are capable of bursting up to
16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
burst length. While there are two SPI interfaces, each of which need to have two
DMA channels for communications. So I need four channels in total to allocate to
provide the DMA capability for all interfaces. In order to set the SPI controller
up with valid optimized parameters the max-burst-length is required. Otherwise we
can end up with buffers overrun/underrun.

> 
> Peripheral needs to supply a lot of configuration parameters specific to the
> DMA controller in use (that's why we have struct dw_dma_slave).
> So, seems to me the feasible approach is supply correct data in the first place.

How to supply a valid data if clients don't know the DMA controller limitations
in general?

> 
> If you have specific channels to acquire then you probably need to provide a
> custom xlate / filter functions. Because above seems a bit hackish workaround
> of dynamic channel allocation mechanism.

No, I don't have a specific channel to acquire and in general you may use any
returned from the DMA subsystem (though some platforms may need a dedicated
channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
their DMA settings must properly and optimally configured. It can be only done
if you know the DMA controller parameters like max burst length, max block-size,
etc.

So no. The change proposed by this patch isn't workaround, but a useful feature,
moreover expected to be supported by the generic DMA subsystem.

> 
> But let's see what we can do better. Since maximum is defined on the slave side
> device, it probably needs to define minimum as well, otherwise it's possible
> that some hardware can't cope underrun bursts.

There is no need to define minimum if such limit doesn't exists except a
natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
added such capability into the generic DMA subsystem so far.

-Sergey

> 
> Vinod, what do you think?
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-12 12:35         ` Andy Shevchenko
@ 2020-05-12 17:01           ` Serge Semin
  2020-05-15  6:16           ` Vinod Koul
  1 sibling, 0 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-12 17:01 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vineet Gupta, Vinod Koul, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On Tue, May 12, 2020 at 03:35:51PM +0300, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > > > Maximum block size DW DMAC configuration corresponds to the max segment
> > > > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > > > value specific to the probed DW DMA controller. It shall help the DMA
> > > > clients to create size-optimized SG-list items for the controller. This in
> > > > turn will cause less dw_desc allocations, less LLP reinitializations,
> > > > better DMA device performance.
> 
> > > Yeah, I have locally something like this and I didn't dare to upstream because
> > > there is an issue. We have this information per DMA controller, while we
> > > actually need this on per DMA channel basis.
> > > 
> > > Above will work only for synthesized DMA with all channels having same block
> > > size. That's why above conditional is not needed anyway.
> > 
> > Hm, I don't really see why the conditional isn't needed and this won't work. As
> > you can see in the loop above Initially I find a maximum of all channels maximum
> > block sizes and use it then as a max segment size parameter for the whole device.
> > If the DW DMA controller has the same max block size of all channels, then it
> > will be found. If the channels've been synthesized with different block sizes,
> > then the optimization will work for the one with greatest block size. The SG
> > list entries of the channels with lesser max block size will be split up
> > by the DW DMAC driver, which would have been done anyway without
> > max_segment_size being set. Here we at least provide the optimization for the
> > channels with greatest max block size.
> > 
> > I do understand that it would be good to have this parameter setup on per generic
> > DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
> > such facility, so setting at least some justified value is a good idea.
> > 
> > > 
> > > OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> > > that Intel Medfield has interesting settings, but I don't remember if DMA
> > > channels are different inside the same controller).
> > > 
> > > Vineet, do you have any information that Synopsys customers synthesized DMA
> > > controllers with different channel characteristics inside one DMA IP?
> > 
> > AFAICS the DW DMAC channels can be synthesized with different max block size.
> > The IP core supports such configuration. So we can't assume that such DMAC
> > release can't be found in a real hardware just because we've never seen one.
> > No matter what Vineet will have to say in response to your question.
> 
> My point here that we probably can avoid complications till we have real
> hardware where it's different. As I said I don't remember a such, except
> *maybe* Intel Medfield, which is quite outdated and not supported for wider
> audience anyway.

I see your point. My position is different in this matter and explained in the
previous emails. Let's see what Viresh and Vinod think of it.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-12 14:08       ` Serge Semin
@ 2020-05-12 19:12         ` Andy Shevchenko
  2020-05-12 19:47           ` Serge Semin
  2020-05-15  6:39           ` Vinod Koul
  0 siblings, 2 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-12 19:12 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > IP core of the DW DMA controller may be synthesized with different
> > > max burst length of the transfers per each channel. According to Synopsis
> > > having the fixed maximum burst transactions length may provide some
> > > performance gain. At the same time setting up the source and destination
> > > multi size exceeding the max burst length limitation may cause a serious
> > > problems. In our case the system just hangs up. In order to fix this
> > > lets introduce the max burst length platform config of the DW DMA
> > > controller device and don't let the DMA channels configuration code
> > > exceed the burst length hardware limitation. Depending on the IP core
> > > configuration the maximum value can vary from channel to channel.
> > > It can be detected either in runtime from the DWC parameter registers
> > > or from the dedicated dts property.
> > 
> > I'm wondering what can be the scenario when your peripheral will ask something
> > which is not supported by DMA controller?
> 
> I may misunderstood your statement, because seeing your activity around my
> patchsets including the SPI patchset and sometimes very helpful comments,
> this question answer seems too obvious to see you asking it.
> 
> No need to go far for an example. See the DW APB SSI driver. Its DMA module
> specifies the burst length to be 16, while not all of ours channels supports it.
> Yes, originally it has been developed for the Intel Midfield SPI, but since I
> converted the driver into a generic code we can't use a fixed value. For instance
> in our hardware only two DMA channels of total 16 are capable of bursting up to
> 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
> burst length. While there are two SPI interfaces, each of which need to have two
> DMA channels for communications. So I need four channels in total to allocate to
> provide the DMA capability for all interfaces. In order to set the SPI controller
> up with valid optimized parameters the max-burst-length is required. Otherwise we
> can end up with buffers overrun/underrun.

Right, and we come to the question which channel better to be used by SPI and
the rest devices. Without specific filter function you can easily get into a
case of inverted optimizations, when SPI got channels with burst = 4, while
it's needed 16, and other hardware otherwise. Performance wise it's worse
scenario which we may avoid in the first place, right?

> > Peripheral needs to supply a lot of configuration parameters specific to the
> > DMA controller in use (that's why we have struct dw_dma_slave).
> > So, seems to me the feasible approach is supply correct data in the first place.
> 
> How to supply a valid data if clients don't know the DMA controller limitations
> in general?

This is a good question. DMA controllers are quite different and having unified
capabilities structure for all is almost impossible task to fulfil. That's why
custom filter function(s) can help here. Based on compatible string you can
implement whatever customized quirks like two functions, for example, to try 16
burst size first and fallback to 4 if none was previously found.

> > If you have specific channels to acquire then you probably need to provide a
> > custom xlate / filter functions. Because above seems a bit hackish workaround
> > of dynamic channel allocation mechanism.
> 
> No, I don't have a specific channel to acquire and in general you may use any
> returned from the DMA subsystem (though some platforms may need a dedicated
> channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
> channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
> their DMA settings must properly and optimally configured. It can be only done
> if you know the DMA controller parameters like max burst length, max block-size,
> etc.
> 
> So no. The change proposed by this patch isn't workaround, but a useful feature,
> moreover expected to be supported by the generic DMA subsystem.

See above.

> > But let's see what we can do better. Since maximum is defined on the slave side
> > device, it probably needs to define minimum as well, otherwise it's possible
> > that some hardware can't cope underrun bursts.
> 
> There is no need to define minimum if such limit doesn't exists except a
> natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> added such capability into the generic DMA subsystem so far.

There is a contract between provider and consumer about DMA resource. That's
why both sides should participate in fulfilling it. Theoretically it may be a
hardware that doesn't support minimum burst available in DMA by a reason. For
such we would need minimum to be provided as well.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-12 19:12         ` Andy Shevchenko
@ 2020-05-12 19:47           ` Serge Semin
  2020-05-15 11:02             ` Andy Shevchenko
  2020-05-15  6:39           ` Vinod Koul
  1 sibling, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-12 19:47 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Tue, May 12, 2020 at 10:12:08PM +0300, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > > IP core of the DW DMA controller may be synthesized with different
> > > > max burst length of the transfers per each channel. According to Synopsis
> > > > having the fixed maximum burst transactions length may provide some
> > > > performance gain. At the same time setting up the source and destination
> > > > multi size exceeding the max burst length limitation may cause a serious
> > > > problems. In our case the system just hangs up. In order to fix this
> > > > lets introduce the max burst length platform config of the DW DMA
> > > > controller device and don't let the DMA channels configuration code
> > > > exceed the burst length hardware limitation. Depending on the IP core
> > > > configuration the maximum value can vary from channel to channel.
> > > > It can be detected either in runtime from the DWC parameter registers
> > > > or from the dedicated dts property.
> > > 
> > > I'm wondering what can be the scenario when your peripheral will ask something
> > > which is not supported by DMA controller?
> > 
> > I may misunderstood your statement, because seeing your activity around my
> > patchsets including the SPI patchset and sometimes very helpful comments,
> > this question answer seems too obvious to see you asking it.
> > 
> > No need to go far for an example. See the DW APB SSI driver. Its DMA module
> > specifies the burst length to be 16, while not all of ours channels supports it.
> > Yes, originally it has been developed for the Intel Midfield SPI, but since I
> > converted the driver into a generic code we can't use a fixed value. For instance
> > in our hardware only two DMA channels of total 16 are capable of bursting up to
> > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
> > burst length. While there are two SPI interfaces, each of which need to have two
> > DMA channels for communications. So I need four channels in total to allocate to
> > provide the DMA capability for all interfaces. In order to set the SPI controller
> > up with valid optimized parameters the max-burst-length is required. Otherwise we
> > can end up with buffers overrun/underrun.
> 
> Right, and we come to the question which channel better to be used by SPI and
> the rest devices. Without specific filter function you can easily get into a
> case of inverted optimizations, when SPI got channels with burst = 4, while
> it's needed 16, and other hardware otherwise. Performance wise it's worse
> scenario which we may avoid in the first place, right?

If we start thinking like you said, we'll get stuck at a problem of which interfaces
should get faster DMA channels and which one should be left with slowest. In general
this task can't be solved, because without any application-specific requirement
they all are equally valuable and deserve to have the best resources allocated.
So we shouldn't assume that some interface is better or more valuable than
another, therefore in generic DMA client code any filtering is redundant.

> 
> > > Peripheral needs to supply a lot of configuration parameters specific to the
> > > DMA controller in use (that's why we have struct dw_dma_slave).
> > > So, seems to me the feasible approach is supply correct data in the first place.
> > 
> > How to supply a valid data if clients don't know the DMA controller limitations
> > in general?
> 
> This is a good question. DMA controllers are quite different and having unified
> capabilities structure for all is almost impossible task to fulfil. That's why
> custom filter function(s) can help here. Based on compatible string you can
> implement whatever customized quirks like two functions, for example, to try 16
> burst size first and fallback to 4 if none was previously found.

Right. As I said in the previous email it's up to the corresponding platforms to
decide the criteria of the filtering including the max-burst length value.
Even though the DW DMA channels resources aren't uniform on Baikal-T1 SoC I also
won't do the filter-based channel allocation, because I can't predict the SoC
application. Some of them may be used on a platform with active SPI interface
utilization, some with specific requirements to UARTs and so on.

> 
> > > If you have specific channels to acquire then you probably need to provide a
> > > custom xlate / filter functions. Because above seems a bit hackish workaround
> > > of dynamic channel allocation mechanism.
> > 
> > No, I don't have a specific channel to acquire and in general you may use any
> > returned from the DMA subsystem (though some platforms may need a dedicated
> > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
> > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
> > their DMA settings must properly and optimally configured. It can be only done
> > if you know the DMA controller parameters like max burst length, max block-size,
> > etc.
> > 
> > So no. The change proposed by this patch isn't workaround, but a useful feature,
> > moreover expected to be supported by the generic DMA subsystem.
> 
> See above.
> 
> > > But let's see what we can do better. Since maximum is defined on the slave side
> > > device, it probably needs to define minimum as well, otherwise it's possible
> > > that some hardware can't cope underrun bursts.
> > 
> > There is no need to define minimum if such limit doesn't exists except a
> > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> > added such capability into the generic DMA subsystem so far.
> 
> There is a contract between provider and consumer about DMA resource. That's
> why both sides should participate in fulfilling it. Theoretically it may be a
> hardware that doesn't support minimum burst available in DMA by a reason. For
> such we would need minimum to be provided as well.

I don't think 'theoretical' consideration counts when implementing something in
kernel. That 'theoretical' may never happen, but you'll end up supporting a
dummy functionality. Practicality is what kernel developers normally place
before anything else.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-12 12:38                 ` Andy Shevchenko
@ 2020-05-15  6:09                   ` Vinod Koul
  2020-05-15 10:51                     ` Andy Shevchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-15  6:09 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On 12-05-20, 15:38, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> 
> ...
> 
> I leave it to Rob and Vinod.
> It won't break our case, so, feel free with your approach.

I agree the DT is about describing the hardware and looks like value of
1 is not allowed. If allowed it should be added..

> P.S. Perhaps at some point we need to
> 1) convert properties to be u32 (it will simplify things);
> 2) convert legacy ones to proper format ('-' instead of '_', vendor prefix added);
> 3) parse them in core with device property API.

These suggestions are good and should be done.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-12 12:35         ` Andy Shevchenko
  2020-05-12 17:01           ` Serge Semin
@ 2020-05-15  6:16           ` Vinod Koul
  2020-05-15 10:53             ` Andy Shevchenko
  1 sibling, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-15  6:16 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Vineet Gupta, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On 12-05-20, 15:35, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > > > Maximum block size DW DMAC configuration corresponds to the max segment
> > > > size DMA parameter in the DMA core subsystem notation. Lets set it with a
> > > > value specific to the probed DW DMA controller. It shall help the DMA
> > > > clients to create size-optimized SG-list items for the controller. This in
> > > > turn will cause less dw_desc allocations, less LLP reinitializations,
> > > > better DMA device performance.
> 
> > > Yeah, I have locally something like this and I didn't dare to upstream because
> > > there is an issue. We have this information per DMA controller, while we
> > > actually need this on per DMA channel basis.
> > > 
> > > Above will work only for synthesized DMA with all channels having same block
> > > size. That's why above conditional is not needed anyway.
> > 
> > Hm, I don't really see why the conditional isn't needed and this won't work. As
> > you can see in the loop above Initially I find a maximum of all channels maximum
> > block sizes and use it then as a max segment size parameter for the whole device.
> > If the DW DMA controller has the same max block size of all channels, then it
> > will be found. If the channels've been synthesized with different block sizes,
> > then the optimization will work for the one with greatest block size. The SG
> > list entries of the channels with lesser max block size will be split up
> > by the DW DMAC driver, which would have been done anyway without
> > max_segment_size being set. Here we at least provide the optimization for the
> > channels with greatest max block size.
> > 
> > I do understand that it would be good to have this parameter setup on per generic
> > DMA channel descriptor basis. But DMA core and device descriptor doesn't provide
> > such facility, so setting at least some justified value is a good idea.
> > 
> > > 
> > > OTOH, I never saw the DesignWare DMA to be synthesized differently (I remember
> > > that Intel Medfield has interesting settings, but I don't remember if DMA
> > > channels are different inside the same controller).
> > > 
> > > Vineet, do you have any information that Synopsys customers synthesized DMA
> > > controllers with different channel characteristics inside one DMA IP?
> > 
> > AFAICS the DW DMAC channels can be synthesized with different max block size.
> > The IP core supports such configuration. So we can't assume that such DMAC
> > release can't be found in a real hardware just because we've never seen one.
> > No matter what Vineet will have to say in response to your question.
> 
> My point here that we probably can avoid complications till we have real
> hardware where it's different. As I said I don't remember a such, except
> *maybe* Intel Medfield, which is quite outdated and not supported for wider
> audience anyway.

IIRC Intel Medfield has couple of dma controller instances each one with
different parameters *but* each instance has same channel configuration.

I do not recall seeing that we have synthesis parameters per channel
basis... But I maybe wrong, it's been a while.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-12 12:42                       ` Serge Semin
@ 2020-05-15  6:30                         ` Vinod Koul
  2020-05-17 19:23                           ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-15  6:30 UTC (permalink / raw)
  To: Serge Semin
  Cc: Andy Shevchenko, Serge Semin, Mark Brown, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, Linux Kernel Mailing List

Hi Serge,

On 12-05-20, 15:42, Serge Semin wrote:
> Vinod,
> 
> Could you join the discussion for a little bit?
> 
> In order to properly fix the problem discussed in this topic, we need to
> introduce an additional capability exported by DMA channel handlers on per-channel
> basis. It must be a number, which would indicate an upper limitation of the SG list
> entries amount.
> Something like this would do it:
> struct dma_slave_caps {
> ...
> 	unsigned int max_sg_nents;
> ...

Looking at the discussion, I agree we should can this up in the
interface. The max_dma_len suggests the length of a descriptor allowed,
it does not convey the sg_nents supported which in the case of nollp is
one.

Btw is this is a real hardware issue, I have found that value of such
hardware is very less and people did fix it up in subsequent revs to add
llp support.

Also, another question is why this cannot be handled in driver, I agree
your hardware does not support llp but that does not stop you from
breaking a multi_sg list into N hardware descriptors and keep submitting
them (for this to work submission should be done in isr and not in bh,
unfortunately very few driver take that route). TBH the max_sg_nents or
max_dma_len are HW restrictions and SW *can* deal with then :-)

In an idea world, you should break the sw descriptor submitted into N hw
descriptors and submit to hardware and let user know when the sw
descriptor is completed. Of course we do not do that :(

> };
> As Andy suggested it's value should be interpreted as:
> 0          - unlimited number of entries,
> 1:MAX_UINT - actual limit to the number of entries.

Hmm why 0, why not MAX_UINT for unlimited?

> In addition to that seeing the dma_get_slave_caps() method provide the caps only
> by getting them from the DMA device descriptor, while we need to have an info on
> per-channel basis, it would be good to introduce a new DMA-device callback like:
> struct dma_device {
> ...
> 	int (*device_caps)(struct dma_chan *chan,
> 			   struct dma_slave_caps *caps);

Do you have a controller where channel caps are on per-channel basis?

> ...
> };
> So the DMA driver could override the generic DMA device capabilities with the
> values specific to the DMA channels. Such functionality will be also helpful for
> the max-burst-len parameter introduced by this patchset, since depending on the
> IP-core synthesis parameters it may be channel-specific.
> 
> Alternatively we could just introduce a new fields to the dma_chan structure and
> retrieve the new caps values from them in the dma_get_slave_caps() method.
> Though the solution with callback I like better.
> 
> What is your opinion about this? What solution you'd prefer?
> 
> On Tue, May 12, 2020 at 12:08:00AM +0300, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 12:07:14AM +0300, Andy Shevchenko wrote:
> > > On Mon, May 11, 2020 at 10:32:55PM +0300, Serge Semin wrote:
> > > > On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> > > > > On Mon, May 11, 2020 at 4:48 PM Serge Semin
> > > > > <Sergey.Semin@baikalelectronics.ru> wrote:
> > > > > >
> > > > > > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > > > > > > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> > > > > > >
> > > > > > > > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > > > > > > > driver will split it into the max transaction chunks anyway.
> > > > > > >
> > > > > > > That sounds like you need to also impose a limit on the maximum message
> > > > > > > size as well then, with that you should be able to handle messages up
> > > > > > > to whatever that limit is.  There's code for that bit already, so long
> > > > > > > as the limit is not too low it should be fine for most devices and
> > > > > > > client drivers can see the limit so they can be updated to work with it
> > > > > > > if needed.
> > > > > >
> > > > > > Hmm, this might work. The problem will be with imposing such limitation through
> > > > > > the DW APB SSI driver. In order to do this I need to know:
> > > > > > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > > > > > 2) Maximum DW DMA transfer block size.
> > > > > > Then I'll be able to use this information in the can_dma() callback to enable
> > > > > > the DMA xfers only for the safe transfers. Did you mean something like this when
> > > > > > you said "There's code for that bit already" ? If you meant the max_dma_len
> > > > > > parameter, then setting it won't work, because it just limits the SG items size
> > > > > > not the total length of a single transfer.
> > > > > >
> > > > > > So the question is of how to export the multi-block LLP flag from DW DMAc
> > > > > > driver. Andy?
> > > > > 
> > > > > I'm not sure I understand why do you need this being exported. Just
> > > > > always supply SG list out of single entry and define the length
> > > > > according to the maximum segment size (it's done IIRC in SPI core).
> > > > 
> > > > Finally I see your point. So you suggest to feed the DMA engine with SG list
> > > > entries one-by-one instead of sending all of them at once in a single
> > > > dmaengine_prep_slave_sg() -> dmaengine_submit() -> dma_async_issue_pending()
> > > > session. Hm, this solution will work, but there is an issue. There is no
> > > > guarantee, that Tx and Rx SG lists are symmetric, consisting of the same
> > > > number of items with the same sizes. It depends on the Tx/Rx buffers physical
> > > > address alignment and their offsets within the memory pages. Though this
> > > > problem can be solved by making the Tx and Rx SG lists symmetric. I'll have
> > > > to implement a clever DMA IO loop, which would extract the DMA
> > > > addresses/lengths from the SG entries and perform the single-buffer DMA 
> > > > transactions with the DMA buffers of the same length.
> > > > 
> > > > Regarding noLLP being exported. Obviously I intended to solve the problem in a
> > > > generic way since the problem is common for noLLP DW APB SSI/DW DMAC combination.
> > > > In order to do this we need to know whether the multi-block LLP feature is
> > > > unsupported by the DW DMA controller. We either make such info somehow exported
> > > > from the DW DMA driver, so the DMA clients (like Dw APB SSI controller driver)
> > > > could be ready to work around the problem; or just implement a flag-based quirk
> > > > in the DMA client driver, which would be enabled in the platform-specific basis
> > > > depending on the platform device actually detected (for instance, a specific
> > > > version of the DW APB SSI IP). AFAICS You'd prefer the later option. 
> > > 
> > > So, we may extend the struct of DMA parameters to tell the consumer amount of entries (each of which is no longer than maximum segment size) it can afford:
> > > - 0: Auto (DMA driver handles any cases itself)
> > > - 1: Only single entry
> > > - 2: Up to two...
> > 
> > It will left implementation details (or i.o.w. obstacles or limitation) why DMA
> > can't do otherwise.
> 
> Sounds good. Thanks for assistance.
> 
> -Sergey
> 
> > 
> > -- 
> > With Best Regards,
> > Andy Shevchenko
> > 
> > 

-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-12 19:12         ` Andy Shevchenko
  2020-05-12 19:47           ` Serge Semin
@ 2020-05-15  6:39           ` Vinod Koul
  2020-05-17 19:38             ` Serge Semin
  1 sibling, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-15  6:39 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On 12-05-20, 22:12, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > > IP core of the DW DMA controller may be synthesized with different
> > > > max burst length of the transfers per each channel. According to Synopsis
> > > > having the fixed maximum burst transactions length may provide some
> > > > performance gain. At the same time setting up the source and destination
> > > > multi size exceeding the max burst length limitation may cause a serious
> > > > problems. In our case the system just hangs up. In order to fix this
> > > > lets introduce the max burst length platform config of the DW DMA
> > > > controller device and don't let the DMA channels configuration code
> > > > exceed the burst length hardware limitation. Depending on the IP core
> > > > configuration the maximum value can vary from channel to channel.
> > > > It can be detected either in runtime from the DWC parameter registers
> > > > or from the dedicated dts property.
> > > 
> > > I'm wondering what can be the scenario when your peripheral will ask something
> > > which is not supported by DMA controller?
> > 
> > I may misunderstood your statement, because seeing your activity around my
> > patchsets including the SPI patchset and sometimes very helpful comments,
> > this question answer seems too obvious to see you asking it.
> > 
> > No need to go far for an example. See the DW APB SSI driver. Its DMA module
> > specifies the burst length to be 16, while not all of ours channels supports it.
> > Yes, originally it has been developed for the Intel Midfield SPI, but since I
> > converted the driver into a generic code we can't use a fixed value. For instance
> > in our hardware only two DMA channels of total 16 are capable of bursting up to
> > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
> > burst length. While there are two SPI interfaces, each of which need to have two
> > DMA channels for communications. So I need four channels in total to allocate to
> > provide the DMA capability for all interfaces. In order to set the SPI controller
> > up with valid optimized parameters the max-burst-length is required. Otherwise we
> > can end up with buffers overrun/underrun.
> 
> Right, and we come to the question which channel better to be used by SPI and
> the rest devices. Without specific filter function you can easily get into a
> case of inverted optimizations, when SPI got channels with burst = 4, while
> it's needed 16, and other hardware otherwise. Performance wise it's worse
> scenario which we may avoid in the first place, right?

If one has channels which are different and described as such in DT,
then I think it does make sense to specify in your board-dt about the
specific channels you would require...
> 
> > > Peripheral needs to supply a lot of configuration parameters specific to the
> > > DMA controller in use (that's why we have struct dw_dma_slave).
> > > So, seems to me the feasible approach is supply correct data in the first place.
> > 
> > How to supply a valid data if clients don't know the DMA controller limitations
> > in general?
> 
> This is a good question. DMA controllers are quite different and having unified
> capabilities structure for all is almost impossible task to fulfil. That's why
> custom filter function(s) can help here. Based on compatible string you can
> implement whatever customized quirks like two functions, for example, to try 16
> burst size first and fallback to 4 if none was previously found.
> 
> > > If you have specific channels to acquire then you probably need to provide a
> > > custom xlate / filter functions. Because above seems a bit hackish workaround
> > > of dynamic channel allocation mechanism.
> > 
> > No, I don't have a specific channel to acquire and in general you may use any
> > returned from the DMA subsystem (though some platforms may need a dedicated
> > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
> > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
> > their DMA settings must properly and optimally configured. It can be only done
> > if you know the DMA controller parameters like max burst length, max block-size,
> > etc.
> > 
> > So no. The change proposed by this patch isn't workaround, but a useful feature,
> > moreover expected to be supported by the generic DMA subsystem.
> 
> See above.
> 
> > > But let's see what we can do better. Since maximum is defined on the slave side
> > > device, it probably needs to define minimum as well, otherwise it's possible
> > > that some hardware can't cope underrun bursts.
> > 
> > There is no need to define minimum if such limit doesn't exists except a
> > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> > added such capability into the generic DMA subsystem so far.
> 
> There is a contract between provider and consumer about DMA resource. That's
> why both sides should participate in fulfilling it. Theoretically it may be a
> hardware that doesn't support minimum burst available in DMA by a reason. For
> such we would need minimum to be provided as well.

Agreed and if required caps should be extended to tell consumer the
minimum values supported.

-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-15  6:09                   ` Vinod Koul
@ 2020-05-15 10:51                     ` Andy Shevchenko
  2020-05-15 10:56                       ` Vinod Koul
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-15 10:51 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Serge Semin, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> On 12-05-20, 15:38, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:

...

> > I leave it to Rob and Vinod.
> > It won't break our case, so, feel free with your approach.
> 
> I agree the DT is about describing the hardware and looks like value of
> 1 is not allowed. If allowed it should be added..

It's allowed at *run time*, it's illegal in *pre-silicon stage* when
synthesizing the IP.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-15  6:16           ` Vinod Koul
@ 2020-05-15 10:53             ` Andy Shevchenko
  2020-05-17 18:22               ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-15 10:53 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Serge Semin, Vineet Gupta, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On Fri, May 15, 2020 at 11:46:01AM +0530, Vinod Koul wrote:
> On 12-05-20, 15:35, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:

...

> > My point here that we probably can avoid complications till we have real
> > hardware where it's different. As I said I don't remember a such, except
> > *maybe* Intel Medfield, which is quite outdated and not supported for wider
> > audience anyway.
> 
> IIRC Intel Medfield has couple of dma controller instances each one with
> different parameters *but* each instance has same channel configuration.

That's my memory too.

> I do not recall seeing that we have synthesis parameters per channel
> basis... But I maybe wrong, it's been a while.

Exactly, that's why I think we better simplify things till we will have real
issue with it. I.o.w. no need to solve the problem which doesn't exist.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-15 10:51                     ` Andy Shevchenko
@ 2020-05-15 10:56                       ` Vinod Koul
  2020-05-15 11:11                         ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-15 10:56 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Serge Semin, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On 15-05-20, 13:51, Andy Shevchenko wrote:
> On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> > On 12-05-20, 15:38, Andy Shevchenko wrote:
> > > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> 
> ...
> 
> > > I leave it to Rob and Vinod.
> > > It won't break our case, so, feel free with your approach.
> > 
> > I agree the DT is about describing the hardware and looks like value of
> > 1 is not allowed. If allowed it should be added..
> 
> It's allowed at *run time*, it's illegal in *pre-silicon stage* when
> synthesizing the IP.

Then it should be added ..

-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-12 19:47           ` Serge Semin
@ 2020-05-15 11:02             ` Andy Shevchenko
  0 siblings, 0 replies; 72+ messages in thread
From: Andy Shevchenko @ 2020-05-15 11:02 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Vinod Koul, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Tue, May 12, 2020 at 10:47:34PM +0300, Serge Semin wrote:
> On Tue, May 12, 2020 at 10:12:08PM +0300, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > > > IP core of the DW DMA controller may be synthesized with different
> > > > > max burst length of the transfers per each channel. According to Synopsis
> > > > > having the fixed maximum burst transactions length may provide some
> > > > > performance gain. At the same time setting up the source and destination
> > > > > multi size exceeding the max burst length limitation may cause a serious
> > > > > problems. In our case the system just hangs up. In order to fix this
> > > > > lets introduce the max burst length platform config of the DW DMA
> > > > > controller device and don't let the DMA channels configuration code
> > > > > exceed the burst length hardware limitation. Depending on the IP core
> > > > > configuration the maximum value can vary from channel to channel.
> > > > > It can be detected either in runtime from the DWC parameter registers
> > > > > or from the dedicated dts property.
> > > > 
> > > > I'm wondering what can be the scenario when your peripheral will ask something
> > > > which is not supported by DMA controller?
> > > 
> > > I may misunderstood your statement, because seeing your activity around my
> > > patchsets including the SPI patchset and sometimes very helpful comments,
> > > this question answer seems too obvious to see you asking it.
> > > 
> > > No need to go far for an example. See the DW APB SSI driver. Its DMA module
> > > specifies the burst length to be 16, while not all of ours channels supports it.
> > > Yes, originally it has been developed for the Intel Midfield SPI, but since I
> > > converted the driver into a generic code we can't use a fixed value. For instance
> > > in our hardware only two DMA channels of total 16 are capable of bursting up to
> > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
> > > burst length. While there are two SPI interfaces, each of which need to have two
> > > DMA channels for communications. So I need four channels in total to allocate to
> > > provide the DMA capability for all interfaces. In order to set the SPI controller
> > > up with valid optimized parameters the max-burst-length is required. Otherwise we
> > > can end up with buffers overrun/underrun.
> > 
> > Right, and we come to the question which channel better to be used by SPI and
> > the rest devices. Without specific filter function you can easily get into a
> > case of inverted optimizations, when SPI got channels with burst = 4, while
> > it's needed 16, and other hardware otherwise. Performance wise it's worse
> > scenario which we may avoid in the first place, right?
> 
> If we start thinking like you said, we'll get stuck at a problem of which interfaces
> should get faster DMA channels and which one should be left with slowest. In general
> this task can't be solved, because without any application-specific requirement
> they all are equally valuable and deserve to have the best resources allocated.
> So we shouldn't assume that some interface is better or more valuable than
> another, therefore in generic DMA client code any filtering is redundant.

True, that's why I called it platform dependent quirks. You may do whatever you
want / need to preform on your hardware best you can. If it's okay for your
hardware to have this inverse optimization, than fine, generic DMA client
should really not care about it.

> > > > Peripheral needs to supply a lot of configuration parameters specific to the
> > > > DMA controller in use (that's why we have struct dw_dma_slave).
> > > > So, seems to me the feasible approach is supply correct data in the first place.
> > > 
> > > How to supply a valid data if clients don't know the DMA controller limitations
> > > in general?
> > 
> > This is a good question. DMA controllers are quite different and having unified
> > capabilities structure for all is almost impossible task to fulfil. That's why
> > custom filter function(s) can help here. Based on compatible string you can
> > implement whatever customized quirks like two functions, for example, to try 16
> > burst size first and fallback to 4 if none was previously found.
> 
> Right. As I said in the previous email it's up to the corresponding platforms to
> decide the criteria of the filtering including the max-burst length value.

Correct!

> Even though the DW DMA channels resources aren't uniform on Baikal-T1 SoC I also
> won't do the filter-based channel allocation, because I can't predict the SoC
> application. Some of them may be used on a platform with active SPI interface
> utilization, some with specific requirements to UARTs and so on.

It's your choice as platform maintainer.

> > > > If you have specific channels to acquire then you probably need to provide a
> > > > custom xlate / filter functions. Because above seems a bit hackish workaround
> > > > of dynamic channel allocation mechanism.
> > > 
> > > No, I don't have a specific channel to acquire and in general you may use any
> > > returned from the DMA subsystem (though some platforms may need a dedicated
> > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
> > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
> > > their DMA settings must properly and optimally configured. It can be only done
> > > if you know the DMA controller parameters like max burst length, max block-size,
> > > etc.
> > > 
> > > So no. The change proposed by this patch isn't workaround, but a useful feature,
> > > moreover expected to be supported by the generic DMA subsystem.
> > 
> > See above.
> > 
> > > > But let's see what we can do better. Since maximum is defined on the slave side
> > > > device, it probably needs to define minimum as well, otherwise it's possible
> > > > that some hardware can't cope underrun bursts.
> > > 
> > > There is no need to define minimum if such limit doesn't exists except a
> > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> > > added such capability into the generic DMA subsystem so far.
> > 
> > There is a contract between provider and consumer about DMA resource. That's
> > why both sides should participate in fulfilling it. Theoretically it may be a
> > hardware that doesn't support minimum burst available in DMA by a reason. For
> > such we would need minimum to be provided as well.
> 
> I don't think 'theoretical' consideration counts when implementing something in
> kernel. That 'theoretical' may never happen, but you'll end up supporting a
> dummy functionality. Practicality is what kernel developers normally place
> before anything else.

The point here is to avoid half-baked solutions.

I'm not against max-burst logic on top of the existing interface, but would be
better if we allow the range, in this case it will work for any DMA controller
(as be part of DMA engine family).

I guess we need summarize this very long discussion and settle the next steps.

(if you can provide in short form anybody can read in 1 minute it would be
 nice, I already forgot tons of paragraphs you sent here, esp. taking into
 account tons of paragraphs in the other Baikal related series)

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-15 10:56                       ` Vinod Koul
@ 2020-05-15 11:11                         ` Serge Semin
  2020-05-17 17:47                           ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-15 11:11 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Andy Shevchenko, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Fri, May 15, 2020 at 04:26:58PM +0530, Vinod Koul wrote:
> On 15-05-20, 13:51, Andy Shevchenko wrote:
> > On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> > > On 12-05-20, 15:38, Andy Shevchenko wrote:
> > > > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > 
> > ...
> > 
> > > > I leave it to Rob and Vinod.
> > > > It won't break our case, so, feel free with your approach.
> > > 
> > > I agree the DT is about describing the hardware and looks like value of
> > > 1 is not allowed. If allowed it should be added..
> > 
> > It's allowed at *run time*, it's illegal in *pre-silicon stage* when
> > synthesizing the IP.
> 
> Then it should be added ..

Vinod, max-burst-len is "MAXimum" burst length not "run-time or current or any
other" burst length. It's a constant defined at the IP-core synthesis stage and
according to the Data Book, MAX burst length can't be 1. The allowed values are
exactly as I described in the binding [4, 8, 16, 32, ...]. MAX burst length
defines the upper limit of the run-time burst length. So setting it to 1 isn't
about describing a hardware, but using DT for the software convenience.

-Sergey

> 
> -- 
> ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-15 11:11                         ` Serge Semin
@ 2020-05-17 17:47                           ` Serge Semin
  2020-05-18 17:30                             ` Rob Herring
  2020-05-19 17:13                             ` Vinod Koul
  0 siblings, 2 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-17 17:47 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Andy Shevchenko, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Fri, May 15, 2020 at 02:11:13PM +0300, Serge Semin wrote:
> On Fri, May 15, 2020 at 04:26:58PM +0530, Vinod Koul wrote:
> > On 15-05-20, 13:51, Andy Shevchenko wrote:
> > > On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> > > > On 12-05-20, 15:38, Andy Shevchenko wrote:
> > > > > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > > > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > 
> > > ...
> > > 
> > > > > I leave it to Rob and Vinod.
> > > > > It won't break our case, so, feel free with your approach.
> > > > 
> > > > I agree the DT is about describing the hardware and looks like value of
> > > > 1 is not allowed. If allowed it should be added..
> > > 
> > > It's allowed at *run time*, it's illegal in *pre-silicon stage* when
> > > synthesizing the IP.
> > 
> > Then it should be added ..
> 
> Vinod, max-burst-len is "MAXimum" burst length not "run-time or current or any
> other" burst length. It's a constant defined at the IP-core synthesis stage and
> according to the Data Book, MAX burst length can't be 1. The allowed values are
> exactly as I described in the binding [4, 8, 16, 32, ...]. MAX burst length
> defines the upper limit of the run-time burst length. So setting it to 1 isn't
> about describing a hardware, but using DT for the software convenience.
> 
> -Sergey

Vinod, to make this completely clear. According to the DW DMAC data book:
- In general, run-time parameter of the DMA transaction burst length (set in
  the SRC_MSIZE/DST_MSIZE fields of the channel control register) may belong
  to the set [1, 4, 8, 16, 32, 64, 128, 256].
- Actual upper limit of the burst length run-time parameter is limited by a
  constant defined at the IP-synthesize stage (it's called DMAH_CHx_MAX_MULT_SIZE)
  and this constant belongs to the set [4, 8, 16, 32, 64, 128, 256]. (See, no 1
  in this set).

So the run-time burst length in a case of particular DW DMA controller belongs
to the range:
1 <= SRC_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
and
1 <= DST_MSIZE <= DMAH_CHx_MAX_MULT_SIZE

See. No mater which DW DMA controller we get each of them will at least support
the burst length of 1 and 4 transfer words. This is determined by design of the
DW DMA controller IP since DMAH_CHx_MAX_MULT_SIZE constant set starts with 4.

In this patch I suggest to add the max-burst-len property, which specifies
the upper limit for the run-time burst length. Since the maximum burst length
capable to be set to the SRC_MSIZE/DST_MSIZE fields of the DMA channel control
register is determined by the DMAH_CHx_MAX_MULT_SIZE constant (which can't be 1
by the DW DMA IP design), max-burst-len property as being also responsible for
the maximum burst length setting should be associated with DMAH_CHx_MAX_MULT_SIZE
thus should belong to the same set [4, 8, 16, 32, 64, 128, 256].

So 1 shouldn't be in the enum of the max-burst-len property constraint, because
hardware doesn't support such limitation by design, while setting 1 as
max-burst-len would mean incorrect description of the DMA controller.

Vinod, could you take a look at the info I provided above and say your final word
whether 1 should be really allowed to be in the max-burst-len enum constraints?
I'll do as you say in the next version of the patchset.

Regards,
-Sergey

> 
> > 
> > -- 
> > ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter
  2020-05-15 10:53             ` Andy Shevchenko
@ 2020-05-17 18:22               ` Serge Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-17 18:22 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Serge Semin, Vinod Koul, Vineet Gupta, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, linux-kernel

On Fri, May 15, 2020 at 01:53:13PM +0300, Andy Shevchenko wrote:
> On Fri, May 15, 2020 at 11:46:01AM +0530, Vinod Koul wrote:
> > On 12-05-20, 15:35, Andy Shevchenko wrote:
> > > On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > > > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> 
> ...
> 
> > > My point here that we probably can avoid complications till we have real
> > > hardware where it's different. As I said I don't remember a such, except
> > > *maybe* Intel Medfield, which is quite outdated and not supported for wider
> > > audience anyway.
> > 
> > IIRC Intel Medfield has couple of dma controller instances each one with
> > different parameters *but* each instance has same channel configuration.
> 
> That's my memory too.
> 
> > I do not recall seeing that we have synthesis parameters per channel
> > basis... But I maybe wrong, it's been a while.
> 
> Exactly, that's why I think we better simplify things till we will have real
> issue with it. I.o.w. no need to solve the problem which doesn't exist.

Ok then. My hardware is also synthesized with uniform max block size
parameter. I'll remove that maximum of maximum search pattern and use the block
size found for the very first channel to set the maximum segment size parameter.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-15  6:30                         ` Vinod Koul
@ 2020-05-17 19:23                           ` Serge Semin
  2020-05-19 17:02                             ` Vinod Koul
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-17 19:23 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Andy Shevchenko, Mark Brown, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, Linux Kernel Mailing List

On Fri, May 15, 2020 at 12:00:39PM +0530, Vinod Koul wrote:
> Hi Serge,
> 
> On 12-05-20, 15:42, Serge Semin wrote:
> > Vinod,
> > 
> > Could you join the discussion for a little bit?
> > 
> > In order to properly fix the problem discussed in this topic, we need to
> > introduce an additional capability exported by DMA channel handlers on per-channel
> > basis. It must be a number, which would indicate an upper limitation of the SG list
> > entries amount.
> > Something like this would do it:
> > struct dma_slave_caps {
> > ...
> > 	unsigned int max_sg_nents;
> > ...
> 
> Looking at the discussion, I agree we should can this up in the
> interface. The max_dma_len suggests the length of a descriptor allowed,
> it does not convey the sg_nents supported which in the case of nollp is
> one.
> 
> Btw is this is a real hardware issue, I have found that value of such
> hardware is very less and people did fix it up in subsequent revs to add
> llp support.

Yes, it is. My DW DMAC doesn't support LLP and there isn't going to be new SoC
version produced.(

> 
> Also, another question is why this cannot be handled in driver, I agree
> your hardware does not support llp but that does not stop you from
> breaking a multi_sg list into N hardware descriptors and keep submitting
> them (for this to work submission should be done in isr and not in bh,
> unfortunately very few driver take that route).

Current DW DMA driver does that, but this isn't enough. The problem is that
in order to fix the issue in the DMA hardware driver we need to introduce
an inter-dependent channels abstraction and synchronously feed both Tx and
Rx DMA channels with hardware descriptors (LLP entries) one-by-one. This hardly
needed by any slave device driver rather than SPI, which Tx and Rx buffers are
inter-dependent. So Andy's idea was to move the fix to the SPI driver (feed
the DMA engine channels with Tx and Rx data buffers synchronously), but DMA
engine would provide an info whether such fix is required. This can be
determined by the maximum SG entries capability.

(Note max_sg_ents isn't a limitation on the number of SG entries supported by
the DMA driver, but the number of SG entries handled by the DMA engine in a
single DMA transaction.)

> TBH the max_sg_nents or
> max_dma_len are HW restrictions and SW *can* deal with then :-)

Yes, it can, but it only works for the cases when individual DMA channels are
utilized. DMA hardware driver doesn't know that the target and source slave
device buffers (SPI Tx and Rx FIFOs) are inter-dependent, that writing to one
you will implicitly push data to another. So due to the interrupts handling
latency Tx DMA channel is restarted faster than Rx DMA channel is reinitialized.
This causes the SPI Rx FIFO overflow and data loss.

> 
> In an idea world, you should break the sw descriptor submitted into N hw
> descriptors and submit to hardware and let user know when the sw
> descriptor is completed. Of course we do not do that :(

Well, the current Dw DMA driver does that. But due to the two slave device
buffers inter-dependency this isn't enough to perform safe DMA transactions.
Due to the interrupts handling latency Tx DMA channel pushes data to the slave
device buffer faster than Rx DMA channel starts to handle incoming data. This
causes the SPI Rx FIFO overflow.

> 
> > };
> > As Andy suggested it's value should be interpreted as:
> > 0          - unlimited number of entries,
> > 1:MAX_UINT - actual limit to the number of entries.
> 

> Hmm why 0, why not MAX_UINT for unlimited?

0 is much better for many reasons. First of all MAX_UINT is a lot, but it's
still a number. On x64 platform this might be actual limit if for instance
the block-size register is 32-bits wide. Secondly interpreting 0 as unlimited
number of entries would be more suitable since most of the drivers support
LLP functionality and we wouldn't need to update their code to set MAX_UINT.
Thirdly DMA engines, which don't support LLPs would need to set this parameter
as 1. So if we do as you say and interpret unlimited number of LLPs as MAX_UINT,
then 0 would left unused.

To sum up I also think that using 0 as unlimited number SG entries supported is
much better.

> 
> > In addition to that seeing the dma_get_slave_caps() method provide the caps only
> > by getting them from the DMA device descriptor, while we need to have an info on
> > per-channel basis, it would be good to introduce a new DMA-device callback like:
> > struct dma_device {
> > ...
> > 	int (*device_caps)(struct dma_chan *chan,
> > 			   struct dma_slave_caps *caps);
> 

> Do you have a controller where channel caps are on per-channel basis?

Yes, I do. Our DW DMA controller has got the maximum burst length non-uniformly
distributed per DMA channels. There are eight channels our controller supports,
among which first two channels can burst up to 32 transfer words, but the rest
of the channels support bursting up to 4 transfer words.

So having such device_caps() callback to customize the device capabilities on
per-DMA-channel basis would be very useful! What do you think?

-Sergey

> 
> > ...
> > };
> > So the DMA driver could override the generic DMA device capabilities with the
> > values specific to the DMA channels. Such functionality will be also helpful for
> > the max-burst-len parameter introduced by this patchset, since depending on the
> > IP-core synthesis parameters it may be channel-specific.
> > 
> > Alternatively we could just introduce a new fields to the dma_chan structure and
> > retrieve the new caps values from them in the dma_get_slave_caps() method.
> > Though the solution with callback I like better.
> > 
> > What is your opinion about this? What solution you'd prefer?
> > 
> > On Tue, May 12, 2020 at 12:08:00AM +0300, Andy Shevchenko wrote:
> > > On Tue, May 12, 2020 at 12:07:14AM +0300, Andy Shevchenko wrote:
> > > > On Mon, May 11, 2020 at 10:32:55PM +0300, Serge Semin wrote:
> > > > > On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> > > > > > On Mon, May 11, 2020 at 4:48 PM Serge Semin
> > > > > > <Sergey.Semin@baikalelectronics.ru> wrote:
> > > > > > >
> > > > > > > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > > > > > > > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> > > > > > > >
> > > > > > > > > Alas linearizing the SPI messages won't help in this case because the DW DMA
> > > > > > > > > driver will split it into the max transaction chunks anyway.
> > > > > > > >
> > > > > > > > That sounds like you need to also impose a limit on the maximum message
> > > > > > > > size as well then, with that you should be able to handle messages up
> > > > > > > > to whatever that limit is.  There's code for that bit already, so long
> > > > > > > > as the limit is not too low it should be fine for most devices and
> > > > > > > > client drivers can see the limit so they can be updated to work with it
> > > > > > > > if needed.
> > > > > > >
> > > > > > > Hmm, this might work. The problem will be with imposing such limitation through
> > > > > > > the DW APB SSI driver. In order to do this I need to know:
> > > > > > > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > > > > > > 2) Maximum DW DMA transfer block size.
> > > > > > > Then I'll be able to use this information in the can_dma() callback to enable
> > > > > > > the DMA xfers only for the safe transfers. Did you mean something like this when
> > > > > > > you said "There's code for that bit already" ? If you meant the max_dma_len
> > > > > > > parameter, then setting it won't work, because it just limits the SG items size
> > > > > > > not the total length of a single transfer.
> > > > > > >
> > > > > > > So the question is of how to export the multi-block LLP flag from DW DMAc
> > > > > > > driver. Andy?
> > > > > > 
> > > > > > I'm not sure I understand why do you need this being exported. Just
> > > > > > always supply SG list out of single entry and define the length
> > > > > > according to the maximum segment size (it's done IIRC in SPI core).
> > > > > 
> > > > > Finally I see your point. So you suggest to feed the DMA engine with SG list
> > > > > entries one-by-one instead of sending all of them at once in a single
> > > > > dmaengine_prep_slave_sg() -> dmaengine_submit() -> dma_async_issue_pending()
> > > > > session. Hm, this solution will work, but there is an issue. There is no
> > > > > guarantee, that Tx and Rx SG lists are symmetric, consisting of the same
> > > > > number of items with the same sizes. It depends on the Tx/Rx buffers physical
> > > > > address alignment and their offsets within the memory pages. Though this
> > > > > problem can be solved by making the Tx and Rx SG lists symmetric. I'll have
> > > > > to implement a clever DMA IO loop, which would extract the DMA
> > > > > addresses/lengths from the SG entries and perform the single-buffer DMA 
> > > > > transactions with the DMA buffers of the same length.
> > > > > 
> > > > > Regarding noLLP being exported. Obviously I intended to solve the problem in a
> > > > > generic way since the problem is common for noLLP DW APB SSI/DW DMAC combination.
> > > > > In order to do this we need to know whether the multi-block LLP feature is
> > > > > unsupported by the DW DMA controller. We either make such info somehow exported
> > > > > from the DW DMA driver, so the DMA clients (like Dw APB SSI controller driver)
> > > > > could be ready to work around the problem; or just implement a flag-based quirk
> > > > > in the DMA client driver, which would be enabled in the platform-specific basis
> > > > > depending on the platform device actually detected (for instance, a specific
> > > > > version of the DW APB SSI IP). AFAICS You'd prefer the later option. 
> > > > 
> > > > So, we may extend the struct of DMA parameters to tell the consumer amount of entries (each of which is no longer than maximum segment size) it can afford:
> > > > - 0: Auto (DMA driver handles any cases itself)
> > > > - 1: Only single entry
> > > > - 2: Up to two...
> > > 
> > > It will left implementation details (or i.o.w. obstacles or limitation) why DMA
> > > can't do otherwise.
> > 
> > Sounds good. Thanks for assistance.
> > 
> > -Sergey
> > 
> > > 
> > > -- 
> > > With Best Regards,
> > > Andy Shevchenko
> > > 
> > > 
> 
> -- 
> ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-15  6:39           ` Vinod Koul
@ 2020-05-17 19:38             ` Serge Semin
  2020-05-19 17:07               ` Vinod Koul
  0 siblings, 1 reply; 72+ messages in thread
From: Serge Semin @ 2020-05-17 19:38 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Andy Shevchenko, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Fri, May 15, 2020 at 12:09:50PM +0530, Vinod Koul wrote:
> On 12-05-20, 22:12, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > > > IP core of the DW DMA controller may be synthesized with different
> > > > > max burst length of the transfers per each channel. According to Synopsis
> > > > > having the fixed maximum burst transactions length may provide some
> > > > > performance gain. At the same time setting up the source and destination
> > > > > multi size exceeding the max burst length limitation may cause a serious
> > > > > problems. In our case the system just hangs up. In order to fix this
> > > > > lets introduce the max burst length platform config of the DW DMA
> > > > > controller device and don't let the DMA channels configuration code
> > > > > exceed the burst length hardware limitation. Depending on the IP core
> > > > > configuration the maximum value can vary from channel to channel.
> > > > > It can be detected either in runtime from the DWC parameter registers
> > > > > or from the dedicated dts property.
> > > > 
> > > > I'm wondering what can be the scenario when your peripheral will ask something
> > > > which is not supported by DMA controller?
> > > 
> > > I may misunderstood your statement, because seeing your activity around my
> > > patchsets including the SPI patchset and sometimes very helpful comments,
> > > this question answer seems too obvious to see you asking it.
> > > 
> > > No need to go far for an example. See the DW APB SSI driver. Its DMA module
> > > specifies the burst length to be 16, while not all of ours channels supports it.
> > > Yes, originally it has been developed for the Intel Midfield SPI, but since I
> > > converted the driver into a generic code we can't use a fixed value. For instance
> > > in our hardware only two DMA channels of total 16 are capable of bursting up to
> > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
> > > burst length. While there are two SPI interfaces, each of which need to have two
> > > DMA channels for communications. So I need four channels in total to allocate to
> > > provide the DMA capability for all interfaces. In order to set the SPI controller
> > > up with valid optimized parameters the max-burst-length is required. Otherwise we
> > > can end up with buffers overrun/underrun.
> > 
> > Right, and we come to the question which channel better to be used by SPI and
> > the rest devices. Without specific filter function you can easily get into a
> > case of inverted optimizations, when SPI got channels with burst = 4, while
> > it's needed 16, and other hardware otherwise. Performance wise it's worse
> > scenario which we may avoid in the first place, right?
> 
> If one has channels which are different and described as such in DT,
> then I think it does make sense to specify in your board-dt about the
> specific channels you would require...

Well, we do have such hardware. Our DW DMA controller has got different max
burst lengths assigned to first two and the rest of the channels. But creating
a functionality of the individual channels assignment is a matter of different
patchset. Sorry. It's not one of my task at the moment.

My primary task is to integrate the Baikal-T1 SoC support into the kernel. I've
refactored a lot of code found in the Baikal-T1 SDK and currently under a pressure
of a lot of review. Alas there is no time to create new functionality as you
suggest. In future I may provide such, but not in the framework of this patchset.

> > 
> > > > Peripheral needs to supply a lot of configuration parameters specific to the
> > > > DMA controller in use (that's why we have struct dw_dma_slave).
> > > > So, seems to me the feasible approach is supply correct data in the first place.
> > > 
> > > How to supply a valid data if clients don't know the DMA controller limitations
> > > in general?
> > 
> > This is a good question. DMA controllers are quite different and having unified
> > capabilities structure for all is almost impossible task to fulfil. That's why
> > custom filter function(s) can help here. Based on compatible string you can
> > implement whatever customized quirks like two functions, for example, to try 16
> > burst size first and fallback to 4 if none was previously found.
> > 
> > > > If you have specific channels to acquire then you probably need to provide a
> > > > custom xlate / filter functions. Because above seems a bit hackish workaround
> > > > of dynamic channel allocation mechanism.
> > > 
> > > No, I don't have a specific channel to acquire and in general you may use any
> > > returned from the DMA subsystem (though some platforms may need a dedicated
> > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
> > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
> > > their DMA settings must properly and optimally configured. It can be only done
> > > if you know the DMA controller parameters like max burst length, max block-size,
> > > etc.
> > > 
> > > So no. The change proposed by this patch isn't workaround, but a useful feature,
> > > moreover expected to be supported by the generic DMA subsystem.
> > 
> > See above.
> > 
> > > > But let's see what we can do better. Since maximum is defined on the slave side
> > > > device, it probably needs to define minimum as well, otherwise it's possible
> > > > that some hardware can't cope underrun bursts.
> > > 
> > > There is no need to define minimum if such limit doesn't exists except a
> > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> > > added such capability into the generic DMA subsystem so far.
> > 
> > There is a contract between provider and consumer about DMA resource. That's
> > why both sides should participate in fulfilling it. Theoretically it may be a
> > hardware that doesn't support minimum burst available in DMA by a reason. For
> > such we would need minimum to be provided as well.
> 
> Agreed and if required caps should be extended to tell consumer the
> minimum values supported.

Sorry, it's not required by our hardware. Is there any, which actually has such
limitation? (minimum burst length)

-Sergey

> 
> -- 
> ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-17 17:47                           ` Serge Semin
@ 2020-05-18 17:30                             ` Rob Herring
  2020-05-18 19:30                               ` Serge Semin
  2020-05-19 17:13                             ` Vinod Koul
  1 sibling, 1 reply; 72+ messages in thread
From: Rob Herring @ 2020-05-18 17:30 UTC (permalink / raw)
  To: Serge Semin
  Cc: Vinod Koul, Serge Semin, Andy Shevchenko, Viresh Kumar,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Sun, May 17, 2020 at 08:47:39PM +0300, Serge Semin wrote:
> On Fri, May 15, 2020 at 02:11:13PM +0300, Serge Semin wrote:
> > On Fri, May 15, 2020 at 04:26:58PM +0530, Vinod Koul wrote:
> > > On 15-05-20, 13:51, Andy Shevchenko wrote:
> > > > On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> > > > > On 12-05-20, 15:38, Andy Shevchenko wrote:
> > > > > > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > > > > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > > > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > > 
> > > > ...
> > > > 
> > > > > > I leave it to Rob and Vinod.
> > > > > > It won't break our case, so, feel free with your approach.
> > > > > 
> > > > > I agree the DT is about describing the hardware and looks like value of
> > > > > 1 is not allowed. If allowed it should be added..
> > > > 
> > > > It's allowed at *run time*, it's illegal in *pre-silicon stage* when
> > > > synthesizing the IP.
> > > 
> > > Then it should be added ..
> > 
> > Vinod, max-burst-len is "MAXimum" burst length not "run-time or current or any
> > other" burst length. It's a constant defined at the IP-core synthesis stage and
> > according to the Data Book, MAX burst length can't be 1. The allowed values are
> > exactly as I described in the binding [4, 8, 16, 32, ...]. MAX burst length
> > defines the upper limit of the run-time burst length. So setting it to 1 isn't
> > about describing a hardware, but using DT for the software convenience.
> > 
> > -Sergey
> 
> Vinod, to make this completely clear. According to the DW DMAC data book:
> - In general, run-time parameter of the DMA transaction burst length (set in
>   the SRC_MSIZE/DST_MSIZE fields of the channel control register) may belong
>   to the set [1, 4, 8, 16, 32, 64, 128, 256].
> - Actual upper limit of the burst length run-time parameter is limited by a
>   constant defined at the IP-synthesize stage (it's called DMAH_CHx_MAX_MULT_SIZE)
>   and this constant belongs to the set [4, 8, 16, 32, 64, 128, 256]. (See, no 1
>   in this set).
> 
> So the run-time burst length in a case of particular DW DMA controller belongs
> to the range:
> 1 <= SRC_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> and
> 1 <= DST_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> 
> See. No mater which DW DMA controller we get each of them will at least support
> the burst length of 1 and 4 transfer words. This is determined by design of the
> DW DMA controller IP since DMAH_CHx_MAX_MULT_SIZE constant set starts with 4.
> 
> In this patch I suggest to add the max-burst-len property, which specifies
> the upper limit for the run-time burst length. Since the maximum burst length
> capable to be set to the SRC_MSIZE/DST_MSIZE fields of the DMA channel control
> register is determined by the DMAH_CHx_MAX_MULT_SIZE constant (which can't be 1
> by the DW DMA IP design), max-burst-len property as being also responsible for
> the maximum burst length setting should be associated with DMAH_CHx_MAX_MULT_SIZE
> thus should belong to the same set [4, 8, 16, 32, 64, 128, 256].
> 
> So 1 shouldn't be in the enum of the max-burst-len property constraint, because
> hardware doesn't support such limitation by design, while setting 1 as
> max-burst-len would mean incorrect description of the DMA controller.
> 
> Vinod, could you take a look at the info I provided above and say your final word
> whether 1 should be really allowed to be in the max-burst-len enum constraints?
> I'll do as you say in the next version of the patchset.

I generally think the synthesis time IP configuration should be implied 
by the compatible string which is why we have SoC specific compatible 
strings (Of course I dream for IP vendors to make all that discoverable 
which is only occasionally the case). There are exceptions to this. If 
one SoC has the same IP configured in different ways, then we'd probably 
have properties for the differences.

As to whether h/w configuration is okay in DT, the answer is yes. The 
question is whether it is determined by SoC, board, OS and also who 
would set it and how often. Something tuned per board and independent of 
the OS/user is the ideal example. 

Rob

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 1/6] dt-bindings: dma: dw: Convert DW DMAC to DT binding
  2020-05-08 10:52   ` [PATCH v2 1/6] dt-bindings: dma: dw: Convert DW DMAC to DT binding Serge Semin
@ 2020-05-18 17:50     ` Rob Herring
  0 siblings, 0 replies; 72+ messages in thread
From: Rob Herring @ 2020-05-18 17:50 UTC (permalink / raw)
  To: Serge Semin
  Cc: Vinod Koul, Viresh Kumar, Andy Shevchenko, Serge Semin,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Fri, May 08, 2020 at 01:52:59PM +0300, Serge Semin wrote:
> Modern device tree bindings are supposed to be created as YAML-files
> in accordance with dt-schema. This commit replaces the Synopsis
> Designware DMA controller legacy bare text bindings with YAML file.
> The only required prorties are "compatible", "reg", "#dma-cells" and
> "interrupts", which will be used by the driver to correctly find the
> controller memory region and handle its events. The rest of the properties
> are optional, since in case if either "dma-channels" or "dma-masters" isn't
> specified, the driver will attempt to auto-detect the IP core
> configuration.
> 
> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>
> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
> Cc: Paul Burton <paulburton@kernel.org>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: linux-mips@vger.kernel.org
> 
> ---
> 
> Changelog v2:
> - Rearrange SoBs.
> - Move $ref to the root level of the properties. So do do with the
>   constraints.
> - Discard default settings defined out of the property enum constraint.
> - Replace "additionalProperties: false" with "unevaluatedProperties: false"
>   property.
> - Remove a label definition from the binding example.
> ---
>  .../bindings/dma/snps,dma-spear1340.yaml      | 161 ++++++++++++++++++
>  .../devicetree/bindings/dma/snps-dma.txt      |  69 --------
>  2 files changed, 161 insertions(+), 69 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
>  delete mode 100644 Documentation/devicetree/bindings/dma/snps-dma.txt

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-18 17:30                             ` Rob Herring
@ 2020-05-18 19:30                               ` Serge Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-18 19:30 UTC (permalink / raw)
  To: Rob Herring
  Cc: Serge Semin, Vinod Koul, Andy Shevchenko, Viresh Kumar,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Mon, May 18, 2020 at 11:30:03AM -0600, Rob Herring wrote:
> On Sun, May 17, 2020 at 08:47:39PM +0300, Serge Semin wrote:
> > On Fri, May 15, 2020 at 02:11:13PM +0300, Serge Semin wrote:
> > > On Fri, May 15, 2020 at 04:26:58PM +0530, Vinod Koul wrote:
> > > > On 15-05-20, 13:51, Andy Shevchenko wrote:
> > > > > On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> > > > > > On 12-05-20, 15:38, Andy Shevchenko wrote:
> > > > > > > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > > > > > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > > > > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > > > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > > > 
> > > > > ...
> > > > > 
> > > > > > > I leave it to Rob and Vinod.
> > > > > > > It won't break our case, so, feel free with your approach.
> > > > > > 
> > > > > > I agree the DT is about describing the hardware and looks like value of
> > > > > > 1 is not allowed. If allowed it should be added..
> > > > > 
> > > > > It's allowed at *run time*, it's illegal in *pre-silicon stage* when
> > > > > synthesizing the IP.
> > > > 
> > > > Then it should be added ..
> > > 
> > > Vinod, max-burst-len is "MAXimum" burst length not "run-time or current or any
> > > other" burst length. It's a constant defined at the IP-core synthesis stage and
> > > according to the Data Book, MAX burst length can't be 1. The allowed values are
> > > exactly as I described in the binding [4, 8, 16, 32, ...]. MAX burst length
> > > defines the upper limit of the run-time burst length. So setting it to 1 isn't
> > > about describing a hardware, but using DT for the software convenience.
> > > 
> > > -Sergey
> > 
> > Vinod, to make this completely clear. According to the DW DMAC data book:
> > - In general, run-time parameter of the DMA transaction burst length (set in
> >   the SRC_MSIZE/DST_MSIZE fields of the channel control register) may belong
> >   to the set [1, 4, 8, 16, 32, 64, 128, 256].
> > - Actual upper limit of the burst length run-time parameter is limited by a
> >   constant defined at the IP-synthesize stage (it's called DMAH_CHx_MAX_MULT_SIZE)
> >   and this constant belongs to the set [4, 8, 16, 32, 64, 128, 256]. (See, no 1
> >   in this set).
> > 
> > So the run-time burst length in a case of particular DW DMA controller belongs
> > to the range:
> > 1 <= SRC_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> > and
> > 1 <= DST_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> > 
> > See. No mater which DW DMA controller we get each of them will at least support
> > the burst length of 1 and 4 transfer words. This is determined by design of the
> > DW DMA controller IP since DMAH_CHx_MAX_MULT_SIZE constant set starts with 4.
> > 
> > In this patch I suggest to add the max-burst-len property, which specifies
> > the upper limit for the run-time burst length. Since the maximum burst length
> > capable to be set to the SRC_MSIZE/DST_MSIZE fields of the DMA channel control
> > register is determined by the DMAH_CHx_MAX_MULT_SIZE constant (which can't be 1
> > by the DW DMA IP design), max-burst-len property as being also responsible for
> > the maximum burst length setting should be associated with DMAH_CHx_MAX_MULT_SIZE
> > thus should belong to the same set [4, 8, 16, 32, 64, 128, 256].
> > 
> > So 1 shouldn't be in the enum of the max-burst-len property constraint, because
> > hardware doesn't support such limitation by design, while setting 1 as
> > max-burst-len would mean incorrect description of the DMA controller.
> > 
> > Vinod, could you take a look at the info I provided above and say your final word
> > whether 1 should be really allowed to be in the max-burst-len enum constraints?
> > I'll do as you say in the next version of the patchset.
> 
> I generally think the synthesis time IP configuration should be implied 
> by the compatible string which is why we have SoC specific compatible 
> strings (Of course I dream for IP vendors to make all that discoverable 
> which is only occasionally the case). There are exceptions to this. If 
> one SoC has the same IP configured in different ways, then we'd probably 
> have properties for the differences.

Hm, AFAIU from what you said the IP configuration specific to a particular
SoC must be determined by the compatible string and that configuration parameters
should be hidden somewhere in the driver internals for instance in the platform
data structure. In case if there are several versions of the same IP are embedded
into the SoC, then the differences can be described by the DT properties. Right?
If I am right, then that's weird. A lot of the currently available platforms (and
drivers) don't follow that rule and just specify the generic IP compatible string
and describe their IP synthesis parameters by the DT properties.

> 
> As to whether h/w configuration is okay in DT, the answer is yes. The 
> question is whether it is determined by SoC, board, OS and also who 
> would set it and how often. Something tuned per board and independent of 
> the OS/user is the ideal example.

So does this mean that I have to allow the max-burst-len property to be 1 even
though in accordance with the DW DMA Data Book the upper limit of the
burst-length will never be 1, but will always start with 4? By allowing the
upper limit to be 1 we wouldn't provide the h/w configuration (hardware has
already been configured with maximum burst length parameter DMAH_CHx_MAX_MULT_SIZE
on the IP synthesis stage), but would setup an artificial constraints on the
maximum allowed burst length. Are you ok with this and 1 should be permitted
anyway?

-Sergey

> 
> Rob

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-17 19:23                           ` Serge Semin
@ 2020-05-19 17:02                             ` Vinod Koul
  2020-05-21  1:40                               ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-19 17:02 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Andy Shevchenko, Mark Brown, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, Linux Kernel Mailing List

On 17-05-20, 22:23, Serge Semin wrote:
> On Fri, May 15, 2020 at 12:00:39PM +0530, Vinod Koul wrote:
> > Hi Serge,
> > 
> > On 12-05-20, 15:42, Serge Semin wrote:
> > > Vinod,
> > > 
> > > Could you join the discussion for a little bit?
> > > 
> > > In order to properly fix the problem discussed in this topic, we need to
> > > introduce an additional capability exported by DMA channel handlers on per-channel
> > > basis. It must be a number, which would indicate an upper limitation of the SG list
> > > entries amount.
> > > Something like this would do it:
> > > struct dma_slave_caps {
> > > ...
> > > 	unsigned int max_sg_nents;
> > > ...
> > 
> > Looking at the discussion, I agree we should can this up in the
> > interface. The max_dma_len suggests the length of a descriptor allowed,
> > it does not convey the sg_nents supported which in the case of nollp is
> > one.
> > 
> > Btw is this is a real hardware issue, I have found that value of such
> > hardware is very less and people did fix it up in subsequent revs to add
> > llp support.
> 
> Yes, it is. My DW DMAC doesn't support LLP and there isn't going to be new SoC
> version produced.(

Ouch

> > Also, another question is why this cannot be handled in driver, I agree
> > your hardware does not support llp but that does not stop you from
> > breaking a multi_sg list into N hardware descriptors and keep submitting
> > them (for this to work submission should be done in isr and not in bh,
> > unfortunately very few driver take that route).
> 
> Current DW DMA driver does that, but this isn't enough. The problem is that
> in order to fix the issue in the DMA hardware driver we need to introduce
> an inter-dependent channels abstraction and synchronously feed both Tx and
> Rx DMA channels with hardware descriptors (LLP entries) one-by-one. This hardly
> needed by any slave device driver rather than SPI, which Tx and Rx buffers are
> inter-dependent. So Andy's idea was to move the fix to the SPI driver (feed
> the DMA engine channels with Tx and Rx data buffers synchronously), but DMA
> engine would provide an info whether such fix is required. This can be
> determined by the maximum SG entries capability.

Okay but having the sw limitation removed would also be a good idea, you
can handle any user, I will leave it upto you, either way is okay

> 
> (Note max_sg_ents isn't a limitation on the number of SG entries supported by
> the DMA driver, but the number of SG entries handled by the DMA engine in a
> single DMA transaction.)
> 
> > TBH the max_sg_nents or
> > max_dma_len are HW restrictions and SW *can* deal with then :-)
> 
> Yes, it can, but it only works for the cases when individual DMA channels are
> utilized. DMA hardware driver doesn't know that the target and source slave
> device buffers (SPI Tx and Rx FIFOs) are inter-dependent, that writing to one
> you will implicitly push data to another. So due to the interrupts handling
> latency Tx DMA channel is restarted faster than Rx DMA channel is reinitialized.
> This causes the SPI Rx FIFO overflow and data loss.
> 
> > 
> > In an idea world, you should break the sw descriptor submitted into N hw
> > descriptors and submit to hardware and let user know when the sw
> > descriptor is completed. Of course we do not do that :(
> 
> Well, the current Dw DMA driver does that. But due to the two slave device
> buffers inter-dependency this isn't enough to perform safe DMA transactions.
> Due to the interrupts handling latency Tx DMA channel pushes data to the slave
> device buffer faster than Rx DMA channel starts to handle incoming data. This
> causes the SPI Rx FIFO overflow.
> 
> > 
> > > };
> > > As Andy suggested it's value should be interpreted as:
> > > 0          - unlimited number of entries,
> > > 1:MAX_UINT - actual limit to the number of entries.
> > 
> 
> > Hmm why 0, why not MAX_UINT for unlimited?
> 
> 0 is much better for many reasons. First of all MAX_UINT is a lot, but it's
> still a number. On x64 platform this might be actual limit if for instance
> the block-size register is 32-bits wide. Secondly interpreting 0 as unlimited
> number of entries would be more suitable since most of the drivers support
> LLP functionality and we wouldn't need to update their code to set MAX_UINT.
> Thirdly DMA engines, which don't support LLPs would need to set this parameter
> as 1. So if we do as you say and interpret unlimited number of LLPs as MAX_UINT,
> then 0 would left unused.
> 
> To sum up I also think that using 0 as unlimited number SG entries supported is
> much better.

ok

> > > In addition to that seeing the dma_get_slave_caps() method provide the caps only
> > > by getting them from the DMA device descriptor, while we need to have an info on
> > > per-channel basis, it would be good to introduce a new DMA-device callback like:
> > > struct dma_device {
> > > ...
> > > 	int (*device_caps)(struct dma_chan *chan,
> > > 			   struct dma_slave_caps *caps);
> > 
> 
> > Do you have a controller where channel caps are on per-channel basis?
> 
> Yes, I do. Our DW DMA controller has got the maximum burst length non-uniformly
> distributed per DMA channels. There are eight channels our controller supports,
> among which first two channels can burst up to 32 transfer words, but the rest
> of the channels support bursting up to 4 transfer words.
> 
> So having such device_caps() callback to customize the device capabilities on
> per-DMA-channel basis would be very useful! What do you think?

Okay looks like per-ch basis is the way forward!

-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-17 19:38             ` Serge Semin
@ 2020-05-19 17:07               ` Vinod Koul
  2020-05-21  1:47                 ` Serge Semin
  0 siblings, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-19 17:07 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Andy Shevchenko, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On 17-05-20, 22:38, Serge Semin wrote:
> On Fri, May 15, 2020 at 12:09:50PM +0530, Vinod Koul wrote:
> > On 12-05-20, 22:12, Andy Shevchenko wrote:
> > > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > > > > IP core of the DW DMA controller may be synthesized with different
> > > > > > max burst length of the transfers per each channel. According to Synopsis
> > > > > > having the fixed maximum burst transactions length may provide some
> > > > > > performance gain. At the same time setting up the source and destination
> > > > > > multi size exceeding the max burst length limitation may cause a serious
> > > > > > problems. In our case the system just hangs up. In order to fix this
> > > > > > lets introduce the max burst length platform config of the DW DMA
> > > > > > controller device and don't let the DMA channels configuration code
> > > > > > exceed the burst length hardware limitation. Depending on the IP core
> > > > > > configuration the maximum value can vary from channel to channel.
> > > > > > It can be detected either in runtime from the DWC parameter registers
> > > > > > or from the dedicated dts property.
> > > > > 
> > > > > I'm wondering what can be the scenario when your peripheral will ask something
> > > > > which is not supported by DMA controller?
> > > > 
> > > > I may misunderstood your statement, because seeing your activity around my
> > > > patchsets including the SPI patchset and sometimes very helpful comments,
> > > > this question answer seems too obvious to see you asking it.
> > > > 
> > > > No need to go far for an example. See the DW APB SSI driver. Its DMA module
> > > > specifies the burst length to be 16, while not all of ours channels supports it.
> > > > Yes, originally it has been developed for the Intel Midfield SPI, but since I
> > > > converted the driver into a generic code we can't use a fixed value. For instance
> > > > in our hardware only two DMA channels of total 16 are capable of bursting up to
> > > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
> > > > burst length. While there are two SPI interfaces, each of which need to have two
> > > > DMA channels for communications. So I need four channels in total to allocate to
> > > > provide the DMA capability for all interfaces. In order to set the SPI controller
> > > > up with valid optimized parameters the max-burst-length is required. Otherwise we
> > > > can end up with buffers overrun/underrun.
> > > 
> > > Right, and we come to the question which channel better to be used by SPI and
> > > the rest devices. Without specific filter function you can easily get into a
> > > case of inverted optimizations, when SPI got channels with burst = 4, while
> > > it's needed 16, and other hardware otherwise. Performance wise it's worse
> > > scenario which we may avoid in the first place, right?
> > 
> > If one has channels which are different and described as such in DT,
> > then I think it does make sense to specify in your board-dt about the
> > specific channels you would require...
> 
> Well, we do have such hardware. Our DW DMA controller has got different max
> burst lengths assigned to first two and the rest of the channels. But creating
> a functionality of the individual channels assignment is a matter of different
> patchset. Sorry. It's not one of my task at the moment.
> 
> My primary task is to integrate the Baikal-T1 SoC support into the kernel. I've
> refactored a lot of code found in the Baikal-T1 SDK and currently under a pressure
> of a lot of review. Alas there is no time to create new functionality as you
> suggest. In future I may provide such, but not in the framework of this patchset.

Well you need to tell your folks that upstreaming does not work under
pressure and we can't put a timeline for upstreaming. It needs to do
what is deemed the right way. Reviews can take time, that needs to be
comprehended as well!

> > > > > Peripheral needs to supply a lot of configuration parameters specific to the
> > > > > DMA controller in use (that's why we have struct dw_dma_slave).
> > > > > So, seems to me the feasible approach is supply correct data in the first place.
> > > > 
> > > > How to supply a valid data if clients don't know the DMA controller limitations
> > > > in general?
> > > 
> > > This is a good question. DMA controllers are quite different and having unified
> > > capabilities structure for all is almost impossible task to fulfil. That's why
> > > custom filter function(s) can help here. Based on compatible string you can
> > > implement whatever customized quirks like two functions, for example, to try 16
> > > burst size first and fallback to 4 if none was previously found.
> > > 
> > > > > If you have specific channels to acquire then you probably need to provide a
> > > > > custom xlate / filter functions. Because above seems a bit hackish workaround
> > > > > of dynamic channel allocation mechanism.
> > > > 
> > > > No, I don't have a specific channel to acquire and in general you may use any
> > > > returned from the DMA subsystem (though some platforms may need a dedicated
> > > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
> > > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
> > > > their DMA settings must properly and optimally configured. It can be only done
> > > > if you know the DMA controller parameters like max burst length, max block-size,
> > > > etc.
> > > > 
> > > > So no. The change proposed by this patch isn't workaround, but a useful feature,
> > > > moreover expected to be supported by the generic DMA subsystem.
> > > 
> > > See above.
> > > 
> > > > > But let's see what we can do better. Since maximum is defined on the slave side
> > > > > device, it probably needs to define minimum as well, otherwise it's possible
> > > > > that some hardware can't cope underrun bursts.
> > > > 
> > > > There is no need to define minimum if such limit doesn't exists except a
> > > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> > > > added such capability into the generic DMA subsystem so far.
> > > 
> > > There is a contract between provider and consumer about DMA resource. That's
> > > why both sides should participate in fulfilling it. Theoretically it may be a
> > > hardware that doesn't support minimum burst available in DMA by a reason. For
> > > such we would need minimum to be provided as well.
> > 
> > Agreed and if required caps should be extended to tell consumer the
> > minimum values supported.
> 
> Sorry, it's not required by our hardware. Is there any, which actually has such
> limitation? (minimum burst length)

IIUC the idea is that you will tell maximum and minimum values supported
and client can pick the best value. Esp in case of slave transfers
things like burst, msize are governed by client capability and usage. So
exposing the set to pick from would make sense

-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-17 17:47                           ` Serge Semin
  2020-05-18 17:30                             ` Rob Herring
@ 2020-05-19 17:13                             ` Vinod Koul
  2020-05-21  1:33                               ` Serge Semin
  1 sibling, 1 reply; 72+ messages in thread
From: Vinod Koul @ 2020-05-19 17:13 UTC (permalink / raw)
  To: Serge Semin
  Cc: Serge Semin, Andy Shevchenko, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On 17-05-20, 20:47, Serge Semin wrote:
> On Fri, May 15, 2020 at 02:11:13PM +0300, Serge Semin wrote:
> > On Fri, May 15, 2020 at 04:26:58PM +0530, Vinod Koul wrote:
> > > On 15-05-20, 13:51, Andy Shevchenko wrote:
> > > > On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> > > > > On 12-05-20, 15:38, Andy Shevchenko wrote:
> > > > > > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > > > > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > > > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > > 
> > > > ...
> > > > 
> > > > > > I leave it to Rob and Vinod.
> > > > > > It won't break our case, so, feel free with your approach.
> > > > > 
> > > > > I agree the DT is about describing the hardware and looks like value of
> > > > > 1 is not allowed. If allowed it should be added..
> > > > 
> > > > It's allowed at *run time*, it's illegal in *pre-silicon stage* when
> > > > synthesizing the IP.
> > > 
> > > Then it should be added ..
> > 
> > Vinod, max-burst-len is "MAXimum" burst length not "run-time or current or any
> > other" burst length. It's a constant defined at the IP-core synthesis stage and
> > according to the Data Book, MAX burst length can't be 1. The allowed values are
> > exactly as I described in the binding [4, 8, 16, 32, ...]. MAX burst length
> > defines the upper limit of the run-time burst length. So setting it to 1 isn't
> > about describing a hardware, but using DT for the software convenience.
> > 
> > -Sergey
> 
> Vinod, to make this completely clear. According to the DW DMAC data book:
> - In general, run-time parameter of the DMA transaction burst length (set in
>   the SRC_MSIZE/DST_MSIZE fields of the channel control register) may belong
>   to the set [1, 4, 8, 16, 32, 64, 128, 256].

so 1 is valid value for msize

> - Actual upper limit of the burst length run-time parameter is limited by a
>   constant defined at the IP-synthesize stage (it's called DMAH_CHx_MAX_MULT_SIZE)
>   and this constant belongs to the set [4, 8, 16, 32, 64, 128, 256]. (See, no 1
>   in this set).

maximum can be 4 onwards, but in my configuration I can choose 1 as
value for msize

> So the run-time burst length in a case of particular DW DMA controller belongs
> to the range:
> 1 <= SRC_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> and
> 1 <= DST_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> 
> See. No mater which DW DMA controller we get each of them will at least support
> the burst length of 1 and 4 transfer words. This is determined by design of the
> DW DMA controller IP since DMAH_CHx_MAX_MULT_SIZE constant set starts with 4.
> 
> In this patch I suggest to add the max-burst-len property, which specifies
> the upper limit for the run-time burst length. Since the maximum burst length
> capable to be set to the SRC_MSIZE/DST_MSIZE fields of the DMA channel control
> register is determined by the DMAH_CHx_MAX_MULT_SIZE constant (which can't be 1
> by the DW DMA IP design), max-burst-len property as being also responsible for
> the maximum burst length setting should be associated with DMAH_CHx_MAX_MULT_SIZE
> thus should belong to the same set [4, 8, 16, 32, 64, 128, 256].
> 
> So 1 shouldn't be in the enum of the max-burst-len property constraint, because
> hardware doesn't support such limitation by design, while setting 1 as
> max-burst-len would mean incorrect description of the DMA controller.
> 
> Vinod, could you take a look at the info I provided above and say your final word
> whether 1 should be really allowed to be in the max-burst-len enum constraints?
> I'll do as you say in the next version of the patchset.

You are specifying the parameter which will be used to pick, i think
starting with 4 makes sense as we are specifying maximum allowed values
for msize. Values lesser than or equal to this would be allowed, I guess
that should be added to documentation.

thanks
-- 
~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property
  2020-05-19 17:13                             ` Vinod Koul
@ 2020-05-21  1:33                               ` Serge Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-21  1:33 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Andy Shevchenko, Viresh Kumar, Rob Herring,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Dan Williams, linux-mips, dmaengine, devicetree,
	linux-kernel

On Tue, May 19, 2020 at 10:43:04PM +0530, Vinod Koul wrote:
> On 17-05-20, 20:47, Serge Semin wrote:
> > On Fri, May 15, 2020 at 02:11:13PM +0300, Serge Semin wrote:
> > > On Fri, May 15, 2020 at 04:26:58PM +0530, Vinod Koul wrote:
> > > > On 15-05-20, 13:51, Andy Shevchenko wrote:
> > > > > On Fri, May 15, 2020 at 11:39:11AM +0530, Vinod Koul wrote:
> > > > > > On 12-05-20, 15:38, Andy Shevchenko wrote:
> > > > > > > On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > > > > > > > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > > > > > > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > > > > > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > > > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> > > > > 
> > > > > ...
> > > > > 
> > > > > > > I leave it to Rob and Vinod.
> > > > > > > It won't break our case, so, feel free with your approach.
> > > > > > 
> > > > > > I agree the DT is about describing the hardware and looks like value of
> > > > > > 1 is not allowed. If allowed it should be added..
> > > > > 
> > > > > It's allowed at *run time*, it's illegal in *pre-silicon stage* when
> > > > > synthesizing the IP.
> > > > 
> > > > Then it should be added ..
> > > 
> > > Vinod, max-burst-len is "MAXimum" burst length not "run-time or current or any
> > > other" burst length. It's a constant defined at the IP-core synthesis stage and
> > > according to the Data Book, MAX burst length can't be 1. The allowed values are
> > > exactly as I described in the binding [4, 8, 16, 32, ...]. MAX burst length
> > > defines the upper limit of the run-time burst length. So setting it to 1 isn't
> > > about describing a hardware, but using DT for the software convenience.
> > > 
> > > -Sergey
> > 
> > Vinod, to make this completely clear. According to the DW DMAC data book:
> > - In general, run-time parameter of the DMA transaction burst length (set in
> >   the SRC_MSIZE/DST_MSIZE fields of the channel control register) may belong
> >   to the set [1, 4, 8, 16, 32, 64, 128, 256].
> 
> so 1 is valid value for msize

Right.

> 
> > - Actual upper limit of the burst length run-time parameter is limited by a
> >   constant defined at the IP-synthesize stage (it's called DMAH_CHx_MAX_MULT_SIZE)
> >   and this constant belongs to the set [4, 8, 16, 32, 64, 128, 256]. (See, no 1
> >   in this set).
> 
> maximum can be 4 onwards, but in my configuration I can choose 1 as
> value for msize

It's true for all configurations. msize can be at least 0 or 1, which correspond
to 1 and 4 burst length respectively.

> 
> > So the run-time burst length in a case of particular DW DMA controller belongs
> > to the range:
> > 1 <= SRC_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> > and
> > 1 <= DST_MSIZE <= DMAH_CHx_MAX_MULT_SIZE
> > 
> > See. No mater which DW DMA controller we get each of them will at least support
> > the burst length of 1 and 4 transfer words. This is determined by design of the
> > DW DMA controller IP since DMAH_CHx_MAX_MULT_SIZE constant set starts with 4.
> > 
> > In this patch I suggest to add the max-burst-len property, which specifies
> > the upper limit for the run-time burst length. Since the maximum burst length
> > capable to be set to the SRC_MSIZE/DST_MSIZE fields of the DMA channel control
> > register is determined by the DMAH_CHx_MAX_MULT_SIZE constant (which can't be 1
> > by the DW DMA IP design), max-burst-len property as being also responsible for
> > the maximum burst length setting should be associated with DMAH_CHx_MAX_MULT_SIZE
> > thus should belong to the same set [4, 8, 16, 32, 64, 128, 256].
> > 
> > So 1 shouldn't be in the enum of the max-burst-len property constraint, because
> > hardware doesn't support such limitation by design, while setting 1 as
> > max-burst-len would mean incorrect description of the DMA controller.
> > 
> > Vinod, could you take a look at the info I provided above and say your final word
> > whether 1 should be really allowed to be in the max-burst-len enum constraints?
> > I'll do as you say in the next version of the patchset.
> 
> You are specifying the parameter which will be used to pick, i think
> starting with 4 makes sense as we are specifying maximum allowed values
> for msize. Values lesser than or equal to this would be allowed, I guess
> that should be added to documentation.

Right. Thanks. I'll a proper description to the property in the binding file.

-Sergey

> 
> thanks
> -- 
> ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported
  2020-05-19 17:02                             ` Vinod Koul
@ 2020-05-21  1:40                               ` Serge Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-21  1:40 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Andy Shevchenko, Mark Brown, Viresh Kumar,
	Dan Williams, Alexey Malahov, Thomas Bogendoerfer, Paul Burton,
	Ralf Baechle, Arnd Bergmann, Rob Herring, linux-mips, devicetree,
	dmaengine, Linux Kernel Mailing List

On Tue, May 19, 2020 at 10:32:46PM +0530, Vinod Koul wrote:
> On 17-05-20, 22:23, Serge Semin wrote:
> > On Fri, May 15, 2020 at 12:00:39PM +0530, Vinod Koul wrote:
> > > Hi Serge,
> > > 
> > > On 12-05-20, 15:42, Serge Semin wrote:
> > > > Vinod,
> > > > 
> > > > Could you join the discussion for a little bit?
> > > > 
> > > > In order to properly fix the problem discussed in this topic, we need to
> > > > introduce an additional capability exported by DMA channel handlers on per-channel
> > > > basis. It must be a number, which would indicate an upper limitation of the SG list
> > > > entries amount.
> > > > Something like this would do it:
> > > > struct dma_slave_caps {
> > > > ...
> > > > 	unsigned int max_sg_nents;
> > > > ...
> > > 
> > > Looking at the discussion, I agree we should can this up in the
> > > interface. The max_dma_len suggests the length of a descriptor allowed,
> > > it does not convey the sg_nents supported which in the case of nollp is
> > > one.
> > > 
> > > Btw is this is a real hardware issue, I have found that value of such
> > > hardware is very less and people did fix it up in subsequent revs to add
> > > llp support.
> > 
> > Yes, it is. My DW DMAC doesn't support LLP and there isn't going to be new SoC
> > version produced.(
> 
> Ouch
> 
> > > Also, another question is why this cannot be handled in driver, I agree
> > > your hardware does not support llp but that does not stop you from
> > > breaking a multi_sg list into N hardware descriptors and keep submitting
> > > them (for this to work submission should be done in isr and not in bh,
> > > unfortunately very few driver take that route).
> > 
> > Current DW DMA driver does that, but this isn't enough. The problem is that
> > in order to fix the issue in the DMA hardware driver we need to introduce
> > an inter-dependent channels abstraction and synchronously feed both Tx and
> > Rx DMA channels with hardware descriptors (LLP entries) one-by-one. This hardly
> > needed by any slave device driver rather than SPI, which Tx and Rx buffers are
> > inter-dependent. So Andy's idea was to move the fix to the SPI driver (feed
> > the DMA engine channels with Tx and Rx data buffers synchronously), but DMA
> > engine would provide an info whether such fix is required. This can be
> > determined by the maximum SG entries capability.
> 
> Okay but having the sw limitation removed would also be a good idea, you
> can handle any user, I will leave it upto you, either way is okay
> 
> > 
> > (Note max_sg_ents isn't a limitation on the number of SG entries supported by
> > the DMA driver, but the number of SG entries handled by the DMA engine in a
> > single DMA transaction.)
> > 
> > > TBH the max_sg_nents or
> > > max_dma_len are HW restrictions and SW *can* deal with then :-)
> > 
> > Yes, it can, but it only works for the cases when individual DMA channels are
> > utilized. DMA hardware driver doesn't know that the target and source slave
> > device buffers (SPI Tx and Rx FIFOs) are inter-dependent, that writing to one
> > you will implicitly push data to another. So due to the interrupts handling
> > latency Tx DMA channel is restarted faster than Rx DMA channel is reinitialized.
> > This causes the SPI Rx FIFO overflow and data loss.
> > 
> > > 
> > > In an idea world, you should break the sw descriptor submitted into N hw
> > > descriptors and submit to hardware and let user know when the sw
> > > descriptor is completed. Of course we do not do that :(
> > 
> > Well, the current Dw DMA driver does that. But due to the two slave device
> > buffers inter-dependency this isn't enough to perform safe DMA transactions.
> > Due to the interrupts handling latency Tx DMA channel pushes data to the slave
> > device buffer faster than Rx DMA channel starts to handle incoming data. This
> > causes the SPI Rx FIFO overflow.
> > 
> > > 
> > > > };
> > > > As Andy suggested it's value should be interpreted as:
> > > > 0          - unlimited number of entries,
> > > > 1:MAX_UINT - actual limit to the number of entries.
> > > 
> > 
> > > Hmm why 0, why not MAX_UINT for unlimited?
> > 
> > 0 is much better for many reasons. First of all MAX_UINT is a lot, but it's
> > still a number. On x64 platform this might be actual limit if for instance
> > the block-size register is 32-bits wide. Secondly interpreting 0 as unlimited
> > number of entries would be more suitable since most of the drivers support
> > LLP functionality and we wouldn't need to update their code to set MAX_UINT.
> > Thirdly DMA engines, which don't support LLPs would need to set this parameter
> > as 1. So if we do as you say and interpret unlimited number of LLPs as MAX_UINT,
> > then 0 would left unused.
> > 
> > To sum up I also think that using 0 as unlimited number SG entries supported is
> > much better.
> 
> ok
> 
> > > > In addition to that seeing the dma_get_slave_caps() method provide the caps only
> > > > by getting them from the DMA device descriptor, while we need to have an info on
> > > > per-channel basis, it would be good to introduce a new DMA-device callback like:
> > > > struct dma_device {
> > > > ...
> > > > 	int (*device_caps)(struct dma_chan *chan,
> > > > 			   struct dma_slave_caps *caps);
> > > 
> > 
> > > Do you have a controller where channel caps are on per-channel basis?
> > 
> > Yes, I do. Our DW DMA controller has got the maximum burst length non-uniformly
> > distributed per DMA channels. There are eight channels our controller supports,
> > among which first two channels can burst up to 32 transfer words, but the rest
> > of the channels support bursting up to 4 transfer words.
> > 
> > So having such device_caps() callback to customize the device capabilities on
> > per-DMA-channel basis would be very useful! What do you think?
> 
> Okay looks like per-ch basis is the way forward!

Great! Thanks. I'll send v3 with updates we've come up to in this discussion.

-Sergey

> 
> -- 
> ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config
  2020-05-19 17:07               ` Vinod Koul
@ 2020-05-21  1:47                 ` Serge Semin
  0 siblings, 0 replies; 72+ messages in thread
From: Serge Semin @ 2020-05-21  1:47 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Serge Semin, Andy Shevchenko, Viresh Kumar, Dan Williams,
	Alexey Malahov, Thomas Bogendoerfer, Paul Burton, Ralf Baechle,
	Arnd Bergmann, Rob Herring, linux-mips, devicetree, dmaengine,
	linux-kernel

On Tue, May 19, 2020 at 10:37:14PM +0530, Vinod Koul wrote:
> On 17-05-20, 22:38, Serge Semin wrote:
> > On Fri, May 15, 2020 at 12:09:50PM +0530, Vinod Koul wrote:
> > > On 12-05-20, 22:12, Andy Shevchenko wrote:
> > > > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > > > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:

[nip]

> > > > > > But let's see what we can do better. Since maximum is defined on the slave side
> > > > > > device, it probably needs to define minimum as well, otherwise it's possible
> > > > > > that some hardware can't cope underrun bursts.
> > > > > 
> > > > > There is no need to define minimum if such limit doesn't exists except a
> > > > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> > > > > added such capability into the generic DMA subsystem so far.
> > > > 
> > > > There is a contract between provider and consumer about DMA resource. That's
> > > > why both sides should participate in fulfilling it. Theoretically it may be a
> > > > hardware that doesn't support minimum burst available in DMA by a reason. For
> > > > such we would need minimum to be provided as well.
> > > 
> > > Agreed and if required caps should be extended to tell consumer the
> > > minimum values supported.
> > 
> > Sorry, it's not required by our hardware. Is there any, which actually has such
> > limitation? (minimum burst length)
> 
> IIUC the idea is that you will tell maximum and minimum values supported
> and client can pick the best value. Esp in case of slave transfers
> things like burst, msize are governed by client capability and usage. So
> exposing the set to pick from would make sense

Agreed. I'll add min_burst capability.

-Sergey

> 
> -- 
> ~Vinod

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, back to index

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-06 13:10 [PATCH 0/5] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account Sergey.Semin
2020-03-06 13:29 ` Andy Shevchenko
2020-03-06 13:30   ` Andy Shevchenko
2020-03-06 13:43     ` Vinod Koul
     [not found]     ` <20200306135050.40094803087C@mail.baikalelectronics.ru>
2020-03-09 21:45       ` Sergey Semin
     [not found]   ` <20200306133756.0F74C8030793@mail.baikalelectronics.ru>
2020-03-06 13:47     ` Sergey Semin
2020-03-06 14:11       ` Andy Shevchenko
     [not found]       ` <20200306141135.9C4F380307C2@mail.baikalelectronics.ru>
2020-03-09 22:08         ` Sergey Semin
2020-05-08 10:52 ` [PATCH v2 0/6] " Serge Semin
2020-05-08 10:52   ` [PATCH v2 1/6] dt-bindings: dma: dw: Convert DW DMAC to DT binding Serge Semin
2020-05-18 17:50     ` Rob Herring
2020-05-08 10:53   ` [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property Serge Semin
2020-05-08 11:12     ` Andy Shevchenko
2020-05-11 20:05       ` Serge Semin
2020-05-11 21:01         ` Andy Shevchenko
2020-05-11 21:35           ` Serge Semin
2020-05-12  9:08             ` Andy Shevchenko
2020-05-12 11:49               ` Serge Semin
2020-05-12 12:38                 ` Andy Shevchenko
2020-05-15  6:09                   ` Vinod Koul
2020-05-15 10:51                     ` Andy Shevchenko
2020-05-15 10:56                       ` Vinod Koul
2020-05-15 11:11                         ` Serge Semin
2020-05-17 17:47                           ` Serge Semin
2020-05-18 17:30                             ` Rob Herring
2020-05-18 19:30                               ` Serge Semin
2020-05-19 17:13                             ` Vinod Koul
2020-05-21  1:33                               ` Serge Semin
2020-05-08 10:53   ` [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter Serge Semin
2020-05-08 11:21     ` Andy Shevchenko
2020-05-08 18:49       ` Vineet Gupta
2020-05-11 21:16       ` Serge Semin
2020-05-12 12:35         ` Andy Shevchenko
2020-05-12 17:01           ` Serge Semin
2020-05-15  6:16           ` Vinod Koul
2020-05-15 10:53             ` Andy Shevchenko
2020-05-17 18:22               ` Serge Semin
2020-05-08 10:53   ` [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported Serge Semin
2020-05-08 11:26     ` Andy Shevchenko
2020-05-08 11:53       ` Mark Brown
2020-05-08 19:06         ` Andy Shevchenko
2020-05-11  3:13           ` Serge Semin
2020-05-11 14:03             ` Andy Shevchenko
2020-05-11  2:10         ` Serge Semin
2020-05-11 11:58           ` Mark Brown
2020-05-11 13:45             ` Serge Semin
2020-05-11 13:58               ` Andy Shevchenko
2020-05-11 17:48                 ` Mark Brown
2020-05-11 18:25                   ` Serge Semin
2020-05-11 19:32                 ` Serge Semin
2020-05-11 21:07                   ` Andy Shevchenko
2020-05-11 21:08                     ` Andy Shevchenko
2020-05-12 12:42                       ` Serge Semin
2020-05-15  6:30                         ` Vinod Koul
2020-05-17 19:23                           ` Serge Semin
2020-05-19 17:02                             ` Vinod Koul
2020-05-21  1:40                               ` Serge Semin
2020-05-11 17:44               ` Mark Brown
2020-05-11 18:32                 ` Serge Semin
2020-05-11 21:32                   ` Mark Brown
2020-05-08 10:53   ` [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config Serge Semin
2020-05-08 11:41     ` Andy Shevchenko
2020-05-12 14:08       ` Serge Semin
2020-05-12 19:12         ` Andy Shevchenko
2020-05-12 19:47           ` Serge Semin
2020-05-15 11:02             ` Andy Shevchenko
2020-05-15  6:39           ` Vinod Koul
2020-05-17 19:38             ` Serge Semin
2020-05-19 17:07               ` Vinod Koul
2020-05-21  1:47                 ` Serge Semin
2020-05-08 10:53   ` [PATCH v2 6/6] dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config Serge Semin
2020-05-08 11:43     ` Andy Shevchenko

dmaengine Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dmaengine/0 dmaengine/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dmaengine dmaengine/ https://lore.kernel.org/dmaengine \
		dmaengine@vger.kernel.org
	public-inbox-index dmaengine

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.dmaengine


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git