Linux-SPI Archive on lore.kernel.org
 help / color / Atom feed
From: Serge Semin <Sergey.Semin@baikalelectronics.ru>
To: Mark Brown <broonie@kernel.org>
Cc: Serge Semin <Sergey.Semin@baikalelectronics.ru>,
	Serge Semin <fancer.lancer@gmail.com>,
	Alexey Malahov <Alexey.Malahov@baikalelectronics.ru>,
	Georgy Vlasov <Georgy.Vlasov@baikalelectronics.ru>,
	Ramil Zaripov <Ramil.Zaripov@baikalelectronics.ru>,
	Pavel Parkhomenko <Pavel.Parkhomenko@baikalelectronics.ru>,
	Peter Ujfalusi <peter.ujfalusi@ti.com>,
	Andy Shevchenko <andy.shevchenko@gmail.com>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Feng Tang <feng.tang@intel.com>, Vinod Koul <vkoul@kernel.org>,
	<linux-spi@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: [PATCH 8/8] spi: dw-dma: Add one-by-one SG list entries transfer
Date: Fri, 31 Jul 2020 10:59:53 +0300
Message-ID: <20200731075953.14416-9-Sergey.Semin@baikalelectronics.ru> (raw)
In-Reply-To: <20200731075953.14416-1-Sergey.Semin@baikalelectronics.ru>

In case if at least one of the requested DMA engine channels doesn't
support the hardware accelerated SG list entries traverse, the DMA driver
will most likely work that around by performing the IRQ-based SG list
entries resubmission. That might and will cause a problem if the DMA Tx
channel is recharged and re-executed before the Rx DMA channel. Due to
non-deterministic IRQ-handler execution latency the DMA Tx channel will
start pushing data to the SPI bus before the Rx DMA channel is even
reinitialized with the next inbound SG list entry. By doing so the DMA
Tx channel will implicitly start filling the DW APB SSI Rx FIFO up, which
while the DMA Rx channel being recharged and re-executed will eventually
be overflown.

In order to solve the problem we have to feed the DMA engine with SG
list entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs
synchronized and prevent the Rx FIFO overflow. Since in general the SPI
tx_sg and rx_sg lists may have different number of entries of different
lengths (though total length should match) we virtually split the
SG-lists to the set of DMA transfers, which length is a minimum of the
ordered SG-entries lengths.

The solution described above is only executed if a full-duplex SPI
transfer is requested and the DMA engine hasn't provided channels with
hardware accelerated SG list traverse capability to handle both SG
lists at once.

Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/spi/spi-dw-dma.c | 137 ++++++++++++++++++++++++++++++++++++++-
 drivers/spi/spi-dw.h     |   1 +
 2 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 2b42b42b6cf2..b3eeef3d5cf7 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -73,6 +73,23 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
 	dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
 }
 
+static void dw_spi_dma_sg_burst_init(struct dw_spi *dws)
+{
+	struct dma_slave_caps tx = {0}, rx = {0};
+
+	dma_get_slave_caps(dws->txchan, &tx);
+	dma_get_slave_caps(dws->rxchan, &rx);
+
+	if (tx.max_sg_burst > 0 && rx.max_sg_burst > 0)
+		dws->dma_sg_burst = min(tx.max_sg_burst, rx.max_sg_burst);
+	else if (tx.max_sg_burst > 0)
+		dws->dma_sg_burst = tx.max_sg_burst;
+	else if (rx.max_sg_burst > 0)
+		dws->dma_sg_burst = rx.max_sg_burst;
+	else
+		dws->dma_sg_burst = 0;
+}
+
 static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
 {
 	struct dw_dma_slave dma_tx = { .dst_id = 1 }, *tx = &dma_tx;
@@ -110,6 +127,8 @@ static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
 
 	dw_spi_dma_maxburst_init(dws);
 
+	dw_spi_dma_sg_burst_init(dws);
+
 	return 0;
 
 free_rxchan:
@@ -139,6 +158,8 @@ static int dw_spi_dma_init_generic(struct device *dev, struct dw_spi *dws)
 
 	dw_spi_dma_maxburst_init(dws);
 
+	dw_spi_dma_sg_burst_init(dws);
+
 	return 0;
 }
 
@@ -459,11 +480,125 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
 	return ret;
 }
 
+static int dw_spi_dma_transfer_one(struct dw_spi *dws,
+				   struct spi_transfer *xfer)
+{
+	struct scatterlist *tx_sg = NULL, *rx_sg = NULL, tx_tmp, rx_tmp;
+	unsigned int tx_len = 0, rx_len = 0;
+	unsigned int base, len;
+	int ret;
+
+	sg_init_table(&tx_tmp, 1);
+	sg_init_table(&rx_tmp, 1);
+
+	/*
+	 * In case if at least one of the requested DMA channels doesn't
+	 * support the hardware accelerated SG list entries traverse, the DMA
+	 * driver will most likely work that around by performing the IRQ-based
+	 * SG list entries resubmission. That might and will cause a problem
+	 * if the DMA Tx channel is recharged and re-executed before the Rx DMA
+	 * channel. Due to non-deterministic IRQ-handler execution latency the
+	 * DMA Tx channel will start pushing data to the SPI bus before the
+	 * Rx DMA channel is even reinitialized with the next inbound SG list
+	 * entry. By doing so the DMA Tx channel will implicitly start filling
+	 * the DW APB SSI Rx FIFO up, which while the DMA Rx channel being
+	 * recharged and re-executed will eventually be overflown.
+	 *
+	 * In order to solve the problem we have to feed the DMA engine with SG
+	 * list entries one-by-one. It shall keep the DW APB SSI Tx and Rx
+	 * FIFOs synchronized and prevent the Rx FIFO overflow. Since in
+	 * general the tx_sg and rx_sg lists may have different number of
+	 * entries of different lengths (though total length should match)
+	 * let's virtually split the SG-lists to the set of DMA transfers,
+	 * which length is a minimum of the ordered SG-entries lengths.
+	 * An ASCII-sketch of the implemented algo is following:
+	 *                  xfer->len
+	 *                |___________|
+	 * tx_sg list:    |___|____|__|
+	 * rx_sg list:    |_|____|____|
+	 * DMA transfers: |_|_|__|_|__|
+	 *
+	 * Note in order to have this workaround solving the denoted problem
+	 * the DMA engine driver should properly initialize the max_sg_burst
+	 * capability and set the DMA device max segment size parameter with
+	 * maximum data block size the DMA engine supports.
+	 */
+	for (base = 0, len = 0; base < xfer->len; base += len) {
+		/* Fetch next Tx DMA data chunk */
+		if (!tx_len) {
+			tx_sg = !tx_sg ? &xfer->tx_sg.sgl[0] : sg_next(tx_sg);
+			sg_dma_address(&tx_tmp) = sg_dma_address(tx_sg);
+			tx_len = sg_dma_len(tx_sg);
+		}
+
+		/* Fetch next Rx DMA data chunk */
+		if (!rx_len) {
+			rx_sg = !rx_sg ? &xfer->rx_sg.sgl[0] : sg_next(rx_sg);
+			sg_dma_address(&rx_tmp) = sg_dma_address(rx_sg);
+			rx_len = sg_dma_len(rx_sg);
+		}
+
+		len = min(tx_len, rx_len);
+
+		sg_dma_len(&tx_tmp) = len;
+		sg_dma_len(&rx_tmp) = len;
+
+		/* Submit DMA Tx transfer */
+		ret = dw_spi_dma_submit_tx(dws, &tx_tmp, 1);
+		if (ret)
+			break;
+
+		/* Submit DMA Rx transfer */
+		ret = dw_spi_dma_submit_rx(dws, &rx_tmp, 1);
+		if (ret)
+			break;
+
+		/* Rx must be started before Tx due to SPI instinct */
+		dma_async_issue_pending(dws->rxchan);
+
+		dma_async_issue_pending(dws->txchan);
+
+		/*
+		 * Here we only need to wait for the DMA transfer to be
+		 * finished since SPI controller is kept enabled during the
+		 * procedure this loop implements and there is no risk to lose
+		 * data left in the Tx/Rx FIFOs.
+		 */
+		ret = dw_spi_dma_wait(dws, len, xfer->effective_speed_hz);
+		if (ret)
+			break;
+
+		reinit_completion(&dws->dma_completion);
+
+		sg_dma_address(&tx_tmp) += len;
+		sg_dma_address(&rx_tmp) += len;
+		tx_len -= len;
+		rx_len -= len;
+	}
+
+	dw_writel(dws, DW_SPI_DMACR, 0);
+
+	return ret;
+}
+
 static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
 {
+	unsigned int nents;
 	int ret;
 
-	ret = dw_spi_dma_transfer_all(dws, xfer);
+	nents = max(xfer->tx_sg.nents, xfer->rx_sg.nents);
+
+	/*
+	 * Execute normal DMA-based transfer (which submits the Rx and Tx SG
+	 * lists directly to the DMA engine at once) if either full hardware
+	 * accelerated SG list traverse is supported by both channels, or the
+	 * Tx-only SPI transfer is requested, or the DMA engine is capable to
+	 * handle both SG lists on hardware accelerated basis.
+	 */
+	if (!dws->dma_sg_burst || !xfer->rx_buf || nents <= dws->dma_sg_burst)
+		ret = dw_spi_dma_transfer_all(dws, xfer);
+	else
+		ret = dw_spi_dma_transfer_one(dws, xfer);
 	if (ret)
 		return ret;
 
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 151ba316619e..1d201c62d292 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -146,6 +146,7 @@ struct dw_spi {
 	u32			txburst;
 	struct dma_chan		*rxchan;
 	u32			rxburst;
+	u32			dma_sg_burst;
 	unsigned long		dma_chan_busy;
 	dma_addr_t		dma_addr; /* phy address of the Data register */
 	const struct dw_spi_dma_ops *dma_ops;
-- 
2.27.0


  parent reply index

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31  7:59 [PATCH 0/8] spi: dw-dma: Add max SG entries burst capability support Serge Semin
2020-07-31  7:59 ` [PATCH 1/8] spi: dw-dma: Set DMA Level registers on init Serge Semin
2020-07-31  7:59 ` [PATCH 2/8] spi: dw-dma: Fail DMA-based transfer if no Tx-buffer specified Serge Semin
2020-07-31  7:59 ` [PATCH 3/8] spi: dw-dma: Configure the DMA channels in dma_setup Serge Semin
2020-07-31  9:16   ` Andy Shevchenko
2020-07-31 12:32     ` Serge Semin
2020-07-31  7:59 ` [PATCH 4/8] spi: dw-dma: Move DMA transfers submission to the channels prep methods Serge Semin
2020-07-31  9:15   ` Andy Shevchenko
2020-07-31 12:46     ` Serge Semin
2020-07-31  7:59 ` [PATCH 5/8] spi: dw-dma: Detach DMA transfer into a dedicated method Serge Semin
2020-07-31  7:59 ` [PATCH 6/8] spi: dw-dma: Move DMAC register cleanup to DMA transfer method Serge Semin
2020-07-31  7:59 ` [PATCH 7/8] spi: dw-dma: Pass exact data to the DMA submit and wait methods Serge Semin
2020-07-31  7:59 ` Serge Semin [this message]
2020-07-31  9:26 ` [PATCH 0/8] spi: dw-dma: Add max SG entries burst capability support Andy Shevchenko
2020-07-31 12:59   ` Serge Semin
2020-08-04 21:14     ` Mark Brown
2020-08-05 11:31       ` Vinod Koul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200731075953.14416-9-Sergey.Semin@baikalelectronics.ru \
    --to=sergey.semin@baikalelectronics.ru \
    --cc=Alexey.Malahov@baikalelectronics.ru \
    --cc=Georgy.Vlasov@baikalelectronics.ru \
    --cc=Pavel.Parkhomenko@baikalelectronics.ru \
    --cc=Ramil.Zaripov@baikalelectronics.ru \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=andy.shevchenko@gmail.com \
    --cc=broonie@kernel.org \
    --cc=fancer.lancer@gmail.com \
    --cc=feng.tang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-spi@vger.kernel.org \
    --cc=peter.ujfalusi@ti.com \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-SPI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-spi/0 linux-spi/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-spi linux-spi/ https://lore.kernel.org/linux-spi \
		linux-spi@vger.kernel.org
	public-inbox-index linux-spi

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-spi


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git