dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement
@ 2021-01-13 11:49 Peter Ujfalusi
  2021-01-13 11:49 ` [PATCH v2 1/3] dmaengine: Extend the dmaengine_alignment for 128 and 256 bytes Peter Ujfalusi
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Peter Ujfalusi @ 2021-01-13 11:49 UTC (permalink / raw)
  To: vkoul
  Cc: dan.j.williams, linux-kernel, dmaengine, vigneshr,
	grygorii.strashko, kishon

Hi,

Changes since v1:
- Added Kishon's tested-by to the first two patch
- Moved the variable definitions to the start of their respective functions
- Remove braces where they are not needed
- correct indentation of cases
- additional patch to clean up the ret = 0 initializations in tisci channel configuration
  functions, no functional changes.

Newer members of the KS3 family (after AM654) have support for burst_size
configuration for each DMA channel.

The HW default value is 64 bytes but on higher throughput channels it can be
increased to 256 bytes (UCHANs) or 128 byes (HCHANs).

Aligning the buffers and length of the transfer to the burst size also increases
the throughput.

Numbers gathered on j721e (UCHAN pair):
echo 8000000 > /sys/module/dmatest/parameters/test_buf_size
echo 2000 > /sys/module/dmatest/parameters/timeout
echo 50 > /sys/module/dmatest/parameters/iterations
echo 1 > /sys/module/dmatest/parameters/max_channels

Prior to  this patch:   ~1.3 GB/s
After this patch:       ~1.8 GB/s
 with 1 byte alignment: ~1.7 GB/s

Regards,
Peter
---
Peter Ujfalusi (3):
  dmaengine: Extend the dmaengine_alignment for 128 and 256 bytes
  dmaengine: ti: k3-udma: Add support for burst_size configuration for
    mem2mem
  dmaengine: ti: k3-udma: Do not initialize ret in tisci channel config
    functions

 drivers/dma/ti/k3-udma.c  | 130 ++++++++++++++++++++++++++++++++++----
 include/linux/dmaengine.h |   2 +
 2 files changed, 120 insertions(+), 12 deletions(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/3] dmaengine: Extend the dmaengine_alignment for 128 and 256 bytes
  2021-01-13 11:49 [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Peter Ujfalusi
@ 2021-01-13 11:49 ` Peter Ujfalusi
  2021-01-13 11:49 ` [PATCH v2 2/3] dmaengine: ti: k3-udma: Add support for burst_size configuration for mem2mem Peter Ujfalusi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Ujfalusi @ 2021-01-13 11:49 UTC (permalink / raw)
  To: vkoul
  Cc: dan.j.williams, linux-kernel, dmaengine, vigneshr,
	grygorii.strashko, kishon

From: Peter Ujfalusi <peter.ujfalusi@ti.com>

Some DMA device can benefit with higher order of alignment than the maximum
of 64 bytes currently defined.

Define 128 and 256 bytes alignment for these devices.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Tested-by: Kishon Vijay Abraham I <kishon@ti.com>
---
 include/linux/dmaengine.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 68130f5f599e..004736b6a9c8 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -745,6 +745,8 @@ enum dmaengine_alignment {
 	DMAENGINE_ALIGN_16_BYTES = 4,
 	DMAENGINE_ALIGN_32_BYTES = 5,
 	DMAENGINE_ALIGN_64_BYTES = 6,
+	DMAENGINE_ALIGN_128_BYTES = 7,
+	DMAENGINE_ALIGN_256_BYTES = 8,
 };
 
 /**
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/3] dmaengine: ti: k3-udma: Add support for burst_size configuration for mem2mem
  2021-01-13 11:49 [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Peter Ujfalusi
  2021-01-13 11:49 ` [PATCH v2 1/3] dmaengine: Extend the dmaengine_alignment for 128 and 256 bytes Peter Ujfalusi
@ 2021-01-13 11:49 ` Peter Ujfalusi
  2021-01-13 11:49 ` [PATCH v2 3/3] dmaengine: ti: k3-udma: Do not initialize ret in tisci channel config functions Peter Ujfalusi
  2021-01-13 16:30 ` [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Vinod Koul
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Ujfalusi @ 2021-01-13 11:49 UTC (permalink / raw)
  To: vkoul
  Cc: dan.j.williams, linux-kernel, dmaengine, vigneshr,
	grygorii.strashko, kishon

From: Peter Ujfalusi <peter.ujfalusi@ti.com>

The UDMA and BCDMA can provide higher throughput if the burst_size of the
channel is changed from it's default (which is 64 bytes) for Ultra-high
and high capacity channels.

This performance benefit is even more visible when the buffers are aligned
with the burst_size configuration.

The am654 does not have a way to change the burst size, but it is using
64 bytes burst, so increasing the copy_align from 8 bytes to 64 (and
clients taking that into account) can increase the throughput as well.

Numbers gathered on j721e:
echo 8000000 > /sys/module/dmatest/parameters/test_buf_size
echo 2000 > /sys/module/dmatest/parameters/timeout
echo 50 > /sys/module/dmatest/parameters/iterations
echo 1 > /sys/module/dmatest/parameters/max_channels

Prior this patch:       ~1.3 GB/s
After this patch:       ~1.8 GB/s
 with 1 byte alignment: ~1.7 GB/s

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Tested-by: Kishon Vijay Abraham I <kishon@ti.com>
---
 drivers/dma/ti/k3-udma.c | 116 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 111 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index 298460438bb4..1030c2ff487a 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -121,6 +121,11 @@ struct udma_oes_offsets {
 #define UDMA_FLAG_PDMA_ACC32		BIT(0)
 #define UDMA_FLAG_PDMA_BURST		BIT(1)
 #define UDMA_FLAG_TDTYPE		BIT(2)
+#define UDMA_FLAG_BURST_SIZE		BIT(3)
+#define UDMA_FLAGS_J7_CLASS		(UDMA_FLAG_PDMA_ACC32 | \
+					 UDMA_FLAG_PDMA_BURST | \
+					 UDMA_FLAG_TDTYPE | \
+					 UDMA_FLAG_BURST_SIZE)
 
 struct udma_match_data {
 	enum k3_dma_type type;
@@ -128,6 +133,7 @@ struct udma_match_data {
 	bool enable_memcpy_support;
 	u32 flags;
 	u32 statictr_z_mask;
+	u8 burst_size[3];
 };
 
 struct udma_soc_data {
@@ -436,6 +442,18 @@ static void k3_configure_chan_coherency(struct dma_chan *chan, u32 asel)
 	}
 }
 
+static u8 udma_get_chan_tpl_index(struct udma_tpl *tpl_map, int chan_id)
+{
+	int i;
+
+	for (i = 0; i < tpl_map->levels; i++) {
+		if (chan_id >= tpl_map->start_idx[i])
+			return i;
+	}
+
+	return 0;
+}
+
 static void udma_reset_uchan(struct udma_chan *uc)
 {
 	memset(&uc->config, 0, sizeof(uc->config));
@@ -1811,13 +1829,21 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
 	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
 	struct udma_tchan *tchan = uc->tchan;
 	struct udma_rchan *rchan = uc->rchan;
+	u8 burst_size = 0;
 	int ret = 0;
+	u8 tpl;
 
 	/* Non synchronized - mem to mem type of transfer */
 	int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
 	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
 	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
 
+	if (ud->match_data->flags & UDMA_FLAG_BURST_SIZE) {
+		tpl = udma_get_chan_tpl_index(&ud->tchan_tpl, tchan->id);
+
+		burst_size = ud->match_data->burst_size[tpl];
+	}
+
 	req_tx.valid_params = TISCI_UDMA_TCHAN_VALID_PARAMS;
 	req_tx.nav_id = tisci_rm->tisci_dev_id;
 	req_tx.index = tchan->id;
@@ -1825,6 +1851,10 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
 	req_tx.tx_fetch_size = sizeof(struct cppi5_desc_hdr_t) >> 2;
 	req_tx.txcq_qnum = tc_ring;
 	req_tx.tx_atype = ud->atype;
+	if (burst_size) {
+		req_tx.valid_params |= TI_SCI_MSG_VALUE_RM_UDMAP_CH_BURST_SIZE_VALID;
+		req_tx.tx_burst_size = burst_size;
+	}
 
 	ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
 	if (ret) {
@@ -1839,6 +1869,10 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
 	req_rx.rxcq_qnum = tc_ring;
 	req_rx.rx_chan_type = TI_SCI_RM_UDMAP_CHAN_TYPE_3RDP_BCOPY_PBRR;
 	req_rx.rx_atype = ud->atype;
+	if (burst_size) {
+		req_rx.valid_params |= TI_SCI_MSG_VALUE_RM_UDMAP_CH_BURST_SIZE_VALID;
+		req_rx.rx_burst_size = burst_size;
+	}
 
 	ret = tisci_ops->rx_ch_cfg(tisci_rm->tisci, &req_rx);
 	if (ret)
@@ -1854,12 +1888,24 @@ static int bcdma_tisci_m2m_channel_config(struct udma_chan *uc)
 	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
 	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
 	struct udma_bchan *bchan = uc->bchan;
+	u8 burst_size = 0;
 	int ret = 0;
+	u8 tpl;
+
+	if (ud->match_data->flags & UDMA_FLAG_BURST_SIZE) {
+		tpl = udma_get_chan_tpl_index(&ud->bchan_tpl, bchan->id);
+
+		burst_size = ud->match_data->burst_size[tpl];
+	}
 
 	req_tx.valid_params = TISCI_BCDMA_BCHAN_VALID_PARAMS;
 	req_tx.nav_id = tisci_rm->tisci_dev_id;
 	req_tx.extended_ch_type = TI_SCI_RM_BCDMA_EXTENDED_CH_TYPE_BCHAN;
 	req_tx.index = bchan->id;
+	if (burst_size) {
+		req_tx.valid_params |= TI_SCI_MSG_VALUE_RM_UDMAP_CH_BURST_SIZE_VALID;
+		req_tx.tx_burst_size = burst_size;
+	}
 
 	ret = tisci_ops->tx_ch_cfg(tisci_rm->tisci, &req_tx);
 	if (ret)
@@ -4167,6 +4213,11 @@ static struct udma_match_data am654_main_data = {
 	.psil_base = 0x1000,
 	.enable_memcpy_support = true,
 	.statictr_z_mask = GENMASK(11, 0),
+	.burst_size = {
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* Normal Channels */
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* H Channels */
+		0, /* No UH Channels */
+	},
 };
 
 static struct udma_match_data am654_mcu_data = {
@@ -4174,38 +4225,63 @@ static struct udma_match_data am654_mcu_data = {
 	.psil_base = 0x6000,
 	.enable_memcpy_support = false,
 	.statictr_z_mask = GENMASK(11, 0),
+	.burst_size = {
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* Normal Channels */
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* H Channels */
+		0, /* No UH Channels */
+	},
 };
 
 static struct udma_match_data j721e_main_data = {
 	.type = DMA_TYPE_UDMA,
 	.psil_base = 0x1000,
 	.enable_memcpy_support = true,
-	.flags = UDMA_FLAG_PDMA_ACC32 | UDMA_FLAG_PDMA_BURST | UDMA_FLAG_TDTYPE,
+	.flags = UDMA_FLAGS_J7_CLASS,
 	.statictr_z_mask = GENMASK(23, 0),
+	.burst_size = {
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* Normal Channels */
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_256_BYTES, /* H Channels */
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_256_BYTES, /* UH Channels */
+	},
 };
 
 static struct udma_match_data j721e_mcu_data = {
 	.type = DMA_TYPE_UDMA,
 	.psil_base = 0x6000,
 	.enable_memcpy_support = false, /* MEM_TO_MEM is slow via MCU UDMA */
-	.flags = UDMA_FLAG_PDMA_ACC32 | UDMA_FLAG_PDMA_BURST | UDMA_FLAG_TDTYPE,
+	.flags = UDMA_FLAGS_J7_CLASS,
 	.statictr_z_mask = GENMASK(23, 0),
+	.burst_size = {
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* Normal Channels */
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_128_BYTES, /* H Channels */
+		0, /* No UH Channels */
+	},
 };
 
 static struct udma_match_data am64_bcdma_data = {
 	.type = DMA_TYPE_BCDMA,
 	.psil_base = 0x2000, /* for tchan and rchan, not applicable to bchan */
 	.enable_memcpy_support = true, /* Supported via bchan */
-	.flags = UDMA_FLAG_PDMA_ACC32 | UDMA_FLAG_PDMA_BURST | UDMA_FLAG_TDTYPE,
+	.flags = UDMA_FLAGS_J7_CLASS,
 	.statictr_z_mask = GENMASK(23, 0),
+	.burst_size = {
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* Normal Channels */
+		0, /* No H Channels */
+		0, /* No UH Channels */
+	},
 };
 
 static struct udma_match_data am64_pktdma_data = {
 	.type = DMA_TYPE_PKTDMA,
 	.psil_base = 0x1000,
 	.enable_memcpy_support = false, /* PKTDMA does not support MEM_TO_MEM */
-	.flags = UDMA_FLAG_PDMA_ACC32 | UDMA_FLAG_PDMA_BURST | UDMA_FLAG_TDTYPE,
+	.flags = UDMA_FLAGS_J7_CLASS,
 	.statictr_z_mask = GENMASK(23, 0),
+	.burst_size = {
+		TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES, /* Normal Channels */
+		0, /* No H Channels */
+		0, /* No UH Channels */
+	},
 };
 
 static const struct of_device_id udma_of_match[] = {
@@ -5045,6 +5121,34 @@ static void udma_dbg_summary_show(struct seq_file *s,
 }
 #endif /* CONFIG_DEBUG_FS */
 
+static enum dmaengine_alignment udma_get_copy_align(struct udma_dev *ud)
+{
+	const struct udma_match_data *match_data = ud->match_data;
+	u8 tpl;
+
+	if (!match_data->enable_memcpy_support)
+		return DMAENGINE_ALIGN_8_BYTES;
+
+	/* Get the highest TPL level the device supports for memcpy */
+	if (ud->bchan_cnt)
+		tpl = udma_get_chan_tpl_index(&ud->bchan_tpl, 0);
+	else if (ud->tchan_cnt)
+		tpl = udma_get_chan_tpl_index(&ud->tchan_tpl, 0);
+	else
+		return DMAENGINE_ALIGN_8_BYTES;
+
+	switch (match_data->burst_size[tpl]) {
+	case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_256_BYTES:
+		return DMAENGINE_ALIGN_256_BYTES;
+	case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_128_BYTES:
+		return DMAENGINE_ALIGN_128_BYTES;
+	case TI_SCI_RM_UDMAP_CHAN_BURST_SIZE_64_BYTES:
+	fallthrough;
+	default:
+		return DMAENGINE_ALIGN_64_BYTES;
+	}
+}
+
 #define TI_UDMAC_BUSWIDTHS	(BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) | \
 				 BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) | \
 				 BIT(DMA_SLAVE_BUSWIDTH_3_BYTES) | \
@@ -5201,7 +5305,6 @@ static int udma_probe(struct platform_device *pdev)
 	ud->ddev.dst_addr_widths = TI_UDMAC_BUSWIDTHS;
 	ud->ddev.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV);
 	ud->ddev.residue_granularity = DMA_RESIDUE_GRANULARITY_BURST;
-	ud->ddev.copy_align = DMAENGINE_ALIGN_8_BYTES;
 	ud->ddev.desc_metadata_modes = DESC_METADATA_CLIENT |
 				       DESC_METADATA_ENGINE;
 	if (ud->match_data->enable_memcpy_support &&
@@ -5283,6 +5386,9 @@ static int udma_probe(struct platform_device *pdev)
 		INIT_DELAYED_WORK(&uc->tx_drain.work, udma_check_tx_completion);
 	}
 
+	/* Configure the copy_align to the maximum burst size the device supports */
+	ud->ddev.copy_align = udma_get_copy_align(ud);
+
 	ret = dma_async_device_register(&ud->ddev);
 	if (ret) {
 		dev_err(dev, "failed to register slave DMA engine: %d\n", ret);
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 3/3] dmaengine: ti: k3-udma: Do not initialize ret in tisci channel config functions
  2021-01-13 11:49 [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Peter Ujfalusi
  2021-01-13 11:49 ` [PATCH v2 1/3] dmaengine: Extend the dmaengine_alignment for 128 and 256 bytes Peter Ujfalusi
  2021-01-13 11:49 ` [PATCH v2 2/3] dmaengine: ti: k3-udma: Add support for burst_size configuration for mem2mem Peter Ujfalusi
@ 2021-01-13 11:49 ` Peter Ujfalusi
  2021-01-13 16:30 ` [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Vinod Koul
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Ujfalusi @ 2021-01-13 11:49 UTC (permalink / raw)
  To: vkoul
  Cc: dan.j.williams, linux-kernel, dmaengine, vigneshr,
	grygorii.strashko, kishon

The ret does not need to be initialized to 0 in the tisci channel config
functions.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
---
 drivers/dma/ti/k3-udma.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c
index 1030c2ff487a..4c0935c6930c 100644
--- a/drivers/dma/ti/k3-udma.c
+++ b/drivers/dma/ti/k3-udma.c
@@ -1830,7 +1830,7 @@ static int udma_tisci_m2m_channel_config(struct udma_chan *uc)
 	struct udma_tchan *tchan = uc->tchan;
 	struct udma_rchan *rchan = uc->rchan;
 	u8 burst_size = 0;
-	int ret = 0;
+	int ret;
 	u8 tpl;
 
 	/* Non synchronized - mem to mem type of transfer */
@@ -1889,7 +1889,7 @@ static int bcdma_tisci_m2m_channel_config(struct udma_chan *uc)
 	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
 	struct udma_bchan *bchan = uc->bchan;
 	u8 burst_size = 0;
-	int ret = 0;
+	int ret;
 	u8 tpl;
 
 	if (ud->match_data->flags & UDMA_FLAG_BURST_SIZE) {
@@ -1923,7 +1923,7 @@ static int udma_tisci_tx_channel_config(struct udma_chan *uc)
 	int tc_ring = k3_ringacc_get_ring_id(tchan->tc_ring);
 	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
 	u32 mode, fetch_size;
-	int ret = 0;
+	int ret;
 
 	if (uc->config.pkt_mode) {
 		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
@@ -1964,7 +1964,7 @@ static int bcdma_tisci_tx_channel_config(struct udma_chan *uc)
 	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
 	struct udma_tchan *tchan = uc->tchan;
 	struct ti_sci_msg_rm_udmap_tx_ch_cfg req_tx = { 0 };
-	int ret = 0;
+	int ret;
 
 	req_tx.valid_params = TISCI_BCDMA_TCHAN_VALID_PARAMS;
 	req_tx.nav_id = tisci_rm->tisci_dev_id;
@@ -1997,7 +1997,7 @@ static int udma_tisci_rx_channel_config(struct udma_chan *uc)
 	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
 	struct ti_sci_msg_rm_udmap_flow_cfg flow_req = { 0 };
 	u32 mode, fetch_size;
-	int ret = 0;
+	int ret;
 
 	if (uc->config.pkt_mode) {
 		mode = TI_SCI_RM_UDMAP_CHAN_TYPE_PKT_PBRR;
@@ -2074,7 +2074,7 @@ static int bcdma_tisci_rx_channel_config(struct udma_chan *uc)
 	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
 	struct udma_rchan *rchan = uc->rchan;
 	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
-	int ret = 0;
+	int ret;
 
 	req_rx.valid_params = TISCI_BCDMA_RCHAN_VALID_PARAMS;
 	req_rx.nav_id = tisci_rm->tisci_dev_id;
@@ -2094,7 +2094,7 @@ static int pktdma_tisci_rx_channel_config(struct udma_chan *uc)
 	const struct ti_sci_rm_udmap_ops *tisci_ops = tisci_rm->tisci_udmap_ops;
 	struct ti_sci_msg_rm_udmap_rx_ch_cfg req_rx = { 0 };
 	struct ti_sci_msg_rm_udmap_flow_cfg flow_req = { 0 };
-	int ret = 0;
+	int ret;
 
 	req_rx.valid_params = TISCI_BCDMA_RCHAN_VALID_PARAMS;
 	req_rx.nav_id = tisci_rm->tisci_dev_id;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement
  2021-01-13 11:49 [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Peter Ujfalusi
                   ` (2 preceding siblings ...)
  2021-01-13 11:49 ` [PATCH v2 3/3] dmaengine: ti: k3-udma: Do not initialize ret in tisci channel config functions Peter Ujfalusi
@ 2021-01-13 16:30 ` Vinod Koul
  3 siblings, 0 replies; 5+ messages in thread
From: Vinod Koul @ 2021-01-13 16:30 UTC (permalink / raw)
  To: Peter Ujfalusi
  Cc: dan.j.williams, linux-kernel, dmaengine, vigneshr,
	grygorii.strashko, kishon

On 13-01-21, 13:49, Peter Ujfalusi wrote:
> Hi,
> 
> Changes since v1:
> - Added Kishon's tested-by to the first two patch
> - Moved the variable definitions to the start of their respective functions
> - Remove braces where they are not needed
> - correct indentation of cases
> - additional patch to clean up the ret = 0 initializations in tisci channel configuration
>   functions, no functional changes.
> 
> Newer members of the KS3 family (after AM654) have support for burst_size
> configuration for each DMA channel.
> 
> The HW default value is 64 bytes but on higher throughput channels it can be
> increased to 256 bytes (UCHANs) or 128 byes (HCHANs).
> 
> Aligning the buffers and length of the transfer to the burst size also increases
> the throughput.
> 
> Numbers gathered on j721e (UCHAN pair):
> echo 8000000 > /sys/module/dmatest/parameters/test_buf_size
> echo 2000 > /sys/module/dmatest/parameters/timeout
> echo 50 > /sys/module/dmatest/parameters/iterations
> echo 1 > /sys/module/dmatest/parameters/max_channels
> 
> Prior to  this patch:   ~1.3 GB/s
> After this patch:       ~1.8 GB/s
>  with 1 byte alignment: ~1.7 GB/s

Applied, thanks

-- 
~Vinod

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-13 16:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-13 11:49 [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Peter Ujfalusi
2021-01-13 11:49 ` [PATCH v2 1/3] dmaengine: Extend the dmaengine_alignment for 128 and 256 bytes Peter Ujfalusi
2021-01-13 11:49 ` [PATCH v2 2/3] dmaengine: ti: k3-udma: Add support for burst_size configuration for mem2mem Peter Ujfalusi
2021-01-13 11:49 ` [PATCH v2 3/3] dmaengine: ti: k3-udma: Do not initialize ret in tisci channel config functions Peter Ujfalusi
2021-01-13 16:30 ` [PATCH v2 0/3] dmaengine: ti: k3-udma: memcpy throughput improvement Vinod Koul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).