All of lore.kernel.org
 help / color / mirror / Atom feed
* optimizing the nand read performance by reducing interrupts from 4 to 1
@ 2018-04-26 15:41 Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 01/13] mtd: nand: gpmi: drop dma_ops_type Sam Lefebvre
                   ` (14 more replies)
  0 siblings, 15 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle

The current implementation for gmpi nand flashes use existing libraries for composing                    
dma command chains in different pieces with mutliple interrupts. This invokes a lot of                   
unnecessary context switches. In order to make more optimal use of dma chaining capabilities
of the gpmi hardware, some rework must be performed to allow gpmi_ecc_read_page() 
and gpmi_ecc_read_subpage() submitting a range of dma's to reduce the interrupts from 
4 to 1 for every page read.

In the current situation, a page read for large page incurs 4 interrupts:
- DMA for READ0
- DMA for READSTART
- DMA for reading of page data
- BCH
                                                                                                         
This series is based on a previous post and contains some optimizations in the commits.
The changes in nand_command() and nand_command_lp() are omitted and commits are re-ordered
in such way that the changes for reducing the interrupts are moved to the end.

In order to do the preparation of this work, ready/busy signalling is aggregated and
error handling in gpmi_ecc_read_oob() has been reviewed.            

Some changes by Sasha Hauer are also adopted which contain some bugfixes and preliminary
cleanup of several parts in de code.

Sam lefebvre 

----------------------------------------------------------------
Arnout Vandecappelle (Essensium/Mind) (2):
      mtd: rawnand: gpmi: instantiate cmdfunc
      mtd: rawnand: gpmi: gpmi_nand_command(): simplification and formatting

Sam Lefebvre (5):
      mtd: rawnand: gpmi: set aggregate ready/busy signalling
      mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob()
      mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq()
      mtd: rawnand: gpmi: issue two commands in a single DMA chain
      mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data()

Sascha Hauer (6):
      mtd: nand: gpmi: drop dma_ops_type
      mtd: nand: gpmi: pass buffer and len around
      mtd: nand: gpmi: put only once used functions inline
      mtd: nand: gpmi: remove direct_dma_map_ok from driver data struct
      mtd: nand: gpmi: return valid value from bch_set_geometry()
      mtd: nand: gpmi: remove unnecessary variables

 drivers/dma/mxs-dma.c                      |   3 +
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  |  77 ++++---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 359 ++++++++++++++++-------------
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h |  37 ++-
 drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h |   1 +
 5 files changed, 260 insertions(+), 217 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/13] mtd: nand: gpmi: drop dma_ops_type
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 02/13] mtd: nand: gpmi: pass buffer and len around Sam Lefebvre
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle, Sascha Hauer

From: Sascha Hauer <s.hauer@pengutronix.de>

The GPMI nand driver puts dma_ops_type in its private data struct. Based
on the ops type the DMA callback handler unmaps previously mapped
buffers. Instead of unmapping the buffers in the DMA callback handler,
do this in the caller directly which waits for the DMA transfer to
finish. This makes the whole dma_ops_type mechanism unnecessary.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
v2: also unmap the cmd_sgl, don't unmap in gpmi_read_page() (it's done
    in the bch interrupt handler). Sascha, in the future better test
    with CONFIG_DMA_API_DEBUG :-)
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  | 36 +++++++++++++++++-------------
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 31 ++-----------------------
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h | 11 ---------
 3 files changed, 23 insertions(+), 55 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index e94556705dc7..d479358758a0 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -544,19 +544,13 @@ int gpmi_is_ready(struct gpmi_nand_data *this, unsigned chip)
 	return reg & mask;
 }
 
-static inline void set_dma_type(struct gpmi_nand_data *this,
-					enum dma_ops_type type)
-{
-	this->last_dma_type = this->dma_type;
-	this->dma_type = type;
-}
-
 int gpmi_send_command(struct gpmi_nand_data *this)
 {
 	struct dma_chan *channel = get_dma_chan(this);
 	struct dma_async_tx_descriptor *desc;
 	struct scatterlist *sgl;
 	int chip = this->current_chip;
+	int ret;
 	u32 pio[3];
 
 	/* [1] send out the PIO words */
@@ -586,8 +580,11 @@ int gpmi_send_command(struct gpmi_nand_data *this)
 		return -EINVAL;
 
 	/* [3] submit the DMA */
-	set_dma_type(this, DMA_FOR_COMMAND);
-	return start_dma_without_bch_irq(this, desc);
+	ret = start_dma_without_bch_irq(this, desc);
+
+	dma_unmap_sg(this->dev, sgl, 1, DMA_TO_DEVICE);
+
+	return ret;
 }
 
 int gpmi_send_data(struct gpmi_nand_data *this)
@@ -595,6 +592,7 @@ int gpmi_send_data(struct gpmi_nand_data *this)
 	struct dma_async_tx_descriptor *desc;
 	struct dma_chan *channel = get_dma_chan(this);
 	int chip = this->current_chip;
+	int ret;
 	uint32_t command_mode;
 	uint32_t address;
 	u32 pio[2];
@@ -624,8 +622,11 @@ int gpmi_send_data(struct gpmi_nand_data *this)
 		return -EINVAL;
 
 	/* [3] submit the DMA */
-	set_dma_type(this, DMA_FOR_WRITE_DATA);
-	return start_dma_without_bch_irq(this, desc);
+	ret = start_dma_without_bch_irq(this, desc);
+
+	dma_unmap_sg(this->dev, &this->data_sgl, 1, DMA_TO_DEVICE);
+
+	return ret;
 }
 
 int gpmi_read_data(struct gpmi_nand_data *this)
@@ -633,6 +634,7 @@ int gpmi_read_data(struct gpmi_nand_data *this)
 	struct dma_async_tx_descriptor *desc;
 	struct dma_chan *channel = get_dma_chan(this);
 	int chip = this->current_chip;
+	int ret;
 	u32 pio[2];
 
 	/* [1] : send PIO */
@@ -658,8 +660,14 @@ int gpmi_read_data(struct gpmi_nand_data *this)
 		return -EINVAL;
 
 	/* [3] : submit the DMA */
-	set_dma_type(this, DMA_FOR_READ_DATA);
-	return start_dma_without_bch_irq(this, desc);
+
+	ret = start_dma_without_bch_irq(this, desc);
+
+	dma_unmap_sg(this->dev, &this->data_sgl, 1, DMA_FROM_DEVICE);
+	if (this->direct_dma_map_ok == false)
+		memcpy(this->upper_buf, this->data_buffer_dma, this->upper_len);
+
+	return ret;
 }
 
 int gpmi_send_page(struct gpmi_nand_data *this,
@@ -703,7 +711,6 @@ int gpmi_send_page(struct gpmi_nand_data *this,
 	if (!desc)
 		return -EINVAL;
 
-	set_dma_type(this, DMA_FOR_WRITE_ECC_PAGE);
 	return start_dma_with_bch_irq(this, desc);
 }
 
@@ -785,7 +792,6 @@ int gpmi_read_page(struct gpmi_nand_data *this,
 		return -EINVAL;
 
 	/* [4] submit the DMA */
-	set_dma_type(this, DMA_FOR_READ_ECC_PAGE);
 	return start_dma_with_bch_irq(this, desc);
 }
 
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index c2597c8107a0..864fda1901a6 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -472,31 +472,6 @@ static void dma_irq_callback(void *param)
 	struct gpmi_nand_data *this = param;
 	struct completion *dma_c = &this->dma_done;
 
-	switch (this->dma_type) {
-	case DMA_FOR_COMMAND:
-		dma_unmap_sg(this->dev, &this->cmd_sgl, 1, DMA_TO_DEVICE);
-		break;
-
-	case DMA_FOR_READ_DATA:
-		dma_unmap_sg(this->dev, &this->data_sgl, 1, DMA_FROM_DEVICE);
-		if (this->direct_dma_map_ok == false)
-			memcpy(this->upper_buf, this->data_buffer_dma,
-				this->upper_len);
-		break;
-
-	case DMA_FOR_WRITE_DATA:
-		dma_unmap_sg(this->dev, &this->data_sgl, 1, DMA_TO_DEVICE);
-		break;
-
-	case DMA_FOR_READ_ECC_PAGE:
-	case DMA_FOR_WRITE_ECC_PAGE:
-		/* We have to wait the BCH interrupt to finish. */
-		break;
-
-	default:
-		dev_err(this->dev, "in wrong DMA operation.\n");
-	}
-
 	complete(dma_c);
 }
 
@@ -516,8 +491,7 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
 	/* Wait for the interrupt from the DMA block. */
 	timeout = wait_for_completion_timeout(dma_c, msecs_to_jiffies(1000));
 	if (!timeout) {
-		dev_err(this->dev, "DMA timeout, last DMA :%d\n",
-			this->last_dma_type);
+		dev_err(this->dev, "DMA timeout, last DMA\n");
 		gpmi_dump_info(this);
 		return -ETIMEDOUT;
 	}
@@ -546,8 +520,7 @@ int start_dma_with_bch_irq(struct gpmi_nand_data *this,
 	/* Wait for the interrupt from the BCH block. */
 	timeout = wait_for_completion_timeout(bch_c, msecs_to_jiffies(1000));
 	if (!timeout) {
-		dev_err(this->dev, "BCH timeout, last DMA :%d\n",
-			this->last_dma_type);
+		dev_err(this->dev, "BCH timeout\n");
 		gpmi_dump_info(this);
 		return -ETIMEDOUT;
 	}
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
index 62fde59b995f..2397010a8963 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
@@ -77,15 +77,6 @@ struct boot_rom_geometry {
 	unsigned int  search_area_stride_exponent;
 };
 
-/* DMA operations types */
-enum dma_ops_type {
-	DMA_FOR_COMMAND = 1,
-	DMA_FOR_READ_DATA,
-	DMA_FOR_WRITE_DATA,
-	DMA_FOR_READ_ECC_PAGE,
-	DMA_FOR_WRITE_ECC_PAGE
-};
-
 enum gpmi_type {
 	IS_MX23,
 	IS_MX28,
@@ -178,8 +169,6 @@ struct gpmi_nand_data {
 	/* DMA channels */
 #define DMA_CHANS		8
 	struct dma_chan		*dma_chans[DMA_CHANS];
-	enum dma_ops_type	last_dma_type;
-	enum dma_ops_type	dma_type;
 	struct completion	dma_done;
 
 	/* private */
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/13] mtd: nand: gpmi: pass buffer and len around
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 01/13] mtd: nand: gpmi: drop dma_ops_type Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 03/13] mtd: nand: gpmi: put only once used functions inline Sam Lefebvre
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle, Sascha Hauer

From: Sascha Hauer <s.hauer@pengutronix.de>

Instead of putting the buffer and len passed in from the mtd core
into the private data struct, just pass it around in the GPMI
drivers functions. This makes the lifetime of the variables more
clear and the code easier to follow.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
v2: rebased
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  | 14 +++++++-------
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 20 ++++++++------------
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h | 11 ++++-------
 3 files changed, 19 insertions(+), 26 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index d479358758a0..447961b798b4 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -587,7 +587,7 @@ int gpmi_send_command(struct gpmi_nand_data *this)
 	return ret;
 }
 
-int gpmi_send_data(struct gpmi_nand_data *this)
+int gpmi_send_data(struct gpmi_nand_data *this, const void *buf, int len)
 {
 	struct dma_async_tx_descriptor *desc;
 	struct dma_chan *channel = get_dma_chan(this);
@@ -606,7 +606,7 @@ int gpmi_send_data(struct gpmi_nand_data *this)
 		| BF_GPMI_CTRL0_CS(chip, this)
 		| BF_GPMI_CTRL0_LOCK_CS(LOCK_CS_ENABLE, this)
 		| BF_GPMI_CTRL0_ADDRESS(address)
-		| BF_GPMI_CTRL0_XFER_COUNT(this->upper_len);
+		| BF_GPMI_CTRL0_XFER_COUNT(len);
 	pio[1] = 0;
 	desc = dmaengine_prep_slave_sg(channel, (struct scatterlist *)pio,
 					ARRAY_SIZE(pio), DMA_TRANS_NONE, 0);
@@ -614,7 +614,7 @@ int gpmi_send_data(struct gpmi_nand_data *this)
 		return -EINVAL;
 
 	/* [2] send DMA request */
-	prepare_data_dma(this, DMA_TO_DEVICE);
+	prepare_data_dma(this, buf, len, DMA_TO_DEVICE);
 	desc = dmaengine_prep_slave_sg(channel, &this->data_sgl,
 					1, DMA_MEM_TO_DEV,
 					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
@@ -629,7 +629,7 @@ int gpmi_send_data(struct gpmi_nand_data *this)
 	return ret;
 }
 
-int gpmi_read_data(struct gpmi_nand_data *this)
+int gpmi_read_data(struct gpmi_nand_data *this, void *buf, int len)
 {
 	struct dma_async_tx_descriptor *desc;
 	struct dma_chan *channel = get_dma_chan(this);
@@ -643,7 +643,7 @@ int gpmi_read_data(struct gpmi_nand_data *this)
 		| BF_GPMI_CTRL0_CS(chip, this)
 		| BF_GPMI_CTRL0_LOCK_CS(LOCK_CS_ENABLE, this)
 		| BF_GPMI_CTRL0_ADDRESS(BV_GPMI_CTRL0_ADDRESS__NAND_DATA)
-		| BF_GPMI_CTRL0_XFER_COUNT(this->upper_len);
+		| BF_GPMI_CTRL0_XFER_COUNT(len);
 	pio[1] = 0;
 	desc = dmaengine_prep_slave_sg(channel,
 					(struct scatterlist *)pio,
@@ -652,7 +652,7 @@ int gpmi_read_data(struct gpmi_nand_data *this)
 		return -EINVAL;
 
 	/* [2] : send DMA request */
-	prepare_data_dma(this, DMA_FROM_DEVICE);
+	prepare_data_dma(this, buf, len, DMA_FROM_DEVICE);
 	desc = dmaengine_prep_slave_sg(channel, &this->data_sgl,
 					1, DMA_DEV_TO_MEM,
 					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
@@ -665,7 +665,7 @@ int gpmi_read_data(struct gpmi_nand_data *this)
 
 	dma_unmap_sg(this->dev, &this->data_sgl, 1, DMA_FROM_DEVICE);
 	if (this->direct_dma_map_ok == false)
-		memcpy(this->upper_buf, this->data_buffer_dma, this->upper_len);
+		memcpy(buf, this->data_buffer_dma, len);
 
 	return ret;
 }
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 864fda1901a6..5fa9cd4fac4a 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -437,15 +437,15 @@ struct dma_chan *get_dma_chan(struct gpmi_nand_data *this)
 }
 
 /* Can we use the upper's buffer directly for DMA? */
-void prepare_data_dma(struct gpmi_nand_data *this, enum dma_data_direction dr)
+void prepare_data_dma(struct gpmi_nand_data *this, const void *buf, int len,
+		      enum dma_data_direction dr)
 {
 	struct scatterlist *sgl = &this->data_sgl;
 	int ret;
 
 	/* first try to map the upper buffer directly */
-	if (virt_addr_valid(this->upper_buf) &&
-		!object_is_on_stack(this->upper_buf)) {
-		sg_init_one(sgl, this->upper_buf, this->upper_len);
+	if (virt_addr_valid(buf) && !object_is_on_stack(buf)) {
+		sg_init_one(sgl, buf, len);
 		ret = dma_map_sg(this->dev, sgl, 1, dr);
 		if (ret == 0)
 			goto map_fail;
@@ -456,10 +456,10 @@ void prepare_data_dma(struct gpmi_nand_data *this, enum dma_data_direction dr)
 
 map_fail:
 	/* We have to use our own DMA buffer. */
-	sg_init_one(sgl, this->data_buffer_dma, this->upper_len);
+	sg_init_one(sgl, this->data_buffer_dma, len);
 
 	if (dr == DMA_TO_DEVICE)
-		memcpy(this->data_buffer_dma, this->upper_buf, this->upper_len);
+		memcpy(this->data_buffer_dma, buf, len);
 
 	dma_map_sg(this->dev, sgl, 1, dr);
 
@@ -919,10 +919,8 @@ static void gpmi_read_buf(struct mtd_info *mtd, uint8_t *buf, int len)
 	struct gpmi_nand_data *this = nand_get_controller_data(chip);
 
 	dev_dbg(this->dev, "len is %d\n", len);
-	this->upper_buf	= buf;
-	this->upper_len	= len;
 
-	gpmi_read_data(this);
+	gpmi_read_data(this, buf, len);
 }
 
 static void gpmi_write_buf(struct mtd_info *mtd, const uint8_t *buf, int len)
@@ -931,10 +929,8 @@ static void gpmi_write_buf(struct mtd_info *mtd, const uint8_t *buf, int len)
 	struct gpmi_nand_data *this = nand_get_controller_data(chip);
 
 	dev_dbg(this->dev, "len is %d\n", len);
-	this->upper_buf	= (uint8_t *)buf;
-	this->upper_len	= len;
 
-	gpmi_send_data(this);
+	gpmi_send_data(this, buf, len);
 }
 
 static uint8_t gpmi_read_byte(struct mtd_info *mtd)
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
index 2397010a8963..fba72ad28263 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
@@ -141,10 +141,6 @@ struct gpmi_nand_data {
 	int			current_chip;
 	unsigned int		command_length;
 
-	/* passed from upper layer */
-	uint8_t			*upper_buf;
-	int			upper_len;
-
 	/* for DMA operations */
 	bool			direct_dma_map_ok;
 
@@ -178,7 +174,7 @@ struct gpmi_nand_data {
 /* Common Services */
 int common_nfc_set_geometry(struct gpmi_nand_data *);
 struct dma_chan *get_dma_chan(struct gpmi_nand_data *);
-void prepare_data_dma(struct gpmi_nand_data *,
+void prepare_data_dma(struct gpmi_nand_data *, const void *buf, int len,
 		      enum dma_data_direction dr);
 int start_dma_without_bch_irq(struct gpmi_nand_data *,
 			      struct dma_async_tx_descriptor *);
@@ -197,8 +193,9 @@ int gpmi_disable_clk(struct gpmi_nand_data *this);
 int gpmi_setup_data_interface(struct mtd_info *mtd, int chipnr,
 			      const struct nand_data_interface *conf);
 void gpmi_nfc_apply_timings(struct gpmi_nand_data *this);
-int gpmi_read_data(struct gpmi_nand_data *);
-int gpmi_send_data(struct gpmi_nand_data *);
+int gpmi_read_data(struct gpmi_nand_data *, void *buf, int len);
+int gpmi_send_data(struct gpmi_nand_data *, const void *buf, int len);
+
 int gpmi_send_page(struct gpmi_nand_data *,
 		   dma_addr_t payload, dma_addr_t auxiliary);
 int gpmi_read_page(struct gpmi_nand_data *,
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/13] mtd: nand: gpmi: put only once used functions inline
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 01/13] mtd: nand: gpmi: drop dma_ops_type Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 02/13] mtd: nand: gpmi: pass buffer and len around Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 04/13] mtd: nand: gpmi: remove direct_dma_map_ok from driver data struct Sam Lefebvre
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle, Sascha Hauer

From: Sascha Hauer <s.hauer@pengutronix.de>

read_page_prepare(), read_page_end() and read_page_swap_end() are
trivial functions that are used only once and take 8 arguments each.
De-obfuscate the code by open coding these functions in
gpmi_ecc_read_page()

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
v2: rebased
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 89 ++++++++----------------------
 1 file changed, 23 insertions(+), 66 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 5fa9cd4fac4a..774ac98fbdb2 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -668,56 +668,6 @@ static void release_resources(struct gpmi_nand_data *this)
 	release_dma_channels(this);
 }
 
-static int read_page_prepare(struct gpmi_nand_data *this,
-			void *destination, unsigned length,
-			void *alt_virt, dma_addr_t alt_phys, unsigned alt_size,
-			void **use_virt, dma_addr_t *use_phys)
-{
-	struct device *dev = this->dev;
-
-	if (virt_addr_valid(destination)) {
-		dma_addr_t dest_phys;
-
-		dest_phys = dma_map_single(dev, destination,
-						length, DMA_FROM_DEVICE);
-		if (dma_mapping_error(dev, dest_phys)) {
-			if (alt_size < length) {
-				dev_err(dev, "Alternate buffer is too small\n");
-				return -ENOMEM;
-			}
-			goto map_failed;
-		}
-		*use_virt = destination;
-		*use_phys = dest_phys;
-		this->direct_dma_map_ok = true;
-		return 0;
-	}
-
-map_failed:
-	*use_virt = alt_virt;
-	*use_phys = alt_phys;
-	this->direct_dma_map_ok = false;
-	return 0;
-}
-
-static inline void read_page_end(struct gpmi_nand_data *this,
-			void *destination, unsigned length,
-			void *alt_virt, dma_addr_t alt_phys, unsigned alt_size,
-			void *used_virt, dma_addr_t used_phys)
-{
-	if (this->direct_dma_map_ok)
-		dma_unmap_single(this->dev, used_phys, length, DMA_FROM_DEVICE);
-}
-
-static inline void read_page_swap_end(struct gpmi_nand_data *this,
-			void *destination, unsigned length,
-			void *alt_virt, dma_addr_t alt_phys, unsigned alt_size,
-			void *used_virt, dma_addr_t used_phys)
-{
-	if (!this->direct_dma_map_ok)
-		memcpy(destination, alt_virt, length);
-}
-
 static int send_page_prepare(struct gpmi_nand_data *this,
 			const void *source, unsigned length,
 			void *alt_virt, dma_addr_t alt_phys, unsigned alt_size,
@@ -1008,24 +958,33 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	int           ret;
 
 	dev_dbg(this->dev, "page number is : %d\n", page);
-	ret = read_page_prepare(this, buf, nfc_geo->payload_size,
-					this->payload_virt, this->payload_phys,
-					nfc_geo->payload_size,
-					&payload_virt, &payload_phys);
-	if (ret) {
-		dev_err(this->dev, "Inadequate DMA buffer\n");
-		ret = -ENOMEM;
-		return ret;
+
+	payload_virt = this->payload_virt;
+	payload_phys = this->payload_phys;
+	this->direct_dma_map_ok = false;
+
+	if (virt_addr_valid(buf)) {
+		dma_addr_t dest_phys;
+
+		dest_phys = dma_map_single(this->dev, buf, nfc_geo->payload_size,
+					   DMA_FROM_DEVICE);
+		if (!dma_mapping_error(this->dev, dest_phys)) {
+			payload_virt = buf;
+			payload_phys = dest_phys;
+			this->direct_dma_map_ok = true;
+		}
 	}
+
 	auxiliary_virt = this->auxiliary_virt;
 	auxiliary_phys = this->auxiliary_phys;
 
 	/* go! */
 	ret = gpmi_read_page(this, payload_phys, auxiliary_phys);
-	read_page_end(this, buf, nfc_geo->payload_size,
-			this->payload_virt, this->payload_phys,
-			nfc_geo->payload_size,
-			payload_virt, payload_phys);
+
+	if (this->direct_dma_map_ok)
+		dma_unmap_single(this->dev, payload_phys, nfc_geo->payload_size,
+				 DMA_FROM_DEVICE);
+
 	if (ret) {
 		dev_err(this->dev, "Error in ECC-based read: %d\n", ret);
 		return ret;
@@ -1034,10 +993,8 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	/* Loop over status bytes, accumulating ECC status. */
 	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
 
-	read_page_swap_end(this, buf, nfc_geo->payload_size,
-			   this->payload_virt, this->payload_phys,
-			   nfc_geo->payload_size,
-			   payload_virt, payload_phys);
+	if (!this->direct_dma_map_ok)
+		memcpy(buf, this->payload_virt, nfc_geo->payload_size);
 
 	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
 		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/13] mtd: nand: gpmi: remove direct_dma_map_ok from driver data struct
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (2 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 03/13] mtd: nand: gpmi: put only once used functions inline Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 05/13] mtd: nand: gpmi: return valid value from bch_set_geometry() Sam Lefebvre
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle, Sascha Hauer

From: Sascha Hauer <s.hauer@pengutronix.de>

Instead of putting direct_dma_map_ok into driver struct pass it around
between functions to make the code more readable.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
v2: rebased
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  |  5 +++--
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 15 +++++++--------
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h |  5 +----
 3 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index 447961b798b4..39834bedf460 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -636,6 +636,7 @@ int gpmi_read_data(struct gpmi_nand_data *this, void *buf, int len)
 	int chip = this->current_chip;
 	int ret;
 	u32 pio[2];
+	bool direct;
 
 	/* [1] : send PIO */
 	pio[0] = BF_GPMI_CTRL0_COMMAND_MODE(BV_GPMI_CTRL0_COMMAND_MODE__READ)
@@ -652,7 +653,7 @@ int gpmi_read_data(struct gpmi_nand_data *this, void *buf, int len)
 		return -EINVAL;
 
 	/* [2] : send DMA request */
-	prepare_data_dma(this, buf, len, DMA_FROM_DEVICE);
+	direct = prepare_data_dma(this, buf, len, DMA_FROM_DEVICE);
 	desc = dmaengine_prep_slave_sg(channel, &this->data_sgl,
 					1, DMA_DEV_TO_MEM,
 					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
@@ -664,7 +665,7 @@ int gpmi_read_data(struct gpmi_nand_data *this, void *buf, int len)
 	ret = start_dma_without_bch_irq(this, desc);
 
 	dma_unmap_sg(this->dev, &this->data_sgl, 1, DMA_FROM_DEVICE);
-	if (this->direct_dma_map_ok == false)
+	if (!direct)
 		memcpy(buf, this->data_buffer_dma, len);
 
 	return ret;
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 774ac98fbdb2..9ecba7ea0cd5 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -437,7 +437,7 @@ struct dma_chan *get_dma_chan(struct gpmi_nand_data *this)
 }
 
 /* Can we use the upper's buffer directly for DMA? */
-void prepare_data_dma(struct gpmi_nand_data *this, const void *buf, int len,
+bool prepare_data_dma(struct gpmi_nand_data *this, const void *buf, int len,
 		      enum dma_data_direction dr)
 {
 	struct scatterlist *sgl = &this->data_sgl;
@@ -450,8 +450,7 @@ void prepare_data_dma(struct gpmi_nand_data *this, const void *buf, int len,
 		if (ret == 0)
 			goto map_fail;
 
-		this->direct_dma_map_ok = true;
-		return;
+		return true;
 	}
 
 map_fail:
@@ -463,7 +462,7 @@ void prepare_data_dma(struct gpmi_nand_data *this, const void *buf, int len,
 
 	dma_map_sg(this->dev, sgl, 1, dr);
 
-	this->direct_dma_map_ok = false;
+	return false;
 }
 
 /* This will be called after the DMA operation is finished. */
@@ -956,12 +955,12 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	unsigned char *status;
 	unsigned int  max_bitflips = 0;
 	int           ret;
+	bool          direct = false;
 
 	dev_dbg(this->dev, "page number is : %d\n", page);
 
 	payload_virt = this->payload_virt;
 	payload_phys = this->payload_phys;
-	this->direct_dma_map_ok = false;
 
 	if (virt_addr_valid(buf)) {
 		dma_addr_t dest_phys;
@@ -971,7 +970,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 		if (!dma_mapping_error(this->dev, dest_phys)) {
 			payload_virt = buf;
 			payload_phys = dest_phys;
-			this->direct_dma_map_ok = true;
+			direct = true;
 		}
 	}
 
@@ -981,7 +980,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	/* go! */
 	ret = gpmi_read_page(this, payload_phys, auxiliary_phys);
 
-	if (this->direct_dma_map_ok)
+	if (direct)
 		dma_unmap_single(this->dev, payload_phys, nfc_geo->payload_size,
 				 DMA_FROM_DEVICE);
 
@@ -993,7 +992,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	/* Loop over status bytes, accumulating ECC status. */
 	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
 
-	if (!this->direct_dma_map_ok)
+	if (!direct)
 		memcpy(buf, this->payload_virt, nfc_geo->payload_size);
 
 	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
index fba72ad28263..6aa10d6962d6 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
@@ -141,9 +141,6 @@ struct gpmi_nand_data {
 	int			current_chip;
 	unsigned int		command_length;
 
-	/* for DMA operations */
-	bool			direct_dma_map_ok;
-
 	struct scatterlist	cmd_sgl;
 	char			*cmd_buffer;
 
@@ -174,7 +171,7 @@ struct gpmi_nand_data {
 /* Common Services */
 int common_nfc_set_geometry(struct gpmi_nand_data *);
 struct dma_chan *get_dma_chan(struct gpmi_nand_data *);
-void prepare_data_dma(struct gpmi_nand_data *, const void *buf, int len,
+bool prepare_data_dma(struct gpmi_nand_data *, const void *buf, int len,
 		      enum dma_data_direction dr);
 int start_dma_without_bch_irq(struct gpmi_nand_data *,
 			      struct dma_async_tx_descriptor *);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/13] mtd: nand: gpmi: return valid value from bch_set_geometry()
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (3 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 04/13] mtd: nand: gpmi: remove direct_dma_map_ok from driver data struct Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 06/13] mtd: nand: gpmi: remove unnecessary variables Sam Lefebvre
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle, Sascha Hauer

From: Sascha Hauer <s.hauer@pengutronix.de>

The caller of bch_set_geometry() expects the return value to
be an error code, so !0 is not valid. return the error from the
just called function instead.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
v2: rebased
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index 39834bedf460..83697b8df871 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -258,8 +258,9 @@ int bch_set_geometry(struct gpmi_nand_data *this)
 	unsigned int gf_len;
 	int ret;
 
-	if (common_nfc_set_geometry(this))
-		return !0;
+	ret = common_nfc_set_geometry(this);
+	if (ret)
+		return ret;
 
 	block_count   = bch_geo->ecc_chunk_count - 1;
 	block_size    = bch_geo->ecc_chunk_size;
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/13] mtd: nand: gpmi: remove unnecessary variables
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (4 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 05/13] mtd: nand: gpmi: return valid value from bch_set_geometry() Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 07/13] mtd: rawnand: gpmi: set aggregate ready/busy signalling Sam Lefebvre
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle, Sascha Hauer

From: Sascha Hauer <s.hauer@pengutronix.de>

Use this->auxiliary_virt and this->auxiliary_phys directly rather
than creating extra local variables for them.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
v2: rebased
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 9ecba7ea0cd5..7ba00b8ab288 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -949,8 +949,6 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	struct mtd_info *mtd = nand_to_mtd(chip);
 	void          *payload_virt;
 	dma_addr_t    payload_phys;
-	void          *auxiliary_virt;
-	dma_addr_t    auxiliary_phys;
 	unsigned int  i;
 	unsigned char *status;
 	unsigned int  max_bitflips = 0;
@@ -974,11 +972,8 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 		}
 	}
 
-	auxiliary_virt = this->auxiliary_virt;
-	auxiliary_phys = this->auxiliary_phys;
-
 	/* go! */
-	ret = gpmi_read_page(this, payload_phys, auxiliary_phys);
+	ret = gpmi_read_page(this, payload_phys, this->auxiliary_phys);
 
 	if (direct)
 		dma_unmap_single(this->dev, payload_phys, nfc_geo->payload_size,
@@ -990,7 +985,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	}
 
 	/* Loop over status bytes, accumulating ECC status. */
-	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
+	status = this->auxiliary_virt + nfc_geo->auxiliary_status_offset;
 
 	if (!direct)
 		memcpy(buf, this->payload_virt, nfc_geo->payload_size);
@@ -1048,7 +1043,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 						buf + i * nfc_geo->ecc_chunk_size,
 						nfc_geo->ecc_chunk_size,
 						eccbuf, eccbytes,
-						auxiliary_virt,
+						this->auxiliary_virt,
 						nfc_geo->metadata_size,
 						nfc_geo->ecc_strength);
 			} else {
@@ -1076,7 +1071,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	}
 
 	/* handle the block mark swapping */
-	block_mark_swapping(this, buf, auxiliary_virt);
+	block_mark_swapping(this, buf, this->auxiliary_virt);
 
 	if (oob_required) {
 		/*
@@ -1090,7 +1085,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 		 * the block mark.
 		 */
 		memset(chip->oob_poi, ~0, mtd->oobsize);
-		chip->oob_poi[0] = ((uint8_t *) auxiliary_virt)[0];
+		chip->oob_poi[0] = ((uint8_t *)this->auxiliary_virt)[0];
 	}
 
 	return max_bitflips;
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/13] mtd: rawnand: gpmi: set aggregate ready/busy signalling
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (5 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 06/13] mtd: nand: gpmi: remove unnecessary variables Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-06-18 15:00   ` Miquel Raynal
  2018-04-26 15:41 ` [PATCH 08/13] mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob() Sam Lefebvre
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle, Han Xu

The GPMI block by default decouples the ready/busy signal (which is
shared between all chip-selects) to separate internal signals per
chip-select. However, since the gpmi-nand driver uses DMA0 for all
chip-selects, DMA transfers will not be able to see the ready/busy
status for any chip-select other than 0.

Currently, this happens to work because nand_command() ends with an
explicit nand_wait_ready() and the driver only sets up a single command
in a DMA chain. nand_wait_ready() polls for the chip-select specific
ready/busy status bit. Future patches, however, will set up a DMA chain
with several commands, so these will have to wait correctly in the DMA
chain itself.

To fix this, set the GANGED_RDYBUSY bit in the control1 register. This
ties all internal ready/busy signals together, so DMA0 will also see
the ready/busy status of chip selects 1-2-3. It's a bit silly that this
isn't implied in hardware by the DECOUPLE_CS bit.

Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
Cc: Han Xu <han.xu@nxp.com>
---
Tested on an i.MX6Q with two identical chips on CS0 and CS1.
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  | 3 +++
 drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index 83697b8df871..1858afdb400d 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -193,6 +193,9 @@ int gpmi_init(struct gpmi_nand_data *this)
 	 */
 	writel(BM_GPMI_CTRL1_DECOUPLE_CS, r->gpmi_regs + HW_GPMI_CTRL1_SET);
 
+	/* Aggregate ready busy signalling. */
+	writel(BM_GPMI_CTRL1_GANGED_RDYBUSY, r->gpmi_regs + HW_GPMI_CTRL1_SET);
+
 	gpmi_disable_clk(this);
 	return 0;
 err_out:
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h
index d92bf32221ca..e341802c90ac 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h
@@ -120,6 +120,7 @@
 #define BV_GPMI_CTRL1_WRN_DLY_SEL_7_TO_12NS		0x2
 #define BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY		0x3
 
+#define BM_GPMI_CTRL1_GANGED_RDYBUSY		(1 << 19)
 #define BM_GPMI_CTRL1_BCH_MODE				(1 << 18)
 
 #define BP_GPMI_CTRL1_DLL_ENABLE			17
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/13] mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob()
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (6 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 07/13] mtd: rawnand: gpmi: set aggregate ready/busy signalling Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 16:43   ` Boris Brezillon
  2018-04-26 15:41 ` [PATCH 09/13] mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq() Sam Lefebvre
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle

Errors can be generated by nand_read_page_op(). It's important
to check the error flags and pass them upwards.

Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>

---

v2: also bail out before calling ->read_byte() (Boris)

When handling errors we need to bail out before calling
->read_byte().
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 7ba00b8ab288..60bc1bac7741 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -1334,13 +1334,18 @@ static int gpmi_ecc_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 				int page)
 {
 	struct gpmi_nand_data *this = nand_get_controller_data(chip);
+	int ret = 0;
 
 	dev_dbg(this->dev, "page number is %d\n", page);
 	/* clear the OOB buffer */
 	memset(chip->oob_poi, ~0, mtd->oobsize);
 
 	/* Read out the conventional OOB. */
-	nand_read_page_op(chip, page, mtd->writesize, NULL, 0);
+	ret = nand_read_page_op(chip, page, mtd->writesize, NULL, 0);
+
+	if (ret)
+		return ret;
+
 	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
 	/*
@@ -1350,11 +1355,15 @@ static int gpmi_ecc_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 	 */
 	if (GPMI_IS_MX23(this)) {
 		/* Read the block mark into the first byte of the OOB buffer. */
-		nand_read_page_op(chip, page, 0, NULL, 0);
+		ret = nand_read_page_op(chip, page, 0, NULL, 0);
+
+		if (ret)
+			return ret;
+
 		chip->oob_poi[0] = chip->read_byte(mtd);
 	}
 
-	return 0;
+	return ret;
 }
 
 static int
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/13] mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq()
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (7 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 08/13] mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob() Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 16:49   ` Boris Brezillon
  2018-04-26 15:41 ` [PATCH 10/13] mtd: rawnand: gpmi: instantiate cmdfunc Sam Lefebvre
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle

To improve nand performance, the BCH interrupt bit must be polled
before waiting for occurence. This avoids an unnecessary context switch.

Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 60bc1bac7741..28a2cf106ddc 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -503,18 +503,31 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
  * Actually, we must wait for two interrupts :
  *	[1] firstly the DMA interrupt and
  *	[2] secondly the BCH interrupt.
+ *	To improve performance, we first poll for the BCH
+ *	interrupt.
  */
 int start_dma_with_bch_irq(struct gpmi_nand_data *this,
 			struct dma_async_tx_descriptor *desc)
 {
 	struct completion *bch_c = &this->bch_done;
+	struct resources *r = &this->resources;
 	unsigned long timeout;
 
-	/* Prepare to receive an interrupt from the BCH block. */
-	init_completion(bch_c);
+	/* Disable interrupt */
+	writel(BM_BCH_CTRL_COMPLETE_IRQ_EN, r->bch_regs + HW_BCH_CTRL_CLR);
 
 	/* start the DMA */
 	start_dma_without_bch_irq(this, desc);
+	if (readl(r->bch_regs + HW_BCH_CTRL) & BM_BCH_CTRL_COMPLETE_IRQ) {
+		writel(BM_BCH_CTRL_COMPLETE_IRQ, r->bch_regs + HW_BCH_CTRL_CLR);
+		return 0;
+	}
+
+	/* Prepare to receive an interrupt from the BCH block. */
+	init_completion(bch_c);
+
+	/* Re-enable interrupt */
+	writel(BM_BCH_CTRL_COMPLETE_IRQ_EN, r->bch_regs + HW_BCH_CTRL_SET);
 
 	/* Wait for the interrupt from the BCH block. */
 	timeout = wait_for_completion_timeout(bch_c, msecs_to_jiffies(1000));
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/13] mtd: rawnand: gpmi: instantiate cmdfunc
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (8 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 09/13] mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq() Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 11/13] mtd: rawnand: gpmi: gpmi_nand_command(): simplification and formatting Sam Lefebvre
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle

From: "Arnout Vandecappelle (Essensium/Mind)" <arnout@mind.be>

Later patches will optimize the command handling for gpmi. This
requires a gpmmi-specific implementation of cmdfunc. As a first step,
nand_command() is copied literally from nand_base.c (after merging
nand_command() and nand_command_lp()).

The auxiliary functions nand_ccs_delay() and nand_wait_status_ready()
also need to be copied, together with the includes they need.

Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 205 +++++++++++++++++++++++++++++
 1 file changed, 205 insertions(+)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 28a2cf106ddc..91205573f2bc 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -19,11 +19,13 @@
  * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
  */
 #include <linux/clk.h>
+#include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/sched/task_stack.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/mtd/partitions.h>
+#include <linux/nmi.h>
 #include <linux/of.h>
 #include <linux/of_device.h>
 #include "gpmi-nand.h"
@@ -1104,6 +1106,208 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	return max_bitflips;
 }
 
+static void gpmi_wait_status_ready(struct mtd_info *mtd, unsigned long timeo)
+{
+	register struct nand_chip *chip = mtd_to_nand(mtd);
+	int ret;
+
+	timeo = jiffies + msecs_to_jiffies(timeo);
+	do {
+		u8 status;
+
+		ret = nand_read_data_op(chip, &status, sizeof(status), true);
+		if (ret)
+			return;
+
+		if (status & NAND_STATUS_READY)
+			break;
+		touch_softlockup_watchdog();
+	} while (time_before(jiffies, timeo));
+};
+
+static void gpmi_ccs_delay(struct nand_chip *chip)
+{
+	/*
+	 * The controller already takes care of waiting for tCCS when the RNDIN
+	 * or RNDOUT command is sent, return directly.
+	 */
+	if (!(chip->options & NAND_WAIT_TCCS))
+		return;
+
+	/*
+	 * Wait tCCS_min if it is correctly defined, otherwise wait 500ns
+	 * (which should be safe for all NANDs).
+	 */
+	if (chip->setup_data_interface)
+		ndelay(chip->data_interface.timings.sdr.tCCS_min / 1000);
+	else
+		ndelay(500);
+}
+
+/**
+ * gpmi_nand_command - Send command to NAND device
+ * @mtd: MTD device structure
+ * @command: the command to be sent
+ * @column: the column address for this command, -1 if none
+ * @page_addr: the page address for this command, -1 if none
+ *
+ * Send command to NAND device.
+ */
+static void gpmi_nand_command(struct mtd_info *mtd, unsigned int command,
+			      int column, int page_addr)
+{
+	register struct nand_chip *chip = mtd_to_nand(mtd);
+	int ctrl = NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE;
+	/* Large page devices (> 512 bytes) behave slightly differently. */
+	bool is_lp = mtd->writesize > 512;
+
+	if (is_lp) {
+		/* Large page devices don't have the separate regions as we
+		 * have in the small page devices. We must emulate
+		 * NAND_CMD_READOOB to keep the code compatible.
+		 */
+		if (command == NAND_CMD_READOOB) {
+			column += mtd->writesize;
+			command = NAND_CMD_READ0;
+		}
+	} else if (command == NAND_CMD_SEQIN) {
+		/* Write out the command to the device */
+		int readcmd;
+
+		if (column >= mtd->writesize) {
+			/* OOB area */
+			column -= mtd->writesize;
+			readcmd = NAND_CMD_READOOB;
+		} else if (column < 256) {
+			/* First 256 bytes --> READ0 */
+			readcmd = NAND_CMD_READ0;
+		} else {
+			column -= 256;
+			readcmd = NAND_CMD_READ1;
+		}
+		chip->cmd_ctrl(mtd, readcmd, ctrl);
+		ctrl &= ~NAND_CTRL_CHANGE;
+	}
+
+	/* Command latch cycle */
+	if (command != NAND_CMD_NONE)
+		chip->cmd_ctrl(mtd, command, ctrl);
+
+	/* Address cycle, when necessary */
+	ctrl = NAND_NCE | NAND_ALE | NAND_CTRL_CHANGE;
+	/* Serially input address */
+	if (column != -1) {
+		/* Adjust columns for 16 bit buswidth */
+		if (chip->options & NAND_BUSWIDTH_16 &&
+				!nand_opcode_8bits(command))
+			column >>= 1;
+		chip->cmd_ctrl(mtd, column, ctrl);
+		ctrl &= ~NAND_CTRL_CHANGE;
+
+		/* Only output a single addr cycle for 8bits opcodes. */
+		if (is_lp && !nand_opcode_8bits(command))
+			chip->cmd_ctrl(mtd, column >> 8, ctrl);
+	}
+	if (page_addr != -1) {
+		chip->cmd_ctrl(mtd, page_addr, ctrl);
+		ctrl &= ~NAND_CTRL_CHANGE;
+		chip->cmd_ctrl(mtd, page_addr >> 8, ctrl);
+		if (chip->options & NAND_ROW_ADDR_3)
+			chip->cmd_ctrl(mtd, page_addr >> 16, ctrl);
+	}
+	chip->cmd_ctrl(mtd, NAND_CMD_NONE, NAND_NCE | NAND_CTRL_CHANGE);
+
+	/*
+	 * Program and erase have their own busy handlers status, sequential
+	 * in and status need no delay.
+	 */
+	switch (command) {
+
+	case NAND_CMD_NONE:
+	case NAND_CMD_PAGEPROG:
+	case NAND_CMD_ERASE1:
+	case NAND_CMD_ERASE2:
+	case NAND_CMD_SEQIN:
+	case NAND_CMD_STATUS:
+	case NAND_CMD_READID:
+	case NAND_CMD_SET_FEATURES:
+		return;
+
+	case NAND_CMD_CACHEDPROG:
+		if (is_lp)
+			return;
+		break;
+
+	case NAND_CMD_RNDIN:
+		if (is_lp) {
+			gpmi_ccs_delay(chip);
+			return;
+		}
+		break;
+
+	case NAND_CMD_RESET:
+		if (chip->dev_ready)
+			break;
+		udelay(chip->chip_delay);
+		chip->cmd_ctrl(mtd, NAND_CMD_STATUS,
+			       NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
+		chip->cmd_ctrl(mtd, NAND_CMD_NONE,
+			       NAND_NCE | NAND_CTRL_CHANGE);
+		/* EZ-NAND can take upto 250ms as per ONFi v4.0 */
+		gpmi_wait_status_ready(mtd, 250);
+		return;
+
+	case NAND_CMD_RNDOUT:
+		if (is_lp) {
+			/* No ready / busy check necessary */
+			chip->cmd_ctrl(mtd, NAND_CMD_RNDOUTSTART,
+				NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
+			chip->cmd_ctrl(mtd, NAND_CMD_NONE,
+				NAND_NCE | NAND_CTRL_CHANGE);
+
+			gpmi_ccs_delay(chip);
+			return;
+		}
+		break;
+
+	case NAND_CMD_READ0:
+		/*
+		 * READ0 is sometimes used to exit GET STATUS mode. When this
+		 * is the case no address cycles are requested, and we can use
+		 * this information to detect that that we should not wait for
+		 * the device to be ready and READSTART should not be issued.
+		 */
+		if (column == -1 && page_addr == -1)
+			return;
+
+		if (is_lp) {
+			chip->cmd_ctrl(mtd, NAND_CMD_READSTART,
+				       NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
+			chip->cmd_ctrl(mtd, NAND_CMD_NONE,
+				       NAND_NCE | NAND_CTRL_CHANGE);
+		}
+		/* Read commands must wait */
+		break;
+	}
+	/*
+	 * If we don't have access to the busy pin, we apply the given command
+	 * delay.
+	 */
+	if (!chip->dev_ready) {
+		udelay(chip->chip_delay);
+		return;
+	}
+
+	/*
+	 * Apply this short delay always to ensure that we do wait tWB in
+	 * any case on any machine.
+	 */
+	ndelay(100);
+
+	nand_wait_ready(mtd);
+}
+
+
 static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 			      uint8_t *buf, int oob_required, int page)
 {
@@ -1912,6 +2116,7 @@ static int gpmi_nand_init(struct gpmi_nand_data *this)
 	chip->select_chip	= gpmi_select_chip;
 	chip->setup_data_interface = gpmi_setup_data_interface;
 	chip->cmd_ctrl		= gpmi_cmd_ctrl;
+	chip->cmdfunc		= gpmi_nand_command;
 	chip->dev_ready		= gpmi_dev_ready;
 	chip->read_byte		= gpmi_read_byte;
 	chip->read_buf		= gpmi_read_buf;
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/13] mtd: rawnand: gpmi: gpmi_nand_command(): simplification and formatting
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (9 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 10/13] mtd: rawnand: gpmi: instantiate cmdfunc Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 12/13] mtd: rawnand: gpmi: issue two commands in a single DMA chain Sam Lefebvre
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle

From: "Arnout Vandecappelle (Essensium/Mind)" <arnout@mind.be>

The controller already takes care of waiting for tCCS when the RNDIN
or RNDOUT command is sent. That is, the NAND_WAIT_TCCS option is not
set for gpmi-nand. Therefore, the gpmi_ccs_delay() calls are not
needed. Gpmi-nand implements the dev_ready function, so explicit delays are not
needed for the RESET and READ0 commands. Therefore, the
gpmi_wait_status_ready() function and the includes it brings in are not
needed. The RESET command just falls through to the nand_wait_ready() call.

Commands are send using DMA, and the DMA already waits for the
ready/busy signal from the NAND. So no explicit delay and call to
nand_wait_ready() is needed.

This makes it possible to simplify the switch in gpmi_nand_command() a
lot.

Since gpmi-nand now implements cmdfunc, it is no longer needed to
implement cmd_ctrl. Call gpmi_cmd_ctrl directly.

gpmi_cmd_ctrl() has two "states":

* ALE or CLE is set, in this case the command/control data is buffered.
  These calls are replaced with
	this->cmd_buffer[this->command_length++] = data;

* ALE and CLE are not set, in this case the command is sent (DMA is
  started). These calls are replaced with
	ret = gpmi_send_command(this);
	if (ret)
		dev_err(this->dev, "Chip: %u, Error %d\n",
			this->current_chip, ret);
	this->command_length = 0;

The 'ctrl' variable/parameter is not used.

To enable chaining the two DMAs corresponding to the two commands that
can be issued by gpmi_nand_command(), duplicate the cmd_sgl and
command_length. For cmd_buffer, instead of duplicating it, we can use
a fixed offset within the buffer.

The appropriate sgl and command_length is passed to gpmi_send_command(),
and the sg_init_one(), dma_map_sg() and dma_unmap_sg() calls are hoisted
out of it. This is needed because the unmapping should only be done
after both commands have finished.

Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  |  12 +-
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 224 +++++++----------------------
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h |   9 +-
 3 files changed, 64 insertions(+), 181 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index 1858afdb400d..46b2208df30e 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -548,11 +548,11 @@ int gpmi_is_ready(struct gpmi_nand_data *this, unsigned chip)
 	return reg & mask;
 }
 
-int gpmi_send_command(struct gpmi_nand_data *this)
+int gpmi_send_command(struct gpmi_nand_data *this, struct scatterlist *sgl,
+		      unsigned int command_length)
 {
 	struct dma_chan *channel = get_dma_chan(this);
 	struct dma_async_tx_descriptor *desc;
-	struct scatterlist *sgl;
 	int chip = this->current_chip;
 	int ret;
 	u32 pio[3];
@@ -564,7 +564,7 @@ int gpmi_send_command(struct gpmi_nand_data *this)
 		| BF_GPMI_CTRL0_LOCK_CS(LOCK_CS_ENABLE, this)
 		| BF_GPMI_CTRL0_ADDRESS(BV_GPMI_CTRL0_ADDRESS__NAND_CLE)
 		| BM_GPMI_CTRL0_ADDRESS_INCREMENT
-		| BF_GPMI_CTRL0_XFER_COUNT(this->command_length);
+		| BF_GPMI_CTRL0_XFER_COUNT(command_length);
 	pio[1] = pio[2] = 0;
 	desc = dmaengine_prep_slave_sg(channel,
 					(struct scatterlist *)pio,
@@ -573,10 +573,6 @@ int gpmi_send_command(struct gpmi_nand_data *this)
 		return -EINVAL;
 
 	/* [2] send out the COMMAND + ADDRESS string stored in @buffer */
-	sgl = &this->cmd_sgl;
-
-	sg_init_one(sgl, this->cmd_buffer, this->command_length);
-	dma_map_sg(this->dev, sgl, 1, DMA_TO_DEVICE);
 	desc = dmaengine_prep_slave_sg(channel,
 				sgl, 1, DMA_MEM_TO_DEV,
 				DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
@@ -586,8 +582,6 @@ int gpmi_send_command(struct gpmi_nand_data *this)
 	/* [3] submit the DMA */
 	ret = start_dma_without_bch_irq(this, desc);
 
-	dma_unmap_sg(this->dev, sgl, 1, DMA_TO_DEVICE);
-
 	return ret;
 }
 
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 91205573f2bc..81accbf175bf 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -19,13 +19,11 @@
  * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
  */
 #include <linux/clk.h>
-#include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/sched/task_stack.h>
 #include <linux/interrupt.h>
 #include <linux/module.h>
 #include <linux/mtd/partitions.h>
-#include <linux/nmi.h>
 #include <linux/of.h>
 #include <linux/of_device.h>
 #include "gpmi-nand.h"
@@ -801,40 +799,6 @@ static int gpmi_alloc_dma_buffer(struct gpmi_nand_data *this)
 	return -ENOMEM;
 }
 
-static void gpmi_cmd_ctrl(struct mtd_info *mtd, int data, unsigned int ctrl)
-{
-	struct nand_chip *chip = mtd_to_nand(mtd);
-	struct gpmi_nand_data *this = nand_get_controller_data(chip);
-	int ret;
-
-	/*
-	 * Every operation begins with a command byte and a series of zero or
-	 * more address bytes. These are distinguished by either the Address
-	 * Latch Enable (ALE) or Command Latch Enable (CLE) signals being
-	 * asserted. When MTD is ready to execute the command, it will deassert
-	 * both latch enables.
-	 *
-	 * Rather than run a separate DMA operation for every single byte, we
-	 * queue them up and run a single DMA operation for the entire series
-	 * of command and data bytes. NAND_CMD_NONE means the END of the queue.
-	 */
-	if ((ctrl & (NAND_ALE | NAND_CLE))) {
-		if (data != NAND_CMD_NONE)
-			this->cmd_buffer[this->command_length++] = data;
-		return;
-	}
-
-	if (!this->command_length)
-		return;
-
-	ret = gpmi_send_command(this);
-	if (ret)
-		dev_err(this->dev, "Chip: %u, Error %d\n",
-			this->current_chip, ret);
-
-	this->command_length = 0;
-}
-
 static int gpmi_dev_ready(struct mtd_info *mtd)
 {
 	struct nand_chip *chip = mtd_to_nand(mtd);
@@ -1106,44 +1070,6 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	return max_bitflips;
 }
 
-static void gpmi_wait_status_ready(struct mtd_info *mtd, unsigned long timeo)
-{
-	register struct nand_chip *chip = mtd_to_nand(mtd);
-	int ret;
-
-	timeo = jiffies + msecs_to_jiffies(timeo);
-	do {
-		u8 status;
-
-		ret = nand_read_data_op(chip, &status, sizeof(status), true);
-		if (ret)
-			return;
-
-		if (status & NAND_STATUS_READY)
-			break;
-		touch_softlockup_watchdog();
-	} while (time_before(jiffies, timeo));
-};
-
-static void gpmi_ccs_delay(struct nand_chip *chip)
-{
-	/*
-	 * The controller already takes care of waiting for tCCS when the RNDIN
-	 * or RNDOUT command is sent, return directly.
-	 */
-	if (!(chip->options & NAND_WAIT_TCCS))
-		return;
-
-	/*
-	 * Wait tCCS_min if it is correctly defined, otherwise wait 500ns
-	 * (which should be safe for all NANDs).
-	 */
-	if (chip->setup_data_interface)
-		ndelay(chip->data_interface.timings.sdr.tCCS_min / 1000);
-	else
-		ndelay(500);
-}
-
 /**
  * gpmi_nand_command - Send command to NAND device
  * @mtd: MTD device structure
@@ -1157,10 +1083,14 @@ static void gpmi_nand_command(struct mtd_info *mtd, unsigned int command,
 			      int column, int page_addr)
 {
 	register struct nand_chip *chip = mtd_to_nand(mtd);
-	int ctrl = NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE;
+	struct gpmi_nand_data *this = nand_get_controller_data(chip);
+	int ret;
 	/* Large page devices (> 512 bytes) behave slightly differently. */
 	bool is_lp = mtd->writesize > 512;
 
+	BUG_ON(this->command_length != 0);
+	BUG_ON(this->command_length2 != 0);
+
 	if (is_lp) {
 		/* Large page devices don't have the separate regions as we
 		 * have in the small page devices. We must emulate
@@ -1185,126 +1115,81 @@ static void gpmi_nand_command(struct mtd_info *mtd, unsigned int command,
 			column -= 256;
 			readcmd = NAND_CMD_READ1;
 		}
-		chip->cmd_ctrl(mtd, readcmd, ctrl);
-		ctrl &= ~NAND_CTRL_CHANGE;
+		this->cmd_buffer[this->command_length++] = readcmd;
 	}
 
 	/* Command latch cycle */
 	if (command != NAND_CMD_NONE)
-		chip->cmd_ctrl(mtd, command, ctrl);
+		this->cmd_buffer[this->command_length++] = command;
 
-	/* Address cycle, when necessary */
-	ctrl = NAND_NCE | NAND_ALE | NAND_CTRL_CHANGE;
 	/* Serially input address */
 	if (column != -1) {
 		/* Adjust columns for 16 bit buswidth */
 		if (chip->options & NAND_BUSWIDTH_16 &&
 				!nand_opcode_8bits(command))
 			column >>= 1;
-		chip->cmd_ctrl(mtd, column, ctrl);
-		ctrl &= ~NAND_CTRL_CHANGE;
+		this->cmd_buffer[this->command_length++] = column;
 
 		/* Only output a single addr cycle for 8bits opcodes. */
 		if (is_lp && !nand_opcode_8bits(command))
-			chip->cmd_ctrl(mtd, column >> 8, ctrl);
+			this->cmd_buffer[this->command_length++] = column >> 8;
 	}
 	if (page_addr != -1) {
-		chip->cmd_ctrl(mtd, page_addr, ctrl);
-		ctrl &= ~NAND_CTRL_CHANGE;
-		chip->cmd_ctrl(mtd, page_addr >> 8, ctrl);
+		this->cmd_buffer[this->command_length++] = page_addr;
+		this->cmd_buffer[this->command_length++] = page_addr >> 8;
 		if (chip->options & NAND_ROW_ADDR_3)
-			chip->cmd_ctrl(mtd, page_addr >> 16, ctrl);
+			this->cmd_buffer[this->command_length++] = page_addr >> 16;
 	}
-	chip->cmd_ctrl(mtd, NAND_CMD_NONE, NAND_NCE | NAND_CTRL_CHANGE);
-
-	/*
-	 * Program and erase have their own busy handlers status, sequential
-	 * in and status need no delay.
-	 */
-	switch (command) {
-
-	case NAND_CMD_NONE:
-	case NAND_CMD_PAGEPROG:
-	case NAND_CMD_ERASE1:
-	case NAND_CMD_ERASE2:
-	case NAND_CMD_SEQIN:
-	case NAND_CMD_STATUS:
-	case NAND_CMD_READID:
-	case NAND_CMD_SET_FEATURES:
-		return;
-
-	case NAND_CMD_CACHEDPROG:
-		if (is_lp)
-			return;
-		break;
 
-	case NAND_CMD_RNDIN:
-		if (is_lp) {
-			gpmi_ccs_delay(chip);
-			return;
-		}
-		break;
-
-	case NAND_CMD_RESET:
-		if (chip->dev_ready)
-			break;
-		udelay(chip->chip_delay);
-		chip->cmd_ctrl(mtd, NAND_CMD_STATUS,
-			       NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
-		chip->cmd_ctrl(mtd, NAND_CMD_NONE,
-			       NAND_NCE | NAND_CTRL_CHANGE);
-		/* EZ-NAND can take upto 250ms as per ONFi v4.0 */
-		gpmi_wait_status_ready(mtd, 250);
-		return;
-
-	case NAND_CMD_RNDOUT:
-		if (is_lp) {
+	/* Reuse the same cmd_buffer for the possible second command. The address
+	 * must be word-aligned. For convenience, use a fixed offset of 64, much
+	 * larger than the maximum command_length. */
+	if (is_lp)
+		switch (command) {
+		case NAND_CMD_RNDOUT:
 			/* No ready / busy check necessary */
-			chip->cmd_ctrl(mtd, NAND_CMD_RNDOUTSTART,
-				NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
-			chip->cmd_ctrl(mtd, NAND_CMD_NONE,
-				NAND_NCE | NAND_CTRL_CHANGE);
+			this->cmd_buffer[64 + this->command_length2++] = NAND_CMD_RNDOUTSTART;
+			break;
 
-			gpmi_ccs_delay(chip);
-			return;
+		case NAND_CMD_READ0:
+			/*
+			 * READ0 is sometimes used to exit GET STATUS mode. When this
+			 * is the case no address cycles are requested, and we can use
+			 * this information to detect that that we should not wait for
+			 * the device to be ready and READSTART should not be issued.
+			 */
+			if (column != -1 || page_addr != -1)
+				this->cmd_buffer[64 + this->command_length2++] = NAND_CMD_READSTART;
+			break;
 		}
-		break;
 
-	case NAND_CMD_READ0:
-		/*
-		 * READ0 is sometimes used to exit GET STATUS mode. When this
-		 * is the case no address cycles are requested, and we can use
-		 * this information to detect that that we should not wait for
-		 * the device to be ready and READSTART should not be issued.
-		 */
-		if (column == -1 && page_addr == -1)
-			return;
-
-		if (is_lp) {
-			chip->cmd_ctrl(mtd, NAND_CMD_READSTART,
-				       NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
-			chip->cmd_ctrl(mtd, NAND_CMD_NONE,
-				       NAND_NCE | NAND_CTRL_CHANGE);
-		}
-		/* Read commands must wait */
-		break;
-	}
-	/*
-	 * If we don't have access to the busy pin, we apply the given command
-	 * delay.
-	 */
-	if (!chip->dev_ready) {
-		udelay(chip->chip_delay);
-		return;
+	/* This starts the DMA for the command and waits for it to finish. */
+	if (this->command_length > 0) {
+		sg_init_one(&this->cmd_sgl, this->cmd_buffer, this->command_length);
+		dma_map_sg(this->dev, &this->cmd_sgl, 1, DMA_TO_DEVICE);
+		ret = gpmi_send_command(this, &this->cmd_sgl, this->command_length);
+		if (ret)
+			dev_err(this->dev, "Chip: %u, Error %d\n",
+				this->current_chip, ret);
 	}
 
-	/*
-	 * Apply this short delay always to ensure that we do wait tWB in
-	 * any case on any machine.
-	 */
-	ndelay(100);
+	if (this->command_length2 > 0) {
+		sg_init_one(&this->cmd_sgl2, this->cmd_buffer + 64, this->command_length2);
+		dma_map_sg(this->dev, &this->cmd_sgl2, 1, DMA_TO_DEVICE);
+		ret = gpmi_send_command(this, &this->cmd_sgl2, this->command_length2);
+		if (ret)
+			dev_err(this->dev, "Chip: %u, Error %d\n",
+				this->current_chip, ret);
+	}
 
-	nand_wait_ready(mtd);
+	if (this->command_length > 0) {
+		dma_unmap_sg(this->dev, &this->cmd_sgl, 1, DMA_TO_DEVICE);
+		this->command_length = 0;
+	}
+	if (this->command_length2 > 0) {
+		dma_unmap_sg(this->dev, &this->cmd_sgl2, 1, DMA_TO_DEVICE);
+		this->command_length2 = 0;
+	}
 }
 
 
@@ -2115,7 +2000,6 @@ static int gpmi_nand_init(struct gpmi_nand_data *this)
 	nand_set_flash_node(chip, this->pdev->dev.of_node);
 	chip->select_chip	= gpmi_select_chip;
 	chip->setup_data_interface = gpmi_setup_data_interface;
-	chip->cmd_ctrl		= gpmi_cmd_ctrl;
 	chip->cmdfunc		= gpmi_nand_command;
 	chip->dev_ready		= gpmi_dev_ready;
 	chip->read_byte		= gpmi_read_byte;
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
index 6aa10d6962d6..9dc3dd16fa0b 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
@@ -139,11 +139,15 @@ struct gpmi_nand_data {
 
 	/* General-use Variables */
 	int			current_chip;
-	unsigned int		command_length;
 
+	unsigned int		command_length;
 	struct scatterlist	cmd_sgl;
 	char			*cmd_buffer;
 
+	unsigned int		command_length2;
+	struct scatterlist	cmd_sgl2;
+	char			*cmd_buffer2;
+
 	struct scatterlist	data_sgl;
 	char			*data_buffer_dma;
 
@@ -184,7 +188,8 @@ void gpmi_clear_bch(struct gpmi_nand_data *);
 void gpmi_dump_info(struct gpmi_nand_data *);
 int bch_set_geometry(struct gpmi_nand_data *);
 int gpmi_is_ready(struct gpmi_nand_data *, unsigned chip);
-int gpmi_send_command(struct gpmi_nand_data *);
+int gpmi_send_command(struct gpmi_nand_data *, struct scatterlist *sgl,
+		      unsigned int command_length);
 int gpmi_enable_clk(struct gpmi_nand_data *this);
 int gpmi_disable_clk(struct gpmi_nand_data *this);
 int gpmi_setup_data_interface(struct mtd_info *mtd, int chipnr,
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 12/13] mtd: rawnand: gpmi: issue two commands in a single DMA chain
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (10 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 11/13] mtd: rawnand: gpmi: gpmi_nand_command(): simplification and formatting Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 15:41 ` [PATCH 13/13] mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data() Sam Lefebvre
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle

gpmi_nand_command() may issue two commands. Instead of issuing them as
separate DMAs, chain them together and issue only a single DMA.

This removes one DMA interrupt and associated overhead. For full-page
reads with ECC, it reduces the number of interrupts from 4 per page to
3 per page (2 DMA interrupts + 1 BCH interrupt).

Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  | 12 +++++++-----
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c |  8 +++++---
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h |  2 +-
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index 46b2208df30e..8f5a2a242228 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -549,12 +549,12 @@ int gpmi_is_ready(struct gpmi_nand_data *this, unsigned chip)
 }
 
 int gpmi_send_command(struct gpmi_nand_data *this, struct scatterlist *sgl,
-		      unsigned int command_length)
+		      unsigned int command_length, bool chain, bool start_dma)
 {
 	struct dma_chan *channel = get_dma_chan(this);
 	struct dma_async_tx_descriptor *desc;
 	int chip = this->current_chip;
-	int ret;
+	int ret = 0;
 	u32 pio[3];
 
 	/* [1] send out the PIO words */
@@ -568,7 +568,8 @@ int gpmi_send_command(struct gpmi_nand_data *this, struct scatterlist *sgl,
 	pio[1] = pio[2] = 0;
 	desc = dmaengine_prep_slave_sg(channel,
 					(struct scatterlist *)pio,
-					ARRAY_SIZE(pio), DMA_TRANS_NONE, 0);
+					ARRAY_SIZE(pio), DMA_TRANS_NONE,
+					chain ? DMA_PREP_INTERRUPT : 0);
 	if (!desc)
 		return -EINVAL;
 
@@ -579,8 +580,9 @@ int gpmi_send_command(struct gpmi_nand_data *this, struct scatterlist *sgl,
 	if (!desc)
 		return -EINVAL;
 
-	/* [3] submit the DMA */
-	ret = start_dma_without_bch_irq(this, desc);
+	if (start_dma)
+		/* [3] submit the DMA */
+		ret = start_dma_without_bch_irq(this, desc);
 
 	return ret;
 }
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 81accbf175bf..49599d6dd841 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -1163,11 +1163,12 @@ static void gpmi_nand_command(struct mtd_info *mtd, unsigned int command,
 			break;
 		}
 
-	/* This starts the DMA for the command and waits for it to finish. */
+	/* Chain the two commands, start the DMA and wait for it to finish. */
 	if (this->command_length > 0) {
 		sg_init_one(&this->cmd_sgl, this->cmd_buffer, this->command_length);
 		dma_map_sg(this->dev, &this->cmd_sgl, 1, DMA_TO_DEVICE);
-		ret = gpmi_send_command(this, &this->cmd_sgl, this->command_length);
+		ret = gpmi_send_command(this, &this->cmd_sgl, this->command_length,
+					false, this->command_length2 == 0);
 		if (ret)
 			dev_err(this->dev, "Chip: %u, Error %d\n",
 				this->current_chip, ret);
@@ -1176,7 +1177,8 @@ static void gpmi_nand_command(struct mtd_info *mtd, unsigned int command,
 	if (this->command_length2 > 0) {
 		sg_init_one(&this->cmd_sgl2, this->cmd_buffer + 64, this->command_length2);
 		dma_map_sg(this->dev, &this->cmd_sgl2, 1, DMA_TO_DEVICE);
-		ret = gpmi_send_command(this, &this->cmd_sgl2, this->command_length2);
+		ret = gpmi_send_command(this, &this->cmd_sgl2, this->command_length2,
+					this->command_length > 0, true);
 		if (ret)
 			dev_err(this->dev, "Chip: %u, Error %d\n",
 				this->current_chip, ret);
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
index 9dc3dd16fa0b..a09ec300754f 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
@@ -189,7 +189,7 @@ void gpmi_dump_info(struct gpmi_nand_data *);
 int bch_set_geometry(struct gpmi_nand_data *);
 int gpmi_is_ready(struct gpmi_nand_data *, unsigned chip);
 int gpmi_send_command(struct gpmi_nand_data *, struct scatterlist *sgl,
-		      unsigned int command_length);
+		      unsigned int command_length, bool chain, bool start_dma);
 int gpmi_enable_clk(struct gpmi_nand_data *this);
 int gpmi_disable_clk(struct gpmi_nand_data *this);
 int gpmi_setup_data_interface(struct mtd_info *mtd, int chipnr,
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 13/13] mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data()
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (11 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 12/13] mtd: rawnand: gpmi: issue two commands in a single DMA chain Sam Lefebvre
@ 2018-04-26 15:41 ` Sam Lefebvre
  2018-04-26 21:46   ` Boris Brezillon
  2018-04-26 16:28 ` optimizing the nand read performance by reducing interrupts from 4 to 1 Boris Brezillon
  2018-04-26 20:07 ` Boris Brezillon
  14 siblings, 1 reply; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-26 15:41 UTC (permalink / raw)
  To: linux-mtd; +Cc: Dries Staelens, Sam Lefebvre, Arnout Vandecapelle

An additional interrupt can be avoided in the page and subpage
read operations by adding a flag start_dma to indicate weather
gpmi_nand_command() needs to be chained with other dma commands
or not.

In case of chaining, gpmi_read_page() gets a flag that adds:
DMA_PREP_INTERRUPT: chain the dma with previous one
DMA_PREP_PQ_DISABLE_P: wait for prefetching being ready
before reading, turns on the CCW_WAIT4RDY flag.

Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
---
 drivers/dma/mxs-dma.c                      |  3 +++
 drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  |  6 ++++--
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 12 ++++++++++--
 drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h |  3 ++-
 4 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/mxs-dma.c b/drivers/dma/mxs-dma.c
index 41d167921fab..cb13fc759f97 100644
--- a/drivers/dma/mxs-dma.c
+++ b/drivers/dma/mxs-dma.c
@@ -82,6 +82,7 @@
 #define CCW_CHAIN		(1 << 2)
 #define CCW_IRQ			(1 << 3)
 #define CCW_DEC_SEM		(1 << 6)
+#define CCW_WAIT4RDY		(1 << 5)
 #define CCW_WAIT4END		(1 << 7)
 #define CCW_HALT_ON_TERM	(1 << 8)
 #define CCW_TERM_FLUSH		(1 << 9)
@@ -551,6 +552,8 @@ static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
 		ccw->bits |= CCW_TERM_FLUSH;
 		ccw->bits |= BF_CCW(sg_len, PIO_NUM);
 		ccw->bits |= BF_CCW(MXS_DMA_CMD_NO_XFER, COMMAND);
+		if (flags & DMA_PREP_PQ_DISABLE_P)
+			ccw->bits |= CCW_WAIT4RDY;
 	} else {
 		for_each_sg(sgl, sg, sg_len, i) {
 			if (sg_dma_len(sg) > MAX_XFER_BYTES) {
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
index 8f5a2a242228..a8d09214b44c 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
@@ -716,7 +716,8 @@ int gpmi_send_page(struct gpmi_nand_data *this,
 }
 
 int gpmi_read_page(struct gpmi_nand_data *this,
-				dma_addr_t payload, dma_addr_t auxiliary)
+				dma_addr_t payload, dma_addr_t auxiliary,
+				bool chain)
 {
 	struct bch_geometry *geo = &this->bch_geometry;
 	uint32_t command_mode;
@@ -741,7 +742,8 @@ int gpmi_read_page(struct gpmi_nand_data *this,
 	pio[1] = 0;
 	desc = dmaengine_prep_slave_sg(channel,
 				(struct scatterlist *)pio, 2,
-				DMA_TRANS_NONE, 0);
+				DMA_TRANS_NONE,
+				chain ? DMA_PREP_INTERRUPT | DMA_PREP_PQ_DISABLE_P : 0);
 	if (!desc)
 		return -EINVAL;
 
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 49599d6dd841..80cc85755582 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -952,7 +952,7 @@ static int gpmi_ecc_read_page_data(struct nand_chip *chip,
 	}
 
 	/* go! */
-	ret = gpmi_read_page(this, payload_phys, this->auxiliary_phys);
+	ret = gpmi_read_page(this, payload_phys, this->auxiliary_phys, true);
 
 	if (direct)
 		dma_unmap_single(this->dev, payload_phys, nfc_geo->payload_size,
@@ -1178,12 +1178,14 @@ static void gpmi_nand_command(struct mtd_info *mtd, unsigned int command,
 		sg_init_one(&this->cmd_sgl2, this->cmd_buffer + 64, this->command_length2);
 		dma_map_sg(this->dev, &this->cmd_sgl2, 1, DMA_TO_DEVICE);
 		ret = gpmi_send_command(this, &this->cmd_sgl2, this->command_length2,
-					this->command_length > 0, true);
+					this->command_length > 0, this->start_dma);
 		if (ret)
 			dev_err(this->dev, "Chip: %u, Error %d\n",
 				this->current_chip, ret);
 	}
 
+	this->start_dma = 1;
+
 	if (this->command_length > 0) {
 		dma_unmap_sg(this->dev, &this->cmd_sgl, 1, DMA_TO_DEVICE);
 		this->command_length = 0;
@@ -1198,6 +1200,8 @@ static void gpmi_nand_command(struct mtd_info *mtd, unsigned int command,
 static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 			      uint8_t *buf, int oob_required, int page)
 {
+	struct gpmi_nand_data *this = nand_get_controller_data(chip);
+	this->start_dma = 0;
 	nand_read_page_op(chip, page, 0, NULL, 0);
 
 	return gpmi_ecc_read_page_data(chip, buf, oob_required, page);
@@ -1251,6 +1255,7 @@ static int gpmi_ecc_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 		buf = buf + first * size;
 	}
 
+	this->start_dma = 0;
 	nand_read_page_op(chip, page, col, NULL, 0);
 
 	/* Save the old environment */
@@ -1993,6 +1998,9 @@ static int gpmi_nand_init(struct gpmi_nand_data *this)
 	/* init current chip */
 	this->current_chip	= -1;
 
+	/* init dma */
+	this->start_dma = true;
+
 	/* init the MTD data structures */
 	mtd->name		= "gpmi-nand";
 	mtd->dev.parent		= this->dev;
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
index a09ec300754f..c4a6e100c07f 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h
@@ -167,6 +167,7 @@ struct gpmi_nand_data {
 #define DMA_CHANS		8
 	struct dma_chan		*dma_chans[DMA_CHANS];
 	struct completion	dma_done;
+	bool start_dma;
 
 	/* private */
 	void			*private;
@@ -201,7 +202,7 @@ int gpmi_send_data(struct gpmi_nand_data *, const void *buf, int len);
 int gpmi_send_page(struct gpmi_nand_data *,
 		   dma_addr_t payload, dma_addr_t auxiliary);
 int gpmi_read_page(struct gpmi_nand_data *,
-		   dma_addr_t payload, dma_addr_t auxiliary);
+		   dma_addr_t payload, dma_addr_t auxiliary, bool chain);
 
 void gpmi_copy_bits(u8 *dst, size_t dst_bit_off,
 		    const u8 *src, size_t src_bit_off,
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: optimizing the nand read performance by reducing interrupts from 4 to 1
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (12 preceding siblings ...)
  2018-04-26 15:41 ` [PATCH 13/13] mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data() Sam Lefebvre
@ 2018-04-26 16:28 ` Boris Brezillon
  2018-04-27 10:27   ` Sam Lefebvre
  2018-04-26 20:07 ` Boris Brezillon
  14 siblings, 1 reply; 22+ messages in thread
From: Boris Brezillon @ 2018-04-26 16:28 UTC (permalink / raw)
  To: Sam Lefebvre
  Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens,
	Richard Weinberger, Han Xu

Hi Sam,

On Thu, 26 Apr 2018 17:41:21 +0200
Sam Lefebvre <sam.lefebvre@essensium.com> wrote:

> The current implementation for gmpi nand flashes use existing libraries for composing                    
> dma command chains in different pieces with mutliple interrupts. This invokes a lot of                   
> unnecessary context switches. In order to make more optimal use of dma chaining capabilities
> of the gpmi hardware, some rework must be performed to allow gpmi_ecc_read_page() 
> and gpmi_ecc_read_subpage() submitting a range of dma's to reduce the interrupts from 
> 4 to 1 for every page read.
> 
> In the current situation, a page read for large page incurs 4 interrupts:
> - DMA for READ0
> - DMA for READSTART
> - DMA for reading of page data
> - BCH
>                                                                                                          
> This series is based on a previous post and contains some optimizations in the commits.
> The changes in nand_command() and nand_command_lp() are omitted and commits are re-ordered
> in such way that the changes for reducing the interrupts are moved to the end.
> 
> In order to do the preparation of this work, ready/busy signalling is aggregated and
> error handling in gpmi_ecc_read_oob() has been reviewed.            
> 
> Some changes by Sasha Hauer are also adopted which contain some bugfixes and preliminary
> cleanup of several parts in de code.

Are there any changes in this version? I don't see a changelog and the
version number has not been incremented.

> 
> Sam lefebvre 
> 
> ----------------------------------------------------------------
> Arnout Vandecappelle (Essensium/Mind) (2):
>       mtd: rawnand: gpmi: instantiate cmdfunc

Also, I nacked this patch in my previous review, so I'm not going to
accept it now, unless you have strong arguments to convince me.

One last thing, can you Cc the relevant maintainers next time (you can
use scripts/get_maintainers.pl to get the list).

Regards,

Boris

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/13] mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob()
  2018-04-26 15:41 ` [PATCH 08/13] mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob() Sam Lefebvre
@ 2018-04-26 16:43   ` Boris Brezillon
  0 siblings, 0 replies; 22+ messages in thread
From: Boris Brezillon @ 2018-04-26 16:43 UTC (permalink / raw)
  To: Sam Lefebvre; +Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens

Hi Sam,

On Thu, 26 Apr 2018 17:41:29 +0200
Sam Lefebvre <sam.lefebvre@essensium.com> wrote:

> Errors can be generated by nand_read_page_op(). It's important
> to check the error flags and pass them upwards.
> 
> Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
> 
> ---
> 
> v2: also bail out before calling ->read_byte() (Boris)
> 
> When handling errors we need to bail out before calling
> ->read_byte().  
> ---
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
> index 7ba00b8ab288..60bc1bac7741 100644
> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
> @@ -1334,13 +1334,18 @@ static int gpmi_ecc_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
>  				int page)
>  {
>  	struct gpmi_nand_data *this = nand_get_controller_data(chip);
> +	int ret = 0;

No need to initialize ret to 0 here.

>  
>  	dev_dbg(this->dev, "page number is %d\n", page);
>  	/* clear the OOB buffer */
>  	memset(chip->oob_poi, ~0, mtd->oobsize);
>  
>  	/* Read out the conventional OOB. */
> -	nand_read_page_op(chip, page, mtd->writesize, NULL, 0);
> +	ret = nand_read_page_op(chip, page, mtd->writesize, NULL, 0);
> +

Please drop this empty line.

> +	if (ret)
> +		return ret;
> +
>  	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
>  
>  	/*
> @@ -1350,11 +1355,15 @@ static int gpmi_ecc_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
>  	 */
>  	if (GPMI_IS_MX23(this)) {
>  		/* Read the block mark into the first byte of the OOB buffer. */
> -		nand_read_page_op(chip, page, 0, NULL, 0);
> +		ret = nand_read_page_op(chip, page, 0, NULL, 0);
> +

Here as well.

> +		if (ret)
> +			return ret;
> +
>  		chip->oob_poi[0] = chip->read_byte(mtd);
>  	}
>  
> -	return 0;
> +	return ret;

You can keep return 0 here.

>  }
>  
>  static int

Regards,

Boris

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 09/13] mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq()
  2018-04-26 15:41 ` [PATCH 09/13] mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq() Sam Lefebvre
@ 2018-04-26 16:49   ` Boris Brezillon
  0 siblings, 0 replies; 22+ messages in thread
From: Boris Brezillon @ 2018-04-26 16:49 UTC (permalink / raw)
  To: Sam Lefebvre; +Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens

On Thu, 26 Apr 2018 17:41:30 +0200
Sam Lefebvre <sam.lefebvre@essensium.com> wrote:

> To improve nand performance, the BCH interrupt bit must be polled
> before waiting for occurence. This avoids an unnecessary context switch.
> 
> Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
> ---
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
> index 60bc1bac7741..28a2cf106ddc 100644
> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
> @@ -503,18 +503,31 @@ int start_dma_without_bch_irq(struct gpmi_nand_data *this,
>   * Actually, we must wait for two interrupts :
>   *	[1] firstly the DMA interrupt and
>   *	[2] secondly the BCH interrupt.
> + *	To improve performance, we first poll for the BCH
> + *	interrupt.
>   */
>  int start_dma_with_bch_irq(struct gpmi_nand_data *this,
>  			struct dma_async_tx_descriptor *desc)
>  {
>  	struct completion *bch_c = &this->bch_done;
> +	struct resources *r = &this->resources;
>  	unsigned long timeout;
>  
> -	/* Prepare to receive an interrupt from the BCH block. */
> -	init_completion(bch_c);
> +	/* Disable interrupt */
> +	writel(BM_BCH_CTRL_COMPLETE_IRQ_EN, r->bch_regs + HW_BCH_CTRL_CLR);

I'd prefer to have it disabled in the probe function.

>  
>  	/* start the DMA */
>  	start_dma_without_bch_irq(this, desc);
> +	if (readl(r->bch_regs + HW_BCH_CTRL) & BM_BCH_CTRL_COMPLETE_IRQ) {
> +		writel(BM_BCH_CTRL_COMPLETE_IRQ, r->bch_regs + HW_BCH_CTRL_CLR);
> +		return 0;
> +	}
> +
> +	/* Prepare to receive an interrupt from the BCH block. */
> +	init_completion(bch_c);
> +
> +	/* Re-enable interrupt */
> +	writel(BM_BCH_CTRL_COMPLETE_IRQ_EN, r->bch_regs + HW_BCH_CTRL_SET);
>  
>  	/* Wait for the interrupt from the BCH block. */
>  	timeout = wait_for_completion_timeout(bch_c, msecs_to_jiffies(1000));

And then you can disable it here, so that the interrupt is only active
when we need it.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: optimizing the nand read performance by reducing interrupts from 4 to 1
  2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
                   ` (13 preceding siblings ...)
  2018-04-26 16:28 ` optimizing the nand read performance by reducing interrupts from 4 to 1 Boris Brezillon
@ 2018-04-26 20:07 ` Boris Brezillon
  14 siblings, 0 replies; 22+ messages in thread
From: Boris Brezillon @ 2018-04-26 20:07 UTC (permalink / raw)
  To: Sam Lefebvre; +Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens, Han Xu

On Thu, 26 Apr 2018 17:41:21 +0200
Sam Lefebvre <sam.lefebvre@essensium.com> wrote:

> The current implementation for gmpi nand flashes use existing libraries for composing                    
> dma command chains in different pieces with mutliple interrupts. This invokes a lot of                   
> unnecessary context switches. In order to make more optimal use of dma chaining capabilities
> of the gpmi hardware, some rework must be performed to allow gpmi_ecc_read_page() 
> and gpmi_ecc_read_subpage() submitting a range of dma's to reduce the interrupts from 
> 4 to 1 for every page read.
> 
> In the current situation, a page read for large page incurs 4 interrupts:
> - DMA for READ0
> - DMA for READSTART
> - DMA for reading of page data
> - BCH
>                                                                                                          
> This series is based on a previous post and contains some optimizations in the commits.
> The changes in nand_command() and nand_command_lp() are omitted and commits are re-ordered
> in such way that the changes for reducing the interrupts are moved to the end.
> 
> In order to do the preparation of this work, ready/busy signalling is aggregated and
> error handling in gpmi_ecc_read_oob() has been reviewed.            
> 
> Some changes by Sasha Hauer are also adopted which contain some bugfixes and preliminary
> cleanup of several parts in de code.
> 
> Sam lefebvre 
> 
> ----------------------------------------------------------------
> Arnout Vandecappelle (Essensium/Mind) (2):
>       mtd: rawnand: gpmi: instantiate cmdfunc
>       mtd: rawnand: gpmi: gpmi_nand_command(): simplification and formatting
> 
> Sam Lefebvre (5):
>       mtd: rawnand: gpmi: set aggregate ready/busy signalling

Still waiting for Han's review on this one, but it looks good to me.

>       mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob()
>       mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq()
>       mtd: rawnand: gpmi: issue two commands in a single DMA chain
>       mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data()
> 
> Sascha Hauer (6):
>       mtd: nand: gpmi: drop dma_ops_type
>       mtd: nand: gpmi: pass buffer and len around
>       mtd: nand: gpmi: put only once used functions inline
>       mtd: nand: gpmi: remove direct_dma_map_ok from driver data struct
>       mtd: nand: gpmi: return valid value from bch_set_geometry()
>       mtd: nand: gpmi: remove unnecessary variables

Applied patches 1 to 6 after fixing the subject prefix.

For the remaining patches, I'm still opposed to the idea of replacing
->cmd_ctrl() by ->cmdfunc(), but I'm almost sure you can do the same
kind of optimization by changing the GPMI ->read/write_page()
implementation.

Actually, it's likely to be simpler/cleaner than hacking a custom
->cmdfunc() implementation, and you'll have what you were looking for:
optimized read/write paths. 

> 
>  drivers/dma/mxs-dma.c                      |   3 +
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  |  77 ++++---
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 359 ++++++++++++++++-------------
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h |  37 ++-
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h |   1 +
>  5 files changed, 260 insertions(+), 217 deletions(-)
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 13/13] mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data()
  2018-04-26 15:41 ` [PATCH 13/13] mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data() Sam Lefebvre
@ 2018-04-26 21:46   ` Boris Brezillon
  0 siblings, 0 replies; 22+ messages in thread
From: Boris Brezillon @ 2018-04-26 21:46 UTC (permalink / raw)
  To: Sam Lefebvre; +Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens

On Thu, 26 Apr 2018 17:41:34 +0200
Sam Lefebvre <sam.lefebvre@essensium.com> wrote:

> An additional interrupt can be avoided in the page and subpage
> read operations by adding a flag start_dma to indicate weather
> gpmi_nand_command() needs to be chained with other dma commands
> or not.
> 
> In case of chaining, gpmi_read_page() gets a flag that adds:
> DMA_PREP_INTERRUPT: chain the dma with previous one
> DMA_PREP_PQ_DISABLE_P: wait for prefetching being ready
> before reading, turns on the CCW_WAIT4RDY flag.
> 
> Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
> ---
>  drivers/dma/mxs-dma.c                      |  3 +++

Changes to mxs-dma.c should be done in a separate patch.

>  drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  |  6 ++++--
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 12 ++++++++++--
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.h |  3 ++-
>  4 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/dma/mxs-dma.c b/drivers/dma/mxs-dma.c
> index 41d167921fab..cb13fc759f97 100644
> --- a/drivers/dma/mxs-dma.c
> +++ b/drivers/dma/mxs-dma.c
> @@ -82,6 +82,7 @@
>  #define CCW_CHAIN		(1 << 2)
>  #define CCW_IRQ			(1 << 3)
>  #define CCW_DEC_SEM		(1 << 6)
> +#define CCW_WAIT4RDY		(1 << 5)
>  #define CCW_WAIT4END		(1 << 7)
>  #define CCW_HALT_ON_TERM	(1 << 8)
>  #define CCW_TERM_FLUSH		(1 << 9)
> @@ -551,6 +552,8 @@ static struct dma_async_tx_descriptor *mxs_dma_prep_slave_sg(
>  		ccw->bits |= CCW_TERM_FLUSH;
>  		ccw->bits |= BF_CCW(sg_len, PIO_NUM);
>  		ccw->bits |= BF_CCW(MXS_DMA_CMD_NO_XFER, COMMAND);
> +		if (flags & DMA_PREP_PQ_DISABLE_P)
> +			ccw->bits |= CCW_WAIT4RDY;

Hm, I'm really not sure this is a good idea to abuse the
DMA_PREP_PQ_DISABLE_P flag like that. Better add an mxs APBH specific
API on top of the DMA API to set some of the NAND related flags
(NAND_LOCK, NAND_WAIT4RDY).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: optimizing the nand read performance by reducing interrupts from 4 to 1
  2018-04-26 16:28 ` optimizing the nand read performance by reducing interrupts from 4 to 1 Boris Brezillon
@ 2018-04-27 10:27   ` Sam Lefebvre
  2018-04-27 12:09     ` Boris Brezillon
  0 siblings, 1 reply; 22+ messages in thread
From: Sam Lefebvre @ 2018-04-27 10:27 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens,
	Richard Weinberger, Han Xu, sam.lefebvre

Hi Boris

On 26/04/18 18:28, Boris Brezillon wrote:

> Hi Sam,
>
> On Thu, 26 Apr 2018 17:41:21 +0200
> Sam Lefebvre <sam.lefebvre@essensium.com> wrote:
>
> Are there any changes in this version? I don't see a changelog and the
> version number has not been incremented.

Until now we did not used version numbers. Because of time limitation
patches have been reordered and limited in such way that:

Patches 1-9 will be probably accepted.
Patches 10-13 is a solution for nand read performance improvement,
may be possibly not accepted upstream but can be useful for somebody
who want to use it in the future.

> Also, I nacked this patch in my previous review, so I'm not going to
> accept it now, unless you have strong arguments to convince me.

I will remember that these patches are based on your mail of march 5,
2018 where you suggested to reduce the number of interrupts in the
current implementation. Due to budgetary constraints, today is the
last day we are able to work on it in terms of this project.

Some measurements showed:
Reduction of 10% boot time on quad core.
Reduction of 3% boot time on dual lite.

>
> One last thing, can you Cc the relevant maintainers next time (you can
> use scripts/get_maintainers.pl to get the list).

Ok, I will check that.

Best regards,
Sam

> Regards,
>
> Boris
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: optimizing the nand read performance by reducing interrupts from 4 to 1
  2018-04-27 10:27   ` Sam Lefebvre
@ 2018-04-27 12:09     ` Boris Brezillon
  0 siblings, 0 replies; 22+ messages in thread
From: Boris Brezillon @ 2018-04-27 12:09 UTC (permalink / raw)
  To: Sam Lefebvre
  Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens,
	Richard Weinberger, Han Xu, sam.lefebvre

Hi Sam,

On Fri, 27 Apr 2018 12:27:49 +0200
Sam Lefebvre <sam.lefebvre@saleconix.be> wrote:

> > Are there any changes in this version? I don't see a changelog and the
> > version number has not been incremented.  
> 
> Until now we did not used version numbers. Because of time limitation
> patches have been reordered and limited in such way that:
> 
> Patches 1-9 will be probably accepted.
> Patches 10-13 is a solution for nand read performance improvement,
> may be possibly not accepted upstream but can be useful for somebody
> who want to use it in the future.

How does that prevent you from adding a version number to the patch
series? It's as easy as passing "-vX" to your git format-patch command
(where X is the version number of the patch series), and it helps
maintainers keeping track of various versions of the patchset.

> 
> > Also, I nacked this patch in my previous review, so I'm not going to
> > accept it now, unless you have strong arguments to convince me.  
> 
> I will remember that these patches are based on your mail of march 5,
> 2018 where you suggested to reduce the number of interrupts in the
> current implementation.

I certainly never told you to implement ->cmdfunc(). Just quoting my
answer to your private email for the record:

"
Before you even consider these options I'd recommend reworking the
gpmi_ecc_read_page() function to avoid splitting things in 2 distinct
calls:

1/ nand_read_page_op()
2/ gpmi_ecc_read_page_data()

#1 is responsible for 2 DMA interrupts (READ0+ADDR and READ_START PIO
xfers), and #2 is responsible for the last DMA interrupt (READ_DATA PIO
xfer) + the BCH interrupt. I think you could turn this into a single
interrupt by chaining the READ0+ADDR+READ_START+WAITREADY+DATA_IN
instructions, wait for DMA completion and then poll the BCH status
instead of using interrupts (assuming BCH calculation does not take too
long).
"

Don't know where you see any mention of cmdfunc() in there, and I
actually suggested what I'm suggesting today, that is, implementing the
optimization in gpmi_ecc_read_page().

> Due to budgetary constraints, today is the
> last day we are able to work on it in terms of this project.

That won't convince me, quite the opposite actually. If the goal is to
send the current status so that other people can take over this work,
then that's fine, but don't expect me to accept the patches as-is.

> 
> Some measurements showed:
> Reduction of 10% boot time on quad core.
> Reduction of 3% boot time on dual lite.

That's a nice improvement, indeed, but still not a good reason to take
the patches, especially since I proposed 2 alternatives, "->exec_op()"
or "gpmi_ecc_read_page() optim", and I'm almost sure the 2nd one is
even less invasive than re-implementing your own cmdfunc().

Regards,

Boris

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 07/13] mtd: rawnand: gpmi: set aggregate ready/busy signalling
  2018-04-26 15:41 ` [PATCH 07/13] mtd: rawnand: gpmi: set aggregate ready/busy signalling Sam Lefebvre
@ 2018-06-18 15:00   ` Miquel Raynal
  0 siblings, 0 replies; 22+ messages in thread
From: Miquel Raynal @ 2018-06-18 15:00 UTC (permalink / raw)
  To: Sam Lefebvre; +Cc: linux-mtd, Arnout Vandecapelle, Dries Staelens, Han Xu

Hello Han,

Could we have your input on this patch, please?

On Thu, 26 Apr 2018 17:41:28 +0200, Sam Lefebvre
<sam.lefebvre@essensium.com> wrote:

> The GPMI block by default decouples the ready/busy signal (which is
> shared between all chip-selects) to separate internal signals per
> chip-select. However, since the gpmi-nand driver uses DMA0 for all
> chip-selects, DMA transfers will not be able to see the ready/busy
> status for any chip-select other than 0.
> 
> Currently, this happens to work because nand_command() ends with an
> explicit nand_wait_ready() and the driver only sets up a single command
> in a DMA chain. nand_wait_ready() polls for the chip-select specific
> ready/busy status bit. Future patches, however, will set up a DMA chain
> with several commands, so these will have to wait correctly in the DMA
> chain itself.
> 
> To fix this, set the GANGED_RDYBUSY bit in the control1 register. This
> ties all internal ready/busy signals together, so DMA0 will also see
> the ready/busy status of chip selects 1-2-3. It's a bit silly that this
> isn't implied in hardware by the DECOUPLE_CS bit.
> 
> Signed-off-by: Sam Lefebvre <sam.lefebvre@essensium.com>
> Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
> Cc: Han Xu <han.xu@nxp.com>
> ---
> Tested on an i.MX6Q with two identical chips on CS0 and CS1.
> ---
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c  | 3 +++
>  drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> index 83697b8df871..1858afdb400d 100644
> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> @@ -193,6 +193,9 @@ int gpmi_init(struct gpmi_nand_data *this)
>  	 */
>  	writel(BM_GPMI_CTRL1_DECOUPLE_CS, r->gpmi_regs + HW_GPMI_CTRL1_SET);
>  
> +	/* Aggregate ready busy signalling. */
> +	writel(BM_GPMI_CTRL1_GANGED_RDYBUSY, r->gpmi_regs + HW_GPMI_CTRL1_SET);
> +
>  	gpmi_disable_clk(this);
>  	return 0;
>  err_out:
> diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h b/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h
> index d92bf32221ca..e341802c90ac 100644
> --- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h
> +++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-regs.h
> @@ -120,6 +120,7 @@
>  #define BV_GPMI_CTRL1_WRN_DLY_SEL_7_TO_12NS		0x2
>  #define BV_GPMI_CTRL1_WRN_DLY_SEL_NO_DELAY		0x3
>  
> +#define BM_GPMI_CTRL1_GANGED_RDYBUSY		(1 << 19)
>  #define BM_GPMI_CTRL1_BCH_MODE				(1 << 18)
>  
>  #define BP_GPMI_CTRL1_DLL_ENABLE			17

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-06-18 15:00 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-26 15:41 optimizing the nand read performance by reducing interrupts from 4 to 1 Sam Lefebvre
2018-04-26 15:41 ` [PATCH 01/13] mtd: nand: gpmi: drop dma_ops_type Sam Lefebvre
2018-04-26 15:41 ` [PATCH 02/13] mtd: nand: gpmi: pass buffer and len around Sam Lefebvre
2018-04-26 15:41 ` [PATCH 03/13] mtd: nand: gpmi: put only once used functions inline Sam Lefebvre
2018-04-26 15:41 ` [PATCH 04/13] mtd: nand: gpmi: remove direct_dma_map_ok from driver data struct Sam Lefebvre
2018-04-26 15:41 ` [PATCH 05/13] mtd: nand: gpmi: return valid value from bch_set_geometry() Sam Lefebvre
2018-04-26 15:41 ` [PATCH 06/13] mtd: nand: gpmi: remove unnecessary variables Sam Lefebvre
2018-04-26 15:41 ` [PATCH 07/13] mtd: rawnand: gpmi: set aggregate ready/busy signalling Sam Lefebvre
2018-06-18 15:00   ` Miquel Raynal
2018-04-26 15:41 ` [PATCH 08/13] mtd: rawnand: gpmi: return generated errors in gpmi_ecc_read_oob() Sam Lefebvre
2018-04-26 16:43   ` Boris Brezillon
2018-04-26 15:41 ` [PATCH 09/13] mtd: rawnand: gpmi: poll the BCH interrupt bit in start_dma_with_bch_irq() Sam Lefebvre
2018-04-26 16:49   ` Boris Brezillon
2018-04-26 15:41 ` [PATCH 10/13] mtd: rawnand: gpmi: instantiate cmdfunc Sam Lefebvre
2018-04-26 15:41 ` [PATCH 11/13] mtd: rawnand: gpmi: gpmi_nand_command(): simplification and formatting Sam Lefebvre
2018-04-26 15:41 ` [PATCH 12/13] mtd: rawnand: gpmi: issue two commands in a single DMA chain Sam Lefebvre
2018-04-26 15:41 ` [PATCH 13/13] mtd: rawnand: gmpi: chain gpmi_nand_command() with gpmi_ecc_read_page_data() Sam Lefebvre
2018-04-26 21:46   ` Boris Brezillon
2018-04-26 16:28 ` optimizing the nand read performance by reducing interrupts from 4 to 1 Boris Brezillon
2018-04-27 10:27   ` Sam Lefebvre
2018-04-27 12:09     ` Boris Brezillon
2018-04-26 20:07 ` Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.