linux-spi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode
@ 2023-02-07  6:09 Sai Krishna Potthuri
  2023-02-07  6:09 ` [PATCH 1/3] spi: cadence-quadspi: Add support for PHY module and DQS Sai Krishna Potthuri
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Sai Krishna Potthuri @ 2023-02-07  6:09 UTC (permalink / raw)
  To: Mark Brown, Tudor Ambarus, Pratyush Yadav, Michael Walle,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra
  Cc: linux-mtd, linux-spi, linux-kernel, saikrishna12468, git,
	Sai Krishna Potthuri

Enable PHY and DQS required for Xilinx Versal Octal SPI to operate in DTR
protocol.
Add and update device_id field in spi_mem structure with flash id
information. Xilinx Versal Octal SPI driver requires the device id
information to perform the Rx tuning operation. Since there is no common
Tuning Data Pattern defined across all vendors, controllers like Xilinx
Versal Octal SPI which requires Rx tuning to find out the optimal sampling
point for data lines, this device id information will be used as a golden
data.
The reason behind choosing this approach instead of reading the ID again
in the controller driver is to make it generic solution.
- Other controller drivers which want to use similar tuning process, they
will make use of this ID instead of reading the ID again in the driver.
- Also, we can avoid hardcoding the command information and initiating the
transfer in the controller driver as this should happen from spi-nor.

Sai Krishna Potthuri (3):
  spi: cadence-quadspi: Add support for PHY module and DQS
  mtd: spi-nor: Add and update device_id field in spi_mem structure
  spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI

 drivers/mtd/spi-nor/core.c        |   1 +
 drivers/spi/spi-cadence-quadspi.c | 226 +++++++++++++++++++++++++++++-
 include/linux/spi/spi-mem.h       |   4 +
 3 files changed, 230 insertions(+), 1 deletion(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] spi: cadence-quadspi: Add support for PHY module and DQS
  2023-02-07  6:09 [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Sai Krishna Potthuri
@ 2023-02-07  6:09 ` Sai Krishna Potthuri
  2023-02-07  6:09 ` [PATCH 2/3] mtd: spi-nor: Add and update device_id field in spi_mem structure Sai Krishna Potthuri
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Sai Krishna Potthuri @ 2023-02-07  6:09 UTC (permalink / raw)
  To: Mark Brown, Tudor Ambarus, Pratyush Yadav, Michael Walle,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra
  Cc: linux-mtd, linux-spi, linux-kernel, saikrishna12468, git,
	Sai Krishna Potthuri

Cadence Octal SPI Flash Controller has PHY module, which is used for
handling SPI transactions when there is a requirement for high
performance. Also, this controller supports DQS(data strobe) used for
data capturing in PHY mode rather than internally generated ref_clk.
Xilinx Versal Octal SPI use PHY module and DQS signal when operating in
DTR protocol.

Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>
---
 drivers/spi/spi-cadence-quadspi.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index 676313e1bdad..b55b763003f0 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -40,6 +40,8 @@
 #define CQSPI_SUPPORT_EXTERNAL_DMA	BIT(2)
 #define CQSPI_NO_SUPPORT_WR_COMPLETION	BIT(3)
 #define CQSPI_SLOW_SRAM		BIT(4)
+#define CQSPI_HAS_PHY			BIT(5)
+#define CQSPI_HAS_DQS			BIT(6)
 
 /* Capabilities */
 #define CQSPI_SUPPORTS_OCTAL		BIT(0)
@@ -89,6 +91,8 @@ struct cqspi_st {
 	u32			pd_dev_id;
 	bool			wr_completion;
 	bool			slow_sram;
+	bool			use_phy;
+	bool			use_dqs;
 };
 
 struct cqspi_driver_platdata {
@@ -112,6 +116,7 @@ struct cqspi_driver_platdata {
 /* Register map */
 #define CQSPI_REG_CONFIG			0x00
 #define CQSPI_REG_CONFIG_ENABLE_MASK		BIT(0)
+#define CQSPI_REG_CONFIG_PHY_MODE		BIT(3)
 #define CQSPI_REG_CONFIG_ENB_DIR_ACC_CTRL	BIT(7)
 #define CQSPI_REG_CONFIG_DECODE_MASK		BIT(9)
 #define CQSPI_REG_CONFIG_CHIPSELECT_LSB		10
@@ -151,6 +156,7 @@ struct cqspi_driver_platdata {
 #define CQSPI_REG_DELAY_TSHSL_MASK		0xFF
 
 #define CQSPI_REG_READCAPTURE			0x10
+#define CQSPI_REG_READCAPTURE_DQS		BIT(8)
 #define CQSPI_REG_READCAPTURE_BYPASS_LSB	0
 #define CQSPI_REG_READCAPTURE_DELAY_LSB		1
 #define CQSPI_REG_READCAPTURE_DELAY_MASK	0xF
@@ -468,6 +474,8 @@ static int cqspi_enable_dtr(struct cqspi_flash_pdata *f_pdata,
 	if (op->cmd.dtr) {
 		reg |= CQSPI_REG_CONFIG_DTR_PROTO;
 		reg |= CQSPI_REG_CONFIG_DUAL_OPCODE;
+		if (cqspi->use_phy)
+			reg |= CQSPI_REG_CONFIG_PHY_MODE;
 
 		/* Set up command opcode extension. */
 		ret = cqspi_setup_opcode_ext(f_pdata, op, shift);
@@ -476,10 +484,20 @@ static int cqspi_enable_dtr(struct cqspi_flash_pdata *f_pdata,
 	} else {
 		reg &= ~CQSPI_REG_CONFIG_DTR_PROTO;
 		reg &= ~CQSPI_REG_CONFIG_DUAL_OPCODE;
+		reg &= ~CQSPI_REG_CONFIG_PHY_MODE;
 	}
 
 	writel(reg, reg_base + CQSPI_REG_CONFIG);
 
+	reg = readl(reg_base + CQSPI_REG_READCAPTURE);
+
+	if (op->cmd.dtr)
+		reg |= CQSPI_REG_READCAPTURE_DQS;
+	else
+		reg &= ~CQSPI_REG_READCAPTURE_DQS;
+
+	writel(reg, reg_base + CQSPI_REG_READCAPTURE);
+
 	return cqspi_wait_idle(cqspi);
 }
 
@@ -1700,6 +1718,10 @@ static int cqspi_probe(struct platform_device *pdev)
 			cqspi->wr_completion = false;
 		if (ddata->quirks & CQSPI_SLOW_SRAM)
 			cqspi->slow_sram = true;
+		if (ddata->quirks & CQSPI_HAS_PHY)
+			cqspi->use_phy = true;
+		if (ddata->quirks & CQSPI_HAS_DQS)
+			cqspi->use_dqs = true;
 
 		if (of_device_is_compatible(pdev->dev.of_node,
 					    "xlnx,versal-ospi-1.0"))
@@ -1820,7 +1842,8 @@ static const struct cqspi_driver_platdata socfpga_qspi = {
 
 static const struct cqspi_driver_platdata versal_ospi = {
 	.hwcaps_mask = CQSPI_SUPPORTS_OCTAL,
-	.quirks = CQSPI_DISABLE_DAC_MODE | CQSPI_SUPPORT_EXTERNAL_DMA,
+	.quirks = CQSPI_DISABLE_DAC_MODE | CQSPI_SUPPORT_EXTERNAL_DMA
+			| CQSPI_HAS_PHY | CQSPI_HAS_DQS,
 	.indirect_read_dma = cqspi_versal_indirect_read_dma,
 	.get_dma_status = cqspi_get_versal_dma_status,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] mtd: spi-nor: Add and update device_id field in spi_mem structure
  2023-02-07  6:09 [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Sai Krishna Potthuri
  2023-02-07  6:09 ` [PATCH 1/3] spi: cadence-quadspi: Add support for PHY module and DQS Sai Krishna Potthuri
@ 2023-02-07  6:09 ` Sai Krishna Potthuri
  2023-02-07  6:09 ` [PATCH 3/3] spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI Sai Krishna Potthuri
  2023-02-07  6:48 ` [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Tudor Ambarus
  3 siblings, 0 replies; 8+ messages in thread
From: Sai Krishna Potthuri @ 2023-02-07  6:09 UTC (permalink / raw)
  To: Mark Brown, Tudor Ambarus, Pratyush Yadav, Michael Walle,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra
  Cc: linux-mtd, linux-spi, linux-kernel, saikrishna12468, git,
	Sai Krishna Potthuri

Xilinx Versal Octal SPI controller drivers might require the device id
information to perform the Rx tuning operation. Since there is no common
Tuning Data Pattern defined across all vendors, controllers which requires
Rx tuning to find out the optimal sampling point for data lines, this
device id information will be used as a golden data.
spi-nor core updates the device id information in device_id field
while reading the id of the flash device.

Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>
---
 drivers/mtd/spi-nor/core.c  | 1 +
 include/linux/spi/spi-mem.h | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index bee8fc4c9f07..9811e74ef3b6 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -1644,6 +1644,7 @@ static const struct flash_info *spi_nor_match_id(struct spi_nor *nor,
 			if (part->id_len &&
 			    !memcmp(part->id, id, part->id_len)) {
 				nor->manufacturer = manufacturers[i];
+				memcpy(nor->spimem->device_id, id, SPI_NOR_MAX_ID_LEN);
 				return part;
 			}
 		}
diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h
index 8e984d75f5b6..e9ae5c1b7391 100644
--- a/include/linux/spi/spi-mem.h
+++ b/include/linux/spi/spi-mem.h
@@ -181,11 +181,14 @@ struct spi_mem_dirmap_desc {
 	void *priv;
 };
 
+#define SPI_MEM_DEV_MAX_ID_LEN		6
+
 /**
  * struct spi_mem - describes a SPI memory device
  * @spi: the underlying SPI device
  * @drvpriv: spi_mem_driver private data
  * @name: name of the SPI memory device
+ * @device_id: device id of the SPI memory device
  *
  * Extra information that describe the SPI memory device and may be needed by
  * the controller to properly handle this device should be placed here.
@@ -197,6 +200,7 @@ struct spi_mem {
 	struct spi_device *spi;
 	void *drvpriv;
 	const char *name;
+	u8 device_id[SPI_MEM_DEV_MAX_ID_LEN];
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI
  2023-02-07  6:09 [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Sai Krishna Potthuri
  2023-02-07  6:09 ` [PATCH 1/3] spi: cadence-quadspi: Add support for PHY module and DQS Sai Krishna Potthuri
  2023-02-07  6:09 ` [PATCH 2/3] mtd: spi-nor: Add and update device_id field in spi_mem structure Sai Krishna Potthuri
@ 2023-02-07  6:09 ` Sai Krishna Potthuri
  2023-02-07  7:23   ` Tudor Ambarus
  2023-02-07  6:48 ` [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Tudor Ambarus
  3 siblings, 1 reply; 8+ messages in thread
From: Sai Krishna Potthuri @ 2023-02-07  6:09 UTC (permalink / raw)
  To: Mark Brown, Tudor Ambarus, Pratyush Yadav, Michael Walle,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra
  Cc: linux-mtd, linux-spi, linux-kernel, saikrishna12468, git,
	Sai Krishna Potthuri

Add Rx tuning support for Xilinx Versal Octal SPI Controller(OSPI).
Xilinx Versal Octal SPI controller requires Rx tuning to find out the
optimal tap value when running at higher clock in DTR protocol. As there
is no common Tuning Data Pattern defined across all vendors, Xilinx Versal
Octal SPI uses READ_ID based tuning algorithm for which this device_id
field in spi_mem will be used as a Tuning data pattern.
After enabling the DTR protocol, spi-nor sends READ ID command to verify
the DTR protocol switching is successfully done or not. If execute_tuning
is defined(not NULL) and clk_tuned is not set then Rx tuning algorithm
will be executed before it actually read the ID in DTR protocol.
Xilinx Versal tuning algorithm (as suggested by IP design team):
- Sweep all possible 127 taps to read the flash id for each tap.
- To consider a particular tap as valid make sure to get the correct id
for 10 times.
- Once valid window is found(window size >= 3), stop parsing the
remaining taps.
- Perform the above steps with extra dummy clock 1.
- Identify the maximum window size between extra dummy clock 0 and 1 and
configure the median of min and max valid tap of the corresponding window.
- If the maximum window size is with dummy clock 1 then all the subsequent
read operations should be performed with one extra dummy clock.

Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>
---
 drivers/spi/spi-cadence-quadspi.c | 201 ++++++++++++++++++++++++++++++
 1 file changed, 201 insertions(+)

diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index b55b763003f0..bc4395736def 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -93,6 +93,8 @@ struct cqspi_st {
 	bool			slow_sram;
 	bool			use_phy;
 	bool			use_dqs;
+	bool			clk_tuned;
+	u8			extra_dummy_clks;
 };
 
 struct cqspi_driver_platdata {
@@ -101,6 +103,7 @@ struct cqspi_driver_platdata {
 	int (*indirect_read_dma)(struct cqspi_flash_pdata *f_pdata,
 				 u_char *rxbuf, loff_t from_addr, size_t n_rx);
 	u32 (*get_dma_status)(struct cqspi_st *cqspi);
+	int (*execute_tuning)(struct spi_mem *mem, const struct spi_mem_op *op);
 };
 
 /* Operation timeout value */
@@ -192,6 +195,7 @@ struct cqspi_driver_platdata {
 
 #define CQSPI_REG_IRQSTATUS			0x40
 #define CQSPI_REG_IRQMASK			0x44
+#define CQSPI_REG_VERSAL_ECO			0x48
 
 #define CQSPI_REG_INDIRECTRD			0x60
 #define CQSPI_REG_INDIRECTRD_START_MASK		BIT(0)
@@ -238,6 +242,19 @@ struct cqspi_driver_platdata {
 #define CQSPI_REG_POLLING_STATUS		0xB0
 #define CQSPI_REG_POLLING_STATUS_DUMMY_LSB	16
 
+#define CQSPI_REG_VERSAL_PHY_CONFIG		0xB4
+#define CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK	BIT(31)
+#define CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK	BIT(30)
+#define CQSPI_REG_PHY_CONFIG_TX_DLL_DLY_LSB	16
+
+#define CQSPI_REG_VERSAL_PHY_MASTER_CTRL	0xB8
+#define CQSPI_REG_DLL_LOWER			0xBC
+#define CQSPI_REG_DLL_LOWER_LPBK_LOCK_MASK	BIT(15)
+#define CQSPI_REG_DLL_LOWER_DLL_LOCK_MASK	BIT(0)
+
+#define CQSPI_REG_VERSAL_DLL_OBSVBLE_UPPER	0xC0
+#define CQSPI_REG_DLL_UPPER_RX_FLD_MASK		0x7F
+
 #define CQSPI_REG_OP_EXT_LOWER			0xE0
 #define CQSPI_REG_OP_EXT_READ_LSB		24
 #define CQSPI_REG_OP_EXT_WRITE_LSB		16
@@ -282,6 +299,12 @@ struct cqspi_driver_platdata {
 #define CQSPI_DMA_UNALIGN		0x3
 
 #define CQSPI_REG_VERSAL_DMA_VAL		0x602
+#define CQSPI_VERSAL_MAX_DLL_TAPS		127
+#define CQSPI_VERSAL_TX_TAP_MASTER		0x1E
+#define CQSPI_VERSAL_ECO_MIN_FREQ		120000000
+#define CQSPI_VERSAL_INIT_DLY			4
+#define CQSPI_VERSAL_RDID_MAX_CNT		10
+#define CQSPI_VERSAL_MIN_WINDOW_SZ		3
 
 static int cqspi_wait_for_bit(void __iomem *reg, const u32 mask, bool clr)
 {
@@ -540,6 +563,14 @@ static int cqspi_command_read(struct cqspi_flash_pdata *f_pdata,
 	if (dummy_clk > CQSPI_DUMMY_CLKS_MAX)
 		return -EOPNOTSUPP;
 
+	/*
+	 * Based on the tuning, some controllers requires extra dummy clocks due to additional
+	 * delays in read path. For such controllers add required extra dummy clocks for all
+	 * subsequent read operations.
+	 */
+	if (cqspi->extra_dummy_clks)
+		dummy_clk += cqspi->extra_dummy_clks;
+
 	if (dummy_clk)
 		reg |= (dummy_clk & CQSPI_REG_CMDCTRL_DUMMY_MASK)
 		     << CQSPI_REG_CMDCTRL_DUMMY_LSB;
@@ -662,6 +693,14 @@ static int cqspi_read_setup(struct cqspi_flash_pdata *f_pdata,
 	if (dummy_clk > CQSPI_DUMMY_CLKS_MAX)
 		return -EOPNOTSUPP;
 
+	/*
+	 * Based on the tuning, some controllers requires extra dummy clocks due to additional
+	 * delays in read path. For such controllers add required extra dummy clocks for all
+	 * subsequent read operations.
+	 */
+	if (cqspi->extra_dummy_clks)
+		dummy_clk += cqspi->extra_dummy_clks;
+
 	if (dummy_clk)
 		reg |= (dummy_clk & CQSPI_REG_RD_INSTR_DUMMY_MASK)
 		       << CQSPI_REG_RD_INSTR_DUMMY_LSB;
@@ -1056,6 +1095,157 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
 	return ret;
 }
 
+static int cqspi_versal_execute_tuning(struct spi_mem *mem, const struct spi_mem_op *op)
+{
+	struct cqspi_st *cqspi = spi_master_get_devdata(mem->spi->master);
+	u8 dummy_incr, dummy_flag = 0, count, max_index, min_index;
+	u8 min_rxtap, max_rxtap, avg_index, max_tap, windowsize;
+	struct platform_device *pdev = cqspi->pdev;
+	struct cqspi_flash_pdata *f_pdata;
+	bool id_matched, rxtapfound;
+	u32 val, i, reg, txtap = 0;
+	s8 max_windowsize = -1;
+	int ret;
+
+	f_pdata = &cqspi->f_pdata[mem->spi->chip_select];
+
+	reg = readl(cqspi->iobase + CQSPI_REG_CONFIG);
+	reg &= CQSPI_REG_CONFIG_ENABLE_MASK;
+	if (!reg)
+		return 0;
+
+	if (cqspi->master_ref_clk_hz >= CQSPI_VERSAL_ECO_MIN_FREQ)
+		writel(1, cqspi->iobase + CQSPI_REG_VERSAL_ECO);
+
+	/* Drive DLL reset bit to low */
+	writel(0, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+
+	/* Set initial delay value */
+	writel(CQSPI_VERSAL_INIT_DLY, cqspi->iobase + CQSPI_REG_VERSAL_PHY_MASTER_CTRL);
+
+	/* Set DLL reset bit */
+	writel(CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK,
+	       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+
+	/* Check for loopback lock */
+	ret = cqspi_wait_for_bit(cqspi->iobase + CQSPI_REG_DLL_LOWER,
+				 CQSPI_REG_DLL_LOWER_LPBK_LOCK_MASK, 0);
+	if (ret) {
+		dev_err(&pdev->dev, "Loopback lock bit error (%i)\n", ret);
+		return ret;
+	}
+
+	/* Re-synchronize slave DLLs */
+	writel(CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+	writel(CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK | CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK,
+	       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+
+	txtap = CQSPI_VERSAL_TX_TAP_MASTER << CQSPI_REG_PHY_CONFIG_TX_DLL_DLY_LSB;
+	max_tap = CQSPI_VERSAL_MAX_DLL_TAPS;
+
+	for (dummy_incr = 0; dummy_incr <= 1; dummy_incr++) {
+		/* Update the extra dummy clocks required to read the ID */
+		cqspi->extra_dummy_clks = dummy_incr;
+		rxtapfound = false;
+		min_rxtap = 0;
+		max_rxtap = 0;
+		max_index = 0;
+		min_index = 0;
+
+		for (i = 0; i <= max_tap; i++) {
+			val = txtap | i | CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK;
+			writel(val, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+			writel(val | CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK,
+			       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+			ret = cqspi_wait_for_bit(cqspi->iobase + CQSPI_REG_DLL_LOWER,
+						 CQSPI_REG_DLL_LOWER_DLL_LOCK_MASK, 0);
+			if (ret)
+				return ret;
+
+			/*
+			 * To consider the tap as valid, make sure to read the Flash ID
+			 * 10 times(recommended by IP design team). This makes the algorithm more
+			 * robust and gives more confidence about the valid tap, this will also
+			 * avoid getting the correct id by a fluke with a particular tap.
+			 * In any one of the iteration if the Flash ID is not matches with the
+			 * device_id, then that particular tap should not be considered as valid.
+			 */
+			count = 0;
+			do {
+				/* Execute the command and wait for the complete */
+				ret = cqspi_command_read(f_pdata, op);
+				if (ret < 0) {
+					dev_err(&pdev->dev,
+						"error %d reading JEDEC ID\n", ret);
+					return ret;
+				}
+
+				id_matched = true;
+				if (memcmp(mem->device_id, op->data.buf.in, op->data.nbytes)) {
+					id_matched = false;
+					break;
+				}
+			} while (id_matched && (count++ <= CQSPI_VERSAL_RDID_MAX_CNT));
+
+			if (id_matched) {
+				u8 current_rxtap = readl(cqspi->iobase +
+							 CQSPI_REG_VERSAL_DLL_OBSVBLE_UPPER) &
+							 CQSPI_REG_DLL_UPPER_RX_FLD_MASK;
+
+				if (!rxtapfound) {
+					min_rxtap = current_rxtap;
+					min_index = i;
+					rxtapfound = true;
+				}
+
+				max_rxtap = current_rxtap;
+				max_index = i;
+			}
+
+			if (!rxtapfound)
+				continue;
+
+			/* If ID doesn't match or reach the max tap, calculate the window size */
+			if (!id_matched || i == max_tap) {
+				windowsize = max_rxtap - min_rxtap + 1;
+				if (windowsize > max_windowsize) {
+					dummy_flag = dummy_incr;
+					max_windowsize = windowsize;
+					avg_index = (max_index + min_index) / 2;
+				}
+
+				rxtapfound = false;
+				if (windowsize >= CQSPI_VERSAL_MIN_WINDOW_SZ)
+					break;
+			}
+		}
+	}
+
+	/*
+	 * Look for minimum window size to ensure reliable data sampling even during drifts due
+	 * to PVT, so that a drift either towards left or right can be well addressed.
+	 */
+	if (max_windowsize < CQSPI_VERSAL_MIN_WINDOW_SZ)
+		return -EINVAL;
+
+	/* Update the extra dummy clocks based on the maximum window size */
+	cqspi->extra_dummy_clks = dummy_flag;
+
+	val = txtap | avg_index | CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK;
+	writel(val, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+	writel((CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK | val),
+	       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
+
+	ret = cqspi_wait_for_bit(cqspi->iobase + CQSPI_REG_DLL_LOWER,
+				 CQSPI_REG_DLL_LOWER_DLL_LOCK_MASK, 0);
+	if (ret)
+		return ret;
+
+	cqspi->clk_tuned = true;
+
+	return 0;
+}
+
 static void cqspi_chipselect(struct cqspi_flash_pdata *f_pdata)
 {
 	struct cqspi_st *cqspi = f_pdata->cqspi;
@@ -1345,11 +1535,21 @@ static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
 static int cqspi_mem_process(struct spi_mem *mem, const struct spi_mem_op *op)
 {
 	struct cqspi_st *cqspi = spi_master_get_devdata(mem->spi->master);
+	const struct cqspi_driver_platdata *ddata;
+	struct device *dev = &cqspi->pdev->dev;
 	struct cqspi_flash_pdata *f_pdata;
 
+	ddata = of_device_get_match_data(dev);
 	f_pdata = &cqspi->f_pdata[mem->spi->chip_select];
 	cqspi_configure(f_pdata, mem->spi->max_speed_hz);
 
+	if (op->cmd.dtr && ddata && ddata->execute_tuning && !cqspi->clk_tuned) {
+		int ret = ddata->execute_tuning(mem, op);
+
+		if (ret)
+			return ret;
+	}
+
 	if (op->data.dir == SPI_MEM_DATA_IN && op->data.buf.in) {
 		if (!op->addr.nbytes)
 			return cqspi_command_read(f_pdata, op);
@@ -1846,6 +2046,7 @@ static const struct cqspi_driver_platdata versal_ospi = {
 			| CQSPI_HAS_PHY | CQSPI_HAS_DQS,
 	.indirect_read_dma = cqspi_versal_indirect_read_dma,
 	.get_dma_status = cqspi_get_versal_dma_status,
+	.execute_tuning = cqspi_versal_execute_tuning,
 };
 
 static const struct of_device_id cqspi_dt_ids[] = {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode
  2023-02-07  6:09 [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Sai Krishna Potthuri
                   ` (2 preceding siblings ...)
  2023-02-07  6:09 ` [PATCH 3/3] spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI Sai Krishna Potthuri
@ 2023-02-07  6:48 ` Tudor Ambarus
  2023-07-18 10:01   ` Michael Walle
  3 siblings, 1 reply; 8+ messages in thread
From: Tudor Ambarus @ 2023-02-07  6:48 UTC (permalink / raw)
  To: Sai Krishna Potthuri, Mark Brown, Tudor Ambarus, Pratyush Yadav,
	Michael Walle, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra
  Cc: linux-mtd, linux-spi, linux-kernel, saikrishna12468, git



On 2/7/23 06:09, Sai Krishna Potthuri wrote:
> Enable PHY and DQS required for Xilinx Versal Octal SPI to operate in DTR
> protocol.
> Add and update device_id field in spi_mem structure with flash id
> information. Xilinx Versal Octal SPI driver requires the device id
> information to perform the Rx tuning operation. Since there is no common
> Tuning Data Pattern defined across all vendors, controllers like Xilinx
> Versal Octal SPI which requires Rx tuning to find out the optimal sampling
> point for data lines, this device id information will be used as a golden
> data.

Using only 6 bytes as golden pattern seems fragile, but you are aware of
that, as I see that you chose to read the ID 10 times to make the
decision whether the tap is valid or not. Other option (which is not
perfect) is to use SFDP data as golden pattern. If I remember
correctly, JESD216 suggests to use the Read SFDP cmd at 50 MHz, so it
won't help you much. In practice SPI NOR uses the Read SFDP command at
the flash's maximum speed and we haven't seen problems. But better would
be to use some flash OTP data maybe? I remember Pratyush has submitted a
phy calibration series in the past, I haven't had the chance to read his
proposal. Did you? How's your proposal different than his?

Cheers,
ta

> The reason behind choosing this approach instead of reading the ID again
> in the controller driver is to make it generic solution.
> - Other controller drivers which want to use similar tuning process, they
> will make use of this ID instead of reading the ID again in the driver.
> - Also, we can avoid hardcoding the command information and initiating the
> transfer in the controller driver as this should happen from spi-nor.
> 
> Sai Krishna Potthuri (3):
>    spi: cadence-quadspi: Add support for PHY module and DQS
>    mtd: spi-nor: Add and update device_id field in spi_mem structure
>    spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI
> 
>   drivers/mtd/spi-nor/core.c        |   1 +
>   drivers/spi/spi-cadence-quadspi.c | 226 +++++++++++++++++++++++++++++-
>   include/linux/spi/spi-mem.h       |   4 +
>   3 files changed, 230 insertions(+), 1 deletion(-)
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI
  2023-02-07  6:09 ` [PATCH 3/3] spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI Sai Krishna Potthuri
@ 2023-02-07  7:23   ` Tudor Ambarus
  0 siblings, 0 replies; 8+ messages in thread
From: Tudor Ambarus @ 2023-02-07  7:23 UTC (permalink / raw)
  To: Sai Krishna Potthuri, Mark Brown, Tudor Ambarus, Pratyush Yadav,
	Michael Walle, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra
  Cc: linux-mtd, linux-spi, linux-kernel, saikrishna12468, git



On 2/7/23 06:09, Sai Krishna Potthuri wrote:
> Add Rx tuning support for Xilinx Versal Octal SPI Controller(OSPI).
> Xilinx Versal Octal SPI controller requires Rx tuning to find out the
> optimal tap value when running at higher clock in DTR protocol. As there

Is the tunning required just for DTR? Is STR spared of the tunning
requirements? Because as of now, we use the maximum supported flash 
speed from the begining and we don't change it.

An option is to start the NOR flash with min(max-flash-freq, 50 MHz), do
the read ID, SFDP read, golden pattern read, increase the freq to the
max-flash-freq, then do the calibration. The golden pattern must not
necessarily be a OTP-like data (ID, SFDP, OTP), we can reserve a portion
of the flash (non - OTP) for the golden pattern, can't we?

> is no common Tuning Data Pattern defined across all vendors, Xilinx Versal
> Octal SPI uses READ_ID based tuning algorithm for which this device_id
> field in spi_mem will be used as a Tuning data pattern.
> After enabling the DTR protocol, spi-nor sends READ ID command to verify
> the DTR protocol switching is successfully done or not. If execute_tuning
> is defined(not NULL) and clk_tuned is not set then Rx tuning algorithm

looks hackish, an explicit call to a phy-calibrate method would be
better IMO.

Cheers,
ta
> will be executed before it actually read the ID in DTR protocol.
> Xilinx Versal tuning algorithm (as suggested by IP design team):
> - Sweep all possible 127 taps to read the flash id for each tap.
> - To consider a particular tap as valid make sure to get the correct id
> for 10 times.
> - Once valid window is found(window size >= 3), stop parsing the
> remaining taps.
> - Perform the above steps with extra dummy clock 1.
> - Identify the maximum window size between extra dummy clock 0 and 1 and
> configure the median of min and max valid tap of the corresponding window.
> - If the maximum window size is with dummy clock 1 then all the subsequent
> read operations should be performed with one extra dummy clock.
> 
> Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>
> ---
>   drivers/spi/spi-cadence-quadspi.c | 201 ++++++++++++++++++++++++++++++
>   1 file changed, 201 insertions(+)
> 
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index b55b763003f0..bc4395736def 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -93,6 +93,8 @@ struct cqspi_st {
>   	bool			slow_sram;
>   	bool			use_phy;
>   	bool			use_dqs;
> +	bool			clk_tuned;
> +	u8			extra_dummy_clks;
>   };
>   
>   struct cqspi_driver_platdata {
> @@ -101,6 +103,7 @@ struct cqspi_driver_platdata {
>   	int (*indirect_read_dma)(struct cqspi_flash_pdata *f_pdata,
>   				 u_char *rxbuf, loff_t from_addr, size_t n_rx);
>   	u32 (*get_dma_status)(struct cqspi_st *cqspi);
> +	int (*execute_tuning)(struct spi_mem *mem, const struct spi_mem_op *op);
>   };
>   
>   /* Operation timeout value */
> @@ -192,6 +195,7 @@ struct cqspi_driver_platdata {
>   
>   #define CQSPI_REG_IRQSTATUS			0x40
>   #define CQSPI_REG_IRQMASK			0x44
> +#define CQSPI_REG_VERSAL_ECO			0x48
>   
>   #define CQSPI_REG_INDIRECTRD			0x60
>   #define CQSPI_REG_INDIRECTRD_START_MASK		BIT(0)
> @@ -238,6 +242,19 @@ struct cqspi_driver_platdata {
>   #define CQSPI_REG_POLLING_STATUS		0xB0
>   #define CQSPI_REG_POLLING_STATUS_DUMMY_LSB	16
>   
> +#define CQSPI_REG_VERSAL_PHY_CONFIG		0xB4
> +#define CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK	BIT(31)
> +#define CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK	BIT(30)
> +#define CQSPI_REG_PHY_CONFIG_TX_DLL_DLY_LSB	16
> +
> +#define CQSPI_REG_VERSAL_PHY_MASTER_CTRL	0xB8
> +#define CQSPI_REG_DLL_LOWER			0xBC
> +#define CQSPI_REG_DLL_LOWER_LPBK_LOCK_MASK	BIT(15)
> +#define CQSPI_REG_DLL_LOWER_DLL_LOCK_MASK	BIT(0)
> +
> +#define CQSPI_REG_VERSAL_DLL_OBSVBLE_UPPER	0xC0
> +#define CQSPI_REG_DLL_UPPER_RX_FLD_MASK		0x7F
> +
>   #define CQSPI_REG_OP_EXT_LOWER			0xE0
>   #define CQSPI_REG_OP_EXT_READ_LSB		24
>   #define CQSPI_REG_OP_EXT_WRITE_LSB		16
> @@ -282,6 +299,12 @@ struct cqspi_driver_platdata {
>   #define CQSPI_DMA_UNALIGN		0x3
>   
>   #define CQSPI_REG_VERSAL_DMA_VAL		0x602
> +#define CQSPI_VERSAL_MAX_DLL_TAPS		127
> +#define CQSPI_VERSAL_TX_TAP_MASTER		0x1E
> +#define CQSPI_VERSAL_ECO_MIN_FREQ		120000000
> +#define CQSPI_VERSAL_INIT_DLY			4
> +#define CQSPI_VERSAL_RDID_MAX_CNT		10
> +#define CQSPI_VERSAL_MIN_WINDOW_SZ		3
>   
>   static int cqspi_wait_for_bit(void __iomem *reg, const u32 mask, bool clr)
>   {
> @@ -540,6 +563,14 @@ static int cqspi_command_read(struct cqspi_flash_pdata *f_pdata,
>   	if (dummy_clk > CQSPI_DUMMY_CLKS_MAX)
>   		return -EOPNOTSUPP;
>   
> +	/*
> +	 * Based on the tuning, some controllers requires extra dummy clocks due to additional
> +	 * delays in read path. For such controllers add required extra dummy clocks for all
> +	 * subsequent read operations.
> +	 */
> +	if (cqspi->extra_dummy_clks)
> +		dummy_clk += cqspi->extra_dummy_clks;
> +
>   	if (dummy_clk)
>   		reg |= (dummy_clk & CQSPI_REG_CMDCTRL_DUMMY_MASK)
>   		     << CQSPI_REG_CMDCTRL_DUMMY_LSB;
> @@ -662,6 +693,14 @@ static int cqspi_read_setup(struct cqspi_flash_pdata *f_pdata,
>   	if (dummy_clk > CQSPI_DUMMY_CLKS_MAX)
>   		return -EOPNOTSUPP;
>   
> +	/*
> +	 * Based on the tuning, some controllers requires extra dummy clocks due to additional
> +	 * delays in read path. For such controllers add required extra dummy clocks for all
> +	 * subsequent read operations.
> +	 */
> +	if (cqspi->extra_dummy_clks)
> +		dummy_clk += cqspi->extra_dummy_clks;
> +
>   	if (dummy_clk)
>   		reg |= (dummy_clk & CQSPI_REG_RD_INSTR_DUMMY_MASK)
>   		       << CQSPI_REG_RD_INSTR_DUMMY_LSB;
> @@ -1056,6 +1095,157 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
>   	return ret;
>   }
>   
> +static int cqspi_versal_execute_tuning(struct spi_mem *mem, const struct spi_mem_op *op)
> +{
> +	struct cqspi_st *cqspi = spi_master_get_devdata(mem->spi->master);
> +	u8 dummy_incr, dummy_flag = 0, count, max_index, min_index;
> +	u8 min_rxtap, max_rxtap, avg_index, max_tap, windowsize;
> +	struct platform_device *pdev = cqspi->pdev;
> +	struct cqspi_flash_pdata *f_pdata;
> +	bool id_matched, rxtapfound;
> +	u32 val, i, reg, txtap = 0;
> +	s8 max_windowsize = -1;
> +	int ret;
> +
> +	f_pdata = &cqspi->f_pdata[mem->spi->chip_select];
> +
> +	reg = readl(cqspi->iobase + CQSPI_REG_CONFIG);
> +	reg &= CQSPI_REG_CONFIG_ENABLE_MASK;
> +	if (!reg)
> +		return 0;
> +
> +	if (cqspi->master_ref_clk_hz >= CQSPI_VERSAL_ECO_MIN_FREQ)
> +		writel(1, cqspi->iobase + CQSPI_REG_VERSAL_ECO);
> +
> +	/* Drive DLL reset bit to low */
> +	writel(0, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +
> +	/* Set initial delay value */
> +	writel(CQSPI_VERSAL_INIT_DLY, cqspi->iobase + CQSPI_REG_VERSAL_PHY_MASTER_CTRL);
> +
> +	/* Set DLL reset bit */
> +	writel(CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK,
> +	       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +
> +	/* Check for loopback lock */
> +	ret = cqspi_wait_for_bit(cqspi->iobase + CQSPI_REG_DLL_LOWER,
> +				 CQSPI_REG_DLL_LOWER_LPBK_LOCK_MASK, 0);
> +	if (ret) {
> +		dev_err(&pdev->dev, "Loopback lock bit error (%i)\n", ret);
> +		return ret;
> +	}
> +
> +	/* Re-synchronize slave DLLs */
> +	writel(CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +	writel(CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK | CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK,
> +	       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +
> +	txtap = CQSPI_VERSAL_TX_TAP_MASTER << CQSPI_REG_PHY_CONFIG_TX_DLL_DLY_LSB;
> +	max_tap = CQSPI_VERSAL_MAX_DLL_TAPS;
> +
> +	for (dummy_incr = 0; dummy_incr <= 1; dummy_incr++) {
> +		/* Update the extra dummy clocks required to read the ID */
> +		cqspi->extra_dummy_clks = dummy_incr;
> +		rxtapfound = false;
> +		min_rxtap = 0;
> +		max_rxtap = 0;
> +		max_index = 0;
> +		min_index = 0;
> +
> +		for (i = 0; i <= max_tap; i++) {
> +			val = txtap | i | CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK;
> +			writel(val, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +			writel(val | CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK,
> +			       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +			ret = cqspi_wait_for_bit(cqspi->iobase + CQSPI_REG_DLL_LOWER,
> +						 CQSPI_REG_DLL_LOWER_DLL_LOCK_MASK, 0);
> +			if (ret)
> +				return ret;
> +
> +			/*
> +			 * To consider the tap as valid, make sure to read the Flash ID
> +			 * 10 times(recommended by IP design team). This makes the algorithm more
> +			 * robust and gives more confidence about the valid tap, this will also
> +			 * avoid getting the correct id by a fluke with a particular tap.
> +			 * In any one of the iteration if the Flash ID is not matches with the
> +			 * device_id, then that particular tap should not be considered as valid.
> +			 */
> +			count = 0;
> +			do {
> +				/* Execute the command and wait for the complete */
> +				ret = cqspi_command_read(f_pdata, op);
> +				if (ret < 0) {
> +					dev_err(&pdev->dev,
> +						"error %d reading JEDEC ID\n", ret);
> +					return ret;
> +				}
> +
> +				id_matched = true;
> +				if (memcmp(mem->device_id, op->data.buf.in, op->data.nbytes)) {
> +					id_matched = false;
> +					break;
> +				}
> +			} while (id_matched && (count++ <= CQSPI_VERSAL_RDID_MAX_CNT));
> +
> +			if (id_matched) {
> +				u8 current_rxtap = readl(cqspi->iobase +
> +							 CQSPI_REG_VERSAL_DLL_OBSVBLE_UPPER) &
> +							 CQSPI_REG_DLL_UPPER_RX_FLD_MASK;
> +
> +				if (!rxtapfound) {
> +					min_rxtap = current_rxtap;
> +					min_index = i;
> +					rxtapfound = true;
> +				}
> +
> +				max_rxtap = current_rxtap;
> +				max_index = i;
> +			}
> +
> +			if (!rxtapfound)
> +				continue;
> +
> +			/* If ID doesn't match or reach the max tap, calculate the window size */
> +			if (!id_matched || i == max_tap) {
> +				windowsize = max_rxtap - min_rxtap + 1;
> +				if (windowsize > max_windowsize) {
> +					dummy_flag = dummy_incr;
> +					max_windowsize = windowsize;
> +					avg_index = (max_index + min_index) / 2;
> +				}
> +
> +				rxtapfound = false;
> +				if (windowsize >= CQSPI_VERSAL_MIN_WINDOW_SZ)
> +					break;
> +			}
> +		}
> +	}
> +
> +	/*
> +	 * Look for minimum window size to ensure reliable data sampling even during drifts due
> +	 * to PVT, so that a drift either towards left or right can be well addressed.
> +	 */
> +	if (max_windowsize < CQSPI_VERSAL_MIN_WINDOW_SZ)
> +		return -EINVAL;
> +
> +	/* Update the extra dummy clocks based on the maximum window size */
> +	cqspi->extra_dummy_clks = dummy_flag;
> +
> +	val = txtap | avg_index | CQSPI_REG_PHY_CONFIG_RESET_FLD_MASK;
> +	writel(val, cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +	writel((CQSPI_REG_PHY_CONFIG_RESYNC_FLD_MASK | val),
> +	       cqspi->iobase + CQSPI_REG_VERSAL_PHY_CONFIG);
> +
> +	ret = cqspi_wait_for_bit(cqspi->iobase + CQSPI_REG_DLL_LOWER,
> +				 CQSPI_REG_DLL_LOWER_DLL_LOCK_MASK, 0);
> +	if (ret)
> +		return ret;
> +
> +	cqspi->clk_tuned = true;
> +
> +	return 0;
> +}
> +
>   static void cqspi_chipselect(struct cqspi_flash_pdata *f_pdata)
>   {
>   	struct cqspi_st *cqspi = f_pdata->cqspi;
> @@ -1345,11 +1535,21 @@ static ssize_t cqspi_read(struct cqspi_flash_pdata *f_pdata,
>   static int cqspi_mem_process(struct spi_mem *mem, const struct spi_mem_op *op)
>   {
>   	struct cqspi_st *cqspi = spi_master_get_devdata(mem->spi->master);
> +	const struct cqspi_driver_platdata *ddata;
> +	struct device *dev = &cqspi->pdev->dev;
>   	struct cqspi_flash_pdata *f_pdata;
>   
> +	ddata = of_device_get_match_data(dev);
>   	f_pdata = &cqspi->f_pdata[mem->spi->chip_select];
>   	cqspi_configure(f_pdata, mem->spi->max_speed_hz);
>   
> +	if (op->cmd.dtr && ddata && ddata->execute_tuning && !cqspi->clk_tuned) {
> +		int ret = ddata->execute_tuning(mem, op);
> +
> +		if (ret)
> +			return ret;
> +	}
> +
>   	if (op->data.dir == SPI_MEM_DATA_IN && op->data.buf.in) {
>   		if (!op->addr.nbytes)
>   			return cqspi_command_read(f_pdata, op);
> @@ -1846,6 +2046,7 @@ static const struct cqspi_driver_platdata versal_ospi = {
>   			| CQSPI_HAS_PHY | CQSPI_HAS_DQS,
>   	.indirect_read_dma = cqspi_versal_indirect_read_dma,
>   	.get_dma_status = cqspi_get_versal_dma_status,
> +	.execute_tuning = cqspi_versal_execute_tuning,
>   };
>   
>   static const struct of_device_id cqspi_dt_ids[] = {

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode
  2023-02-07  6:48 ` [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Tudor Ambarus
@ 2023-07-18 10:01   ` Michael Walle
  2023-07-18 11:30     ` Potthuri, Sai Krishna
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Walle @ 2023-07-18 10:01 UTC (permalink / raw)
  To: Tudor Ambarus
  Cc: Sai Krishna Potthuri, Mark Brown, Tudor Ambarus, Pratyush Yadav,
	Miquel Raynal, Richard Weinberger, Vignesh Raghavendra,
	linux-mtd, linux-spi, linux-kernel, saikrishna12468, git

Am 2023-02-07 07:48, schrieb Tudor Ambarus:
> On 2/7/23 06:09, Sai Krishna Potthuri wrote:
>> Enable PHY and DQS required for Xilinx Versal Octal SPI to operate in 
>> DTR
>> protocol.
>> Add and update device_id field in spi_mem structure with flash id
>> information. Xilinx Versal Octal SPI driver requires the device id
>> information to perform the Rx tuning operation. Since there is no 
>> common
>> Tuning Data Pattern defined across all vendors, controllers like 
>> Xilinx
>> Versal Octal SPI which requires Rx tuning to find out the optimal 
>> sampling
>> point for data lines, this device id information will be used as a 
>> golden
>> data.
> 
> Using only 6 bytes as golden pattern seems fragile, but you are aware 
> of
> that, as I see that you chose to read the ID 10 times to make the
> decision whether the tap is valid or not. Other option (which is not
> perfect) is to use SFDP data as golden pattern. If I remember
> correctly, JESD216 suggests to use the Read SFDP cmd at 50 MHz, so it
> won't help you much. In practice SPI NOR uses the Read SFDP command at
> the flash's maximum speed and we haven't seen problems. But better 
> would
> be to use some flash OTP data maybe? I remember Pratyush has submitted 
> a
> phy calibration series in the past, I haven't had the chance to read 
> his
> proposal. Did you? How's your proposal different than his?

And its not 6 bytes.. it's usually only three. The last three bytes will
probably be undefined. So the might return ff or just wrap around and
return the first three bytes again.

Is there a datasheet where you can read how the calibration is done? Is 
this
the same for all i/o pads or individual per i/o pad?

I cannot see where the op to read the id is coming from. Are you relying
on the fact that a RDID is the first command which gets executed. If so,
please don't.

Do you calibrate only one pad? RDID (9f) is  single bit i/o, right? And
I guess you are calibrating with the highest frequency, are
we sure that RDID will work with any frequency (on any flash).


>> The reason behind choosing this approach instead of reading the ID 
>> again
>> in the controller driver is to make it generic solution.
>> - Other controller drivers which want to use similar tuning process, 
>> they
>> will make use of this ID instead of reading the ID again in the 
>> driver.

Honestly, I'm not sure this is the way to go. Pratyush proposed solution
to have a dedicated memory area within the flash array with a know 
pattern
seems to make more sense, because you are calibrating on the thing you 
are going
to use later, that is quad/ocal read with the fastest frequency.

>> - Also, we can avoid hardcoding the command information and initiating 
>> the
>> transfer in the controller driver as this should happen from spi-nor.

So how you know that this is a RDID instruction?

-michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode
  2023-07-18 10:01   ` Michael Walle
@ 2023-07-18 11:30     ` Potthuri, Sai Krishna
  0 siblings, 0 replies; 8+ messages in thread
From: Potthuri, Sai Krishna @ 2023-07-18 11:30 UTC (permalink / raw)
  To: Michael Walle, Tudor Ambarus
  Cc: Mark Brown, Tudor Ambarus, Pratyush Yadav, Miquel Raynal,
	Richard Weinberger, Vignesh Raghavendra, linux-mtd, linux-spi,
	linux-kernel, saikrishna12468, git (AMD-Xilinx)

Hi Michael,

> -----Original Message-----
> From: Michael Walle <michael@walle.cc>
> Sent: Tuesday, July 18, 2023 3:31 PM
> To: Tudor Ambarus <tudor.ambarus@linaro.org>
> Cc: Potthuri, Sai Krishna <sai.krishna.potthuri@amd.com>; Mark Brown
> <broonie@kernel.org>; Tudor Ambarus <tudor.ambarus@microchip.com>;
> Pratyush Yadav <pratyush@kernel.org>; Miquel Raynal
> <miquel.raynal@bootlin.com>; Richard Weinberger <richard@nod.at>; Vignesh
> Raghavendra <vigneshr@ti.com>; linux-mtd@lists.infradead.org; linux-
> spi@vger.kernel.org; linux-kernel@vger.kernel.org; saikrishna12468@gmail.com;
> git (AMD-Xilinx) <git@amd.com>
> Subject: Re: [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR
> mode
> 
> Am 2023-02-07 07:48, schrieb Tudor Ambarus:
> > On 2/7/23 06:09, Sai Krishna Potthuri wrote:
> >> Enable PHY and DQS required for Xilinx Versal Octal SPI to operate in
> >> DTR protocol.
> >> Add and update device_id field in spi_mem structure with flash id
> >> information. Xilinx Versal Octal SPI driver requires the device id
> >> information to perform the Rx tuning operation. Since there is no
> >> common Tuning Data Pattern defined across all vendors, controllers
> >> like Xilinx Versal Octal SPI which requires Rx tuning to find out the
> >> optimal sampling point for data lines, this device id information
> >> will be used as a golden data.
> >
> > Using only 6 bytes as golden pattern seems fragile, but you are aware
> > of that, as I see that you chose to read the ID 10 times to make the
> > decision whether the tap is valid or not. Other option (which is not
> > perfect) is to use SFDP data as golden pattern. If I remember
> > correctly, JESD216 suggests to use the Read SFDP cmd at 50 MHz, so it
> > won't help you much. In practice SPI NOR uses the Read SFDP command at
> > the flash's maximum speed and we haven't seen problems. But better
> > would be to use some flash OTP data maybe? I remember Pratyush has
> > submitted a phy calibration series in the past, I haven't had the
> > chance to read his proposal. Did you? How's your proposal different
> > than his?
> 
> And its not 6 bytes.. it's usually only three. The last three bytes will probably be
> undefined. So the might return ff or just wrap around and return the first three
> bytes again.
> 
> Is there a datasheet where you can read how the calibration is done? Is this the
> same for all i/o pads or individual per i/o pad?
> 
> I cannot see where the op to read the id is coming from. Are you relying on the
> fact that a RDID is the first command which gets executed. If so, please don't.
> 
> Do you calibrate only one pad? RDID (9f) is  single bit i/o, right? And I guess you
> are calibrating with the highest frequency, are we sure that RDID will work with
> any frequency (on any flash).
> 
> 
> >> The reason behind choosing this approach instead of reading the ID
> >> again in the controller driver is to make it generic solution.
> >> - Other controller drivers which want to use similar tuning process,
> >> they will make use of this ID instead of reading the ID again in the
> >> driver.
> 
> Honestly, I'm not sure this is the way to go. Pratyush proposed solution to have a
> dedicated memory area within the flash array with a know pattern seems to
> make more sense, because you are calibrating on the thing you are going to use
> later, that is quad/ocal read with the fastest frequency.
Yes, I am going through his proposal and working on it to adapt that solution.
Agree, this approach makes more sense than RDID also, with this we can avoid the
cons with RDID approach like small pattern, calibrating one line, frequency limitations
with RDID command(if any) etc.,
 
> 
> >> - Also, we can avoid hardcoding the command information and
> >> initiating the transfer in the controller driver as this should
> >> happen from spi-nor.
> 
> So how you know that this is a RDID instruction?
Yes, I am expecting the first command after switching to DDR mode is RDID.
Agree, we cannot rely on this.

Regards
Sai Krishna

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-07-18 11:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-07  6:09 [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Sai Krishna Potthuri
2023-02-07  6:09 ` [PATCH 1/3] spi: cadence-quadspi: Add support for PHY module and DQS Sai Krishna Potthuri
2023-02-07  6:09 ` [PATCH 2/3] mtd: spi-nor: Add and update device_id field in spi_mem structure Sai Krishna Potthuri
2023-02-07  6:09 ` [PATCH 3/3] spi: cadence-quadspi: Add Rx tuning support for Xilinx Versal OSPI Sai Krishna Potthuri
2023-02-07  7:23   ` Tudor Ambarus
2023-02-07  6:48 ` [PATCH 0/3] spi: spi-cadence-quadspi: Add Rx tuning support for DTR mode Tudor Ambarus
2023-07-18 10:01   ` Michael Walle
2023-07-18 11:30     ` Potthuri, Sai Krishna

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).