All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] mtd: spi-nor: aspeed: support extensions
@ 2017-04-06 16:56 Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size Cédric Le Goater
                   ` (9 more replies)
  0 siblings, 10 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

Hello,

Here is a series extending the support for the Aspeed AST2500 and
AST2400 SMC driver. First is a couple of simple fixes, then comes
support for :

 - DMA which applies to the FMC (BMC Firmware) controller only.
 - Dual Data (Quad is not supported by the controller). The spi-nor
   table of chips is modified to activate DUAL mode on the chips used
   on OpenPOWER systems. 
 - Configuration of the segment registers defining the mapping windows
   of the chips on the AHB bus.
 - Command mode for reads. Reads are performed directly on the mapping
   windows of the chip on the AHB Bus.
 - Read mode optimization, calculating the best settings for a given
   chip (for the SPI controllers only)

Tested on:

 * OpenPOWER Palmetto (AST2400)
 * Evaluation board (AST2500) 
 * OpenPOWER Witherspoon (AST2500)
 * OpenPOWER Romulus (AST2500)
 * OpenPOWER Zaius (AST2500)

Thanks,

C.

Cédric Le Goater (9):
  mtd: spi-nor: aspeed: fix the AST2400 SMC window size
  mtd: spi-nor: aspeed: remove dummies from keep mask
  mtd: spi-nor: aspeed: add DMA support
  mtd: spi-nor: Add support for Macronix mx66l1g45g spi flash
  mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices
  mtd: spi-nor: aspeed: configure chip window on AHB bus
  mtd: spi-nor: aspeed: use command mode for reads
  mtd: spi-nor: aspeed: link controller with the ahb clock
  mtd: spi-nor: aspeed: optimize read mode

Robert Lippert (1):
  mtd: spi-nor: aspeed: add support for SPI dual IO read mode

 drivers/mtd/spi-nor/aspeed-smc.c | 604 ++++++++++++++++++++++++++++++++++++++-
 drivers/mtd/spi-nor/spi-nor.c    |   9 +-
 2 files changed, 595 insertions(+), 18 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 19:17   ` Marek Vasut
  2017-04-06 16:56 ` [PATCH 02/10] mtd: spi-nor: aspeed: remove dummies from keep mask Cédric Le Goater
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

The window of the Aspeed AST2400 SMC Controller to map chips on the
AHB Bus has a 256MB size. 64MB is the default size for the first chip
select and not the full window range: [ 0x20000000 - 0x2FFFFFFF ].

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index 56051d30f000..aa2c647c8507 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -48,7 +48,7 @@ static void aspeed_smc_chip_set_4b_spi_2400(struct aspeed_smc_chip *chip);
 static void aspeed_smc_chip_set_4b(struct aspeed_smc_chip *chip);
 
 static const struct aspeed_smc_info fmc_2400_info = {
-	.maxsize = 64 * 1024 * 1024,
+	.maxsize = 256 * 1024 * 1024,
 	.nce = 5,
 	.hastype = true,
 	.we0 = 16,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 02/10] mtd: spi-nor: aspeed: remove dummies from keep mask
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support Cédric Le Goater
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

There is no need to keep the dummy bytes in the control register if
the command mode is not kept also. This could lead to an inconsistent
setting : normal read mode (command 0x3) and dummy bytes. It is to be
noted that the HW allows such a configuration.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index aa2c647c8507..a98d454d07ed 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -180,8 +180,7 @@ struct aspeed_smc_controller {
 
 #define CONTROL_KEEP_MASK						\
 	(CONTROL_AAF_MODE | CONTROL_CE_INACTIVE_MASK | CONTROL_CLK_DIV4 | \
-	 CONTROL_IO_DUMMY_MASK | CONTROL_CLOCK_FREQ_SEL_MASK |		\
-	 CONTROL_LSB_FIRST | CONTROL_CLOCK_MODE_3)
+	 CONTROL_CLOCK_FREQ_SEL_MASK | CONTROL_LSB_FIRST | CONTROL_CLOCK_MODE_3)
 
 /*
  * The Segment Register uses a 8MB unit to encode the start address
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 02/10] mtd: spi-nor: aspeed: remove dummies from keep mask Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 19:07   ` Cyrille Pitchen
  2017-04-06 16:56 ` [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode Cédric Le Goater
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

The Aspeed FMC controller can handle transfers to the flash modules
using DMAs. A couple of registers first need to be programmed with the
DRAM and flash addresses and the length of the transfer. The transfer
is then initiated using a DMA control register and an interrupt
notifies the completion.

Such transfers can replace the current IO mode in the read/write ops
when some conditions are met on the size and the alignment. In case of
failure, a timeout for instance, the operation is restarted using the
IO mode.

DMA support does not seem to be that efficient. So we provide some
sysfs files for tuning and to switch it on and off (default is off)

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 231 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 231 insertions(+)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index a98d454d07ed..7dfa1ea0a787 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -11,6 +11,8 @@
 
 #include <linux/bug.h>
 #include <linux/device.h>
+#include <linux/dma-direction.h>
+#include <linux/dma-mapping.h>
 #include <linux/io.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
@@ -40,6 +42,7 @@ struct aspeed_smc_info {
 	bool hastype;		/* flash type field exists in config reg */
 	u8 we0;			/* shift for write enable bit for CE0 */
 	u8 ctl0;		/* offset in regs of ctl for CE0 */
+	bool has_dma;
 
 	void (*set_4b)(struct aspeed_smc_chip *chip);
 };
@@ -53,6 +56,7 @@ static const struct aspeed_smc_info fmc_2400_info = {
 	.hastype = true,
 	.we0 = 16,
 	.ctl0 = 0x10,
+	.has_dma = true,
 	.set_4b = aspeed_smc_chip_set_4b,
 };
 
@@ -62,6 +66,7 @@ static const struct aspeed_smc_info spi_2400_info = {
 	.hastype = false,
 	.we0 = 0,
 	.ctl0 = 0x04,
+	.has_dma = false,
 	.set_4b = aspeed_smc_chip_set_4b_spi_2400,
 };
 
@@ -71,6 +76,7 @@ static const struct aspeed_smc_info fmc_2500_info = {
 	.hastype = true,
 	.we0 = 16,
 	.ctl0 = 0x10,
+	.has_dma = true,
 	.set_4b = aspeed_smc_chip_set_4b,
 };
 
@@ -80,6 +86,7 @@ static const struct aspeed_smc_info spi_2500_info = {
 	.hastype = false,
 	.we0 = 16,
 	.ctl0 = 0x10,
+	.has_dma = false,
 	.set_4b = aspeed_smc_chip_set_4b,
 };
 
@@ -97,6 +104,7 @@ struct aspeed_smc_chip {
 	struct aspeed_smc_controller *controller;
 	void __iomem *ctl;			/* control register */
 	void __iomem *ahb_base;			/* base of chip window */
+	unsigned long phys_base;		/* physical address of window */
 	u32 ctl_val[smc_max];			/* control settings */
 	enum aspeed_smc_flash_type type;	/* what type of flash */
 	struct spi_nor nor;
@@ -110,6 +118,18 @@ struct aspeed_smc_controller {
 	void __iomem *regs;			/* controller registers */
 	void __iomem *ahb_base;			/* per-chip windows resource */
 
+	/* interrupt handling */
+	int irq;
+
+	/* dma */
+	bool dma_enabled;
+	struct completion dma_done;
+
+	/* dma logging */
+	size_t dma_length;
+	dma_addr_t dma_addr;			/* bus address of buffer */
+	dma_addr_t flash_addr;			/* flash address */
+
 	struct aspeed_smc_chip *chips[0];	/* pointers to attached chips */
 };
 
@@ -182,6 +202,11 @@ struct aspeed_smc_controller {
 	(CONTROL_AAF_MODE | CONTROL_CE_INACTIVE_MASK | CONTROL_CLK_DIV4 | \
 	 CONTROL_CLOCK_FREQ_SEL_MASK | CONTROL_LSB_FIRST | CONTROL_CLOCK_MODE_3)
 
+/* Interrupt Control and Status Register */
+#define INTERRUPT_STATUS_REG		0x08
+#define     INTERRUPT_DMA_ENABLE	BIT(3)
+#define     INTERRUPT_DMA_STATUS	BIT(11)
+
 /*
  * The Segment Register uses a 8MB unit to encode the start address
  * and the end address of the mapping window of a flash SPI slave :
@@ -194,6 +219,108 @@ struct aspeed_smc_controller {
 #define SEGMENT_ADDR_START(_r)		((((_r) >> 16) & 0xFF) << 23)
 #define SEGMENT_ADDR_END(_r)		((((_r) >> 24) & 0xFF) << 23)
 
+/* DMA Registers */
+#define DMA_CONTROL_REG			0x80
+#define     DMA_ENABLE			BIT(0)
+#define     DMA_WRITE			BIT(1)
+
+#define DMA_FLASH_BASE_REG		0x84
+#define DMA_DRAM_BASE_REG		0x88
+#define DMA_LENGTH_REG			0x8c
+
+/*
+ * DMAs do not seem to be that fast, so disable by default
+ */
+static bool use_dma;
+module_param(use_dma, bool, 0644);
+
+static unsigned int min_dma_size = 256;
+module_param(min_dma_size, uint, 0644);
+
+/* with 100ms we had a couple of timeouts */
+static unsigned int dma_timeout = 200;
+module_param(dma_timeout, uint, 0644);
+
+static void aspeed_smc_dma_done(struct aspeed_smc_controller *controller)
+{
+	writel(0, controller->regs + INTERRUPT_STATUS_REG);
+	writel(0, controller->regs + DMA_CONTROL_REG);
+}
+
+#define DMA_LENGTH(x) (((x) - 4) & ~0xFE000003)
+#define DMA_ADDR(x) ((x) & ~0x00000003)
+
+static inline void aspeed_smc_chip_configure(struct aspeed_smc_chip *chip,
+					     u32 ctl)
+{
+	ctl |= CONTROL_CE_STOP_ACTIVE_CONTROL;
+	writel(ctl, chip->ctl);
+
+	ctl &= ~CONTROL_CE_STOP_ACTIVE_CONTROL;
+	writel(ctl, chip->ctl);
+}
+
+static int aspeed_smc_dma_start(struct aspeed_smc_chip *chip,
+				u32 offset, void *buf, size_t length,
+				int is_write)
+{
+	struct aspeed_smc_controller *controller = chip->controller;
+	dma_addr_t dma_addr, flash_addr;
+	int ret = 0;
+
+	aspeed_smc_chip_configure(chip, is_write ? chip->ctl_val[smc_write] :
+		chip->ctl_val[smc_read]);
+
+	dev_dbg(chip->nor.dev, "DMA %s to=0x%08x len=0x%08x\n",
+		is_write ? "write" : "read", offset, length);
+
+	dma_addr = dma_map_single(chip->nor.dev, buf, length,
+				  (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));
+
+	if (unlikely(dma_mapping_error(chip->nor.dev, dma_addr))) {
+		dev_err(chip->nor.dev, "Failed to dma_map_single()\n");
+		ret = -ENOMEM;
+		goto out;
+	}
+	flash_addr = chip->phys_base + offset;
+
+	controller->dma_length = length;
+	controller->dma_addr = dma_addr;
+	controller->flash_addr = flash_addr;
+
+	reinit_completion(&controller->dma_done);
+
+	writel(0, controller->regs + DMA_CONTROL_REG);
+	writel(DMA_ADDR(flash_addr), controller->regs +
+	       DMA_FLASH_BASE_REG);
+	writel(DMA_ADDR(dma_addr), controller->regs + DMA_DRAM_BASE_REG);
+	writel(DMA_LENGTH(length), controller->regs + DMA_LENGTH_REG);
+
+	writel(INTERRUPT_DMA_ENABLE,
+	       controller->regs + INTERRUPT_STATUS_REG);
+
+	writel(DMA_ENABLE | (is_write << 1),
+	       controller->regs + DMA_CONTROL_REG);
+
+	if (!wait_for_completion_timeout(&controller->dma_done,
+					 msecs_to_jiffies(dma_timeout))) {
+		dev_err(chip->nor.dev,
+			"DMA timeout addr@%.8x faddr@%.8x size=%x\n",
+			controller->dma_addr,
+			controller->flash_addr,
+			controller->dma_length);
+		ret = -ETIMEDOUT;
+		aspeed_smc_dma_done(controller);
+	}
+
+	dma_unmap_single(chip->nor.dev,
+			 controller->dma_addr, controller->dma_length,
+			 (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));
+out:
+	aspeed_smc_chip_configure(chip, chip->ctl_val[smc_read]);
+	return ret;
+}
+
 /*
  * In user mode all data bytes read or written to the chip decode address
  * range are transferred to or from the SPI bus. The range is treated as a
@@ -366,12 +493,32 @@ static void aspeed_smc_send_cmd_addr(struct spi_nor *nor, u8 cmd, u32 addr)
 	}
 }
 
+/*
+ * Try DMA transfer when size and alignment are correct. In case
+ * of failure, just restart using the IO mode.
+ */
+static int aspeed_smc_dma_check(struct aspeed_smc_chip *chip, loff_t off,
+				size_t len)
+{
+	return (IS_ALIGNED(off, 4) && IS_ALIGNED(len, 4) &&
+		len >= min_dma_size && chip->controller->dma_enabled &&
+		use_dma);
+}
+
 static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
 				    size_t len, u_char *read_buf)
 {
 	struct aspeed_smc_chip *chip = nor->priv;
 	int i;
 	u8 dummy = 0xFF;
+	int ret;
+
+	if (aspeed_smc_dma_check(chip, from, len)) {
+		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
+		if (!ret)
+			goto out;
+		dev_err(chip->nor.dev, "DMA read failed: %d", ret);
+	}
 
 	aspeed_smc_start_user(nor);
 	aspeed_smc_send_cmd_addr(nor, nor->read_opcode, from);
@@ -380,6 +527,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
 
 	aspeed_smc_read_from_ahb(read_buf, chip->ahb_base, len);
 	aspeed_smc_stop_user(nor);
+out:
 	return len;
 }
 
@@ -387,11 +535,21 @@ static ssize_t aspeed_smc_write_user(struct spi_nor *nor, loff_t to,
 				     size_t len, const u_char *write_buf)
 {
 	struct aspeed_smc_chip *chip = nor->priv;
+	int ret;
+
+	if (aspeed_smc_dma_check(chip, to, len)) {
+		ret = aspeed_smc_dma_start(chip, to, (void *)write_buf,
+					   len, 1);
+		if (!ret)
+			goto out;
+		dev_err(chip->nor.dev, "DMA write failed: %d", ret);
+	}
 
 	aspeed_smc_start_user(nor);
 	aspeed_smc_send_cmd_addr(nor, nor->program_opcode, to);
 	aspeed_smc_write_to_ahb(chip->ahb_base, write_buf, len);
 	aspeed_smc_stop_user(nor);
+out:
 	return len;
 }
 
@@ -527,6 +685,8 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
 		return -EINVAL;
 	}
 
+	chip->phys_base = res->start + (chip->ahb_base - controller->ahb_base);
+
 	/*
 	 * Get value of the inherited control register. U-Boot usually
 	 * does some timing calibration on the FMC chip, so it's good
@@ -597,6 +757,7 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
 	}
 
 	chip->ctl_val[smc_read] |= cmd |
+		chip->nor.read_opcode << CONTROL_COMMAND_SHIFT |
 		CONTROL_IO_DUMMY_SET(chip->nor.read_dummy / 8);
 
 	dev_dbg(controller->dev, "base control register: %08x\n",
@@ -695,6 +856,74 @@ static int aspeed_smc_setup_flash(struct aspeed_smc_controller *controller,
 	return ret;
 }
 
+static irqreturn_t aspeed_smc_irq(int irq, void *arg)
+{
+	struct aspeed_smc_controller *controller = arg;
+	struct device *dev = controller->dev;
+	irqreturn_t ret = IRQ_NONE;
+	u32 dma_ctl = readl(controller->regs + DMA_CONTROL_REG);
+	u32 status = readl(controller->regs + INTERRUPT_STATUS_REG);
+
+	dev_dbg(dev, "received IRQ. status: %x", status);
+
+	if (!(status & INTERRUPT_DMA_ENABLE) || !(dma_ctl & DMA_ENABLE)) {
+		dev_err(dev, "No DMA. bad IRQ status: %x", status);
+		goto out;
+	}
+
+	if (!(status & INTERRUPT_DMA_STATUS)) {
+		dev_err(dev, "DMA still in progress. length %d\n",
+			readl(controller->regs + DMA_LENGTH_REG));
+		goto out;
+	}
+
+	ret = IRQ_HANDLED;
+	aspeed_smc_dma_done(controller);
+	complete(&controller->dma_done);
+
+out:
+	return ret;
+}
+
+static int aspeed_smc_config_irq(struct aspeed_smc_controller *controller,
+				 struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	int rc;
+
+	controller->irq = platform_get_irq(pdev, 0);
+	if (!controller->irq)
+		return -ENODEV;
+
+	rc = devm_request_irq(dev, controller->irq, aspeed_smc_irq, IRQF_SHARED,
+			      DEVICE_NAME, controller);
+	if (rc < 0) {
+		dev_warn(dev, "Unable to request IRQ %d\n", controller->irq);
+		controller->irq = 0;
+		return rc;
+	}
+
+	dev_info(dev, "Using IRQ %d\n", controller->irq);
+	return 0;
+}
+
+static void aspeed_smc_dma_setup(struct aspeed_smc_controller *controller,
+				 struct platform_device *pdev)
+{
+	const struct aspeed_smc_info *info = controller->info;
+
+	init_completion(&controller->dma_done);
+
+	controller->dma_enabled = false;
+	if (info->has_dma)
+		controller->dma_enabled = !aspeed_smc_config_irq(controller,
+								 pdev);
+
+	if (controller->dma_enabled)
+		dev_info(controller->dev, "DMA support %s.\n",
+			 use_dma ? "enabled" : "disabled");
+}
+
 static int aspeed_smc_probe(struct platform_device *pdev)
 {
 	struct device_node *np = pdev->dev.of_node;
@@ -730,6 +959,8 @@ static int aspeed_smc_probe(struct platform_device *pdev)
 	if (IS_ERR(controller->ahb_base))
 		return PTR_ERR(controller->ahb_base);
 
+	aspeed_smc_dma_setup(controller, pdev);
+
 	ret = aspeed_smc_setup_flash(controller, np, res);
 	if (ret)
 		dev_err(dev, "Aspeed SMC probe failed %d\n", ret);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
                   ` (2 preceding siblings ...)
  2017-04-06 16:56 ` [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 19:21   ` Marek Vasut
  2017-04-06 16:56 ` [PATCH 05/10] mtd: spi-nor: Add support for Macronix mx66l1g45g spi flash Cédric Le Goater
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley, Robert Lippert,
	Robert Lippert, Cédric Le Goater

From: Robert Lippert <roblip@gmail.com>

Implements support for the dual IO read mode on aspeed SMC/FMC
controllers which uses both MISO and MOSI lines for data during a read
to double the read bandwidth.

Signed-off-by: Robert Lippert <rlippert@google.com>
[clg: adapted to mainline driver ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index 7dfa1ea0a787..b3c8cfe29765 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -512,6 +512,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
 	int i;
 	u8 dummy = 0xFF;
 	int ret;
+	u32 ctl;
 
 	if (aspeed_smc_dma_check(chip, from, len)) {
 		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
@@ -525,6 +526,13 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
 	for (i = 0; i < chip->nor.read_dummy / 8; i++)
 		aspeed_smc_write_to_ahb(chip->ahb_base, &dummy, sizeof(dummy));
 
+	if (chip->nor.flash_read == SPI_NOR_DUAL) {
+		/* Switch to dual I/O mode for data cycle */
+		ctl = readl(chip->ctl) & ~CONTROL_IO_MODE_MASK;
+		ctl |= CONTROL_IO_DUAL_DATA;
+		writel(ctl, chip->ctl);
+	}
+
 	aspeed_smc_read_from_ahb(read_buf, chip->ahb_base, len);
 	aspeed_smc_stop_user(nor);
 out:
@@ -751,6 +759,9 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
 	case SPI_NOR_FAST:
 		cmd = CONTROL_COMMAND_MODE_FREAD;
 		break;
+	case SPI_NOR_DUAL:
+		cmd = CONTROL_COMMAND_MODE_FREAD | CONTROL_IO_DUAL_DATA;
+		break;
 	default:
 		dev_err(chip->nor.dev, "unsupported SPI read mode\n");
 		return -EINVAL;
@@ -760,7 +771,7 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
 		chip->nor.read_opcode << CONTROL_COMMAND_SHIFT |
 		CONTROL_IO_DUMMY_SET(chip->nor.read_dummy / 8);
 
-	dev_dbg(controller->dev, "base control register: %08x\n",
+	dev_dbg(controller->dev, "read control register: %08x\n",
 		chip->ctl_val[smc_read]);
 	return 0;
 }
@@ -830,12 +841,7 @@ static int aspeed_smc_setup_flash(struct aspeed_smc_controller *controller,
 		if (ret)
 			break;
 
-		/*
-		 * TODO: Add support for SPI_NOR_QUAD and SPI_NOR_DUAL
-		 * attach when board support is present as determined
-		 * by of property.
-		 */
-		ret = spi_nor_scan(nor, NULL, SPI_NOR_NORMAL);
+		ret = spi_nor_scan(nor, NULL, SPI_NOR_DUAL);
 		if (ret)
 			break;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 05/10] mtd: spi-nor: Add support for Macronix mx66l1g45g spi flash
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
                   ` (3 preceding siblings ...)
  2017-04-06 16:56 ` [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 19:21   ` Marek Vasut
  2017-04-06 16:56 ` [PATCH 06/10] mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices Cédric Le Goater
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

These modules are used on the OpenPOWER Witherspoon systems to hold
the POWER9 host firmware image. The SPI_NOR_DUAL_READ flags is added
for the Aspeed SoCs which do not support QUAD reads.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 The line is over 80 characters but it looks better.

 drivers/mtd/spi-nor/spi-nor.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
index 747645c74134..ee777df4466c 100644
--- a/drivers/mtd/spi-nor/spi-nor.c
+++ b/drivers/mtd/spi-nor/spi-nor.c
@@ -1020,6 +1020,7 @@ static const struct flash_info spi_nor_ids[] = {
 	{ "mx25u25635f", INFO(0xc22539, 0, 64 * 1024, 512, SECT_4K) },
 	{ "mx25l25655e", INFO(0xc22619, 0, 64 * 1024, 512, 0) },
 	{ "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, SPI_NOR_QUAD_READ) },
+	{ "mx66l1g45g",  INFO(0xc2201b, 0, 64 * 1024, 2048, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
 	{ "mx66l1g55g",  INFO(0xc2261b, 0, 64 * 1024, 2048, SPI_NOR_QUAD_READ) },
 
 	/* Micron */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 06/10] mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
                   ` (4 preceding siblings ...)
  2017-04-06 16:56 ` [PATCH 05/10] mtd: spi-nor: Add support for Macronix mx66l1g45g spi flash Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 19:22   ` Marek Vasut
  2017-04-06 16:56 ` [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus Cédric Le Goater
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

These devices are used on OpenPOWER systems.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 The lines are over 80 characters but it looks better.

 drivers/mtd/spi-nor/spi-nor.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
index ee777df4466c..0821e3211867 100644
--- a/drivers/mtd/spi-nor/spi-nor.c
+++ b/drivers/mtd/spi-nor/spi-nor.c
@@ -1016,10 +1016,10 @@ static const struct flash_info spi_nor_ids[] = {
 	{ "mx25u6435f",  INFO(0xc22537, 0, 64 * 1024, 128, SECT_4K) },
 	{ "mx25l12805d", INFO(0xc22018, 0, 64 * 1024, 256, 0) },
 	{ "mx25l12855e", INFO(0xc22618, 0, 64 * 1024, 256, 0) },
-	{ "mx25l25635e", INFO(0xc22019, 0, 64 * 1024, 512, 0) },
+	{ "mx25l25635e", INFO(0xc22019, 0, 64 * 1024, 512, SPI_NOR_DUAL_READ) },
 	{ "mx25u25635f", INFO(0xc22539, 0, 64 * 1024, 512, SECT_4K) },
 	{ "mx25l25655e", INFO(0xc22619, 0, 64 * 1024, 512, 0) },
-	{ "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, SPI_NOR_QUAD_READ) },
+	{ "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
 	{ "mx66l1g45g",  INFO(0xc2201b, 0, 64 * 1024, 2048, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
 	{ "mx66l1g55g",  INFO(0xc2261b, 0, 64 * 1024, 2048, SPI_NOR_QUAD_READ) },
 
@@ -1031,7 +1031,7 @@ static const struct flash_info spi_nor_ids[] = {
 	{ "n25q064a",    INFO(0x20bb17, 0, 64 * 1024,  128, SECT_4K | SPI_NOR_QUAD_READ) },
 	{ "n25q128a11",  INFO(0x20bb18, 0, 64 * 1024,  256, SECT_4K | SPI_NOR_QUAD_READ) },
 	{ "n25q128a13",  INFO(0x20ba18, 0, 64 * 1024,  256, SECT_4K | SPI_NOR_QUAD_READ) },
-	{ "n25q256a",    INFO(0x20ba19, 0, 64 * 1024,  512, SECT_4K | SPI_NOR_QUAD_READ) },
+	{ "n25q256a",    INFO(0x20ba19, 0, 64 * 1024,  512, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
 	{ "n25q512a",    INFO(0x20bb20, 0, 64 * 1024, 1024, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
 	{ "n25q512ax3",  INFO(0x20ba20, 0, 64 * 1024, 1024, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
 	{ "n25q00",      INFO(0x20ba21, 0, 64 * 1024, 2048, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
@@ -1150,7 +1150,7 @@ static const struct flash_info spi_nor_ids[] = {
 	{ "w25q80", INFO(0xef5014, 0, 64 * 1024,  16, SECT_4K) },
 	{ "w25q80bl", INFO(0xef4014, 0, 64 * 1024,  16, SECT_4K) },
 	{ "w25q128", INFO(0xef4018, 0, 64 * 1024, 256, SECT_4K) },
-	{ "w25q256", INFO(0xef4019, 0, 64 * 1024, 512, SECT_4K) },
+	{ "w25q256", INFO(0xef4019, 0, 64 * 1024, 512, SECT_4K | SPI_NOR_DUAL_READ) },
 
 	/* Catalyst / On Semiconductor -- non-JEDEC */
 	{ "cat25c11", CAT25_INFO(  16, 8, 16, 1, SPI_NOR_NO_ERASE | SPI_NOR_NO_FR) },
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
                   ` (5 preceding siblings ...)
  2017-04-06 16:56 ` [PATCH 06/10] mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 19:25   ` Marek Vasut
  2017-04-06 16:56 ` [PATCH 08/10] mtd: spi-nor: aspeed: use command mode for reads Cédric Le Goater
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

The segment registers of the SMC controller provide a way to configure
the mapping windows of the chips on the AHB bus. The settings are
required to be correct when the controller operates in Command mode,
which is the case for DMAs and the LPC mapping.

This tries to set the segment registers of each chip depending on the
size of the flash device and depending on the previous segment
settings, in order to have a contiguous window across multiple chip.

Unfortunately, the AST2500 SPI controller has a bug and it is not
possible to configure a full 128MB window for a chip of the same
size. The window size needs to be restricted to 120MB. This issue only
applies to CE0.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 124 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 121 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index b3c8cfe29765..336a1ddd100b 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -104,6 +104,7 @@ struct aspeed_smc_chip {
 	struct aspeed_smc_controller *controller;
 	void __iomem *ctl;			/* control register */
 	void __iomem *ahb_base;			/* base of chip window */
+	u32 ahb_window_size;			/* chip window size */
 	unsigned long phys_base;		/* physical address of window */
 	u32 ctl_val[smc_max];			/* control settings */
 	enum aspeed_smc_flash_type type;	/* what type of flash */
@@ -218,6 +219,10 @@ struct aspeed_smc_controller {
 #define SEGMENT_ADDR_REG0		0x30
 #define SEGMENT_ADDR_START(_r)		((((_r) >> 16) & 0xFF) << 23)
 #define SEGMENT_ADDR_END(_r)		((((_r) >> 24) & 0xFF) << 23)
+#define SEGMENT_ADDR_VALUE(start, end)					\
+	(((((start) >> 23) & 0xFF) << 16) | ((((end) >> 23) & 0xFF) << 24))
+#define SEGMENT_ADDR_REG(controller, cs)	\
+	((controller)->regs + SEGMENT_ADDR_REG0 + (cs) * 4)
 
 /* DMA Registers */
 #define DMA_CONTROL_REG			0x80
@@ -604,8 +609,7 @@ static void __iomem *aspeed_smc_chip_base(struct aspeed_smc_chip *chip,
 	u32 reg;
 
 	if (controller->info->nce > 1) {
-		reg = readl(controller->regs + SEGMENT_ADDR_REG0 +
-			    chip->cs * 4);
+		reg = readl(SEGMENT_ADDR_REG(controller, chip->cs));
 
 		if (SEGMENT_ADDR_START(reg) >= SEGMENT_ADDR_END(reg))
 			return NULL;
@@ -616,6 +620,111 @@ static void __iomem *aspeed_smc_chip_base(struct aspeed_smc_chip *chip,
 	return controller->ahb_base + offset;
 }
 
+static u32 chip_set_segment(struct aspeed_smc_chip *chip, u32 cs, u32 start,
+			    u32 size)
+{
+	struct aspeed_smc_controller *controller = chip->controller;
+	void __iomem *seg_reg;
+	u32 oldval, newval, val0, start0, end;
+
+	val0 = readl(SEGMENT_ADDR_REG(controller, 0));
+	start0 = SEGMENT_ADDR_START(val0);
+
+	seg_reg = SEGMENT_ADDR_REG(controller, cs);
+	oldval = readl(seg_reg);
+
+	/* If the chip size is not specified, use the default segment
+	 * size, but take into account the possible overlap with the
+	 * previous segment
+	 */
+	if (!size)
+		size = SEGMENT_ADDR_END(oldval) - start;
+
+	/* The segment cannot exceed the maximum window size of the
+	 * controller.
+	 */
+	if (start + size > start0 + controller->info->maxsize) {
+		size = start0 + controller->info->maxsize - start;
+		dev_warn(chip->nor.dev, "CE%d window resized to %dMB",
+			 cs, size >> 20);
+	}
+
+	end = start + size;
+	newval = SEGMENT_ADDR_VALUE(start, end);
+	writel(newval, seg_reg);
+
+	/* Restore default value if something goes wrong. The chip
+	 * might have set some bogus value and we would loose access
+	 * to the chip.
+	 */
+	if (newval != readl(seg_reg)) {
+		dev_err(chip->nor.dev, "CE%d window invalid", cs);
+		writel(oldval, seg_reg);
+		start = SEGMENT_ADDR_START(oldval);
+		end = SEGMENT_ADDR_END(oldval);
+		size = end - start;
+	}
+
+	dev_info(chip->nor.dev, "CE%d window [ 0x%.8x- 0x%.8x ] %dMB",
+		 cs, start, end, size >> 20);
+
+	return size;
+}
+
+/*
+ * This is expected to be called in increasing CE order
+ */
+static u32 aspeed_smc_chip_set_segment(struct aspeed_smc_chip *chip)
+{
+	struct aspeed_smc_controller *controller = chip->controller;
+	u32  val0, start0, start;
+	u32 size = chip->nor.mtd.size;
+
+	/* The AST2500 SPI controller has a bug when the CE0 chip size
+	 * exceeds 120MB.
+	 */
+	if (chip->cs == 0 && controller->info == &spi_2500_info &&
+	    size == (128 << 20)) {
+		size = 120 << 20;
+		dev_info(chip->nor.dev,
+			 "CE%d window resized to %dMB (AST2500 HW quirk)",
+			 chip->cs, size >> 20);
+	}
+
+	val0 = readl(SEGMENT_ADDR_REG(controller, 0));
+	start0 = SEGMENT_ADDR_START(val0);
+
+	/* As a start address for the current segment, use the default
+	 * start address if we are handling CE0 or use the previous
+	 * segment ending address
+	 */
+	if (chip->cs) {
+		u32 prev = readl(SEGMENT_ADDR_REG(controller, chip->cs - 1));
+
+		start = SEGMENT_ADDR_END(prev);
+	} else
+		start = start0;
+
+	size = chip_set_segment(chip, chip->cs, start, size);
+
+	/* Update chip base address on the AHB bus */
+	chip->ahb_base = controller->ahb_base + (start - start0);
+
+	if (size < chip->nor.mtd.size)
+		dev_warn(chip->nor.dev,
+			 "CE%d window too small for chip %dMB",
+			 chip->cs, (u32) chip->nor.mtd.size >> 20);
+
+	/* Make sure the next segment does not overlap with the
+	 * current one we just configured even if there is no
+	 * available chip. That could break access in Command Mode.
+	 */
+	if (chip->cs < controller->info->nce - 1)
+		chip_set_segment(chip, chip->cs + 1, start + size, 0);
+
+	return size;
+}
+
 static void aspeed_smc_chip_enable_write(struct aspeed_smc_chip *chip)
 {
 	struct aspeed_smc_controller *controller = chip->controller;
@@ -689,7 +798,7 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
 	 */
 	chip->ahb_base = aspeed_smc_chip_base(chip, res);
 	if (!chip->ahb_base) {
-		dev_warn(chip->nor.dev, "CE segment window closed.\n");
+		dev_warn(chip->nor.dev, "CE%d window closed", chip->cs);
 		return -EINVAL;
 	}
 
@@ -738,6 +847,15 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
 	if (chip->nor.addr_width == 4 && info->set_4b)
 		info->set_4b(chip);
 
+	/* This is for direct AHB access when using Command Mode. For
+	 * the AST2400 SPI controller which only handles one chip,
+	 * let's use the full window.
+	 */
+	if (controller->info->nce == 1)
+		chip->ahb_window_size = info->maxsize;
+	else
+		chip->ahb_window_size = aspeed_smc_chip_set_segment(chip);
+
 	/*
 	 * base mode has not been optimized yet. use it for writes.
 	 */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 08/10] mtd: spi-nor: aspeed: use command mode for reads
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
                   ` (6 preceding siblings ...)
  2017-04-06 16:56 ` [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 09/10] mtd: spi-nor: aspeed: link controller with the ahb clock Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode Cédric Le Goater
  9 siblings, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

When reading flash contents, try to use the "command mode" if the AHB
window configured for the flash module is big enough. Else, just fall
back to the "user mode" to perform the read.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 42 +++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index 336a1ddd100b..3c004dfaf873 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -516,16 +516,8 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
 	struct aspeed_smc_chip *chip = nor->priv;
 	int i;
 	u8 dummy = 0xFF;
-	int ret;
 	u32 ctl;
 
-	if (aspeed_smc_dma_check(chip, from, len)) {
-		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
-		if (!ret)
-			goto out;
-		dev_err(chip->nor.dev, "DMA read failed: %d", ret);
-	}
-
 	aspeed_smc_start_user(nor);
 	aspeed_smc_send_cmd_addr(nor, nor->read_opcode, from);
 	for (i = 0; i < chip->nor.read_dummy / 8; i++)
@@ -540,6 +532,38 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
 
 	aspeed_smc_read_from_ahb(read_buf, chip->ahb_base, len);
 	aspeed_smc_stop_user(nor);
+	return 0;
+}
+
+static ssize_t aspeed_smc_read(struct spi_nor *nor, loff_t from, size_t len,
+			       u_char *read_buf)
+{
+	struct aspeed_smc_chip *chip = nor->priv;
+	int ret;
+
+	/* The segment window configured for the chip is too small for
+	 * the read offset. Use the "User mode" of the controller to
+	 * perform the read.
+	 */
+	if (from >= chip->ahb_window_size) {
+		aspeed_smc_read_user(nor, from, len, read_buf);
+		goto out;
+	}
+
+	/* Then, try DMA if the driver allows them. */
+	if (aspeed_smc_dma_check(chip, from, len)) {
+		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
+		if (!ret)
+			goto out;
+		dev_err(chip->nor.dev, "DMA read failed: %d", ret);
+	}
+
+	/* Last, and this should be the default, use the "Command
+	 * mode" of the controller which does the read from the
+	 * segment window configured for the chip on the AHB bus.
+	 */
+	memcpy_fromio(read_buf, chip->ahb_base + from, len);
+
 out:
 	return len;
 }
@@ -948,7 +972,7 @@ static int aspeed_smc_setup_flash(struct aspeed_smc_controller *controller,
 		nor->dev = dev;
 		nor->priv = chip;
 		spi_nor_set_flash_node(nor, child);
-		nor->read = aspeed_smc_read_user;
+		nor->read = aspeed_smc_read;
 		nor->write = aspeed_smc_write_user;
 		nor->read_reg = aspeed_smc_read_reg;
 		nor->write_reg = aspeed_smc_write_reg;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 09/10] mtd: spi-nor: aspeed: link controller with the ahb clock
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
                   ` (7 preceding siblings ...)
  2017-04-06 16:56 ` [PATCH 08/10] mtd: spi-nor: aspeed: use command mode for reads Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 16:56 ` [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode Cédric Le Goater
  9 siblings, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

We will need the AHB frequency to set the HCLK settings in the SMC
controller to optimize the reads.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index 3c004dfaf873..1b398303f039 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -10,6 +10,7 @@
  */
 
 #include <linux/bug.h>
+#include <linux/clk.h>
 #include <linux/device.h>
 #include <linux/dma-direction.h>
 #include <linux/dma-mapping.h>
@@ -119,6 +120,8 @@ struct aspeed_smc_controller {
 	void __iomem *regs;			/* controller registers */
 	void __iomem *ahb_base;			/* per-chip windows resource */
 
+	struct clk *ahb_clk;
+
 	/* interrupt handling */
 	int irq;
 
@@ -1107,6 +1110,10 @@ static int aspeed_smc_probe(struct platform_device *pdev)
 	if (IS_ERR(controller->ahb_base))
 		return PTR_ERR(controller->ahb_base);
 
+	controller->ahb_clk = devm_clk_get(&pdev->dev, "ahb");
+	if (IS_ERR(controller->ahb_clk))
+		return PTR_ERR(controller->ahb_clk);
+
 	aspeed_smc_dma_setup(controller, pdev);
 
 	ret = aspeed_smc_setup_flash(controller, np, res);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
                   ` (8 preceding siblings ...)
  2017-04-06 16:56 ` [PATCH 09/10] mtd: spi-nor: aspeed: link controller with the ahb clock Cédric Le Goater
@ 2017-04-06 16:56 ` Cédric Le Goater
  2017-04-06 19:28   ` Marek Vasut
  9 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-06 16:56 UTC (permalink / raw)
  To: linux-mtd
  Cc: Cyrille Pitchen, Marek Vasut, Boris Brezillon, David Woodhouse,
	Brian Norris, Richard Weinberger, Joel Stanley,
	Cédric Le Goater

This is only for SPI controllers as U-Boot should have done it already
for the FMC controller using DMAs.

The algo is based on the one found in the OpenPOWER pflash tool. It
first reads a golden buffer at low speed and then performs reads with
different clocks and delay cycles settings to find the fastest
configuration for the chip.

It can be deactivated at boot time with the kernel parameter :

	aspeed_smc.optimize_read=0

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 drivers/mtd/spi-nor/aspeed-smc.c | 191 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)

diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
index 1b398303f039..875b029198fc 100644
--- a/drivers/mtd/spi-nor/aspeed-smc.c
+++ b/drivers/mtd/spi-nor/aspeed-smc.c
@@ -22,6 +22,7 @@
 #include <linux/mtd/spi-nor.h>
 #include <linux/of.h>
 #include <linux/of_platform.h>
+#include <linux/slab.h>
 #include <linux/sysfs.h>
 
 #define DEVICE_NAME	"aspeed-smc"
@@ -43,13 +44,17 @@ struct aspeed_smc_info {
 	bool hastype;		/* flash type field exists in config reg */
 	u8 we0;			/* shift for write enable bit for CE0 */
 	u8 ctl0;		/* offset in regs of ctl for CE0 */
+	u8 timing;		/* offset in regs of timing */
 	bool has_dma;
 
 	void (*set_4b)(struct aspeed_smc_chip *chip);
+	int (*optimize_read)(struct aspeed_smc_chip *chip, u32 max_freq);
 };
 
 static void aspeed_smc_chip_set_4b_spi_2400(struct aspeed_smc_chip *chip);
 static void aspeed_smc_chip_set_4b(struct aspeed_smc_chip *chip);
+static int aspeed_smc_optimize_read(struct aspeed_smc_chip *chip,
+				     u32 max_freq);
 
 static const struct aspeed_smc_info fmc_2400_info = {
 	.maxsize = 256 * 1024 * 1024,
@@ -57,6 +62,7 @@ static const struct aspeed_smc_info fmc_2400_info = {
 	.hastype = true,
 	.we0 = 16,
 	.ctl0 = 0x10,
+	.timing = 0x94,
 	.has_dma = true,
 	.set_4b = aspeed_smc_chip_set_4b,
 };
@@ -67,8 +73,10 @@ static const struct aspeed_smc_info spi_2400_info = {
 	.hastype = false,
 	.we0 = 0,
 	.ctl0 = 0x04,
+	.timing = 0x94,
 	.has_dma = false,
 	.set_4b = aspeed_smc_chip_set_4b_spi_2400,
+	.optimize_read = aspeed_smc_optimize_read,
 };
 
 static const struct aspeed_smc_info fmc_2500_info = {
@@ -77,6 +85,7 @@ static const struct aspeed_smc_info fmc_2500_info = {
 	.hastype = true,
 	.we0 = 16,
 	.ctl0 = 0x10,
+	.timing = 0x94,
 	.has_dma = true,
 	.set_4b = aspeed_smc_chip_set_4b,
 };
@@ -87,8 +96,10 @@ static const struct aspeed_smc_info spi_2500_info = {
 	.hastype = false,
 	.we0 = 16,
 	.ctl0 = 0x10,
+	.timing = 0x94,
 	.has_dma = false,
 	.set_4b = aspeed_smc_chip_set_4b,
+	.optimize_read = aspeed_smc_optimize_read,
 };
 
 enum aspeed_smc_ctl_reg_value {
@@ -249,6 +260,12 @@ module_param(min_dma_size, uint, 0644);
 static unsigned int dma_timeout = 200;
 module_param(dma_timeout, uint, 0644);
 
+/*
+ * Switch to turn off read optimisation if needed
+ */
+static bool optimize_read = true;
+module_param(optimize_read, bool, 0644);
+
 static void aspeed_smc_dma_done(struct aspeed_smc_controller *controller)
 {
 	writel(0, controller->regs + INTERRUPT_STATUS_REG);
@@ -865,6 +882,174 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
 	return 0;
 }
 
+
+#define CALIBRATE_BUF_SIZE 16384
+
+static bool aspeed_smc_check_reads(struct aspeed_smc_chip *chip,
+				  const u8 *golden_buf, u8 *test_buf)
+{
+	int i;
+
+	for (i = 0; i < 10; i++) {
+		aspeed_smc_read_from_ahb(test_buf, chip->ahb_base,
+					 CALIBRATE_BUF_SIZE);
+		if (memcmp(test_buf, golden_buf, CALIBRATE_BUF_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+static int aspeed_smc_calibrate_reads(struct aspeed_smc_chip *chip, u32 hdiv,
+				      const u8 *golden_buf, u8 *test_buf)
+{
+	struct aspeed_smc_controller *controller = chip->controller;
+	const struct aspeed_smc_info *info = controller->info;
+	int i;
+	int good_pass = -1, pass_count = 0;
+	u32 shift = (hdiv - 1) << 2;
+	u32 mask = ~(0xfu << shift);
+	u32 fread_timing_val = 0;
+
+#define FREAD_TPASS(i)	(((i) / 2) | (((i) & 1) ? 0 : 8))
+
+	/* Try HCLK delay 0..5, each one with/without delay and look for a
+	 * good pair.
+	 */
+	for (i = 0; i < 12; i++) {
+		bool pass;
+
+		fread_timing_val &= mask;
+		fread_timing_val |= FREAD_TPASS(i) << shift;
+
+		writel(fread_timing_val, controller->regs + info->timing);
+		pass = aspeed_smc_check_reads(chip, golden_buf, test_buf);
+		dev_dbg(chip->nor.dev,
+			"  * [%08x] %d HCLK delay, %dns DI delay : %s",
+			fread_timing_val, i/2, (i & 1) ? 0 : 4,
+			pass ? "PASS" : "FAIL");
+		if (pass) {
+			pass_count++;
+			if (pass_count == 3) {
+				good_pass = i - 1;
+				break;
+			}
+		} else
+			pass_count = 0;
+	}
+
+	/* No good setting for this frequency */
+	if (good_pass < 0)
+		return -1;
+
+	/* We have at least one pass of margin, let's use first pass */
+	fread_timing_val &= mask;
+	fread_timing_val |= FREAD_TPASS(good_pass) << shift;
+	writel(fread_timing_val, controller->regs + info->timing);
+	dev_dbg(chip->nor.dev, " * -> good is pass %d [0x%08x]",
+		good_pass, fread_timing_val);
+	return 0;
+}
+
+static bool aspeed_smc_check_calib_data(const u8 *test_buf, u32 size)
+{
+	const u32 *tb32 = (const u32 *) test_buf;
+	u32 i, cnt = 0;
+
+	/* We check if we have enough words that are neither all 0
+	 * nor all 1's so the calibration can be considered valid.
+	 *
+	 * I use an arbitrary threshold for now of 64
+	 */
+	size >>= 2;
+	for (i = 0; i < size; i++) {
+		if (tb32[i] != 0 && tb32[i] != 0xffffffff)
+			cnt++;
+	}
+	return cnt >= 64;
+}
+
+static const uint32_t aspeed_smc_hclk_divs[] = {
+	0xf, /* HCLK */
+	0x7, /* HCLK/2 */
+	0xe, /* HCLK/3 */
+	0x6, /* HCLK/4 */
+	0xd, /* HCLK/5 */
+};
+#define ASPEED_SMC_HCLK_DIV(i) (aspeed_smc_hclk_divs[(i) - 1] << 8)
+
+static int aspeed_smc_optimize_read(struct aspeed_smc_chip *chip,
+				     u32 max_freq)
+{
+	u8 *golden_buf, *test_buf;
+	int i, rc, best_div = -1;
+	u32 save_read_val = chip->ctl_val[smc_read];
+	u32 ahb_freq = clk_get_rate(chip->controller->ahb_clk);
+
+	dev_dbg(chip->nor.dev, "AHB frequency: %d MHz", ahb_freq / 1000000);
+
+	test_buf = kmalloc(CALIBRATE_BUF_SIZE * 2, GFP_KERNEL);
+	golden_buf = test_buf + CALIBRATE_BUF_SIZE;
+
+	/* We start with the dumbest setting (keep 4Byte bit) and read
+	 * some data
+	 */
+	chip->ctl_val[smc_read] = (chip->ctl_val[smc_read] & 0x2000) |
+		(0x00 << 28) | /* Single bit */
+		(0x00 << 24) | /* CE# max */
+		(0x03 << 16) | /* use normal reads */
+		(0x00 <<  8) | /* HCLK/16 */
+		(0x00 <<  6) | /* no dummy cycle */
+		(0x00);        /* normal read */
+
+	writel(chip->ctl_val[smc_read], chip->ctl);
+
+	aspeed_smc_read_from_ahb(golden_buf, chip->ahb_base,
+				 CALIBRATE_BUF_SIZE);
+
+	/* Establish our read mode with freq field set to 0 (HCLK/16) */
+	chip->ctl_val[smc_read] = save_read_val & 0xfffff0ff;
+
+	/* Check if calibration data is suitable */
+	if (!aspeed_smc_check_calib_data(golden_buf, CALIBRATE_BUF_SIZE)) {
+		dev_info(chip->nor.dev,
+			 "Calibration area too uniform, using low speed");
+		writel(chip->ctl_val[smc_read], chip->ctl);
+		kfree(test_buf);
+		return 0;
+	}
+
+	/* Now we iterate the HCLK dividers until we find our breaking point */
+	for (i = ARRAY_SIZE(aspeed_smc_hclk_divs); i > 0; i--) {
+		u32 tv, freq;
+
+		/* Compare timing to max */
+		freq = ahb_freq / i;
+		if (freq >= max_freq)
+			continue;
+
+		/* Set the timing */
+		tv = chip->ctl_val[smc_read] | ASPEED_SMC_HCLK_DIV(i);
+		writel(tv, chip->ctl);
+		dev_dbg(chip->nor.dev, "Trying HCLK/%d...", i);
+		rc = aspeed_smc_calibrate_reads(chip, i, golden_buf, test_buf);
+		if (rc == 0)
+			best_div = i;
+	}
+	kfree(test_buf);
+
+	/* Nothing found ? */
+	if (best_div < 0)
+		dev_warn(chip->nor.dev, "No good frequency, using dumb slow");
+	else {
+		dev_dbg(chip->nor.dev, "Found good read timings at HCLK/%d",
+			best_div);
+		chip->ctl_val[smc_read] |= ASPEED_SMC_HCLK_DIV(best_div);
+	}
+
+	writel(chip->ctl_val[smc_read], chip->ctl);
+	return 0;
+}
+
 static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
 {
 	struct aspeed_smc_controller *controller = chip->controller;
@@ -918,6 +1103,12 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
 
 	dev_dbg(controller->dev, "read control register: %08x\n",
 		chip->ctl_val[smc_read]);
+
+	/*
+	 * TODO: get max freq from chip
+	 */
+	if (optimize_read && info->optimize_read)
+		info->optimize_read(chip, 104000000);
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support
  2017-04-06 16:56 ` [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support Cédric Le Goater
@ 2017-04-06 19:07   ` Cyrille Pitchen
  2017-04-06 19:13     ` Cyrille Pitchen
  2017-04-19 14:36     ` Cédric Le Goater
  0 siblings, 2 replies; 40+ messages in thread
From: Cyrille Pitchen @ 2017-04-06 19:07 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Boris Brezillon, Richard Weinberger, Marek Vasut, Joel Stanley,
	Cyrille Pitchen, Brian Norris, David Woodhouse

Hi Cédric,

Le 06/04/2017 à 18:56, Cédric Le Goater a écrit :
> The Aspeed FMC controller can handle transfers to the flash modules
> using DMAs. A couple of registers first need to be programmed with the
> DRAM and flash addresses and the length of the transfer. The transfer
> is then initiated using a DMA control register and an interrupt
> notifies the completion.
> 
> Such transfers can replace the current IO mode in the read/write ops
> when some conditions are met on the size and the alignment. In case of
> failure, a timeout for instance, the operation is restarted using the
> IO mode.
> 
> DMA support does not seem to be that efficient. So we provide some
> sysfs files for tuning and to switch it on and off (default is off)
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  drivers/mtd/spi-nor/aspeed-smc.c | 231 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 231 insertions(+)
> 
> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
> index a98d454d07ed..7dfa1ea0a787 100644
> --- a/drivers/mtd/spi-nor/aspeed-smc.c
> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
> @@ -11,6 +11,8 @@
>  
>  #include <linux/bug.h>
>  #include <linux/device.h>
> +#include <linux/dma-direction.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/io.h>
>  #include <linux/module.h>
>  #include <linux/mutex.h>
> @@ -40,6 +42,7 @@ struct aspeed_smc_info {
>  	bool hastype;		/* flash type field exists in config reg */
>  	u8 we0;			/* shift for write enable bit for CE0 */
>  	u8 ctl0;		/* offset in regs of ctl for CE0 */
> +	bool has_dma;
>  
>  	void (*set_4b)(struct aspeed_smc_chip *chip);
>  };
> @@ -53,6 +56,7 @@ static const struct aspeed_smc_info fmc_2400_info = {
>  	.hastype = true,
>  	.we0 = 16,
>  	.ctl0 = 0x10,
> +	.has_dma = true,
>  	.set_4b = aspeed_smc_chip_set_4b,
>  };
>  
> @@ -62,6 +66,7 @@ static const struct aspeed_smc_info spi_2400_info = {
>  	.hastype = false,
>  	.we0 = 0,
>  	.ctl0 = 0x04,
> +	.has_dma = false,
>  	.set_4b = aspeed_smc_chip_set_4b_spi_2400,
>  };
>  
> @@ -71,6 +76,7 @@ static const struct aspeed_smc_info fmc_2500_info = {
>  	.hastype = true,
>  	.we0 = 16,
>  	.ctl0 = 0x10,
> +	.has_dma = true,
>  	.set_4b = aspeed_smc_chip_set_4b,
>  };
>  
> @@ -80,6 +86,7 @@ static const struct aspeed_smc_info spi_2500_info = {
>  	.hastype = false,
>  	.we0 = 16,
>  	.ctl0 = 0x10,
> +	.has_dma = false,
>  	.set_4b = aspeed_smc_chip_set_4b,
>  };
>  
> @@ -97,6 +104,7 @@ struct aspeed_smc_chip {
>  	struct aspeed_smc_controller *controller;
>  	void __iomem *ctl;			/* control register */
>  	void __iomem *ahb_base;			/* base of chip window */
> +	unsigned long phys_base;		/* physical address of window */
>  	u32 ctl_val[smc_max];			/* control settings */
>  	enum aspeed_smc_flash_type type;	/* what type of flash */
>  	struct spi_nor nor;
> @@ -110,6 +118,18 @@ struct aspeed_smc_controller {
>  	void __iomem *regs;			/* controller registers */
>  	void __iomem *ahb_base;			/* per-chip windows resource */
>  
> +	/* interrupt handling */
> +	int irq;
> +
> +	/* dma */
> +	bool dma_enabled;
> +	struct completion dma_done;
> +
> +	/* dma logging */
> +	size_t dma_length;
> +	dma_addr_t dma_addr;			/* bus address of buffer */
> +	dma_addr_t flash_addr;			/* flash address */
> +
>  	struct aspeed_smc_chip *chips[0];	/* pointers to attached chips */
>  };
>  
> @@ -182,6 +202,11 @@ struct aspeed_smc_controller {
>  	(CONTROL_AAF_MODE | CONTROL_CE_INACTIVE_MASK | CONTROL_CLK_DIV4 | \
>  	 CONTROL_CLOCK_FREQ_SEL_MASK | CONTROL_LSB_FIRST | CONTROL_CLOCK_MODE_3)
>  
> +/* Interrupt Control and Status Register */
> +#define INTERRUPT_STATUS_REG		0x08
> +#define     INTERRUPT_DMA_ENABLE	BIT(3)
> +#define     INTERRUPT_DMA_STATUS	BIT(11)
> +
>  /*
>   * The Segment Register uses a 8MB unit to encode the start address
>   * and the end address of the mapping window of a flash SPI slave :
> @@ -194,6 +219,108 @@ struct aspeed_smc_controller {
>  #define SEGMENT_ADDR_START(_r)		((((_r) >> 16) & 0xFF) << 23)
>  #define SEGMENT_ADDR_END(_r)		((((_r) >> 24) & 0xFF) << 23)
>  
> +/* DMA Registers */
> +#define DMA_CONTROL_REG			0x80
> +#define     DMA_ENABLE			BIT(0)
> +#define     DMA_WRITE			BIT(1)
> +
> +#define DMA_FLASH_BASE_REG		0x84
> +#define DMA_DRAM_BASE_REG		0x88
> +#define DMA_LENGTH_REG			0x8c
> +
> +/*
> + * DMAs do not seem to be that fast, so disable by default
> + */
> +static bool use_dma;
> +module_param(use_dma, bool, 0644);
> +
> +static unsigned int min_dma_size = 256;
> +module_param(min_dma_size, uint, 0644);
> +
> +/* with 100ms we had a couple of timeouts */
> +static unsigned int dma_timeout = 200;
> +module_param(dma_timeout, uint, 0644);
> +
> +static void aspeed_smc_dma_done(struct aspeed_smc_controller *controller)
> +{
> +	writel(0, controller->regs + INTERRUPT_STATUS_REG);
> +	writel(0, controller->regs + DMA_CONTROL_REG);
> +}
> +
> +#define DMA_LENGTH(x) (((x) - 4) & ~0xFE000003)
> +#define DMA_ADDR(x) ((x) & ~0x00000003)
> +
> +static inline void aspeed_smc_chip_configure(struct aspeed_smc_chip *chip,
> +					     u32 ctl)
> +{
> +	ctl |= CONTROL_CE_STOP_ACTIVE_CONTROL;
> +	writel(ctl, chip->ctl);
> +
> +	ctl &= ~CONTROL_CE_STOP_ACTIVE_CONTROL;
> +	writel(ctl, chip->ctl);
> +}
> +
> +static int aspeed_smc_dma_start(struct aspeed_smc_chip *chip,
> +				u32 offset, void *buf, size_t length,
> +				int is_write)
> +{
> +	struct aspeed_smc_controller *controller = chip->controller;
> +	dma_addr_t dma_addr, flash_addr;
> +	int ret = 0;
> +
> +	aspeed_smc_chip_configure(chip, is_write ? chip->ctl_val[smc_write] :
> +		chip->ctl_val[smc_read]);
> +
> +	dev_dbg(chip->nor.dev, "DMA %s to=0x%08x len=0x%08x\n",
> +		is_write ? "write" : "read", offset, length);
> +
> +	dma_addr = dma_map_single(chip->nor.dev, buf, length,
> +				  (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));

big mistake! Don't worry, your not the first one to fall into this trap
and I did this mistake too in earlier versions of the Atmel QSPI
controller driver (not mainlined) ;)

So please, let me explain:

You can use dma_map_single() if 'buf' is allocated with kmalloc() for
instance because kmalloc'ed memory pages are guaranteed to be contiguous
in physical space too (not only in virtual space).
Also kmalloc'ed memory can be (un)mapped for DMA usages: when cleaning
and/or invalidating the data cache lines, the cache aliasing issue
should be avoided, hence it's safe.

On the other hand, vmalloc'ed memory pages are not guaranteed to be
contiguous in physical space and often they aren't. So just for this
only reason, you cannot use dma_map_single().


More technical explanations but you can skip the following section if
you want:

####################################################################
Then you could think of using dma_map_sg() instead. This is what the
__spi_map_sg() / spi_map_buf() functions do in the SPI sub-system: in
the case of vmalloc'ed memory, spi_map_buf() first build a
scatter-gather list from the allocated pages before calling dma_map_sg().

However the vmalloc'ed memory area is not safe about the cache alias
issue. Depending on model of the data cache it may be safe or not:

- PIPT (Physically Indexed / Physically Tagged): should be safe in any
case (data cache of ARM Cortex A5 for instance).

- VIPT (Virtually Indexed / Physically Tagged): if the size of the data
cache is "small" enough so the number of (index + offset) bits in the
virtual address is less are equal to the number of offset bits for the
smallest page size in the MMU (likely 12 bits for a 4KB MMU page size),
the cache alias issue should be avoided. Otherwise this issue may occur
so KO.

- VIVT (Virtually Indexed / Virtually Tagged): this is the worst case.
Even the tag part comes from the virtual address so OK (ARM 926 I guess).

e.g. for a 32bit address:

Depending on the cache model, the virtual and/or physical is used to
extract the TAG, INDEX and OFFSET.

31                         Y                   0
+--------------------------+--------+----------+
|           TAG            |  INDEX |   OFFSET |
+--------------------------+--------+----------+

The INDEX part is used the select the right set in the N-way set
associated cache. Then the TAG part is used to find the right cache line
within the selected set.

For a minimum MMU page size of 4096 (== 2^12) bytes, if (Y < 12),
whatever you use the physical of virtual address, the (INDEX + OFFSET)
bits are the same. Then if the TAG comes from the physical address, no
problem: even if a given physical page is mapped twice (or more) in the
virtual address space through the MMU, all those virtual addresses are
cached in the very same cache lines.
So for a VIPT cache model, it depends on the geometry (cache size, cache
line size, number of ways) of your cache.

SPI controller drivers should not rely on this "trick": it may not work
later if the same controller IP is integrated in another SoC using a
different core with a different cache model or different cache geometry.
####################################################################


All that to say, other parts of the kernel, like the ubifs layer,
allocates buffer with vmalloc() instead of kmalloc().
So you may test your code programming a UBI file-system on your SPI
flash memory and your kernel is likely to crash.


> +
> +	if (unlikely(dma_mapping_error(chip->nor.dev, dma_addr))) {
> +		dev_err(chip->nor.dev, "Failed to dma_map_single()\n");
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +	flash_addr = chip->phys_base + offset;
> +
> +	controller->dma_length = length;
> +	controller->dma_addr = dma_addr;
> +	controller->flash_addr = flash_addr;
> +
> +	reinit_completion(&controller->dma_done);
> +
> +	writel(0, controller->regs + DMA_CONTROL_REG);
> +	writel(DMA_ADDR(flash_addr), controller->regs +
> +	       DMA_FLASH_BASE_REG);
> +	writel(DMA_ADDR(dma_addr), controller->regs + DMA_DRAM_BASE_REG);
> +	writel(DMA_LENGTH(length), controller->regs + DMA_LENGTH_REG);
> +
> +	writel(INTERRUPT_DMA_ENABLE,
> +	       controller->regs + INTERRUPT_STATUS_REG);
> +
> +	writel(DMA_ENABLE | (is_write << 1),
> +	       controller->regs + DMA_CONTROL_REG);
> +
> +	if (!wait_for_completion_timeout(&controller->dma_done,
> +					 msecs_to_jiffies(dma_timeout))) {
> +		dev_err(chip->nor.dev,
> +			"DMA timeout addr@%.8x faddr@%.8x size=%x\n",
> +			controller->dma_addr,
> +			controller->flash_addr,
> +			controller->dma_length);
> +		ret = -ETIMEDOUT;
> +		aspeed_smc_dma_done(controller);
> +	}
> +
> +	dma_unmap_single(chip->nor.dev,
> +			 controller->dma_addr, controller->dma_length,
> +			 (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));
> +out:
> +	aspeed_smc_chip_configure(chip, chip->ctl_val[smc_read]);
> +	return ret;
> +}
> +
>  /*
>   * In user mode all data bytes read or written to the chip decode address
>   * range are transferred to or from the SPI bus. The range is treated as a
> @@ -366,12 +493,32 @@ static void aspeed_smc_send_cmd_addr(struct spi_nor *nor, u8 cmd, u32 addr)
>  	}
>  }
>  
> +/*
> + * Try DMA transfer when size and alignment are correct. In case
> + * of failure, just restart using the IO mode.
> + */
> +static int aspeed_smc_dma_check(struct aspeed_smc_chip *chip, loff_t off,
> +				size_t len)
> +{
> +	return (IS_ALIGNED(off, 4) && IS_ALIGNED(len, 4) &&
> +		len >= min_dma_size && chip->controller->dma_enabled &&
> +		use_dma);
> +}
> +

vmalloc'ed memory or other high memory regions (not DMA capable) can
still pass this check.

Best regards,

Cyrille

>  static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>  				    size_t len, u_char *read_buf)
>  {
>  	struct aspeed_smc_chip *chip = nor->priv;
>  	int i;
>  	u8 dummy = 0xFF;
> +	int ret;
> +
> +	if (aspeed_smc_dma_check(chip, from, len)) {
> +		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
> +		if (!ret)
> +			goto out;
> +		dev_err(chip->nor.dev, "DMA read failed: %d", ret);
> +	}
>  
>  	aspeed_smc_start_user(nor);
>  	aspeed_smc_send_cmd_addr(nor, nor->read_opcode, from);
> @@ -380,6 +527,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>  
>  	aspeed_smc_read_from_ahb(read_buf, chip->ahb_base, len);
>  	aspeed_smc_stop_user(nor);
> +out:
>  	return len;
>  }
>  
> @@ -387,11 +535,21 @@ static ssize_t aspeed_smc_write_user(struct spi_nor *nor, loff_t to,
>  				     size_t len, const u_char *write_buf)
>  {
>  	struct aspeed_smc_chip *chip = nor->priv;
> +	int ret;
> +
> +	if (aspeed_smc_dma_check(chip, to, len)) {
> +		ret = aspeed_smc_dma_start(chip, to, (void *)write_buf,
> +					   len, 1);
> +		if (!ret)
> +			goto out;
> +		dev_err(chip->nor.dev, "DMA write failed: %d", ret);
> +	}
>  
>  	aspeed_smc_start_user(nor);
>  	aspeed_smc_send_cmd_addr(nor, nor->program_opcode, to);
>  	aspeed_smc_write_to_ahb(chip->ahb_base, write_buf, len);
>  	aspeed_smc_stop_user(nor);
> +out:
>  	return len;
>  }
>  
> @@ -527,6 +685,8 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
>  		return -EINVAL;
>  	}
>  
> +	chip->phys_base = res->start + (chip->ahb_base - controller->ahb_base);
> +
>  	/*
>  	 * Get value of the inherited control register. U-Boot usually
>  	 * does some timing calibration on the FMC chip, so it's good
> @@ -597,6 +757,7 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
>  	}
>  
>  	chip->ctl_val[smc_read] |= cmd |
> +		chip->nor.read_opcode << CONTROL_COMMAND_SHIFT |
>  		CONTROL_IO_DUMMY_SET(chip->nor.read_dummy / 8);
>  
>  	dev_dbg(controller->dev, "base control register: %08x\n",
> @@ -695,6 +856,74 @@ static int aspeed_smc_setup_flash(struct aspeed_smc_controller *controller,
>  	return ret;
>  }
>  
> +static irqreturn_t aspeed_smc_irq(int irq, void *arg)
> +{
> +	struct aspeed_smc_controller *controller = arg;
> +	struct device *dev = controller->dev;
> +	irqreturn_t ret = IRQ_NONE;
> +	u32 dma_ctl = readl(controller->regs + DMA_CONTROL_REG);
> +	u32 status = readl(controller->regs + INTERRUPT_STATUS_REG);
> +
> +	dev_dbg(dev, "received IRQ. status: %x", status);
> +
> +	if (!(status & INTERRUPT_DMA_ENABLE) || !(dma_ctl & DMA_ENABLE)) {
> +		dev_err(dev, "No DMA. bad IRQ status: %x", status);
> +		goto out;
> +	}
> +
> +	if (!(status & INTERRUPT_DMA_STATUS)) {
> +		dev_err(dev, "DMA still in progress. length %d\n",
> +			readl(controller->regs + DMA_LENGTH_REG));
> +		goto out;
> +	}
> +
> +	ret = IRQ_HANDLED;
> +	aspeed_smc_dma_done(controller);
> +	complete(&controller->dma_done);
> +
> +out:
> +	return ret;
> +}
> +
> +static int aspeed_smc_config_irq(struct aspeed_smc_controller *controller,
> +				 struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	int rc;
> +
> +	controller->irq = platform_get_irq(pdev, 0);
> +	if (!controller->irq)
> +		return -ENODEV;
> +
> +	rc = devm_request_irq(dev, controller->irq, aspeed_smc_irq, IRQF_SHARED,
> +			      DEVICE_NAME, controller);
> +	if (rc < 0) {
> +		dev_warn(dev, "Unable to request IRQ %d\n", controller->irq);
> +		controller->irq = 0;
> +		return rc;
> +	}
> +
> +	dev_info(dev, "Using IRQ %d\n", controller->irq);
> +	return 0;
> +}
> +
> +static void aspeed_smc_dma_setup(struct aspeed_smc_controller *controller,
> +				 struct platform_device *pdev)
> +{
> +	const struct aspeed_smc_info *info = controller->info;
> +
> +	init_completion(&controller->dma_done);
> +
> +	controller->dma_enabled = false;
> +	if (info->has_dma)
> +		controller->dma_enabled = !aspeed_smc_config_irq(controller,
> +								 pdev);
> +
> +	if (controller->dma_enabled)
> +		dev_info(controller->dev, "DMA support %s.\n",
> +			 use_dma ? "enabled" : "disabled");
> +}
> +
>  static int aspeed_smc_probe(struct platform_device *pdev)
>  {
>  	struct device_node *np = pdev->dev.of_node;
> @@ -730,6 +959,8 @@ static int aspeed_smc_probe(struct platform_device *pdev)
>  	if (IS_ERR(controller->ahb_base))
>  		return PTR_ERR(controller->ahb_base);
>  
> +	aspeed_smc_dma_setup(controller, pdev);
> +
>  	ret = aspeed_smc_setup_flash(controller, np, res);
>  	if (ret)
>  		dev_err(dev, "Aspeed SMC probe failed %d\n", ret);
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support
  2017-04-06 19:07   ` Cyrille Pitchen
@ 2017-04-06 19:13     ` Cyrille Pitchen
  2017-04-19 14:36     ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: Cyrille Pitchen @ 2017-04-06 19:13 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Boris Brezillon, Richard Weinberger, Marek Vasut, Joel Stanley,
	Cyrille Pitchen, Brian Norris, David Woodhouse

Le 06/04/2017 à 21:07, Cyrille Pitchen a écrit :
> Hi Cédric,
> 
> Le 06/04/2017 à 18:56, Cédric Le Goater a écrit :
>> The Aspeed FMC controller can handle transfers to the flash modules
>> using DMAs. A couple of registers first need to be programmed with the
>> DRAM and flash addresses and the length of the transfer. The transfer
>> is then initiated using a DMA control register and an interrupt
>> notifies the completion.
>>
>> Such transfers can replace the current IO mode in the read/write ops
>> when some conditions are met on the size and the alignment. In case of
>> failure, a timeout for instance, the operation is restarted using the
>> IO mode.
>>
>> DMA support does not seem to be that efficient. So we provide some
>> sysfs files for tuning and to switch it on and off (default is off)
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  drivers/mtd/spi-nor/aspeed-smc.c | 231 +++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 231 insertions(+)
>>
>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>> index a98d454d07ed..7dfa1ea0a787 100644
>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>> @@ -11,6 +11,8 @@
>>  
>>  #include <linux/bug.h>
>>  #include <linux/device.h>
>> +#include <linux/dma-direction.h>
>> +#include <linux/dma-mapping.h>
>>  #include <linux/io.h>
>>  #include <linux/module.h>
>>  #include <linux/mutex.h>
>> @@ -40,6 +42,7 @@ struct aspeed_smc_info {
>>  	bool hastype;		/* flash type field exists in config reg */
>>  	u8 we0;			/* shift for write enable bit for CE0 */
>>  	u8 ctl0;		/* offset in regs of ctl for CE0 */
>> +	bool has_dma;
>>  
>>  	void (*set_4b)(struct aspeed_smc_chip *chip);
>>  };
>> @@ -53,6 +56,7 @@ static const struct aspeed_smc_info fmc_2400_info = {
>>  	.hastype = true,
>>  	.we0 = 16,
>>  	.ctl0 = 0x10,
>> +	.has_dma = true,
>>  	.set_4b = aspeed_smc_chip_set_4b,
>>  };
>>  
>> @@ -62,6 +66,7 @@ static const struct aspeed_smc_info spi_2400_info = {
>>  	.hastype = false,
>>  	.we0 = 0,
>>  	.ctl0 = 0x04,
>> +	.has_dma = false,
>>  	.set_4b = aspeed_smc_chip_set_4b_spi_2400,
>>  };
>>  
>> @@ -71,6 +76,7 @@ static const struct aspeed_smc_info fmc_2500_info = {
>>  	.hastype = true,
>>  	.we0 = 16,
>>  	.ctl0 = 0x10,
>> +	.has_dma = true,
>>  	.set_4b = aspeed_smc_chip_set_4b,
>>  };
>>  
>> @@ -80,6 +86,7 @@ static const struct aspeed_smc_info spi_2500_info = {
>>  	.hastype = false,
>>  	.we0 = 16,
>>  	.ctl0 = 0x10,
>> +	.has_dma = false,
>>  	.set_4b = aspeed_smc_chip_set_4b,
>>  };
>>  
>> @@ -97,6 +104,7 @@ struct aspeed_smc_chip {
>>  	struct aspeed_smc_controller *controller;
>>  	void __iomem *ctl;			/* control register */
>>  	void __iomem *ahb_base;			/* base of chip window */
>> +	unsigned long phys_base;		/* physical address of window */
>>  	u32 ctl_val[smc_max];			/* control settings */
>>  	enum aspeed_smc_flash_type type;	/* what type of flash */
>>  	struct spi_nor nor;
>> @@ -110,6 +118,18 @@ struct aspeed_smc_controller {
>>  	void __iomem *regs;			/* controller registers */
>>  	void __iomem *ahb_base;			/* per-chip windows resource */
>>  
>> +	/* interrupt handling */
>> +	int irq;
>> +
>> +	/* dma */
>> +	bool dma_enabled;
>> +	struct completion dma_done;
>> +
>> +	/* dma logging */
>> +	size_t dma_length;
>> +	dma_addr_t dma_addr;			/* bus address of buffer */
>> +	dma_addr_t flash_addr;			/* flash address */
>> +
>>  	struct aspeed_smc_chip *chips[0];	/* pointers to attached chips */
>>  };
>>  
>> @@ -182,6 +202,11 @@ struct aspeed_smc_controller {
>>  	(CONTROL_AAF_MODE | CONTROL_CE_INACTIVE_MASK | CONTROL_CLK_DIV4 | \
>>  	 CONTROL_CLOCK_FREQ_SEL_MASK | CONTROL_LSB_FIRST | CONTROL_CLOCK_MODE_3)
>>  
>> +/* Interrupt Control and Status Register */
>> +#define INTERRUPT_STATUS_REG		0x08
>> +#define     INTERRUPT_DMA_ENABLE	BIT(3)
>> +#define     INTERRUPT_DMA_STATUS	BIT(11)
>> +
>>  /*
>>   * The Segment Register uses a 8MB unit to encode the start address
>>   * and the end address of the mapping window of a flash SPI slave :
>> @@ -194,6 +219,108 @@ struct aspeed_smc_controller {
>>  #define SEGMENT_ADDR_START(_r)		((((_r) >> 16) & 0xFF) << 23)
>>  #define SEGMENT_ADDR_END(_r)		((((_r) >> 24) & 0xFF) << 23)
>>  
>> +/* DMA Registers */
>> +#define DMA_CONTROL_REG			0x80
>> +#define     DMA_ENABLE			BIT(0)
>> +#define     DMA_WRITE			BIT(1)
>> +
>> +#define DMA_FLASH_BASE_REG		0x84
>> +#define DMA_DRAM_BASE_REG		0x88
>> +#define DMA_LENGTH_REG			0x8c
>> +
>> +/*
>> + * DMAs do not seem to be that fast, so disable by default
>> + */
>> +static bool use_dma;
>> +module_param(use_dma, bool, 0644);
>> +
>> +static unsigned int min_dma_size = 256;
>> +module_param(min_dma_size, uint, 0644);
>> +
>> +/* with 100ms we had a couple of timeouts */
>> +static unsigned int dma_timeout = 200;
>> +module_param(dma_timeout, uint, 0644);
>> +
>> +static void aspeed_smc_dma_done(struct aspeed_smc_controller *controller)
>> +{
>> +	writel(0, controller->regs + INTERRUPT_STATUS_REG);
>> +	writel(0, controller->regs + DMA_CONTROL_REG);
>> +}
>> +
>> +#define DMA_LENGTH(x) (((x) - 4) & ~0xFE000003)
>> +#define DMA_ADDR(x) ((x) & ~0x00000003)
>> +
>> +static inline void aspeed_smc_chip_configure(struct aspeed_smc_chip *chip,
>> +					     u32 ctl)
>> +{
>> +	ctl |= CONTROL_CE_STOP_ACTIVE_CONTROL;
>> +	writel(ctl, chip->ctl);
>> +
>> +	ctl &= ~CONTROL_CE_STOP_ACTIVE_CONTROL;
>> +	writel(ctl, chip->ctl);
>> +}
>> +
>> +static int aspeed_smc_dma_start(struct aspeed_smc_chip *chip,
>> +				u32 offset, void *buf, size_t length,
>> +				int is_write)
>> +{
>> +	struct aspeed_smc_controller *controller = chip->controller;
>> +	dma_addr_t dma_addr, flash_addr;
>> +	int ret = 0;
>> +
>> +	aspeed_smc_chip_configure(chip, is_write ? chip->ctl_val[smc_write] :
>> +		chip->ctl_val[smc_read]);
>> +
>> +	dev_dbg(chip->nor.dev, "DMA %s to=0x%08x len=0x%08x\n",
>> +		is_write ? "write" : "read", offset, length);
>> +
>> +	dma_addr = dma_map_single(chip->nor.dev, buf, length,
>> +				  (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));
> 
> big mistake! Don't worry, your not the first one to fall into this trap
> and I did this mistake too in earlier versions of the Atmel QSPI
> controller driver (not mainlined) ;)
> 
> So please, let me explain:
> 
> You can use dma_map_single() if 'buf' is allocated with kmalloc() for
> instance because kmalloc'ed memory pages are guaranteed to be contiguous
> in physical space too (not only in virtual space).
> Also kmalloc'ed memory can be (un)mapped for DMA usages: when cleaning
> and/or invalidating the data cache lines, the cache aliasing issue
> should be avoided, hence it's safe.
> 
> On the other hand, vmalloc'ed memory pages are not guaranteed to be
> contiguous in physical space and often they aren't. So just for this
> only reason, you cannot use dma_map_single().
> 
> 
> More technical explanations but you can skip the following section if
> you want:
> 
> ####################################################################
> Then you could think of using dma_map_sg() instead. This is what the
> __spi_map_sg() / spi_map_buf() functions do in the SPI sub-system: in
> the case of vmalloc'ed memory, spi_map_buf() first build a
> scatter-gather list from the allocated pages before calling dma_map_sg().
> 
> However the vmalloc'ed memory area is not safe about the cache alias
> issue. Depending on model of the data cache it may be safe or not:
> 
> - PIPT (Physically Indexed / Physically Tagged): should be safe in any
> case (data cache of ARM Cortex A5 for instance).
> 
> - VIPT (Virtually Indexed / Physically Tagged): if the size of the data
> cache is "small" enough so the number of (index + offset) bits in the
> virtual address is less are equal to the number of offset bits for the
> smallest page size in the MMU (likely 12 bits for a 4KB MMU page size),
> the cache alias issue should be avoided. Otherwise this issue may occur
> so KO.
> 
> - VIVT (Virtually Indexed / Virtually Tagged): this is the worst case.
> Even the tag part comes from the virtual address so OK (ARM 926 I guess).
                                                 KO --^
> 
> e.g. for a 32bit address:
> 
> Depending on the cache model, the virtual and/or physical is used to
> extract the TAG, INDEX and OFFSET.
> 
> 31                         Y                   0
> +--------------------------+--------+----------+
> |           TAG            |  INDEX |   OFFSET |
> +--------------------------+--------+----------+
> 
> The INDEX part is used the select the right set in the N-way set
> associated cache. Then the TAG part is used to find the right cache line
> within the selected set.
> 
> For a minimum MMU page size of 4096 (== 2^12) bytes, if (Y < 12),
> whatever you use the physical of virtual address, the (INDEX + OFFSET)
> bits are the same. Then if the TAG comes from the physical address, no
> problem: even if a given physical page is mapped twice (or more) in the
> virtual address space through the MMU, all those virtual addresses are
> cached in the very same cache lines.
> So for a VIPT cache model, it depends on the geometry (cache size, cache
> line size, number of ways) of your cache.
> 
> SPI controller drivers should not rely on this "trick": it may not work
> later if the same controller IP is integrated in another SoC using a
> different core with a different cache model or different cache geometry.
> ####################################################################
> 
> 
> All that to say, other parts of the kernel, like the ubifs layer,
> allocates buffer with vmalloc() instead of kmalloc().
> So you may test your code programming a UBI file-system on your SPI
> flash memory and your kernel is likely to crash.
> 
> 
>> +
>> +	if (unlikely(dma_mapping_error(chip->nor.dev, dma_addr))) {
>> +		dev_err(chip->nor.dev, "Failed to dma_map_single()\n");
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>> +	flash_addr = chip->phys_base + offset;
>> +
>> +	controller->dma_length = length;
>> +	controller->dma_addr = dma_addr;
>> +	controller->flash_addr = flash_addr;
>> +
>> +	reinit_completion(&controller->dma_done);
>> +
>> +	writel(0, controller->regs + DMA_CONTROL_REG);
>> +	writel(DMA_ADDR(flash_addr), controller->regs +
>> +	       DMA_FLASH_BASE_REG);
>> +	writel(DMA_ADDR(dma_addr), controller->regs + DMA_DRAM_BASE_REG);
>> +	writel(DMA_LENGTH(length), controller->regs + DMA_LENGTH_REG);
>> +
>> +	writel(INTERRUPT_DMA_ENABLE,
>> +	       controller->regs + INTERRUPT_STATUS_REG);
>> +
>> +	writel(DMA_ENABLE | (is_write << 1),
>> +	       controller->regs + DMA_CONTROL_REG);
>> +
>> +	if (!wait_for_completion_timeout(&controller->dma_done,
>> +					 msecs_to_jiffies(dma_timeout))) {
>> +		dev_err(chip->nor.dev,
>> +			"DMA timeout addr@%.8x faddr@%.8x size=%x\n",
>> +			controller->dma_addr,
>> +			controller->flash_addr,
>> +			controller->dma_length);
>> +		ret = -ETIMEDOUT;
>> +		aspeed_smc_dma_done(controller);
>> +	}
>> +
>> +	dma_unmap_single(chip->nor.dev,
>> +			 controller->dma_addr, controller->dma_length,
>> +			 (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));
>> +out:
>> +	aspeed_smc_chip_configure(chip, chip->ctl_val[smc_read]);
>> +	return ret;
>> +}
>> +
>>  /*
>>   * In user mode all data bytes read or written to the chip decode address
>>   * range are transferred to or from the SPI bus. The range is treated as a
>> @@ -366,12 +493,32 @@ static void aspeed_smc_send_cmd_addr(struct spi_nor *nor, u8 cmd, u32 addr)
>>  	}
>>  }
>>  
>> +/*
>> + * Try DMA transfer when size and alignment are correct. In case
>> + * of failure, just restart using the IO mode.
>> + */
>> +static int aspeed_smc_dma_check(struct aspeed_smc_chip *chip, loff_t off,
>> +				size_t len)
>> +{
>> +	return (IS_ALIGNED(off, 4) && IS_ALIGNED(len, 4) &&
>> +		len >= min_dma_size && chip->controller->dma_enabled &&
>> +		use_dma);
>> +}
>> +
> 
> vmalloc'ed memory or other high memory regions (not DMA capable) can
> still pass this check.
> 
> Best regards,
> 
> Cyrille
> 
>>  static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>  				    size_t len, u_char *read_buf)
>>  {
>>  	struct aspeed_smc_chip *chip = nor->priv;
>>  	int i;
>>  	u8 dummy = 0xFF;
>> +	int ret;
>> +
>> +	if (aspeed_smc_dma_check(chip, from, len)) {
>> +		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
>> +		if (!ret)
>> +			goto out;
>> +		dev_err(chip->nor.dev, "DMA read failed: %d", ret);
>> +	}
>>  
>>  	aspeed_smc_start_user(nor);
>>  	aspeed_smc_send_cmd_addr(nor, nor->read_opcode, from);
>> @@ -380,6 +527,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>  
>>  	aspeed_smc_read_from_ahb(read_buf, chip->ahb_base, len);
>>  	aspeed_smc_stop_user(nor);
>> +out:
>>  	return len;
>>  }
>>  
>> @@ -387,11 +535,21 @@ static ssize_t aspeed_smc_write_user(struct spi_nor *nor, loff_t to,
>>  				     size_t len, const u_char *write_buf)
>>  {
>>  	struct aspeed_smc_chip *chip = nor->priv;
>> +	int ret;
>> +
>> +	if (aspeed_smc_dma_check(chip, to, len)) {
>> +		ret = aspeed_smc_dma_start(chip, to, (void *)write_buf,
>> +					   len, 1);
>> +		if (!ret)
>> +			goto out;
>> +		dev_err(chip->nor.dev, "DMA write failed: %d", ret);
>> +	}
>>  
>>  	aspeed_smc_start_user(nor);
>>  	aspeed_smc_send_cmd_addr(nor, nor->program_opcode, to);
>>  	aspeed_smc_write_to_ahb(chip->ahb_base, write_buf, len);
>>  	aspeed_smc_stop_user(nor);
>> +out:
>>  	return len;
>>  }
>>  
>> @@ -527,6 +685,8 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
>>  		return -EINVAL;
>>  	}
>>  
>> +	chip->phys_base = res->start + (chip->ahb_base - controller->ahb_base);
>> +
>>  	/*
>>  	 * Get value of the inherited control register. U-Boot usually
>>  	 * does some timing calibration on the FMC chip, so it's good
>> @@ -597,6 +757,7 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
>>  	}
>>  
>>  	chip->ctl_val[smc_read] |= cmd |
>> +		chip->nor.read_opcode << CONTROL_COMMAND_SHIFT |
>>  		CONTROL_IO_DUMMY_SET(chip->nor.read_dummy / 8);
>>  
>>  	dev_dbg(controller->dev, "base control register: %08x\n",
>> @@ -695,6 +856,74 @@ static int aspeed_smc_setup_flash(struct aspeed_smc_controller *controller,
>>  	return ret;
>>  }
>>  
>> +static irqreturn_t aspeed_smc_irq(int irq, void *arg)
>> +{
>> +	struct aspeed_smc_controller *controller = arg;
>> +	struct device *dev = controller->dev;
>> +	irqreturn_t ret = IRQ_NONE;
>> +	u32 dma_ctl = readl(controller->regs + DMA_CONTROL_REG);
>> +	u32 status = readl(controller->regs + INTERRUPT_STATUS_REG);
>> +
>> +	dev_dbg(dev, "received IRQ. status: %x", status);
>> +
>> +	if (!(status & INTERRUPT_DMA_ENABLE) || !(dma_ctl & DMA_ENABLE)) {
>> +		dev_err(dev, "No DMA. bad IRQ status: %x", status);
>> +		goto out;
>> +	}
>> +
>> +	if (!(status & INTERRUPT_DMA_STATUS)) {
>> +		dev_err(dev, "DMA still in progress. length %d\n",
>> +			readl(controller->regs + DMA_LENGTH_REG));
>> +		goto out;
>> +	}
>> +
>> +	ret = IRQ_HANDLED;
>> +	aspeed_smc_dma_done(controller);
>> +	complete(&controller->dma_done);
>> +
>> +out:
>> +	return ret;
>> +}
>> +
>> +static int aspeed_smc_config_irq(struct aspeed_smc_controller *controller,
>> +				 struct platform_device *pdev)
>> +{
>> +	struct device *dev = &pdev->dev;
>> +	int rc;
>> +
>> +	controller->irq = platform_get_irq(pdev, 0);
>> +	if (!controller->irq)
>> +		return -ENODEV;
>> +
>> +	rc = devm_request_irq(dev, controller->irq, aspeed_smc_irq, IRQF_SHARED,
>> +			      DEVICE_NAME, controller);
>> +	if (rc < 0) {
>> +		dev_warn(dev, "Unable to request IRQ %d\n", controller->irq);
>> +		controller->irq = 0;
>> +		return rc;
>> +	}
>> +
>> +	dev_info(dev, "Using IRQ %d\n", controller->irq);
>> +	return 0;
>> +}
>> +
>> +static void aspeed_smc_dma_setup(struct aspeed_smc_controller *controller,
>> +				 struct platform_device *pdev)
>> +{
>> +	const struct aspeed_smc_info *info = controller->info;
>> +
>> +	init_completion(&controller->dma_done);
>> +
>> +	controller->dma_enabled = false;
>> +	if (info->has_dma)
>> +		controller->dma_enabled = !aspeed_smc_config_irq(controller,
>> +								 pdev);
>> +
>> +	if (controller->dma_enabled)
>> +		dev_info(controller->dev, "DMA support %s.\n",
>> +			 use_dma ? "enabled" : "disabled");
>> +}
>> +
>>  static int aspeed_smc_probe(struct platform_device *pdev)
>>  {
>>  	struct device_node *np = pdev->dev.of_node;
>> @@ -730,6 +959,8 @@ static int aspeed_smc_probe(struct platform_device *pdev)
>>  	if (IS_ERR(controller->ahb_base))
>>  		return PTR_ERR(controller->ahb_base);
>>  
>> +	aspeed_smc_dma_setup(controller, pdev);
>> +
>>  	ret = aspeed_smc_setup_flash(controller, np, res);
>>  	if (ret)
>>  		dev_err(dev, "Aspeed SMC probe failed %d\n", ret);
>>
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size
  2017-04-06 16:56 ` [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size Cédric Le Goater
@ 2017-04-06 19:17   ` Marek Vasut
  2017-04-11 10:18     ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-06 19:17 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
> The window of the Aspeed AST2400 SMC Controller to map chips on the
> AHB Bus has a 256MB size. 64MB is the default size for the first chip
> select and not the full window range: [ 0x20000000 - 0x2FFFFFFF ].
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  drivers/mtd/spi-nor/aspeed-smc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
> index 56051d30f000..aa2c647c8507 100644
> --- a/drivers/mtd/spi-nor/aspeed-smc.c
> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
> @@ -48,7 +48,7 @@ static void aspeed_smc_chip_set_4b_spi_2400(struct aspeed_smc_chip *chip);
>  static void aspeed_smc_chip_set_4b(struct aspeed_smc_chip *chip);
>  
>  static const struct aspeed_smc_info fmc_2400_info = {
> -	.maxsize = 64 * 1024 * 1024,
> +	.maxsize = 256 * 1024 * 1024,

How does the second part of reg property in DT relate to this ?
DT binding example has it set to 0x02000000 = 32 MiB ...

>  	.nce = 5,
>  	.hastype = true,
>  	.we0 = 16,
> 


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode
  2017-04-06 16:56 ` [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode Cédric Le Goater
@ 2017-04-06 19:21   ` Marek Vasut
  2017-04-11  8:53     ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-06 19:21 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Robert Lippert, Robert Lippert

On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
> From: Robert Lippert <roblip@gmail.com>
> 
> Implements support for the dual IO read mode on aspeed SMC/FMC
> controllers which uses both MISO and MOSI lines for data during a read
> to double the read bandwidth.
> 
> Signed-off-by: Robert Lippert <rlippert@google.com>
> [clg: adapted to mainline driver ]
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  drivers/mtd/spi-nor/aspeed-smc.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
> index 7dfa1ea0a787..b3c8cfe29765 100644
> --- a/drivers/mtd/spi-nor/aspeed-smc.c
> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
> @@ -512,6 +512,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>  	int i;
>  	u8 dummy = 0xFF;
>  	int ret;
> +	u32 ctl;
>  
>  	if (aspeed_smc_dma_check(chip, from, len)) {
>  		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
> @@ -525,6 +526,13 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>  	for (i = 0; i < chip->nor.read_dummy / 8; i++)
>  		aspeed_smc_write_to_ahb(chip->ahb_base, &dummy, sizeof(dummy));
>  
> +	if (chip->nor.flash_read == SPI_NOR_DUAL) {
> +		/* Switch to dual I/O mode for data cycle */
> +		ctl = readl(chip->ctl) & ~CONTROL_IO_MODE_MASK;
> +		ctl |= CONTROL_IO_DUAL_DATA;
> +		writel(ctl, chip->ctl);
> +	}
> +
Can't you switch the mode at runtime ? If you do, who'll clear the
CONTROL_IO_DUAL_DATA in this ctl register if you switch to SINGLE
mode for some command ?

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 05/10] mtd: spi-nor: Add support for Macronix mx66l1g45g spi flash
  2017-04-06 16:56 ` [PATCH 05/10] mtd: spi-nor: Add support for Macronix mx66l1g45g spi flash Cédric Le Goater
@ 2017-04-06 19:21   ` Marek Vasut
  0 siblings, 0 replies; 40+ messages in thread
From: Marek Vasut @ 2017-04-06 19:21 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
> These modules are used on the OpenPOWER Witherspoon systems to hold
> the POWER9 host firmware image. The SPI_NOR_DUAL_READ flags is added
> for the Aspeed SoCs which do not support QUAD reads.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Completely unrelated, should be sent as a separate patch.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/10] mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices
  2017-04-06 16:56 ` [PATCH 06/10] mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices Cédric Le Goater
@ 2017-04-06 19:22   ` Marek Vasut
  2017-04-11  8:15     ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-06 19:22 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
> These devices are used on OpenPOWER systems.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  The lines are over 80 characters but it looks better.
> 
>  drivers/mtd/spi-nor/spi-nor.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
> index ee777df4466c..0821e3211867 100644
> --- a/drivers/mtd/spi-nor/spi-nor.c
> +++ b/drivers/mtd/spi-nor/spi-nor.c
> @@ -1016,10 +1016,10 @@ static const struct flash_info spi_nor_ids[] = {
>  	{ "mx25u6435f",  INFO(0xc22537, 0, 64 * 1024, 128, SECT_4K) },
>  	{ "mx25l12805d", INFO(0xc22018, 0, 64 * 1024, 256, 0) },
>  	{ "mx25l12855e", INFO(0xc22618, 0, 64 * 1024, 256, 0) },
> -	{ "mx25l25635e", INFO(0xc22019, 0, 64 * 1024, 512, 0) },
> +	{ "mx25l25635e", INFO(0xc22019, 0, 64 * 1024, 512, SPI_NOR_DUAL_READ) },
>  	{ "mx25u25635f", INFO(0xc22539, 0, 64 * 1024, 512, SECT_4K) },
>  	{ "mx25l25655e", INFO(0xc22619, 0, 64 * 1024, 512, 0) },
> -	{ "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, SPI_NOR_QUAD_READ) },
> +	{ "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
>  	{ "mx66l1g45g",  INFO(0xc2201b, 0, 64 * 1024, 2048, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
>  	{ "mx66l1g55g",  INFO(0xc2261b, 0, 64 * 1024, 2048, SPI_NOR_QUAD_READ) },
>  
> @@ -1031,7 +1031,7 @@ static const struct flash_info spi_nor_ids[] = {
>  	{ "n25q064a",    INFO(0x20bb17, 0, 64 * 1024,  128, SECT_4K | SPI_NOR_QUAD_READ) },
>  	{ "n25q128a11",  INFO(0x20bb18, 0, 64 * 1024,  256, SECT_4K | SPI_NOR_QUAD_READ) },
>  	{ "n25q128a13",  INFO(0x20ba18, 0, 64 * 1024,  256, SECT_4K | SPI_NOR_QUAD_READ) },
> -	{ "n25q256a",    INFO(0x20ba19, 0, 64 * 1024,  512, SECT_4K | SPI_NOR_QUAD_READ) },
> +	{ "n25q256a",    INFO(0x20ba19, 0, 64 * 1024,  512, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
>  	{ "n25q512a",    INFO(0x20bb20, 0, 64 * 1024, 1024, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
>  	{ "n25q512ax3",  INFO(0x20ba20, 0, 64 * 1024, 1024, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
>  	{ "n25q00",      INFO(0x20ba21, 0, 64 * 1024, 2048, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
> @@ -1150,7 +1150,7 @@ static const struct flash_info spi_nor_ids[] = {
>  	{ "w25q80", INFO(0xef5014, 0, 64 * 1024,  16, SECT_4K) },
>  	{ "w25q80bl", INFO(0xef4014, 0, 64 * 1024,  16, SECT_4K) },
>  	{ "w25q128", INFO(0xef4018, 0, 64 * 1024, 256, SECT_4K) },
> -	{ "w25q256", INFO(0xef4019, 0, 64 * 1024, 512, SECT_4K) },
> +	{ "w25q256", INFO(0xef4019, 0, 64 * 1024, 512, SECT_4K | SPI_NOR_DUAL_READ) },

All of them probably support quad too ... also unrelated to this series,
separate patch please.

>  	/* Catalyst / On Semiconductor -- non-JEDEC */
>  	{ "cat25c11", CAT25_INFO(  16, 8, 16, 1, SPI_NOR_NO_ERASE | SPI_NOR_NO_FR) },
> 


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus
  2017-04-06 16:56 ` [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus Cédric Le Goater
@ 2017-04-06 19:25   ` Marek Vasut
  2017-04-11 10:20     ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-06 19:25 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
> The segment registers of the SMC controller provide a way to configure
> the mapping windows of the chips on the AHB bus. The settings are
> required to be correct when the controller operates in Command mode,
> which is the case for DMAs and the LPC mapping.
> 
> This tries to set the segment registers of each chip depending on the
> size of the flash device and depending on the previous segment
> settings, in order to have a contiguous window across multiple chip.
> 
> Unfortunately, the AST2500 SPI controller has a bug and it is not
> possible to configure a full 128MB window for a chip of the same
> size. The window size needs to be restricted to 120MB. This issue only
> applies to CE0.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  drivers/mtd/spi-nor/aspeed-smc.c | 124 ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 121 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
> index b3c8cfe29765..336a1ddd100b 100644
> --- a/drivers/mtd/spi-nor/aspeed-smc.c
> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
> @@ -104,6 +104,7 @@ struct aspeed_smc_chip {
>  	struct aspeed_smc_controller *controller;
>  	void __iomem *ctl;			/* control register */
>  	void __iomem *ahb_base;			/* base of chip window */
> +	u32 ahb_window_size;			/* chip window size */
>  	unsigned long phys_base;		/* physical address of window */
>  	u32 ctl_val[smc_max];			/* control settings */
>  	enum aspeed_smc_flash_type type;	/* what type of flash */
> @@ -218,6 +219,10 @@ struct aspeed_smc_controller {
>  #define SEGMENT_ADDR_REG0		0x30
>  #define SEGMENT_ADDR_START(_r)		((((_r) >> 16) & 0xFF) << 23)
>  #define SEGMENT_ADDR_END(_r)		((((_r) >> 24) & 0xFF) << 23)
> +#define SEGMENT_ADDR_VALUE(start, end)					\
> +	(((((start) >> 23) & 0xFF) << 16) | ((((end) >> 23) & 0xFF) << 24))
> +#define SEGMENT_ADDR_REG(controller, cs)	\
> +	((controller)->regs + SEGMENT_ADDR_REG0 + (cs) * 4)
>  
>  /* DMA Registers */
>  #define DMA_CONTROL_REG			0x80
> @@ -604,8 +609,7 @@ static void __iomem *aspeed_smc_chip_base(struct aspeed_smc_chip *chip,
>  	u32 reg;
>  
>  	if (controller->info->nce > 1) {
> -		reg = readl(controller->regs + SEGMENT_ADDR_REG0 +
> -			    chip->cs * 4);
> +		reg = readl(SEGMENT_ADDR_REG(controller, chip->cs));
>  
>  		if (SEGMENT_ADDR_START(reg) >= SEGMENT_ADDR_END(reg))
>  			return NULL;
> @@ -616,6 +620,111 @@ static void __iomem *aspeed_smc_chip_base(struct aspeed_smc_chip *chip,
>  	return controller->ahb_base + offset;
>  }
>  
> +static u32 chip_set_segment(struct aspeed_smc_chip *chip, u32 cs, u32 start,
> +			    u32 size)
> +{
> +	struct aspeed_smc_controller *controller = chip->controller;
> +	void __iomem *seg_reg;
> +	u32 oldval, newval, val0, start0, end;

The naming of variables could use some improvement, it's really inobvious.

> +	val0 = readl(SEGMENT_ADDR_REG(controller, 0));
> +	start0 = SEGMENT_ADDR_START(val0);
> +
> +	seg_reg = SEGMENT_ADDR_REG(controller, cs);
> +	oldval = readl(seg_reg);
> +
> +	/* If the chip size is not specified, use the default segment

Doesn't checkpatch warn about this broken multi-line comment style ?
Anyway, please fix.

> +	 * size, but take into account the possible overlap with the
> +	 * previous segment
> +	 */
> +	if (!size)
> +		size = SEGMENT_ADDR_END(oldval) - start;
> +
> +	/* The segment cannot exceed the maximum window size of the
> +	 * controller.
> +	 */
> +	if (start + size > start0 + controller->info->maxsize) {
> +		size = start0 + controller->info->maxsize - start;
> +		dev_warn(chip->nor.dev, "CE%d window resized to %dMB",
> +			 cs, size >> 20);
> +	}
> +
> +	end = start + size;
> +	newval = SEGMENT_ADDR_VALUE(start, end);
> +	writel(newval, seg_reg);
> +
> +	/* Restore default value if something goes wrong. The chip
> +	 * might have set some bogus value and we would loose access
> +	 * to the chip.
> +	 */
> +	if (newval != readl(seg_reg)) {
> +		dev_err(chip->nor.dev, "CE%d window invalid", cs);
> +		writel(oldval, seg_reg);
> +		start = SEGMENT_ADDR_START(oldval);
> +		end = SEGMENT_ADDR_END(oldval);
> +		size = end - start;
> +	}
> +
> +	dev_info(chip->nor.dev, "CE%d window [ 0x%.8x- 0x%.8x ] %dMB",
> +		 cs, start, end, size >> 20);
> +
> +	return size;
> +}
> +
> +/*
> + * This is expected to be called in increasing CE order
> + */
> +static u32 aspeed_smc_chip_set_segment(struct aspeed_smc_chip *chip)
> +{
> +	struct aspeed_smc_controller *controller = chip->controller;
> +	u32  val0, start0, start;
> +	u32 size = chip->nor.mtd.size;
> +
> +	/* The AST2500 SPI controller has a bug when the CE0 chip size
> +	 * exceeds 120MB.

120 or 128 ?

> +	 */
> +	if (chip->cs == 0 && controller->info == &spi_2500_info &&
> +	    size == (128 << 20)) {

I guess you can use SZ_128M for more clarity.

> +		size = 120 << 20;
> +		dev_info(chip->nor.dev,
> +			 "CE%d window resized to %dMB (AST2500 HW quirk)",
> +			 chip->cs, size >> 20);
> +	}
> +
> +	val0 = readl(SEGMENT_ADDR_REG(controller, 0));
> +	start0 = SEGMENT_ADDR_START(val0);
> +
> +	/* As a start address for the current segment, use the default
> +	 * start address if we are handling CE0 or use the previous
> +	 * segment ending address
> +	 */
> +	if (chip->cs) {
> +		u32 prev = readl(SEGMENT_ADDR_REG(controller, chip->cs - 1));
> +
> +		start = SEGMENT_ADDR_END(prev);
> +	} else
> +		start = start0;
> +
> +	size = chip_set_segment(chip, chip->cs, start, size);
> +
> +	/* Update chip base address on the AHB bus */
> +	chip->ahb_base = controller->ahb_base + (start - start0);
> +
> +	if (size < chip->nor.mtd.size)
> +		dev_warn(chip->nor.dev,
> +			 "CE%d window too small for chip %dMB",
> +			 chip->cs, (u32) chip->nor.mtd.size >> 20);
> +
> +	/* Make sure the next segment does not overlap with the
> +	 * current one we just configured even if there is no
> +	 * available chip. That could break access in Command Mode.
> +	 */
> +	if (chip->cs < controller->info->nce - 1)
> +		chip_set_segment(chip, chip->cs + 1, start + size, 0);
> +
> +	return size;
> +}
> +
>  static void aspeed_smc_chip_enable_write(struct aspeed_smc_chip *chip)
>  {
>  	struct aspeed_smc_controller *controller = chip->controller;
> @@ -689,7 +798,7 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
>  	 */
>  	chip->ahb_base = aspeed_smc_chip_base(chip, res);
>  	if (!chip->ahb_base) {
> -		dev_warn(chip->nor.dev, "CE segment window closed.\n");
> +		dev_warn(chip->nor.dev, "CE%d window closed", chip->cs);
>  		return -EINVAL;
>  	}
>  
> @@ -738,6 +847,15 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
>  	if (chip->nor.addr_width == 4 && info->set_4b)
>  		info->set_4b(chip);
>  
> +	/* This is for direct AHB access when using Command Mode. For

Fix the comments please

/*
 * foo
 * bar
 * baz
 */

> +	 * the AST2400 SPI controller which only handles one chip,
> +	 * let's use the full window.
> +	 */
> +	if (controller->info->nce == 1)
> +		chip->ahb_window_size = info->maxsize;
> +	else
> +		chip->ahb_window_size = aspeed_smc_chip_set_segment(chip);
> +
>  	/*
>  	 * base mode has not been optimized yet. use it for writes.
>  	 */
> 


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-06 16:56 ` [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode Cédric Le Goater
@ 2017-04-06 19:28   ` Marek Vasut
  2017-04-11  8:13     ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-06 19:28 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
> This is only for SPI controllers as U-Boot should have done it already
> for the FMC controller using DMAs.
> 
> The algo is based on the one found in the OpenPOWER pflash tool. It
> first reads a golden buffer at low speed and then performs reads with
> different clocks and delay cycles settings to find the fastest
> configuration for the chip.
> 
> It can be deactivated at boot time with the kernel parameter :

Something tells me this is likely gonna be pretty flaky and might lead
to unreliable configurations. The hardware manufacturer should be able
to determine the limits of the hardware and those should be put into DT
at manufacturing time IMO.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-06 19:28   ` Marek Vasut
@ 2017-04-11  8:13     ` Cédric Le Goater
  2017-04-11  8:23       ` Benjamin Herrenschmidt
  2017-04-14 16:18       ` Marek Vasut
  0 siblings, 2 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-11  8:13 UTC (permalink / raw)
  To: Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Benjamin Herrenschmidt

Hello Marek,

On 04/06/2017 09:28 PM, Marek Vasut wrote:
> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>> This is only for SPI controllers as U-Boot should have done it already
>> for the FMC controller using DMAs.
>>
>> The algo is based on the one found in the OpenPOWER pflash tool. It
>> first reads a golden buffer at low speed and then performs reads with
>> different clocks and delay cycles settings to find the fastest
>> configuration for the chip.
>>
>> It can be deactivated at boot time with the kernel parameter :
> 
> Something tells me this is likely gonna be pretty flaky and might lead
> to unreliable configurations. The hardware manufacturer should be able
> to determine the limits of the hardware and those should be put into DT
> at manufacturing time IMO.

It is not a common method but we have been using it on many OpenPOWER
platforms (P8) using the AST2400 (palmetto, habanero, firestone, 
garrison, etc) and also on the newer P9 platforms using the AST2500
(romulus, zaius, witherspoon). So I would say that experience 'proves' 
that it works. We didn't invent the SPI Timing Calibration algo, it is
described in the specs of the manufacturer.

Also, all these platforms use sockets to hold the different flash 
modules of the system. We don't know in advance what we will find 
and so it makes difficult to put any timing in the DT.    

And the values of the "data input delay cycle" (REG94) depend on the 
chip model and on the length of the wires of the board. If we were to 
hard-code such values in the DT, we would need to put only safe and slow 
settings known to work in all configurations. which is a bit what we 
do in the current driver using really slow ones.

Here is a proposal. We could activate this algo using a property such
as :
 
	"speed-mode" = "freq" or "auto-adjust"

How's that ? 

Thanks,

C.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/10] mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices
  2017-04-06 19:22   ` Marek Vasut
@ 2017-04-11  8:15     ` Cédric Le Goater
  0 siblings, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-11  8:15 UTC (permalink / raw)
  To: Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 09:22 PM, Marek Vasut wrote:
> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>> These devices are used on OpenPOWER systems.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  The lines are over 80 characters but it looks better.
>>
>>  drivers/mtd/spi-nor/spi-nor.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
>> index ee777df4466c..0821e3211867 100644
>> --- a/drivers/mtd/spi-nor/spi-nor.c
>> +++ b/drivers/mtd/spi-nor/spi-nor.c
>> @@ -1016,10 +1016,10 @@ static const struct flash_info spi_nor_ids[] = {
>>  	{ "mx25u6435f",  INFO(0xc22537, 0, 64 * 1024, 128, SECT_4K) },
>>  	{ "mx25l12805d", INFO(0xc22018, 0, 64 * 1024, 256, 0) },
>>  	{ "mx25l12855e", INFO(0xc22618, 0, 64 * 1024, 256, 0) },
>> -	{ "mx25l25635e", INFO(0xc22019, 0, 64 * 1024, 512, 0) },
>> +	{ "mx25l25635e", INFO(0xc22019, 0, 64 * 1024, 512, SPI_NOR_DUAL_READ) },
>>  	{ "mx25u25635f", INFO(0xc22539, 0, 64 * 1024, 512, SECT_4K) },
>>  	{ "mx25l25655e", INFO(0xc22619, 0, 64 * 1024, 512, 0) },
>> -	{ "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, SPI_NOR_QUAD_READ) },
>> +	{ "mx66l51235l", INFO(0xc2201a, 0, 64 * 1024, 1024, SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
>>  	{ "mx66l1g45g",  INFO(0xc2201b, 0, 64 * 1024, 2048, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
>>  	{ "mx66l1g55g",  INFO(0xc2261b, 0, 64 * 1024, 2048, SPI_NOR_QUAD_READ) },
>>  
>> @@ -1031,7 +1031,7 @@ static const struct flash_info spi_nor_ids[] = {
>>  	{ "n25q064a",    INFO(0x20bb17, 0, 64 * 1024,  128, SECT_4K | SPI_NOR_QUAD_READ) },
>>  	{ "n25q128a11",  INFO(0x20bb18, 0, 64 * 1024,  256, SECT_4K | SPI_NOR_QUAD_READ) },
>>  	{ "n25q128a13",  INFO(0x20ba18, 0, 64 * 1024,  256, SECT_4K | SPI_NOR_QUAD_READ) },
>> -	{ "n25q256a",    INFO(0x20ba19, 0, 64 * 1024,  512, SECT_4K | SPI_NOR_QUAD_READ) },
>> +	{ "n25q256a",    INFO(0x20ba19, 0, 64 * 1024,  512, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
>>  	{ "n25q512a",    INFO(0x20bb20, 0, 64 * 1024, 1024, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
>>  	{ "n25q512ax3",  INFO(0x20ba20, 0, 64 * 1024, 1024, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
>>  	{ "n25q00",      INFO(0x20ba21, 0, 64 * 1024, 2048, SECT_4K | USE_FSR | SPI_NOR_QUAD_READ) },
>> @@ -1150,7 +1150,7 @@ static const struct flash_info spi_nor_ids[] = {
>>  	{ "w25q80", INFO(0xef5014, 0, 64 * 1024,  16, SECT_4K) },
>>  	{ "w25q80bl", INFO(0xef4014, 0, 64 * 1024,  16, SECT_4K) },
>>  	{ "w25q128", INFO(0xef4018, 0, 64 * 1024, 256, SECT_4K) },
>> -	{ "w25q256", INFO(0xef4019, 0, 64 * 1024, 512, SECT_4K) },
>> +	{ "w25q256", INFO(0xef4019, 0, 64 * 1024, 512, SECT_4K | SPI_NOR_DUAL_READ) },
> 
> All of them probably support quad too ... also unrelated to this series,
> separate patch please.

ok. I will check for quad support and resend patch 5 and 6 in a separate patchset.

Thanks

C. 

> 
>>  	/* Catalyst / On Semiconductor -- non-JEDEC */
>>  	{ "cat25c11", CAT25_INFO(  16, 8, 16, 1, SPI_NOR_NO_ERASE | SPI_NOR_NO_FR) },
>>
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-11  8:13     ` Cédric Le Goater
@ 2017-04-11  8:23       ` Benjamin Herrenschmidt
  2017-04-14 16:18       ` Marek Vasut
  1 sibling, 0 replies; 40+ messages in thread
From: Benjamin Herrenschmidt @ 2017-04-11  8:23 UTC (permalink / raw)
  To: Cédric Le Goater, Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On Tue, 2017-04-11 at 10:13 +0200, Cédric Le Goater wrote:
> 
> > Something tells me this is likely gonna be pretty flaky and might lead
> > to unreliable configurations. The hardware manufacturer should be able
> > to determine the limits of the hardware and those should be put into DT
> > at manufacturing time IMO.
> 
> It is not a common method but we have been using it on many OpenPOWER
> platforms (P8) using the AST2400 (palmetto, habanero, firestone, 
> garrison, etc) and also on the newer P9 platforms using the AST2500
> (romulus, zaius, witherspoon). So I would say that experience 'proves' 
> that it works. We didn't invent the SPI Timing Calibration algo, it is
> described in the specs of the manufacturer.

 ... Partially ;-) I wrote the original pflash so I can take the blame
for that one but yes, it's proven actually more reliable than any
attempt at hard wiring the settings for a given board/chip combination.

> Also, all these platforms use sockets to hold the different flash 
> modules of the system. We don't know in advance what we will find 
> and so it makes difficult to put any timing in the DT.    

Correct. Plus random revisions changing the trace length etc... the
problem isn't so much the SPI frequency which can theorically be
derived from the detected chip, as long as we stay within reasonable
limits. There seem to be sub-cycle read delays that need to be inserted
for reads to be reliable. The controller allow to manipulate those
delays rather precisely but the "right" value is hard to guess and
depend on a given combination of system trace length, flash chip, flash
chip revision even etc...

The algorithm we use goes through several passes over ranges of timings
and delays and picks a "safe" middle ground, ie, picks a combination
that is good *and* have both adjacent combinations good.

It also makes sure that there's a good enough distribution of 0's and
1's in the flash.

If any of the above fails, it falls back to slow and safe timings.

I suggested to Cedric the stuff he proposed below, which is to make it
a manufacturer choice to use the auto calibration or not via the DT.

However ...

> And the values of the "data input delay cycle" (REG94) depend on the 
> chip model and on the length of the wires of the board. If we were to 
> hard-code such values in the DT, we would need to put only safe and slow 
> settings known to work in all configurations. which is a bit what we 
> do in the current driver using really slow ones.
> 
> Here is a proposal. We could activate this algo using a property such
> as :
>  
> 	"speed-mode" = "freq" or "auto-adjust"

You need more than just "freq".

At the moment we use a command freq which is somewhat safe and a
different read freq. There's also a write freq which could be different
based on the flash chip. It also depends on the command mode used (fast
read, dual IO, ...).

The problem is that you need to also provide a way to specify that gate
delay correction (REG94) in addition too. My original implementation in
pflash was forced to slow commands right down (which sucks for write
and erase) because it didn't apply that correction factor to the reads
of the status register and thus was getting occasional garbage there
which can be fatal.

So I would provide a way to specify the "normal+write+erase command"
and the "read command" timing separately, and not as a frequency but as
a precise set of divider and REG94 since the exact frequency obtained
from the divider rather than the "target" frequency is what matters.

> How's that ? 
> 
> Thanks,
> 
> C.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode
  2017-04-06 19:21   ` Marek Vasut
@ 2017-04-11  8:53     ` Cédric Le Goater
  2017-04-11 10:43       ` Marek Vasut
  0 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-11  8:53 UTC (permalink / raw)
  To: Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Robert Lippert, Robert Lippert

On 04/06/2017 09:21 PM, Marek Vasut wrote:
> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>> From: Robert Lippert <roblip@gmail.com>
>>
>> Implements support for the dual IO read mode on aspeed SMC/FMC
>> controllers which uses both MISO and MOSI lines for data during a read
>> to double the read bandwidth.
>>
>> Signed-off-by: Robert Lippert <rlippert@google.com>
>> [clg: adapted to mainline driver ]
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  drivers/mtd/spi-nor/aspeed-smc.c | 20 +++++++++++++-------
>>  1 file changed, 13 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>> index 7dfa1ea0a787..b3c8cfe29765 100644
>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>> @@ -512,6 +512,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>  	int i;
>>  	u8 dummy = 0xFF;
>>  	int ret;
>> +	u32 ctl;
>>  
>>  	if (aspeed_smc_dma_check(chip, from, len)) {
>>  		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
>> @@ -525,6 +526,13 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>  	for (i = 0; i < chip->nor.read_dummy / 8; i++)
>>  		aspeed_smc_write_to_ahb(chip->ahb_base, &dummy, sizeof(dummy));
>>  
>> +	if (chip->nor.flash_read == SPI_NOR_DUAL) {
>> +		/* Switch to dual I/O mode for data cycle */
>> +		ctl = readl(chip->ctl) & ~CONTROL_IO_MODE_MASK;
>> +		ctl |= CONTROL_IO_DUAL_DATA;
>> +		writel(ctl, chip->ctl);
>> +	}
>> +
> Can't you switch the mode at runtime ? If you do, who'll clear the
> CONTROL_IO_DUAL_DATA in this ctl register if you switch to SINGLE
> mode for some command ?

This is the read routine in User mode. When the read is completed,
the previous (and default) setting of the control register is restored 
with a call to aspeed_smc_stop_user() but you don't see it in the diff 
hunk above.

The default setting of the control register is used when the controller 
is in Command mode. That is when reads are performed 'automatically' 
from the AHB bus. 

C.   

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size
  2017-04-06 19:17   ` Marek Vasut
@ 2017-04-11 10:18     ` Cédric Le Goater
  2017-04-11 10:42       ` Marek Vasut
  0 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-11 10:18 UTC (permalink / raw)
  To: Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 09:17 PM, Marek Vasut wrote:
> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>> The window of the Aspeed AST2400 SMC Controller to map chips on the
>> AHB Bus has a 256MB size. 64MB is the default size for the first chip
>> select and not the full window range: [ 0x20000000 - 0x2FFFFFFF ].
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  drivers/mtd/spi-nor/aspeed-smc.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>> index 56051d30f000..aa2c647c8507 100644
>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>> @@ -48,7 +48,7 @@ static void aspeed_smc_chip_set_4b_spi_2400(struct aspeed_smc_chip *chip);
>>  static void aspeed_smc_chip_set_4b(struct aspeed_smc_chip *chip);
>>  
>>  static const struct aspeed_smc_info fmc_2400_info = {
>> -	.maxsize = 64 * 1024 * 1024,
>> +	.maxsize = 256 * 1024 * 1024,
> 
> How does the second part of reg property in DT relate to this ?
> DT binding example has it set to 0x02000000 = 32 MiB ...

Thanks for pointing this out. There is an inconsistency and
a redundancy here ... The patch is wrong as I am mixing up 
different limits. Anyhow, there are some concerns to discuss. 

The DT binding is the size of the overall AHB window in which 
a controller can directly memory access the chips bound to 
itself. This is for the Command mode.

The .maxsize attribute, not being used until this patchset, 
holds the size limit of a chip that the controller supports 
to do direct memory access. This limit is 64MB on the AST2400 
and 256MB (full window) on the AST2500. 

I think that this value should be moved in the DT also ? 

The second thing to do is to retrieve the size of the overall 
AHB window from the DT and stop using the .maxsize attribute 
in the subsequent patches configuring the segment registers 
and introducing the command mode.

But the SPI controller window on the AST2400 is :

	 [ 0x30000000 - 0x3FFFFFFF ] 

which is a problem because, if I remember well, it requires us 
to configure CONFIG_VMSPLIT_2G to reserve space for the iomap.
I am not sure how to correctly handle that case, today we are 
just defining a small 32MB window.


Thanks,

C. 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus
  2017-04-06 19:25   ` Marek Vasut
@ 2017-04-11 10:20     ` Cédric Le Goater
  2017-04-11 10:44       ` Marek Vasut
  0 siblings, 1 reply; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-11 10:20 UTC (permalink / raw)
  To: Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/06/2017 09:25 PM, Marek Vasut wrote:
> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>> The segment registers of the SMC controller provide a way to configure
>> the mapping windows of the chips on the AHB bus. The settings are
>> required to be correct when the controller operates in Command mode,
>> which is the case for DMAs and the LPC mapping.
>>
>> This tries to set the segment registers of each chip depending on the
>> size of the flash device and depending on the previous segment
>> settings, in order to have a contiguous window across multiple chip.
>>
>> Unfortunately, the AST2500 SPI controller has a bug and it is not
>> possible to configure a full 128MB window for a chip of the same
>> size. The window size needs to be restricted to 120MB. This issue only
>> applies to CE0.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  drivers/mtd/spi-nor/aspeed-smc.c | 124 ++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 121 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>> index b3c8cfe29765..336a1ddd100b 100644
>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>> @@ -104,6 +104,7 @@ struct aspeed_smc_chip {
>>  	struct aspeed_smc_controller *controller;
>>  	void __iomem *ctl;			/* control register */
>>  	void __iomem *ahb_base;			/* base of chip window */
>> +	u32 ahb_window_size;			/* chip window size */
>>  	unsigned long phys_base;		/* physical address of window */
>>  	u32 ctl_val[smc_max];			/* control settings */
>>  	enum aspeed_smc_flash_type type;	/* what type of flash */
>> @@ -218,6 +219,10 @@ struct aspeed_smc_controller {
>>  #define SEGMENT_ADDR_REG0		0x30
>>  #define SEGMENT_ADDR_START(_r)		((((_r) >> 16) & 0xFF) << 23)
>>  #define SEGMENT_ADDR_END(_r)		((((_r) >> 24) & 0xFF) << 23)
>> +#define SEGMENT_ADDR_VALUE(start, end)					\
>> +	(((((start) >> 23) & 0xFF) << 16) | ((((end) >> 23) & 0xFF) << 24))
>> +#define SEGMENT_ADDR_REG(controller, cs)	\
>> +	((controller)->regs + SEGMENT_ADDR_REG0 + (cs) * 4)
>>  
>>  /* DMA Registers */
>>  #define DMA_CONTROL_REG			0x80
>> @@ -604,8 +609,7 @@ static void __iomem *aspeed_smc_chip_base(struct aspeed_smc_chip *chip,
>>  	u32 reg;
>>  
>>  	if (controller->info->nce > 1) {
>> -		reg = readl(controller->regs + SEGMENT_ADDR_REG0 +
>> -			    chip->cs * 4);
>> +		reg = readl(SEGMENT_ADDR_REG(controller, chip->cs));
>>  
>>  		if (SEGMENT_ADDR_START(reg) >= SEGMENT_ADDR_END(reg))
>>  			return NULL;
>> @@ -616,6 +620,111 @@ static void __iomem *aspeed_smc_chip_base(struct aspeed_smc_chip *chip,
>>  	return controller->ahb_base + offset;
>>  }
>>  
>> +static u32 chip_set_segment(struct aspeed_smc_chip *chip, u32 cs, u32 start,
>> +			    u32 size)
>> +{
>> +	struct aspeed_smc_controller *controller = chip->controller;
>> +	void __iomem *seg_reg;
>> +	u32 oldval, newval, val0, start0, end;
> 
> The naming of variables could use some improvement, it's really inobvious.

OK. I agree. I first tried to have less but the algo is a little
complex.

> 
>> +	val0 = readl(SEGMENT_ADDR_REG(controller, 0));
>> +	start0 = SEGMENT_ADDR_START(val0);
>> +
>> +	seg_reg = SEGMENT_ADDR_REG(controller, cs);
>> +	oldval = readl(seg_reg);
>> +
>> +	/* If the chip size is not specified, use the default segment
> 
> Doesn't checkpatch warn about this broken multi-line comment style ?

no. It didn't.

> Anyway, please fix.

Sure.

>> +	 * size, but take into account the possible overlap with the
>> +	 * previous segment
>> +	 */
>> +	if (!size)
>> +		size = SEGMENT_ADDR_END(oldval) - start;
>> +
>> +	/* The segment cannot exceed the maximum window size of the
>> +	 * controller.
>> +	 */
>> +	if (start + size > start0 + controller->info->maxsize) {
>> +		size = start0 + controller->info->maxsize - start;
>> +		dev_warn(chip->nor.dev, "CE%d window resized to %dMB",
>> +			 cs, size >> 20);
>> +	}
>> +
>> +	end = start + size;
>> +	newval = SEGMENT_ADDR_VALUE(start, end);
>> +	writel(newval, seg_reg);
>> +
>> +	/* Restore default value if something goes wrong. The chip
>> +	 * might have set some bogus value and we would loose access
>> +	 * to the chip.
>> +	 */
>> +	if (newval != readl(seg_reg)) {
>> +		dev_err(chip->nor.dev, "CE%d window invalid", cs);
>> +		writel(oldval, seg_reg);
>> +		start = SEGMENT_ADDR_START(oldval);
>> +		end = SEGMENT_ADDR_END(oldval);
>> +		size = end - start;
>> +	}
>> +
>> +	dev_info(chip->nor.dev, "CE%d window [ 0x%.8x- 0x%.8x ] %dMB",
>> +		 cs, start, end, size >> 20);
>> +
>> +	return size;
>> +}
>> +
>> +/*
>> + * This is expected to be called in increasing CE order
>> + */
>> +static u32 aspeed_smc_chip_set_segment(struct aspeed_smc_chip *chip)
>> +{
>> +	struct aspeed_smc_controller *controller = chip->controller;
>> +	u32  val0, start0, start;
>> +	u32 size = chip->nor.mtd.size;
>> +
>> +	/* The AST2500 SPI controller has a bug when the CE0 chip size
>> +	 * exceeds 120MB.
> 
> 120 or 128 ?

120. This is a HW bug.

> 
>> +	 */
>> +	if (chip->cs == 0 && controller->info == &spi_2500_info &&
>> +	    size == (128 << 20)) {
> 
> I guess you can use SZ_128M for more clarity.

OK.


> 
>> +		size = 120 << 20;
>> +		dev_info(chip->nor.dev,
>> +			 "CE%d window resized to %dMB (AST2500 HW quirk)",
>> +			 chip->cs, size >> 20);
>> +	}
>> +
>> +	val0 = readl(SEGMENT_ADDR_REG(controller, 0));
>> +	start0 = SEGMENT_ADDR_START(val0);
>> +
>> +	/* As a start address for the current segment, use the default
>> +	 * start address if we are handling CE0 or use the previous
>> +	 * segment ending address
>> +	 */
>> +	if (chip->cs) {
>> +		u32 prev = readl(SEGMENT_ADDR_REG(controller, chip->cs - 1));
>> +
>> +		start = SEGMENT_ADDR_END(prev);
>> +	} else
>> +		start = start0;
>> +
>> +	size = chip_set_segment(chip, chip->cs, start, size);
>> +
>> +	/* Update chip base address on the AHB bus */
>> +	chip->ahb_base = controller->ahb_base + (start - start0);
>> +
>> +	if (size < chip->nor.mtd.size)
>> +		dev_warn(chip->nor.dev,
>> +			 "CE%d window too small for chip %dMB",
>> +			 chip->cs, (u32) chip->nor.mtd.size >> 20);
>> +
>> +	/* Make sure the next segment does not overlap with the
>> +	 * current one we just configured even if there is no
>> +	 * available chip. That could break access in Command Mode.
>> +	 */
>> +	if (chip->cs < controller->info->nce - 1)
>> +		chip_set_segment(chip, chip->cs + 1, start + size, 0);
>> +
>> +	return size;
>> +}
>> +
>>  static void aspeed_smc_chip_enable_write(struct aspeed_smc_chip *chip)
>>  {
>>  	struct aspeed_smc_controller *controller = chip->controller;
>> @@ -689,7 +798,7 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
>>  	 */
>>  	chip->ahb_base = aspeed_smc_chip_base(chip, res);
>>  	if (!chip->ahb_base) {
>> -		dev_warn(chip->nor.dev, "CE segment window closed.\n");
>> +		dev_warn(chip->nor.dev, "CE%d window closed", chip->cs);
>>  		return -EINVAL;
>>  	}
>>  
>> @@ -738,6 +847,15 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
>>  	if (chip->nor.addr_width == 4 && info->set_4b)
>>  		info->set_4b(chip);
>>  
>> +	/* This is for direct AHB access when using Command Mode. For
> 
> Fix the comments please
> 
> /*
>  * foo
>  * bar
>  * baz
>  */

yes.


Thanks,

C. 

>> +	 * the AST2400 SPI controller which only handles one chip,
>> +	 * let's use the full window.
>> +	 */
>> +	if (controller->info->nce == 1)
>> +		chip->ahb_window_size = info->maxsize;
>> +	else
>> +		chip->ahb_window_size = aspeed_smc_chip_set_segment(chip);
>> +
>>  	/*
>>  	 * base mode has not been optimized yet. use it for writes.
>>  	 */
>>
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size
  2017-04-11 10:18     ` Cédric Le Goater
@ 2017-04-11 10:42       ` Marek Vasut
  2017-04-11 16:10         ` Cédric Le Goater
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-11 10:42 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/11/2017 12:18 PM, Cédric Le Goater wrote:
> On 04/06/2017 09:17 PM, Marek Vasut wrote:
>> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>>> The window of the Aspeed AST2400 SMC Controller to map chips on the
>>> AHB Bus has a 256MB size. 64MB is the default size for the first chip
>>> select and not the full window range: [ 0x20000000 - 0x2FFFFFFF ].
>>>
>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>> ---
>>>  drivers/mtd/spi-nor/aspeed-smc.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>>> index 56051d30f000..aa2c647c8507 100644
>>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>>> @@ -48,7 +48,7 @@ static void aspeed_smc_chip_set_4b_spi_2400(struct aspeed_smc_chip *chip);
>>>  static void aspeed_smc_chip_set_4b(struct aspeed_smc_chip *chip);
>>>  
>>>  static const struct aspeed_smc_info fmc_2400_info = {
>>> -	.maxsize = 64 * 1024 * 1024,
>>> +	.maxsize = 256 * 1024 * 1024,
>>
>> How does the second part of reg property in DT relate to this ?
>> DT binding example has it set to 0x02000000 = 32 MiB ...
> 
> Thanks for pointing this out. There is an inconsistency and
> a redundancy here ... The patch is wrong as I am mixing up 
> different limits. Anyhow, there are some concerns to discuss. 
> 
> The DT binding is the size of the overall AHB window in which 
> a controller can directly memory access the chips bound to 
> itself. This is for the Command mode.
> 
> The .maxsize attribute, not being used until this patchset, 
> holds the size limit of a chip that the controller supports 
> to do direct memory access. This limit is 64MB on the AST2400 
> and 256MB (full window) on the AST2500. 
> 
> I think that this value should be moved in the DT also ? 

No, I don't think so, this is a chip property which can be derived from
the compatible string, right ?

> The second thing to do is to retrieve the size of the overall 
> AHB window from the DT and stop using the .maxsize attribute 
> in the subsequent patches configuring the segment registers 
> and introducing the command mode.
> 
> But the SPI controller window on the AST2400 is :
> 
> 	 [ 0x30000000 - 0x3FFFFFFF ] 
> 
> which is a problem because, if I remember well, it requires us 
> to configure CONFIG_VMSPLIT_2G to reserve space for the iomap.
> I am not sure how to correctly handle that case, today we are 
> just defining a small 32MB window.
> 
> 
> Thanks,
> 
> C. 
> 


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode
  2017-04-11  8:53     ` Cédric Le Goater
@ 2017-04-11 10:43       ` Marek Vasut
  0 siblings, 0 replies; 40+ messages in thread
From: Marek Vasut @ 2017-04-11 10:43 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Robert Lippert, Robert Lippert

On 04/11/2017 10:53 AM, Cédric Le Goater wrote:
> On 04/06/2017 09:21 PM, Marek Vasut wrote:
>> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>>> From: Robert Lippert <roblip@gmail.com>
>>>
>>> Implements support for the dual IO read mode on aspeed SMC/FMC
>>> controllers which uses both MISO and MOSI lines for data during a read
>>> to double the read bandwidth.
>>>
>>> Signed-off-by: Robert Lippert <rlippert@google.com>
>>> [clg: adapted to mainline driver ]
>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>> ---
>>>  drivers/mtd/spi-nor/aspeed-smc.c | 20 +++++++++++++-------
>>>  1 file changed, 13 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>>> index 7dfa1ea0a787..b3c8cfe29765 100644
>>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>>> @@ -512,6 +512,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>>  	int i;
>>>  	u8 dummy = 0xFF;
>>>  	int ret;
>>> +	u32 ctl;
>>>  
>>>  	if (aspeed_smc_dma_check(chip, from, len)) {
>>>  		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
>>> @@ -525,6 +526,13 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>>  	for (i = 0; i < chip->nor.read_dummy / 8; i++)
>>>  		aspeed_smc_write_to_ahb(chip->ahb_base, &dummy, sizeof(dummy));
>>>  
>>> +	if (chip->nor.flash_read == SPI_NOR_DUAL) {
>>> +		/* Switch to dual I/O mode for data cycle */
>>> +		ctl = readl(chip->ctl) & ~CONTROL_IO_MODE_MASK;
>>> +		ctl |= CONTROL_IO_DUAL_DATA;
>>> +		writel(ctl, chip->ctl);
>>> +	}
>>> +
>> Can't you switch the mode at runtime ? If you do, who'll clear the
>> CONTROL_IO_DUAL_DATA in this ctl register if you switch to SINGLE
>> mode for some command ?
> 
> This is the read routine in User mode. When the read is completed,
> the previous (and default) setting of the control register is restored 
> with a call to aspeed_smc_stop_user() but you don't see it in the diff 
> hunk above.
> 
> The default setting of the control register is used when the controller 
> is in Command mode. That is when reads are performed 'automatically' 
> from the AHB bus. 

Ah, OK, thanks for clarifying.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus
  2017-04-11 10:20     ` Cédric Le Goater
@ 2017-04-11 10:44       ` Marek Vasut
  0 siblings, 0 replies; 40+ messages in thread
From: Marek Vasut @ 2017-04-11 10:44 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/11/2017 12:20 PM, Cédric Le Goater wrote:
[...]
>>> +static u32 aspeed_smc_chip_set_segment(struct aspeed_smc_chip *chip)
>>> +{
>>> +	struct aspeed_smc_controller *controller = chip->controller;
>>> +	u32  val0, start0, start;
>>> +	u32 size = chip->nor.mtd.size;
>>> +
>>> +	/* The AST2500 SPI controller has a bug when the CE0 chip size
>>> +	 * exceeds 120MB.
>>
>> 120 or 128 ?
> 
> 120. This is a HW bug.
> 
Please note that in the comment then .


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size
  2017-04-11 10:42       ` Marek Vasut
@ 2017-04-11 16:10         ` Cédric Le Goater
  0 siblings, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-11 16:10 UTC (permalink / raw)
  To: Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On 04/11/2017 12:42 PM, Marek Vasut wrote:
> On 04/11/2017 12:18 PM, Cédric Le Goater wrote:
>> On 04/06/2017 09:17 PM, Marek Vasut wrote:
>>> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>>>> The window of the Aspeed AST2400 SMC Controller to map chips on the
>>>> AHB Bus has a 256MB size. 64MB is the default size for the first chip
>>>> select and not the full window range: [ 0x20000000 - 0x2FFFFFFF ].
>>>>
>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>>> ---
>>>>  drivers/mtd/spi-nor/aspeed-smc.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>>>> index 56051d30f000..aa2c647c8507 100644
>>>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>>>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>>>> @@ -48,7 +48,7 @@ static void aspeed_smc_chip_set_4b_spi_2400(struct aspeed_smc_chip *chip);
>>>>  static void aspeed_smc_chip_set_4b(struct aspeed_smc_chip *chip);
>>>>  
>>>>  static const struct aspeed_smc_info fmc_2400_info = {
>>>> -	.maxsize = 64 * 1024 * 1024,
>>>> +	.maxsize = 256 * 1024 * 1024,
>>>
>>> How does the second part of reg property in DT relate to this ?
>>> DT binding example has it set to 0x02000000 = 32 MiB ...
>>
>> Thanks for pointing this out. There is an inconsistency and
>> a redundancy here ... The patch is wrong as I am mixing up 
>> different limits. Anyhow, there are some concerns to discuss. 
>>
>> The DT binding is the size of the overall AHB window in which 
>> a controller can directly memory access the chips bound to 
>> itself. This is for the Command mode.
>>
>> The .maxsize attribute, not being used until this patchset, 
>> holds the size limit of a chip that the controller supports 
>> to do direct memory access. This limit is 64MB on the AST2400 
>> and 256MB (full window) on the AST2500. 
>>
>> I think that this value should be moved in the DT also ? 
> 
> No, I don't think so, this is a chip property which can be derived from
> the compatible string, right ?

yes.

C. 


>> The second thing to do is to retrieve the size of the overall 
>> AHB window from the DT and stop using the .maxsize attribute 
>> in the subsequent patches configuring the segment registers 
>> and introducing the command mode.
>>
>> But the SPI controller window on the AST2400 is :
>>
>> 	 [ 0x30000000 - 0x3FFFFFFF ] 
>>
>> which is a problem because, if I remember well, it requires us 
>> to configure CONFIG_VMSPLIT_2G to reserve space for the iomap.
>> I am not sure how to correctly handle that case, today we are 
>> just defining a small 32MB window.
>>
>>
>> Thanks,
>>
>> C. 
>>
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-11  8:13     ` Cédric Le Goater
  2017-04-11  8:23       ` Benjamin Herrenschmidt
@ 2017-04-14 16:18       ` Marek Vasut
  2017-04-14 21:40         ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-14 16:18 UTC (permalink / raw)
  To: Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Benjamin Herrenschmidt

On 04/11/2017 10:13 AM, Cédric Le Goater wrote:
> Hello Marek,
> 
> On 04/06/2017 09:28 PM, Marek Vasut wrote:
>> On 04/06/2017 06:56 PM, Cédric Le Goater wrote:
>>> This is only for SPI controllers as U-Boot should have done it already
>>> for the FMC controller using DMAs.
>>>
>>> The algo is based on the one found in the OpenPOWER pflash tool. It
>>> first reads a golden buffer at low speed and then performs reads with
>>> different clocks and delay cycles settings to find the fastest
>>> configuration for the chip.
>>>
>>> It can be deactivated at boot time with the kernel parameter :
>>
>> Something tells me this is likely gonna be pretty flaky and might lead
>> to unreliable configurations. The hardware manufacturer should be able
>> to determine the limits of the hardware and those should be put into DT
>> at manufacturing time IMO.
> 
> It is not a common method but we have been using it on many OpenPOWER
> platforms (P8) using the AST2400 (palmetto, habanero, firestone, 
> garrison, etc) and also on the newer P9 platforms using the AST2500
> (romulus, zaius, witherspoon). So I would say that experience 'proves' 
> that it works. We didn't invent the SPI Timing Calibration algo, it is
> described in the specs of the manufacturer.
> 
> Also, all these platforms use sockets to hold the different flash 
> modules of the system. We don't know in advance what we will find 
> and so it makes difficult to put any timing in the DT.    
> 
> And the values of the "data input delay cycle" (REG94) depend on the 
> chip model and on the length of the wires of the board. If we were to 
> hard-code such values in the DT, we would need to put only safe and slow 
> settings known to work in all configurations. which is a bit what we 
> do in the current driver using really slow ones.

Aha, thanks for clarifying.

> Here is a proposal. We could activate this algo using a property such
> as :
>  
> 	"speed-mode" = "freq" or "auto-adjust"
> 
> How's that ? 

Thinking about it a bit more, I think having a module parameter is the
right approach here ... or maybe compile-time switch. This shouldn't be
in DT as it's not HW property.

> Thanks,
> 
> C.
> 


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-14 16:18       ` Marek Vasut
@ 2017-04-14 21:40         ` Benjamin Herrenschmidt
  2017-04-14 21:51           ` Marek Vasut
  0 siblings, 1 reply; 40+ messages in thread
From: Benjamin Herrenschmidt @ 2017-04-14 21:40 UTC (permalink / raw)
  To: Marek Vasut, Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley

On Fri, 2017-04-14 at 18:18 +0200, Marek Vasut wrote:
> > Here is a proposal. We could activate this algo using a property such
> > as :
> >   
> >        "speed-mode" = "freq" or "auto-adjust"
> > 
> > How's that ? 
> 
> Thinking about it a bit more, I think having a module parameter is the
> right approach here ... or maybe compile-time switch. This shouldn't be
> in DT as it's not HW property.

Strong disagreement here :-)

DT is not *strictly* HW properties. Never was despite what some
fanatics around might say :-) Its also platform properties and can
include policies.

We put things like UART speeds in there, MAC addresses, etc... it makes
sense to put calibration info and in this case, request to perform SW
calibration.

Module parameters are crap. They are a major pain to use, they are in
practice only good for tweaking/experimenting.

Ben.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-14 21:40         ` Benjamin Herrenschmidt
@ 2017-04-14 21:51           ` Marek Vasut
  2017-04-14 22:11             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-14 21:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On 04/14/2017 11:40 PM, Benjamin Herrenschmidt wrote:
> On Fri, 2017-04-14 at 18:18 +0200, Marek Vasut wrote:
>>> Here is a proposal. We could activate this algo using a property such
>>> as :
>>>
>>>        "speed-mode" = "freq" or "auto-adjust"
>>>
>>> How's that ?
>>
>> Thinking about it a bit more, I think having a module parameter is the
>> right approach here ... or maybe compile-time switch. This shouldn't be
>> in DT as it's not HW property.
>
> Strong disagreement here :-)
>
> DT is not *strictly* HW properties. Never was despite what some
> fanatics around might say :-) Its also platform properties and can
> include policies.

/me grabs popcorn ... :-)

> We put things like UART speeds in there, MAC addresses, etc...

UART speeds or UART max/allowed speeds ? That's basically HW property
since flaky HW might not allow all sorts of UART speed options. MAC
address is a HW property.

> it makes
> sense to put calibration info and in this case, request to perform SW
> calibration.

That's a hard question and I don't have the right answer to this.

> Module parameters are crap. They are a major pain to use, they are in
> practice only good for tweaking/experimenting.

That's correct, but then turning the calibration off would probably be
only used in such experimental setups or during HW bringup (if at all).
Based on the discussion thus far, my impression is that the preferred
and mostly used default is calibration enabled.

> Ben.
>


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-14 21:51           ` Marek Vasut
@ 2017-04-14 22:11             ` Benjamin Herrenschmidt
  2017-04-14 22:25               ` Marek Vasut
  0 siblings, 1 reply; 40+ messages in thread
From: Benjamin Herrenschmidt @ 2017-04-14 22:11 UTC (permalink / raw)
  To: Marek Vasut, Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On Fri, 2017-04-14 at 23:51 +0200, Marek Vasut wrote:
> On 04/14/2017 11:40 PM, Benjamin Herrenschmidt wrote:
> > 
> > Strong disagreement here :-)
> > 
> > DT is not *strictly* HW properties. Never was despite what some
> > fanatics around might say :-) Its also platform properties and can
> > include policies.
> 
> /me grabs popcorn ... :-)

Be my guest ! I only invented the bloody thing in the first place after
all ;-) (Well, the FDT format rather, and its use in Linux, the DT
itself dates back from Open Firmware).

> > We put things like UART speeds in there, MAC addresses, etc...
> 
> UART speeds or UART max/allowed speeds ? That's basically HW property
> since flaky HW might not allow all sorts of UART speed options. MAC
> address is a HW property.

Both. The point is that there is no hard lines. Every time people come
up with hard lines, we end up with inflexible horror shows that fail to
solve practical issues on the field.

There is no good reason to forbid passing such a simple policy argument
that way. None. Other than ideological that is.

> > it makes
> > sense to put calibration info and in this case, request to perform
> > SW
> > calibration.
> 
> That's a hard question and I don't have the right answer to this.

I do, and it's fine :-)

> > Module parameters are crap. They are a major pain to use, they are
> > in
> > practice only good for tweaking/experimenting.
> 
> That's correct, but then turning the calibration off would probably
> be only used in such experimental setups or during HW bringup (if at
> all).
> Based on the discussion thus far, my impression is that thepreferred
> and mostly used default is calibration enabled.

Probably yes. So we could reverse the problem and say that we have
the calibration enabled by default, and an optional device-tree
property to force a fixed speed.

That becomes akin to what we do with Ethernet PHYs for example :)

Cheers,
Ben.

> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-14 22:11             ` Benjamin Herrenschmidt
@ 2017-04-14 22:25               ` Marek Vasut
  2017-04-14 22:42                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Vasut @ 2017-04-14 22:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On 04/15/2017 12:11 AM, Benjamin Herrenschmidt wrote:
> On Fri, 2017-04-14 at 23:51 +0200, Marek Vasut wrote:
>> On 04/14/2017 11:40 PM, Benjamin Herrenschmidt wrote:
>>>
>>> Strong disagreement here :-)
>>>
>>> DT is not *strictly* HW properties. Never was despite what some
>>> fanatics around might say :-) Its also platform properties and can
>>> include policies.
>>
>> /me grabs popcorn ... :-)
> 
> Be my guest ! I only invented the bloody thing in the first place after
> all ;-) (Well, the FDT format rather, and its use in Linux, the DT
> itself dates back from Open Firmware).

:-)

>>> We put things like UART speeds in there, MAC addresses, etc...
>>
>> UART speeds or UART max/allowed speeds ? That's basically HW property
>> since flaky HW might not allow all sorts of UART speed options. MAC
>> address is a HW property.
> 
> Both. The point is that there is no hard lines. Every time people come
> up with hard lines, we end up with inflexible horror shows that fail to
> solve practical issues on the field.

True

> There is no good reason to forbid passing such a simple policy argument
> that way. None. Other than ideological that is.
>
>>> it makes
>>> sense to put calibration info and in this case, request to perform
>>> SW
>>> calibration.
>>
>> That's a hard question and I don't have the right answer to this.
> 
> I do, and it's fine :-)
> 
>>> Module parameters are crap. They are a major pain to use, they are
>>> in
>>> practice only good for tweaking/experimenting.
>>
>> That's correct, but then turning the calibration off would probably
>> be only used in such experimental setups or during HW bringup (if at
>> all).
>> Based on the discussion thus far, my impression is that thepreferred
>> and mostly used default is calibration enabled.
> 
> Probably yes. So we could reverse the problem and say that we have
> the calibration enabled by default, and an optional device-tree
> property to force a fixed speed.

That sounds like a good option, yes. And if it's just forcing fixed
speed, that's awesome, but I think there are more values that might need
to be passed ... I think that's what Cedric can answer.

> That becomes akin to what we do with Ethernet PHYs for example :)

Even better, I recall seeing some speed DT props in the SPI binding
docs, so we already have those in place.

> Cheers,
> Ben.
> 
>>
>>


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-14 22:25               ` Marek Vasut
@ 2017-04-14 22:42                 ` Benjamin Herrenschmidt
  2017-04-16 18:46                   ` Marek Vasut
  2017-04-18 15:46                   ` Cédric Le Goater
  0 siblings, 2 replies; 40+ messages in thread
From: Benjamin Herrenschmidt @ 2017-04-14 22:42 UTC (permalink / raw)
  To: Marek Vasut, Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On Sat, 2017-04-15 at 00:25 +0200, Marek Vasut wrote:
> That sounds like a good option, yes. And if it's just forcing fixed
> speed, that's awesome, but I think there are more values that might need
> to be passed ... I think that's what Cedric can answer.

Yes, fixed and delay. Not a huge deal.

Also a module parameter makes it hard to specify different settings for
different instances of the device/flash. IE, there can be multiple
flash controllers and each of them can have multiple chip selects.

The DT is the only sane way for that.

> > That becomes akin to what we do with Ethernet PHYs for example :)
> 
> Even better, I recall seeing some speed DT props in the SPI binding
> docs, so we already have those in place.

Right though the gate delay is rather IP block specific but that's
not a huge issue.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-14 22:42                 ` Benjamin Herrenschmidt
@ 2017-04-16 18:46                   ` Marek Vasut
  2017-04-18 15:46                   ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: Marek Vasut @ 2017-04-16 18:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Cédric Le Goater, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On 04/15/2017 12:42 AM, Benjamin Herrenschmidt wrote:
> On Sat, 2017-04-15 at 00:25 +0200, Marek Vasut wrote:
>> That sounds like a good option, yes. And if it's just forcing fixed
>> speed, that's awesome, but I think there are more values that might need
>> to be passed ... I think that's what Cedric can answer.
> 
> Yes, fixed and delay. Not a huge deal.

OK, that works.

> Also a module parameter makes it hard to specify different settings for
> different instances of the device/flash. IE, there can be multiple
> flash controllers and each of them can have multiple chip selects.
> 
> The DT is the only sane way for that.

:-)

>>> That becomes akin to what we do with Ethernet PHYs for example :)
>>
>> Even better, I recall seeing some speed DT props in the SPI binding
>> docs, so we already have those in place.
> 
> Right though the gate delay is rather IP block specific but that's
> not a huge issue.

Yep.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-14 22:42                 ` Benjamin Herrenschmidt
  2017-04-16 18:46                   ` Marek Vasut
@ 2017-04-18 15:46                   ` Cédric Le Goater
  2017-04-18 16:51                     ` Marek Vasut
  2017-04-18 22:41                     ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-18 15:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On 04/15/2017 12:42 AM, Benjamin Herrenschmidt wrote:
> On Sat, 2017-04-15 at 00:25 +0200, Marek Vasut wrote:
>> That sounds like a good option, yes. And if it's just forcing fixed
>> speed, that's awesome, but I think there are more values that might need
>> to be passed ... I think that's what Cedric can answer.
> 
> Yes, fixed and delay. Not a huge deal.

Do we really need to specify the delays in the DT ? As we loop on 
the different delay settings, if a setting is bogus, we can just 
pick the following HCLK divider. No ? I don't think it is worth 
adding black magic properties like this in the DT. We never had to 
tune it manually AFAICT and anyway, we can always add that later 
if needed. 

So, that would leave two properties :

  - safe divider for "normal/write/erase" commands
  - speedy divider for "read" commands

If the property is not present, we would keep the low HW default, 
which is /16. 

If there is such a property, the divider value would be considered 
as a max. The range should be 1..5. Let's introduce "literal" values 
like "min" and "max" for 1 and 5 ?  

Do we care for the other HCLK settings < 5 for which we can not set
any delay ?  

> Also a module parameter makes it hard to specify different settings for
> different instances of the device/flash. IE, there can be multiple
> flash controllers and each of them can have multiple chip selects.
> 
> The DT is the only sane way for that.

Yes. Can we keep the global module parameter as a global chicken 
switch to deactivate the algo ? It could be useful, some chips tend
to freak out when hammered on a bit too much. 

Thanks,

C.


>>> That becomes akin to what we do with Ethernet PHYs for example :)
>>
>> Even better, I recall seeing some speed DT props in the SPI binding
>> docs, so we already have those in place.
> 
> Right though the gate delay is rather IP block specific but that's
> not a huge issue.
> 
> Cheers,
> Ben.
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-18 15:46                   ` Cédric Le Goater
@ 2017-04-18 16:51                     ` Marek Vasut
  2017-04-18 22:41                     ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 40+ messages in thread
From: Marek Vasut @ 2017-04-18 16:51 UTC (permalink / raw)
  To: Cédric Le Goater, Benjamin Herrenschmidt, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On 04/18/2017 05:46 PM, Cédric Le Goater wrote:
> On 04/15/2017 12:42 AM, Benjamin Herrenschmidt wrote:
>> On Sat, 2017-04-15 at 00:25 +0200, Marek Vasut wrote:
>>> That sounds like a good option, yes. And if it's just forcing fixed
>>> speed, that's awesome, but I think there are more values that might need
>>> to be passed ... I think that's what Cedric can answer.
>>
>> Yes, fixed and delay. Not a huge deal.
> 
> Do we really need to specify the delays in the DT ? As we loop on 
> the different delay settings, if a setting is bogus, we can just 
> pick the following HCLK divider. No ? I don't think it is worth 
> adding black magic properties like this in the DT. We never had to 
> tune it manually AFAICT and anyway, we can always add that later 
> if needed. 
> 
> So, that would leave two properties :
> 
>   - safe divider for "normal/write/erase" commands
>   - speedy divider for "read" commands
> 
> If the property is not present, we would keep the low HW default, 
> which is /16. 
> 
> If there is such a property, the divider value would be considered 
> as a max. The range should be 1..5. Let's introduce "literal" values 
> like "min" and "max" for 1 and 5 ?  

Can we use the spi max frequency which is a standard property and be
done with it ?

> Do we care for the other HCLK settings < 5 for which we can not set
> any delay ?  

I cannot tell, you're the expert :)

>> Also a module parameter makes it hard to specify different settings for
>> different instances of the device/flash. IE, there can be multiple
>> flash controllers and each of them can have multiple chip selects.
>>
>> The DT is the only sane way for that.
> 
> Yes. Can we keep the global module parameter as a global chicken 
> switch to deactivate the algo ? It could be useful, some chips tend
> to freak out when hammered on a bit too much. 

Can we also use the spi-max-frequency here somehow ?

> Thanks,
> 
> C.
> 
> 
>>>> That becomes akin to what we do with Ethernet PHYs for example :)
>>>
>>> Even better, I recall seeing some speed DT props in the SPI binding
>>> docs, so we already have those in place.
>>
>> Right though the gate delay is rather IP block specific but that's
>> not a huge issue.
>>
>> Cheers,
>> Ben.
>>
> 


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode
  2017-04-18 15:46                   ` Cédric Le Goater
  2017-04-18 16:51                     ` Marek Vasut
@ 2017-04-18 22:41                     ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 40+ messages in thread
From: Benjamin Herrenschmidt @ 2017-04-18 22:41 UTC (permalink / raw)
  To: Cédric Le Goater, Marek Vasut, linux-mtd
  Cc: Cyrille Pitchen, Boris Brezillon, David Woodhouse, Brian Norris,
	Richard Weinberger, Joel Stanley, Rob Herring

On Tue, 2017-04-18 at 17:46 +0200, Cédric Le Goater wrote:
> On 04/15/2017 12:42 AM, Benjamin Herrenschmidt wrote:
> > On Sat, 2017-04-15 at 00:25 +0200, Marek Vasut wrote:
> > > That sounds like a good option, yes. And if it's just forcing
> > > fixed
> > > speed, that's awesome, but I think there are more values that
> > > might need
> > > to be passed ... I think that's what Cedric can answer.
> > 
> > Yes, fixed and delay. Not a huge deal.
> 
> Do we really need to specify the delays in the DT ? As we loop on 
> the different delay settings, if a setting is bogus, we can just 
> pick the following HCLK divider. No ? I don't think it is worth 
> adding black magic properties like this in the DT. We never had to 
> tune it manually AFAICT and anyway, we can always add that later 
> if needed. 
> 
> So, that would leave two properties :
> 
>   - safe divider for "normal/write/erase" commands
>   - speedy divider for "read" commands

When the properties are present, we don't calibrate. So we need the
delay being specified. There's no "black magic" about it, it's about
specifying the appropriate configuration of the controller for the
given motherboard layout/flashchip/speed.

> If the property is not present, we would keep the low HW default, 
> which is /16. 

No misunderstood. The default is to calibrate. The properties allow you
to specify a fixed speed/delay combination in case the system has been
fully calibrated in the factory.

> If there is such a property, the divider value would be considered 
> as a max. The range should be 1..5. Let's introduce "literal" values 
> like "min" and "max" for 1 and 5 ?  
> 
> Do we care for the other HCLK settings < 5 for which we can not set
> any delay ?  

> > Also a module parameter makes it hard to specify different settings
> > for
> > different instances of the device/flash. IE, there can be multiple
> > flash controllers and each of them can have multiple chip selects.
> > 
> > The DT is the only sane way for that.
> 
> Yes. Can we keep the global module parameter as a global chicken 
> switch to deactivate the algo ? It could be useful, some chips tend
> to freak out when hammered on a bit too much. 

Yes, that's useful for diagnosis when things go wrong.

Cheers,
Ben.

> Thanks,
> 
> C.
> 
> 
> > > > That becomes akin to what we do with Ethernet PHYs for example
> > > > :)
> > > 
> > > Even better, I recall seeing some speed DT props in the SPI
> > > binding
> > > docs, so we already have those in place.
> > 
> > Right though the gate delay is rather IP block specific but that's
> > not a huge issue.
> > 
> > Cheers,
> > Ben.
> > 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support
  2017-04-06 19:07   ` Cyrille Pitchen
  2017-04-06 19:13     ` Cyrille Pitchen
@ 2017-04-19 14:36     ` Cédric Le Goater
  1 sibling, 0 replies; 40+ messages in thread
From: Cédric Le Goater @ 2017-04-19 14:36 UTC (permalink / raw)
  To: Cyrille Pitchen, linux-mtd
  Cc: Boris Brezillon, Richard Weinberger, Marek Vasut, Joel Stanley,
	Cyrille Pitchen, Brian Norris, David Woodhouse

On 04/06/2017 09:07 PM, Cyrille Pitchen wrote:
> Hi Cédric,
> 
> Le 06/04/2017 à 18:56, Cédric Le Goater a écrit :
>> The Aspeed FMC controller can handle transfers to the flash modules
>> using DMAs. A couple of registers first need to be programmed with the
>> DRAM and flash addresses and the length of the transfer. The transfer
>> is then initiated using a DMA control register and an interrupt
>> notifies the completion.
>>
>> Such transfers can replace the current IO mode in the read/write ops
>> when some conditions are met on the size and the alignment. In case of
>> failure, a timeout for instance, the operation is restarted using the
>> IO mode.
>>
>> DMA support does not seem to be that efficient. So we provide some
>> sysfs files for tuning and to switch it on and off (default is off)
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>  drivers/mtd/spi-nor/aspeed-smc.c | 231 +++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 231 insertions(+)
>>
>> diff --git a/drivers/mtd/spi-nor/aspeed-smc.c b/drivers/mtd/spi-nor/aspeed-smc.c
>> index a98d454d07ed..7dfa1ea0a787 100644
>> --- a/drivers/mtd/spi-nor/aspeed-smc.c
>> +++ b/drivers/mtd/spi-nor/aspeed-smc.c
>> @@ -11,6 +11,8 @@
>>  
>>  #include <linux/bug.h>
>>  #include <linux/device.h>
>> +#include <linux/dma-direction.h>
>> +#include <linux/dma-mapping.h>
>>  #include <linux/io.h>
>>  #include <linux/module.h>
>>  #include <linux/mutex.h>
>> @@ -40,6 +42,7 @@ struct aspeed_smc_info {
>>  	bool hastype;		/* flash type field exists in config reg */
>>  	u8 we0;			/* shift for write enable bit for CE0 */
>>  	u8 ctl0;		/* offset in regs of ctl for CE0 */
>> +	bool has_dma;
>>  
>>  	void (*set_4b)(struct aspeed_smc_chip *chip);
>>  };
>> @@ -53,6 +56,7 @@ static const struct aspeed_smc_info fmc_2400_info = {
>>  	.hastype = true,
>>  	.we0 = 16,
>>  	.ctl0 = 0x10,
>> +	.has_dma = true,
>>  	.set_4b = aspeed_smc_chip_set_4b,
>>  };
>>  
>> @@ -62,6 +66,7 @@ static const struct aspeed_smc_info spi_2400_info = {
>>  	.hastype = false,
>>  	.we0 = 0,
>>  	.ctl0 = 0x04,
>> +	.has_dma = false,
>>  	.set_4b = aspeed_smc_chip_set_4b_spi_2400,
>>  };
>>  
>> @@ -71,6 +76,7 @@ static const struct aspeed_smc_info fmc_2500_info = {
>>  	.hastype = true,
>>  	.we0 = 16,
>>  	.ctl0 = 0x10,
>> +	.has_dma = true,
>>  	.set_4b = aspeed_smc_chip_set_4b,
>>  };
>>  
>> @@ -80,6 +86,7 @@ static const struct aspeed_smc_info spi_2500_info = {
>>  	.hastype = false,
>>  	.we0 = 16,
>>  	.ctl0 = 0x10,
>> +	.has_dma = false,
>>  	.set_4b = aspeed_smc_chip_set_4b,
>>  };
>>  
>> @@ -97,6 +104,7 @@ struct aspeed_smc_chip {
>>  	struct aspeed_smc_controller *controller;
>>  	void __iomem *ctl;			/* control register */
>>  	void __iomem *ahb_base;			/* base of chip window */
>> +	unsigned long phys_base;		/* physical address of window */
>>  	u32 ctl_val[smc_max];			/* control settings */
>>  	enum aspeed_smc_flash_type type;	/* what type of flash */
>>  	struct spi_nor nor;
>> @@ -110,6 +118,18 @@ struct aspeed_smc_controller {
>>  	void __iomem *regs;			/* controller registers */
>>  	void __iomem *ahb_base;			/* per-chip windows resource */
>>  
>> +	/* interrupt handling */
>> +	int irq;
>> +
>> +	/* dma */
>> +	bool dma_enabled;
>> +	struct completion dma_done;
>> +
>> +	/* dma logging */
>> +	size_t dma_length;
>> +	dma_addr_t dma_addr;			/* bus address of buffer */
>> +	dma_addr_t flash_addr;			/* flash address */
>> +
>>  	struct aspeed_smc_chip *chips[0];	/* pointers to attached chips */
>>  };
>>  
>> @@ -182,6 +202,11 @@ struct aspeed_smc_controller {
>>  	(CONTROL_AAF_MODE | CONTROL_CE_INACTIVE_MASK | CONTROL_CLK_DIV4 | \
>>  	 CONTROL_CLOCK_FREQ_SEL_MASK | CONTROL_LSB_FIRST | CONTROL_CLOCK_MODE_3)
>>  
>> +/* Interrupt Control and Status Register */
>> +#define INTERRUPT_STATUS_REG		0x08
>> +#define     INTERRUPT_DMA_ENABLE	BIT(3)
>> +#define     INTERRUPT_DMA_STATUS	BIT(11)
>> +
>>  /*
>>   * The Segment Register uses a 8MB unit to encode the start address
>>   * and the end address of the mapping window of a flash SPI slave :
>> @@ -194,6 +219,108 @@ struct aspeed_smc_controller {
>>  #define SEGMENT_ADDR_START(_r)		((((_r) >> 16) & 0xFF) << 23)
>>  #define SEGMENT_ADDR_END(_r)		((((_r) >> 24) & 0xFF) << 23)
>>  
>> +/* DMA Registers */
>> +#define DMA_CONTROL_REG			0x80
>> +#define     DMA_ENABLE			BIT(0)
>> +#define     DMA_WRITE			BIT(1)
>> +
>> +#define DMA_FLASH_BASE_REG		0x84
>> +#define DMA_DRAM_BASE_REG		0x88
>> +#define DMA_LENGTH_REG			0x8c
>> +
>> +/*
>> + * DMAs do not seem to be that fast, so disable by default
>> + */
>> +static bool use_dma;
>> +module_param(use_dma, bool, 0644);
>> +
>> +static unsigned int min_dma_size = 256;
>> +module_param(min_dma_size, uint, 0644);
>> +
>> +/* with 100ms we had a couple of timeouts */
>> +static unsigned int dma_timeout = 200;
>> +module_param(dma_timeout, uint, 0644);
>> +
>> +static void aspeed_smc_dma_done(struct aspeed_smc_controller *controller)
>> +{
>> +	writel(0, controller->regs + INTERRUPT_STATUS_REG);
>> +	writel(0, controller->regs + DMA_CONTROL_REG);
>> +}
>> +
>> +#define DMA_LENGTH(x) (((x) - 4) & ~0xFE000003)
>> +#define DMA_ADDR(x) ((x) & ~0x00000003)
>> +
>> +static inline void aspeed_smc_chip_configure(struct aspeed_smc_chip *chip,
>> +					     u32 ctl)
>> +{
>> +	ctl |= CONTROL_CE_STOP_ACTIVE_CONTROL;
>> +	writel(ctl, chip->ctl);
>> +
>> +	ctl &= ~CONTROL_CE_STOP_ACTIVE_CONTROL;
>> +	writel(ctl, chip->ctl);
>> +}
>> +
>> +static int aspeed_smc_dma_start(struct aspeed_smc_chip *chip,
>> +				u32 offset, void *buf, size_t length,
>> +				int is_write)
>> +{
>> +	struct aspeed_smc_controller *controller = chip->controller;
>> +	dma_addr_t dma_addr, flash_addr;
>> +	int ret = 0;
>> +
>> +	aspeed_smc_chip_configure(chip, is_write ? chip->ctl_val[smc_write] :
>> +		chip->ctl_val[smc_read]);
>> +
>> +	dev_dbg(chip->nor.dev, "DMA %s to=0x%08x len=0x%08x\n",
>> +		is_write ? "write" : "read", offset, length);
>> +
>> +	dma_addr = dma_map_single(chip->nor.dev, buf, length,
>> +				  (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));
> 
> big mistake! Don't worry, your not the first one to fall into this trap
> and I did this mistake too in earlier versions of the Atmel QSPI
> controller driver (not mainlined) ;)
> 
> So please, let me explain:
> 
> You can use dma_map_single() if 'buf' is allocated with kmalloc() for
> instance because kmalloc'ed memory pages are guaranteed to be contiguous
> in physical space too (not only in virtual space).
> Also kmalloc'ed memory can be (un)mapped for DMA usages: when cleaning
> and/or invalidating the data cache lines, the cache aliasing issue
> should be avoided, hence it's safe.
> 
> On the other hand, vmalloc'ed memory pages are not guaranteed to be
> contiguous in physical space and often they aren't. So just for this
> only reason, you cannot use dma_map_single().
> 
> 
> More technical explanations but you can skip the following section if
> you want:
> 
> ####################################################################
> Then you could think of using dma_map_sg() instead. This is what the
> __spi_map_sg() / spi_map_buf() functions do in the SPI sub-system: in
> the case of vmalloc'ed memory, spi_map_buf() first build a
> scatter-gather list from the allocated pages before calling dma_map_sg().
> 
> However the vmalloc'ed memory area is not safe about the cache alias
> issue. Depending on model of the data cache it may be safe or not:
> 
> - PIPT (Physically Indexed / Physically Tagged): should be safe in any
> case (data cache of ARM Cortex A5 for instance).
> 
> - VIPT (Virtually Indexed / Physically Tagged): if the size of the data
> cache is "small" enough so the number of (index + offset) bits in the
> virtual address is less are equal to the number of offset bits for the
> smallest page size in the MMU (likely 12 bits for a 4KB MMU page size),
> the cache alias issue should be avoided. Otherwise this issue may occur
> so KO.
> 
> - VIVT (Virtually Indexed / Virtually Tagged): this is the worst case.
> Even the tag part comes from the virtual address so OK (ARM 926 I guess).
> 
> e.g. for a 32bit address:
> 
> Depending on the cache model, the virtual and/or physical is used to
> extract the TAG, INDEX and OFFSET.
> 
> 31                         Y                   0
> +--------------------------+--------+----------+
> |           TAG            |  INDEX |   OFFSET |
> +--------------------------+--------+----------+
> 
> The INDEX part is used the select the right set in the N-way set
> associated cache. Then the TAG part is used to find the right cache line
> within the selected set.
> 
> For a minimum MMU page size of 4096 (== 2^12) bytes, if (Y < 12),
> whatever you use the physical of virtual address, the (INDEX + OFFSET)
> bits are the same. Then if the TAG comes from the physical address, no
> problem: even if a given physical page is mapped twice (or more) in the
> virtual address space through the MMU, all those virtual addresses are
> cached in the very same cache lines.
> So for a VIPT cache model, it depends on the geometry (cache size, cache
> line size, number of ways) of your cache.
> 
> SPI controller drivers should not rely on this "trick": it may not work
> later if the same controller IP is integrated in another SoC using a
> different core with a different cache model or different cache geometry.
> ####################################################################
> 
> 
> All that to say, other parts of the kernel, like the ubifs layer,
> allocates buffer with vmalloc() instead of kmalloc().
> So you may test your code programming a UBI file-system on your SPI
> flash memory and your kernel is likely to crash.

Yes I completely overlooked the vmalloc buffers :/ What would you 
recommend ? a intermediate kmalloc buffer ? DMA are not that fast 
already so I guess that would be even worse. We might just as well 
drop the support.

Thanks for the detailed explanation,

C.


>> +
>> +	if (unlikely(dma_mapping_error(chip->nor.dev, dma_addr))) {
>> +		dev_err(chip->nor.dev, "Failed to dma_map_single()\n");
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>> +	flash_addr = chip->phys_base + offset;
>> +
>> +	controller->dma_length = length;
>> +	controller->dma_addr = dma_addr;
>> +	controller->flash_addr = flash_addr;
>> +
>> +	reinit_completion(&controller->dma_done);
>> +
>> +	writel(0, controller->regs + DMA_CONTROL_REG);
>> +	writel(DMA_ADDR(flash_addr), controller->regs +
>> +	       DMA_FLASH_BASE_REG);
>> +	writel(DMA_ADDR(dma_addr), controller->regs + DMA_DRAM_BASE_REG);
>> +	writel(DMA_LENGTH(length), controller->regs + DMA_LENGTH_REG);
>> +
>> +	writel(INTERRUPT_DMA_ENABLE,
>> +	       controller->regs + INTERRUPT_STATUS_REG);
>> +
>> +	writel(DMA_ENABLE | (is_write << 1),
>> +	       controller->regs + DMA_CONTROL_REG);
>> +
>> +	if (!wait_for_completion_timeout(&controller->dma_done,
>> +					 msecs_to_jiffies(dma_timeout))) {
>> +		dev_err(chip->nor.dev,
>> +			"DMA timeout addr@%.8x faddr@%.8x size=%x\n",
>> +			controller->dma_addr,
>> +			controller->flash_addr,
>> +			controller->dma_length);
>> +		ret = -ETIMEDOUT;
>> +		aspeed_smc_dma_done(controller);
>> +	}
>> +
>> +	dma_unmap_single(chip->nor.dev,
>> +			 controller->dma_addr, controller->dma_length,
>> +			 (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE));
>> +out:
>> +	aspeed_smc_chip_configure(chip, chip->ctl_val[smc_read]);
>> +	return ret;
>> +}
>> +
>>  /*
>>   * In user mode all data bytes read or written to the chip decode address
>>   * range are transferred to or from the SPI bus. The range is treated as a
>> @@ -366,12 +493,32 @@ static void aspeed_smc_send_cmd_addr(struct spi_nor *nor, u8 cmd, u32 addr)
>>  	}
>>  }
>>  
>> +/*
>> + * Try DMA transfer when size and alignment are correct. In case
>> + * of failure, just restart using the IO mode.
>> + */
>> +static int aspeed_smc_dma_check(struct aspeed_smc_chip *chip, loff_t off,
>> +				size_t len)
>> +{
>> +	return (IS_ALIGNED(off, 4) && IS_ALIGNED(len, 4) &&
>> +		len >= min_dma_size && chip->controller->dma_enabled &&
>> +		use_dma);
>> +}
>> +
> 
> vmalloc'ed memory or other high memory regions (not DMA capable) can
> still pass this check.
> 
> Best regards,
> 
> Cyrille
> 
>>  static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>  				    size_t len, u_char *read_buf)
>>  {
>>  	struct aspeed_smc_chip *chip = nor->priv;
>>  	int i;
>>  	u8 dummy = 0xFF;
>> +	int ret;
>> +
>> +	if (aspeed_smc_dma_check(chip, from, len)) {
>> +		ret = aspeed_smc_dma_start(chip, from, read_buf, len, 0);
>> +		if (!ret)
>> +			goto out;
>> +		dev_err(chip->nor.dev, "DMA read failed: %d", ret);
>> +	}
>>  
>>  	aspeed_smc_start_user(nor);
>>  	aspeed_smc_send_cmd_addr(nor, nor->read_opcode, from);
>> @@ -380,6 +527,7 @@ static ssize_t aspeed_smc_read_user(struct spi_nor *nor, loff_t from,
>>  
>>  	aspeed_smc_read_from_ahb(read_buf, chip->ahb_base, len);
>>  	aspeed_smc_stop_user(nor);
>> +out:
>>  	return len;
>>  }
>>  
>> @@ -387,11 +535,21 @@ static ssize_t aspeed_smc_write_user(struct spi_nor *nor, loff_t to,
>>  				     size_t len, const u_char *write_buf)
>>  {
>>  	struct aspeed_smc_chip *chip = nor->priv;
>> +	int ret;
>> +
>> +	if (aspeed_smc_dma_check(chip, to, len)) {
>> +		ret = aspeed_smc_dma_start(chip, to, (void *)write_buf,
>> +					   len, 1);
>> +		if (!ret)
>> +			goto out;
>> +		dev_err(chip->nor.dev, "DMA write failed: %d", ret);
>> +	}
>>  
>>  	aspeed_smc_start_user(nor);
>>  	aspeed_smc_send_cmd_addr(nor, nor->program_opcode, to);
>>  	aspeed_smc_write_to_ahb(chip->ahb_base, write_buf, len);
>>  	aspeed_smc_stop_user(nor);
>> +out:
>>  	return len;
>>  }
>>  
>> @@ -527,6 +685,8 @@ static int aspeed_smc_chip_setup_init(struct aspeed_smc_chip *chip,
>>  		return -EINVAL;
>>  	}
>>  
>> +	chip->phys_base = res->start + (chip->ahb_base - controller->ahb_base);
>> +
>>  	/*
>>  	 * Get value of the inherited control register. U-Boot usually
>>  	 * does some timing calibration on the FMC chip, so it's good
>> @@ -597,6 +757,7 @@ static int aspeed_smc_chip_setup_finish(struct aspeed_smc_chip *chip)
>>  	}
>>  
>>  	chip->ctl_val[smc_read] |= cmd |
>> +		chip->nor.read_opcode << CONTROL_COMMAND_SHIFT |
>>  		CONTROL_IO_DUMMY_SET(chip->nor.read_dummy / 8);
>>  
>>  	dev_dbg(controller->dev, "base control register: %08x\n",
>> @@ -695,6 +856,74 @@ static int aspeed_smc_setup_flash(struct aspeed_smc_controller *controller,
>>  	return ret;
>>  }
>>  
>> +static irqreturn_t aspeed_smc_irq(int irq, void *arg)
>> +{
>> +	struct aspeed_smc_controller *controller = arg;
>> +	struct device *dev = controller->dev;
>> +	irqreturn_t ret = IRQ_NONE;
>> +	u32 dma_ctl = readl(controller->regs + DMA_CONTROL_REG);
>> +	u32 status = readl(controller->regs + INTERRUPT_STATUS_REG);
>> +
>> +	dev_dbg(dev, "received IRQ. status: %x", status);
>> +
>> +	if (!(status & INTERRUPT_DMA_ENABLE) || !(dma_ctl & DMA_ENABLE)) {
>> +		dev_err(dev, "No DMA. bad IRQ status: %x", status);
>> +		goto out;
>> +	}
>> +
>> +	if (!(status & INTERRUPT_DMA_STATUS)) {
>> +		dev_err(dev, "DMA still in progress. length %d\n",
>> +			readl(controller->regs + DMA_LENGTH_REG));
>> +		goto out;
>> +	}
>> +
>> +	ret = IRQ_HANDLED;
>> +	aspeed_smc_dma_done(controller);
>> +	complete(&controller->dma_done);
>> +
>> +out:
>> +	return ret;
>> +}
>> +
>> +static int aspeed_smc_config_irq(struct aspeed_smc_controller *controller,
>> +				 struct platform_device *pdev)
>> +{
>> +	struct device *dev = &pdev->dev;
>> +	int rc;
>> +
>> +	controller->irq = platform_get_irq(pdev, 0);
>> +	if (!controller->irq)
>> +		return -ENODEV;
>> +
>> +	rc = devm_request_irq(dev, controller->irq, aspeed_smc_irq, IRQF_SHARED,
>> +			      DEVICE_NAME, controller);
>> +	if (rc < 0) {
>> +		dev_warn(dev, "Unable to request IRQ %d\n", controller->irq);
>> +		controller->irq = 0;
>> +		return rc;
>> +	}
>> +
>> +	dev_info(dev, "Using IRQ %d\n", controller->irq);
>> +	return 0;
>> +}
>> +
>> +static void aspeed_smc_dma_setup(struct aspeed_smc_controller *controller,
>> +				 struct platform_device *pdev)
>> +{
>> +	const struct aspeed_smc_info *info = controller->info;
>> +
>> +	init_completion(&controller->dma_done);
>> +
>> +	controller->dma_enabled = false;
>> +	if (info->has_dma)
>> +		controller->dma_enabled = !aspeed_smc_config_irq(controller,
>> +								 pdev);
>> +
>> +	if (controller->dma_enabled)
>> +		dev_info(controller->dev, "DMA support %s.\n",
>> +			 use_dma ? "enabled" : "disabled");
>> +}
>> +
>>  static int aspeed_smc_probe(struct platform_device *pdev)
>>  {
>>  	struct device_node *np = pdev->dev.of_node;
>> @@ -730,6 +959,8 @@ static int aspeed_smc_probe(struct platform_device *pdev)
>>  	if (IS_ERR(controller->ahb_base))
>>  		return PTR_ERR(controller->ahb_base);
>>  
>> +	aspeed_smc_dma_setup(controller, pdev);
>> +
>>  	ret = aspeed_smc_setup_flash(controller, np, res);
>>  	if (ret)
>>  		dev_err(dev, "Aspeed SMC probe failed %d\n", ret);
>>
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2017-04-19 14:37 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06 16:56 [PATCH 00/10] mtd: spi-nor: aspeed: support extensions Cédric Le Goater
2017-04-06 16:56 ` [PATCH 01/10] mtd: spi-nor: aspeed: fix the AST2400 SMC window size Cédric Le Goater
2017-04-06 19:17   ` Marek Vasut
2017-04-11 10:18     ` Cédric Le Goater
2017-04-11 10:42       ` Marek Vasut
2017-04-11 16:10         ` Cédric Le Goater
2017-04-06 16:56 ` [PATCH 02/10] mtd: spi-nor: aspeed: remove dummies from keep mask Cédric Le Goater
2017-04-06 16:56 ` [PATCH 03/10] mtd: spi-nor: aspeed: add DMA support Cédric Le Goater
2017-04-06 19:07   ` Cyrille Pitchen
2017-04-06 19:13     ` Cyrille Pitchen
2017-04-19 14:36     ` Cédric Le Goater
2017-04-06 16:56 ` [PATCH 04/10] mtd: spi-nor: aspeed: add support for SPI dual IO read mode Cédric Le Goater
2017-04-06 19:21   ` Marek Vasut
2017-04-11  8:53     ` Cédric Le Goater
2017-04-11 10:43       ` Marek Vasut
2017-04-06 16:56 ` [PATCH 05/10] mtd: spi-nor: Add support for Macronix mx66l1g45g spi flash Cédric Le Goater
2017-04-06 19:21   ` Marek Vasut
2017-04-06 16:56 ` [PATCH 06/10] mtd: spi-nor: add SPI_NOR_DUAL_READ to some flash devices Cédric Le Goater
2017-04-06 19:22   ` Marek Vasut
2017-04-11  8:15     ` Cédric Le Goater
2017-04-06 16:56 ` [PATCH 07/10] mtd: spi-nor: aspeed: configure chip window on AHB bus Cédric Le Goater
2017-04-06 19:25   ` Marek Vasut
2017-04-11 10:20     ` Cédric Le Goater
2017-04-11 10:44       ` Marek Vasut
2017-04-06 16:56 ` [PATCH 08/10] mtd: spi-nor: aspeed: use command mode for reads Cédric Le Goater
2017-04-06 16:56 ` [PATCH 09/10] mtd: spi-nor: aspeed: link controller with the ahb clock Cédric Le Goater
2017-04-06 16:56 ` [PATCH 10/10] mtd: spi-nor: aspeed: optimize read mode Cédric Le Goater
2017-04-06 19:28   ` Marek Vasut
2017-04-11  8:13     ` Cédric Le Goater
2017-04-11  8:23       ` Benjamin Herrenschmidt
2017-04-14 16:18       ` Marek Vasut
2017-04-14 21:40         ` Benjamin Herrenschmidt
2017-04-14 21:51           ` Marek Vasut
2017-04-14 22:11             ` Benjamin Herrenschmidt
2017-04-14 22:25               ` Marek Vasut
2017-04-14 22:42                 ` Benjamin Herrenschmidt
2017-04-16 18:46                   ` Marek Vasut
2017-04-18 15:46                   ` Cédric Le Goater
2017-04-18 16:51                     ` Marek Vasut
2017-04-18 22:41                     ` Benjamin Herrenschmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.