All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] mtd: rawnand: micron: Get the actual number of bitflips
@ 2018-07-03 12:20 Boris Brezillon
  2018-07-03 12:20 ` [PATCH 1/3] mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs Boris Brezillon
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Boris Brezillon @ 2018-07-03 12:20 UTC (permalink / raw)
  To: Boris Brezillon, Richard Weinberger, Miquel Raynal, linux-mtd
  Cc: David Woodhouse, Brian Norris, Marek Vasut, Bean Huo, Chris Packham

Hello,

Back when support for Micron 4bit/512byte on-die ECC was added, we
decided that it wasn't worth retrieving the real number of bitflips
when the chip was returning WRITE_RECOMMENDED and instead decided to
always return the maximum value (ECC strength).

This decision leads to UBI moving eraseblocks around as soon as 1
bitflip is present, which is far from optimal, and might wear the NAND
out faster than if we get the actual number of bitflips by re-reading
the page in raw mode and comparing its content to the corrected
version.

IIRC, Bean warned us about that, but it seems we didn't listen, so now
is time to revisit the implementation and implement what Bean initially
suggested.

This implementation has been tested on an MT29F2G08ABAEAH4, and seems
to work as expected (nandbiterrs works fine, and the real number of
bitflips is now returned).

Here are some details about these patches:
Patch 1 is just a cleanup to avoid passing parameters we don't need to
the ecc_status() functions. Patch 2 is preparing things for the actual
changes by reworking the ordering in the read function, and patch 3 is
implementing the read-in-raw-mode-and-compare logic.

Regards,

Boris

Boris Brezillon (3):
  mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs
  mtd: rawnand: micron: Disable ECC earlier in the read path
  mtd: rawnand: micron: Get the actual number of bitflips

 drivers/mtd/nand/raw/nand_micron.c | 145 ++++++++++++++++++++++++++++++-------
 1 file changed, 120 insertions(+), 25 deletions(-)

-- 
2.14.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs
  2018-07-03 12:20 [PATCH 0/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
@ 2018-07-03 12:20 ` Boris Brezillon
  2018-07-03 12:20 ` [PATCH 2/3] mtd: rawnand: micron: Disable ECC earlier in the read path Boris Brezillon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2018-07-03 12:20 UTC (permalink / raw)
  To: Boris Brezillon, Richard Weinberger, Miquel Raynal, linux-mtd
  Cc: David Woodhouse, Brian Norris, Marek Vasut, Bean Huo, Chris Packham

The mtd_info object can be retrieved from a nand_chip object. There's
no need to pass both to the ecc_status() functions.

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
---
 drivers/mtd/nand/raw/nand_micron.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
index 35fa6880a799..63ac98a36ed7 100644
--- a/drivers/mtd/nand/raw/nand_micron.c
+++ b/drivers/mtd/nand/raw/nand_micron.c
@@ -134,9 +134,10 @@ static int micron_nand_on_die_ecc_setup(struct nand_chip *chip, bool enable)
 }
 
 
-static int micron_nand_on_die_ecc_status_4(struct mtd_info *mtd,
-					   struct nand_chip *chip, u8 status)
+static int micron_nand_on_die_ecc_status_4(struct nand_chip *chip, u8 status)
 {
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
 	/*
 	 * The internal ECC doesn't tell us the number of bitflips
 	 * that have been corrected, but tells us if it recommends to
@@ -154,9 +155,10 @@ static int micron_nand_on_die_ecc_status_4(struct mtd_info *mtd,
 	return 0;
 }
 
-static int micron_nand_on_die_ecc_status_8(struct mtd_info *mtd,
-					   struct nand_chip *chip, u8 status)
+static int micron_nand_on_die_ecc_status_8(struct nand_chip *chip, u8 status)
 {
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
 	/*
 	 * With 8/512 we have more information but still don't know precisely
 	 * how many bit-flips were seen.
@@ -206,9 +208,9 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
 		goto out;
 
 	if (chip->ecc.strength == 4)
-		max_bitflips = micron_nand_on_die_ecc_status_4(mtd, chip, status);
+		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
 	else
-		max_bitflips = micron_nand_on_die_ecc_status_8(mtd, chip, status);
+		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
 
 	ret = nand_read_data_op(chip, buf, mtd->writesize, false);
 	if (!ret && oob_required)
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-03 12:20 [PATCH 0/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
  2018-07-03 12:20 ` [PATCH 1/3] mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs Boris Brezillon
@ 2018-07-03 12:20 ` Boris Brezillon
  2018-07-16  9:00   ` [2/3] " Chris Packham
  2018-07-03 12:20 ` [PATCH 3/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
  2018-07-08 21:48 ` [PATCH 0/3] " Miquel Raynal
  3 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2018-07-03 12:20 UTC (permalink / raw)
  To: Boris Brezillon, Richard Weinberger, Miquel Raynal, linux-mtd
  Cc: David Woodhouse, Brian Norris, Marek Vasut, Bean Huo, Chris Packham

We are about to support extracting the real number of bitflips for
4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
the page in raw mode to compare it to the corrected version, and this
logic will be placed in micron_nand_on_die_ecc_status_4().

Moving the micron_nand_on_die_ecc_setup() will allow us to disable
ECC only once.

As a result, we have to rework the exit path and add an error path
where the ECC is disabled.

Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
---
 drivers/mtd/nand/raw/nand_micron.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
index 63ac98a36ed7..b9cbaf125a98 100644
--- a/drivers/mtd/nand/raw/nand_micron.c
+++ b/drivers/mtd/nand/raw/nand_micron.c
@@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
 
 	ret = nand_read_page_op(chip, page, 0, NULL, 0);
 	if (ret)
-		goto out;
+		goto err_disable_ecc;
 
 	ret = nand_status_op(chip, &status);
 	if (ret)
-		goto out;
+		goto err_disable_ecc;
 
 	ret = nand_exit_status_op(chip);
 	if (ret)
-		goto out;
+		goto err_disable_ecc;
 
-	if (chip->ecc.strength == 4)
-		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
-	else
-		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
+	micron_nand_on_die_ecc_setup(chip, false);
 
 	ret = nand_read_data_op(chip, buf, mtd->writesize, false);
 	if (!ret && oob_required)
 		ret = nand_read_data_op(chip, chip->oob_poi, mtd->oobsize,
 					false);
 
-out:
+	if (ret)
+		return ret;
+
+	if (chip->ecc.strength == 4)
+		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
+	else
+		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
+
+	return max_bitflips;
+
+err_disable_ecc:
 	micron_nand_on_die_ecc_setup(chip, false);
 
-	return ret ? ret : max_bitflips;
+	return ret;
 }
 
 static int
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] mtd: rawnand: micron: Get the actual number of bitflips
  2018-07-03 12:20 [PATCH 0/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
  2018-07-03 12:20 ` [PATCH 1/3] mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs Boris Brezillon
  2018-07-03 12:20 ` [PATCH 2/3] mtd: rawnand: micron: Disable ECC earlier in the read path Boris Brezillon
@ 2018-07-03 12:20 ` Boris Brezillon
  2018-07-08 21:48 ` [PATCH 0/3] " Miquel Raynal
  3 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2018-07-03 12:20 UTC (permalink / raw)
  To: Boris Brezillon, Richard Weinberger, Miquel Raynal, linux-mtd
  Cc: David Woodhouse, Brian Norris, Marek Vasut, Bean Huo, Chris Packham

The MT29F2Gxxx chips with 4bits/512byte on-die ECC let us know when
some bitflips were corrected by the on-die ECC, but they do not report
the actual number of bitflips that were present in the data+ECC chunk.

We initially decided to always return ecc->strength to avoid re-reading
the page in raw mode + comparing it to the corrected buffer to extract
the real number of bitflips, but this forces UBI to move data around as
soon as one bitflip is present in a page.

This not only wears the NAND out faster, but also degrades
performances, since reading a full PEB + writing it back to a different
PEB + erasing the old one is much more expensive than re-reading the
faulty page in raw mode and comparing it to the corrected buffer.
In most cases, the actual number of bitflips does not exceed the
bitflips threshold, and UBI won't have to move data around. Otherwise,
we can assume the time spent re-reading the page and doing the
comparison is negligible compared to the time UBI spends moving a full
PEB to another PEB.

Note that this logic is not applied to chips with 8bits/512byte on-die
ECC, because those chips provide fine-grained information (the maximum
error is 1 bit, and it will not force UBI to move blocks around at the
first bitflip).

Suggested-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
---
 drivers/mtd/nand/raw/nand_micron.c | 114 ++++++++++++++++++++++++++++++++-----
 1 file changed, 100 insertions(+), 14 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
index b9cbaf125a98..754e9cdd44ac 100644
--- a/drivers/mtd/nand/raw/nand_micron.c
+++ b/drivers/mtd/nand/raw/nand_micron.c
@@ -16,6 +16,7 @@
  */
 
 #include <linux/mtd/rawnand.h>
+#include <linux/slab.h>
 
 /*
  * Special Micron status bit 3 indicates that the block has been
@@ -63,6 +64,14 @@ struct nand_onfi_vendor_micron {
 	u8 param_revision;
 } __packed;
 
+struct micron_on_die_ecc {
+	void *rawbuf;
+};
+
+struct micron_nand {
+	struct micron_on_die_ecc ecc;
+};
+
 static int micron_nand_setup_read_retry(struct mtd_info *mtd, int retry_mode)
 {
 	struct nand_chip *chip = mtd_to_nand(mtd);
@@ -134,25 +143,59 @@ static int micron_nand_on_die_ecc_setup(struct nand_chip *chip, bool enable)
 }
 
 
-static int micron_nand_on_die_ecc_status_4(struct nand_chip *chip, u8 status)
+static int micron_nand_on_die_ecc_status_4(struct nand_chip *chip, u8 status,
+					   void *buf, int page)
 {
+	struct micron_nand *micron = nand_get_manufacturer_data(chip);
 	struct mtd_info *mtd = nand_to_mtd(chip);
+	unsigned int step, max_bitflips = 0;
+	int ret;
+
+	if (!(status & NAND_ECC_STATUS_WRITE_RECOMMENDED)) {
+		if (status & NAND_STATUS_FAIL)
+			mtd->ecc_stats.failed++;
+
+		return 0;
+	}
 
 	/*
 	 * The internal ECC doesn't tell us the number of bitflips
 	 * that have been corrected, but tells us if it recommends to
-	 * rewrite the block. If it's the case, then we pretend we had
-	 * a number of bitflips equal to the ECC strength, which will
-	 * hint the NAND core to rewrite the block.
+	 * rewrite the block. If it's the case, we need to read the
+	 * page in raw mode and compare its content to the corrected
+	 * version to extract the actual number of bitflips.
 	 */
-	if (status & NAND_STATUS_FAIL) {
-		mtd->ecc_stats.failed++;
-	} else if (status & NAND_ECC_STATUS_WRITE_RECOMMENDED) {
-		mtd->ecc_stats.corrected += chip->ecc.strength;
-		return chip->ecc.strength;
+	ret = nand_read_page_op(chip, page, 0, micron->ecc.rawbuf,
+				mtd->writesize + mtd->oobsize);
+	if (ret)
+		return ret;
+
+	for (step = 0; step < chip->ecc.steps; step++) {
+		unsigned int offs, i, nbitflips = 0;
+		u8 *rawbuf, *corrbuf;
+
+		offs = step * chip->ecc.size;
+		rawbuf = micron->ecc.rawbuf + offs;
+		corrbuf = buf + offs;
+
+		for (i = 0; i < chip->ecc.size; i++)
+			nbitflips += hweight8(corrbuf[i] ^ rawbuf[i]);
+
+		offs = (step * 16) + 4;
+		rawbuf = micron->ecc.rawbuf + mtd->writesize + offs;
+		corrbuf = chip->oob_poi + offs;
+
+		for (i = 0; i < chip->ecc.bytes + 4; i++)
+			nbitflips += hweight8(corrbuf[i] ^ rawbuf[i]);
+
+		if (WARN_ON(nbitflips > chip->ecc.strength))
+			return -EINVAL;
+
+		max_bitflips = max(nbitflips, max_bitflips);
+		mtd->ecc_stats.corrected += nbitflips;
 	}
 
-	return 0;
+	return max_bitflips;
 }
 
 static int micron_nand_on_die_ecc_status_8(struct nand_chip *chip, u8 status)
@@ -218,7 +261,8 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
 		return ret;
 
 	if (chip->ecc.strength == 4)
-		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
+		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status,
+							       buf, page);
 	else
 		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
 
@@ -332,12 +376,19 @@ static int micron_supports_on_die_ecc(struct nand_chip *chip)
 static int micron_nand_init(struct nand_chip *chip)
 {
 	struct mtd_info *mtd = nand_to_mtd(chip);
+	struct micron_nand *micron;
 	int ondie;
 	int ret;
 
+	micron = kzalloc(sizeof(*micron), GFP_KERNEL);
+	if (!micron)
+		return -ENOMEM;
+
+	nand_set_manufacturer_data(chip, micron);
+
 	ret = micron_nand_onfi_init(chip);
 	if (ret)
-		return ret;
+		goto err_free_manuf_data;
 
 	if (mtd->writesize == 2048)
 		chip->bbt_options |= NAND_BBT_SCAN2NDPAGE;
@@ -347,13 +398,33 @@ static int micron_nand_init(struct nand_chip *chip)
 	if (ondie == MICRON_ON_DIE_MANDATORY &&
 	    chip->ecc.mode != NAND_ECC_ON_DIE) {
 		pr_err("On-die ECC forcefully enabled, not supported\n");
-		return -EINVAL;
+		ret = -EINVAL;
+		goto err_free_manuf_data;
 	}
 
 	if (chip->ecc.mode == NAND_ECC_ON_DIE) {
 		if (ondie == MICRON_ON_DIE_UNSUPPORTED) {
 			pr_err("On-die ECC selected but not supported\n");
-			return -EINVAL;
+			ret = -EINVAL;
+			goto err_free_manuf_data;
+		}
+
+		/*
+		 * In case of 4bit on-die ECC, we need a buffer to store a
+		 * page dumped in raw mode so that we can compare its content
+		 * to the same page after ECC correction happened and extract
+		 * the real number of bitflips from this comparison.
+		 * That's not needed for 8-bit ECC, because the status expose
+		 * a better approximation of the number of bitflips in a page.
+		 */
+		if (chip->ecc_strength_ds == 4) {
+			micron->ecc.rawbuf = kmalloc(mtd->writesize +
+						     mtd->oobsize,
+						     GFP_KERNEL);
+			if (!micron->ecc.rawbuf) {
+				ret = -ENOMEM;
+				goto err_free_manuf_data;
+			}
 		}
 
 		chip->ecc.bytes = chip->ecc_strength_ds * 2;
@@ -369,6 +440,20 @@ static int micron_nand_init(struct nand_chip *chip)
 	}
 
 	return 0;
+
+err_free_manuf_data:
+	kfree(micron->ecc.rawbuf);
+	kfree(micron);
+
+	return ret;
+}
+
+static void micron_nand_cleanup(struct nand_chip *chip)
+{
+	struct micron_nand *micron = nand_get_manufacturer_data(chip);
+
+	kfree(micron->ecc.rawbuf);
+	kfree(micron);
 }
 
 static void micron_fixup_onfi_param_page(struct nand_chip *chip,
@@ -385,5 +470,6 @@ static void micron_fixup_onfi_param_page(struct nand_chip *chip,
 
 const struct nand_manufacturer_ops micron_nand_manuf_ops = {
 	.init = micron_nand_init,
+	.cleanup = micron_nand_cleanup,
 	.fixup_onfi_param_page = micron_fixup_onfi_param_page,
 };
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] mtd: rawnand: micron: Get the actual number of bitflips
  2018-07-03 12:20 [PATCH 0/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
                   ` (2 preceding siblings ...)
  2018-07-03 12:20 ` [PATCH 3/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
@ 2018-07-08 21:48 ` Miquel Raynal
  3 siblings, 0 replies; 13+ messages in thread
From: Miquel Raynal @ 2018-07-08 21:48 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Richard Weinberger, linux-mtd, David Woodhouse, Brian Norris,
	Marek Vasut, Bean Huo, Chris Packham

Hi Boris,

Boris Brezillon <boris.brezillon@bootlin.com> wrote on Tue,  3 Jul 2018
14:20:06 +0200:

> Hello,
> 
> Back when support for Micron 4bit/512byte on-die ECC was added, we
> decided that it wasn't worth retrieving the real number of bitflips
> when the chip was returning WRITE_RECOMMENDED and instead decided to
> always return the maximum value (ECC strength).
> 
> This decision leads to UBI moving eraseblocks around as soon as 1
> bitflip is present, which is far from optimal, and might wear the NAND
> out faster than if we get the actual number of bitflips by re-reading
> the page in raw mode and comparing its content to the corrected
> version.
> 
> IIRC, Bean warned us about that, but it seems we didn't listen, so now
> is time to revisit the implementation and implement what Bean initially
> suggested.
> 
> This implementation has been tested on an MT29F2G08ABAEAH4, and seems
> to work as expected (nandbiterrs works fine, and the real number of
> bitflips is now returned).
> 
> Here are some details about these patches:
> Patch 1 is just a cleanup to avoid passing parameters we don't need to
> the ecc_status() functions. Patch 2 is preparing things for the actual
> changes by reworking the ordering in the read function, and patch 3 is
> implementing the read-in-raw-mode-and-compare logic.
> 
> Regards,
> 
> Boris
> 
> Boris Brezillon (3):
>   mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs
>   mtd: rawnand: micron: Disable ECC earlier in the read path
>   mtd: rawnand: micron: Get the actual number of bitflips
> 
>  drivers/mtd/nand/raw/nand_micron.c | 145 ++++++++++++++++++++++++++++++-------
>  1 file changed, 120 insertions(+), 25 deletions(-)
> 


Series applied to nand/next.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-03 12:20 ` [PATCH 2/3] mtd: rawnand: micron: Disable ECC earlier in the read path Boris Brezillon
@ 2018-07-16  9:00   ` Chris Packham
  2018-07-16 16:10     ` Boris Brezillon
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Packham @ 2018-07-16  9:00 UTC (permalink / raw)
  To: Boris Brezillon, Richard Weinberger, Miquel Raynal, linux-mtd
  Cc: Marek Vasut, Brian Norris, David Woodhouse, Bean Huo

Hi Boris,

On 04/07/18 00:20, Boris Brezillon wrote:
> We are about to support extracting the real number of bitflips for
> 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
> the page in raw mode to compare it to the corrected version, and this
> logic will be placed in micron_nand_on_die_ecc_status_4().
> 
> Moving the micron_nand_on_die_ecc_setup() will allow us to disable
> ECC only once.
> 
> As a result, we have to rework the exit path and add an error path
> where the ECC is disabled.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>

As I said on the other thread this appears to cause a problem for me on 
the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find 
the BBT, not sure if that is symptom or cause. If I revert this I can 
successfully mount and access data on the chip.

Here's some output from dmesg:

nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xd1
nand: Micron MT29F1G08ABAGAWP
nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
nand: timing mode 5 not acknowledged by the NAND chip
Bad block table not found for chip 0
Bad block table not found for chip 0
Scanning device for bad blocks
random: fast init done
Bad block table written to 0x000007fe0000, version 0x01
Bad block table written to 0x000007fc0000, version 0x01
3 fixed-partitions partitions found on MTD device pxa3xx_nand-0
Creating 3 MTD partitions on "pxa3xx_nand-0":
0x000000000000-0x000007000000 : "user"
0x000007000000-0x000007800000 : "errlog"
mtdoops: Attached to MTD device 1
mtdoops: ready 0, 1
0x000007800000-0x000008000000 : "nand-bbt"

And from my attempt to mount:

[root@linuxbox ~]# umount /flash && ubiattach -p /dev/mtd0 && mount -t 
ubifs ubi
0:user -o sync /flash
ubi0: attaching mtd0
ubi0: scanning is finished
ubi0 error: ubi_read_volume_table: the layout volume was not found
ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, error -22
ubiattach: error!: cannot attach "/dev/mtd0"
            error 22 (Invalid argument)
[root@linuxbox ~]#


> ---
>   drivers/mtd/nand/raw/nand_micron.c | 25 ++++++++++++++++---------
>   1 file changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
> index 63ac98a36ed7..b9cbaf125a98 100644
> --- a/drivers/mtd/nand/raw/nand_micron.c
> +++ b/drivers/mtd/nand/raw/nand_micron.c
> @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
>   
>   	ret = nand_read_page_op(chip, page, 0, NULL, 0);
>   	if (ret)
> -		goto out;
> +		goto err_disable_ecc;
>   
>   	ret = nand_status_op(chip, &status);
>   	if (ret)
> -		goto out;
> +		goto err_disable_ecc;
>   
>   	ret = nand_exit_status_op(chip);
>   	if (ret)
> -		goto out;
> +		goto err_disable_ecc;
>   
> -	if (chip->ecc.strength == 4)
> -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> -	else
> -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> +	micron_nand_on_die_ecc_setup(chip, false);
>   
>   	ret = nand_read_data_op(chip, buf, mtd->writesize, false);
>   	if (!ret && oob_required)
>   		ret = nand_read_data_op(chip, chip->oob_poi, mtd->oobsize,
>   					false);
>   
> -out:
> +	if (ret)
> +		return ret;
> +
> +	if (chip->ecc.strength == 4)
> +		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> +	else
> +		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> +
> +	return max_bitflips;
> +
> +err_disable_ecc:
>   	micron_nand_on_die_ecc_setup(chip, false);
>   
> -	return ret ? ret : max_bitflips;
> +	return ret;
>   }
>   
>   static int
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-16  9:00   ` [2/3] " Chris Packham
@ 2018-07-16 16:10     ` Boris Brezillon
  2018-07-16 20:55       ` Boris Brezillon
  0 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2018-07-16 16:10 UTC (permalink / raw)
  To: Chris Packham
  Cc: Richard Weinberger, Miquel Raynal, linux-mtd, Marek Vasut,
	Brian Norris, David Woodhouse, Bean Huo

On Mon, 16 Jul 2018 09:00:59 +0000
Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:

> Hi Boris,
> 
> On 04/07/18 00:20, Boris Brezillon wrote:
> > We are about to support extracting the real number of bitflips for
> > 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
> > the page in raw mode to compare it to the corrected version, and this
> > logic will be placed in micron_nand_on_die_ecc_status_4().
> > 
> > Moving the micron_nand_on_die_ecc_setup() will allow us to disable
> > ECC only once.
> > 
> > As a result, we have to rework the exit path and add an error path
> > where the ECC is disabled.
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>  
> 
> As I said on the other thread this appears to cause a problem for me on 
> the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find 
> the BBT, not sure if that is symptom or cause. If I revert this I can 
> successfully mount and access data on the chip.
> 
> Here's some output from dmesg:
> 
> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xd1
> nand: Micron MT29F1G08ABAGAWP
> nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
> nand: timing mode 5 not acknowledged by the NAND chip
> Bad block table not found for chip 0
> Bad block table not found for chip 0
> Scanning device for bad blocks
> random: fast init done
> Bad block table written to 0x000007fe0000, version 0x01
> Bad block table written to 0x000007fc0000, version 0x01
> 3 fixed-partitions partitions found on MTD device pxa3xx_nand-0
> Creating 3 MTD partitions on "pxa3xx_nand-0":
> 0x000000000000-0x000007000000 : "user"
> 0x000007000000-0x000007800000 : "errlog"
> mtdoops: Attached to MTD device 1
> mtdoops: ready 0, 1
> 0x000007800000-0x000008000000 : "nand-bbt"
> 
> And from my attempt to mount:
> 
> [root@linuxbox ~]# umount /flash && ubiattach -p /dev/mtd0 && mount -t 
> ubifs ubi
> 0:user -o sync /flash
> ubi0: attaching mtd0
> ubi0: scanning is finished
> ubi0 error: ubi_read_volume_table: the layout volume was not found
> ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, error -22
> ubiattach: error!: cannot attach "/dev/mtd0"
>             error 22 (Invalid argument)
> [root@linuxbox ~]#
> 
> 
> > ---
> >   drivers/mtd/nand/raw/nand_micron.c | 25 ++++++++++++++++---------
> >   1 file changed, 16 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
> > index 63ac98a36ed7..b9cbaf125a98 100644
> > --- a/drivers/mtd/nand/raw/nand_micron.c
> > +++ b/drivers/mtd/nand/raw/nand_micron.c
> > @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
> >   
> >   	ret = nand_read_page_op(chip, page, 0, NULL, 0);
> >   	if (ret)
> > -		goto out;
> > +		goto err_disable_ecc;
> >   
> >   	ret = nand_status_op(chip, &status);
> >   	if (ret)
> > -		goto out;
> > +		goto err_disable_ecc;
> >   
> >   	ret = nand_exit_status_op(chip);
> >   	if (ret)
> > -		goto out;
> > +		goto err_disable_ecc;
> >   
> > -	if (chip->ecc.strength == 4)
> > -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> > -	else
> > -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> > +	micron_nand_on_die_ecc_setup(chip, false);

Hm, can you try to move the micron_nand_on_die_ecc_setup(chip, false)
call just before nand_exit_status_op()?

> >   
> >   	ret = nand_read_data_op(chip, buf, mtd->writesize, false);
> >   	if (!ret && oob_required)
> >   		ret = nand_read_data_op(chip, chip->oob_poi, mtd->oobsize,
> >   					false);
> >   
> > -out:
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (chip->ecc.strength == 4)
> > +		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> > +	else
> > +		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> > +
> > +	return max_bitflips;
> > +
> > +err_disable_ecc:
> >   	micron_nand_on_die_ecc_setup(chip, false);
> >   
> > -	return ret ? ret : max_bitflips;
> > +	return ret;
> >   }
> >   
> >   static int
> >   
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-16 16:10     ` Boris Brezillon
@ 2018-07-16 20:55       ` Boris Brezillon
  2018-07-16 22:42         ` Chris Packham
  0 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2018-07-16 20:55 UTC (permalink / raw)
  To: Chris Packham
  Cc: Richard Weinberger, Miquel Raynal, linux-mtd, Marek Vasut,
	Brian Norris, David Woodhouse, Bean Huo

On Mon, 16 Jul 2018 18:10:57 +0200
Boris Brezillon <boris.brezillon@bootlin.com> wrote:

> On Mon, 16 Jul 2018 09:00:59 +0000
> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> 
> > Hi Boris,
> > 
> > On 04/07/18 00:20, Boris Brezillon wrote:  
> > > We are about to support extracting the real number of bitflips for
> > > 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
> > > the page in raw mode to compare it to the corrected version, and this
> > > logic will be placed in micron_nand_on_die_ecc_status_4().
> > > 
> > > Moving the micron_nand_on_die_ecc_setup() will allow us to disable
> > > ECC only once.
> > > 
> > > As a result, we have to rework the exit path and add an error path
> > > where the ECC is disabled.
> > > 
> > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>    
> > 
> > As I said on the other thread this appears to cause a problem for me on 
> > the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find 
> > the BBT, not sure if that is symptom or cause.

It's most likely the symptom, not the cause.

> > > 
> > > diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
> > > index 63ac98a36ed7..b9cbaf125a98 100644
> > > --- a/drivers/mtd/nand/raw/nand_micron.c
> > > +++ b/drivers/mtd/nand/raw/nand_micron.c
> > > @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
> > >   
> > >   	ret = nand_read_page_op(chip, page, 0, NULL, 0);
> > >   	if (ret)
> > > -		goto out;
> > > +		goto err_disable_ecc;
> > >   
> > >   	ret = nand_status_op(chip, &status);
> > >   	if (ret)
> > > -		goto out;
> > > +		goto err_disable_ecc;
> > >   
> > >   	ret = nand_exit_status_op(chip);
> > >   	if (ret)
> > > -		goto out;
> > > +		goto err_disable_ecc;
> > >   
> > > -	if (chip->ecc.strength == 4)
> > > -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> > > -	else
> > > -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> > > +	micron_nand_on_die_ecc_setup(chip, false);  
> 
> Hm, can you try to move the micron_nand_on_die_ecc_setup(chip, false)
> call just before nand_exit_status_op()?
> 

Just pushed a branch fixing that [1]. Can you test it? If it works,
I'll ask Miquel to drop the initial set of patches and instead pick the
fixed ones so that we don't break bisectibility.

[1]https://github.com/bbrezillon/linux-0day/commits/nand/next

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-16 20:55       ` Boris Brezillon
@ 2018-07-16 22:42         ` Chris Packham
  2018-07-17 11:41           ` Boris Brezillon
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Packham @ 2018-07-16 22:42 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Richard Weinberger, Miquel Raynal, linux-mtd, Marek Vasut,
	Brian Norris, David Woodhouse, Bean Huo

On 17/07/18 08:55, Boris Brezillon wrote:
> On Mon, 16 Jul 2018 18:10:57 +0200
> Boris Brezillon <boris.brezillon@bootlin.com> wrote:
> 
>> On Mon, 16 Jul 2018 09:00:59 +0000
>> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
>>
>>> Hi Boris,
>>>
>>> On 04/07/18 00:20, Boris Brezillon wrote:
>>>> We are about to support extracting the real number of bitflips for
>>>> 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
>>>> the page in raw mode to compare it to the corrected version, and this
>>>> logic will be placed in micron_nand_on_die_ecc_status_4().
>>>>
>>>> Moving the micron_nand_on_die_ecc_setup() will allow us to disable
>>>> ECC only once.
>>>>
>>>> As a result, we have to rework the exit path and add an error path
>>>> where the ECC is disabled.
>>>>
>>>> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
>>>
>>> As I said on the other thread this appears to cause a problem for me on
>>> the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find
>>> the BBT, not sure if that is symptom or cause.
> 
> It's most likely the symptom, not the cause.
> 
>>>>
>>>> diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
>>>> index 63ac98a36ed7..b9cbaf125a98 100644
>>>> --- a/drivers/mtd/nand/raw/nand_micron.c
>>>> +++ b/drivers/mtd/nand/raw/nand_micron.c
>>>> @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
>>>>    
>>>>    	ret = nand_read_page_op(chip, page, 0, NULL, 0);
>>>>    	if (ret)
>>>> -		goto out;
>>>> +		goto err_disable_ecc;
>>>>    
>>>>    	ret = nand_status_op(chip, &status);
>>>>    	if (ret)
>>>> -		goto out;
>>>> +		goto err_disable_ecc;
>>>>    
>>>>    	ret = nand_exit_status_op(chip);
>>>>    	if (ret)
>>>> -		goto out;
>>>> +		goto err_disable_ecc;
>>>>    
>>>> -	if (chip->ecc.strength == 4)
>>>> -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
>>>> -	else
>>>> -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
>>>> +	micron_nand_on_die_ecc_setup(chip, false);
>>
>> Hm, can you try to move the micron_nand_on_die_ecc_setup(chip, false)
>> call just before nand_exit_status_op()?
>>
> 
> Just pushed a branch fixing that [1]. Can you test it? If it works,
> I'll ask Miquel to drop the initial set of patches and instead pick the
> fixed ones so that we don't break bisectibility.
> 
> [1]https://github.com/bbrezillon/linux-0day/commits/nand/next
> 

Still appears to have the same problem.

I'm guessing that since you can't actually disable ecc on this chip 
calling micron_nand_on_die_ecc_setup(chip, false); before reading the 
oob data interferes with it somehow (if I call it after there is no 
problem).

We could add code to qualify the attempt to disable ecc early based on 
it being optional/mandatory or just stick with it being disabled late.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-16 22:42         ` Chris Packham
@ 2018-07-17 11:41           ` Boris Brezillon
  2018-07-17 21:27             ` Chris Packham
  0 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2018-07-17 11:41 UTC (permalink / raw)
  To: Chris Packham
  Cc: Richard Weinberger, Miquel Raynal, linux-mtd, Marek Vasut,
	Brian Norris, David Woodhouse, Bean Huo

On Mon, 16 Jul 2018 22:42:54 +0000
Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:

> On 17/07/18 08:55, Boris Brezillon wrote:
> > On Mon, 16 Jul 2018 18:10:57 +0200
> > Boris Brezillon <boris.brezillon@bootlin.com> wrote:
> >   
> >> On Mon, 16 Jul 2018 09:00:59 +0000
> >> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> >>  
> >>> Hi Boris,
> >>>
> >>> On 04/07/18 00:20, Boris Brezillon wrote:  
> >>>> We are about to support extracting the real number of bitflips for
> >>>> 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
> >>>> the page in raw mode to compare it to the corrected version, and this
> >>>> logic will be placed in micron_nand_on_die_ecc_status_4().
> >>>>
> >>>> Moving the micron_nand_on_die_ecc_setup() will allow us to disable
> >>>> ECC only once.
> >>>>
> >>>> As a result, we have to rework the exit path and add an error path
> >>>> where the ECC is disabled.
> >>>>
> >>>> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>  
> >>>
> >>> As I said on the other thread this appears to cause a problem for me on
> >>> the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find
> >>> the BBT, not sure if that is symptom or cause.  
> > 
> > It's most likely the symptom, not the cause.
> >   
> >>>>
> >>>> diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
> >>>> index 63ac98a36ed7..b9cbaf125a98 100644
> >>>> --- a/drivers/mtd/nand/raw/nand_micron.c
> >>>> +++ b/drivers/mtd/nand/raw/nand_micron.c
> >>>> @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
> >>>>    
> >>>>    	ret = nand_read_page_op(chip, page, 0, NULL, 0);
> >>>>    	if (ret)
> >>>> -		goto out;
> >>>> +		goto err_disable_ecc;
> >>>>    
> >>>>    	ret = nand_status_op(chip, &status);
> >>>>    	if (ret)
> >>>> -		goto out;
> >>>> +		goto err_disable_ecc;
> >>>>    
> >>>>    	ret = nand_exit_status_op(chip);
> >>>>    	if (ret)
> >>>> -		goto out;
> >>>> +		goto err_disable_ecc;
> >>>>    
> >>>> -	if (chip->ecc.strength == 4)
> >>>> -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> >>>> -	else
> >>>> -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> >>>> +	micron_nand_on_die_ecc_setup(chip, false);  
> >>
> >> Hm, can you try to move the micron_nand_on_die_ecc_setup(chip, false)
> >> call just before nand_exit_status_op()?
> >>  
> > 
> > Just pushed a branch fixing that [1]. Can you test it? If it works,
> > I'll ask Miquel to drop the initial set of patches and instead pick the
> > fixed ones so that we don't break bisectibility.
> > 
> > [1]https://github.com/bbrezillon/linux-0day/commits/nand/next
> >   
> 
> Still appears to have the same problem.
> 
> I'm guessing that since you can't actually disable ecc on this chip 
> calling micron_nand_on_die_ecc_setup(chip, false); before reading the 
> oob data interferes with it somehow (if I call it after there is no 
> problem).
> 
> We could add code to qualify the attempt to disable ecc early based on 
> it being optional/mandatory or just stick with it being disabled late.

Okay. I gave up on disabling ECC earlier and instead implemented what
you suggested. Can you test
https://github.com/bbrezillon/linux-0day/commits/nand/next again and
let me know if it works for you?

Thanks,

Boris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-17 11:41           ` Boris Brezillon
@ 2018-07-17 21:27             ` Chris Packham
  2018-07-17 21:31               ` Boris Brezillon
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Packham @ 2018-07-17 21:27 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Richard Weinberger, Miquel Raynal, linux-mtd, Marek Vasut,
	Brian Norris, David Woodhouse, Bean Huo

Hi Boris,

On 17/07/18 23:42, Boris Brezillon wrote:
> On Mon, 16 Jul 2018 22:42:54 +0000
> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> 
>> On 17/07/18 08:55, Boris Brezillon wrote:
>>> On Mon, 16 Jul 2018 18:10:57 +0200
>>> Boris Brezillon <boris.brezillon@bootlin.com> wrote:
>>>    
>>>> On Mon, 16 Jul 2018 09:00:59 +0000
>>>> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
>>>>   
>>>>> Hi Boris,
>>>>>
>>>>> On 04/07/18 00:20, Boris Brezillon wrote:
>>>>>> We are about to support extracting the real number of bitflips for
>>>>>> 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
>>>>>> the page in raw mode to compare it to the corrected version, and this
>>>>>> logic will be placed in micron_nand_on_die_ecc_status_4().
>>>>>>
>>>>>> Moving the micron_nand_on_die_ecc_setup() will allow us to disable
>>>>>> ECC only once.
>>>>>>
>>>>>> As a result, we have to rework the exit path and add an error path
>>>>>> where the ECC is disabled.
>>>>>>
>>>>>> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
>>>>>
>>>>> As I said on the other thread this appears to cause a problem for me on
>>>>> the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find
>>>>> the BBT, not sure if that is symptom or cause.
>>>
>>> It's most likely the symptom, not the cause.
>>>    
>>>>>>
>>>>>> diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
>>>>>> index 63ac98a36ed7..b9cbaf125a98 100644
>>>>>> --- a/drivers/mtd/nand/raw/nand_micron.c
>>>>>> +++ b/drivers/mtd/nand/raw/nand_micron.c
>>>>>> @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
>>>>>>     
>>>>>>     	ret = nand_read_page_op(chip, page, 0, NULL, 0);
>>>>>>     	if (ret)
>>>>>> -		goto out;
>>>>>> +		goto err_disable_ecc;
>>>>>>     
>>>>>>     	ret = nand_status_op(chip, &status);
>>>>>>     	if (ret)
>>>>>> -		goto out;
>>>>>> +		goto err_disable_ecc;
>>>>>>     
>>>>>>     	ret = nand_exit_status_op(chip);
>>>>>>     	if (ret)
>>>>>> -		goto out;
>>>>>> +		goto err_disable_ecc;
>>>>>>     
>>>>>> -	if (chip->ecc.strength == 4)
>>>>>> -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
>>>>>> -	else
>>>>>> -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
>>>>>> +	micron_nand_on_die_ecc_setup(chip, false);
>>>>
>>>> Hm, can you try to move the micron_nand_on_die_ecc_setup(chip, false)
>>>> call just before nand_exit_status_op()?
>>>>   
>>>
>>> Just pushed a branch fixing that [1]. Can you test it? If it works,
>>> I'll ask Miquel to drop the initial set of patches and instead pick the
>>> fixed ones so that we don't break bisectibility.
>>>
>>> [1]https://github.com/bbrezillon/linux-0day/commits/nand/next
>>>    
>>
>> Still appears to have the same problem.
>>
>> I'm guessing that since you can't actually disable ecc on this chip
>> calling micron_nand_on_die_ecc_setup(chip, false); before reading the
>> oob data interferes with it somehow (if I call it after there is no
>> problem).
>>
>> We could add code to qualify the attempt to disable ecc early based on
>> it being optional/mandatory or just stick with it being disabled late.
> 
> Okay. I gave up on disabling ECC earlier and instead implemented what
> you suggested. Can you test
> https://github.com/bbrezillon/linux-0day/commits/nand/next again and
> let me know if it works for you?

Yes that works for me. Feel free to add

Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>

to

88240e244026 mtd: rawnand: micron: Make ECC activation stateful
6ae7f205cec0 mtd: rawnand: micron: Avoid enabling/disabling ECC when it 
can't be disabled



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-17 21:27             ` Chris Packham
@ 2018-07-17 21:31               ` Boris Brezillon
  2018-07-18  7:25                 ` Miquel Raynal
  0 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2018-07-17 21:31 UTC (permalink / raw)
  To: Chris Packham
  Cc: Richard Weinberger, Miquel Raynal, linux-mtd, Marek Vasut,
	Brian Norris, David Woodhouse, Bean Huo

On Tue, 17 Jul 2018 21:27:26 +0000
Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:

> Hi Boris,
> 
> On 17/07/18 23:42, Boris Brezillon wrote:
> > On Mon, 16 Jul 2018 22:42:54 +0000
> > Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> >   
> >> On 17/07/18 08:55, Boris Brezillon wrote:  
> >>> On Mon, 16 Jul 2018 18:10:57 +0200
> >>> Boris Brezillon <boris.brezillon@bootlin.com> wrote:
> >>>      
> >>>> On Mon, 16 Jul 2018 09:00:59 +0000
> >>>> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> >>>>     
> >>>>> Hi Boris,
> >>>>>
> >>>>> On 04/07/18 00:20, Boris Brezillon wrote:  
> >>>>>> We are about to support extracting the real number of bitflips for
> >>>>>> 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
> >>>>>> the page in raw mode to compare it to the corrected version, and this
> >>>>>> logic will be placed in micron_nand_on_die_ecc_status_4().
> >>>>>>
> >>>>>> Moving the micron_nand_on_die_ecc_setup() will allow us to disable
> >>>>>> ECC only once.
> >>>>>>
> >>>>>> As a result, we have to rework the exit path and add an error path
> >>>>>> where the ECC is disabled.
> >>>>>>
> >>>>>> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>  
> >>>>>
> >>>>> As I said on the other thread this appears to cause a problem for me on
> >>>>> the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find
> >>>>> the BBT, not sure if that is symptom or cause.  
> >>>
> >>> It's most likely the symptom, not the cause.
> >>>      
> >>>>>>
> >>>>>> diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
> >>>>>> index 63ac98a36ed7..b9cbaf125a98 100644
> >>>>>> --- a/drivers/mtd/nand/raw/nand_micron.c
> >>>>>> +++ b/drivers/mtd/nand/raw/nand_micron.c
> >>>>>> @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
> >>>>>>     
> >>>>>>     	ret = nand_read_page_op(chip, page, 0, NULL, 0);
> >>>>>>     	if (ret)
> >>>>>> -		goto out;
> >>>>>> +		goto err_disable_ecc;
> >>>>>>     
> >>>>>>     	ret = nand_status_op(chip, &status);
> >>>>>>     	if (ret)
> >>>>>> -		goto out;
> >>>>>> +		goto err_disable_ecc;
> >>>>>>     
> >>>>>>     	ret = nand_exit_status_op(chip);
> >>>>>>     	if (ret)
> >>>>>> -		goto out;
> >>>>>> +		goto err_disable_ecc;
> >>>>>>     
> >>>>>> -	if (chip->ecc.strength == 4)
> >>>>>> -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> >>>>>> -	else
> >>>>>> -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> >>>>>> +	micron_nand_on_die_ecc_setup(chip, false);  
> >>>>
> >>>> Hm, can you try to move the micron_nand_on_die_ecc_setup(chip, false)
> >>>> call just before nand_exit_status_op()?
> >>>>     
> >>>
> >>> Just pushed a branch fixing that [1]. Can you test it? If it works,
> >>> I'll ask Miquel to drop the initial set of patches and instead pick the
> >>> fixed ones so that we don't break bisectibility.
> >>>
> >>> [1]https://github.com/bbrezillon/linux-0day/commits/nand/next
> >>>      
> >>
> >> Still appears to have the same problem.
> >>
> >> I'm guessing that since you can't actually disable ecc on this chip
> >> calling micron_nand_on_die_ecc_setup(chip, false); before reading the
> >> oob data interferes with it somehow (if I call it after there is no
> >> problem).
> >>
> >> We could add code to qualify the attempt to disable ecc early based on
> >> it being optional/mandatory or just stick with it being disabled late.  
> > 
> > Okay. I gave up on disabling ECC earlier and instead implemented what
> > you suggested. Can you test
> > https://github.com/bbrezillon/linux-0day/commits/nand/next again and
> > let me know if it works for you?  
> 
> Yes that works for me. Feel free to add
> 
> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
> 
> to
> 
> 88240e244026 mtd: rawnand: micron: Make ECC activation stateful
> 6ae7f205cec0 mtd: rawnand: micron: Avoid enabling/disabling ECC when it 
> can't be disabled

Okay, cool.

Miquel, can you drop:

683ca8a70738 mtd: rawnand: micron: Get the actual number of bitflips
d73e5d29aa29 mtd: rawnand: micron: Disable ECC earlier in the read path
db2d37331813 mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs
51f3b3970a8c mtd: rawnand: micron: detect forced on-die ECC
386fc2609d15 mtd: rawnand: micron: support 8/512 on-die ECC

I'll send a new patch series soon.

Thanks,

Boris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2/3] mtd: rawnand: micron: Disable ECC earlier in the read path
  2018-07-17 21:31               ` Boris Brezillon
@ 2018-07-18  7:25                 ` Miquel Raynal
  0 siblings, 0 replies; 13+ messages in thread
From: Miquel Raynal @ 2018-07-18  7:25 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Chris Packham, Richard Weinberger, linux-mtd, Marek Vasut,
	Brian Norris, David Woodhouse, Bean Huo

Hi Boris,

Boris Brezillon <boris.brezillon@bootlin.com> wrote on Tue, 17 Jul 2018
23:31:43 +0200:

> On Tue, 17 Jul 2018 21:27:26 +0000
> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> 
> > Hi Boris,
> > 
> > On 17/07/18 23:42, Boris Brezillon wrote:  
> > > On Mon, 16 Jul 2018 22:42:54 +0000
> > > Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> > >     
> > >> On 17/07/18 08:55, Boris Brezillon wrote:    
> > >>> On Mon, 16 Jul 2018 18:10:57 +0200
> > >>> Boris Brezillon <boris.brezillon@bootlin.com> wrote:
> > >>>        
> > >>>> On Mon, 16 Jul 2018 09:00:59 +0000
> > >>>> Chris Packham <Chris.Packham@alliedtelesis.co.nz> wrote:
> > >>>>       
> > >>>>> Hi Boris,
> > >>>>>
> > >>>>> On 04/07/18 00:20, Boris Brezillon wrote:    
> > >>>>>> We are about to support extracting the real number of bitflips for
> > >>>>>> 4-bits ECC when WRITE_RECOMMEND is returned. This requires re-reading
> > >>>>>> the page in raw mode to compare it to the corrected version, and this
> > >>>>>> logic will be placed in micron_nand_on_die_ecc_status_4().
> > >>>>>>
> > >>>>>> Moving the micron_nand_on_die_ecc_setup() will allow us to disable
> > >>>>>> ECC only once.
> > >>>>>>
> > >>>>>> As a result, we have to rework the exit path and add an error path
> > >>>>>> where the ECC is disabled.
> > >>>>>>
> > >>>>>> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>    
> > >>>>>
> > >>>>> As I said on the other thread this appears to cause a problem for me on
> > >>>>> the MT29F1G08ABAFAWP-ITE setup I have. I notice we're not able to find
> > >>>>> the BBT, not sure if that is symptom or cause.    
> > >>>
> > >>> It's most likely the symptom, not the cause.
> > >>>        
> > >>>>>>
> > >>>>>> diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
> > >>>>>> index 63ac98a36ed7..b9cbaf125a98 100644
> > >>>>>> --- a/drivers/mtd/nand/raw/nand_micron.c
> > >>>>>> +++ b/drivers/mtd/nand/raw/nand_micron.c
> > >>>>>> @@ -197,30 +197,37 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
> > >>>>>>     
> > >>>>>>     	ret = nand_read_page_op(chip, page, 0, NULL, 0);
> > >>>>>>     	if (ret)
> > >>>>>> -		goto out;
> > >>>>>> +		goto err_disable_ecc;
> > >>>>>>     
> > >>>>>>     	ret = nand_status_op(chip, &status);
> > >>>>>>     	if (ret)
> > >>>>>> -		goto out;
> > >>>>>> +		goto err_disable_ecc;
> > >>>>>>     
> > >>>>>>     	ret = nand_exit_status_op(chip);
> > >>>>>>     	if (ret)
> > >>>>>> -		goto out;
> > >>>>>> +		goto err_disable_ecc;
> > >>>>>>     
> > >>>>>> -	if (chip->ecc.strength == 4)
> > >>>>>> -		max_bitflips = micron_nand_on_die_ecc_status_4(chip, status);
> > >>>>>> -	else
> > >>>>>> -		max_bitflips = micron_nand_on_die_ecc_status_8(chip, status);
> > >>>>>> +	micron_nand_on_die_ecc_setup(chip, false);    
> > >>>>
> > >>>> Hm, can you try to move the micron_nand_on_die_ecc_setup(chip, false)
> > >>>> call just before nand_exit_status_op()?
> > >>>>       
> > >>>
> > >>> Just pushed a branch fixing that [1]. Can you test it? If it works,
> > >>> I'll ask Miquel to drop the initial set of patches and instead pick the
> > >>> fixed ones so that we don't break bisectibility.
> > >>>
> > >>> [1]https://github.com/bbrezillon/linux-0day/commits/nand/next
> > >>>        
> > >>
> > >> Still appears to have the same problem.
> > >>
> > >> I'm guessing that since you can't actually disable ecc on this chip
> > >> calling micron_nand_on_die_ecc_setup(chip, false); before reading the
> > >> oob data interferes with it somehow (if I call it after there is no
> > >> problem).
> > >>
> > >> We could add code to qualify the attempt to disable ecc early based on
> > >> it being optional/mandatory or just stick with it being disabled late.    
> > > 
> > > Okay. I gave up on disabling ECC earlier and instead implemented what
> > > you suggested. Can you test
> > > https://github.com/bbrezillon/linux-0day/commits/nand/next again and
> > > let me know if it works for you?    
> > 
> > Yes that works for me. Feel free to add
> > 
> > Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
> > 
> > to
> > 
> > 88240e244026 mtd: rawnand: micron: Make ECC activation stateful
> > 6ae7f205cec0 mtd: rawnand: micron: Avoid enabling/disabling ECC when it 
> > can't be disabled  
> 
> Okay, cool.
> 
> Miquel, can you drop:
> 
> 683ca8a70738 mtd: rawnand: micron: Get the actual number of bitflips
> d73e5d29aa29 mtd: rawnand: micron: Disable ECC earlier in the read path
> db2d37331813 mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs
> 51f3b3970a8c mtd: rawnand: micron: detect forced on-die ECC
> 386fc2609d15 mtd: rawnand: micron: support 8/512 on-die ECC
> 
> I'll send a new patch series soon.

Above patches dropped, thank you both for working on (and fixing) this.

I'll wait for the new series.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-07-18  7:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-03 12:20 [PATCH 0/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
2018-07-03 12:20 ` [PATCH 1/3] mtd: rawnand: micron: Stop passing mtd to ecc_status() funcs Boris Brezillon
2018-07-03 12:20 ` [PATCH 2/3] mtd: rawnand: micron: Disable ECC earlier in the read path Boris Brezillon
2018-07-16  9:00   ` [2/3] " Chris Packham
2018-07-16 16:10     ` Boris Brezillon
2018-07-16 20:55       ` Boris Brezillon
2018-07-16 22:42         ` Chris Packham
2018-07-17 11:41           ` Boris Brezillon
2018-07-17 21:27             ` Chris Packham
2018-07-17 21:31               ` Boris Brezillon
2018-07-18  7:25                 ` Miquel Raynal
2018-07-03 12:20 ` [PATCH 3/3] mtd: rawnand: micron: Get the actual number of bitflips Boris Brezillon
2018-07-08 21:48 ` [PATCH 0/3] " Miquel Raynal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.