All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-02-21 12:52 ` Markus Pargmann
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-02-21 12:52 UTC (permalink / raw)
  To: Han Xu, David Woodhouse, Boris BREZILLON
  Cc: Brian Norris, Huang Shijie, Fabio Estevam, linux-mtd,
	linux-arm-kernel, kernel, Markus Pargmann

ECC is only calculated for written pages. As erased pages are not
actively written the ECC is always invalid. For this purpose the
Hardware BCH unit is able to check for erased pages and does not raise
an ECC error in this case. This behaviour can be influenced using the
BCH_MODE register which sets the number of allowed bitflips in an erased
page. Unfortunately the unit is not capable of fixing the bitflips in
memory.

To avoid complete software checks for erased pages, we can simply check
buffers with uncorrectable ECC errors because we know that any erased
page with errors is uncorrectable by the BCH unit.

This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
to correct erased pages. To have the valid data in the buffer before
using them, this patch moves the read_page_swap_end() call before the
ECC status checking for-loop.

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
---
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
 1 file changed, 44 insertions(+), 5 deletions(-)

diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 235ddcb58f39..ce5a21252102 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	/* Loop over status bytes, accumulating ECC status. */
 	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
 
+	read_page_swap_end(this, buf, nfc_geo->payload_size,
+			   this->payload_virt, this->payload_phys,
+			   nfc_geo->payload_size,
+			   payload_virt, payload_phys);
+
 	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
 		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
 			continue;
 
 		if (*status == STATUS_UNCORRECTABLE) {
+			int flips;
+
+			/*
+			 * The ECC hardware has an uncorrectable ECC status
+			 * code in case we have bitflips in an erased page. As
+			 * nothing was written into this subpage the ECC is
+			 * obviously wrong and we can not trust it. We assume
+			 * at this point that we are reading an erased page and
+			 * try to correct the bitflips in buffer up to
+			 * ecc_strength bitflips. If this is a page with random
+			 * data, we exceed this number of bitflips and have a
+			 * ECC failure. Otherwise we use the corrected buffer.
+			 */
+			if (i == 0) {
+				/* The first block includes metadata */
+				flips = nand_check_erased_ecc_chunk(
+						buf + i * nfc_geo->ecc_chunk_size,
+						nfc_geo->ecc_chunk_size,
+						NULL, 0,
+						auxiliary_virt,
+						nfc_geo->metadata_size,
+						nfc_geo->ecc_strength);
+			} else {
+				flips = nand_check_erased_ecc_chunk(
+						buf + i * nfc_geo->ecc_chunk_size,
+						nfc_geo->ecc_chunk_size,
+						NULL, 0,
+						NULL, 0,
+						nfc_geo->ecc_strength);
+			}
+
+			if (flips > 0) {
+				max_bitflips = max_t(unsigned int, max_bitflips,
+						     flips);
+				mtd->ecc_stats.corrected += flips;
+				continue;
+			}
+
 			mtd->ecc_stats.failed++;
 			continue;
 		}
+
 		mtd->ecc_stats.corrected += *status;
 		max_bitflips = max_t(unsigned int, max_bitflips, *status);
 	}
@@ -1062,11 +1106,6 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 		chip->oob_poi[0] = ((uint8_t *) auxiliary_virt)[0];
 	}
 
-	read_page_swap_end(this, buf, nfc_geo->payload_size,
-			this->payload_virt, this->payload_phys,
-			nfc_geo->payload_size,
-			payload_virt, payload_phys);
-
 	return max_bitflips;
 }
 
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-02-21 12:52 ` Markus Pargmann
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-02-21 12:52 UTC (permalink / raw)
  To: linux-arm-kernel

ECC is only calculated for written pages. As erased pages are not
actively written the ECC is always invalid. For this purpose the
Hardware BCH unit is able to check for erased pages and does not raise
an ECC error in this case. This behaviour can be influenced using the
BCH_MODE register which sets the number of allowed bitflips in an erased
page. Unfortunately the unit is not capable of fixing the bitflips in
memory.

To avoid complete software checks for erased pages, we can simply check
buffers with uncorrectable ECC errors because we know that any erased
page with errors is uncorrectable by the BCH unit.

This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
to correct erased pages. To have the valid data in the buffer before
using them, this patch moves the read_page_swap_end() call before the
ECC status checking for-loop.

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
---
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
 1 file changed, 44 insertions(+), 5 deletions(-)

diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 235ddcb58f39..ce5a21252102 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	/* Loop over status bytes, accumulating ECC status. */
 	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
 
+	read_page_swap_end(this, buf, nfc_geo->payload_size,
+			   this->payload_virt, this->payload_phys,
+			   nfc_geo->payload_size,
+			   payload_virt, payload_phys);
+
 	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
 		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
 			continue;
 
 		if (*status == STATUS_UNCORRECTABLE) {
+			int flips;
+
+			/*
+			 * The ECC hardware has an uncorrectable ECC status
+			 * code in case we have bitflips in an erased page. As
+			 * nothing was written into this subpage the ECC is
+			 * obviously wrong and we can not trust it. We assume
+			 * at this point that we are reading an erased page and
+			 * try to correct the bitflips in buffer up to
+			 * ecc_strength bitflips. If this is a page with random
+			 * data, we exceed this number of bitflips and have a
+			 * ECC failure. Otherwise we use the corrected buffer.
+			 */
+			if (i == 0) {
+				/* The first block includes metadata */
+				flips = nand_check_erased_ecc_chunk(
+						buf + i * nfc_geo->ecc_chunk_size,
+						nfc_geo->ecc_chunk_size,
+						NULL, 0,
+						auxiliary_virt,
+						nfc_geo->metadata_size,
+						nfc_geo->ecc_strength);
+			} else {
+				flips = nand_check_erased_ecc_chunk(
+						buf + i * nfc_geo->ecc_chunk_size,
+						nfc_geo->ecc_chunk_size,
+						NULL, 0,
+						NULL, 0,
+						nfc_geo->ecc_strength);
+			}
+
+			if (flips > 0) {
+				max_bitflips = max_t(unsigned int, max_bitflips,
+						     flips);
+				mtd->ecc_stats.corrected += flips;
+				continue;
+			}
+
 			mtd->ecc_stats.failed++;
 			continue;
 		}
+
 		mtd->ecc_stats.corrected += *status;
 		max_bitflips = max_t(unsigned int, max_bitflips, *status);
 	}
@@ -1062,11 +1106,6 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 		chip->oob_poi[0] = ((uint8_t *) auxiliary_virt)[0];
 	}
 
-	read_page_swap_end(this, buf, nfc_geo->payload_size,
-			this->payload_virt, this->payload_phys,
-			nfc_geo->payload_size,
-			payload_virt, payload_phys);
-
 	return max_bitflips;
 }
 
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-02-21 12:52 ` Markus Pargmann
@ 2016-02-24 13:57   ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-02-24 13:57 UTC (permalink / raw)
  To: Markus Pargmann
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel

Hi Markus,

On Sun, 21 Feb 2016 13:52:06 +0100
Markus Pargmann <mpa@pengutronix.de> wrote:

> ECC is only calculated for written pages. As erased pages are not
> actively written the ECC is always invalid. For this purpose the
> Hardware BCH unit is able to check for erased pages and does not raise
> an ECC error in this case. This behaviour can be influenced using the
> BCH_MODE register which sets the number of allowed bitflips in an erased
> page. Unfortunately the unit is not capable of fixing the bitflips in
> memory.
> 
> To avoid complete software checks for erased pages, we can simply check
> buffers with uncorrectable ECC errors because we know that any erased
> page with errors is uncorrectable by the BCH unit.
> 
> This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> to correct erased pages. To have the valid data in the buffer before
> using them, this patch moves the read_page_swap_end() call before the
> ECC status checking for-loop.
> 
> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> ---
>  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
>  1 file changed, 44 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> index 235ddcb58f39..ce5a21252102 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>  	/* Loop over status bytes, accumulating ECC status. */
>  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
>  
> +	read_page_swap_end(this, buf, nfc_geo->payload_size,
> +			   this->payload_virt, this->payload_phys,
> +			   nfc_geo->payload_size,
> +			   payload_virt, payload_phys);
> +
>  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
>  		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
>  			continue;
>  
>  		if (*status == STATUS_UNCORRECTABLE) {
> +			int flips;
> +
> +			/*
> +			 * The ECC hardware has an uncorrectable ECC status
> +			 * code in case we have bitflips in an erased page. As
> +			 * nothing was written into this subpage the ECC is
> +			 * obviously wrong and we can not trust it. We assume
> +			 * at this point that we are reading an erased page and
> +			 * try to correct the bitflips in buffer up to
> +			 * ecc_strength bitflips. If this is a page with random
> +			 * data, we exceed this number of bitflips and have a
> +			 * ECC failure. Otherwise we use the corrected buffer.
> +			 */
> +			if (i == 0) {
> +				/* The first block includes metadata */
> +				flips = nand_check_erased_ecc_chunk(
> +						buf + i * nfc_geo->ecc_chunk_size,
> +						nfc_geo->ecc_chunk_size,
> +						NULL, 0,
> +						auxiliary_virt,
> +						nfc_geo->metadata_size,
> +						nfc_geo->ecc_strength);
> +			} else {
> +				flips = nand_check_erased_ecc_chunk(
> +						buf + i * nfc_geo->ecc_chunk_size,
> +						nfc_geo->ecc_chunk_size,
> +						NULL, 0,
> +						NULL, 0,
> +						nfc_geo->ecc_strength);
> +			}

You're not checking ECC bytes. I know it's a bit more complicated to
implement, but as Brian stated several times, you should always check
ECC bytes along with data bytes when testing for an erased chunk.

Indeed, it might appear that the user really programmed ff on the page,
and in this case you don't want to consider the page as erased.
In this case, the ECC bytes will be different from ff, and you'll
be able to identify it by checking those ECC bytes.

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-02-24 13:57   ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-02-24 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Markus,

On Sun, 21 Feb 2016 13:52:06 +0100
Markus Pargmann <mpa@pengutronix.de> wrote:

> ECC is only calculated for written pages. As erased pages are not
> actively written the ECC is always invalid. For this purpose the
> Hardware BCH unit is able to check for erased pages and does not raise
> an ECC error in this case. This behaviour can be influenced using the
> BCH_MODE register which sets the number of allowed bitflips in an erased
> page. Unfortunately the unit is not capable of fixing the bitflips in
> memory.
> 
> To avoid complete software checks for erased pages, we can simply check
> buffers with uncorrectable ECC errors because we know that any erased
> page with errors is uncorrectable by the BCH unit.
> 
> This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> to correct erased pages. To have the valid data in the buffer before
> using them, this patch moves the read_page_swap_end() call before the
> ECC status checking for-loop.
> 
> Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> ---
>  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
>  1 file changed, 44 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> index 235ddcb58f39..ce5a21252102 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>  	/* Loop over status bytes, accumulating ECC status. */
>  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
>  
> +	read_page_swap_end(this, buf, nfc_geo->payload_size,
> +			   this->payload_virt, this->payload_phys,
> +			   nfc_geo->payload_size,
> +			   payload_virt, payload_phys);
> +
>  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
>  		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
>  			continue;
>  
>  		if (*status == STATUS_UNCORRECTABLE) {
> +			int flips;
> +
> +			/*
> +			 * The ECC hardware has an uncorrectable ECC status
> +			 * code in case we have bitflips in an erased page. As
> +			 * nothing was written into this subpage the ECC is
> +			 * obviously wrong and we can not trust it. We assume
> +			 * at this point that we are reading an erased page and
> +			 * try to correct the bitflips in buffer up to
> +			 * ecc_strength bitflips. If this is a page with random
> +			 * data, we exceed this number of bitflips and have a
> +			 * ECC failure. Otherwise we use the corrected buffer.
> +			 */
> +			if (i == 0) {
> +				/* The first block includes metadata */
> +				flips = nand_check_erased_ecc_chunk(
> +						buf + i * nfc_geo->ecc_chunk_size,
> +						nfc_geo->ecc_chunk_size,
> +						NULL, 0,
> +						auxiliary_virt,
> +						nfc_geo->metadata_size,
> +						nfc_geo->ecc_strength);
> +			} else {
> +				flips = nand_check_erased_ecc_chunk(
> +						buf + i * nfc_geo->ecc_chunk_size,
> +						nfc_geo->ecc_chunk_size,
> +						NULL, 0,
> +						NULL, 0,
> +						nfc_geo->ecc_strength);
> +			}

You're not checking ECC bytes. I know it's a bit more complicated to
implement, but as Brian stated several times, you should always check
ECC bytes along with data bytes when testing for an erased chunk.

Indeed, it might appear that the user really programmed ff on the page,
and in this case you don't want to consider the page as erased.
In this case, the ECC bytes will be different from ff, and you'll
be able to identify it by checking those ECC bytes.

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-02-24 13:57   ` Boris Brezillon
@ 2016-04-11  6:34     ` Markus Pargmann
  -1 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-11  6:34 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 4871 bytes --]

Hi Boris,

On Wednesday 24 February 2016 14:57:50 Boris Brezillon wrote:
> Hi Markus,
> 
> On Sun, 21 Feb 2016 13:52:06 +0100
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > ECC is only calculated for written pages. As erased pages are not
> > actively written the ECC is always invalid. For this purpose the
> > Hardware BCH unit is able to check for erased pages and does not raise
> > an ECC error in this case. This behaviour can be influenced using the
> > BCH_MODE register which sets the number of allowed bitflips in an erased
> > page. Unfortunately the unit is not capable of fixing the bitflips in
> > memory.
> > 
> > To avoid complete software checks for erased pages, we can simply check
> > buffers with uncorrectable ECC errors because we know that any erased
> > page with errors is uncorrectable by the BCH unit.
> > 
> > This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> > to correct erased pages. To have the valid data in the buffer before
> > using them, this patch moves the read_page_swap_end() call before the
> > ECC status checking for-loop.
> > 
> > Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> > ---
> >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
> >  1 file changed, 44 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > index 235ddcb58f39..ce5a21252102 100644
> > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
> >  	/* Loop over status bytes, accumulating ECC status. */
> >  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
> >  
> > +	read_page_swap_end(this, buf, nfc_geo->payload_size,
> > +			   this->payload_virt, this->payload_phys,
> > +			   nfc_geo->payload_size,
> > +			   payload_virt, payload_phys);
> > +
> >  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> >  		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> >  			continue;
> >  
> >  		if (*status == STATUS_UNCORRECTABLE) {
> > +			int flips;
> > +
> > +			/*
> > +			 * The ECC hardware has an uncorrectable ECC status
> > +			 * code in case we have bitflips in an erased page. As
> > +			 * nothing was written into this subpage the ECC is
> > +			 * obviously wrong and we can not trust it. We assume
> > +			 * at this point that we are reading an erased page and
> > +			 * try to correct the bitflips in buffer up to
> > +			 * ecc_strength bitflips. If this is a page with random
> > +			 * data, we exceed this number of bitflips and have a
> > +			 * ECC failure. Otherwise we use the corrected buffer.
> > +			 */
> > +			if (i == 0) {
> > +				/* The first block includes metadata */
> > +				flips = nand_check_erased_ecc_chunk(
> > +						buf + i * nfc_geo->ecc_chunk_size,
> > +						nfc_geo->ecc_chunk_size,
> > +						NULL, 0,
> > +						auxiliary_virt,
> > +						nfc_geo->metadata_size,
> > +						nfc_geo->ecc_strength);
> > +			} else {
> > +				flips = nand_check_erased_ecc_chunk(
> > +						buf + i * nfc_geo->ecc_chunk_size,
> > +						nfc_geo->ecc_chunk_size,
> > +						NULL, 0,
> > +						NULL, 0,
> > +						nfc_geo->ecc_strength);
> > +			}
> 
> You're not checking ECC bytes. I know it's a bit more complicated to
> implement, but as Brian stated several times, you should always check
> ECC bytes along with data bytes when testing for an erased chunk.
> 
> Indeed, it might appear that the user really programmed ff on the page,
> and in this case you don't want to consider the page as erased.
> In this case, the ECC bytes will be different from ff, and you'll
> be able to identify it by checking those ECC bytes.

Thanks for the feedback. Talking with a coworker about this we may have found a
better approach to this that is less complicated to implement. The hardware
unit allows us to set a bitflip threshold for erased pages. The ECC unit
creates an ECC error only if the number of bitflips exceeds this threshold, but
it does not correct these. So the idea is to change the patch so that we set
pages, that are signaled by the ECC as erased, to 0xff completely without
checking. So the ECC will do all the work and we completely trust in its
abilities to do it correctly.

I will send a modified patch as soon as I have some time for this.

Best Regards,

Markus

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-11  6:34     ` Markus Pargmann
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-11  6:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On Wednesday 24 February 2016 14:57:50 Boris Brezillon wrote:
> Hi Markus,
> 
> On Sun, 21 Feb 2016 13:52:06 +0100
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > ECC is only calculated for written pages. As erased pages are not
> > actively written the ECC is always invalid. For this purpose the
> > Hardware BCH unit is able to check for erased pages and does not raise
> > an ECC error in this case. This behaviour can be influenced using the
> > BCH_MODE register which sets the number of allowed bitflips in an erased
> > page. Unfortunately the unit is not capable of fixing the bitflips in
> > memory.
> > 
> > To avoid complete software checks for erased pages, we can simply check
> > buffers with uncorrectable ECC errors because we know that any erased
> > page with errors is uncorrectable by the BCH unit.
> > 
> > This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> > to correct erased pages. To have the valid data in the buffer before
> > using them, this patch moves the read_page_swap_end() call before the
> > ECC status checking for-loop.
> > 
> > Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> > ---
> >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
> >  1 file changed, 44 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > index 235ddcb58f39..ce5a21252102 100644
> > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
> >  	/* Loop over status bytes, accumulating ECC status. */
> >  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
> >  
> > +	read_page_swap_end(this, buf, nfc_geo->payload_size,
> > +			   this->payload_virt, this->payload_phys,
> > +			   nfc_geo->payload_size,
> > +			   payload_virt, payload_phys);
> > +
> >  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> >  		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> >  			continue;
> >  
> >  		if (*status == STATUS_UNCORRECTABLE) {
> > +			int flips;
> > +
> > +			/*
> > +			 * The ECC hardware has an uncorrectable ECC status
> > +			 * code in case we have bitflips in an erased page. As
> > +			 * nothing was written into this subpage the ECC is
> > +			 * obviously wrong and we can not trust it. We assume
> > +			 * at this point that we are reading an erased page and
> > +			 * try to correct the bitflips in buffer up to
> > +			 * ecc_strength bitflips. If this is a page with random
> > +			 * data, we exceed this number of bitflips and have a
> > +			 * ECC failure. Otherwise we use the corrected buffer.
> > +			 */
> > +			if (i == 0) {
> > +				/* The first block includes metadata */
> > +				flips = nand_check_erased_ecc_chunk(
> > +						buf + i * nfc_geo->ecc_chunk_size,
> > +						nfc_geo->ecc_chunk_size,
> > +						NULL, 0,
> > +						auxiliary_virt,
> > +						nfc_geo->metadata_size,
> > +						nfc_geo->ecc_strength);
> > +			} else {
> > +				flips = nand_check_erased_ecc_chunk(
> > +						buf + i * nfc_geo->ecc_chunk_size,
> > +						nfc_geo->ecc_chunk_size,
> > +						NULL, 0,
> > +						NULL, 0,
> > +						nfc_geo->ecc_strength);
> > +			}
> 
> You're not checking ECC bytes. I know it's a bit more complicated to
> implement, but as Brian stated several times, you should always check
> ECC bytes along with data bytes when testing for an erased chunk.
> 
> Indeed, it might appear that the user really programmed ff on the page,
> and in this case you don't want to consider the page as erased.
> In this case, the ECC bytes will be different from ff, and you'll
> be able to identify it by checking those ECC bytes.

Thanks for the feedback. Talking with a coworker about this we may have found a
better approach to this that is less complicated to implement. The hardware
unit allows us to set a bitflip threshold for erased pages. The ECC unit
creates an ECC error only if the number of bitflips exceeds this threshold, but
it does not correct these. So the idea is to change the patch so that we set
pages, that are signaled by the ECC as erased, to 0xff completely without
checking. So the ECC will do all the work and we completely trust in its
abilities to do it correctly.

I will send a modified patch as soon as I have some time for this.

Best Regards,

Markus

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160411/f276b318/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-11  6:34     ` Markus Pargmann
@ 2016-04-11  7:29       ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-11  7:29 UTC (permalink / raw)
  To: Markus Pargmann
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel

On Mon, 11 Apr 2016 08:34:35 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> Hi Boris,
> 
> On Wednesday 24 February 2016 14:57:50 Boris Brezillon wrote:
> > Hi Markus,
> > 
> > On Sun, 21 Feb 2016 13:52:06 +0100
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> > 
> > > ECC is only calculated for written pages. As erased pages are not
> > > actively written the ECC is always invalid. For this purpose the
> > > Hardware BCH unit is able to check for erased pages and does not raise
> > > an ECC error in this case. This behaviour can be influenced using the
> > > BCH_MODE register which sets the number of allowed bitflips in an erased
> > > page. Unfortunately the unit is not capable of fixing the bitflips in
> > > memory.
> > > 
> > > To avoid complete software checks for erased pages, we can simply check
> > > buffers with uncorrectable ECC errors because we know that any erased
> > > page with errors is uncorrectable by the BCH unit.
> > > 
> > > This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> > > to correct erased pages. To have the valid data in the buffer before
> > > using them, this patch moves the read_page_swap_end() call before the
> > > ECC status checking for-loop.
> > > 
> > > Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> > > ---
> > >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
> > >  1 file changed, 44 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > index 235ddcb58f39..ce5a21252102 100644
> > > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
> > >  	/* Loop over status bytes, accumulating ECC status. */
> > >  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
> > >  
> > > +	read_page_swap_end(this, buf, nfc_geo->payload_size,
> > > +			   this->payload_virt, this->payload_phys,
> > > +			   nfc_geo->payload_size,
> > > +			   payload_virt, payload_phys);
> > > +
> > >  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> > >  		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> > >  			continue;
> > >  
> > >  		if (*status == STATUS_UNCORRECTABLE) {
> > > +			int flips;
> > > +
> > > +			/*
> > > +			 * The ECC hardware has an uncorrectable ECC status
> > > +			 * code in case we have bitflips in an erased page. As
> > > +			 * nothing was written into this subpage the ECC is
> > > +			 * obviously wrong and we can not trust it. We assume
> > > +			 * at this point that we are reading an erased page and
> > > +			 * try to correct the bitflips in buffer up to
> > > +			 * ecc_strength bitflips. If this is a page with random
> > > +			 * data, we exceed this number of bitflips and have a
> > > +			 * ECC failure. Otherwise we use the corrected buffer.
> > > +			 */
> > > +			if (i == 0) {
> > > +				/* The first block includes metadata */
> > > +				flips = nand_check_erased_ecc_chunk(
> > > +						buf + i * nfc_geo->ecc_chunk_size,
> > > +						nfc_geo->ecc_chunk_size,
> > > +						NULL, 0,
> > > +						auxiliary_virt,
> > > +						nfc_geo->metadata_size,
> > > +						nfc_geo->ecc_strength);
> > > +			} else {
> > > +				flips = nand_check_erased_ecc_chunk(
> > > +						buf + i * nfc_geo->ecc_chunk_size,
> > > +						nfc_geo->ecc_chunk_size,
> > > +						NULL, 0,
> > > +						NULL, 0,
> > > +						nfc_geo->ecc_strength);
> > > +			}
> > 
> > You're not checking ECC bytes. I know it's a bit more complicated to
> > implement, but as Brian stated several times, you should always check
> > ECC bytes along with data bytes when testing for an erased chunk.
> > 
> > Indeed, it might appear that the user really programmed ff on the page,
> > and in this case you don't want to consider the page as erased.
> > In this case, the ECC bytes will be different from ff, and you'll
> > be able to identify it by checking those ECC bytes.
> 
> Thanks for the feedback. Talking with a coworker about this we may have found a
> better approach to this that is less complicated to implement. The hardware
> unit allows us to set a bitflip threshold for erased pages. The ECC unit
> creates an ECC error only if the number of bitflips exceeds this threshold, but
> it does not correct these. So the idea is to change the patch so that we set
> pages, that are signaled by the ECC as erased, to 0xff completely without
> checking. So the ECC will do all the work and we completely trust in its
> abilities to do it correctly.

Sounds good.

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-11  7:29       ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-11  7:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 11 Apr 2016 08:34:35 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> Hi Boris,
> 
> On Wednesday 24 February 2016 14:57:50 Boris Brezillon wrote:
> > Hi Markus,
> > 
> > On Sun, 21 Feb 2016 13:52:06 +0100
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> > 
> > > ECC is only calculated for written pages. As erased pages are not
> > > actively written the ECC is always invalid. For this purpose the
> > > Hardware BCH unit is able to check for erased pages and does not raise
> > > an ECC error in this case. This behaviour can be influenced using the
> > > BCH_MODE register which sets the number of allowed bitflips in an erased
> > > page. Unfortunately the unit is not capable of fixing the bitflips in
> > > memory.
> > > 
> > > To avoid complete software checks for erased pages, we can simply check
> > > buffers with uncorrectable ECC errors because we know that any erased
> > > page with errors is uncorrectable by the BCH unit.
> > > 
> > > This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> > > to correct erased pages. To have the valid data in the buffer before
> > > using them, this patch moves the read_page_swap_end() call before the
> > > ECC status checking for-loop.
> > > 
> > > Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> > > ---
> > >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
> > >  1 file changed, 44 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > index 235ddcb58f39..ce5a21252102 100644
> > > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
> > >  	/* Loop over status bytes, accumulating ECC status. */
> > >  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
> > >  
> > > +	read_page_swap_end(this, buf, nfc_geo->payload_size,
> > > +			   this->payload_virt, this->payload_phys,
> > > +			   nfc_geo->payload_size,
> > > +			   payload_virt, payload_phys);
> > > +
> > >  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> > >  		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> > >  			continue;
> > >  
> > >  		if (*status == STATUS_UNCORRECTABLE) {
> > > +			int flips;
> > > +
> > > +			/*
> > > +			 * The ECC hardware has an uncorrectable ECC status
> > > +			 * code in case we have bitflips in an erased page. As
> > > +			 * nothing was written into this subpage the ECC is
> > > +			 * obviously wrong and we can not trust it. We assume
> > > +			 * at this point that we are reading an erased page and
> > > +			 * try to correct the bitflips in buffer up to
> > > +			 * ecc_strength bitflips. If this is a page with random
> > > +			 * data, we exceed this number of bitflips and have a
> > > +			 * ECC failure. Otherwise we use the corrected buffer.
> > > +			 */
> > > +			if (i == 0) {
> > > +				/* The first block includes metadata */
> > > +				flips = nand_check_erased_ecc_chunk(
> > > +						buf + i * nfc_geo->ecc_chunk_size,
> > > +						nfc_geo->ecc_chunk_size,
> > > +						NULL, 0,
> > > +						auxiliary_virt,
> > > +						nfc_geo->metadata_size,
> > > +						nfc_geo->ecc_strength);
> > > +			} else {
> > > +				flips = nand_check_erased_ecc_chunk(
> > > +						buf + i * nfc_geo->ecc_chunk_size,
> > > +						nfc_geo->ecc_chunk_size,
> > > +						NULL, 0,
> > > +						NULL, 0,
> > > +						nfc_geo->ecc_strength);
> > > +			}
> > 
> > You're not checking ECC bytes. I know it's a bit more complicated to
> > implement, but as Brian stated several times, you should always check
> > ECC bytes along with data bytes when testing for an erased chunk.
> > 
> > Indeed, it might appear that the user really programmed ff on the page,
> > and in this case you don't want to consider the page as erased.
> > In this case, the ECC bytes will be different from ff, and you'll
> > be able to identify it by checking those ECC bytes.
> 
> Thanks for the feedback. Talking with a coworker about this we may have found a
> better approach to this that is less complicated to implement. The hardware
> unit allows us to set a bitflip threshold for erased pages. The ECC unit
> creates an ECC error only if the number of bitflips exceeds this threshold, but
> it does not correct these. So the idea is to change the patch so that we set
> pages, that are signaled by the ECC as erased, to 0xff completely without
> checking. So the ECC will do all the work and we completely trust in its
> abilities to do it correctly.

Sounds good.

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-11  7:29       ` Boris Brezillon
@ 2016-04-12 22:39         ` Han Xu
  -1 siblings, 0 replies; 32+ messages in thread
From: Han Xu @ 2016-04-12 22:39 UTC (permalink / raw)
  To: Boris Brezillon, Markus Pargmann
  Cc: David Woodhouse, Fabio Estevam, linux-mtd, kernel, Huang Shijie,
	Brian Norris, linux-arm-kernel



________________________________________
From: Boris Brezillon <boris.brezillon@free-electrons.com>
Sent: Monday, April 11, 2016 1:29 AM
To: Markus Pargmann
Cc: Han Xu; David Woodhouse; Fabio Estevam; linux-mtd@lists.infradead.org; kernel@pengutronix.de; Huang Shijie; Brian Norris; linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages

On Mon, 11 Apr 2016 08:34:35 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> Hi Boris,
>
> On Wednesday 24 February 2016 14:57:50 Boris Brezillon wrote:
> > Hi Markus,
> >
> > On Sun, 21 Feb 2016 13:52:06 +0100
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> >
> > > ECC is only calculated for written pages. As erased pages are not
> > > actively written the ECC is always invalid. For this purpose the
> > > Hardware BCH unit is able to check for erased pages and does not raise
> > > an ECC error in this case. This behaviour can be influenced using the
> > > BCH_MODE register which sets the number of allowed bitflips in an erased
> > > page. Unfortunately the unit is not capable of fixing the bitflips in
> > > memory.
> > >
> > > To avoid complete software checks for erased pages, we can simply check
> > > buffers with uncorrectable ECC errors because we know that any erased
> > > page with errors is uncorrectable by the BCH unit.
> > >
> > > This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> > > to correct erased pages. To have the valid data in the buffer before
> > > using them, this patch moves the read_page_swap_end() call before the
> > > ECC status checking for-loop.
> > >
> > > Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> > > ---
> > >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
> > >  1 file changed, 44 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > index 235ddcb58f39..ce5a21252102 100644
> > > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
> > >   /* Loop over status bytes, accumulating ECC status. */
> > >   status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
> > >
> > > + read_page_swap_end(this, buf, nfc_geo->payload_size,
> > > +                    this->payload_virt, this->payload_phys,
> > > +                    nfc_geo->payload_size,
> > > +                    payload_virt, payload_phys);
> > > +
> > >   for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> > >           if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> > >                   continue;
> > >
> > >           if (*status == STATUS_UNCORRECTABLE) {
> > > +                 int flips;
> > > +
> > > +                 /*
> > > +                  * The ECC hardware has an uncorrectable ECC status
> > > +                  * code in case we have bitflips in an erased page. As
> > > +                  * nothing was written into this subpage the ECC is
> > > +                  * obviously wrong and we can not trust it. We assume
> > > +                  * at this point that we are reading an erased page and
> > > +                  * try to correct the bitflips in buffer up to
> > > +                  * ecc_strength bitflips. If this is a page with random
> > > +                  * data, we exceed this number of bitflips and have a
> > > +                  * ECC failure. Otherwise we use the corrected buffer.
> > > +                  */
> > > +                 if (i == 0) {
> > > +                         /* The first block includes metadata */
> > > +                         flips = nand_check_erased_ecc_chunk(
> > > +                                         buf + i * nfc_geo->ecc_chunk_size,
> > > +                                         nfc_geo->ecc_chunk_size,
> > > +                                         NULL, 0,
> > > +                                         auxiliary_virt,
> > > +                                         nfc_geo->metadata_size,
> > > +                                         nfc_geo->ecc_strength);
> > > +                 } else {
> > > +                         flips = nand_check_erased_ecc_chunk(
> > > +                                         buf + i * nfc_geo->ecc_chunk_size,
> > > +                                         nfc_geo->ecc_chunk_size,
> > > +                                         NULL, 0,
> > > +                                         NULL, 0,
> > > +                                         nfc_geo->ecc_strength);
> > > +                 }
> >
> > You're not checking ECC bytes. I know it's a bit more complicated to
> > implement, but as Brian stated several times, you should always check
> > ECC bytes along with data bytes when testing for an erased chunk.
> >
> > Indeed, it might appear that the user really programmed ff on the page,
> > and in this case you don't want to consider the page as erased.
> > In this case, the ECC bytes will be different from ff, and you'll
> > be able to identify it by checking those ECC bytes.
>
> Thanks for the feedback. Talking with a coworker about this we may have found a
> better approach to this that is less complicated to implement. The hardware
> unit allows us to set a bitflip threshold for erased pages. The ECC unit
> creates an ECC error only if the number of bitflips exceeds this threshold, but
> it does not correct these. So the idea is to change the patch so that we set
> pages, that are signaled by the ECC as erased, to 0xff completely without
> checking. So the ECC will do all the work and we completely trust in its
> abilities to do it correctly.

Sounds good.

some new platforms with new gpmi controller could check the count of 0 bits in page,
refer to my patch https://patchwork.ozlabs.org/patch/587124/

But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
only check the uncorrectable branch and then correct data sounds better. Setting threshold
and correcting all erased page may highly impact the performance.

--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-12 22:39         ` Han Xu
  0 siblings, 0 replies; 32+ messages in thread
From: Han Xu @ 2016-04-12 22:39 UTC (permalink / raw)
  To: linux-arm-kernel



________________________________________
From: Boris Brezillon <boris.brezillon@free-electrons.com>
Sent: Monday, April 11, 2016 1:29 AM
To: Markus Pargmann
Cc: Han Xu; David Woodhouse; Fabio Estevam; linux-mtd at lists.infradead.org; kernel at pengutronix.de; Huang Shijie; Brian Norris; linux-arm-kernel at lists.infradead.org
Subject: Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages

On Mon, 11 Apr 2016 08:34:35 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> Hi Boris,
>
> On Wednesday 24 February 2016 14:57:50 Boris Brezillon wrote:
> > Hi Markus,
> >
> > On Sun, 21 Feb 2016 13:52:06 +0100
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> >
> > > ECC is only calculated for written pages. As erased pages are not
> > > actively written the ECC is always invalid. For this purpose the
> > > Hardware BCH unit is able to check for erased pages and does not raise
> > > an ECC error in this case. This behaviour can be influenced using the
> > > BCH_MODE register which sets the number of allowed bitflips in an erased
> > > page. Unfortunately the unit is not capable of fixing the bitflips in
> > > memory.
> > >
> > > To avoid complete software checks for erased pages, we can simply check
> > > buffers with uncorrectable ECC errors because we know that any erased
> > > page with errors is uncorrectable by the BCH unit.
> > >
> > > This patch adds the generic nand_check_erased_ecc_chunk() to gpmi-nand
> > > to correct erased pages. To have the valid data in the buffer before
> > > using them, this patch moves the read_page_swap_end() call before the
> > > ECC status checking for-loop.
> > >
> > > Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
> > > ---
> > >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 49 ++++++++++++++++++++++++++++++----
> > >  1 file changed, 44 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > index 235ddcb58f39..ce5a21252102 100644
> > > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > > @@ -1035,14 +1035,58 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
> > >   /* Loop over status bytes, accumulating ECC status. */
> > >   status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
> > >
> > > + read_page_swap_end(this, buf, nfc_geo->payload_size,
> > > +                    this->payload_virt, this->payload_phys,
> > > +                    nfc_geo->payload_size,
> > > +                    payload_virt, payload_phys);
> > > +
> > >   for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> > >           if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> > >                   continue;
> > >
> > >           if (*status == STATUS_UNCORRECTABLE) {
> > > +                 int flips;
> > > +
> > > +                 /*
> > > +                  * The ECC hardware has an uncorrectable ECC status
> > > +                  * code in case we have bitflips in an erased page. As
> > > +                  * nothing was written into this subpage the ECC is
> > > +                  * obviously wrong and we can not trust it. We assume
> > > +                  * at this point that we are reading an erased page and
> > > +                  * try to correct the bitflips in buffer up to
> > > +                  * ecc_strength bitflips. If this is a page with random
> > > +                  * data, we exceed this number of bitflips and have a
> > > +                  * ECC failure. Otherwise we use the corrected buffer.
> > > +                  */
> > > +                 if (i == 0) {
> > > +                         /* The first block includes metadata */
> > > +                         flips = nand_check_erased_ecc_chunk(
> > > +                                         buf + i * nfc_geo->ecc_chunk_size,
> > > +                                         nfc_geo->ecc_chunk_size,
> > > +                                         NULL, 0,
> > > +                                         auxiliary_virt,
> > > +                                         nfc_geo->metadata_size,
> > > +                                         nfc_geo->ecc_strength);
> > > +                 } else {
> > > +                         flips = nand_check_erased_ecc_chunk(
> > > +                                         buf + i * nfc_geo->ecc_chunk_size,
> > > +                                         nfc_geo->ecc_chunk_size,
> > > +                                         NULL, 0,
> > > +                                         NULL, 0,
> > > +                                         nfc_geo->ecc_strength);
> > > +                 }
> >
> > You're not checking ECC bytes. I know it's a bit more complicated to
> > implement, but as Brian stated several times, you should always check
> > ECC bytes along with data bytes when testing for an erased chunk.
> >
> > Indeed, it might appear that the user really programmed ff on the page,
> > and in this case you don't want to consider the page as erased.
> > In this case, the ECC bytes will be different from ff, and you'll
> > be able to identify it by checking those ECC bytes.
>
> Thanks for the feedback. Talking with a coworker about this we may have found a
> better approach to this that is less complicated to implement. The hardware
> unit allows us to set a bitflip threshold for erased pages. The ECC unit
> creates an ECC error only if the number of bitflips exceeds this threshold, but
> it does not correct these. So the idea is to change the patch so that we set
> pages, that are signaled by the ECC as erased, to 0xff completely without
> checking. So the ECC will do all the work and we completely trust in its
> abilities to do it correctly.

Sounds good.

some new platforms with new gpmi controller could check the count of 0 bits in page,
refer to my patch https://patchwork.ozlabs.org/patch/587124/

But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
only check the uncorrectable branch and then correct data sounds better. Setting threshold
and correcting all erased page may highly impact the performance.

--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-12 22:39         ` Han Xu
@ 2016-04-12 22:51           ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-12 22:51 UTC (permalink / raw)
  To: Han Xu
  Cc: Markus Pargmann, David Woodhouse, Fabio Estevam, linux-mtd,
	kernel, Huang Shijie, Brian Norris, linux-arm-kernel

On Tue, 12 Apr 2016 22:39:08 +0000
Han Xu <han.xu@nxp.com> wrote:

> > Thanks for the feedback. Talking with a coworker about this we may have found a
> > better approach to this that is less complicated to implement. The hardware
> > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > it does not correct these. So the idea is to change the patch so that we set
> > pages, that are signaled by the ECC as erased, to 0xff completely without
> > checking. So the ECC will do all the work and we completely trust in its
> > abilities to do it correctly.
> 
> Sounds good.
> 
> some new platforms with new gpmi controller could check the count of 0 bits in page,
> refer to my patch https://patchwork.ozlabs.org/patch/587124/
> 
> But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> only check the uncorrectable branch and then correct data sounds better. Setting threshold
> and correcting all erased page may highly impact the performance.

Indeed, bitflips in erased pages is not so common, and penalizing the
likely case (erased pages without any bitflips) doesn't look like a good
idea in the end.

You can still implement this check in software. You can have a look at
nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
have to adapt it because your controller does not guarantees that ECC
bits for a given chunk are byte aligned :-/

[1]http://lxr.free-electrons.com/source/drivers/mtd/nand/nand_base.c#L1200


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-12 22:51           ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-12 22:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 12 Apr 2016 22:39:08 +0000
Han Xu <han.xu@nxp.com> wrote:

> > Thanks for the feedback. Talking with a coworker about this we may have found a
> > better approach to this that is less complicated to implement. The hardware
> > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > it does not correct these. So the idea is to change the patch so that we set
> > pages, that are signaled by the ECC as erased, to 0xff completely without
> > checking. So the ECC will do all the work and we completely trust in its
> > abilities to do it correctly.
> 
> Sounds good.
> 
> some new platforms with new gpmi controller could check the count of 0 bits in page,
> refer to my patch https://patchwork.ozlabs.org/patch/587124/
> 
> But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> only check the uncorrectable branch and then correct data sounds better. Setting threshold
> and correcting all erased page may highly impact the performance.

Indeed, bitflips in erased pages is not so common, and penalizing the
likely case (erased pages without any bitflips) doesn't look like a good
idea in the end.

You can still implement this check in software. You can have a look at
nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
have to adapt it because your controller does not guarantees that ECC
bits for a given chunk are byte aligned :-/

[1]http://lxr.free-electrons.com/source/drivers/mtd/nand/nand_base.c#L1200


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-12 22:51           ` Boris Brezillon
@ 2016-04-15  7:55             ` Markus Pargmann
  -1 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-15  7:55 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel, Stefan Christ,
	Elie De Brauwer

[-- Attachment #1: Type: text/plain, Size: 2418 bytes --]

On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> On Tue, 12 Apr 2016 22:39:08 +0000
> Han Xu <han.xu@nxp.com> wrote:
> 
> > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > better approach to this that is less complicated to implement. The hardware
> > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > it does not correct these. So the idea is to change the patch so that we set
> > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > checking. So the ECC will do all the work and we completely trust in its
> > > abilities to do it correctly.
> > 
> > Sounds good.
> > 
> > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > 
> > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > and correcting all erased page may highly impact the performance.
> 
> Indeed, bitflips in erased pages is not so common, and penalizing the
> likely case (erased pages without any bitflips) doesn't look like a good
> idea in the end.

Are erased pages really read that often? I am not sure how UBI handles
this, does it read every page before writing?

> 
> You can still implement this check in software. You can have a look at
> nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> have to adapt it because your controller does not guarantees that ECC
> bits for a given chunk are byte aligned :-/

Yes I used this function in the patch. The issue is that I am not quite
sure yet where to find the raw ECC data (without rereading the page).
The reference manual is not extremely clear about that, ecc data may be
in the 'auxilliary data' but I am not sure that it really is available
somewhere.

Best Regards,

Markus

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-15  7:55             ` Markus Pargmann
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-15  7:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> On Tue, 12 Apr 2016 22:39:08 +0000
> Han Xu <han.xu@nxp.com> wrote:
> 
> > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > better approach to this that is less complicated to implement. The hardware
> > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > it does not correct these. So the idea is to change the patch so that we set
> > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > checking. So the ECC will do all the work and we completely trust in its
> > > abilities to do it correctly.
> > 
> > Sounds good.
> > 
> > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > 
> > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > and correcting all erased page may highly impact the performance.
> 
> Indeed, bitflips in erased pages is not so common, and penalizing the
> likely case (erased pages without any bitflips) doesn't look like a good
> idea in the end.

Are erased pages really read that often? I am not sure how UBI handles
this, does it read every page before writing?

> 
> You can still implement this check in software. You can have a look at
> nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> have to adapt it because your controller does not guarantees that ECC
> bits for a given chunk are byte aligned :-/

Yes I used this function in the patch. The issue is that I am not quite
sure yet where to find the raw ECC data (without rereading the page).
The reference manual is not extremely clear about that, ecc data may be
in the 'auxilliary data' but I am not sure that it really is available
somewhere.

Best Regards,

Markus

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160415/4a32d055/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15  7:55             ` Markus Pargmann
@ 2016-04-15  8:35               ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-15  8:35 UTC (permalink / raw)
  To: Markus Pargmann
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel, Stefan Christ,
	Elie De Brauwer, Richard Weinberger, Artem Bityutskiy

Hi Markus,

On Fri, 15 Apr 2016 09:55:45 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > On Tue, 12 Apr 2016 22:39:08 +0000
> > Han Xu <han.xu@nxp.com> wrote:
> > 
> > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > better approach to this that is less complicated to implement. The hardware
> > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > it does not correct these. So the idea is to change the patch so that we set
> > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > checking. So the ECC will do all the work and we completely trust in its
> > > > abilities to do it correctly.
> > > 
> > > Sounds good.
> > > 
> > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > 
> > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > and correcting all erased page may highly impact the performance.
> > 
> > Indeed, bitflips in erased pages is not so common, and penalizing the
> > likely case (erased pages without any bitflips) doesn't look like a good
> > idea in the end.
> 
> Are erased pages really read that often?

Yes, it's not unusual to have those "empty pages?" checks (added Artem
and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
in its journal heads after an unclean unmount (which happens quite
often) to make sure there's no corruption.

> I am not sure how UBI handles
> this, does it read every page before writing?

Nope, or maybe it does when you activate some extra checks.

> 
> > 
> > You can still implement this check in software. You can have a look at
> > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > have to adapt it because your controller does not guarantees that ECC
> > bits for a given chunk are byte aligned :-/
> 
> Yes I used this function in the patch. The issue is that I am not quite
> sure yet where to find the raw ECC data (without rereading the page).
> The reference manual is not extremely clear about that, ecc data may be
> in the 'auxilliary data' but I am not sure that it really is available
> somewhere.

AFAIR (and I'm not sure since it was a long time ago), you don't have
direct access to ECC bytes with the GPMI engine. If that's the case,
you'll have to read the ECC bytes manually (moving the page pointer
using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
this engine, because ECC bytes are not guaranteed to be byte aligned
(see gpmi ->read_page_raw() implementation).
Once you've retrieved ECC bytes (or bits in this case), for each ECC
chunk, you can use the nand_check_erased_ecc_chunk() function (just make
sure you're padding the last ECC byte of each chunk with ones so that
bitflips cannot be reported on this section).

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-15  8:35               ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-15  8:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Markus,

On Fri, 15 Apr 2016 09:55:45 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > On Tue, 12 Apr 2016 22:39:08 +0000
> > Han Xu <han.xu@nxp.com> wrote:
> > 
> > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > better approach to this that is less complicated to implement. The hardware
> > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > it does not correct these. So the idea is to change the patch so that we set
> > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > checking. So the ECC will do all the work and we completely trust in its
> > > > abilities to do it correctly.
> > > 
> > > Sounds good.
> > > 
> > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > 
> > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > and correcting all erased page may highly impact the performance.
> > 
> > Indeed, bitflips in erased pages is not so common, and penalizing the
> > likely case (erased pages without any bitflips) doesn't look like a good
> > idea in the end.
> 
> Are erased pages really read that often?

Yes, it's not unusual to have those "empty pages?" checks (added Artem
and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
in its journal heads after an unclean unmount (which happens quite
often) to make sure there's no corruption.

> I am not sure how UBI handles
> this, does it read every page before writing?

Nope, or maybe it does when you activate some extra checks.

> 
> > 
> > You can still implement this check in software. You can have a look at
> > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > have to adapt it because your controller does not guarantees that ECC
> > bits for a given chunk are byte aligned :-/
> 
> Yes I used this function in the patch. The issue is that I am not quite
> sure yet where to find the raw ECC data (without rereading the page).
> The reference manual is not extremely clear about that, ecc data may be
> in the 'auxilliary data' but I am not sure that it really is available
> somewhere.

AFAIR (and I'm not sure since it was a long time ago), you don't have
direct access to ECC bytes with the GPMI engine. If that's the case,
you'll have to read the ECC bytes manually (moving the page pointer
using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
this engine, because ECC bytes are not guaranteed to be byte aligned
(see gpmi ->read_page_raw() implementation).
Once you've retrieved ECC bytes (or bits in this case), for each ECC
chunk, you can use the nand_check_erased_ecc_chunk() function (just make
sure you're padding the last ECC byte of each chunk with ones so that
bitflips cannot be reported on this section).

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15  8:35               ` Boris Brezillon
@ 2016-04-15  9:35                 ` Markus Pargmann
  -1 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-15  9:35 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel, Stefan Christ,
	Elie De Brauwer, Richard Weinberger, Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 4062 bytes --]

Hi Boris,

On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> Hi Markus,
> 
> On Fri, 15 Apr 2016 09:55:45 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > Han Xu <han.xu@nxp.com> wrote:
> > > 
> > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > better approach to this that is less complicated to implement. The hardware
> > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > abilities to do it correctly.
> > > > 
> > > > Sounds good.
> > > > 
> > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > 
> > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > and correcting all erased page may highly impact the performance.
> > > 
> > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > likely case (erased pages without any bitflips) doesn't look like a good
> > > idea in the end.
> > 
> > Are erased pages really read that often?
> 
> Yes, it's not unusual to have those "empty pages?" checks (added Artem
> and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> in its journal heads after an unclean unmount (which happens quite
> often) to make sure there's no corruption.
> 
> > I am not sure how UBI handles
> > this, does it read every page before writing?
> 
> Nope, or maybe it does when you activate some extra checks.
> 
> > 
> > > 
> > > You can still implement this check in software. You can have a look at
> > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > have to adapt it because your controller does not guarantees that ECC
> > > bits for a given chunk are byte aligned :-/
> > 
> > Yes I used this function in the patch. The issue is that I am not quite
> > sure yet where to find the raw ECC data (without rereading the page).
> > The reference manual is not extremely clear about that, ecc data may be
> > in the 'auxilliary data' but I am not sure that it really is available
> > somewhere.
> 
> AFAIR (and I'm not sure since it was a long time ago), you don't have
> direct access to ECC bytes with the GPMI engine. If that's the case,
> you'll have to read the ECC bytes manually (moving the page pointer
> using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> this engine, because ECC bytes are not guaranteed to be byte aligned
> (see gpmi ->read_page_raw() implementation).
> Once you've retrieved ECC bytes (or bits in this case), for each ECC
> chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> sure you're padding the last ECC byte of each chunk with ones so that
> bitflips cannot be reported on this section).

Thanks for the information. So I understand that this approach is the
preferred one to avoid any performance issues for normal operation.

I actually won't be able to fix this patch accordingly for some time. If
anyone else needs this earlier, feel free to implement it.

Best Regards,

Markus

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-15  9:35                 ` Markus Pargmann
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-15  9:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> Hi Markus,
> 
> On Fri, 15 Apr 2016 09:55:45 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > Han Xu <han.xu@nxp.com> wrote:
> > > 
> > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > better approach to this that is less complicated to implement. The hardware
> > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > abilities to do it correctly.
> > > > 
> > > > Sounds good.
> > > > 
> > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > 
> > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > and correcting all erased page may highly impact the performance.
> > > 
> > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > likely case (erased pages without any bitflips) doesn't look like a good
> > > idea in the end.
> > 
> > Are erased pages really read that often?
> 
> Yes, it's not unusual to have those "empty pages?" checks (added Artem
> and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> in its journal heads after an unclean unmount (which happens quite
> often) to make sure there's no corruption.
> 
> > I am not sure how UBI handles
> > this, does it read every page before writing?
> 
> Nope, or maybe it does when you activate some extra checks.
> 
> > 
> > > 
> > > You can still implement this check in software. You can have a look at
> > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > have to adapt it because your controller does not guarantees that ECC
> > > bits for a given chunk are byte aligned :-/
> > 
> > Yes I used this function in the patch. The issue is that I am not quite
> > sure yet where to find the raw ECC data (without rereading the page).
> > The reference manual is not extremely clear about that, ecc data may be
> > in the 'auxilliary data' but I am not sure that it really is available
> > somewhere.
> 
> AFAIR (and I'm not sure since it was a long time ago), you don't have
> direct access to ECC bytes with the GPMI engine. If that's the case,
> you'll have to read the ECC bytes manually (moving the page pointer
> using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> this engine, because ECC bytes are not guaranteed to be byte aligned
> (see gpmi ->read_page_raw() implementation).
> Once you've retrieved ECC bytes (or bits in this case), for each ECC
> chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> sure you're padding the last ECC byte of each chunk with ones so that
> bitflips cannot be reported on this section).

Thanks for the information. So I understand that this approach is the
preferred one to avoid any performance issues for normal operation.

I actually won't be able to fix this patch accordingly for some time. If
anyone else needs this earlier, feel free to implement it.

Best Regards,

Markus

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160415/aa2caa65/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15  9:35                 ` Markus Pargmann
@ 2016-04-15  9:39                   ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-15  9:39 UTC (permalink / raw)
  To: Markus Pargmann
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel, Stefan Christ,
	Elie De Brauwer, Richard Weinberger, Artem Bityutskiy

Hi Markus,

On Fri, 15 Apr 2016 11:35:07 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> Hi Boris,
> 
> On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > Hi Markus,
> > 
> > On Fri, 15 Apr 2016 09:55:45 +0200
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> > 
> > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > Han Xu <han.xu@nxp.com> wrote:
> > > > 
> > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > abilities to do it correctly.
> > > > > 
> > > > > Sounds good.
> > > > > 
> > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > 
> > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > and correcting all erased page may highly impact the performance.
> > > > 
> > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > idea in the end.
> > > 
> > > Are erased pages really read that often?
> > 
> > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > in its journal heads after an unclean unmount (which happens quite
> > often) to make sure there's no corruption.
> > 
> > > I am not sure how UBI handles
> > > this, does it read every page before writing?
> > 
> > Nope, or maybe it does when you activate some extra checks.
> > 
> > > 
> > > > 
> > > > You can still implement this check in software. You can have a look at
> > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > have to adapt it because your controller does not guarantees that ECC
> > > > bits for a given chunk are byte aligned :-/
> > > 
> > > Yes I used this function in the patch. The issue is that I am not quite
> > > sure yet where to find the raw ECC data (without rereading the page).
> > > The reference manual is not extremely clear about that, ecc data may be
> > > in the 'auxilliary data' but I am not sure that it really is available
> > > somewhere.
> > 
> > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > direct access to ECC bytes with the GPMI engine. If that's the case,
> > you'll have to read the ECC bytes manually (moving the page pointer
> > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > this engine, because ECC bytes are not guaranteed to be byte aligned
> > (see gpmi ->read_page_raw() implementation).
> > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > sure you're padding the last ECC byte of each chunk with ones so that
> > bitflips cannot be reported on this section).
> 
> Thanks for the information. So I understand that this approach is the
> preferred one to avoid any performance issues for normal operation.
> 
> I actually won't be able to fix this patch accordingly for some time. If
> anyone else needs this earlier, feel free to implement it.

I just did [1] (it applies on top of your patch), but maybe you
can test it (I don't have any imx platforms right now) ;).

If these changes work, feel free to squash them into your previous
patch.

Thanks,

Boris

[1]http://code.bulix.org/bq6yyh-96549

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-15  9:39                   ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-15  9:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Markus,

On Fri, 15 Apr 2016 11:35:07 +0200
Markus Pargmann <mpa@pengutronix.de> wrote:

> Hi Boris,
> 
> On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > Hi Markus,
> > 
> > On Fri, 15 Apr 2016 09:55:45 +0200
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> > 
> > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > Han Xu <han.xu@nxp.com> wrote:
> > > > 
> > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > abilities to do it correctly.
> > > > > 
> > > > > Sounds good.
> > > > > 
> > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > 
> > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > and correcting all erased page may highly impact the performance.
> > > > 
> > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > idea in the end.
> > > 
> > > Are erased pages really read that often?
> > 
> > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > in its journal heads after an unclean unmount (which happens quite
> > often) to make sure there's no corruption.
> > 
> > > I am not sure how UBI handles
> > > this, does it read every page before writing?
> > 
> > Nope, or maybe it does when you activate some extra checks.
> > 
> > > 
> > > > 
> > > > You can still implement this check in software. You can have a look at
> > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > have to adapt it because your controller does not guarantees that ECC
> > > > bits for a given chunk are byte aligned :-/
> > > 
> > > Yes I used this function in the patch. The issue is that I am not quite
> > > sure yet where to find the raw ECC data (without rereading the page).
> > > The reference manual is not extremely clear about that, ecc data may be
> > > in the 'auxilliary data' but I am not sure that it really is available
> > > somewhere.
> > 
> > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > direct access to ECC bytes with the GPMI engine. If that's the case,
> > you'll have to read the ECC bytes manually (moving the page pointer
> > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > this engine, because ECC bytes are not guaranteed to be byte aligned
> > (see gpmi ->read_page_raw() implementation).
> > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > sure you're padding the last ECC byte of each chunk with ones so that
> > bitflips cannot be reported on this section).
> 
> Thanks for the information. So I understand that this approach is the
> preferred one to avoid any performance issues for normal operation.
> 
> I actually won't be able to fix this patch accordingly for some time. If
> anyone else needs this earlier, feel free to implement it.

I just did [1] (it applies on top of your patch), but maybe you
can test it (I don't have any imx platforms right now) ;).

If these changes work, feel free to squash them into your previous
patch.

Thanks,

Boris

[1]http://code.bulix.org/bq6yyh-96549

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15  9:39                   ` Boris Brezillon
@ 2016-04-15 12:03                     ` Markus Pargmann
  -1 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-15 12:03 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Han Xu, David Woodhouse, Fabio Estevam, linux-mtd, kernel,
	Huang Shijie, Brian Norris, linux-arm-kernel, Stefan Christ,
	Elie De Brauwer, Richard Weinberger, Artem Bityutskiy

[-- Attachment #1: Type: text/plain, Size: 4997 bytes --]

Hi Boris,

On Friday 15 April 2016 11:39:06 Boris Brezillon wrote:
> Hi Markus,
> 
> On Fri, 15 Apr 2016 11:35:07 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > Hi Boris,
> > 
> > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > Hi Markus,
> > > 
> > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > > 
> > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > > 
> > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > abilities to do it correctly.
> > > > > > 
> > > > > > Sounds good.
> > > > > > 
> > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > > 
> > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > and correcting all erased page may highly impact the performance.
> > > > > 
> > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > idea in the end.
> > > > 
> > > > Are erased pages really read that often?
> > > 
> > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > in its journal heads after an unclean unmount (which happens quite
> > > often) to make sure there's no corruption.
> > > 
> > > > I am not sure how UBI handles
> > > > this, does it read every page before writing?
> > > 
> > > Nope, or maybe it does when you activate some extra checks.
> > > 
> > > > 
> > > > > 
> > > > > You can still implement this check in software. You can have a look at
> > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > bits for a given chunk are byte aligned :-/
> > > > 
> > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > The reference manual is not extremely clear about that, ecc data may be
> > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > somewhere.
> > > 
> > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > you'll have to read the ECC bytes manually (moving the page pointer
> > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > (see gpmi ->read_page_raw() implementation).
> > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > sure you're padding the last ECC byte of each chunk with ones so that
> > > bitflips cannot be reported on this section).
> > 
> > Thanks for the information. So I understand that this approach is the
> > preferred one to avoid any performance issues for normal operation.
> > 
> > I actually won't be able to fix this patch accordingly for some time. If
> > anyone else needs this earlier, feel free to implement it.
> 
> I just did [1] (it applies on top of your patch), but maybe you
> can test it (I don't have any imx platforms right now) ;).

Great, thank you :). I just tested the patch and it works for me. The
erased page bitflips are still detected and fixed. I will send a new
version then.

Best Regards,

Markus

> 
> If these changes work, feel free to squash them into your previous
> patch.
> 
> Thanks,
> 
> Boris
> 
> [1]http://code.bulix.org/bq6yyh-96549
> 
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-15 12:03                     ` Markus Pargmann
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-15 12:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On Friday 15 April 2016 11:39:06 Boris Brezillon wrote:
> Hi Markus,
> 
> On Fri, 15 Apr 2016 11:35:07 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > Hi Boris,
> > 
> > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > Hi Markus,
> > > 
> > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > > 
> > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > > 
> > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > abilities to do it correctly.
> > > > > > 
> > > > > > Sounds good.
> > > > > > 
> > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > > 
> > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > and correcting all erased page may highly impact the performance.
> > > > > 
> > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > idea in the end.
> > > > 
> > > > Are erased pages really read that often?
> > > 
> > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > in its journal heads after an unclean unmount (which happens quite
> > > often) to make sure there's no corruption.
> > > 
> > > > I am not sure how UBI handles
> > > > this, does it read every page before writing?
> > > 
> > > Nope, or maybe it does when you activate some extra checks.
> > > 
> > > > 
> > > > > 
> > > > > You can still implement this check in software. You can have a look at
> > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > bits for a given chunk are byte aligned :-/
> > > > 
> > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > The reference manual is not extremely clear about that, ecc data may be
> > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > somewhere.
> > > 
> > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > you'll have to read the ECC bytes manually (moving the page pointer
> > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > (see gpmi ->read_page_raw() implementation).
> > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > sure you're padding the last ECC byte of each chunk with ones so that
> > > bitflips cannot be reported on this section).
> > 
> > Thanks for the information. So I understand that this approach is the
> > preferred one to avoid any performance issues for normal operation.
> > 
> > I actually won't be able to fix this patch accordingly for some time. If
> > anyone else needs this earlier, feel free to implement it.
> 
> I just did [1] (it applies on top of your patch), but maybe you
> can test it (I don't have any imx platforms right now) ;).

Great, thank you :). I just tested the patch and it works for me. The
erased page bitflips are still detected and fixed. I will send a new
version then.

Best Regards,

Markus

> 
> If these changes work, feel free to squash them into your previous
> patch.
> 
> Thanks,
> 
> Boris
> 
> [1]http://code.bulix.org/bq6yyh-96549
> 
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160415/1f739a1b/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15 12:03                     ` Markus Pargmann
@ 2016-04-15 15:33                       ` Han Xu
  -1 siblings, 0 replies; 32+ messages in thread
From: Han Xu @ 2016-04-15 15:33 UTC (permalink / raw)
  To: Markus Pargmann, Boris Brezillon
  Cc: David Woodhouse, Fabio Estevam, linux-mtd, kernel, Huang Shijie,
	Brian Norris, linux-arm-kernel, Stefan Christ, Elie De Brauwer,
	Richard Weinberger, Artem Bityutskiy



________________________________________
From: Markus Pargmann <mpa@pengutronix.de>
Sent: Friday, April 15, 2016 6:03 AM
To: Boris Brezillon
Cc: Han Xu; David Woodhouse; Fabio Estevam; linux-mtd@lists.infradead.org; kernel@pengutronix.de; Huang Shijie; Brian Norris; linux-arm-kernel@lists.infradead.org; Stefan Christ; Elie De Brauwer; Richard Weinberger; Artem Bityutskiy
Subject: Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages

Hi Boris,

On Friday 15 April 2016 11:39:06 Boris Brezillon wrote:
> Hi Markus,
>
> On Fri, 15 Apr 2016 11:35:07 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
>
> > Hi Boris,
> >
> > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > Hi Markus,
> > >
> > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > >
> > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > >
> > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > abilities to do it correctly.
> > > > > >
> > > > > > Sounds good.
> > > > > >
> > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > >
> > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > and correcting all erased page may highly impact the performance.
> > > > >
> > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > idea in the end.
> > > >
> > > > Are erased pages really read that often?
> > >
> > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > in its journal heads after an unclean unmount (which happens quite
> > > often) to make sure there's no corruption.
> > >
> > > > I am not sure how UBI handles
> > > > this, does it read every page before writing?
> > >
> > > Nope, or maybe it does when you activate some extra checks.
> > >
> > > >
> > > > >
> > > > > You can still implement this check in software. You can have a look at
> > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > bits for a given chunk are byte aligned :-/
> > > >
> > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > The reference manual is not extremely clear about that, ecc data may be
> > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > somewhere.
> > >
> > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > you'll have to read the ECC bytes manually (moving the page pointer
> > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > (see gpmi ->read_page_raw() implementation).
> > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > sure you're padding the last ECC byte of each chunk with ones so that
> > > bitflips cannot be reported on this section).
> >
> > Thanks for the information. So I understand that this approach is the
> > preferred one to avoid any performance issues for normal operation.
> >
> > I actually won't be able to fix this patch accordingly for some time. If
> > anyone else needs this earlier, feel free to implement it.
>
> I just did [1] (it applies on top of your patch), but maybe you
> can test it (I don't have any imx platforms right now) ;).

Great, thank you :). I just tested the patch and it works for me. The
erased page bitflips are still detected and fixed. I will send a new
version then.

Hi Markus,

Could you please share how to verify the patch, in other words, how to reproduce the
UBIFS corruption issue consistently. Thanks.

Best Regards,

Markus

>
> If these changes work, feel free to squash them into your previous
> patch.
>
> Thanks,
>
> Boris
>
> [1]http://code.bulix.org/bq6yyh-96549
>
>

--
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-15 15:33                       ` Han Xu
  0 siblings, 0 replies; 32+ messages in thread
From: Han Xu @ 2016-04-15 15:33 UTC (permalink / raw)
  To: linux-arm-kernel



________________________________________
From: Markus Pargmann <mpa@pengutronix.de>
Sent: Friday, April 15, 2016 6:03 AM
To: Boris Brezillon
Cc: Han Xu; David Woodhouse; Fabio Estevam; linux-mtd at lists.infradead.org; kernel at pengutronix.de; Huang Shijie; Brian Norris; linux-arm-kernel at lists.infradead.org; Stefan Christ; Elie De Brauwer; Richard Weinberger; Artem Bityutskiy
Subject: Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages

Hi Boris,

On Friday 15 April 2016 11:39:06 Boris Brezillon wrote:
> Hi Markus,
>
> On Fri, 15 Apr 2016 11:35:07 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
>
> > Hi Boris,
> >
> > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > Hi Markus,
> > >
> > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > >
> > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > >
> > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > abilities to do it correctly.
> > > > > >
> > > > > > Sounds good.
> > > > > >
> > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > >
> > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > and correcting all erased page may highly impact the performance.
> > > > >
> > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > idea in the end.
> > > >
> > > > Are erased pages really read that often?
> > >
> > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > in its journal heads after an unclean unmount (which happens quite
> > > often) to make sure there's no corruption.
> > >
> > > > I am not sure how UBI handles
> > > > this, does it read every page before writing?
> > >
> > > Nope, or maybe it does when you activate some extra checks.
> > >
> > > >
> > > > >
> > > > > You can still implement this check in software. You can have a look at
> > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > bits for a given chunk are byte aligned :-/
> > > >
> > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > The reference manual is not extremely clear about that, ecc data may be
> > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > somewhere.
> > >
> > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > you'll have to read the ECC bytes manually (moving the page pointer
> > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > (see gpmi ->read_page_raw() implementation).
> > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > sure you're padding the last ECC byte of each chunk with ones so that
> > > bitflips cannot be reported on this section).
> >
> > Thanks for the information. So I understand that this approach is the
> > preferred one to avoid any performance issues for normal operation.
> >
> > I actually won't be able to fix this patch accordingly for some time. If
> > anyone else needs this earlier, feel free to implement it.
>
> I just did [1] (it applies on top of your patch), but maybe you
> can test it (I don't have any imx platforms right now) ;).

Great, thank you :). I just tested the patch and it works for me. The
erased page bitflips are still detected and fixed. I will send a new
version then.

Hi Markus,

Could you please share how to verify the patch, in other words, how to reproduce the
UBIFS corruption issue consistently. Thanks.

Best Regards,

Markus

>
> If these changes work, feel free to squash them into your previous
> patch.
>
> Thanks,
>
> Boris
>
> [1]http://code.bulix.org/bq6yyh-96549
>
>

--
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15 15:33                       ` Han Xu
@ 2016-04-15 15:40                         ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-15 15:40 UTC (permalink / raw)
  To: Han Xu
  Cc: Markus Pargmann, David Woodhouse, Fabio Estevam, linux-mtd,
	kernel, Huang Shijie, Brian Norris, linux-arm-kernel,
	Stefan Christ, Elie De Brauwer, Richard Weinberger,
	Artem Bityutskiy

On Fri, 15 Apr 2016 15:33:07 +0000
Han Xu <han.xu@nxp.com> wrote:

> >
> > I just did [1] (it applies on top of your patch), but maybe you
> > can test it (I don't have any imx platforms right now) ;).
> 
> Great, thank you :). I just tested the patch and it works for me. The
> erased page bitflips are still detected and fixed. I will send a new
> version then.
> 
> Hi Markus,
> 
> Could you please share how to verify the patch, in other words, how to reproduce the
> UBIFS corruption issue consistently. Thanks.

I should really post a new version of the nandflipbits tool [1].
That's clearly the easiest solution to verify that your bitflip
correction is reliable.


[1]http://lists.infradead.org/pipermail/linux-mtd/2014-November/056634.html

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-15 15:40                         ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-15 15:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 15 Apr 2016 15:33:07 +0000
Han Xu <han.xu@nxp.com> wrote:

> >
> > I just did [1] (it applies on top of your patch), but maybe you
> > can test it (I don't have any imx platforms right now) ;).
> 
> Great, thank you :). I just tested the patch and it works for me. The
> erased page bitflips are still detected and fixed. I will send a new
> version then.
> 
> Hi Markus,
> 
> Could you please share how to verify the patch, in other words, how to reproduce the
> UBIFS corruption issue consistently. Thanks.

I should really post a new version of the nandflipbits tool [1].
That's clearly the easiest solution to verify that your bitflip
correction is reliable.


[1]http://lists.infradead.org/pipermail/linux-mtd/2014-November/056634.html

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15 15:40                         ` Boris Brezillon
@ 2016-04-18 10:07                           ` Markus Pargmann
  -1 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-18 10:07 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Boris Brezillon, Han Xu, Fabio Estevam, Elie De Brauwer,
	Artem Bityutskiy, Stefan Christ, Huang Shijie, linux-mtd, kernel,
	Richard Weinberger, Brian Norris, David Woodhouse

[-- Attachment #1: Type: text/plain, Size: 1485 bytes --]

On Friday 15 April 2016 17:40:31 Boris Brezillon wrote:
> On Fri, 15 Apr 2016 15:33:07 +0000
> Han Xu <han.xu@nxp.com> wrote:
> 
> > >
> > > I just did [1] (it applies on top of your patch), but maybe you
> > > can test it (I don't have any imx platforms right now) ;).
> > 
> > Great, thank you :). I just tested the patch and it works for me. The
> > erased page bitflips are still detected and fixed. I will send a new
> > version then.
> > 
> > Hi Markus,
> > 
> > Could you please share how to verify the patch, in other words, how to reproduce the
> > UBIFS corruption issue consistently. Thanks.

I used a simple bashscript with 'nandwrite --noecc' that writes a number
of zeroes in every sub-page. For written pages the ECC will handle the
flips. For erased pages we can test the algorithm that handles the
erased page bitflips.

> 
> I should really post a new version of the nandflipbits tool [1].
> That's clearly the easiest solution to verify that your bitflip
> correction is reliable.

Seems really useful.

Best Regards,

Markus

> 
> 
> [1]http://lists.infradead.org/pipermail/linux-mtd/2014-November/056634.html
> 
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-18 10:07                           ` Markus Pargmann
  0 siblings, 0 replies; 32+ messages in thread
From: Markus Pargmann @ 2016-04-18 10:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 15 April 2016 17:40:31 Boris Brezillon wrote:
> On Fri, 15 Apr 2016 15:33:07 +0000
> Han Xu <han.xu@nxp.com> wrote:
> 
> > >
> > > I just did [1] (it applies on top of your patch), but maybe you
> > > can test it (I don't have any imx platforms right now) ;).
> > 
> > Great, thank you :). I just tested the patch and it works for me. The
> > erased page bitflips are still detected and fixed. I will send a new
> > version then.
> > 
> > Hi Markus,
> > 
> > Could you please share how to verify the patch, in other words, how to reproduce the
> > UBIFS corruption issue consistently. Thanks.

I used a simple bashscript with 'nandwrite --noecc' that writes a number
of zeroes in every sub-page. For written pages the ECC will handle the
flips. For erased pages we can test the algorithm that handles the
erased page bitflips.

> 
> I should really post a new version of the nandflipbits tool [1].
> That's clearly the easiest solution to verify that your bitflip
> correction is reliable.

Seems really useful.

Best Regards,

Markus

> 
> 
> [1]http://lists.infradead.org/pipermail/linux-mtd/2014-November/056634.html
> 
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160418/b0f1222c/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-15  9:39                   ` Boris Brezillon
@ 2016-04-18 14:47                     ` Stefan Christ
  -1 siblings, 0 replies; 32+ messages in thread
From: Stefan Christ @ 2016-04-18 14:47 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Markus Pargmann, Fabio Estevam, Elie De Brauwer,
	Artem Bityutskiy, linux-mtd, kernel, Richard Weinberger,
	Huang Shijie, Han Xu, Brian Norris, David Woodhouse,
	linux-arm-kernel

Hi Boris,

On Fri, Apr 15, 2016 at 11:39:06AM +0200, Boris Brezillon wrote:
> Hi Markus,
> 
> On Fri, 15 Apr 2016 11:35:07 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > Hi Boris,
> > 
> > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > Hi Markus,
> > > 
> > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > > 
> > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > > 
> > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > abilities to do it correctly.
> > > > > > 
> > > > > > Sounds good.
> > > > > > 
> > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > > 
> > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > and correcting all erased page may highly impact the performance.
> > > > > 
> > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > idea in the end.
> > > > 
> > > > Are erased pages really read that often?
> > > 
> > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > in its journal heads after an unclean unmount (which happens quite
> > > often) to make sure there's no corruption.
> > > 
> > > > I am not sure how UBI handles
> > > > this, does it read every page before writing?
> > > 
> > > Nope, or maybe it does when you activate some extra checks.
> > > 
> > > > 
> > > > > 
> > > > > You can still implement this check in software. You can have a look at
> > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > bits for a given chunk are byte aligned :-/
> > > > 
> > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > The reference manual is not extremely clear about that, ecc data may be
> > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > somewhere.
> > > 
> > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > you'll have to read the ECC bytes manually (moving the page pointer
> > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > (see gpmi ->read_page_raw() implementation).
> > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > sure you're padding the last ECC byte of each chunk with ones so that
> > > bitflips cannot be reported on this section).
> > 
> > Thanks for the information. So I understand that this approach is the
> > preferred one to avoid any performance issues for normal operation.
> > 
> > I actually won't be able to fix this patch accordingly for some time. If
> > anyone else needs this earlier, feel free to implement it.
> 
> I just did [1] (it applies on top of your patch), but maybe you
> can test it (I don't have any imx platforms right now) ;).
> 
> If these changes work, feel free to squash them into your previous
> patch.

I've tested your diff onto Markus Pargmann's patch. It looks promising.

However I've noticed that the calculation of the ECC parity bits position is
wrong.  It doesn't consider the extra metadata bytes at the beginning of the
raw page and that the ECC parity bits are at the end of the ECC chunk. My test
platform is the i.MX6 with two NAND flashes

    nand: Samsung NAND 1GiB 3,3V 8-bit
    nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
    (-> 104 Bits ECC )

and
 
    nand: AMD/Spansion S34ML08G2
    nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
    (-> 234 Bits ECC )

I've also tested the bit alignment code. It works correctly for the Spansion
NAND, as the 234 Bits of ECC are 29.25 Bytes on the NAND flash. So there the
parity bits are not byte aligned.

Mit freundlichen Grüßen / Kind regards,
        Stefan Christ

The corrected ECC parity bit code is:

---->8----
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 2f16d7f..ccae6e6 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -1054,7 +1054,9 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 			int flips;
 
 			/* Read ECC bytes into our internal raw_buffer */
-			offset = ((8 * nfc_geo->ecc_chunk_size) + eccbits) * i;
+			offset = nfc_geo->metadata_size * 8;
+			offset += ((8 * nfc_geo->ecc_chunk_size) + eccbits) * (i + 1);
+			offset -= eccbits;
 			bitoffset = offset % 8;
 			eccbytes = DIV_ROUND_UP(offset + eccbits, 8);
 			offset /= 8;

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-18 14:47                     ` Stefan Christ
  0 siblings, 0 replies; 32+ messages in thread
From: Stefan Christ @ 2016-04-18 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On Fri, Apr 15, 2016 at 11:39:06AM +0200, Boris Brezillon wrote:
> Hi Markus,
> 
> On Fri, 15 Apr 2016 11:35:07 +0200
> Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > Hi Boris,
> > 
> > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > Hi Markus,
> > > 
> > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > > 
> > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > > 
> > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > abilities to do it correctly.
> > > > > > 
> > > > > > Sounds good.
> > > > > > 
> > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > > 
> > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > and correcting all erased page may highly impact the performance.
> > > > > 
> > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > idea in the end.
> > > > 
> > > > Are erased pages really read that often?
> > > 
> > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > in its journal heads after an unclean unmount (which happens quite
> > > often) to make sure there's no corruption.
> > > 
> > > > I am not sure how UBI handles
> > > > this, does it read every page before writing?
> > > 
> > > Nope, or maybe it does when you activate some extra checks.
> > > 
> > > > 
> > > > > 
> > > > > You can still implement this check in software. You can have a look at
> > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > bits for a given chunk are byte aligned :-/
> > > > 
> > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > The reference manual is not extremely clear about that, ecc data may be
> > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > somewhere.
> > > 
> > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > you'll have to read the ECC bytes manually (moving the page pointer
> > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > (see gpmi ->read_page_raw() implementation).
> > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > sure you're padding the last ECC byte of each chunk with ones so that
> > > bitflips cannot be reported on this section).
> > 
> > Thanks for the information. So I understand that this approach is the
> > preferred one to avoid any performance issues for normal operation.
> > 
> > I actually won't be able to fix this patch accordingly for some time. If
> > anyone else needs this earlier, feel free to implement it.
> 
> I just did [1] (it applies on top of your patch), but maybe you
> can test it (I don't have any imx platforms right now) ;).
> 
> If these changes work, feel free to squash them into your previous
> patch.

I've tested your diff onto Markus Pargmann's patch. It looks promising.

However I've noticed that the calculation of the ECC parity bits position is
wrong.  It doesn't consider the extra metadata bytes at the beginning of the
raw page and that the ECC parity bits are at the end of the ECC chunk. My test
platform is the i.MX6 with two NAND flashes

    nand: Samsung NAND 1GiB 3,3V 8-bit
    nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
    (-> 104 Bits ECC )

and
 
    nand: AMD/Spansion S34ML08G2
    nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
    (-> 234 Bits ECC )

I've also tested the bit alignment code. It works correctly for the Spansion
NAND, as the 234 Bits of ECC are 29.25 Bytes on the NAND flash. So there the
parity bits are not byte aligned.

Mit freundlichen Gr??en / Kind regards,
        Stefan Christ

The corrected ECC parity bit code is:

---->8----
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 2f16d7f..ccae6e6 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -1054,7 +1054,9 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 			int flips;
 
 			/* Read ECC bytes into our internal raw_buffer */
-			offset = ((8 * nfc_geo->ecc_chunk_size) + eccbits) * i;
+			offset = nfc_geo->metadata_size * 8;
+			offset += ((8 * nfc_geo->ecc_chunk_size) + eccbits) * (i + 1);
+			offset -= eccbits;
 			bitoffset = offset % 8;
 			eccbytes = DIV_ROUND_UP(offset + eccbits, 8);
 			offset /= 8;

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
  2016-04-18 14:47                     ` Stefan Christ
@ 2016-04-18 15:10                       ` Boris Brezillon
  -1 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-18 15:10 UTC (permalink / raw)
  To: Stefan Christ
  Cc: Markus Pargmann, Fabio Estevam, Elie De Brauwer,
	Artem Bityutskiy, linux-mtd, kernel, Richard Weinberger,
	Huang Shijie, Brian Norris, David Woodhouse, linux-arm-kernel

On Mon, 18 Apr 2016 16:47:20 +0200
Stefan Christ <s.christ@phytec.de> wrote:

> Hi Boris,
> 
> On Fri, Apr 15, 2016 at 11:39:06AM +0200, Boris Brezillon wrote:
> > Hi Markus,
> > 
> > On Fri, 15 Apr 2016 11:35:07 +0200
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> > 
> > > Hi Boris,
> > > 
> > > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > > Hi Markus,
> > > > 
> > > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > > > 
> > > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > > > 
> > > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > > abilities to do it correctly.
> > > > > > > 
> > > > > > > Sounds good.
> > > > > > > 
> > > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > > > 
> > > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > > and correcting all erased page may highly impact the performance.
> > > > > > 
> > > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > > idea in the end.
> > > > > 
> > > > > Are erased pages really read that often?
> > > > 
> > > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > > in its journal heads after an unclean unmount (which happens quite
> > > > often) to make sure there's no corruption.
> > > > 
> > > > > I am not sure how UBI handles
> > > > > this, does it read every page before writing?
> > > > 
> > > > Nope, or maybe it does when you activate some extra checks.
> > > > 
> > > > > 
> > > > > > 
> > > > > > You can still implement this check in software. You can have a look at
> > > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > > bits for a given chunk are byte aligned :-/
> > > > > 
> > > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > > The reference manual is not extremely clear about that, ecc data may be
> > > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > > somewhere.
> > > > 
> > > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > > you'll have to read the ECC bytes manually (moving the page pointer
> > > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > > (see gpmi ->read_page_raw() implementation).
> > > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > > sure you're padding the last ECC byte of each chunk with ones so that
> > > > bitflips cannot be reported on this section).
> > > 
> > > Thanks for the information. So I understand that this approach is the
> > > preferred one to avoid any performance issues for normal operation.
> > > 
> > > I actually won't be able to fix this patch accordingly for some time. If
> > > anyone else needs this earlier, feel free to implement it.
> > 
> > I just did [1] (it applies on top of your patch), but maybe you
> > can test it (I don't have any imx platforms right now) ;).
> > 
> > If these changes work, feel free to squash them into your previous
> > patch.
> 
> I've tested your diff onto Markus Pargmann's patch. It looks promising.
> 
> However I've noticed that the calculation of the ECC parity bits position is
> wrong.  It doesn't consider the extra metadata bytes at the beginning of the
> raw page and that the ECC parity bits are at the end of the ECC chunk.

Oh, you're right. Thanks for fixing that.

> My test
> platform is the i.MX6 with two NAND flashes
> 
>     nand: Samsung NAND 1GiB 3,3V 8-bit
>     nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>     (-> 104 Bits ECC )
> 
> and
>  
>     nand: AMD/Spansion S34ML08G2
>     nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
>     (-> 234 Bits ECC )
> 
> I've also tested the bit alignment code. It works correctly for the Spansion
> NAND, as the 234 Bits of ECC are 29.25 Bytes on the NAND flash. So there the
> parity bits are not byte aligned.

And thanks for testing.

Markus, if you resubmit your patch, please take Stefan's changes.

> 
> Mit freundlichen Grüßen / Kind regards,
>         Stefan Christ
> 
> The corrected ECC parity bit code is:
> 
> ---->8----
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> index 2f16d7f..ccae6e6 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> @@ -1054,7 +1054,9 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>  			int flips;
>  
>  			/* Read ECC bytes into our internal raw_buffer */
> -			offset = ((8 * nfc_geo->ecc_chunk_size) + eccbits) * i;
> +			offset = nfc_geo->metadata_size * 8;
> +			offset += ((8 * nfc_geo->ecc_chunk_size) + eccbits) * (i + 1);
> +			offset -= eccbits;
>  			bitoffset = offset % 8;
>  			eccbytes = DIV_ROUND_UP(offset + eccbits, 8);
>  			offset /= 8;



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages
@ 2016-04-18 15:10                       ` Boris Brezillon
  0 siblings, 0 replies; 32+ messages in thread
From: Boris Brezillon @ 2016-04-18 15:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 18 Apr 2016 16:47:20 +0200
Stefan Christ <s.christ@phytec.de> wrote:

> Hi Boris,
> 
> On Fri, Apr 15, 2016 at 11:39:06AM +0200, Boris Brezillon wrote:
> > Hi Markus,
> > 
> > On Fri, 15 Apr 2016 11:35:07 +0200
> > Markus Pargmann <mpa@pengutronix.de> wrote:
> > 
> > > Hi Boris,
> > > 
> > > On Friday 15 April 2016 10:35:08 Boris Brezillon wrote:
> > > > Hi Markus,
> > > > 
> > > > On Fri, 15 Apr 2016 09:55:45 +0200
> > > > Markus Pargmann <mpa@pengutronix.de> wrote:
> > > > 
> > > > > On Wednesday 13 April 2016 00:51:55 Boris Brezillon wrote:
> > > > > > On Tue, 12 Apr 2016 22:39:08 +0000
> > > > > > Han Xu <han.xu@nxp.com> wrote:
> > > > > > 
> > > > > > > > Thanks for the feedback. Talking with a coworker about this we may have found a
> > > > > > > > better approach to this that is less complicated to implement. The hardware
> > > > > > > > unit allows us to set a bitflip threshold for erased pages. The ECC unit
> > > > > > > > creates an ECC error only if the number of bitflips exceeds this threshold, but
> > > > > > > > it does not correct these. So the idea is to change the patch so that we set
> > > > > > > > pages, that are signaled by the ECC as erased, to 0xff completely without
> > > > > > > > checking. So the ECC will do all the work and we completely trust in its
> > > > > > > > abilities to do it correctly.
> > > > > > > 
> > > > > > > Sounds good.
> > > > > > > 
> > > > > > > some new platforms with new gpmi controller could check the count of 0 bits in page,
> > > > > > > refer to my patch https://patchwork.ozlabs.org/patch/587124/
> > > > > > > 
> > > > > > > But for all legacy platforms, IMO, considering bitflip is rare case, set threshold to 0 and
> > > > > > > only check the uncorrectable branch and then correct data sounds better. Setting threshold
> > > > > > > and correcting all erased page may highly impact the performance.
> > > > > > 
> > > > > > Indeed, bitflips in erased pages is not so common, and penalizing the
> > > > > > likely case (erased pages without any bitflips) doesn't look like a good
> > > > > > idea in the end.
> > > > > 
> > > > > Are erased pages really read that often?
> > > > 
> > > > Yes, it's not unusual to have those "empty pages?" checks (added Artem
> > > > and Richard to get a confirmation). AFAIR, UBIFS check for empty pages
> > > > in its journal heads after an unclean unmount (which happens quite
> > > > often) to make sure there's no corruption.
> > > > 
> > > > > I am not sure how UBI handles
> > > > > this, does it read every page before writing?
> > > > 
> > > > Nope, or maybe it does when you activate some extra checks.
> > > > 
> > > > > 
> > > > > > 
> > > > > > You can still implement this check in software. You can have a look at
> > > > > > nand_check_erased_ecc_chunk() [1] if you need an example, but you'll
> > > > > > have to adapt it because your controller does not guarantees that ECC
> > > > > > bits for a given chunk are byte aligned :-/
> > > > > 
> > > > > Yes I used this function in the patch. The issue is that I am not quite
> > > > > sure yet where to find the raw ECC data (without rereading the page).
> > > > > The reference manual is not extremely clear about that, ecc data may be
> > > > > in the 'auxilliary data' but I am not sure that it really is available
> > > > > somewhere.
> > > > 
> > > > AFAIR (and I'm not sure since it was a long time ago), you don't have
> > > > direct access to ECC bytes with the GPMI engine. If that's the case,
> > > > you'll have to read the ECC bytes manually (moving the page pointer
> > > > using ->cmdfunc(NAND_CMD_RNDOUT, column, -1)), which is a pain with
> > > > this engine, because ECC bytes are not guaranteed to be byte aligned
> > > > (see gpmi ->read_page_raw() implementation).
> > > > Once you've retrieved ECC bytes (or bits in this case), for each ECC
> > > > chunk, you can use the nand_check_erased_ecc_chunk() function (just make
> > > > sure you're padding the last ECC byte of each chunk with ones so that
> > > > bitflips cannot be reported on this section).
> > > 
> > > Thanks for the information. So I understand that this approach is the
> > > preferred one to avoid any performance issues for normal operation.
> > > 
> > > I actually won't be able to fix this patch accordingly for some time. If
> > > anyone else needs this earlier, feel free to implement it.
> > 
> > I just did [1] (it applies on top of your patch), but maybe you
> > can test it (I don't have any imx platforms right now) ;).
> > 
> > If these changes work, feel free to squash them into your previous
> > patch.
> 
> I've tested your diff onto Markus Pargmann's patch. It looks promising.
> 
> However I've noticed that the calculation of the ECC parity bits position is
> wrong.  It doesn't consider the extra metadata bytes at the beginning of the
> raw page and that the ECC parity bits are at the end of the ECC chunk.

Oh, you're right. Thanks for fixing that.

> My test
> platform is the i.MX6 with two NAND flashes
> 
>     nand: Samsung NAND 1GiB 3,3V 8-bit
>     nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>     (-> 104 Bits ECC )
> 
> and
>  
>     nand: AMD/Spansion S34ML08G2
>     nand: 1024 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
>     (-> 234 Bits ECC )
> 
> I've also tested the bit alignment code. It works correctly for the Spansion
> NAND, as the 234 Bits of ECC are 29.25 Bytes on the NAND flash. So there the
> parity bits are not byte aligned.

And thanks for testing.

Markus, if you resubmit your patch, please take Stefan's changes.

> 
> Mit freundlichen Gr??en / Kind regards,
>         Stefan Christ
> 
> The corrected ECC parity bit code is:
> 
> ---->8----
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> index 2f16d7f..ccae6e6 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> @@ -1054,7 +1054,9 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>  			int flips;
>  
>  			/* Read ECC bytes into our internal raw_buffer */
> -			offset = ((8 * nfc_geo->ecc_chunk_size) + eccbits) * i;
> +			offset = nfc_geo->metadata_size * 8;
> +			offset += ((8 * nfc_geo->ecc_chunk_size) + eccbits) * (i + 1);
> +			offset -= eccbits;
>  			bitoffset = offset % 8;
>  			eccbytes = DIV_ROUND_UP(offset + eccbits, 8);
>  			offset /= 8;



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-04-18 15:10 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-21 12:52 [PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages Markus Pargmann
2016-02-21 12:52 ` Markus Pargmann
2016-02-24 13:57 ` Boris Brezillon
2016-02-24 13:57   ` Boris Brezillon
2016-04-11  6:34   ` Markus Pargmann
2016-04-11  6:34     ` Markus Pargmann
2016-04-11  7:29     ` Boris Brezillon
2016-04-11  7:29       ` Boris Brezillon
2016-04-12 22:39       ` Han Xu
2016-04-12 22:39         ` Han Xu
2016-04-12 22:51         ` Boris Brezillon
2016-04-12 22:51           ` Boris Brezillon
2016-04-15  7:55           ` Markus Pargmann
2016-04-15  7:55             ` Markus Pargmann
2016-04-15  8:35             ` Boris Brezillon
2016-04-15  8:35               ` Boris Brezillon
2016-04-15  9:35               ` Markus Pargmann
2016-04-15  9:35                 ` Markus Pargmann
2016-04-15  9:39                 ` Boris Brezillon
2016-04-15  9:39                   ` Boris Brezillon
2016-04-15 12:03                   ` Markus Pargmann
2016-04-15 12:03                     ` Markus Pargmann
2016-04-15 15:33                     ` Han Xu
2016-04-15 15:33                       ` Han Xu
2016-04-15 15:40                       ` Boris Brezillon
2016-04-15 15:40                         ` Boris Brezillon
2016-04-18 10:07                         ` Markus Pargmann
2016-04-18 10:07                           ` Markus Pargmann
2016-04-18 14:47                   ` Stefan Christ
2016-04-18 14:47                     ` Stefan Christ
2016-04-18 15:10                     ` Boris Brezillon
2016-04-18 15:10                       ` Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.