All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mtd: gpmi: Deal with bitflips in erased regions
@ 2016-03-15  8:35 Stefan Christ
  2016-04-11 11:18 ` Stefan Christ
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Stefan Christ @ 2016-03-15  8:35 UTC (permalink / raw)
  To: linux-mtd; +Cc: eliedebrauwer, han.xu, shijie8

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 6015 bytes --]

From: Elie De Brauwer <eliedebrauwer@gmail.com>

The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
able to correct bitflips on data actually streamed through the block.
When erasing a block the data does not stream through the BCH block
and therefore no ECC data is written to the NAND chip. This causes
gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
found in an erased page. Typically causing problems at higher levels
(ubifs corrupted empty space warnings). This problem was also observed
when using SLC NAND devices.

This patch configures the BCH block to mark a block as 'erased' if
not too much bitflips are found (by setting the erase threshold). A
consequence of this is that whenever an erased page is read, the
number of bitflips will be counted and corrected in software,
allowing the upper layers to take proper actions.

Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com>
Acked-by: Peter Korsgaard <peter@korsgaard.com>
Acked-by: Huang Shijie <b32955@freescale.com>
---
Hi,

I'm reposting this patch, because it's still not in the mainline kernel tree.
It was taken from

    http://lists.infradead.org/pipermail/linux-mtd/2014-January/051357.html
    [PATCH v7] mtd: gpmi: Deal with bitflips in erased regions

The patch fixes the UBI corrupted empty space problem:

    UBIFS error (pid 74): ubifs_scan: corrupt empty space at LEB 3059:72260
    UBIFS error (pid 74): ubifs_scanned_corruption: corruption at LEB 3059:72260
    UBIFS error (pid 74): ubifs_scanned_corruption: first 8192 bytes from LEB 3059:72260
    UBIFS error (pid 74): ubifs_scan: LEB 3059 scanning failed

which occurs in real systems sometimes.

I've tested the patch.  You can add my

    Tested-by: Stefan Christ <s.christ@phytec.de>

Mit freundlichen Grüßen / Kind regards,
        Stefan Christ

---
 drivers/mtd/nand/gpmi-nand/bch-regs.h  |  1 +
 drivers/mtd/nand/gpmi-nand/gpmi-lib.c  |  9 +++++++
 drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 43 +++++++++++++++++++++++++++++++---
 3 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/nand/gpmi-nand/bch-regs.h b/drivers/mtd/nand/gpmi-nand/bch-regs.h
index 05bb91f..88cc89d 100644
--- a/drivers/mtd/nand/gpmi-nand/bch-regs.h
+++ b/drivers/mtd/nand/gpmi-nand/bch-regs.h
@@ -31,6 +31,7 @@
 
 #define HW_BCH_STATUS0				0x00000010
 #define HW_BCH_MODE				0x00000020
+#define BM_BCH_MODE_ERASE_THRESHOLD_MASK	0xff
 #define HW_BCH_ENCODEPTR			0x00000030
 #define HW_BCH_DATAPTR				0x00000040
 #define HW_BCH_METAPTR				0x00000050
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
index 0f68a99..a724dcb 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
@@ -256,6 +256,7 @@ int bch_set_geometry(struct gpmi_nand_data *this)
 	unsigned int ecc_strength;
 	unsigned int page_size;
 	unsigned int gf_len;
+	unsigned int erase_threshold;
 	int ret;
 
 	if (common_nfc_set_geometry(this))
@@ -298,6 +299,14 @@ int bch_set_geometry(struct gpmi_nand_data *this)
 			| BF_BCH_FLASH0LAYOUT1_DATAN_SIZE(block_size, this),
 			r->bch_regs + HW_BCH_FLASH0LAYOUT1);
 
+	/* Set the tolerance for bitflips when reading erased blocks. */
+	erase_threshold = gf_len / 2;
+	if (erase_threshold > bch_geo->ecc_strength)
+		erase_threshold = bch_geo->ecc_strength;
+
+	writel(erase_threshold & BM_BCH_MODE_ERASE_THRESHOLD_MASK,
+		r->bch_regs + HW_BCH_MODE);
+
 	/* Set *all* chip selects to use layout 0. */
 	writel(0, r->bch_regs + HW_BCH_LAYOUTSELECT);
 
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index 235ddcb..ba852e8 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -991,6 +991,30 @@ static void block_mark_swapping(struct gpmi_nand_data *this,
 	p[1] = (p[1] & mask) | (from_oob >> (8 - bit));
 }
 
+/*
+ * Count the number of 0 bits in a supposed to be
+ * erased region and correct them. Return the number
+ * of bitflips or zero when the region was correct.
+ */
+static unsigned int erased_sector_bitflips(unsigned char *data,
+					unsigned int chunk,
+					struct bch_geometry *geo)
+{
+	unsigned int flip_bits = 0;
+	int i;
+	int base = geo->ecc_chunk_size * chunk;
+
+	/* Count bitflips */
+	for (i = 0; i < geo->ecc_chunk_size; i++)
+		flip_bits += hweight8(~data[base + i]);
+
+	/* Correct bitflips by 0xFF'ing this chunk. */
+	if (flip_bits)
+		memset(&data[base], 0xFF, geo->ecc_chunk_size);
+
+	return flip_bits;
+}
+
 static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 				uint8_t *buf, int oob_required, int page)
 {
@@ -1002,6 +1026,7 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	dma_addr_t    auxiliary_phys;
 	unsigned int  i;
 	unsigned char *status;
+	unsigned int  flips;
 	unsigned int  max_bitflips = 0;
 	int           ret;
 
@@ -1036,15 +1061,27 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
 
 	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
-		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
+		if (*status == STATUS_GOOD)
 			continue;
 
 		if (*status == STATUS_UNCORRECTABLE) {
 			mtd->ecc_stats.failed++;
 			continue;
 		}
-		mtd->ecc_stats.corrected += *status;
-		max_bitflips = max_t(unsigned int, max_bitflips, *status);
+
+		/*
+		 * The number of bitflips are either counted in software
+		 * in case of an erased chunk or otherwise reported by
+		 * the BCH block.
+		 */
+		if (*status == STATUS_ERASED)
+			flips = erased_sector_bitflips(payload_virt, i,
+							nfc_geo);
+		else
+			flips = *status;
+
+		mtd->ecc_stats.corrected += flips;
+		max_bitflips = max_t(unsigned int, max_bitflips, flips);
 	}
 
 	if (oob_required) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mtd: gpmi: Deal with bitflips in erased regions
  2016-03-15  8:35 [PATCH] mtd: gpmi: Deal with bitflips in erased regions Stefan Christ
@ 2016-04-11 11:18 ` Stefan Christ
  2016-04-11 11:54 ` Boris Brezillon
  2016-04-11 13:08 ` Marc Kleine-Budde
  2 siblings, 0 replies; 5+ messages in thread
From: Stefan Christ @ 2016-04-11 11:18 UTC (permalink / raw)
  To: linux-mtd
  Cc: han.xu, boris.brezillon, richard, dwmw2, computersforpeace,
	eliedebrauwer

Hi,

*ping for my last email. Again this is a real issue for deployed devices and
should be fixed in the mainline kernel.

I hope that the patch author maybe step up and vote that his patch gets
accepted into mainline, too.

The original patch disccusion is

   http://lists.infradead.org/pipermail/linux-mtd/2014-January/051356.html

There seems to be no issues left with this patch.

Mit freundlichen Grüßen / Kind regards,
	Stefan Christ

On Tue, Mar 15, 2016 at 09:35:37AM +0100, Stefan Christ wrote:
> From: Elie De Brauwer <eliedebrauwer@gmail.com>
> 
> The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
> able to correct bitflips on data actually streamed through the block.
> When erasing a block the data does not stream through the BCH block
> and therefore no ECC data is written to the NAND chip. This causes
> gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
> found in an erased page. Typically causing problems at higher levels
> (ubifs corrupted empty space warnings). This problem was also observed
> when using SLC NAND devices.
> 
> This patch configures the BCH block to mark a block as 'erased' if
> not too much bitflips are found (by setting the erase threshold). A
> consequence of this is that whenever an erased page is read, the
> number of bitflips will be counted and corrected in software,
> allowing the upper layers to take proper actions.
> 
> Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com>
> Acked-by: Peter Korsgaard <peter@korsgaard.com>
> Acked-by: Huang Shijie <b32955@freescale.com>
> ---
> Hi,
> 
> I'm reposting this patch, because it's still not in the mainline kernel tree.
> It was taken from
> 
>     http://lists.infradead.org/pipermail/linux-mtd/2014-January/051357.html
>     [PATCH v7] mtd: gpmi: Deal with bitflips in erased regions
> 
> The patch fixes the UBI corrupted empty space problem:
> 
>     UBIFS error (pid 74): ubifs_scan: corrupt empty space at LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scanned_corruption: corruption at LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scanned_corruption: first 8192 bytes from LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scan: LEB 3059 scanning failed
> 
> which occurs in real systems sometimes.
> 
> I've tested the patch.  You can add my
> 
>     Tested-by: Stefan Christ <s.christ@phytec.de>
> 
> Mit freundlichen Grüßen / Kind regards,
>         Stefan Christ
> 
> ---
>  drivers/mtd/nand/gpmi-nand/bch-regs.h  |  1 +
>  drivers/mtd/nand/gpmi-nand/gpmi-lib.c  |  9 +++++++
>  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 43 +++++++++++++++++++++++++++++++---
>  3 files changed, 50 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mtd/nand/gpmi-nand/bch-regs.h b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> index 05bb91f..88cc89d 100644
> --- a/drivers/mtd/nand/gpmi-nand/bch-regs.h
> +++ b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> @@ -31,6 +31,7 @@
>  
>  #define HW_BCH_STATUS0				0x00000010
>  #define HW_BCH_MODE				0x00000020
> +#define BM_BCH_MODE_ERASE_THRESHOLD_MASK	0xff
>  #define HW_BCH_ENCODEPTR			0x00000030
>  #define HW_BCH_DATAPTR				0x00000040
>  #define HW_BCH_METAPTR				0x00000050
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> index 0f68a99..a724dcb 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> @@ -256,6 +256,7 @@ int bch_set_geometry(struct gpmi_nand_data *this)
>  	unsigned int ecc_strength;
>  	unsigned int page_size;
>  	unsigned int gf_len;
> +	unsigned int erase_threshold;
>  	int ret;
>  
>  	if (common_nfc_set_geometry(this))
> @@ -298,6 +299,14 @@ int bch_set_geometry(struct gpmi_nand_data *this)
>  			| BF_BCH_FLASH0LAYOUT1_DATAN_SIZE(block_size, this),
>  			r->bch_regs + HW_BCH_FLASH0LAYOUT1);
>  
> +	/* Set the tolerance for bitflips when reading erased blocks. */
> +	erase_threshold = gf_len / 2;
> +	if (erase_threshold > bch_geo->ecc_strength)
> +		erase_threshold = bch_geo->ecc_strength;
> +
> +	writel(erase_threshold & BM_BCH_MODE_ERASE_THRESHOLD_MASK,
> +		r->bch_regs + HW_BCH_MODE);
> +
>  	/* Set *all* chip selects to use layout 0. */
>  	writel(0, r->bch_regs + HW_BCH_LAYOUTSELECT);
>  
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> index 235ddcb..ba852e8 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> @@ -991,6 +991,30 @@ static void block_mark_swapping(struct gpmi_nand_data *this,
>  	p[1] = (p[1] & mask) | (from_oob >> (8 - bit));
>  }
>  
> +/*
> + * Count the number of 0 bits in a supposed to be
> + * erased region and correct them. Return the number
> + * of bitflips or zero when the region was correct.
> + */
> +static unsigned int erased_sector_bitflips(unsigned char *data,
> +					unsigned int chunk,
> +					struct bch_geometry *geo)
> +{
> +	unsigned int flip_bits = 0;
> +	int i;
> +	int base = geo->ecc_chunk_size * chunk;
> +
> +	/* Count bitflips */
> +	for (i = 0; i < geo->ecc_chunk_size; i++)
> +		flip_bits += hweight8(~data[base + i]);
> +
> +	/* Correct bitflips by 0xFF'ing this chunk. */
> +	if (flip_bits)
> +		memset(&data[base], 0xFF, geo->ecc_chunk_size);
> +
> +	return flip_bits;
> +}
> +
>  static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>  				uint8_t *buf, int oob_required, int page)
>  {
> @@ -1002,6 +1026,7 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>  	dma_addr_t    auxiliary_phys;
>  	unsigned int  i;
>  	unsigned char *status;
> +	unsigned int  flips;
>  	unsigned int  max_bitflips = 0;
>  	int           ret;
>  
> @@ -1036,15 +1061,27 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>  	status = auxiliary_virt + nfc_geo->auxiliary_status_offset;
>  
>  	for (i = 0; i < nfc_geo->ecc_chunk_count; i++, status++) {
> -		if ((*status == STATUS_GOOD) || (*status == STATUS_ERASED))
> +		if (*status == STATUS_GOOD)
>  			continue;
>  
>  		if (*status == STATUS_UNCORRECTABLE) {
>  			mtd->ecc_stats.failed++;
>  			continue;
>  		}
> -		mtd->ecc_stats.corrected += *status;
> -		max_bitflips = max_t(unsigned int, max_bitflips, *status);
> +
> +		/*
> +		 * The number of bitflips are either counted in software
> +		 * in case of an erased chunk or otherwise reported by
> +		 * the BCH block.
> +		 */
> +		if (*status == STATUS_ERASED)
> +			flips = erased_sector_bitflips(payload_virt, i,
> +							nfc_geo);
> +		else
> +			flips = *status;
> +
> +		mtd->ecc_stats.corrected += flips;
> +		max_bitflips = max_t(unsigned int, max_bitflips, flips);
>  	}
>  
>  	if (oob_required) {
> -- 
> 1.9.1
> 
> 

> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mtd: gpmi: Deal with bitflips in erased regions
  2016-03-15  8:35 [PATCH] mtd: gpmi: Deal with bitflips in erased regions Stefan Christ
  2016-04-11 11:18 ` Stefan Christ
@ 2016-04-11 11:54 ` Boris Brezillon
  2016-04-15  7:49   ` Markus Pargmann
  2016-04-11 13:08 ` Marc Kleine-Budde
  2 siblings, 1 reply; 5+ messages in thread
From: Boris Brezillon @ 2016-04-11 11:54 UTC (permalink / raw)
  To: Stefan Christ; +Cc: linux-mtd, han.xu, eliedebrauwer, shijie8

Hi Stefan,

Sorry for the late reply.

On Tue, 15 Mar 2016 09:35:37 +0100
Stefan Christ <s.christ@phytec.de> wrote:

> From: Elie De Brauwer <eliedebrauwer@gmail.com>
> 
> The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
> able to correct bitflips on data actually streamed through the block.
> When erasing a block the data does not stream through the BCH block
> and therefore no ECC data is written to the NAND chip. This causes
> gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
> found in an erased page. Typically causing problems at higher levels
> (ubifs corrupted empty space warnings). This problem was also observed
> when using SLC NAND devices.
> 
> This patch configures the BCH block to mark a block as 'erased' if
> not too much bitflips are found (by setting the erase threshold). A
> consequence of this is that whenever an erased page is read, the
> number of bitflips will be counted and corrected in software,
> allowing the upper layers to take proper actions.
> 
> Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com>
> Acked-by: Peter Korsgaard <peter@korsgaard.com>
> Acked-by: Huang Shijie <b32955@freescale.com>
> ---
> Hi,
> 
> I'm reposting this patch, because it's still not in the mainline kernel tree.
> It was taken from
> 
>     http://lists.infradead.org/pipermail/linux-mtd/2014-January/051357.html
>     [PATCH v7] mtd: gpmi: Deal with bitflips in erased regions
> 
> The patch fixes the UBI corrupted empty space problem:
> 
>     UBIFS error (pid 74): ubifs_scan: corrupt empty space at LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scanned_corruption: corruption at LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scanned_corruption: first 8192 bytes from LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scan: LEB 3059 scanning failed
> 
> which occurs in real systems sometimes.
> 
> I've tested the patch.  You can add my
> 
>     Tested-by: Stefan Christ <s.christ@phytec.de>
> 
> Mit freundlichen Grüßen / Kind regards,
>         Stefan Christ
> 
> ---
>  drivers/mtd/nand/gpmi-nand/bch-regs.h  |  1 +
>  drivers/mtd/nand/gpmi-nand/gpmi-lib.c  |  9 +++++++
>  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 43 +++++++++++++++++++++++++++++++---
>  3 files changed, 50 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mtd/nand/gpmi-nand/bch-regs.h b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> index 05bb91f..88cc89d 100644
> --- a/drivers/mtd/nand/gpmi-nand/bch-regs.h
> +++ b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> @@ -31,6 +31,7 @@
>  
>  #define HW_BCH_STATUS0				0x00000010
>  #define HW_BCH_MODE				0x00000020
> +#define BM_BCH_MODE_ERASE_THRESHOLD_MASK	0xff
>  #define HW_BCH_ENCODEPTR			0x00000030
>  #define HW_BCH_DATAPTR				0x00000040
>  #define HW_BCH_METAPTR				0x00000050
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> index 0f68a99..a724dcb 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> @@ -256,6 +256,7 @@ int bch_set_geometry(struct gpmi_nand_data *this)
>  	unsigned int ecc_strength;
>  	unsigned int page_size;
>  	unsigned int gf_len;
> +	unsigned int erase_threshold;
>  	int ret;
>  
>  	if (common_nfc_set_geometry(this))
> @@ -298,6 +299,14 @@ int bch_set_geometry(struct gpmi_nand_data *this)
>  			| BF_BCH_FLASH0LAYOUT1_DATAN_SIZE(block_size, this),
>  			r->bch_regs + HW_BCH_FLASH0LAYOUT1);
>  
> +	/* Set the tolerance for bitflips when reading erased blocks. */
> +	erase_threshold = gf_len / 2;
> +	if (erase_threshold > bch_geo->ecc_strength)
> +		erase_threshold = bch_geo->ecc_strength;

Can we make it consistent with other drivers and always use ecc_strength
instead of gf_len / 2?

> +
> +	writel(erase_threshold & BM_BCH_MODE_ERASE_THRESHOLD_MASK,
> +		r->bch_regs + HW_BCH_MODE);
> +
>  	/* Set *all* chip selects to use layout 0. */
>  	writel(0, r->bch_regs + HW_BCH_LAYOUTSELECT);
>  
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> index 235ddcb..ba852e8 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> @@ -991,6 +991,30 @@ static void block_mark_swapping(struct gpmi_nand_data *this,
>  	p[1] = (p[1] & mask) | (from_oob >> (8 - bit));
>  }
>  
> +/*
> + * Count the number of 0 bits in a supposed to be
> + * erased region and correct them. Return the number
> + * of bitflips or zero when the region was correct.
> + */
> +static unsigned int erased_sector_bitflips(unsigned char *data,
> +					unsigned int chunk,
> +					struct bch_geometry *geo)
> +{
> +	unsigned int flip_bits = 0;
> +	int i;
> +	int base = geo->ecc_chunk_size * chunk;
> +
> +	/* Count bitflips */
> +	for (i = 0; i < geo->ecc_chunk_size; i++)
> +		flip_bits += hweight8(~data[base + i]);

Hm, the number of bitflips in this case is not always correct: you're
only counting bitflips in the data section and are completely ignoring
ECC bytes.

If you don't have direct access to ECC bytes, you should read them and
count the number of bitflips in there too.

> +
> +	/* Correct bitflips by 0xFF'ing this chunk. */
> +	if (flip_bits)
> +		memset(&data[base], 0xFF, geo->ecc_chunk_size);
> +
> +	return flip_bits;
> +}
> +

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mtd: gpmi: Deal with bitflips in erased regions
  2016-03-15  8:35 [PATCH] mtd: gpmi: Deal with bitflips in erased regions Stefan Christ
  2016-04-11 11:18 ` Stefan Christ
  2016-04-11 11:54 ` Boris Brezillon
@ 2016-04-11 13:08 ` Marc Kleine-Budde
  2 siblings, 0 replies; 5+ messages in thread
From: Marc Kleine-Budde @ 2016-04-11 13:08 UTC (permalink / raw)
  To: Stefan Christ, linux-mtd; +Cc: han.xu, eliedebrauwer, shijie8, mpa, ptx kernel


[-- Attachment #1.1: Type: text/plain, Size: 1513 bytes --]

On 03/15/2016 09:35 AM, Stefan Christ wrote:
> From: Elie De Brauwer <eliedebrauwer@gmail.com>
> 
> The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
> able to correct bitflips on data actually streamed through the block.
> When erasing a block the data does not stream through the BCH block
> and therefore no ECC data is written to the NAND chip. This causes
> gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
> found in an erased page. Typically causing problems at higher levels
> (ubifs corrupted empty space warnings). This problem was also observed
> when using SLC NAND devices.
> 
> This patch configures the BCH block to mark a block as 'erased' if
> not too much bitflips are found (by setting the erase threshold). A
> consequence of this is that whenever an erased page is read, the
> number of bitflips will be counted and corrected in software,
> allowing the upper layers to take proper actions.
> 
> Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com>
> Acked-by: Peter Korsgaard <peter@korsgaard.com>
> Acked-by: Huang Shijie <b32955@freescale.com>

You moght want to coordinate with Markus (Cc'ed) as he tries to solve
the same issue.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mtd: gpmi: Deal with bitflips in erased regions
  2016-04-11 11:54 ` Boris Brezillon
@ 2016-04-15  7:49   ` Markus Pargmann
  0 siblings, 0 replies; 5+ messages in thread
From: Markus Pargmann @ 2016-04-15  7:49 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: Stefan Christ, han.xu, eliedebrauwer, linux-mtd, shijie8

Hi,

On Mon, Apr 11, 2016 at 01:54:46PM +0200, Boris Brezillon wrote:
> Hi Stefan,
> 
> Sorry for the late reply.
> 
> On Tue, 15 Mar 2016 09:35:37 +0100
> Stefan Christ <s.christ@phytec.de> wrote:
> 
> > From: Elie De Brauwer <eliedebrauwer@gmail.com>
> > 
> > The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
> > able to correct bitflips on data actually streamed through the block.
> > When erasing a block the data does not stream through the BCH block
> > and therefore no ECC data is written to the NAND chip. This causes
> > gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
> > found in an erased page. Typically causing problems at higher levels
> > (ubifs corrupted empty space warnings). This problem was also observed
> > when using SLC NAND devices.
> > 
> > This patch configures the BCH block to mark a block as 'erased' if
> > not too much bitflips are found (by setting the erase threshold). A
> > consequence of this is that whenever an erased page is read, the
> > number of bitflips will be counted and corrected in software,
> > allowing the upper layers to take proper actions.
> > 
> > Signed-off-by: Elie De Brauwer <eliedebrauwer@gmail.com>
> > Acked-by: Peter Korsgaard <peter@korsgaard.com>
> > Acked-by: Huang Shijie <b32955@freescale.com>
> > ---
> > Hi,
> > 
> > I'm reposting this patch, because it's still not in the mainline kernel tree.
> > It was taken from
> > 
> >     http://lists.infradead.org/pipermail/linux-mtd/2014-January/051357.html
> >     [PATCH v7] mtd: gpmi: Deal with bitflips in erased regions
> > 
> > The patch fixes the UBI corrupted empty space problem:
> > 
> >     UBIFS error (pid 74): ubifs_scan: corrupt empty space at LEB 3059:72260
> >     UBIFS error (pid 74): ubifs_scanned_corruption: corruption at LEB 3059:72260
> >     UBIFS error (pid 74): ubifs_scanned_corruption: first 8192 bytes from LEB 3059:72260
> >     UBIFS error (pid 74): ubifs_scan: LEB 3059 scanning failed
> > 
> > which occurs in real systems sometimes.
> > 
> > I've tested the patch.  You can add my
> > 
> >     Tested-by: Stefan Christ <s.christ@phytec.de>
> > 
> > Mit freundlichen Grüßen / Kind regards,
> >         Stefan Christ
> > 
> > ---
> >  drivers/mtd/nand/gpmi-nand/bch-regs.h  |  1 +
> >  drivers/mtd/nand/gpmi-nand/gpmi-lib.c  |  9 +++++++
> >  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 43 +++++++++++++++++++++++++++++++---
> >  3 files changed, 50 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/gpmi-nand/bch-regs.h b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> > index 05bb91f..88cc89d 100644
> > --- a/drivers/mtd/nand/gpmi-nand/bch-regs.h
> > +++ b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> > @@ -31,6 +31,7 @@
> >  
> >  #define HW_BCH_STATUS0				0x00000010
> >  #define HW_BCH_MODE				0x00000020
> > +#define BM_BCH_MODE_ERASE_THRESHOLD_MASK	0xff
> >  #define HW_BCH_ENCODEPTR			0x00000030
> >  #define HW_BCH_DATAPTR				0x00000040
> >  #define HW_BCH_METAPTR				0x00000050
> > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> > index 0f68a99..a724dcb 100644
> > --- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> > @@ -256,6 +256,7 @@ int bch_set_geometry(struct gpmi_nand_data *this)
> >  	unsigned int ecc_strength;
> >  	unsigned int page_size;
> >  	unsigned int gf_len;
> > +	unsigned int erase_threshold;
> >  	int ret;
> >  
> >  	if (common_nfc_set_geometry(this))
> > @@ -298,6 +299,14 @@ int bch_set_geometry(struct gpmi_nand_data *this)
> >  			| BF_BCH_FLASH0LAYOUT1_DATAN_SIZE(block_size, this),
> >  			r->bch_regs + HW_BCH_FLASH0LAYOUT1);
> >  
> > +	/* Set the tolerance for bitflips when reading erased blocks. */
> > +	erase_threshold = gf_len / 2;
> > +	if (erase_threshold > bch_geo->ecc_strength)
> > +		erase_threshold = bch_geo->ecc_strength;
> 
> Can we make it consistent with other drivers and always use ecc_strength
> instead of gf_len / 2?
> 
> > +
> > +	writel(erase_threshold & BM_BCH_MODE_ERASE_THRESHOLD_MASK,
> > +		r->bch_regs + HW_BCH_MODE);
> > +
> >  	/* Set *all* chip selects to use layout 0. */
> >  	writel(0, r->bch_regs + HW_BCH_LAYOUTSELECT);
> >  
> > diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > index 235ddcb..ba852e8 100644
> > --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> > @@ -991,6 +991,30 @@ static void block_mark_swapping(struct gpmi_nand_data *this,
> >  	p[1] = (p[1] & mask) | (from_oob >> (8 - bit));
> >  }
> >  
> > +/*
> > + * Count the number of 0 bits in a supposed to be
> > + * erased region and correct them. Return the number
> > + * of bitflips or zero when the region was correct.
> > + */
> > +static unsigned int erased_sector_bitflips(unsigned char *data,
> > +					unsigned int chunk,
> > +					struct bch_geometry *geo)
> > +{
> > +	unsigned int flip_bits = 0;
> > +	int i;
> > +	int base = geo->ecc_chunk_size * chunk;
> > +
> > +	/* Count bitflips */
> > +	for (i = 0; i < geo->ecc_chunk_size; i++)
> > +		flip_bits += hweight8(~data[base + i]);
> 
> Hm, the number of bitflips in this case is not always correct: you're
> only counting bitflips in the data section and are completely ignoring
> ECC bytes.
> 
> If you don't have direct access to ECC bytes, you should read them and
> count the number of bitflips in there too.

This is the same approach I mentioned in my patch
"[PATCH RESEND] gpmi-nand: Handle ECC Errors in erased pages".

But there were some comments about the performance hit with reading
erased pages.

Best Regards,

Markus

> 
> > +
> > +	/* Correct bitflips by 0xFF'ing this chunk. */
> > +	if (flip_bits)
> > +		memset(&data[base], 0xFF, geo->ecc_chunk_size);
> > +
> > +	return flip_bits;
> > +}
> > +
> 
> Best Regards,
> 
> Boris
> 
> -- 
> Boris Brezillon, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-15  7:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-15  8:35 [PATCH] mtd: gpmi: Deal with bitflips in erased regions Stefan Christ
2016-04-11 11:18 ` Stefan Christ
2016-04-11 11:54 ` Boris Brezillon
2016-04-15  7:49   ` Markus Pargmann
2016-04-11 13:08 ` Marc Kleine-Budde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.