linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Masahiro Yamada <yamada.masahiro@socionext.com>
To: linux-mtd@lists.infradead.org
Cc: laurent.monat@idquantique.com,
	thorsten.christiansson@idquantique.com,
	Enrico Jorns <ejo@pengutronix.de>,
	Jason Roberts <jason.e.roberts@intel.com>,
	Artem Bityutskiy <artem.bityutskiy@linux.intel.com>,
	Dinh Nguyen <dinguyen@kernel.org>,
	Boris Brezillon <boris.brezillon@free-electrons.com>,
	Marek Vasut <marek.vasut@gmail.com>,
	Brian Norris <computersforpeace@gmail.com>,
	Graham Moore <grmoore@opensource.altera.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Chuanxiao Dong <chuanxiao.dong@intel.com>,
	Jassi Brar <jaswinder.singh@linaro.org>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	linux-kernel@vger.kernel.org, Richard Weinberger <richard@nod.at>,
	Cyrille Pitchen <cyrille.pitchen@atmel.com>
Subject: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()
Date: Wed, 22 Mar 2017 23:07:18 +0900	[thread overview]
Message-ID: <1490191680-14481-12-git-send-email-yamada.masahiro@socionext.com> (raw)
In-Reply-To: <1490191680-14481-1-git-send-email-yamada.masahiro@socionext.com>

This function is wrong in multiple ways:

[1] Counting corrected bytes instead of corrected bits.

The following code is counting the number of corrected _bytes_.

    /* correct the ECC error */
    buf[offset] ^= err_cor_value;
    mtd->ecc_stats.corrected++;
    bitflips++;

What the core framework expects is the number of corrected _bits_.
They can be different if multiple bitflips occur within one byte.

[2] total number of errors instead of max of per-sector errors

The core framework expects that corrected errors are counted per
sector, then the max value should be taken.  The current code simply
iterates over the whole page, i.e. counts the total number of
correction in the page.  This means "too many bitflips" is triggered
earlier than it should be, i.e. the NAND device is worn out sooner.

Besides those bugs, this function is unreadable due to the deep
nesting.  Notice the whole code in this function is wrapped in
if (irq_status & INTR__ECC_ERR), so this conditional can be moved
out of the function.  Also, use shorter names for local variables.

Re-work the function to fix all the issues.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v2:
  - Use shorter names for local variables.
  - Fix bugs addressed by [1], [2]

 drivers/mtd/nand/denali.c | 157 ++++++++++++++++++++++++----------------------
 1 file changed, 82 insertions(+), 75 deletions(-)

diff --git a/drivers/mtd/nand/denali.c b/drivers/mtd/nand/denali.c
index 86381ac..608fe6f 100644
--- a/drivers/mtd/nand/denali.c
+++ b/drivers/mtd/nand/denali.c
@@ -888,80 +888,87 @@ static void read_oob_data(struct mtd_info *mtd, uint8_t *buf, int page)
 #define ECC_SECTOR(x)	(((x) & ECC_ERROR_ADDRESS__SECTOR_NR) >> 12)
 #define ECC_BYTE(x)	(((x) & ECC_ERROR_ADDRESS__OFFSET))
 #define ECC_CORRECTION_VALUE(x) ((x) & ERR_CORRECTION_INFO__BYTEMASK)
-#define ECC_ERROR_CORRECTABLE(x) (!((x) & ERR_CORRECTION_INFO__ERROR_TYPE))
+#define ECC_ERROR_UNCORRECTABLE(x) ((x) & ERR_CORRECTION_INFO__ERROR_TYPE)
 #define ECC_ERR_DEVICE(x)	(((x) & ERR_CORRECTION_INFO__DEVICE_NR) >> 8)
 #define ECC_LAST_ERR(x)		((x) & ERR_CORRECTION_INFO__LAST_ERR_INFO)
 
-static bool handle_ecc(struct denali_nand_info *denali, uint8_t *buf,
-		       uint32_t irq_status, unsigned int *max_bitflips)
+static int handle_ecc(struct mtd_info *mtd,
+		      struct denali_nand_info *denali, uint8_t *buf)
 {
-	bool check_erased_page = false;
 	unsigned int bitflips = 0;
+	unsigned int max_bitflips = 0;
+	unsigned int total_bitflips = 0;
+	uint32_t err_addr, err_cor_info;
+	unsigned int err_byte, err_sector, err_device;
+	uint8_t err_cor_value;
+	unsigned int prev_sector = 0;
+	int ret = 0;
+
+	/* read the ECC errors. we'll ignore them for now */
+	denali_set_intr_modes(denali, false);
 
-	if (irq_status & INTR__ECC_ERR) {
-		/* read the ECC errors. we'll ignore them for now */
-		uint32_t err_address, err_correction_info, err_byte,
-			 err_sector, err_device, err_correction_value;
-		denali_set_intr_modes(denali, false);
-
-		do {
-			err_address = ioread32(denali->flash_reg +
-						ECC_ERROR_ADDRESS);
-			err_sector = ECC_SECTOR(err_address);
-			err_byte = ECC_BYTE(err_address);
-
-			err_correction_info = ioread32(denali->flash_reg +
-						ERR_CORRECTION_INFO);
-			err_correction_value =
-				ECC_CORRECTION_VALUE(err_correction_info);
-			err_device = ECC_ERR_DEVICE(err_correction_info);
-
-			if (ECC_ERROR_CORRECTABLE(err_correction_info)) {
-				/*
-				 * If err_byte is larger than ECC_SECTOR_SIZE,
-				 * means error happened in OOB, so we ignore
-				 * it. It's no need for us to correct it
-				 * err_device is represented the NAND error
-				 * bits are happened in if there are more
-				 * than one NAND connected.
-				 */
-				if (err_byte < ECC_SECTOR_SIZE) {
-					struct mtd_info *mtd =
-						nand_to_mtd(&denali->nand);
-					int offset;
-
-					offset = (err_sector *
-							ECC_SECTOR_SIZE +
-							err_byte) *
-							denali->devnum +
-							err_device;
-					/* correct the ECC error */
-					buf[offset] ^= err_correction_value;
-					mtd->ecc_stats.corrected++;
-					bitflips++;
-				}
-			} else {
-				/*
-				 * if the error is not correctable, need to
-				 * look at the page to see if it is an erased
-				 * page. if so, then it's not a real ECC error
-				 */
-				check_erased_page = true;
-			}
-		} while (!ECC_LAST_ERR(err_correction_info));
-		/*
-		 * Once handle all ecc errors, controller will triger
-		 * a ECC_TRANSACTION_DONE interrupt, so here just wait
-		 * for a while for this interrupt
-		 */
-		while (!(read_interrupt_status(denali) &
-				INTR__ECC_TRANSACTION_DONE))
-			cpu_relax();
-		clear_interrupts(denali);
-		denali_set_intr_modes(denali, true);
-	}
-	*max_bitflips = bitflips;
-	return check_erased_page;
+	do {
+		err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
+		err_sector = ECC_SECTOR(err_addr);
+		err_byte = ECC_BYTE(err_addr);
+
+		err_cor_info = ioread32(denali->flash_reg + ERR_CORRECTION_INFO);
+		err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
+		err_device = ECC_ERR_DEVICE(err_cor_info);
+
+		/* reset the bitflip counter when crossing ECC sector */
+		if (err_sector != prev_sector)
+			bitflips = 0;
+
+		if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
+			/*
+			 * if the error is not correctable, need to look at the
+			 * page to see if it is an erased page. if so, then
+			 * it's not a real ECC error
+			 */
+			ret = -EBADMSG;
+		} else if (err_byte < ECC_SECTOR_SIZE) {
+			/*
+			 * If err_byte is larger than ECC_SECTOR_SIZE, means error
+			 * happened in OOB, so we ignore it. It's no need for
+			 * us to correct it err_device is represented the NAND
+			 * error bits are happened in if there are more than
+			 * one NAND connected.
+			 */
+			int offset;
+			unsigned int flips_in_byte;
+
+			offset = (err_sector * ECC_SECTOR_SIZE + err_byte) *
+						denali->devnum + err_device;
+
+			/* correct the ECC error */
+			flips_in_byte = hweight8(buf[offset] ^ err_cor_value);
+			bitflips += flips_in_byte;
+			total_bitflips += flips_in_byte;
+			buf[offset] ^= err_cor_value;
+
+			max_bitflips = max(max_bitflips, bitflips);
+		}
+
+		prev_sector = err_sector;
+	} while (!ECC_LAST_ERR(err_cor_info));
+
+	/*
+	 * Once handle all ecc errors, controller will trigger a
+	 * ECC_TRANSACTION_DONE interrupt, so here just wait for
+	 * a while for this interrupt
+	 */
+	while (!(read_interrupt_status(denali) & INTR__ECC_TRANSACTION_DONE))
+		cpu_relax();
+	clear_interrupts(denali);
+	denali_set_intr_modes(denali, true);
+
+	if (ret)
+		return ret;
+
+	mtd->ecc_stats.corrected += total_bitflips;
+
+	return max_bitflips;
 }
 
 /* programs the controller to either enable/disable DMA transfers */
@@ -1097,7 +1104,6 @@ static int denali_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 static int denali_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 			    uint8_t *buf, int oob_required, int page)
 {
-	unsigned int max_bitflips;
 	struct denali_nand_info *denali = mtd_to_denali(mtd);
 
 	dma_addr_t addr = denali->buf.dma_buf;
@@ -1105,8 +1111,7 @@ static int denali_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 
 	uint32_t irq_status;
 	uint32_t irq_mask = INTR__ECC_TRANSACTION_DONE | INTR__ECC_ERR;
-	bool check_erased_page = false;
-	int stat;
+	int stat = 0;
 
 	if (page != denali->page) {
 		dev_err(denali->dev,
@@ -1130,10 +1135,11 @@ static int denali_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 
 	memcpy(buf, denali->buf.buf, mtd->writesize);
 
-	check_erased_page = handle_ecc(denali, buf, irq_status, &max_bitflips);
+	if (irq_status & INTR__ECC_ERR)
+		stat = handle_ecc(mtd, denali, buf);
 	denali_enable_dma(denali, false);
 
-	if (check_erased_page) {
+	if (stat == -EBADMSG) {
 		read_oob_data(mtd, chip->oob_poi, denali->page);
 
 		stat = nand_check_erased_ecc_chunk(
@@ -1144,10 +1150,11 @@ static int denali_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 		if (stat < 0) {
 			mtd->ecc_stats.failed++;
 			/* return 0 for uncorrectable bitflips */
-			max_bitflips = 0;
+			stat = 0;
 		}
 	}
-	return max_bitflips;
+
+	return stat;
 }
 
 static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
-- 
2.7.4

  parent reply	other threads:[~2017-03-22 14:19 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-22 14:07 [PATCH v2 00/53] mtd: nand: denali: 2nd round of Denali NAND IP patch bomb Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 01/53] mtd: nand: allow to set only one of ECC size and ECC strength from DT Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 02/53] mtd: nand: use read_oob() instead of cmdfunc() for bad block check Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 03/53] mtd: nand: denali: remove unused CONFIG option and macros Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 04/53] mtd: nand: denali: remove redundant define of BANK(x) Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 05/53] mtd: nand: denali: remove more unused struct members Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 06/53] mtd: nand: denali: fix comment of denali_nand_info::flash_mem Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 07/53] mtd: nand: denali: consolidate INTR_STATUS__* and INTR_EN__* macros Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 08/53] mtd: nand: denali: introduce capability flag Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 09/53] mtd: nand: denali: use int where no reason to use fixed width variable Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 10/53] mtd: nand: denali: fix erased page checking Masahiro Yamada
2017-03-22 20:36   ` Boris Brezillon
2017-03-23  5:15     ` Masahiro Yamada
2017-03-23  8:03       ` Boris Brezillon
2017-03-22 20:56   ` Boris Brezillon
2017-03-23  5:04     ` Masahiro Yamada
2017-03-23  7:56       ` Boris Brezillon
2017-03-24  2:43         ` Masahiro Yamada
2017-03-24  8:06           ` Boris Brezillon
2017-03-22 14:07 ` Masahiro Yamada [this message]
2017-03-22 20:57   ` [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc() Boris Brezillon
2017-03-23  7:02     ` Masahiro Yamada
2017-03-23  8:12       ` Boris Brezillon
2017-03-22 14:07 ` [PATCH v2 12/53] mtd: nand: denali: support HW_ECC_FIXUP capability Masahiro Yamada
2017-03-22 21:09   ` Boris Brezillon
2017-03-23  7:06     ` Masahiro Yamada
2017-03-23  8:16       ` Boris Brezillon
2017-03-22 21:12   ` Boris Brezillon
2017-03-23  7:05     ` Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 13/53] mtd: nand: denali_dt: enable HW_ECC_FIXUP for Altera SOCFPGA variant Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 14/53] mtd: nand: denali: support 64bit capable DMA engine Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 15/53] mtd: nand: denali_dt: remove dma-mask DT property Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 16/53] mtd: nand: denali_dt: use pdev instead of ofdev for platform_device Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 17/53] mtd: nand: denali: allow to override revision number Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 18/53] mtd: nand: denali: use nand_chip to hold frequently accessed data Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 19/53] mtd: nand: denali: call nand_set_flash_node() to set DT node Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 20/53] mtd: nand: denali: do not set mtd->name Masahiro Yamada
2017-03-27 15:31   ` Boris Brezillon
2017-03-28 21:32     ` Masahiro Yamada
2017-03-28 21:40       ` Boris Brezillon
2017-03-29  1:19         ` Masahiro Yamada
2017-03-29  7:19           ` Boris Brezillon
2017-03-29 11:30             ` Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 21/53] mtd: nand: denali: move multi device fixup code to a helper function Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 22/53] mtd: nand: denali: simplify multi device fixup code Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 23/53] mtd: nand: denali: set DEVICES_CONNECTED 1 if not set Masahiro Yamada
2017-03-22 14:07 ` [PATCH v2 24/53] mtd: nand: denali: remove meaningless writes to read-only registers Masahiro Yamada

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1490191680-14481-12-git-send-email-yamada.masahiro@socionext.com \
    --to=yamada.masahiro@socionext.com \
    --cc=artem.bityutskiy@linux.intel.com \
    --cc=boris.brezillon@free-electrons.com \
    --cc=chuanxiao.dong@intel.com \
    --cc=computersforpeace@gmail.com \
    --cc=cyrille.pitchen@atmel.com \
    --cc=dinguyen@kernel.org \
    --cc=dwmw2@infradead.org \
    --cc=ejo@pengutronix.de \
    --cc=grmoore@opensource.altera.com \
    --cc=jason.e.roberts@intel.com \
    --cc=jaswinder.singh@linaro.org \
    --cc=laurent.monat@idquantique.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=marek.vasut@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=richard@nod.at \
    --cc=thorsten.christiansson@idquantique.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).