All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads
@ 2012-03-11 21:21 Mike Dunn
  2012-03-11 21:21 ` [PATCH 1/4] MTD: add ecc_strength fields to mtd structs Mike Dunn
                   ` (4 more replies)
  0 siblings, 5 replies; 31+ messages in thread
From: Mike Dunn @ 2012-03-11 21:21 UTC (permalink / raw)
  To: linux-mtd; +Cc: Ivan Djelic, Mike Dunn

Hi,

This patchset addresses the problem of insufficient information being returned
by mtd_read() and mtd_read_oob() with regard to bit error corrections.
Currently -EUCLEAN is returned if one or more bit errors were corrected during
the course of the read operation.  Higher layers like UBI use this return code
as an indication that the erase block may be degrading and should be considered
as a candidate for being marked as a bad block.  The problem is that high bit
error rates are common on more recent NAND flash devices, which these devices
compensate for by using strong ecc algorithms.  Frequent (but entirely normal)
bit error corrections on these devices result in blocks being incorrectly marked
as bad.  On some devices, ubi/ubifs can not be reliably used because of this.

This problem was discussed a while back [1][2][3], and a consensus of sorts was
reached for a solution, which these patches implement.  The recent addition of
the mtd api entry functions now make this solution practical (thanks Artem!).  A
quick description of each patch will provide a synopsis of the patchset:

(1) The element 'ecc_strength' is added to struct mtd_info, which will store the
    maximum number of bit errors that can be corrected in one writesize region.

(2) Drivers set ecc_strength during initialization.

(3) The element 'euclean_threshold' is added to struct mtd_info.  If the driver
    leaves this uninitialized, mtd sets it to ecc_strength when the device or
    partition is registered.  This element is also exposed for reading and
    writing from userspace through sysfs.

(4) The drivers' read methods, absent an error, return a non-negative integer
    indicating the maximum number of bit errors that were corrected in any one
    writesize region.  MTD returns -EUCLEAN if this is >= euclean_threshold, 0
    otherwise.

So basically, the meaning of -EUCLEAN is changed from "one or more bit errors
were corrected over the entire read operation", to "a dangerously high number of
bit errors were corrected on one or more writesize regions".  By default,
"dangerously high" is interpreted as the maximum number of correctible bit
errors per writesize.  Drivers can specify a different value, and the user can
override it if more or less caution regarding data integrity is desired.

Patch #2 touches a lot of files, but they are small changes in most cases.  If
you can verify the correctness of the device's ecc strength, an ACK would be
much appreciated!

And of course general reviews greatly appreciated.  Thanks!

[1] http://lists.infradead.org/pipermail/linux-mtd/2011-December/038755.html
[2] http://lists.infradead.org/pipermail/linux-mtd/2011-December/038688.html
[3] http://lists.infradead.org/pipermail/linux-mtd/2011-December/038828.html

Mike Dunn (4):
  MTD: add ecc_strength fields to mtd structs
  MTD: flash drivers set ecc strength
  MTD: euclean_threshold added to mtd_info and sysfs
  MTD: drivers return max_bitflips, mtd returns -EUCLEAN

 Documentation/ABI/testing/sysfs-class-mtd |   23 ++++++++++
 drivers/mtd/devices/doc2000.c             |    1 +
 drivers/mtd/devices/doc2001.c             |    1 +
 drivers/mtd/devices/doc2001plus.c         |    1 +
 drivers/mtd/devices/docg3.c               |    5 ++-
 drivers/mtd/mtdcore.c                     |   68 ++++++++++++++++++++++++++++-
 drivers/mtd/mtdpart.c                     |   26 ++++++-----
 drivers/mtd/nand/alauda.c                 |    5 +-
 drivers/mtd/nand/atmel_nand.c             |    1 +
 drivers/mtd/nand/bcm_umi_nand.c           |    8 +++
 drivers/mtd/nand/bf5xx_nand.c             |    2 +
 drivers/mtd/nand/cafe_nand.c              |    1 +
 drivers/mtd/nand/cs553x_nand.c            |    2 +
 drivers/mtd/nand/davinci_nand.c           |    1 +
 drivers/mtd/nand/denali.c                 |    3 +
 drivers/mtd/nand/diskonchip.c             |    1 +
 drivers/mtd/nand/docg4.c                  |    1 +
 drivers/mtd/nand/fsl_elbc_nand.c          |    6 +++
 drivers/mtd/nand/fsmc_nand.c              |    2 +
 drivers/mtd/nand/jz4740_nand.c            |    5 ++
 drivers/mtd/nand/mxc_nand.c               |    7 +++
 drivers/mtd/nand/nand_base.c              |   21 ++++++++-
 drivers/mtd/nand/ndfc.c                   |    1 +
 drivers/mtd/nand/omap2.c                  |    1 +
 drivers/mtd/nand/pxa3xx_nand.c            |    1 +
 drivers/mtd/nand/r852.c                   |    1 +
 drivers/mtd/nand/rtc_from4.c              |    1 +
 drivers/mtd/nand/s3c2410.c                |    1 +
 drivers/mtd/nand/sh_flctl.c               |    1 +
 drivers/mtd/nand/sharpsl.c                |    1 +
 drivers/mtd/nand/tmio_nand.c              |    1 +
 drivers/mtd/nand/txx9ndfmc.c              |    1 +
 drivers/mtd/onenand/onenand_base.c        |    7 ++-
 include/linux/mtd/mtd.h                   |   20 +++++----
 include/linux/mtd/nand.h                  |    2 +
 35 files changed, 200 insertions(+), 30 deletions(-)

-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/4] MTD: add ecc_strength fields to mtd structs
  2012-03-11 21:21 [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
@ 2012-03-11 21:21 ` Mike Dunn
  2012-03-13 12:25   ` Artem Bityutskiy
  2012-03-11 21:21 ` [PATCH 2/4] MTD: flash drivers set ecc strength Mike Dunn
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Mike Dunn @ 2012-03-11 21:21 UTC (permalink / raw)
  To: linux-mtd; +Cc: Mike Dunn

This adds 'ecc_strength' to struct mtd_info.  This stores the maximum number of
bit errors that can be corrected in one writesize region.

For consistency with the nand code, 'strength' is similiarly added to struct
nand_ecc_ctrl.  This stores the maximum number of bit errors that can be
corrected in one ecc step.

Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
---
 include/linux/mtd/mtd.h  |    3 +++
 include/linux/mtd/nand.h |    2 ++
 2 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
index 726c2d1..cf5ea8c 100644
--- a/include/linux/mtd/mtd.h
+++ b/include/linux/mtd/mtd.h
@@ -164,6 +164,9 @@ struct mtd_info {
 	/* ECC layout structure pointer - read only! */
 	struct nand_ecclayout *ecclayout;
 
+	/* max number of correctible bit errors per writesize */
+	unsigned int ecc_strength;
+
 	/* Data for variable erase regions. If numeraseregions is zero,
 	 * it means that the whole device has erasesize as given above.
 	 */
diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index 609868f..1482340 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -324,6 +324,7 @@ struct nand_hw_control {
  * @steps:	number of ECC steps per page
  * @size:	data bytes per ECC step
  * @bytes:	ECC bytes per step
+ * @strength:	max number of correctible bits per ECC step
  * @total:	total number of ECC bytes per page
  * @prepad:	padding information for syndrome based ECC generators
  * @postpad:	padding information for syndrome based ECC generators
@@ -351,6 +352,7 @@ struct nand_ecc_ctrl {
 	int size;
 	int bytes;
 	int total;
+	int strength;
 	int prepad;
 	int postpad;
 	struct nand_ecclayout	*layout;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-03-11 21:21 [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
  2012-03-11 21:21 ` [PATCH 1/4] MTD: add ecc_strength fields to mtd structs Mike Dunn
@ 2012-03-11 21:21 ` Mike Dunn
  2012-03-13 12:27   ` Artem Bityutskiy
                     ` (2 more replies)
  2012-03-11 21:21 ` [PATCH 3/4] MTD: euclean_threshold added to mtd_info and sysfs Mike Dunn
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 31+ messages in thread
From: Mike Dunn @ 2012-03-11 21:21 UTC (permalink / raw)
  To: linux-mtd; +Cc: Ivan Djelic, Mike Dunn

Flash device drivers initialize 'ecc_strength' in struct mtd_info, which is the
maximum number of bit errors that can be corrected in one writesize region.

Drivers using the nand interface intitialize 'strength' in struct nand_ecc_ctrl,
which is the maximum number of bit errors that can be corrected in one ecc step.
Nand infrastructure code translates this to 'ecc_strength'.

Also for nand drivers, the nand infrastructure code sets ecc.strength for ecc
modes NAND_ECC_SOFT, NAND_ECC_SOFT_BCH, and NAND_ECC_NONE.  It is set in the
driver for all other modes.

Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
---

With a couple exceptions, I'm fairly confident that these values are correct,
but if you can confirm, an ACK would be greatly appreciated!  Note that an
incorrect value will not cause a regression unless the value is overstated.

The exceptions are:

  bcm_umi_nand.c: implements bch in hw; Galois field order not indicated
  jz4740_nand.c: implements r-s in hw; generator polynomial order not indicated

These are given conservative guesses based on the information gleaned from the
code.

 drivers/mtd/devices/doc2000.c      |    1 +
 drivers/mtd/devices/doc2001.c      |    1 +
 drivers/mtd/devices/doc2001plus.c  |    1 +
 drivers/mtd/devices/docg3.c        |    1 +
 drivers/mtd/mtdpart.c              |    1 +
 drivers/mtd/nand/alauda.c          |    1 +
 drivers/mtd/nand/atmel_nand.c      |    1 +
 drivers/mtd/nand/bcm_umi_nand.c    |    8 ++++++++
 drivers/mtd/nand/bf5xx_nand.c      |    2 ++
 drivers/mtd/nand/cafe_nand.c       |    1 +
 drivers/mtd/nand/cs553x_nand.c     |    2 ++
 drivers/mtd/nand/davinci_nand.c    |    1 +
 drivers/mtd/nand/denali.c          |    3 +++
 drivers/mtd/nand/diskonchip.c      |    1 +
 drivers/mtd/nand/docg4.c           |    1 +
 drivers/mtd/nand/fsl_elbc_nand.c   |    6 ++++++
 drivers/mtd/nand/fsmc_nand.c       |    2 ++
 drivers/mtd/nand/jz4740_nand.c     |    5 +++++
 drivers/mtd/nand/mxc_nand.c        |    7 +++++++
 drivers/mtd/nand/nand_base.c       |    7 ++++++-
 drivers/mtd/nand/ndfc.c            |    1 +
 drivers/mtd/nand/omap2.c           |    1 +
 drivers/mtd/nand/pxa3xx_nand.c     |    1 +
 drivers/mtd/nand/r852.c            |    1 +
 drivers/mtd/nand/rtc_from4.c       |    1 +
 drivers/mtd/nand/s3c2410.c         |    1 +
 drivers/mtd/nand/sh_flctl.c        |    1 +
 drivers/mtd/nand/sharpsl.c         |    1 +
 drivers/mtd/nand/tmio_nand.c       |    1 +
 drivers/mtd/nand/txx9ndfmc.c       |    1 +
 drivers/mtd/onenand/onenand_base.c |    1 +
 31 files changed, 63 insertions(+), 1 deletions(-)

diff --git a/drivers/mtd/devices/doc2000.c b/drivers/mtd/devices/doc2000.c
index 7ad7b05..a4eb8b5 100644
--- a/drivers/mtd/devices/doc2000.c
+++ b/drivers/mtd/devices/doc2000.c
@@ -564,6 +564,7 @@ void DoC2k_init(struct mtd_info *mtd)
 	mtd->flags = MTD_CAP_NANDFLASH;
 	mtd->writebufsize = mtd->writesize = 512;
 	mtd->oobsize = 16;
+	mtd->ecc_strength = 2;
 	mtd->owner = THIS_MODULE;
 	mtd->_erase = doc_erase;
 	mtd->_read = doc_read;
diff --git a/drivers/mtd/devices/doc2001.c b/drivers/mtd/devices/doc2001.c
index 7bff54e..f692795 100644
--- a/drivers/mtd/devices/doc2001.c
+++ b/drivers/mtd/devices/doc2001.c
@@ -348,6 +348,7 @@ void DoCMil_init(struct mtd_info *mtd)
 	mtd->erasesize = 0x2000;
 	mtd->writebufsize = mtd->writesize = 512;
 	mtd->oobsize = 16;
+	mtd->ecc_strength = 2;
 	mtd->owner = THIS_MODULE;
 	mtd->_erase = doc_erase;
 	mtd->_read = doc_read;
diff --git a/drivers/mtd/devices/doc2001plus.c b/drivers/mtd/devices/doc2001plus.c
index 4a03d86..04eb2e4 100644
--- a/drivers/mtd/devices/doc2001plus.c
+++ b/drivers/mtd/devices/doc2001plus.c
@@ -469,6 +469,7 @@ void DoCMilPlus_init(struct mtd_info *mtd)
 	mtd->flags = MTD_CAP_NANDFLASH;
 	mtd->writebufsize = mtd->writesize = 512;
 	mtd->oobsize = 16;
+	mtd->ecc_strength = 2;
 	mtd->owner = THIS_MODULE;
 	mtd->_erase = doc_erase;
 	mtd->_read = doc_read;
diff --git a/drivers/mtd/devices/docg3.c b/drivers/mtd/devices/docg3.c
index 3eafef3..886ca0f 100644
--- a/drivers/mtd/devices/docg3.c
+++ b/drivers/mtd/devices/docg3.c
@@ -1827,6 +1827,7 @@ static void __init doc_set_driver_info(int chip_id, struct mtd_info *mtd)
 	mtd->_write_oob = doc_write_oob;
 	mtd->_block_isbad = doc_block_isbad;
 	mtd->ecclayout = &docg3_oobinfo;
+	mtd->ecc_strength = DOC_ECC_BCH_T;
 }
 
 /**
diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index 226d28a..9651c06 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -516,6 +516,7 @@ static struct mtd_part *allocate_partition(struct mtd_info *master,
 	}
 
 	slave->mtd.ecclayout = master->ecclayout;
+	slave->mtd.ecc_strength = master->ecc_strength;
 	if (master->_block_isbad) {
 		uint64_t offs = 0;
 
diff --git a/drivers/mtd/nand/alauda.c b/drivers/mtd/nand/alauda.c
index ac38f73..4f20e1d 100644
--- a/drivers/mtd/nand/alauda.c
+++ b/drivers/mtd/nand/alauda.c
@@ -591,6 +591,7 @@ static int alauda_init_media(struct alauda *al)
 	mtd->_block_isbad = alauda_isbad;
 	mtd->priv = al;
 	mtd->owner = THIS_MODULE;
+	mtd->ecc_strength = 1;
 
 	err = mtd_device_register(mtd, NULL, 0);
 	if (err) {
diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
index 7769519..662abf0 100644
--- a/drivers/mtd/nand/atmel_nand.c
+++ b/drivers/mtd/nand/atmel_nand.c
@@ -554,6 +554,7 @@ static int __init atmel_nand_probe(struct platform_device *pdev)
 		nand_chip->ecc.hwctl = atmel_nand_hwctl;
 		nand_chip->ecc.read_page = atmel_nand_read_page;
 		nand_chip->ecc.bytes = 4;
+		nand_chip->ecc.strength = 1;
 	}
 
 	nand_chip->chip_delay = 20;		/* 20us command delay time */
diff --git a/drivers/mtd/nand/bcm_umi_nand.c b/drivers/mtd/nand/bcm_umi_nand.c
index ee81b63..fc60043 100644
--- a/drivers/mtd/nand/bcm_umi_nand.c
+++ b/drivers/mtd/nand/bcm_umi_nand.c
@@ -476,6 +476,14 @@ static int __devinit bcm_umi_nand_probe(struct platform_device *pdev)
 			largepage_bbt.options = NAND_BBT_SCAN2NDPAGE;
 		this->badblock_pattern = &largepage_bbt;
 	}
+
+	/*
+	 * FIXME: ecc strength value of 6 bits per 512 bytes of data is a
+	 * conservative guess, given 13 ecc bytes and using bch alg.
+	 * (Assume Galois field order m=15 to allow a margin of error.)
+	 */
+	this->ecc.strength = 6;
+
 #endif
 
 	/* Now finish off the scan, now that ecc.layout has been initialized. */
diff --git a/drivers/mtd/nand/bf5xx_nand.c b/drivers/mtd/nand/bf5xx_nand.c
index dd899cb..d7b86b9 100644
--- a/drivers/mtd/nand/bf5xx_nand.c
+++ b/drivers/mtd/nand/bf5xx_nand.c
@@ -702,9 +702,11 @@ static int bf5xx_nand_scan(struct mtd_info *mtd)
 		if (likely(mtd->writesize >= 512)) {
 			chip->ecc.size = 512;
 			chip->ecc.bytes = 6;
+			chip->ecc.strength = 2;
 		} else {
 			chip->ecc.size = 256;
 			chip->ecc.bytes = 3;
+			chip->ecc.strength = 1;
 			bfin_write_NFC_CTL(bfin_read_NFC_CTL() & ~(1 << NFC_PG_SIZE_OFFSET));
 			SSYNC();
 		}
diff --git a/drivers/mtd/nand/cafe_nand.c b/drivers/mtd/nand/cafe_nand.c
index c23c07c..2a96e1a 100644
--- a/drivers/mtd/nand/cafe_nand.c
+++ b/drivers/mtd/nand/cafe_nand.c
@@ -783,6 +783,7 @@ static int __devinit cafe_nand_probe(struct pci_dev *pdev,
 	cafe->nand.ecc.mode = NAND_ECC_HW_SYNDROME;
 	cafe->nand.ecc.size = mtd->writesize;
 	cafe->nand.ecc.bytes = 14;
+	cafe->nand.ecc.strength = 4;
 	cafe->nand.ecc.hwctl  = (void *)cafe_nand_bug;
 	cafe->nand.ecc.calculate = (void *)cafe_nand_bug;
 	cafe->nand.ecc.correct  = (void *)cafe_nand_bug;
diff --git a/drivers/mtd/nand/cs553x_nand.c b/drivers/mtd/nand/cs553x_nand.c
index e2b7c9e..821c34c 100644
--- a/drivers/mtd/nand/cs553x_nand.c
+++ b/drivers/mtd/nand/cs553x_nand.c
@@ -248,6 +248,8 @@ static int __init cs553x_init_one(int cs, int mmio, unsigned long adr)
 		goto out_ior;
 	}
 
+	this->ecc.strength = 1;
+
 	new_mtd->name = kasprintf(GFP_KERNEL, "cs553x_nand_cs%d", cs);
 
 	cs553x_mtd[cs] = new_mtd;
diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
index b81afc7..d94b03c 100644
--- a/drivers/mtd/nand/davinci_nand.c
+++ b/drivers/mtd/nand/davinci_nand.c
@@ -641,6 +641,7 @@ static int __init nand_davinci_probe(struct platform_device *pdev)
 			info->chip.ecc.bytes = 3;
 		}
 		info->chip.ecc.size = 512;
+		info->chip.ecc.strength = pdata->ecc_bits;
 		break;
 	default:
 		ret = -EINVAL;
diff --git a/drivers/mtd/nand/denali.c b/drivers/mtd/nand/denali.c
index 3984d48..a9e57d6 100644
--- a/drivers/mtd/nand/denali.c
+++ b/drivers/mtd/nand/denali.c
@@ -1590,6 +1590,7 @@ static int denali_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 			ECC_15BITS * (denali->mtd.writesize /
 			ECC_SECTOR_SIZE)))) {
 		/* if MLC OOB size is large enough, use 15bit ECC*/
+		denali->nand.ecc.strength = 15;
 		denali->nand.ecc.layout = &nand_15bit_oob;
 		denali->nand.ecc.bytes = ECC_15BITS;
 		iowrite32(15, denali->flash_reg + ECC_CORRECTION);
@@ -1600,12 +1601,14 @@ static int denali_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 				" contain 8bit ECC correction codes");
 		goto failed_req_irq;
 	} else {
+		denali->nand.ecc.strength = 8;
 		denali->nand.ecc.layout = &nand_8bit_oob;
 		denali->nand.ecc.bytes = ECC_8BITS;
 		iowrite32(8, denali->flash_reg + ECC_CORRECTION);
 	}
 
 	denali->nand.ecc.bytes *= denali->devnum;
+	denali->nand.ecc.strength *= denali->devnum;
 	denali->nand.ecc.layout->eccbytes *=
 		denali->mtd.writesize / ECC_SECTOR_SIZE;
 	denali->nand.ecc.layout->oobfree[0].offset =
diff --git a/drivers/mtd/nand/diskonchip.c b/drivers/mtd/nand/diskonchip.c
index df921e7..e2ca067 100644
--- a/drivers/mtd/nand/diskonchip.c
+++ b/drivers/mtd/nand/diskonchip.c
@@ -1653,6 +1653,7 @@ static int __init doc_probe(unsigned long physadr)
 	nand->ecc.mode		= NAND_ECC_HW_SYNDROME;
 	nand->ecc.size		= 512;
 	nand->ecc.bytes		= 6;
+	nand->ecc.strength	= 2;
 	nand->bbt_options	= NAND_BBT_USE_FLASH;
 
 	doc->physadr		= physadr;
diff --git a/drivers/mtd/nand/docg4.c b/drivers/mtd/nand/docg4.c
index 8a0d7f6..a970574 100644
--- a/drivers/mtd/nand/docg4.c
+++ b/drivers/mtd/nand/docg4.c
@@ -1193,6 +1193,7 @@ static void __init init_mtd_structs(struct mtd_info *mtd)
 	nand->ecc.size = DOCG4_PAGE_SIZE;
 	nand->ecc.prepad = 8;
 	nand->ecc.bytes	= 8;
+	nand->ecc.strength = DOCG4_T;
 	nand->options =
 		NAND_BUSWIDTH_16 | NAND_NO_SUBPAGE_WRITE | NAND_NO_AUTOINCR;
 	nand->IO_ADDR_R = nand->IO_ADDR_W = doc->virtadr + DOC_IOSPACE_DATA;
diff --git a/drivers/mtd/nand/fsl_elbc_nand.c b/drivers/mtd/nand/fsl_elbc_nand.c
index 7195ee6..80b5264 100644
--- a/drivers/mtd/nand/fsl_elbc_nand.c
+++ b/drivers/mtd/nand/fsl_elbc_nand.c
@@ -813,6 +813,12 @@ static int fsl_elbc_chip_init(struct fsl_elbc_mtd *priv)
 				&fsl_elbc_oob_sp_eccm1 : &fsl_elbc_oob_sp_eccm0;
 		chip->ecc.size = 512;
 		chip->ecc.bytes = 3;
+		chip->ecc.strength = 1;
+		/*
+		 * FIXME: can hardware ecc correct 4 bitflips if page size is
+		 * 2k?  Then does hardware report number of corrections for this
+		 * case?  If so, ecc_stats reporting needs to be fixed as well.
+		 */
 	} else {
 		/* otherwise fall back to default software ECC */
 		chip->ecc.mode = NAND_ECC_SOFT;
diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index 341086c..588e373 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -863,10 +863,12 @@ static int __init fsmc_nand_probe(struct platform_device *pdev)
 		nand->ecc.calculate = fsmc_read_hwecc_ecc4;
 		nand->ecc.correct = fsmc_bch8_correct_data;
 		nand->ecc.bytes = 13;
+		nand->ecc.strength = 8;
 	} else {
 		nand->ecc.calculate = fsmc_read_hwecc_ecc1;
 		nand->ecc.correct = nand_correct_data;
 		nand->ecc.bytes = 3;
+		nand->ecc.strength = 1;
 	}
 
 	/*
diff --git a/drivers/mtd/nand/jz4740_nand.c b/drivers/mtd/nand/jz4740_nand.c
index cc50e35..e4147e8 100644
--- a/drivers/mtd/nand/jz4740_nand.c
+++ b/drivers/mtd/nand/jz4740_nand.c
@@ -332,6 +332,11 @@ static int __devinit jz_nand_probe(struct platform_device *pdev)
 	chip->ecc.mode		= NAND_ECC_HW_OOB_FIRST;
 	chip->ecc.size		= 512;
 	chip->ecc.bytes		= 9;
+	chip->ecc.strength	= 2;
+	/*
+	 * FIXME: ecc_strength value of 2 bits per 512 bytes of data is a
+	 * conservative guess, given 9 ecc bytes and reed-solomon alg.
+	 */
 
 	if (pdata)
 		chip->ecc.layout = pdata->ecc_layout;
diff --git a/drivers/mtd/nand/mxc_nand.c b/drivers/mtd/nand/mxc_nand.c
index 3c4c053..cc0678a 100644
--- a/drivers/mtd/nand/mxc_nand.c
+++ b/drivers/mtd/nand/mxc_nand.c
@@ -1225,6 +1225,13 @@ static int __init mxcnd_probe(struct platform_device *pdev)
 		goto escan;
 	}
 
+	if (this->ecc.mode == NAND_ECC_HW) {
+		if (nfc_is_v1())
+			this->ecc.strength = 1;
+		else
+			this->ecc.strength = (host->eccsize == 4) ? 4 : 8;
+	}
+
 	/* Register the partitions */
 	mtd_device_parse_register(mtd, part_probes, NULL, pdata->parts,
 				  pdata->nr_parts);
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 1e907dc..8008853 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -3350,6 +3350,7 @@ int nand_scan_tail(struct mtd_info *mtd)
 		if (!chip->ecc.size)
 			chip->ecc.size = 256;
 		chip->ecc.bytes = 3;
+		chip->ecc.strength = 1;
 		break;
 
 	case NAND_ECC_SOFT_BCH:
@@ -3384,6 +3385,8 @@ int nand_scan_tail(struct mtd_info *mtd)
 			pr_warn("BCH ECC initialization failed!\n");
 			BUG();
 		}
+		chip->ecc.strength =
+			chip->ecc.bytes*8 / fls(8*chip->ecc.size);
 		break;
 
 	case NAND_ECC_NONE:
@@ -3397,6 +3400,7 @@ int nand_scan_tail(struct mtd_info *mtd)
 		chip->ecc.write_oob = nand_write_oob_std;
 		chip->ecc.size = mtd->writesize;
 		chip->ecc.bytes = 0;
+		chip->ecc.strength = 0;
 		break;
 
 	default:
@@ -3478,8 +3482,9 @@ int nand_scan_tail(struct mtd_info *mtd)
 	mtd->_block_markbad = nand_block_markbad;
 	mtd->writebufsize = mtd->writesize;
 
-	/* propagate ecc.layout to mtd_info */
+	/* propagate ecc info to mtd_info */
 	mtd->ecclayout = chip->ecc.layout;
+	mtd->ecc_strength = chip->ecc.strength * chip->ecc.steps;
 
 	/* Check, if we should skip the bad block table scan */
 	if (chip->options & NAND_SKIP_BBTSCAN)
diff --git a/drivers/mtd/nand/ndfc.c b/drivers/mtd/nand/ndfc.c
index ec68854..2b6f632 100644
--- a/drivers/mtd/nand/ndfc.c
+++ b/drivers/mtd/nand/ndfc.c
@@ -179,6 +179,7 @@ static int ndfc_chip_init(struct ndfc_controller *ndfc,
 	chip->ecc.mode = NAND_ECC_HW;
 	chip->ecc.size = 256;
 	chip->ecc.bytes = 3;
+	chip->ecc.strength = 1;
 	chip->priv = ndfc;
 
 	ndfc->mtd.priv = chip;
diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
index d2e7a7d..c2b0bba 100644
--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -1058,6 +1058,7 @@ static int __devinit omap_nand_probe(struct platform_device *pdev)
 		(pdata->ecc_opt == OMAP_ECC_HAMMING_CODE_HW_ROMCODE)) {
 		info->nand.ecc.bytes            = 3;
 		info->nand.ecc.size             = 512;
+		info->nand.ecc.strength         = 1;
 		info->nand.ecc.calculate        = omap_calculate_ecc;
 		info->nand.ecc.hwctl            = omap_enable_hwecc;
 		info->nand.ecc.correct          = omap_correct_data;
diff --git a/drivers/mtd/nand/pxa3xx_nand.c b/drivers/mtd/nand/pxa3xx_nand.c
index d3bdc90..def50ca 100644
--- a/drivers/mtd/nand/pxa3xx_nand.c
+++ b/drivers/mtd/nand/pxa3xx_nand.c
@@ -1002,6 +1002,7 @@ static int pxa3xx_nand_scan(struct mtd_info *mtd)
 KEEP_CONFIG:
 	chip->ecc.mode = NAND_ECC_HW;
 	chip->ecc.size = host->page_size;
+	chip->ecc.strength = 1;
 
 	chip->options = NAND_NO_AUTOINCR;
 	chip->options |= NAND_NO_READRDY;
diff --git a/drivers/mtd/nand/r852.c b/drivers/mtd/nand/r852.c
index 769a4e0..c204018 100644
--- a/drivers/mtd/nand/r852.c
+++ b/drivers/mtd/nand/r852.c
@@ -891,6 +891,7 @@ int  r852_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	chip->ecc.mode = NAND_ECC_HW_SYNDROME;
 	chip->ecc.size = R852_DMA_LEN;
 	chip->ecc.bytes = SM_OOB_SIZE;
+	chip->ecc.strength = 2;
 	chip->ecc.hwctl = r852_ecc_hwctl;
 	chip->ecc.calculate = r852_ecc_calculate;
 	chip->ecc.correct = r852_ecc_correct;
diff --git a/drivers/mtd/nand/rtc_from4.c b/drivers/mtd/nand/rtc_from4.c
index f309add..e55b5cf 100644
--- a/drivers/mtd/nand/rtc_from4.c
+++ b/drivers/mtd/nand/rtc_from4.c
@@ -527,6 +527,7 @@ static int __init rtc_from4_init(void)
 	this->ecc.mode = NAND_ECC_HW_SYNDROME;
 	this->ecc.size = 512;
 	this->ecc.bytes = 8;
+	this->ecc.strength = 3;
 	/* return the status of extra status and ECC checks */
 	this->errstat = rtc_from4_errstat;
 	/* set the nand_oobinfo to support FPGA H/W error detection */
diff --git a/drivers/mtd/nand/s3c2410.c b/drivers/mtd/nand/s3c2410.c
index 97623be..91121f3 100644
--- a/drivers/mtd/nand/s3c2410.c
+++ b/drivers/mtd/nand/s3c2410.c
@@ -823,6 +823,7 @@ static void s3c2410_nand_init_chip(struct s3c2410_nand_info *info,
 		chip->ecc.calculate = s3c2410_nand_calculate_ecc;
 		chip->ecc.correct   = s3c2410_nand_correct_data;
 		chip->ecc.mode	    = NAND_ECC_HW;
+		chip->ecc.strength  = 1;
 
 		switch (info->cpu_type) {
 		case TYPE_S3C2410:
diff --git a/drivers/mtd/nand/sh_flctl.c b/drivers/mtd/nand/sh_flctl.c
index d0f1f3c..47154f5 100644
--- a/drivers/mtd/nand/sh_flctl.c
+++ b/drivers/mtd/nand/sh_flctl.c
@@ -798,6 +798,7 @@ static int flctl_chip_init_tail(struct mtd_info *mtd)
 
 		chip->ecc.size = 512;
 		chip->ecc.bytes = 10;
+		chip->ecc.strength = 4;
 		chip->ecc.read_page = flctl_read_page_hwecc;
 		chip->ecc.write_page = flctl_write_page_hwecc;
 		chip->ecc.mode = NAND_ECC_HW;
diff --git a/drivers/mtd/nand/sharpsl.c b/drivers/mtd/nand/sharpsl.c
index 2d269a5..3421e37 100644
--- a/drivers/mtd/nand/sharpsl.c
+++ b/drivers/mtd/nand/sharpsl.c
@@ -167,6 +167,7 @@ static int __devinit sharpsl_nand_probe(struct platform_device *pdev)
 	this->ecc.mode = NAND_ECC_HW;
 	this->ecc.size = 256;
 	this->ecc.bytes = 3;
+	this->ecc.strength = 1;
 	this->badblock_pattern = data->badblock_pattern;
 	this->ecc.layout = data->ecc_layout;
 	this->ecc.hwctl = sharpsl_nand_enable_hwecc;
diff --git a/drivers/mtd/nand/tmio_nand.c b/drivers/mtd/nand/tmio_nand.c
index 060848a..5aa5180 100644
--- a/drivers/mtd/nand/tmio_nand.c
+++ b/drivers/mtd/nand/tmio_nand.c
@@ -430,6 +430,7 @@ static int tmio_probe(struct platform_device *dev)
 	nand_chip->ecc.mode = NAND_ECC_HW;
 	nand_chip->ecc.size = 512;
 	nand_chip->ecc.bytes = 6;
+	nand_chip->ecc.strength = 2;
 	nand_chip->ecc.hwctl = tmio_nand_enable_hwecc;
 	nand_chip->ecc.calculate = tmio_nand_calculate_ecc;
 	nand_chip->ecc.correct = tmio_nand_correct_data;
diff --git a/drivers/mtd/nand/txx9ndfmc.c b/drivers/mtd/nand/txx9ndfmc.c
index 8db0acb..26398dc 100644
--- a/drivers/mtd/nand/txx9ndfmc.c
+++ b/drivers/mtd/nand/txx9ndfmc.c
@@ -356,6 +356,7 @@ static int __init txx9ndfmc_probe(struct platform_device *dev)
 		/* txx9ndfmc_nand_scan will overwrite ecc.size and ecc.bytes */
 		chip->ecc.size = 256;
 		chip->ecc.bytes = 3;
+		chip->ecc.strength = 1;
 		chip->chip_delay = 100;
 		chip->controller = &drvdata->hw_control;
 
diff --git a/drivers/mtd/onenand/onenand_base.c b/drivers/mtd/onenand/onenand_base.c
index a1592cf..3d781b8 100644
--- a/drivers/mtd/onenand/onenand_base.c
+++ b/drivers/mtd/onenand/onenand_base.c
@@ -4080,6 +4080,7 @@ int onenand_scan(struct mtd_info *mtd, int maxchips)
 	mtd->oobavail = this->ecclayout->oobavail;
 
 	mtd->ecclayout = this->ecclayout;
+	mtd->ecc_strength = 1;
 
 	/* Fill in remaining MTD driver data */
 	mtd->type = ONENAND_IS_MLC(this) ? MTD_MLCNANDFLASH : MTD_NANDFLASH;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/4] MTD: euclean_threshold added to mtd_info and sysfs
  2012-03-11 21:21 [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
  2012-03-11 21:21 ` [PATCH 1/4] MTD: add ecc_strength fields to mtd structs Mike Dunn
  2012-03-11 21:21 ` [PATCH 2/4] MTD: flash drivers set ecc strength Mike Dunn
@ 2012-03-11 21:21 ` Mike Dunn
  2012-03-13 12:29   ` Artem Bityutskiy
  2012-03-11 21:21 ` [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
  2012-03-13 12:03 ` [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Artem Bityutskiy
  4 siblings, 1 reply; 31+ messages in thread
From: Mike Dunn @ 2012-03-11 21:21 UTC (permalink / raw)
  To: linux-mtd; +Cc: Mike Dunn

The element 'euclean_threshold' is added to struct mtd_info.  If the driver
leaves this uninitialized, mtd sets it to ecc_strength when the device or
partition is registered.  This element is also exposed for reading and writing
from userspace through sysfs.

Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
---

An afterthought... ecc_strength should probably also be exposed through syfs as
a r/o value.  Ioctls for querying both these values could also be added to
mtdchar.

 Documentation/ABI/testing/sysfs-class-mtd |   23 ++++++++++++++++++++
 drivers/mtd/mtdcore.c                     |   33 +++++++++++++++++++++++++++++
 include/linux/mtd/mtd.h                   |    7 ++++++
 3 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-mtd b/Documentation/ABI/testing/sysfs-class-mtd
index 4d55a18..896a671 100644
--- a/Documentation/ABI/testing/sysfs-class-mtd
+++ b/Documentation/ABI/testing/sysfs-class-mtd
@@ -123,3 +123,26 @@ Description:
 		half page, or a quarter page).
 
 		In the case of ECC NOR, it is the ECC block size.
+
+What:		/sys/class/mtd/mtdX/euclean_threshold
+Date:		March 2012
+KernelVersion:	3.3.1
+Contact:	linux-mtd@lists.infradead.org
+Description:
+		This allows the user to examine and adjust the criteria by which
+		mtd returns -EUCLEAN from mtd_read() and mtd_read_oob().  If the
+		maximum number of bit errors corrected on any single writesize
+		during the read operation (as reported by the driver) equals or
+		exceeds this value, -EUCLEAN is returned.  Otherwise, absent an
+		error, 0 is returned.  Higher layers (e.g., UBI) use this return
+		code as an indication that an erase block may be degrading and
+		should be scrutinized as a candidate for being marked as bad.
+
+		The initial value may be set by the flash device driver.  If
+		not, then it is equal to the maximum number of bit errors that
+		the device's ECC facility is capable of correcting per
+		writesize.  Users who wish to be more paranoid about data
+		integrity can lower the value.
+
+		This is generally applicable only to NAND flash devices with
+		ECC capability.
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index b274fdf..5bbd717 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -250,6 +250,34 @@ static ssize_t mtd_name_show(struct device *dev,
 }
 static DEVICE_ATTR(name, S_IRUGO, mtd_name_show, NULL);
 
+static ssize_t mtd_euclean_threshold_show(struct device *dev,
+					  struct device_attribute *attr,
+					  char *buf)
+{
+	struct mtd_info *mtd = dev_get_drvdata(dev);
+
+	return snprintf(buf, PAGE_SIZE, "%u\n", mtd->euclean_threshold);
+}
+
+static ssize_t mtd_euclean_threshold_store(struct device *dev,
+					   struct device_attribute *attr,
+					   const char *buf, size_t count)
+{
+	struct mtd_info *mtd = dev_get_drvdata(dev);
+	unsigned int euclean_threshold;
+	int retval;
+
+	retval = kstrtouint(buf, 0, &euclean_threshold);
+	if (retval)
+		return retval;
+
+	mtd->euclean_threshold = euclean_threshold;
+	return count;
+}
+static DEVICE_ATTR(euclean_threshold, S_IRUGO | S_IWUSR,
+		   mtd_euclean_threshold_show,
+		   mtd_euclean_threshold_store);
+
 static struct attribute *mtd_attrs[] = {
 	&dev_attr_type.attr,
 	&dev_attr_flags.attr,
@@ -260,6 +288,7 @@ static struct attribute *mtd_attrs[] = {
 	&dev_attr_oobsize.attr,
 	&dev_attr_numeraseregions.attr,
 	&dev_attr_name.attr,
+	&dev_attr_euclean_threshold.attr,
 	NULL,
 };
 
@@ -322,6 +351,10 @@ int add_mtd_device(struct mtd_info *mtd)
 	mtd->index = i;
 	mtd->usecount = 0;
 
+	/* default value if not set by driver */
+	if (mtd->euclean_threshold == 0)
+		mtd->euclean_threshold = mtd->ecc_strength;
+
 	if (is_power_of_2(mtd->erasesize))
 		mtd->erasesize_shift = ffs(mtd->erasesize) - 1;
 	else
diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
index cf5ea8c..20b5c40 100644
--- a/include/linux/mtd/mtd.h
+++ b/include/linux/mtd/mtd.h
@@ -157,6 +157,13 @@ struct mtd_info {
 	unsigned int erasesize_mask;
 	unsigned int writesize_mask;
 
+	/*
+	 * read ops return -EUCLEAN if max number of bitflips corrected on any
+	 * one writesize region equals or exceeds this value.  Settable by
+	 * driver, else defaults to ecc_strength.  User can override in sysfs.
+	 */
+	unsigned int euclean_threshold;
+
 	// Kernel-only stuff starts here.
 	const char *name;
 	int index;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-11 21:21 [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
                   ` (2 preceding siblings ...)
  2012-03-11 21:21 ` [PATCH 3/4] MTD: euclean_threshold added to mtd_info and sysfs Mike Dunn
@ 2012-03-11 21:21 ` Mike Dunn
  2012-03-14 11:05   ` Shmulik Ladkani
  2012-03-29 17:30   ` Brian Norris
  2012-03-13 12:03 ` [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Artem Bityutskiy
  4 siblings, 2 replies; 31+ messages in thread
From: Mike Dunn @ 2012-03-11 21:21 UTC (permalink / raw)
  To: linux-mtd; +Cc: Mike Dunn

The drivers' read methods, absent an error, return a non-negative integer
indicating the maximum number of bit errors that were corrected in any one
writesize region.  MTD returns -EUCLEAN if this is >= euclean_threshold, 0
otherwise.  Note that this is a subtle change to the driver interface.

This and the preceeding patches in this set were tested ubi on top of the
nandsim and docg4 devices, running the ubi test io_basic from mtd-utils.

Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
---

We may want to also change the name of the macro mtd_is_bitflip(), since this is
not really correctly named anymore.

 drivers/mtd/devices/docg3.c        |    4 +++-
 drivers/mtd/mtdcore.c              |   35 ++++++++++++++++++++++++++++++++++-
 drivers/mtd/mtdpart.c              |   25 +++++++++++++------------
 drivers/mtd/nand/alauda.c          |    4 ++--
 drivers/mtd/nand/nand_base.c       |   14 ++++++++++++--
 drivers/mtd/onenand/onenand_base.c |    6 ++++--
 include/linux/mtd/mtd.h            |   10 +---------
 7 files changed, 69 insertions(+), 29 deletions(-)

diff --git a/drivers/mtd/devices/docg3.c b/drivers/mtd/devices/docg3.c
index 886ca0f..95ea33e 100644
--- a/drivers/mtd/devices/docg3.c
+++ b/drivers/mtd/devices/docg3.c
@@ -854,6 +854,7 @@ static int doc_read_oob(struct mtd_info *mtd, loff_t from,
 	u8 *buf = ops->datbuf;
 	size_t len, ooblen, nbdata, nboob;
 	u8 hwecc[DOC_ECC_BCH_SIZE], eccconf1;
+	int max_bitflips = 0;
 
 	if (buf)
 		len = ops->len;
@@ -938,7 +939,8 @@ static int doc_read_oob(struct mtd_info *mtd, loff_t from,
 			}
 			if (ret > 0) {
 				mtd->ecc_stats.corrected += ret;
-				ret = -EUCLEAN;
+				max_bitflips = max(max_bitflips, ret);
+				ret = max_bitflips;
 			}
 		}
 
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 5bbd717..6871658 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -789,12 +789,24 @@ EXPORT_SYMBOL_GPL(mtd_get_unmapped_area);
 int mtd_read(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen,
 	     u_char *buf)
 {
+	int ret_code;
 	*retlen = 0;
 	if (from < 0 || from > mtd->size || len > mtd->size - from)
 		return -EINVAL;
 	if (!len)
 		return 0;
-	return mtd->_read(mtd, from, len, retlen, buf);
+
+	/*
+	 * In the absence of an error, drivers return a non-negative integer
+	 * representing the maximum number of bitflips that were corrected on
+	 * any one writesize region (typically a NAND page).
+	 */
+	ret_code = mtd->_read(mtd, from, len, retlen, buf);
+
+	if (unlikely(ret_code < 0))
+		return ret_code;
+
+	return ret_code >= mtd->euclean_threshold ? -EUCLEAN : 0;
 }
 EXPORT_SYMBOL_GPL(mtd_read);
 
@@ -835,6 +847,27 @@ int mtd_panic_write(struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen,
 }
 EXPORT_SYMBOL_GPL(mtd_panic_write);
 
+int mtd_read_oob(struct mtd_info *mtd, loff_t from, struct mtd_oob_ops *ops)
+{
+	int ret_code;
+	ops->retlen = ops->oobretlen = 0;
+	if (!mtd->_read_oob)
+		return -EOPNOTSUPP;
+
+	/*
+	 * In the absence of an error, drivers return a non-negative integer
+	 * representing the maximum number of bitflips that were corrected on
+	 * any one writesize region (typically a NAND page).
+	 */
+	ret_code = mtd->_read_oob(mtd, from, ops);
+
+	if (unlikely(ret_code < 0))
+		return ret_code;
+
+	return ret_code >= mtd->euclean_threshold ? -EUCLEAN : 0;
+}
+EXPORT_SYMBOL_GPL(mtd_read_oob);
+
 /*
  * Method to access the protection register area, present in some flash
  * devices. The user data is one time programmable but the factory data is read
diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index 9651c06..24022ce 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -67,12 +67,12 @@ static int part_read(struct mtd_info *mtd, loff_t from, size_t len,
 	stats = part->master->ecc_stats;
 	res = part->master->_read(part->master, from + part->offset, len,
 				  retlen, buf);
-	if (unlikely(res)) {
-		if (mtd_is_bitflip(res))
-			mtd->ecc_stats.corrected += part->master->ecc_stats.corrected - stats.corrected;
-		if (mtd_is_eccerr(res))
-			mtd->ecc_stats.failed += part->master->ecc_stats.failed - stats.failed;
-	}
+	if (unlikely(mtd_is_eccerr(res)))
+		mtd->ecc_stats.failed +=
+			part->master->ecc_stats.failed - stats.failed;
+	else
+		mtd->ecc_stats.corrected +=
+			part->master->ecc_stats.corrected - stats.corrected;
 	return res;
 }
 
@@ -108,6 +108,7 @@ static int part_read_oob(struct mtd_info *mtd, loff_t from,
 		struct mtd_oob_ops *ops)
 {
 	struct mtd_part *part = PART(mtd);
+	struct mtd_ecc_stats stats = part->master->ecc_stats;
 	int res;
 
 	if (from >= mtd->size)
@@ -133,12 +134,12 @@ static int part_read_oob(struct mtd_info *mtd, loff_t from,
 	}
 
 	res = part->master->_read_oob(part->master, from + part->offset, ops);
-	if (unlikely(res)) {
-		if (mtd_is_bitflip(res))
-			mtd->ecc_stats.corrected++;
-		if (mtd_is_eccerr(res))
-			mtd->ecc_stats.failed++;
-	}
+	if (unlikely(mtd_is_eccerr(res)))
+		mtd->ecc_stats.failed +=
+			part->master->ecc_stats.failed - stats.failed;
+	else
+		mtd->ecc_stats.corrected +=
+			part->master->ecc_stats.corrected - stats.corrected;
 	return res;
 }
 
diff --git a/drivers/mtd/nand/alauda.c b/drivers/mtd/nand/alauda.c
index 4f20e1d..7e59e83 100644
--- a/drivers/mtd/nand/alauda.c
+++ b/drivers/mtd/nand/alauda.c
@@ -414,7 +414,7 @@ static int alauda_bounce_read(struct mtd_info *mtd, loff_t from, size_t len,
 	}
 	err = 0;
 	if (corrected)
-		err = -EUCLEAN;
+		err = 1;	/* return max bitflips per page */
 	if (uncorrected)
 		err = -EBADMSG;
 out:
@@ -446,7 +446,7 @@ static int alauda_read(struct mtd_info *mtd, loff_t from, size_t len,
 	}
 	err = 0;
 	if (corrected)
-		err = -EUCLEAN;
+		err = 1;	/* return max bitflips per page */
 	if (uncorrected)
 		err = -EBADMSG;
 	return err;
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 8008853..e2bf003 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -1469,6 +1469,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 	uint32_t oobreadlen = ops->ooblen;
 	uint32_t max_oobsize = ops->mode == MTD_OPS_AUTO_OOB ?
 		mtd->oobavail : mtd->oobsize;
+	unsigned int max_bitflips = 0;
 
 	uint8_t *bufpoi, *oob, *buf;
 
@@ -1491,6 +1492,8 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 
 		/* Is the current page in the buffer? */
 		if (realpage != chip->pagebuf || oob) {
+			unsigned int prev_corrected = mtd->ecc_stats.corrected;
+
 			bufpoi = aligned ? buf : chip->buffers->databuf;
 
 			if (likely(sndcmd)) {
@@ -1553,6 +1556,9 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 				else
 					nand_wait_ready(mtd);
 			}
+			max_bitflips = max(max_bitflips,
+					   mtd->ecc_stats.corrected -
+					   prev_corrected);
 		} else {
 			memcpy(buf, chip->buffers->databuf + col, bytes);
 			buf += bytes;
@@ -1594,7 +1600,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 	if (mtd->ecc_stats.failed - stats.failed)
 		return -EBADMSG;
 
-	return  mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
+	return max_bitflips;
 }
 
 /**
@@ -1782,6 +1788,7 @@ static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 	int readlen = ops->ooblen;
 	int len;
 	uint8_t *buf = ops->oobbuf;
+	unsigned int max_bitflips = 0;
 
 	pr_debug("%s: from = 0x%08Lx, len = %i\n",
 			__func__, (unsigned long long)from, readlen);
@@ -1816,6 +1823,7 @@ static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 	page = realpage & chip->pagemask;
 
 	while (1) {
+		unsigned int prev_corrected = mtd->ecc_stats.corrected;
 		if (ops->mode == MTD_OPS_RAW)
 			sndcmd = chip->ecc.read_oob_raw(mtd, chip, page, sndcmd);
 		else
@@ -1836,6 +1844,8 @@ static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 			else
 				nand_wait_ready(mtd);
 		}
+		max_bitflips = max(max_bitflips,
+				   mtd->ecc_stats.corrected - prev_corrected);
 
 		readlen -= len;
 		if (!readlen)
@@ -1865,7 +1875,7 @@ static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 	if (mtd->ecc_stats.failed - stats.failed)
 		return -EBADMSG;
 
-	return  mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
+	return  max_bitflips;
 }
 
 /**
diff --git a/drivers/mtd/onenand/onenand_base.c b/drivers/mtd/onenand/onenand_base.c
index 3d781b8..177b0a5 100644
--- a/drivers/mtd/onenand/onenand_base.c
+++ b/drivers/mtd/onenand/onenand_base.c
@@ -1201,7 +1201,8 @@ static int onenand_mlc_read_ops_nolock(struct mtd_info *mtd, loff_t from,
 	if (mtd->ecc_stats.failed - stats.failed)
 		return -EBADMSG;
 
-	return mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
+	/* return max bitflips per page; ONENANDs correct 1 bit only */
+	return mtd->ecc_stats.corrected != stats.corrected ? 1 : 0;
 }
 
 /**
@@ -1333,7 +1334,8 @@ static int onenand_read_ops_nolock(struct mtd_info *mtd, loff_t from,
 	if (mtd->ecc_stats.failed - stats.failed)
 		return -EBADMSG;
 
-	return mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
+	/* return max bitflips per page; ONENANDs correct 1 bit only */
+	return mtd->ecc_stats.corrected != stats.corrected ? 1 : 0;
 }
 
 /**
diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
index 20b5c40..b0c78f4 100644
--- a/include/linux/mtd/mtd.h
+++ b/include/linux/mtd/mtd.h
@@ -262,15 +262,7 @@ int mtd_write(struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen,
 	      const u_char *buf);
 int mtd_panic_write(struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen,
 		    const u_char *buf);
-
-static inline int mtd_read_oob(struct mtd_info *mtd, loff_t from,
-			       struct mtd_oob_ops *ops)
-{
-	ops->retlen = ops->oobretlen = 0;
-	if (!mtd->_read_oob)
-		return -EOPNOTSUPP;
-	return mtd->_read_oob(mtd, from, ops);
-}
+int mtd_read_oob(struct mtd_info *mtd, loff_t from, struct mtd_oob_ops *ops);
 
 static inline int mtd_write_oob(struct mtd_info *mtd, loff_t to,
 				struct mtd_oob_ops *ops)
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads
  2012-03-11 21:21 [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
                   ` (3 preceding siblings ...)
  2012-03-11 21:21 ` [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
@ 2012-03-13 12:03 ` Artem Bityutskiy
  2012-03-13 17:46   ` Mike Dunn
  4 siblings, 1 reply; 31+ messages in thread
From: Artem Bityutskiy @ 2012-03-13 12:03 UTC (permalink / raw)
  To: Mike Dunn; +Cc: Ivan Djelic, linux-mtd

[-- Attachment #1: Type: text/plain, Size: 2527 bytes --]

Hi,

On Sun, 2012-03-11 at 14:21 -0700, Mike Dunn wrote:
> This problem was discussed a while back [1][2][3], and a consensus of sorts was
> reached for a solution, which these patches implement.  The recent addition of
> the mtd api entry functions now make this solution practical (thanks Artem!).  A
> quick description of each patch will provide a synopsis of the patchset:
> 
> (1) The element 'ecc_strength' is added to struct mtd_info, which will store the
>     maximum number of bit errors that can be corrected in one writesize region.

I think this attribute should be exported via sysfs as R/O.

> (2) Drivers set ecc_strength during initialization.
> 
> (3) The element 'euclean_threshold' is added to struct mtd_info.  If the driver
>     leaves this uninitialized, mtd sets it to ecc_strength when the device or
>     partition is registered.  This element is also exposed for reading and
>     writing from userspace through sysfs.

Would you please rename it to bitflip_threshold. It is bearable when it
is just a  'struct mtd_info' member, but you also export
'euclean_threshold' sysfs file which is really confusing from user
perspective, I think.

> (4) The drivers' read methods, absent an error, return a non-negative integer
>     indicating the maximum number of bit errors that were corrected in any one
>     writesize region.  MTD returns -EUCLEAN if this is >= euclean_threshold, 0
>     otherwise.
> 
> So basically, the meaning of -EUCLEAN is changed from "one or more bit errors
> were corrected over the entire read operation", to "a dangerously high number of
> bit errors were corrected on one or more writesize regions".  By default,
> "dangerously high" is interpreted as the maximum number of correctible bit
> errors per writesize.  Drivers can specify a different value, and the user can
> override it if more or less caution regarding data integrity is desired.

Please, make sure we have a good comment like this in the mtd.h file as
well. I think the one you added is not verbose enough.

Otherwise looks good!

> Patch #2 touches a lot of files, but they are small changes in most cases.  If
> you can verify the correctness of the device's ecc strength, an ACK would be
> much appreciated!

I'd be pro-active and just CC'ed maintainers/possibly relevant people.
scripts/get_maintainer.pl would give you the ones, I think. Just spend a
little time and come up with a list of people and CC them.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/4] MTD: add ecc_strength fields to mtd structs
  2012-03-11 21:21 ` [PATCH 1/4] MTD: add ecc_strength fields to mtd structs Mike Dunn
@ 2012-03-13 12:25   ` Artem Bityutskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Artem Bityutskiy @ 2012-03-13 12:25 UTC (permalink / raw)
  To: Mike Dunn; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 746 bytes --]

On Sun, 2012-03-11 at 14:21 -0700, Mike Dunn wrote:
> This adds 'ecc_strength' to struct mtd_info.  This stores the maximum number of
> bit errors that can be corrected in one writesize region.
> 
> For consistency with the nand code, 'strength' is similiarly added to struct
> nand_ecc_ctrl.  This stores the maximum number of bit errors that can be
> corrected in one ecc step.
> 
> Signed-off-by: Mike Dunn <mikedunn@newsguy.com>

If you expose 'bitflips_threshold' to user-space, I think you also
should expose the strength - otherwise how would users select the
correct value in a general manner? Would you please expose it in a
separate patch?

Pushed this one to l2-mtd.git, thanks!

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-03-11 21:21 ` [PATCH 2/4] MTD: flash drivers set ecc strength Mike Dunn
@ 2012-03-13 12:27   ` Artem Bityutskiy
  2012-03-16 14:13   ` Ivan Djelic
  2012-03-29 17:24   ` Brian Norris
  2 siblings, 0 replies; 31+ messages in thread
From: Artem Bityutskiy @ 2012-03-13 12:27 UTC (permalink / raw)
  To: Mike Dunn; +Cc: Ivan Djelic, linux-mtd

[-- Attachment #1: Type: text/plain, Size: 478 bytes --]

On Sun, 2012-03-11 at 14:21 -0700, Mike Dunn wrote:
> Flash device drivers initialize 'ecc_strength' in struct mtd_info, which is the
> maximum number of bit errors that can be corrected in one writesize region.

Looks ok, so pushed to l2-mtd.git. But please, try to find a list of
possibly relevant people and send them a "please, take a look" request
to make sure we give people a chance to fix possible errors early.

Thanks!

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/4] MTD: euclean_threshold added to mtd_info and sysfs
  2012-03-11 21:21 ` [PATCH 3/4] MTD: euclean_threshold added to mtd_info and sysfs Mike Dunn
@ 2012-03-13 12:29   ` Artem Bityutskiy
  0 siblings, 0 replies; 31+ messages in thread
From: Artem Bityutskiy @ 2012-03-13 12:29 UTC (permalink / raw)
  To: Mike Dunn; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 599 bytes --]

On Sun, 2012-03-11 at 14:21 -0700, Mike Dunn wrote:
> The element 'euclean_threshold' is added to struct mtd_info.  If the driver
> leaves this uninitialized, mtd sets it to ecc_strength when the device or
> partition is registered.  This element is also exposed for reading and writing
> from userspace through sysfs.

Would you please change the name?

> An afterthought... ecc_strength should probably also be exposed through syfs as
> a r/o value.  Ioctls for querying both these values could also be added to
> mtdchar.

Yeah, I believe so!

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads
  2012-03-13 12:03 ` [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Artem Bityutskiy
@ 2012-03-13 17:46   ` Mike Dunn
  2012-03-13 22:14     ` Mike Dunn
  2012-03-14 10:56     ` Artem Bityutskiy
  0 siblings, 2 replies; 31+ messages in thread
From: Mike Dunn @ 2012-03-13 17:46 UTC (permalink / raw)
  To: dedekind1; +Cc: Ivan Djelic, linux-mtd

Thanks for having a look Artem.

On 03/13/2012 05:03 AM, Artem Bityutskiy wrote:
>> (1) The element 'ecc_strength' is added to struct mtd_info, which will store the
>>     maximum number of bit errors that can be corrected in one writesize region.
> 
> I think this attribute should be exported via sysfs as R/O.


Do you think bitflip_threshold and/or ecc_strength should be accessible through
mtdchar ioctls as well?


>> (3) The element 'euclean_threshold' is added to struct mtd_info.  If the driver
>>     leaves this uninitialized, mtd sets it to ecc_strength when the device or
>>     partition is registered.  This element is also exposed for reading and
>>     writing from userspace through sysfs.
> 
> Would you please rename it to bitflip_threshold. It is bearable when it
> is just a  'struct mtd_info' member, but you also export
> 'euclean_threshold' sysfs file which is really confusing from user
> perspective, I think.


OK, maybe bitflip is the better choice.


>> So basically, the meaning of -EUCLEAN is changed from "one or more bit errors
>> were corrected over the entire read operation", to "a dangerously high number of
>> bit errors were corrected on one or more writesize regions".  By default,
>> "dangerously high" is interpreted as the maximum number of correctible bit
>> errors per writesize.  Drivers can specify a different value, and the user can
>> override it if more or less caution regarding data integrity is desired.
> 
> Please, make sure we have a good comment like this in the mtd.h file as
> well. I think the one you added is not verbose enough.


OK.  Usually I have the opposite problem - comments too verbose!


>> Patch #2 touches a lot of files, but they are small changes in most cases.  If
>> you can verify the correctness of the device's ecc strength, an ACK would be
>> much appreciated!
> 
> I'd be pro-active and just CC'ed maintainers/possibly relevant people.
> scripts/get_maintainer.pl would give you the ones, I think. Just spend a
> little time and come up with a list of people and CC them.


I actually did do this, and git-send-email kept rejecting it because of "too
many recipients" or some such.  After several iterations of trying to
intelligently whittle down the CC list and getting the same result, I got
frustrated and just CC'd Ivan.  (Hope you don't mind, Ivan :) Anyone know how
many recipients git allows?

I'll forward patch #2 along with a polite request for review to the original CC
list using my regular email client.  I'll also start putting together new
patches that address your comments.

Thanks again,
Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads
  2012-03-13 17:46   ` Mike Dunn
@ 2012-03-13 22:14     ` Mike Dunn
  2012-03-14 10:56     ` Artem Bityutskiy
  1 sibling, 0 replies; 31+ messages in thread
From: Mike Dunn @ 2012-03-13 22:14 UTC (permalink / raw)
  To: linux-mtd

On 03/13/2012 10:46 AM, Mike Dunn wrote:
>>
>> I'd be pro-active and just CC'ed maintainers/possibly relevant people.
>> scripts/get_maintainer.pl would give you the ones, I think. Just spend a
>> little time and come up with a list of people and CC them.
> 
> 
> I actually did do this, and git-send-email kept rejecting it because of "too
> many recipients" or some such.  After several iterations of trying to
> intelligently whittle down the CC list and getting the same result, I got
> frustrated and just CC'd Ivan.  (Hope you don't mind, Ivan :) Anyone know how
> many recipients git allows?


This is my smtp server rejecting it, not git.  Apparently it has an absurdly low
limit to the number of recipients.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads
  2012-03-13 17:46   ` Mike Dunn
  2012-03-13 22:14     ` Mike Dunn
@ 2012-03-14 10:56     ` Artem Bityutskiy
  1 sibling, 0 replies; 31+ messages in thread
From: Artem Bityutskiy @ 2012-03-14 10:56 UTC (permalink / raw)
  To: Mike Dunn; +Cc: Ivan Djelic, linux-mtd

[-- Attachment #1: Type: text/plain, Size: 301 bytes --]

      * On Tue, 2012-03-13 at 10:46 -0700, Mike Dunn wrote:
> Do you think bitflip_threshold and/or ecc_strength should be accessible through
> mtdchar ioctls as well?

If there is a need - probably, but I feel that sysfs should be enough
for most cases.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-11 21:21 ` [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
@ 2012-03-14 11:05   ` Shmulik Ladkani
  2012-03-14 11:45     ` Shmulik Ladkani
  2012-03-29 17:30   ` Brian Norris
  1 sibling, 1 reply; 31+ messages in thread
From: Shmulik Ladkani @ 2012-03-14 11:05 UTC (permalink / raw)
  To: Mike Dunn; +Cc: linux-mtd, Artem Bityutskiy

On Sun, 11 Mar 2012 14:21:13 -0700 Mike Dunn <mikedunn@newsguy.com> wrote:
> -	return mtd->_read(mtd, from, len, retlen, buf);
> +
> +	/*
> +	 * In the absence of an error, drivers return a non-negative integer
> +	 * representing the maximum number of bitflips that were corrected on
> +	 * any one writesize region (typically a NAND page).
> +	 */
> +	ret_code = mtd->_read(mtd, from, len, retlen, buf);

Maybe better for the remark to reside in mtd.h near '_read' definition?

> +	if (unlikely(ret_code < 0))
> +		return ret_code;
> +
> +	return ret_code >= mtd->euclean_threshold ? -EUCLEAN : 0;

I'm a bit confused here.
From former patches, you state:

> The element 'euclean_threshold' is added to struct mtd_info.  If the driver
> leaves this uninitialized, mtd sets it to ecc_strength when the device or
> partition is registered. 

and:

> Flash device drivers initialize 'ecc_strength' in struct mtd_info, which is the
> maximum number of bit errors that can be corrected in one writesize region.

So I conclude that by default 'euclean_threshold' is equivalent to
"maximum number of bit errors that can be corrected in one writesize".

Now, lets go back to the "ret_code >= mtd->euclean_threshold" condition.

I suspect that in the default case, we'll never reach that condition.

Suppose the read introduced 'euclean_threshold' (well, 'ecc_strength')
bit errors; according to defintion of 'ecc_strength', it actually means
an uncorrectable error.
And as such, it is reported by the driver by incrementing
'ecc_stats.failed', and by the nand infrastructure as -EBADMSG.
So we'll hit the "unlikely(ret_code < 0)" case, and not the
"ret_code >= mtd->euclean_threshold" condition.

Conclusion:
As long as 'euclean_threshold' is kept to its default and not tweaked by
the sysfs knob, MTD system no longer reports -EUCLEAN.

Was that the intention? Did I miss something here?

>  /*
>   * Method to access the protection register area, present in some flash
>   * devices. The user data is one time programmable but the factory data is read
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index 9651c06..24022ce 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -67,12 +67,12 @@ static int part_read(struct mtd_info *mtd, loff_t from, size_t len,
>  	stats = part->master->ecc_stats;
>  	res = part->master->_read(part->master, from + part->offset, len,
>  				  retlen, buf);
> -	if (unlikely(res)) {
> -		if (mtd_is_bitflip(res))
> -			mtd->ecc_stats.corrected += part->master->ecc_stats.corrected - stats.corrected;
> -		if (mtd_is_eccerr(res))
> -			mtd->ecc_stats.failed += part->master->ecc_stats.failed - stats.failed;
> -	}
> +	if (unlikely(mtd_is_eccerr(res)))
> +		mtd->ecc_stats.failed +=
> +			part->master->ecc_stats.failed - stats.failed;
> +	else
> +		mtd->ecc_stats.corrected +=
> +			part->master->ecc_stats.corrected - stats.corrected;

Is there a reason to execute the "else" part ('corrected' increment) in
case 'res' is negative?

> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 8008853..e2bf003 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -1594,7 +1600,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
>  	if (mtd->ecc_stats.failed - stats.failed)
>  		return -EBADMSG;
>  
> -	return  mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
> +	return max_bitflips;
>  }

Two lines eralier we have:
	if (ret)
		return ret;

where 'ret' is assigned to the return value of
read_page/read_page_raw/read_subpage calls.
Meaning, your new code rely on these calls to never return a positive
value, otherwise the 'read' wrapper might mistakenly identify these
as max bitflips.
This is a reasonable assumption. Would be nice if it can be enforced.
(BTW I looked on many implementations, all return 0...)

Regards,
Shmulik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-14 11:05   ` Shmulik Ladkani
@ 2012-03-14 11:45     ` Shmulik Ladkani
  0 siblings, 0 replies; 31+ messages in thread
From: Shmulik Ladkani @ 2012-03-14 11:45 UTC (permalink / raw)
  To: Mike Dunn; +Cc: linux-mtd, Artem Bityutskiy

On Wed, 14 Mar 2012 13:05:43 +0200 Shmulik Ladkani <shmulik.ladkani@gmail.com> wrote:
> 
> Conclusion:
> As long as 'euclean_threshold' is kept to its default and not tweaked by
> the sysfs knob, MTD system no longer reports -EUCLEAN.
> 
> Was that the intention? Did I miss something here?

Yes, I missed something and got it wrong.
Further meditation enlightened me.

No need to reply on this comment. Sorry.

Regards,
Shmulik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-03-11 21:21 ` [PATCH 2/4] MTD: flash drivers set ecc strength Mike Dunn
  2012-03-13 12:27   ` Artem Bityutskiy
@ 2012-03-16 14:13   ` Ivan Djelic
  2012-03-16 20:02     ` Mike Dunn
  2012-03-29 17:24   ` Brian Norris
  2 siblings, 1 reply; 31+ messages in thread
From: Ivan Djelic @ 2012-03-16 14:13 UTC (permalink / raw)
  To: Mike Dunn; +Cc: linux-mtd

On Sun, Mar 11, 2012 at 09:21:11PM +0000, Mike Dunn wrote:
> Flash device drivers initialize 'ecc_strength' in struct mtd_info, which is the
> maximum number of bit errors that can be corrected in one writesize region.
> 
> Drivers using the nand interface intitialize 'strength' in struct nand_ecc_ctrl,
> which is the maximum number of bit errors that can be corrected in one ecc step.
> Nand infrastructure code translates this to 'ecc_strength'.

OK, in [1] I suggested to drop 'ecc_strength' and use ecc.strength, but
obviously it does not work for drivers not using the nand interface...

[1] http://lists.infradead.org/pipermail/linux-mtd/2012-March/040313.html

> With a couple exceptions, I'm fairly confident that these values are correct,
> but if you can confirm, an ACK would be greatly appreciated!  Note that an
> incorrect value will not cause a regression unless the value is overstated.
> 
> The exceptions are:
> 
>   bcm_umi_nand.c: implements bch in hw; Galois field order not indicated

The fixed ecc.size = 512 and number of ecc bytes (13) strongly suggest
m=13 and t=8; this is confirmed if you look at nand_bch_umi.h: 112:
	/* K parameter is used internally.  K = N - (T * 13) */

>   jz4740_nand.c: implements r-s in hw; generator polynomial order not indicated

The Reed-Solomon code used by this chip is described in [2], it is a RS(511,503)
code. It can correct up to 4 symbols (each symbol is 9 bits).
In the NAND context, it just means it can correct up to 4 bitflips per 512 bytes.

[2] http://www.amebasystems.com/downloads/hardware/datasheets/ben-nanonote/Ingenic-SOC-JZ4720/Jz4740-PM/jz4740_03_emc_spec.pdf

> index ee81b63..fc60043 100644
> --- a/drivers/mtd/nand/bcm_umi_nand.c
> +++ b/drivers/mtd/nand/bcm_umi_nand.c
> @@ -476,6 +476,14 @@ static int __devinit bcm_umi_nand_probe(struct platform_device *pdev)
>                         largepage_bbt.options = NAND_BBT_SCAN2NDPAGE;
>                 this->badblock_pattern = &largepage_bbt;
>         }
> +
> +       /*
> +        * FIXME: ecc strength value of 6 bits per 512 bytes of data is a
> +        * conservative guess, given 13 ecc bytes and using bch alg.
> +        * (Assume Galois field order m=15 to allow a margin of error.)
> +        */
> +       this->ecc.strength = 6;
> +
>  #endif

When NAND_ECC_BCH is disabled, is ecc.strength properly enabled ?

> index 7195ee6..80b5264 100644
> --- a/drivers/mtd/nand/fsl_elbc_nand.c
> +++ b/drivers/mtd/nand/fsl_elbc_nand.c
> @@ -813,6 +813,12 @@ static int fsl_elbc_chip_init(struct fsl_elbc_mtd *priv)
>                                 &fsl_elbc_oob_sp_eccm1 : &fsl_elbc_oob_sp_eccm0;
>                 chip->ecc.size = 512;
>                 chip->ecc.bytes = 3;
> +               chip->ecc.strength = 1;
> +               /*
> +                * FIXME: can hardware ecc correct 4 bitflips if page size is
> +                * 2k?  Then does hardware report number of corrections for this
> +                * case?  If so, ecc_stats reporting needs to be fixed as well.
> +                */

Here you simply have 4 ecc steps, each step only able to correct 1 bitflip in its
512 bytes subpage.

BR,
--
Ivan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-03-16 14:13   ` Ivan Djelic
@ 2012-03-16 20:02     ` Mike Dunn
  0 siblings, 0 replies; 31+ messages in thread
From: Mike Dunn @ 2012-03-16 20:02 UTC (permalink / raw)
  To: Ivan Djelic; +Cc: linux-mtd

Thanks again Ivan.  I was hoping you would take a look at this.

On 03/16/2012 07:13 AM, Ivan Djelic wrote:
> On Sun, Mar 11, 2012 at 09:21:11PM +0000, Mike Dunn wrote:
>> Flash device drivers initialize 'ecc_strength' in struct mtd_info, which is the
>> maximum number of bit errors that can be corrected in one writesize region.
>>
>> Drivers using the nand interface intitialize 'strength' in struct nand_ecc_ctrl,
>> which is the maximum number of bit errors that can be corrected in one ecc step.
>> Nand infrastructure code translates this to 'ecc_strength'.
> 
> OK, in [1] I suggested to drop 'ecc_strength' and use ecc.strength, but
> obviously it does not work for drivers not using the nand interface...
> 
> [1] http://lists.infradead.org/pipermail/linux-mtd/2012-March/040313.html
> 
>> With a couple exceptions, I'm fairly confident that these values are correct,
>> but if you can confirm, an ACK would be greatly appreciated!  Note that an
>> incorrect value will not cause a regression unless the value is overstated.
>>
>> The exceptions are:
>>
>>   bcm_umi_nand.c: implements bch in hw; Galois field order not indicated
> 
> The fixed ecc.size = 512 and number of ecc bytes (13) strongly suggest
> m=13 and t=8; this is confirmed if you look at nand_bch_umi.h: 112:
> 	/* K parameter is used internally.  K = N - (T * 13) */


Ah HA!  Thanks.  I didn't want to assume m=13, but I didn't see the clue you
found in the header file.


> 
>>   jz4740_nand.c: implements r-s in hw; generator polynomial order not indicated
> 
> The Reed-Solomon code used by this chip is described in [2], it is a RS(511,503)
> code. It can correct up to 4 symbols (each symbol is 9 bits).
> In the NAND context, it just means it can correct up to 4 bitflips per 512 bytes.
> 
> [2] http://www.amebasystems.com/downloads/hardware/datasheets/ben-nanonote/Ingenic-SOC-JZ4720/Jz4740-PM/jz4740_03_emc_spec.pdf


Thanks again!


> 
>> index ee81b63..fc60043 100644
>> --- a/drivers/mtd/nand/bcm_umi_nand.c
>> +++ b/drivers/mtd/nand/bcm_umi_nand.c
>> @@ -476,6 +476,14 @@ static int __devinit bcm_umi_nand_probe(struct platform_device *pdev)
>>                         largepage_bbt.options = NAND_BBT_SCAN2NDPAGE;
>>                 this->badblock_pattern = &largepage_bbt;
>>         }
>> +
>> +       /*
>> +        * FIXME: ecc strength value of 6 bits per 512 bytes of data is a
>> +        * conservative guess, given 13 ecc bytes and using bch alg.
>> +        * (Assume Galois field order m=15 to allow a margin of error.)
>> +        */
>> +       this->ecc.strength = 6;
>> +
>>  #endif
> 
> When NAND_ECC_BCH is disabled, is ecc.strength properly enabled ?


No, but this is actually broken.  If NAND_ECC_BDH == 0, it won't compile, if I'm
not mistaken.


> 
>> index 7195ee6..80b5264 100644
>> --- a/drivers/mtd/nand/fsl_elbc_nand.c
>> +++ b/drivers/mtd/nand/fsl_elbc_nand.c
>> @@ -813,6 +813,12 @@ static int fsl_elbc_chip_init(struct fsl_elbc_mtd *priv)
>>                                 &fsl_elbc_oob_sp_eccm1 : &fsl_elbc_oob_sp_eccm0;
>>                 chip->ecc.size = 512;
>>                 chip->ecc.bytes = 3;
>> +               chip->ecc.strength = 1;
>> +               /*
>> +                * FIXME: can hardware ecc correct 4 bitflips if page size is
>> +                * 2k?  Then does hardware report number of corrections for this
>> +                * case?  If so, ecc_stats reporting needs to be fixed as well.
>> +                */
> 
> Here you simply have 4 ecc steps, each step only able to correct 1 bitflip in its
> 512 bytes subpage.


I thought so, but it's all done in hw (detection, correction, reporting
results), so I wasn't sure.  And only one bit correction is reported in stats,
never four, regardless of page size.

Really appreciate the review!

Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-03-11 21:21 ` [PATCH 2/4] MTD: flash drivers set ecc strength Mike Dunn
  2012-03-13 12:27   ` Artem Bityutskiy
  2012-03-16 14:13   ` Ivan Djelic
@ 2012-03-29 17:24   ` Brian Norris
  2012-03-31  0:05     ` Mike Dunn
  2012-04-02 17:34     ` Mike Dunn
  2 siblings, 2 replies; 31+ messages in thread
From: Brian Norris @ 2012-03-29 17:24 UTC (permalink / raw)
  To: Mike Dunn, Artem Bityutskiy, David Woodhouse; +Cc: Ivan Djelic, linux-mtd

Hi Mike, Artem, David,

Sorry for the late review here. It looks like this is already queued
up for merging soon by David...
(BTW, patch 1 and 2 seem to do nothing by themselves; should they
really be merged in the 3.4 window?)

On Sun, Mar 11, 2012 at 2:21 PM, Mike Dunn <mikedunn@newsguy.com> wrote:
> Flash device drivers initialize 'ecc_strength' in struct mtd_info, which is the
> maximum number of bit errors that can be corrected in one writesize region.
>
> Drivers using the nand interface intitialize 'strength' in struct nand_ecc_ctrl,
> which is the maximum number of bit errors that can be corrected in one ecc step.
> Nand infrastructure code translates this to 'ecc_strength'.
>
> Also for nand drivers, the nand infrastructure code sets ecc.strength for ecc
> modes NAND_ECC_SOFT, NAND_ECC_SOFT_BCH, and NAND_ECC_NONE.  It is set in the
> driver for all other modes.

I agree that the other modes should be set by the driver, but is there
any kind of sanity check? We've seen problems with the addition of the
mtd->writebufsize parameter that didn't get caught as uninitialized in
a large number of drivers. I don't think your later patches handle the
case that mtd->ecc_strength is not initialized or is 0.

> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 1e907dc..8008853 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3350,6 +3350,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>                if (!chip->ecc.size)
>                        chip->ecc.size = 256;
>                chip->ecc.bytes = 3;
> +               chip->ecc.strength = 1;
>                break;
>
>        case NAND_ECC_SOFT_BCH:
> @@ -3384,6 +3385,8 @@ int nand_scan_tail(struct mtd_info *mtd)
>                        pr_warn("BCH ECC initialization failed!\n");
>                        BUG();
>                }
> +               chip->ecc.strength =
> +                       chip->ecc.bytes*8 / fls(8*chip->ecc.size);

Isn't this spacing against coding style? I'd suggest spaces around the
'*'. Also, after a few minutes, I have no idea where this calculation
comes from. But I'm not familiar with SOFT_BCH. Maybe a comment is in
order?

> @@ -3397,6 +3400,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>                chip->ecc.write_oob = nand_write_oob_std;
>                chip->ecc.size = mtd->writesize;
>                chip->ecc.bytes = 0;
> +               chip->ecc.strength = 0;
>                break;

This will cause problems with your patch 4, I think. I'll comment on
the other email.

Brian

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-11 21:21 ` [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
  2012-03-14 11:05   ` Shmulik Ladkani
@ 2012-03-29 17:30   ` Brian Norris
  2012-03-30 12:13     ` Artem Bityutskiy
  2012-03-31  1:05     ` Mike Dunn
  1 sibling, 2 replies; 31+ messages in thread
From: Brian Norris @ 2012-03-29 17:30 UTC (permalink / raw)
  To: Mike Dunn; +Cc: David Woodhouse, linux-mtd, Shmulik Ladkani, Artem Bityutskiy

On Sun, Mar 11, 2012 at 2:21 PM, Mike Dunn <mikedunn@newsguy.com> wrote:
> The drivers' read methods, absent an error, return a non-negative integer
> indicating the maximum number of bit errors that were corrected in any one
> writesize region.  MTD returns -EUCLEAN if this is >= euclean_threshold, 0
> otherwise.  Note that this is a subtle change to the driver interface.
>
> This and the preceeding patches in this set were tested ubi on top of the
> nandsim and docg4 devices, running the ubi test io_basic from mtd-utils.

Did you test any non-NAND devices? I'm getting the feeling that some
of this is misguided or at least needs some fixing. See below.

> We may want to also change the name of the macro mtd_is_bitflip(), since this is
> not really correctly named anymore.

What sort of name change? Shouldn't EUCLEAN still mean bitflip? We are
just masking the bitflips below a threshold. Anyway, if you have a
better name, bring it up! I think I changed the name from my original
choice based on Artem's suggestions, so I'm open to anything with more
merit :)

> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
> index 5bbd717..6871658 100644
> --- a/drivers/mtd/mtdcore.c
> +++ b/drivers/mtd/mtdcore.c
> @@ -789,12 +789,24 @@ EXPORT_SYMBOL_GPL(mtd_get_unmapped_area);
>  int mtd_read(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen,
>             u_char *buf)
>  {
> +       int ret_code;
>        *retlen = 0;
>        if (from < 0 || from > mtd->size || len > mtd->size - from)
>                return -EINVAL;
>        if (!len)
>                return 0;
> -       return mtd->_read(mtd, from, len, retlen, buf);
> +
> +       /*
> +        * In the absence of an error, drivers return a non-negative integer
> +        * representing the maximum number of bitflips that were corrected on
> +        * any one writesize region (typically a NAND page).
> +        */
> +       ret_code = mtd->_read(mtd, from, len, retlen, buf);
> +
> +       if (unlikely(ret_code < 0))
> +               return ret_code;
> +
> +       return ret_code >= mtd->euclean_threshold ? -EUCLEAN : 0;
>  }

This seems wrong in the case that mtd->euclean_threshold == 0. This
could be the case for any of the following:
(1) NAND that uses ECC_NONE
(2) NAND drivers with HW_ECC that don't initialize chip->ecc.strength
(3) MTD without ECC (e.g., NOR)
(4) Other drivers that might have missed initializing mtd->ecc_strength

Regarding (1): although ECC_NONE is not recommended, it's possible. I
don't think this deserves an EUCLEAN message.

Regarding (2): I notice that your nand_base.c code (patch 2) leaves
ecc.strength initialization up to the driver. I think a sanity check
could be provided in those cases. For instance, in the NAND_ECC_HW*
cases, you could warn or error out on 'chip->ecc.strength == 0'.

If I'm correct, (3) is quite significant, since non-ECC'd MTDs would
produce EUCLEAN statuses on every read.

Brian

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-29 17:30   ` Brian Norris
@ 2012-03-30 12:13     ` Artem Bityutskiy
  2012-03-31  1:17       ` Mike Dunn
  2012-03-31  1:05     ` Mike Dunn
  1 sibling, 1 reply; 31+ messages in thread
From: Artem Bityutskiy @ 2012-03-30 12:13 UTC (permalink / raw)
  To: Brian Norris; +Cc: linux-mtd, Mike Dunn, Shmulik Ladkani, David Woodhouse

[-- Attachment #1: Type: text/plain, Size: 951 bytes --]

On Thu, 2012-03-29 at 10:30 -0700, Brian Norris wrote:
> On Sun, Mar 11, 2012 at 2:21 PM, Mike Dunn <mikedunn@newsguy.com> wrote:
> > The drivers' read methods, absent an error, return a non-negative integer
> > indicating the maximum number of bit errors that were corrected in any one
> > writesize region.  MTD returns -EUCLEAN if this is >= euclean_threshold, 0
> > otherwise.  Note that this is a subtle change to the driver interface.
> >
> > This and the preceeding patches in this set were tested ubi on top of the
> > nandsim and docg4 devices, running the ubi test io_basic from mtd-utils.
> 
> Did you test any non-NAND devices? I'm getting the feeling that some
> of this is misguided or at least needs some fixing. See below.

Brian,

you are right, these 2 should be reverted and Mike should come back with
a better patch-set. I was not careful enough, and thanks for the review.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-03-29 17:24   ` Brian Norris
@ 2012-03-31  0:05     ` Mike Dunn
  2012-04-02 17:34     ` Mike Dunn
  1 sibling, 0 replies; 31+ messages in thread
From: Mike Dunn @ 2012-03-31  0:05 UTC (permalink / raw)
  To: Brian Norris; +Cc: Ivan Djelic, linux-mtd, David Woodhouse, Artem Bityutskiy

On 03/29/2012 10:24 AM, Brian Norris wrote:
> Hi Mike, Artem, David,
> 
> Sorry for the late review here. It looks like this is already queued
> up for merging soon by David...


Thanks for having a look Brian.


> (BTW, patch 1 and 2 seem to do nothing by themselves; should they
> really be merged in the 3.4 window?)


Correct; it's just cruft currently.  There is a flaw in the patchset that was
pointed out by Ivan.  Basically, the threshold should be determined based on the
ecc strength of each ecc step, not on the aggregate ecc strength of the entire
page.  Since no one is clamoring for this change, I figured I'd wait until this
merge window passed before following up.  Fixing this flaw will require patching
many individual drivers.  The scope of the original patches was limited to the
nand infrastructure code.


> 
> On Sun, Mar 11, 2012 at 2:21 PM, Mike Dunn <mikedunn@newsguy.com> wrote:
>> Flash device drivers initialize 'ecc_strength' in struct mtd_info, which is the
>> maximum number of bit errors that can be corrected in one writesize region.
>>
>> Drivers using the nand interface intitialize 'strength' in struct nand_ecc_ctrl,
>> which is the maximum number of bit errors that can be corrected in one ecc step.
>> Nand infrastructure code translates this to 'ecc_strength'.
>>
>> Also for nand drivers, the nand infrastructure code sets ecc.strength for ecc
>> modes NAND_ECC_SOFT, NAND_ECC_SOFT_BCH, and NAND_ECC_NONE.  It is set in the
>> driver for all other modes.
> 
> I agree that the other modes should be set by the driver, but is there
> any kind of sanity check? We've seen problems with the addition of the
> mtd->writebufsize parameter that didn't get caught as uninitialized in
> a large number of drivers. I don't think your later patches handle the
> case that mtd->ecc_strength is not initialized or is 0.


I'll look into adding sanity checks.


> 
>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
>> index 1e907dc..8008853 100644
>> --- a/drivers/mtd/nand/nand_base.c
>> +++ b/drivers/mtd/nand/nand_base.c
>> @@ -3350,6 +3350,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>>                if (!chip->ecc.size)
>>                        chip->ecc.size = 256;
>>                chip->ecc.bytes = 3;
>> +               chip->ecc.strength = 1;
>>                break;
>>
>>        case NAND_ECC_SOFT_BCH:
>> @@ -3384,6 +3385,8 @@ int nand_scan_tail(struct mtd_info *mtd)
>>                        pr_warn("BCH ECC initialization failed!\n");
>>                        BUG();
>>                }
>> +               chip->ecc.strength =
>> +                       chip->ecc.bytes*8 / fls(8*chip->ecc.size);
> 
> Isn't this spacing against coding style? I'd suggest spaces around the
> '*'. Also, after a few minutes, I have no idea where this calculation
> comes from. But I'm not familiar with SOFT_BCH. Maybe a comment is in
> order?
> 
>> @@ -3397,6 +3400,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>>                chip->ecc.write_oob = nand_write_oob_std;
>>                chip->ecc.size = mtd->writesize;
>>                chip->ecc.bytes = 0;
>> +               chip->ecc.strength = 0;
>>                break;
> 
> This will cause problems with your patch 4, I think. I'll comment on
> the other email.
> 
> Brian
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-29 17:30   ` Brian Norris
  2012-03-30 12:13     ` Artem Bityutskiy
@ 2012-03-31  1:05     ` Mike Dunn
  2012-03-31  6:37       ` Shmulik Ladkani
  1 sibling, 1 reply; 31+ messages in thread
From: Mike Dunn @ 2012-03-31  1:05 UTC (permalink / raw)
  To: Brian Norris
  Cc: David Woodhouse, linux-mtd, Shmulik Ladkani, Artem Bityutskiy

On 03/29/2012 10:30 AM, Brian Norris wrote:
> On Sun, Mar 11, 2012 at 2:21 PM, Mike Dunn <mikedunn@newsguy.com> wrote:
>> The drivers' read methods, absent an error, return a non-negative integer
>> indicating the maximum number of bit errors that were corrected in any one
>> writesize region.  MTD returns -EUCLEAN if this is >= euclean_threshold, 0
>> otherwise.  Note that this is a subtle change to the driver interface.
>>
>> This and the preceeding patches in this set were tested ubi on top of the
>> nandsim and docg4 devices, running the ubi test io_basic from mtd-utils.
> 
> Did you test any non-NAND devices? I'm getting the feeling that some
> of this is misguided or at least needs some fixing. See below.


No, not tested on non-nand yet.


> 
>> We may want to also change the name of the macro mtd_is_bitflip(), since this is
>> not really correctly named anymore.
> 
> What sort of name change? Shouldn't EUCLEAN still mean bitflip? We are
> just masking the bitflips below a threshold. Anyway, if you have a
> better name, bring it up! I think I changed the name from my original
> choice based on Artem's suggestions, so I'm open to anything with more
> merit :)
> 
>> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
>> index 5bbd717..6871658 100644
>> --- a/drivers/mtd/mtdcore.c
>> +++ b/drivers/mtd/mtdcore.c
>> @@ -789,12 +789,24 @@ EXPORT_SYMBOL_GPL(mtd_get_unmapped_area);
>>  int mtd_read(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen,
>>             u_char *buf)
>>  {
>> +       int ret_code;
>>        *retlen = 0;
>>        if (from < 0 || from > mtd->size || len > mtd->size - from)
>>                return -EINVAL;
>>        if (!len)
>>                return 0;
>> -       return mtd->_read(mtd, from, len, retlen, buf);
>> +
>> +       /*
>> +        * In the absence of an error, drivers return a non-negative integer
>> +        * representing the maximum number of bitflips that were corrected on
>> +        * any one writesize region (typically a NAND page).
>> +        */
>> +       ret_code = mtd->_read(mtd, from, len, retlen, buf);
>> +
>> +       if (unlikely(ret_code < 0))
>> +               return ret_code;
>> +
>> +       return ret_code >= mtd->euclean_threshold ? -EUCLEAN : 0;
>>  }
> 
> This seems wrong in the case that mtd->euclean_threshold == 0. 


You are correct!  Thanks.  Devices with no ecc (for which euclean_threshold ==
0) will always return 0 (absent an error), which will result in mtd always
returning -EUCLEAN.  Oops! Thanks again.


> This could be the case for any of the following:
> (1) NAND that uses ECC_NONE
> (2) NAND drivers with HW_ECC that don't initialize chip->ecc.strength
> (3) MTD without ECC (e.g., NOR)
> (4) Other drivers that might have missed initializing mtd->ecc_strength
> 
> Regarding (1): although ECC_NONE is not recommended, it's possible. I
> don't think this deserves an EUCLEAN message.
> 
> Regarding (2): I notice that your nand_base.c code (patch 2) leaves
> ecc.strength initialization up to the driver. I think a sanity check
> could be provided in those cases. For instance, in the NAND_ECC_HW*
> cases, you could warn or error out on 'chip->ecc.strength == 0'.


Yes, I'll include sanity checks in next patches.


> If I'm correct, (3) is quite significant, since non-ECC'd MTDs would
> produce EUCLEAN statuses on every read.


Yup.  Again, oops!

Artem, what are your thoughts on fixing these patches, given that many nand
drivers will have to be touched?  I'm willing to see it through.

Thanks again Brian,
Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-30 12:13     ` Artem Bityutskiy
@ 2012-03-31  1:17       ` Mike Dunn
  2012-04-02  7:17         ` Artem Bityutskiy
  0 siblings, 1 reply; 31+ messages in thread
From: Mike Dunn @ 2012-03-31  1:17 UTC (permalink / raw)
  To: dedekind1; +Cc: Brian Norris, linux-mtd, Shmulik Ladkani, David Woodhouse

On 03/30/2012 05:13 AM, Artem Bityutskiy wrote:
> On Thu, 2012-03-29 at 10:30 -0700, Brian Norris wrote:
>> Did you test any non-NAND devices? I'm getting the feeling that some
>> of this is misguided or at least needs some fixing. See below.
> 
> you are right, these 2 should be reverted and Mike should come back with
> a better patch-set. I was not careful enough, and thanks for the review.


At least you were not the author.  Yes, thank you Brian.

Artem, I don't think you pushed the offending patches (3 and 4).  If you want to
leave the first two patches in, I'll be happy to fix them (they currently don't
do anything) when the whole efort is done.  Or else you can pull them and I'll
start from scratch.  Or slink away in shame :)  But I'd prefer to get this done
right.

Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-31  1:05     ` Mike Dunn
@ 2012-03-31  6:37       ` Shmulik Ladkani
  2012-04-02 16:40         ` Mike Dunn
  0 siblings, 1 reply; 31+ messages in thread
From: Shmulik Ladkani @ 2012-03-31  6:37 UTC (permalink / raw)
  To: Mike Dunn; +Cc: Brian Norris, linux-mtd, David Woodhouse, Artem Bityutskiy

Hi Mike,

On Fri, 30 Mar 2012 18:05:55 -0700 Mike Dunn <mikedunn@newsguy.com> wrote:
> > This seems wrong in the case that mtd->euclean_threshold == 0. 
> 
> > This could be the case for any of the following:
> > (1) NAND that uses ECC_NONE
> > (2) NAND drivers with HW_ECC that don't initialize chip->ecc.strength
> > (3) MTD without ECC (e.g., NOR)
> > (4) Other drivers that might have missed initializing mtd->ecc_strength
> > 
> 
> 
> > If I'm correct, (3) is quite significant, since non-ECC'd MTDs would
> > produce EUCLEAN statuses on every read.
> 
> 
> Yup.  Again, oops!
> 

Please re-consider having the 'euclean_threshold' comparison within the
NAND infrastructure (nand_base.c and clones), instead of within the
generic 'mtd_read()' wrapper, as discussed in [1].

This would inherently fix (3).

For code unification, we can have a small inlined function that does the
comparison, which may be used by nand_base.c and the clones.

[1]
http://lists.infradead.org/pipermail/linux-mtd/2012-March/040371.html
http://lists.infradead.org/pipermail/linux-mtd/2012-March/040343.html

Regards,
Shmulik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-31  1:17       ` Mike Dunn
@ 2012-04-02  7:17         ` Artem Bityutskiy
  2012-04-02 15:33           ` Mike Dunn
  0 siblings, 1 reply; 31+ messages in thread
From: Artem Bityutskiy @ 2012-04-02  7:17 UTC (permalink / raw)
  To: Mike Dunn; +Cc: Brian Norris, linux-mtd, Shmulik Ladkani, David Woodhouse

[-- Attachment #1: Type: text/plain, Size: 515 bytes --]

On Fri, 2012-03-30 at 18:17 -0700, Mike Dunn wrote:
> Artem, I don't think you pushed the offending patches (3 and 4).  If you want to
> leave the first two patches in, I'll be happy to fix them (they currently don't
> do anything) when the whole efort is done.  Or else you can pull them and I'll
> start from scratch.  Or slink away in shame :)  But I'd prefer to get this done
> right.

Yeah, I won't touch them so far and will just wait for incremental
patches.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-04-02  7:17         ` Artem Bityutskiy
@ 2012-04-02 15:33           ` Mike Dunn
  0 siblings, 0 replies; 31+ messages in thread
From: Mike Dunn @ 2012-04-02 15:33 UTC (permalink / raw)
  To: dedekind1; +Cc: Brian Norris, linux-mtd, Shmulik Ladkani, David Woodhouse

On 04/02/2012 12:17 AM, Artem Bityutskiy wrote:
> On Fri, 2012-03-30 at 18:17 -0700, Mike Dunn wrote:
>> Artem, I don't think you pushed the offending patches (3 and 4).  If you want to
>> leave the first two patches in, I'll be happy to fix them (they currently don't
>> do anything) when the whole efort is done.  Or else you can pull them and I'll
>> start from scratch.  Or slink away in shame :)  But I'd prefer to get this done
>> right.
> 
> Yeah, I won't touch them so far and will just wait for incremental
> patches.
> 

OK, thanks Artem.  I can start today and should have patches ready soon.

Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-03-31  6:37       ` Shmulik Ladkani
@ 2012-04-02 16:40         ` Mike Dunn
  2012-04-03  8:48           ` Shmulik Ladkani
  0 siblings, 1 reply; 31+ messages in thread
From: Mike Dunn @ 2012-04-02 16:40 UTC (permalink / raw)
  To: Shmulik Ladkani
  Cc: Brian Norris, linux-mtd, David Woodhouse, Artem Bityutskiy

Hi Shmulik,

On 03/30/2012 11:37 PM, Shmulik Ladkani wrote:
> Please re-consider having the 'euclean_threshold' comparison within the
> NAND infrastructure (nand_base.c and clones), instead of within the
> generic 'mtd_read()' wrapper, as discussed in [1].


This isn't possible given the intended functionality of the patches.  The plan
is to allow the user to adjust the bitflip_threshold through sysfs, and do it
uniquely for each partition.  Since the driver itself receives as its argument
the mtd_info for the master, not the partition, it can't compare the bitflip
count with the appropriate threshold to determine whether or not to return
-EUCLEAN.  Here, "driver" refers to the nand imfrastructure code or the drivers
for the two nand-with-ecc devices that don't use the nand interface
(nand/alauda.c and devices/docg3.c).

Thanks,
Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-03-29 17:24   ` Brian Norris
  2012-03-31  0:05     ` Mike Dunn
@ 2012-04-02 17:34     ` Mike Dunn
  2012-04-03  8:03       ` Ivan Djelic
  1 sibling, 1 reply; 31+ messages in thread
From: Mike Dunn @ 2012-04-02 17:34 UTC (permalink / raw)
  To: Brian Norris; +Cc: Ivan Djelic, linux-mtd, David Woodhouse, Artem Bityutskiy

Brian,

I was just reviewing things and realized that I accidentally ignored one of your
comments...


On 03/29/2012 10:24 AM, Brian Norris wrote:
>>
>>        case NAND_ECC_SOFT_BCH:
>> @@ -3384,6 +3385,8 @@ int nand_scan_tail(struct mtd_info *mtd)
>>                        pr_warn("BCH ECC initialization failed!\n");
>>                        BUG();
>>                }
>> +               chip->ecc.strength =
>> +                       chip->ecc.bytes*8 / fls(8*chip->ecc.size);
> 
> Isn't this spacing against coding style? I'd suggest spaces around the
> '*'. Also, after a few minutes, I have no idea where this calculation
> comes from. But I'm not familiar with SOFT_BCH. Maybe a comment is in
> order?


I'm no bch expert myself (ping, Ivan), but I believe this is correct, and it is
consistant with the equations used to determine the 't' parameter (i.e.,
ecc_strength) from nand_bch_init() in drivers/mtd/nand/nand_bch.c.  (BTW,
currently only nandsim uses SOFT_BCH.)  As for coding style... honestly, I
mostly just rely on checkpatch.pl, but I'll check the style guide document.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/4] MTD: flash drivers set ecc strength
  2012-04-02 17:34     ` Mike Dunn
@ 2012-04-03  8:03       ` Ivan Djelic
  0 siblings, 0 replies; 31+ messages in thread
From: Ivan Djelic @ 2012-04-03  8:03 UTC (permalink / raw)
  To: Mike Dunn; +Cc: Brian Norris, linux-mtd, David Woodhouse, Artem Bityutskiy

On Mon, Apr 02, 2012 at 06:34:12PM +0100, Mike Dunn wrote:
> On 03/29/2012 10:24 AM, Brian Norris wrote:
> >>
> >>        case NAND_ECC_SOFT_BCH:
> >> @@ -3384,6 +3385,8 @@ int nand_scan_tail(struct mtd_info *mtd)
> >>                        pr_warn("BCH ECC initialization failed!\n");
> >>                        BUG();
> >>                }
> >> +               chip->ecc.strength =
> >> +                       chip->ecc.bytes*8 / fls(8*chip->ecc.size);
> > 
> > Isn't this spacing against coding style? I'd suggest spaces around the
> > '*'. Also, after a few minutes, I have no idea where this calculation
> > comes from. But I'm not familiar with SOFT_BCH. Maybe a comment is in
> > order?
> 
> 
> I'm no bch expert myself (ping, Ivan), but I believe this is correct, and it is
> consistant with the equations used to determine the 't' parameter (i.e.,
> ecc_strength) from nand_bch_init() in drivers/mtd/nand/nand_bch.c.

Yes you are correct. Quoting nand_bch.c:

	m = fls(1+8*eccsize);
	t = (eccbytes*8)/m;

The first line computes the smallest 'm' such that
    2^m-1 > number_of_bits_in_ecc_block
hence
    m = fls(1+8*eccsize)

In practice, adding 1 to 8*eccsize has no effect on the fls() result (unless
eccsize is 0 :-), but I left it for consistency with nand_bch_init() doc.

Arguably it would have been less confusing to pass 'm' and 't' directly to nand_bch_init(),
but the present solution had the advantage of providing enough information, without
requiring any additional redundant field in mtd structures.

BR,
--
Ivan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-04-02 16:40         ` Mike Dunn
@ 2012-04-03  8:48           ` Shmulik Ladkani
  2012-04-13 15:54             ` Artem Bityutskiy
  0 siblings, 1 reply; 31+ messages in thread
From: Shmulik Ladkani @ 2012-04-03  8:48 UTC (permalink / raw)
  To: Mike Dunn; +Cc: Brian Norris, linux-mtd, David Woodhouse, Artem Bityutskiy

On Mon, 02 Apr 2012 09:40:08 -0700 Mike Dunn <mikedunn@newsguy.com> wrote:
> Hi Shmulik,
> 
> On 03/30/2012 11:37 PM, Shmulik Ladkani wrote:
> > Please re-consider having the 'euclean_threshold' comparison within the
> > NAND infrastructure (nand_base.c and clones), instead of within the
> > generic 'mtd_read()' wrapper, as discussed in [1].
> 
> 
> This isn't possible given the intended functionality of the patches.  The plan
> is to allow the user to adjust the bitflip_threshold through sysfs, and do it
> uniquely for each partition.  Since the driver itself receives as its argument
> the mtd_info for the master, not the partition, it can't compare the bitflip
> count with the appropriate threshold to determine whether or not to return
> -EUCLEAN.  

Ok.

Now I understand why you submitted [l2-mtd ac5b1dd mtd: fix partition
wrapper functions]...
You rely on ac5b1dd in order for the comparison to be executed per
partition.
Reasnoable, however delicate. The reslationship between the
'euclean_threeshold' feature and this patch is implicit and not obvious.
It raises a concern that one might later change the behavior of
'part_read' to use the mtd_read wrapper for some reason, without
understanding the affects.

Anyway, looking on it in a broader perspective, can you please state
use-case examples where one might set different thresholds for different
partitions of same device?
I mean, do we really need that granularity?

Regards,
Shmulik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-04-03  8:48           ` Shmulik Ladkani
@ 2012-04-13 15:54             ` Artem Bityutskiy
  2012-04-13 18:18               ` Mike Dunn
  0 siblings, 1 reply; 31+ messages in thread
From: Artem Bityutskiy @ 2012-04-13 15:54 UTC (permalink / raw)
  To: Shmulik Ladkani; +Cc: linux-mtd, Brian Norris, Mike Dunn, David Woodhouse

[-- Attachment #1: Type: text/plain, Size: 456 bytes --]

On Tue, 2012-04-03 at 11:48 +0300, Shmulik Ladkani wrote:
> Anyway, looking on it in a broader perspective, can you please state
> use-case examples where one might set different thresholds for
> different
> partitions of same device?
> I mean, do we really need that granularity?

Current MTD partitions support is very simple and makes it difficult to
distinguished between partitions and not partitions.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN
  2012-04-13 15:54             ` Artem Bityutskiy
@ 2012-04-13 18:18               ` Mike Dunn
  0 siblings, 0 replies; 31+ messages in thread
From: Mike Dunn @ 2012-04-13 18:18 UTC (permalink / raw)
  To: dedekind1; +Cc: Brian Norris, linux-mtd, Shmulik Ladkani, David Woodhouse

Sorry Shmulik, I didn't get your post again.  (Why?? I get a blizzard of emails
from various lists, but seem to lose the important ones!)

On 04/13/2012 08:54 AM, Artem Bityutskiy wrote:
> On Tue, 2012-04-03 at 11:48 +0300, Shmulik Ladkani wrote:
>> Anyway, looking on it in a broader perspective, can you please state
>> use-case examples where one might set different thresholds for
>> different
>> partitions of same device?
>> I mean, do we really need that granularity?
> 
> Current MTD partitions support is very simple and makes it difficult to
> distinguished between partitions and not partitions.

Yes, the way the mtd code is structured, I think going through the partitions is
the correct way for the sake of cleanliness alone.

BTW, I haven't forgotten this.  I hope to have patches ready in a couple days.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2012-04-13 18:18 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-11 21:21 [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Mike Dunn
2012-03-11 21:21 ` [PATCH 1/4] MTD: add ecc_strength fields to mtd structs Mike Dunn
2012-03-13 12:25   ` Artem Bityutskiy
2012-03-11 21:21 ` [PATCH 2/4] MTD: flash drivers set ecc strength Mike Dunn
2012-03-13 12:27   ` Artem Bityutskiy
2012-03-16 14:13   ` Ivan Djelic
2012-03-16 20:02     ` Mike Dunn
2012-03-29 17:24   ` Brian Norris
2012-03-31  0:05     ` Mike Dunn
2012-04-02 17:34     ` Mike Dunn
2012-04-03  8:03       ` Ivan Djelic
2012-03-11 21:21 ` [PATCH 3/4] MTD: euclean_threshold added to mtd_info and sysfs Mike Dunn
2012-03-13 12:29   ` Artem Bityutskiy
2012-03-11 21:21 ` [PATCH 4/4] MTD: drivers return max_bitflips, mtd returns -EUCLEAN Mike Dunn
2012-03-14 11:05   ` Shmulik Ladkani
2012-03-14 11:45     ` Shmulik Ladkani
2012-03-29 17:30   ` Brian Norris
2012-03-30 12:13     ` Artem Bityutskiy
2012-03-31  1:17       ` Mike Dunn
2012-04-02  7:17         ` Artem Bityutskiy
2012-04-02 15:33           ` Mike Dunn
2012-03-31  1:05     ` Mike Dunn
2012-03-31  6:37       ` Shmulik Ladkani
2012-04-02 16:40         ` Mike Dunn
2012-04-03  8:48           ` Shmulik Ladkani
2012-04-13 15:54             ` Artem Bityutskiy
2012-04-13 18:18               ` Mike Dunn
2012-03-13 12:03 ` [PATCH 0/4] MTD: Change meaning of -EUCLEAN return code on reads Artem Bityutskiy
2012-03-13 17:46   ` Mike Dunn
2012-03-13 22:14     ` Mike Dunn
2012-03-14 10:56     ` Artem Bityutskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.