All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/5] mtd: nand: properly handle bitflips in erased pages
@ 2015-12-30 19:32 Boris Brezillon
  2015-12-30 19:32 ` [PATCH v4 1/5] mtd: nand: return consistent error codes in ecc.correct() implementations Boris Brezillon
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Boris Brezillon @ 2015-12-30 19:32 UTC (permalink / raw)
  To: David Woodhouse, Brian Norris, linux-mtd
  Cc: Franklin S Cooper Jr.,
	Maxim Levitsky, Nicolas Ferre, Jean-Christophe Plagniol-Villard,
	Alexandre Belloni, linux-kernel, Boris Brezillon

Hi,

This patch series aims at providing a common logic to check for bitflips
in erased pages.

Currently each driver is implementing its own logic to check for bitflips
in erased pages. Not only this create code duplication, but most of these
implementations are incorrect.
Here are a few aspects that are often left aside in those implementations:
1/ they do not check OOB bytes when checking for the ff pattern, which
   means they can consider a page as empty while the MTD user actually
   wanted to write almost ff with a few bits to zero
2/ they check for the ff pattern on the whole page, while ECC actually
   works on smaller chunks (usually 512 or 1024 bytes chunks)
3/ they use random bitflip thresholds to decide whether a page/chunk is
   erased or not. IMO this threshold should be set to ECC strength (or
   at least something correlated to this parameter)

The approach taken in this series is to provide two helper functions to
check for bitflips in erased pages. Each driver that needs to check for
such cases can then call the nand_check_erased_ecc_chunk() function, and
rely on the common logic to decide whether a page is erased or not.

While Brian suggested a few times to make this detection automatic for
all drivers that set a specific flag (NAND_CHECK_ERASED_BITFLIPS?), here
is a few reasons I think this is not such a good idea:
1/ some (a lot of) drivers do not properly implement the raw access
   functions, and since we need to check for raw data and OOB bytes this
   makes the automatic detection unusable for most drivers unless they
   decide to correctly implement those methods (which would be a good
   thing BTW).
2/ as a I said earlier, this check should be made at the ECC chunk level
   and not at the page level. This spots two problems: some (a lot of)
   drivers do not properly specify the ecc layout information, and even
   if the ecc layout is correctly defined, there is no way to attach ECC
   bytes to a specific ECC chunk.
3/ the last aspect is the perf penalty incured by this test. Automatically
   doing that at the NAND core level implies reading the whole page again
   in raw mode, while with the helper function approach, drivers supporting
   access at the ECC chunk level can read only the faulty chunk in raw
   mode.

Regarding the bitflips threshold at which an erased pages is considered as
faulty, I have assigned it to ECC strength. As mentioned by Andrea, using
ECC strength might cause some trouble, because if you already have some
bitflips in an erased page, programming it might generate even more of
them.
In the other hand, shouldn't that be checked after (or before) programming
a page. I mean, UBI is already capable of detecting pages which are over
the configured bitflips_threshold and move data around when it detects
such pages.
If we check data after writing a page we wouldn't have to bother about
setting a weaker value for the "bitflips in erased page" case.
Another thing in favor of the ECC strength value for this "bitflips in
erased page" threshold value: if the ECC engine is generating 0xff ECC
bytes when the page is empty, then it will be able to fix ECC strength
bitflips without complaining, so why should we use different value when
we detect bitflips using the pattern match approach?

Best Regards,

Boris

Changes since v3:
- drop already applied patches
- make the generic "bitflips in erased pages" check as an opt-in flag
- split driver changes to ease review
- addressed Brian's comments

Changes since v2:
- improve nand_check_erased_buf() implementation
- keep nand_check_erased_buf() private to nand_base.c
- patch existing ecc.correct() implementations to return consistent error
  codes
- make the 'erased check' optional
- remove some custom implementations of the 'erased check'

Changes since v1:
- fix the nand_check_erased_buf() function
- mark the bitflips > bitflips_threshold condition as unlikely
- add missing memsets in nand_check_erased_ecc_chunk()


Boris Brezillon (5):
  mtd: nand: return consistent error codes in ecc.correct()
    implementations
  mtd: nand: use nand_check_erased_ecc_chunk in default ECC read
    functions
  mtd: nand: davinci: remove custom 'erased check' implementation
  mtd: nand: diskonchip: remove custom 'erased check' implementation
  mtd: nand: jz4740: remove custom 'erased check' implementation

 drivers/mtd/nand/atmel_nand.c   |  2 +-
 drivers/mtd/nand/bf5xx_nand.c   | 20 +++++++++++-----
 drivers/mtd/nand/davinci_nand.c | 15 ++++--------
 drivers/mtd/nand/diskonchip.c   | 37 ++--------------------------
 drivers/mtd/nand/jz4740_nand.c  | 22 ++---------------
 drivers/mtd/nand/mxc_nand.c     |  4 ++--
 drivers/mtd/nand/nand_base.c    | 53 +++++++++++++++++++++++++++++++++++------
 drivers/mtd/nand/nand_bch.c     |  2 +-
 drivers/mtd/nand/nand_ecc.c     |  2 +-
 drivers/mtd/nand/omap2.c        |  6 ++---
 drivers/mtd/nand/r852.c         |  4 ++--
 include/linux/mtd/nand.h        | 18 +++++++++++++-
 include/linux/mtd/nand_bch.h    |  2 +-
 13 files changed, 96 insertions(+), 91 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-01-07  3:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-30 19:32 [PATCH v4 0/5] mtd: nand: properly handle bitflips in erased pages Boris Brezillon
2015-12-30 19:32 ` [PATCH v4 1/5] mtd: nand: return consistent error codes in ecc.correct() implementations Boris Brezillon
2016-01-07  2:48   ` Brian Norris
2015-12-30 19:32 ` [PATCH v4 2/5] mtd: nand: use nand_check_erased_ecc_chunk in default ECC read functions Boris Brezillon
2016-01-07  2:50   ` Brian Norris
2015-12-30 19:32 ` [PATCH v4 3/5] mtd: nand: davinci: remove custom 'erased check' implementation Boris Brezillon
2016-01-07  2:55   ` Brian Norris
2015-12-30 19:32 ` [PATCH v4 4/5] mtd: nand: diskonchip: " Boris Brezillon
2015-12-30 19:39   ` [PATCH v5 " Boris Brezillon
2015-12-30 19:32 ` [PATCH v4 5/5] mtd: nand: jz4740: " Boris Brezillon
2015-12-30 19:41   ` [PATCH v5 " Boris Brezillon
2015-12-30 23:34 ` [PATCH v4 0/5] mtd: nand: properly handle bitflips in erased pages Franklin S Cooper Jr.
2016-01-07  3:02 ` Brian Norris

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.