All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V1] mtd: core: Micron SLC NAND filling block
@ 2018-12-19 11:03 Bean Huo (beanhuo)
  2018-12-19 13:17 ` Boris Brezillon
  2018-12-19 14:27 ` Richard Weinberger
  0 siblings, 2 replies; 13+ messages in thread
From: Bean Huo (beanhuo) @ 2018-12-19 11:03 UTC (permalink / raw)
  To: linux-mtd, Miquel Raynal, Richard Weinberger,
	Zoltan Szubbocsev (zszubbocsev),
	tglx
  Cc: Bean Huo (beanhuo)

On some legacy planar 2D Micron NAND devices when a
block erase command is issued, occasionally even though
a block erase operation successfully completes and returns
a pass status, the flash block may not be completely erased.
Subsequent operations to this block on very rare cases can
result in subtle failures or corruption. These extremely rare
cases should nevertheless be considered.

These rare occurrences have been observed on partially written
blocks. Partially written blocks are not uncommon with UBI/UBIFS. 

To avoid this rare occurrence, we make sure that at least 15 pages
have been programmed to a block before it is erased. In case we
find that less than 15 pages have been programmed, additional
pages are programmed in the block. The observation is that additional
pages rarely need to be written and most of the time UBI/UBIFS erases
blocks that contain more programmed pages. 

Signed-off-by: beanhuo <beanhuo@micron.com>
Reviewed-by:  ZOLTAN SZUBBOCSEV < zszubbocsev@micron.com>
---
 drivers/mtd/mtdcore.c | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/mtd/mtdcore.h |  16 +++++++
 2 files changed, 141 insertions(+)

diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 8bbbb75..b3879b5 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -801,6 +801,127 @@ void __put_mtd_device(struct mtd_info *mtd)
 }
 EXPORT_SYMBOL_GPL(__put_mtd_device);
 
+static int mtd_check_pattern(const void *buf, uint8_t patt, int size)
+{
+	int i;
+
+	for (i = 0; i < size; i++)
+		if (((const uint8_t *)buf)[i] != patt)
+			return 0;
+	return 1;
+}
+
+static int checkup_partial_filling(struct mtd_info *mtd, struct erase_info *instr)
+{
+	int retlen;
+	int ret;
+	u_char * data_buf, * oob_buf;
+	uint32_t empty_page_mask = 0;
+	uint32_t programmed_page = 0;
+	loff_t addr;
+	int nextpage = LAST_CHECKPU_PAGE; /* We defined the maximum page to check is page14,
+	                      first page is page0*/
+	int cur_page = 0;
+	ret = 0;
+
+	if ((mtd->type != MTD_NANDFLASH) && (mtd->type != MTD_MLCNANDFLASH))
+		return 0; /* Only works on NAND */
+
+	data_buf = kmalloc(mtd->writesize, GFP_KERNEL);
+	oob_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
+	addr = instr->addr; /* this is page0 address*/
+
+	if (data_buf)
+	    memset(data_buf, 0xff, mtd->writesize);
+	else
+	    goto end;
+	if (oob_buf)
+	    memset(oob_buf, 0x00, mtd->oobsize);
+	else
+	    goto end;
+
+	for( ; nextpage > 0; ) {
+		ret = mtd_read(mtd, addr, mtd->writesize, &retlen, data_buf);
+		if ((ret) || (retlen != mtd->writesize))
+			goto next; /* Read error, currently, we directly skip it */
+
+		if (mtd_check_pattern(data_buf, 0xff, retlen)) {
+			if (cur_page == 0)
+			    break; /* page 0 is empty, skip block filling checkup */
+
+			/* Mark page need to program later */
+			empty_page_mask |= (1 << cur_page);
+		} else {
+			programmed_page |= (1 << cur_page);
+			if (LAST_CHECKPU_PAGE != nextpage)
+		/*
+		 * Because we checked page0 firstly, at this time the nextpage
+		 * is page13. And if page0 is programmed, then we start to
+		 * check from page13,page11,page9...,page3,respectively.
+		 * We olny check odd page. So if the nextpage is not page13
+		 * and cur_page is programmed, we consider that the current
+		 * cur_page is the last programmed page since the pages of PEB
+		 * are programmed sequentially.
+		*/
+			break;
+		}
+next:
+		addr = instr->addr + mtd->writesize * nextpage;
+		cur_page = nextpage;
+		if (nextpage > 1)
+			nextpage = ((nextpage % 2) ? (nextpage - 2) : (nextpage - 1));
+		else
+			break;
+	}
+
+	if (empty_page_mask == 0x00)
+		goto end;
+
+	int i;
+	struct ubifs_filling_head * filling = data_buf;
+	filling->magic = 0x00000000;
+	filling->sqnum = 0;
+	filling->node_type = 0xCE;
+	filling->group_type = 0;
+	filling->padding[0] = filling->padding[1] = 0xAA;
+	filling->len = cpu_to_le32(UBIFS_PEB_PAD_NODE_SZ);
+	int pad  = mtd->writesize - UBIFS_PEB_PAD_NODE_SZ;
+	filling->pad_len = cpu_to_le32(pad);
+	filling->crc = 0x00000000;
+	memset(data_buf + UBIFS_PEB_PAD_NODE_SZ, 0xAA, pad);
+
+    struct mtd_oob_ops ops;
+    ops.mode = MTD_OPS_RAW;
+    ops.len = 0;
+    ops.ooblen = mtd->oobsize - 16;
+    ops.ooboffs = 16;
+    ops.datbuf = NULL;
+    ops.oobbuf = oob_buf;
+ 
+	empty_page_mask|=0x3;
+	/* Always over-write EC and VID pages with a wrong EC/VID magic bytes. */
+
+	for ( i = 0; i <= LAST_CHECKPU_PAGE ; ) {
+        /* Start to program  filling data into empty page,
+	   we only program the odd page. */
+		if (empty_page_mask & (1 << i)) {
+			addr = instr->addr + mtd->writesize * i;
+            if ((i == 0) || (i == 1)) {
+            ret = mtd_write(mtd, addr, mtd->writesize, &retlen, data_buf);
+            ret = mtd->_write_oob(mtd, addr, &ops); /* clear OOB */
+            } else
+            ret = mtd_write(mtd, addr, mtd->writesize, &retlen, data_buf);
+		}
+		if (i == 0)
+		i = 1;
+		else
+		i += 2;
+	}
+end:
+	kfree(data_buf);
+	kfree(oob_buf);
+	return 0;
+}
 /*
  * Erase is an asynchronous operation.  Device drivers are supposed
  * to call instr->callback() whenever the operation completes, even
@@ -820,6 +941,10 @@ int mtd_erase(struct mtd_info *mtd, struct erase_info *instr)
 		mtd_erase_callback(instr);
 		return 0;
 	}
+
+#ifdef PARTIAL_FILLING_CHECKUP
+    checkup_partial_filling(mtd, instr);
+#endif
 	return mtd->_erase(mtd, instr);
 }
 EXPORT_SYMBOL_GPL(mtd_erase);
diff --git a/drivers/mtd/mtdcore.h b/drivers/mtd/mtdcore.h
index 7b03533..0e5ad67 100644
--- a/drivers/mtd/mtdcore.h
+++ b/drivers/mtd/mtdcore.h
@@ -5,6 +5,22 @@
 
 extern struct mutex mtd_table_mutex;
 
+#define PARTIAL_FILLING_CHECKUP 1 /* Used to turn on/off partial filling
+		 block checkup before erasing this block.*/
+#define LAST_CHECKPU_PAGE 13
+struct ubifs_filling_head {
+	__le32 magic;
+	__le32 crc;
+	__le64 sqnum;
+	__le32 len;
+	__u8 node_type;
+	__u8 group_type;
+	__u8 padding[2];
+	__le32 pad_len;
+} __packed;
+
+#define UBIFS_PEB_PAD_NODE_SZ  sizeof(struct ubifs_filling_head)
+
 struct mtd_info *__mtd_next_device(int i);
 int add_mtd_device(struct mtd_info *mtd);
 int del_mtd_device(struct mtd_info *mtd);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 11:03 [PATCH V1] mtd: core: Micron SLC NAND filling block Bean Huo (beanhuo)
@ 2018-12-19 13:17 ` Boris Brezillon
  2018-12-19 13:59   ` [EXT] " Zoltan Szubbocsev (zszubbocsev)
  2018-12-19 14:27 ` Richard Weinberger
  1 sibling, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2018-12-19 13:17 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: linux-mtd, Miquel Raynal, Richard Weinberger,
	Zoltan Szubbocsev (zszubbocsev),
	tglx

Hi Bean

On Wed, 19 Dec 2018 11:03:06 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> On some legacy planar 2D Micron NAND devices when a
> block erase command is issued, occasionally even though
> a block erase operation successfully completes and returns
> a pass status, the flash block may not be completely erased.
> Subsequent operations to this block on very rare cases can
> result in subtle failures or corruption. These extremely rare
> cases should nevertheless be considered.
> 
> These rare occurrences have been observed on partially written
> blocks. Partially written blocks are not uncommon with UBI/UBIFS. 
> 
> To avoid this rare occurrence, we make sure that at least 15 pages
> have been programmed to a block before it is erased. In case we
> find that less than 15 pages have been programmed, additional
> pages are programmed in the block. The observation is that additional
> pages rarely need to be written and most of the time UBI/UBIFS erases
> blocks that contain more programmed pages. 

Okay, so 15 is the right number :-). Can you tell us which parts are
potentially impacted by this problem?

Regarding the patch itself, it's breaking the layering we have in
MTD/UBI/UBIFS and does not address the non-UBI/UBIFS case, so I don't
think we'll go for this approach. But that's good to have Micron's
feedback on this issue.

Thanks,

Boris

> 
> Signed-off-by: beanhuo <beanhuo@micron.com>
> Reviewed-by:  ZOLTAN SZUBBOCSEV < zszubbocsev@micron.com>
> ---
>  drivers/mtd/mtdcore.c | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/mtd/mtdcore.h |  16 +++++++
>  2 files changed, 141 insertions(+)
> 
> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
> index 8bbbb75..b3879b5 100644
> --- a/drivers/mtd/mtdcore.c
> +++ b/drivers/mtd/mtdcore.c
> @@ -801,6 +801,127 @@ void __put_mtd_device(struct mtd_info *mtd)
>  }
>  EXPORT_SYMBOL_GPL(__put_mtd_device);
>  
> +static int mtd_check_pattern(const void *buf, uint8_t patt, int size)
> +{
> +	int i;
> +
> +	for (i = 0; i < size; i++)
> +		if (((const uint8_t *)buf)[i] != patt)
> +			return 0;
> +	return 1;
> +}
> +
> +static int checkup_partial_filling(struct mtd_info *mtd, struct erase_info *instr)
> +{
> +	int retlen;
> +	int ret;
> +	u_char * data_buf, * oob_buf;
> +	uint32_t empty_page_mask = 0;
> +	uint32_t programmed_page = 0;
> +	loff_t addr;
> +	int nextpage = LAST_CHECKPU_PAGE; /* We defined the maximum page to check is page14,
> +	                      first page is page0*/
> +	int cur_page = 0;
> +	ret = 0;
> +
> +	if ((mtd->type != MTD_NANDFLASH) && (mtd->type != MTD_MLCNANDFLASH))
> +		return 0; /* Only works on NAND */
> +
> +	data_buf = kmalloc(mtd->writesize, GFP_KERNEL);
> +	oob_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
> +	addr = instr->addr; /* this is page0 address*/
> +
> +	if (data_buf)
> +	    memset(data_buf, 0xff, mtd->writesize);
> +	else
> +	    goto end;
> +	if (oob_buf)
> +	    memset(oob_buf, 0x00, mtd->oobsize);
> +	else
> +	    goto end;
> +
> +	for( ; nextpage > 0; ) {
> +		ret = mtd_read(mtd, addr, mtd->writesize, &retlen, data_buf);
> +		if ((ret) || (retlen != mtd->writesize))
> +			goto next; /* Read error, currently, we directly skip it */
> +
> +		if (mtd_check_pattern(data_buf, 0xff, retlen)) {
> +			if (cur_page == 0)
> +			    break; /* page 0 is empty, skip block filling checkup */
> +
> +			/* Mark page need to program later */
> +			empty_page_mask |= (1 << cur_page);
> +		} else {
> +			programmed_page |= (1 << cur_page);
> +			if (LAST_CHECKPU_PAGE != nextpage)
> +		/*
> +		 * Because we checked page0 firstly, at this time the nextpage
> +		 * is page13. And if page0 is programmed, then we start to
> +		 * check from page13,page11,page9...,page3,respectively.
> +		 * We olny check odd page. So if the nextpage is not page13
> +		 * and cur_page is programmed, we consider that the current
> +		 * cur_page is the last programmed page since the pages of PEB
> +		 * are programmed sequentially.
> +		*/
> +			break;
> +		}
> +next:
> +		addr = instr->addr + mtd->writesize * nextpage;
> +		cur_page = nextpage;
> +		if (nextpage > 1)
> +			nextpage = ((nextpage % 2) ? (nextpage - 2) : (nextpage - 1));
> +		else
> +			break;
> +	}
> +
> +	if (empty_page_mask == 0x00)
> +		goto end;
> +
> +	int i;
> +	struct ubifs_filling_head * filling = data_buf;
> +	filling->magic = 0x00000000;
> +	filling->sqnum = 0;
> +	filling->node_type = 0xCE;
> +	filling->group_type = 0;
> +	filling->padding[0] = filling->padding[1] = 0xAA;
> +	filling->len = cpu_to_le32(UBIFS_PEB_PAD_NODE_SZ);
> +	int pad  = mtd->writesize - UBIFS_PEB_PAD_NODE_SZ;
> +	filling->pad_len = cpu_to_le32(pad);
> +	filling->crc = 0x00000000;
> +	memset(data_buf + UBIFS_PEB_PAD_NODE_SZ, 0xAA, pad);
> +
> +    struct mtd_oob_ops ops;
> +    ops.mode = MTD_OPS_RAW;
> +    ops.len = 0;
> +    ops.ooblen = mtd->oobsize - 16;
> +    ops.ooboffs = 16;
> +    ops.datbuf = NULL;
> +    ops.oobbuf = oob_buf;
> + 
> +	empty_page_mask|=0x3;
> +	/* Always over-write EC and VID pages with a wrong EC/VID magic bytes. */
> +
> +	for ( i = 0; i <= LAST_CHECKPU_PAGE ; ) {
> +        /* Start to program  filling data into empty page,
> +	   we only program the odd page. */
> +		if (empty_page_mask & (1 << i)) {
> +			addr = instr->addr + mtd->writesize * i;
> +            if ((i == 0) || (i == 1)) {
> +            ret = mtd_write(mtd, addr, mtd->writesize, &retlen, data_buf);
> +            ret = mtd->_write_oob(mtd, addr, &ops); /* clear OOB */
> +            } else
> +            ret = mtd_write(mtd, addr, mtd->writesize, &retlen, data_buf);
> +		}
> +		if (i == 0)
> +		i = 1;
> +		else
> +		i += 2;
> +	}
> +end:
> +	kfree(data_buf);
> +	kfree(oob_buf);
> +	return 0;
> +}
>  /*
>   * Erase is an asynchronous operation.  Device drivers are supposed
>   * to call instr->callback() whenever the operation completes, even
> @@ -820,6 +941,10 @@ int mtd_erase(struct mtd_info *mtd, struct erase_info *instr)
>  		mtd_erase_callback(instr);
>  		return 0;
>  	}
> +
> +#ifdef PARTIAL_FILLING_CHECKUP
> +    checkup_partial_filling(mtd, instr);
> +#endif
>  	return mtd->_erase(mtd, instr);
>  }
>  EXPORT_SYMBOL_GPL(mtd_erase);
> diff --git a/drivers/mtd/mtdcore.h b/drivers/mtd/mtdcore.h
> index 7b03533..0e5ad67 100644
> --- a/drivers/mtd/mtdcore.h
> +++ b/drivers/mtd/mtdcore.h
> @@ -5,6 +5,22 @@
>  
>  extern struct mutex mtd_table_mutex;
>  
> +#define PARTIAL_FILLING_CHECKUP 1 /* Used to turn on/off partial filling
> +		 block checkup before erasing this block.*/
> +#define LAST_CHECKPU_PAGE 13
> +struct ubifs_filling_head {
> +	__le32 magic;
> +	__le32 crc;
> +	__le64 sqnum;
> +	__le32 len;
> +	__u8 node_type;
> +	__u8 group_type;
> +	__u8 padding[2];
> +	__le32 pad_len;
> +} __packed;
> +
> +#define UBIFS_PEB_PAD_NODE_SZ  sizeof(struct ubifs_filling_head)
> +
>  struct mtd_info *__mtd_next_device(int i);
>  int add_mtd_device(struct mtd_info *mtd);
>  int del_mtd_device(struct mtd_info *mtd);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 13:17 ` Boris Brezillon
@ 2018-12-19 13:59   ` Zoltan Szubbocsev (zszubbocsev)
  2018-12-19 14:17     ` Miquel Raynal
  0 siblings, 1 reply; 13+ messages in thread
From: Zoltan Szubbocsev (zszubbocsev) @ 2018-12-19 13:59 UTC (permalink / raw)
  To: Boris Brezillon, Bean Huo (beanhuo)
  Cc: linux-mtd, Miquel Raynal, Richard Weinberger, tglx

Boris,

Are there  non UBI/UBIFS cases?
Can you please be specific?
Zoltan 

-----Original Message-----
From: Boris Brezillon [mailto:bbrezillon@kernel.org] 
Sent: Mittwoch, 19. Dezember 2018 14:18
To: Bean Huo (beanhuo) <beanhuo@micron.com>
Cc: linux-mtd@lists.infradead.org; Miquel Raynal <miquel.raynal@bootlin.com>; Richard Weinberger <richard@nod.at>; Zoltan Szubbocsev (zszubbocsev) <zszubbocsev@micron.com>; tglx@linutronix.de
Subject: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block

Hi Bean

On Wed, 19 Dec 2018 11:03:06 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> On some legacy planar 2D Micron NAND devices when a block erase 
> command is issued, occasionally even though a block erase operation 
> successfully completes and returns a pass status, the flash block may 
> not be completely erased.
> Subsequent operations to this block on very rare cases can result in 
> subtle failures or corruption. These extremely rare cases should 
> nevertheless be considered.
> 
> These rare occurrences have been observed on partially written blocks. 
> Partially written blocks are not uncommon with UBI/UBIFS.
> 
> To avoid this rare occurrence, we make sure that at least 15 pages 
> have been programmed to a block before it is erased. In case we find 
> that less than 15 pages have been programmed, additional pages are 
> programmed in the block. The observation is that additional pages 
> rarely need to be written and most of the time UBI/UBIFS erases blocks 
> that contain more programmed pages.

Okay, so 15 is the right number :-). Can you tell us which parts are potentially impacted by this problem?

Regarding the patch itself, it's breaking the layering we have in MTD/UBI/UBIFS and does not address the non-UBI/UBIFS case, so I don't think we'll go for this approach. But that's good to have Micron's feedback on this issue.

Thanks,

Boris

> 
> Signed-off-by: beanhuo <beanhuo@micron.com>
> Reviewed-by:  ZOLTAN SZUBBOCSEV < zszubbocsev@micron.com>
> ---
>  drivers/mtd/mtdcore.c | 125 
> ++++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/mtd/mtdcore.h |  16 +++++++
>  2 files changed, 141 insertions(+)
> 
> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c index 
> 8bbbb75..b3879b5 100644
> --- a/drivers/mtd/mtdcore.c
> +++ b/drivers/mtd/mtdcore.c
> @@ -801,6 +801,127 @@ void __put_mtd_device(struct mtd_info *mtd)  }  
> EXPORT_SYMBOL_GPL(__put_mtd_device);
>  
> +static int mtd_check_pattern(const void *buf, uint8_t patt, int size) 
> +{
> +	int i;
> +
> +	for (i = 0; i < size; i++)
> +		if (((const uint8_t *)buf)[i] != patt)
> +			return 0;
> +	return 1;
> +}
> +
> +static int checkup_partial_filling(struct mtd_info *mtd, struct 
> +erase_info *instr) {
> +	int retlen;
> +	int ret;
> +	u_char * data_buf, * oob_buf;
> +	uint32_t empty_page_mask = 0;
> +	uint32_t programmed_page = 0;
> +	loff_t addr;
> +	int nextpage = LAST_CHECKPU_PAGE; /* We defined the maximum page to check is page14,
> +	                      first page is page0*/
> +	int cur_page = 0;
> +	ret = 0;
> +
> +	if ((mtd->type != MTD_NANDFLASH) && (mtd->type != MTD_MLCNANDFLASH))
> +		return 0; /* Only works on NAND */
> +
> +	data_buf = kmalloc(mtd->writesize, GFP_KERNEL);
> +	oob_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
> +	addr = instr->addr; /* this is page0 address*/
> +
> +	if (data_buf)
> +	    memset(data_buf, 0xff, mtd->writesize);
> +	else
> +	    goto end;
> +	if (oob_buf)
> +	    memset(oob_buf, 0x00, mtd->oobsize);
> +	else
> +	    goto end;
> +
> +	for( ; nextpage > 0; ) {
> +		ret = mtd_read(mtd, addr, mtd->writesize, &retlen, data_buf);
> +		if ((ret) || (retlen != mtd->writesize))
> +			goto next; /* Read error, currently, we directly skip it */
> +
> +		if (mtd_check_pattern(data_buf, 0xff, retlen)) {
> +			if (cur_page == 0)
> +			    break; /* page 0 is empty, skip block filling checkup */
> +
> +			/* Mark page need to program later */
> +			empty_page_mask |= (1 << cur_page);
> +		} else {
> +			programmed_page |= (1 << cur_page);
> +			if (LAST_CHECKPU_PAGE != nextpage)
> +		/*
> +		 * Because we checked page0 firstly, at this time the nextpage
> +		 * is page13. And if page0 is programmed, then we start to
> +		 * check from page13,page11,page9...,page3,respectively.
> +		 * We olny check odd page. So if the nextpage is not page13
> +		 * and cur_page is programmed, we consider that the current
> +		 * cur_page is the last programmed page since the pages of PEB
> +		 * are programmed sequentially.
> +		*/
> +			break;
> +		}
> +next:
> +		addr = instr->addr + mtd->writesize * nextpage;
> +		cur_page = nextpage;
> +		if (nextpage > 1)
> +			nextpage = ((nextpage % 2) ? (nextpage - 2) : (nextpage - 1));
> +		else
> +			break;
> +	}
> +
> +	if (empty_page_mask == 0x00)
> +		goto end;
> +
> +	int i;
> +	struct ubifs_filling_head * filling = data_buf;
> +	filling->magic = 0x00000000;
> +	filling->sqnum = 0;
> +	filling->node_type = 0xCE;
> +	filling->group_type = 0;
> +	filling->padding[0] = filling->padding[1] = 0xAA;
> +	filling->len = cpu_to_le32(UBIFS_PEB_PAD_NODE_SZ);
> +	int pad  = mtd->writesize - UBIFS_PEB_PAD_NODE_SZ;
> +	filling->pad_len = cpu_to_le32(pad);
> +	filling->crc = 0x00000000;
> +	memset(data_buf + UBIFS_PEB_PAD_NODE_SZ, 0xAA, pad);
> +
> +    struct mtd_oob_ops ops;
> +    ops.mode = MTD_OPS_RAW;
> +    ops.len = 0;
> +    ops.ooblen = mtd->oobsize - 16;
> +    ops.ooboffs = 16;
> +    ops.datbuf = NULL;
> +    ops.oobbuf = oob_buf;
> + 
> +	empty_page_mask|=0x3;
> +	/* Always over-write EC and VID pages with a wrong EC/VID magic 
> +bytes. */
> +
> +	for ( i = 0; i <= LAST_CHECKPU_PAGE ; ) {
> +        /* Start to program  filling data into empty page,
> +	   we only program the odd page. */
> +		if (empty_page_mask & (1 << i)) {
> +			addr = instr->addr + mtd->writesize * i;
> +            if ((i == 0) || (i == 1)) {
> +            ret = mtd_write(mtd, addr, mtd->writesize, &retlen, data_buf);
> +            ret = mtd->_write_oob(mtd, addr, &ops); /* clear OOB */
> +            } else
> +            ret = mtd_write(mtd, addr, mtd->writesize, &retlen, data_buf);
> +		}
> +		if (i == 0)
> +		i = 1;
> +		else
> +		i += 2;
> +	}
> +end:
> +	kfree(data_buf);
> +	kfree(oob_buf);
> +	return 0;
> +}
>  /*
>   * Erase is an asynchronous operation.  Device drivers are supposed
>   * to call instr->callback() whenever the operation completes, even 
> @@ -820,6 +941,10 @@ int mtd_erase(struct mtd_info *mtd, struct erase_info *instr)
>  		mtd_erase_callback(instr);
>  		return 0;
>  	}
> +
> +#ifdef PARTIAL_FILLING_CHECKUP
> +    checkup_partial_filling(mtd, instr); #endif
>  	return mtd->_erase(mtd, instr);
>  }
>  EXPORT_SYMBOL_GPL(mtd_erase);
> diff --git a/drivers/mtd/mtdcore.h b/drivers/mtd/mtdcore.h index 
> 7b03533..0e5ad67 100644
> --- a/drivers/mtd/mtdcore.h
> +++ b/drivers/mtd/mtdcore.h
> @@ -5,6 +5,22 @@
>  
>  extern struct mutex mtd_table_mutex;
>  
> +#define PARTIAL_FILLING_CHECKUP 1 /* Used to turn on/off partial filling
> +		 block checkup before erasing this block.*/ #define 
> +LAST_CHECKPU_PAGE 13 struct ubifs_filling_head {
> +	__le32 magic;
> +	__le32 crc;
> +	__le64 sqnum;
> +	__le32 len;
> +	__u8 node_type;
> +	__u8 group_type;
> +	__u8 padding[2];
> +	__le32 pad_len;
> +} __packed;
> +
> +#define UBIFS_PEB_PAD_NODE_SZ  sizeof(struct ubifs_filling_head)
> +
>  struct mtd_info *__mtd_next_device(int i);  int add_mtd_device(struct 
> mtd_info *mtd);  int del_mtd_device(struct mtd_info *mtd);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 13:59   ` [EXT] " Zoltan Szubbocsev (zszubbocsev)
@ 2018-12-19 14:17     ` Miquel Raynal
  2018-12-19 17:19       ` Bean Huo (beanhuo)
  0 siblings, 1 reply; 13+ messages in thread
From: Miquel Raynal @ 2018-12-19 14:17 UTC (permalink / raw)
  To: Zoltan Szubbocsev (zszubbocsev)
  Cc: Boris Brezillon, Bean Huo (beanhuo), linux-mtd, Richard Weinberger, tglx

Hi Zoltan,

"Zoltan Szubbocsev (zszubbocsev)" <zszubbocsev@micron.com> wrote on
Wed, 19 Dec 2018 13:59:49 +0000:

> Boris,
> 
> Are there  non UBI/UBIFS cases?
> Can you please be specific?

UBI is one user of MTD.

What about direct accesses to MTD devices? Squashfs?

In any case, I don't think it is exaggerated to say that Linux has been
built on "layering". This issue is NAND chip specific, hence must be
handled at the NAND core level. In the NAND erase path I would like to
see a call to a vendor specific hook that:
1/ checks if the part is affected by comparing its ID to a list,
2/ if yes, checks if the block must be filled before erasing.

The logic of 1/ and 2/ should probably reside in
drivers/mtd/nand/raw/nand_micron.c


Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 11:03 [PATCH V1] mtd: core: Micron SLC NAND filling block Bean Huo (beanhuo)
  2018-12-19 13:17 ` Boris Brezillon
@ 2018-12-19 14:27 ` Richard Weinberger
  2018-12-19 18:11   ` [EXT] " Bean Huo (beanhuo)
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Weinberger @ 2018-12-19 14:27 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: linux-mtd, Miquel Raynal, Zoltan Szubbocsev (zszubbocsev), tglx

Am Mittwoch, 19. Dezember 2018, 12:03:06 CET schrieb Bean Huo (beanhuo):
> +#define PARTIAL_FILLING_CHECKUP 1 /* Used to turn on/off partial filling
> +		 block checkup before erasing this block.*/
> +#define LAST_CHECKPU_PAGE 13
> +struct ubifs_filling_head {
> +	__le32 magic;
> +	__le32 crc;
> +	__le64 sqnum;
> +	__le32 len;
> +	__u8 node_type;
> +	__u8 group_type;
> +	__u8 padding[2];
> +	__le32 pad_len;
> +} __packed;

I don't think this is needed anymore.

If you overwrite (sub)pages 0 and 1, UBI EC and VID headers are gone.
In case of a power-cut before the erase operation, UBI will detect the
corrupted EC and VID headers and re-erase the block.
For a fastmap attach the story is different, it does not track unmap/erase
operations. Therefore it will not notice that the block was scheduled for
erase and got corrupted by overwriting the first pages.
Users on top of UBIFS will face either ECC errors or UBIFS complains
about corrupted empty space.
This fastmap issue got fixed with:
781932375ffc ("ubi: fastmap: Correctly handle interrupted erasures in EBA")

So your ubifs_filling_head struct seems to work around an already fixed UBI
bug. :-)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 14:17     ` Miquel Raynal
@ 2018-12-19 17:19       ` Bean Huo (beanhuo)
  2018-12-19 18:31         ` Richard Weinberger
  2018-12-19 18:46         ` Boris Brezillon
  0 siblings, 2 replies; 13+ messages in thread
From: Bean Huo (beanhuo) @ 2018-12-19 17:19 UTC (permalink / raw)
  To: Miquel Raynal, Zoltan Szubbocsev (zszubbocsev), Boris Brezillon
  Cc: linux-mtd, Richard Weinberger, tglx

Hi, Boris and Miquel
Thanks for your email.
Since the patch is based on v4.2 which doesn't separate specific NAND vendor
driver into vendor specific chip driver file. I will rebase my patch to latest linux kernel,
and implement  a specific hook funciton in the nand_micron.c as you suggested.
After testing well, you will see the version 2.

//Bean Huo

>> Boris,
>>
>> Are there  non UBI/UBIFS cases?
>> Can you please be specific?
>
>UBI is one user of MTD.
>
>What about direct accesses to MTD devices? Squashfs?
>
>In any case, I don't think it is exaggerated to say that Linux has been built on
>"layering". This issue is NAND chip specific, hence must be handled at the
>NAND core level. In the NAND erase path I would like to see a call to a vendor
>specific hook that:
>1/ checks if the part is affected by comparing its ID to a list, 2/ if yes, checks if
>the block must be filled before erasing.
>
>The logic of 1/ and 2/ should probably reside in
>drivers/mtd/nand/raw/nand_micron.c
>
>
>Thanks,
>Miquèl

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 14:27 ` Richard Weinberger
@ 2018-12-19 18:11   ` Bean Huo (beanhuo)
  2018-12-19 18:28     ` Richard Weinberger
                       ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Bean Huo (beanhuo) @ 2018-12-19 18:11 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-mtd, Miquel Raynal, Zoltan Szubbocsev (zszubbocsev),
	tglx, Boris Brezillon

Hi, Richard
thanks for your review.

>Am Mittwoch, 19. Dezember 2018, 12:03:06 CET schrieb Bean Huo (beanhuo):
>> +#define PARTIAL_FILLING_CHECKUP 1 /* Used to turn on/off partial filling
>> +		 block checkup before erasing this block.*/ #define
>> +LAST_CHECKPU_PAGE 13 struct ubifs_filling_head {
>> +	__le32 magic;
>> +	__le32 crc;
>> +	__le64 sqnum;
>> +	__le32 len;
>> +	__u8 node_type;
>> +	__u8 group_type;
>> +	__u8 padding[2];
>> +	__le32 pad_len;
>> +} __packed;
>
>I don't think this is needed anymore.
>

Thanks, I agree with you, and will be removed in next version.

>If you overwrite (sub)pages 0 and 1, UBI EC and VID headers are gone.
>In case of a power-cut before the erase operation, UBI will detect the
>corrupted EC and VID headers and re-erase the block.

This is actually what I want. Means always overwrite EC and VID headers, in order to 
let UBI to re-erase this block just because of this block contains my filling data (resulting from one power-cut after filling pages).
Otherwise, for the next boot, during attach stage, UBIFS will consider this block to be a corrupted block.

let me take one example, eg, the last programmed page is page8, and after the filling pages, 
before mtd->_erase, power-cut hits, so in this block, there will be below data layout, 

Page0 : EC header
Page1: VID header
Page2: not empty, valid data 
Page3: not empty, valid data
Page4: not empty, valid data
Page5: not empty, valid data
Page6: not empty, valid data
Page7: not empty, valid data
Page8: not empty, valid data

Page9: filled with padding data by the patch
Page10: empty, keep all 0xff
Page11: filled with padding data by the patch
Page12: empty, keep all 0xff
Page13: filled with padding data by the patch

As for the older UBIFS (eg, v4.2), there will result in attach failure, but for the latest
UBIFS, if I don't overwrite EC and VID header, will this condition happen?

>For a fastmap attach the story is different, it does not track unmap/erase
>operations. Therefore it will not notice that the block was scheduled for erase
>and got corrupted by overwriting the first pages.
>Users on top of UBIFS will face either ECC errors or UBIFS complains about
>corrupted empty space.
>This fastmap issue got fixed with:
>781932375ffc ("ubi: fastmap: Correctly handle interrupted erasures in EBA")
>
>So your ubifs_filling_head struct seems to work around an already fixed UBI
>bug. :-)
>

Thanks for your info.
"ubi: fastmap: Correctly handle interrupted erasures in EBA" patch is merged from v4.18.
So, I still need to firstly rebase patch to latest kernel, then let's have a look.

>Thanks,
>//richard
>

//Bean

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 18:11   ` [EXT] " Bean Huo (beanhuo)
@ 2018-12-19 18:28     ` Richard Weinberger
  2018-12-19 18:44     ` Boris Brezillon
  2018-12-19 22:29     ` Thomas Gleixner
  2 siblings, 0 replies; 13+ messages in thread
From: Richard Weinberger @ 2018-12-19 18:28 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: linux-mtd, Miquel Raynal, Zoltan Szubbocsev (zszubbocsev),
	tglx, Boris Brezillon

Bean,

Am Mittwoch, 19. Dezember 2018, 19:11:25 CET schrieb Bean Huo (beanhuo):
> >If you overwrite (sub)pages 0 and 1, UBI EC and VID headers are gone.
> >In case of a power-cut before the erase operation, UBI will detect the
> >corrupted EC and VID headers and re-erase the block.
> 
> This is actually what I want. Means always overwrite EC and VID headers, in order to 
> let UBI to re-erase this block just because of this block contains my filling data (resulting from one power-cut after filling pages).
> Otherwise, for the next boot, during attach stage, UBIFS will consider this block to be a corrupted block.
> 
> let me take one example, eg, the last programmed page is page8, and after the filling pages, 
> before mtd->_erase, power-cut hits, so in this block, there will be below data layout, 
> 
> Page0 : EC header
> Page1: VID header
> Page2: not empty, valid data 
> Page3: not empty, valid data
> Page4: not empty, valid data
> Page5: not empty, valid data
> Page6: not empty, valid data
> Page7: not empty, valid data
> Page8: not empty, valid data
> 
> Page9: filled with padding data by the patch
> Page10: empty, keep all 0xff
> Page11: filled with padding data by the patch
> Page12: empty, keep all 0xff
> Page13: filled with padding data by the patch
> 
> As for the older UBIFS (eg, v4.2), there will result in attach failure, but for the latest
> UBIFS, if I don't overwrite EC and VID header, will this condition happen?

Sorry I didn't make myself clear. It is perfectly fine to kill both EC and VID headers.
This is actually what we want.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 17:19       ` Bean Huo (beanhuo)
@ 2018-12-19 18:31         ` Richard Weinberger
  2018-12-19 18:46         ` Boris Brezillon
  1 sibling, 0 replies; 13+ messages in thread
From: Richard Weinberger @ 2018-12-19 18:31 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Miquel Raynal, Zoltan Szubbocsev (zszubbocsev),
	Boris Brezillon, linux-mtd, tglx

Bean,

Am Mittwoch, 19. Dezember 2018, 18:19:43 CET schrieb Bean Huo (beanhuo):
> Hi, Boris and Miquel
> Thanks for your email.
> Since the patch is based on v4.2 which doesn't separate specific NAND vendor
> driver into vendor specific chip driver file. I will rebase my patch to latest linux kernel,
> and implement  a specific hook funciton in the nand_micron.c as you suggested.
> After testing well, you will see the version 2.

Please also check your patch using checkpatch.pl to make sure it matches the kernel coding style.

Can we also limit the workaround to a subset of Micron NANDs?
Enabling it for every kind of Micron NAND seems overkill to me.
Maybe you can precise "legacy planar 2D Micron NAND devices" bit more. :-)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 18:11   ` [EXT] " Bean Huo (beanhuo)
  2018-12-19 18:28     ` Richard Weinberger
@ 2018-12-19 18:44     ` Boris Brezillon
  2018-12-19 22:29     ` Thomas Gleixner
  2 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2018-12-19 18:44 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Richard Weinberger, linux-mtd, Miquel Raynal,
	Zoltan Szubbocsev (zszubbocsev),
	tglx

Hi Bean,

On Wed, 19 Dec 2018 18:11:25 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> "ubi: fastmap: Correctly handle interrupted erasures in EBA" patch is merged from v4.18.
> So, I still need to firstly rebase patch to latest kernel, then let's have a look.

Before you even consider porting this patch, can you please give Thomas'
patch a try (maybe after adjusting the number of blocks to overwrite and
the position of these blocks)? As Miquel said in his reply, we should
handle that at the rawnand level (nand_{base,micron}.c), not in
mtdcore.c.

Regards,

Boris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 17:19       ` Bean Huo (beanhuo)
  2018-12-19 18:31         ` Richard Weinberger
@ 2018-12-19 18:46         ` Boris Brezillon
  1 sibling, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2018-12-19 18:46 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Miquel Raynal, Zoltan Szubbocsev (zszubbocsev),
	Richard Weinberger, tglx, linux-mtd

On Wed, 19 Dec 2018 17:19:43 +0000
"Bean Huo (beanhuo)" <beanhuo@micron.com> wrote:

> Hi, Boris and Miquel
> Thanks for your email.
> Since the patch is based on v4.2 which doesn't separate specific NAND vendor
> driver into vendor specific chip driver file. I will rebase my patch to latest linux kernel,
> and implement  a specific hook funciton in the nand_micron.c as you suggested.
> After testing well, you will see the version 2.
> 

Oops, didn't see this reply. Please ignore my previous email.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 18:11   ` [EXT] " Bean Huo (beanhuo)
  2018-12-19 18:28     ` Richard Weinberger
  2018-12-19 18:44     ` Boris Brezillon
@ 2018-12-19 22:29     ` Thomas Gleixner
  2019-01-02 13:34       ` Bean Huo (beanhuo)
  2 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2018-12-19 22:29 UTC (permalink / raw)
  To: Bean Huo (beanhuo)
  Cc: Richard Weinberger, linux-mtd, Miquel Raynal,
	Zoltan Szubbocsev (zszubbocsev),
	Boris Brezillon

Bean,

On Wed, 19 Dec 2018, Bean Huo (beanhuo) wrote:
>
> As for the older UBIFS (eg, v4.2), there will result in attach failure,
> but for the latest UBIFS, if I don't overwrite EC and VID header, will
> this condition happen?

v4.2 is a dead kernel. Nobody cares about that.

The not so dead LTS kernels should have the UBIFS fixes backported because
this problem can happen independent of the Micron specific issue. If the
LTS kernels lack those backports, then they need to be done independent of
this anyway.

In general patches against the head of development are the proper way to
go. Backporting is a different problem and handled after the fixes have
been established.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [EXT] Re: [PATCH V1] mtd: core: Micron SLC NAND filling block
  2018-12-19 22:29     ` Thomas Gleixner
@ 2019-01-02 13:34       ` Bean Huo (beanhuo)
  0 siblings, 0 replies; 13+ messages in thread
From: Bean Huo (beanhuo) @ 2019-01-02 13:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Weinberger, linux-mtd, Miquel Raynal,
	Zoltan Szubbocsev (zszubbocsev),
	Boris Brezillon

Hi, Thomas
Thanks for infor. I have backed from holiday, and been working on this from now,
I will rebase my patch based on latest linux kernel version.

//bean huo

>On Wed, 19 Dec 2018, Bean Huo (beanhuo) wrote:
>>
>> As for the older UBIFS (eg, v4.2), there will result in attach
>> failure, but for the latest UBIFS, if I don't overwrite EC and VID
>> header, will this condition happen?
>
>v4.2 is a dead kernel. Nobody cares about that.
>
>The not so dead LTS kernels should have the UBIFS fixes backported because
>this problem can happen independent of the Micron specific issue. If the LTS
>kernels lack those backports, then they need to be done independent of this
>anyway.
>
>In general patches against the head of development are the proper way to go.
>Backporting is a different problem and handled after the fixes have been
>established.
>
>Thanks,
>
>	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-01-02 13:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-19 11:03 [PATCH V1] mtd: core: Micron SLC NAND filling block Bean Huo (beanhuo)
2018-12-19 13:17 ` Boris Brezillon
2018-12-19 13:59   ` [EXT] " Zoltan Szubbocsev (zszubbocsev)
2018-12-19 14:17     ` Miquel Raynal
2018-12-19 17:19       ` Bean Huo (beanhuo)
2018-12-19 18:31         ` Richard Weinberger
2018-12-19 18:46         ` Boris Brezillon
2018-12-19 14:27 ` Richard Weinberger
2018-12-19 18:11   ` [EXT] " Bean Huo (beanhuo)
2018-12-19 18:28     ` Richard Weinberger
2018-12-19 18:44     ` Boris Brezillon
2018-12-19 22:29     ` Thomas Gleixner
2019-01-02 13:34       ` Bean Huo (beanhuo)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.