All of lore.kernel.org
 help / color / mirror / Atom feed
* mxc nand i.mx35: no OOB with HW_ECC
@ 2009-11-03 15:25 John Ogness
  2009-11-11  7:22 ` Artem Bityutskiy
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: John Ogness @ 2009-11-03 15:25 UTC (permalink / raw)
  To: linux-mtd; +Cc: s.hauer

Hi,

I spent a while trying to figure out why JFFS2 was getting hardware ECC
errors on an i.MX35 NANDFC with 2K pages. The problem also existed when
using nandwrite together with the -n and -o options.

The problem is that for page sizes larger than 512 bytes, the i.MX35
NANDFC can only write the page and oob data together. When writing the
page and oob data, the ECC hardware also writes the ECC codes for
it. This is ok if the filesystem driver or userspace application
understands that only one write per page+oob is allowed.

But jffs2 and "nandwrite -n -o" assume that they can write the oob data
and the page data separately. With the recently posted mxc_nand.c
driver, this results in the NANDFC writing the ECC codes to flash
twice... with different values. This means the ECC codes that end up on
the flash chip are garbage.

When reading the page back, the ECC hardware uses the garbage ECC codes
to "fix" the data, which results in corrupted data.

ubifs does not touch the oob data, which is probably why nobody cares
about (or has noticed?) this issue.

As someone who is relatively new to the MTD layer, I don't have any
suggestions about how this should be fixed.

A quick tweak to fs/jffs2/wbuf.c:jffs2_nand_flash_setup() will allow
JFFS2 to run without available oob data. Maybe there should be an
official option to allow this.

John Ogness

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mxc nand i.mx35: no OOB with HW_ECC
  2009-11-03 15:25 mxc nand i.mx35: no OOB with HW_ECC John Ogness
@ 2009-11-11  7:22 ` Artem Bityutskiy
  2009-11-11  9:04   ` John Ogness
  2009-11-12  9:35 ` Sascha Hauer
  2009-11-12 10:28 ` Sascha Hauer
  2 siblings, 1 reply; 5+ messages in thread
From: Artem Bityutskiy @ 2009-11-11  7:22 UTC (permalink / raw)
  To: John Ogness; +Cc: s.hauer, linux-mtd

On Tue, 2009-11-03 at 16:25 +0100, John Ogness wrote:
> Hi,
> 
> I spent a while trying to figure out why JFFS2 was getting hardware ECC
> errors on an i.MX35 NANDFC with 2K pages. The problem also existed when
> using nandwrite together with the -n and -o options.
> 
> The problem is that for page sizes larger than 512 bytes, the i.MX35
> NANDFC can only write the page and oob data together. When writing the
> page and oob data, the ECC hardware also writes the ECC codes for
> it. This is ok if the filesystem driver or userspace application
> understands that only one write per page+oob is allowed.
> 
> But jffs2 and "nandwrite -n -o" assume that they can write the oob data
> and the page data separately. With the recently posted mxc_nand.c
> driver, this results in the NANDFC writing the ECC codes to flash
> twice... with different values. This means the ECC codes that end up on
> the flash chip are garbage.

Err, the classical model is that you can write whole page (or
sub-pages), then the driver/HW also generates ECC and stores it in the
OOB. However, traditionally ECC codes do not occupy whole OOB, so there
is space left. And you can write separately to that space.

Your case is confusing. If your ECC occupies whole OOB, your driver
should simply prohibit writing to the OOB. If it does not, it is your
drivers' fault. Really, your driver should either correctly support the
classical model, or it should simply not support OOB writing at all. Not
something in the middle, right?

> When reading the page back, the ECC hardware uses the garbage ECC codes
> to "fix" the data, which results in corrupted data.
> 
> ubifs does not touch the oob data, which is probably why nobody cares
> about (or has noticed?) this issue.

True.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mxc nand i.mx35: no OOB with HW_ECC
  2009-11-11  7:22 ` Artem Bityutskiy
@ 2009-11-11  9:04   ` John Ogness
  0 siblings, 0 replies; 5+ messages in thread
From: John Ogness @ 2009-11-11  9:04 UTC (permalink / raw)
  To: dedekind1; +Cc: s.hauer, linux-mtd

On 2009-11-11, Artem Bityutskiy <dedekind1@gmail.com> wrote:
>> The problem is that for page sizes larger than 512 bytes, the i.MX35
>> NANDFC can only write the page and oob data together. When writing
>> the page and oob data, the ECC hardware also writes the ECC codes for
>> it. This is ok if the filesystem driver or userspace application
>> understands that only one write per page+oob is allowed.
>> 
>> But jffs2 and "nandwrite -n -o" assume that they can write the oob
>> data and the page data separately. With the recently posted
>> mxc_nand.c driver, this results in the NANDFC writing the ECC codes
>> to flash twice... with different values. This means the ECC codes
>> that end up on the flash chip are garbage.
>
> Err, the classical model is that you can write whole page (or
> sub-pages), then the driver/HW also generates ECC and stores it in the
> OOB. However, traditionally ECC codes do not occupy whole OOB, so
> there is space left. And you can write separately to that space.

That is the problem. With the i.MX35 you can _not_ write separately to
that leftover space (at some other time) because the hardware ECC codes
are calculated based on the values within that leftover space as
well. The OOB and page data must be written simultaneously, at which
point the ECC codes over the OOB and page data will also be
written. Afterwards, neither the OOB data nor the page data may be
written to again until the block has been erased.

> Your case is confusing. If your ECC occupies whole OOB, your driver
> should simply prohibit writing to the OOB. If it does not, it is your
> drivers' fault.

Perhaps it is the "fault" of the hardware ECC that is including the OOB
data when calculating the ECC codes. If leftover OOB space and page data
are written simultaneously, there is no problem. The problem arises when
they are written separately, which I suppose is the classical model.

> Really, your driver should either correctly support the classical
> model, or it should simply not support OOB writing at all. Not
> something in the middle, right?

Agreed. The i.MX35 NANDFC driver should not support OOB writing (if
hardware ECC is used) because its ECC hardware is not capable of
supporting the classical model.

John Ogness

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mxc nand i.mx35: no OOB with HW_ECC
  2009-11-03 15:25 mxc nand i.mx35: no OOB with HW_ECC John Ogness
  2009-11-11  7:22 ` Artem Bityutskiy
@ 2009-11-12  9:35 ` Sascha Hauer
  2009-11-12 10:28 ` Sascha Hauer
  2 siblings, 0 replies; 5+ messages in thread
From: Sascha Hauer @ 2009-11-12  9:35 UTC (permalink / raw)
  To: John Ogness; +Cc: linux-mtd

Hi John,

On Tue, Nov 03, 2009 at 04:25:10PM +0100, John Ogness wrote:
> Hi,
> 
> I spent a while trying to figure out why JFFS2 was getting hardware ECC
> errors on an i.MX35 NANDFC with 2K pages. The problem also existed when
> using nandwrite together with the -n and -o options.
> 
> The problem is that for page sizes larger than 512 bytes, the i.MX35
> NANDFC can only write the page and oob data together. When writing the
> page and oob data, the ECC hardware also writes the ECC codes for
> it. This is ok if the filesystem driver or userspace application
> understands that only one write per page+oob is allowed.
> 
> But jffs2 and "nandwrite -n -o" assume that they can write the oob data
> and the page data separately. With the recently posted mxc_nand.c
> driver, this results in the NANDFC writing the ECC codes to flash
> twice... with different values. This means the ECC codes that end up on
> the flash chip are garbage.

I can confirm jffs2 errors on i.MX35 with the patches I posted. And yes
you're right, I mostly use UBIFS.

Freescale originally provided two drivers, one for the i.MX27/31 type
nand controller (v1) from which the in kernel driver is derived and one
for the remaining nand controllers (i.mx35/25) (v2). What I did is to
work on both drivers till they both looked (nearly) the same and posted
the patches for the v1 driver.

One remaining difference is in mxc_do_addr_cycle(). Here is the version
from the v2 driver which works with jffs2:

static void mxc_do_addr_cycle(struct mtd_info *mtd, int column, int
page_addr)
{
	struct nand_chip *nand_chip = mtd->priv;
	struct mxc_nand_host *host = nand_chip->priv;

	u32 page_mask = nand_chip->pagemask;

	if (column != -1) {
		send_addr(host, column & 0xFF, 0);
		if (mtd->writesize == 2048) {
			/* another col addr cycle for 2k page */
			send_addr(host, (column >> 8) & 0xF, 0);
		} else if (mtd->writesize == 4096) {
			/* another col addr cycle for 4k page */
			send_addr(host, (column >> 8) & 0x1F, 0);
		}
	}
	if (page_addr != -1) {
		do {
			send_addr(host, (page_addr & 0xff), 0);
			page_mask >>= 8;
			page_addr >>= 8;
		} while (page_mask != 0);
	}
}

Putting this function into the v1 driver makes the driver work for me on
i.MX35 with jffs2. Can you confirm this?

I haven't looked deeply into the differences and I suspect this version
of mxc_do_addr_cycle() won't work on the v1 type controllers, so this is
not a solution yet.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mxc nand i.mx35: no OOB with HW_ECC
  2009-11-03 15:25 mxc nand i.mx35: no OOB with HW_ECC John Ogness
  2009-11-11  7:22 ` Artem Bityutskiy
  2009-11-12  9:35 ` Sascha Hauer
@ 2009-11-12 10:28 ` Sascha Hauer
  2 siblings, 0 replies; 5+ messages in thread
From: Sascha Hauer @ 2009-11-12 10:28 UTC (permalink / raw)
  To: John Ogness; +Cc: linux-mtd


For the reference, here is what I used to test the driver with jffs2 on
i.MX27 (smallpage/2k page) and on i.MX35 (2k page). Apart from what I
was saying I had another change in my tree in parsing the
NAND_CMD_SEQIN.

Sascha

commit ce388af8574a8af2e0b46e8b9b4bc8f358896a90
Author: Sascha Hauer <s.hauer@pengutronix.de>
Date:   Thu Nov 12 11:25:23 2009 +0100

    mxc-nand: use mxc_do_addr_cycle from v2 driver
    
    Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>

diff --git a/drivers/mtd/nand/mxc_nand.c b/drivers/mtd/nand/mxc_nand.c
index d5445cd..a80f900 100644
--- a/drivers/mtd/nand/mxc_nand.c
+++ b/drivers/mtd/nand/mxc_nand.c
@@ -501,44 +501,24 @@ static void mxc_do_addr_cycle(struct mtd_info *mtd, int column, int page_addr)
 	struct nand_chip *nand_chip = mtd->priv;
 	struct mxc_nand_host *host = nand_chip->priv;
 
-	/* Write out column address, if necessary */
+	u32 page_mask = nand_chip->pagemask;
+
 	if (column != -1) {
-		/*
-		 * MXC NANDFC can only perform full page+spare or
-		 * spare-only read/write.  When the upper layers
-		 * layers perform a read/write buf operation,
-		 * we will used the saved column adress to index into
-		 * the full page.
-		 */
-		send_addr(host, 0, page_addr == -1);
-		if (mtd->writesize > 512)
+		send_addr(host, column & 0xFF, 0);
+		if (mtd->writesize == 2048) {
 			/* another col addr cycle for 2k page */
-			send_addr(host, 0, false);
+			send_addr(host, (column >> 8) & 0xF, 0);
+		} else if (mtd->writesize == 4096) {
+			/* another col addr cycle for 4k page */
+			send_addr(host, (column >> 8) & 0x1F, 0);
+		}
 	}
-
-	/* Write out page address, if necessary */
 	if (page_addr != -1) {
-		/* paddr_0 - p_addr_7 */
-		send_addr(host, (page_addr & 0xff), false);
-
-		if (mtd->writesize > 512) {
-			if (mtd->size >= 0x10000000) {
-				/* paddr_8 - paddr_15 */
-				send_addr(host, (page_addr >> 8) & 0xff, false);
-				send_addr(host, (page_addr >> 16) & 0xff, true);
-			} else
-				/* paddr_8 - paddr_15 */
-				send_addr(host, (page_addr >> 8) & 0xff, true);
-		} else {
-			/* One more address cycle for higher density devices */
-			if (mtd->size >= 0x4000000) {
-				/* paddr_8 - paddr_15 */
-				send_addr(host, (page_addr >> 8) & 0xff, false);
-				send_addr(host, (page_addr >> 16) & 0xff, true);
-			} else
-				/* paddr_8 - paddr_15 */
-				send_addr(host, (page_addr >> 8) & 0xff, true);
-		}
+		do {
+			send_addr(host, (page_addr & 0xff), 0);
+			page_mask >>= 8;
+			page_addr >>= 8;
+		} while (page_mask != 0);
 	}
 }
 
@@ -591,31 +571,21 @@ static void mxc_nand_command(struct mtd_info *mtd, unsigned command,
 		break;
 
 	case NAND_CMD_SEQIN:
-		if (column >= mtd->writesize) {
-			/*
-			 * FIXME: before send SEQIN command for write OOB,
-			 * We must read one page out.
-			 * For K9F1GXX has no READ1 command to set current HW
-			 * pointer to spare area, we must write the whole page
-			 * including OOB together.
-			 */
-			if (mtd->writesize > 512)
-				/* call ourself to read a page */
-				mxc_nand_command(mtd, NAND_CMD_READ0, 0,
-						page_addr);
-
-			host->buf_start = column;
+		/* Before send SEQIN command for
+		 * partial write, we need to read one page out.
+		 * FSL NFC does not support partial write.
+		 * It always sends out 512 + ecc + 512 + ecc ...
+		 * for large page nand flash. But for small
+		 * page nand flash, it did support SPARE
+		 * ONLY operation. But to make driver
+		 * simple. We take the same as large page, read
+		 * whole page out and update.
+		 */
 
-			/* Set program pointer to spare region */
-			if (mtd->writesize == 512)
-				send_cmd(host, NAND_CMD_READOOB, false);
-		} else {
-			host->buf_start = column;
+		if (column)
+			mxc_nand_command(mtd, NAND_CMD_READ0, 0, page_addr);
 
-			/* Set program pointer to page start */
-			if (mtd->writesize == 512)
-				send_cmd(host, NAND_CMD_READ0, false);
-		}
+		host->buf_start = column;
 
 		send_cmd(host, command, false);
 		mxc_do_addr_cycle(mtd, column, page_addr);
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-11-12 10:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-03 15:25 mxc nand i.mx35: no OOB with HW_ECC John Ogness
2009-11-11  7:22 ` Artem Bityutskiy
2009-11-11  9:04   ` John Ogness
2009-11-12  9:35 ` Sascha Hauer
2009-11-12 10:28 ` Sascha Hauer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.