[U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks

All of lore.kernel.org
 help / color / mirror / Atom feed

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
@ 2013-03-19  0:50 Paul B. Henson
  2013-03-19 23:23 ` Scott Wood
  2013-04-04 10:09 ` Trent Piepho
  0 siblings, 2 replies; 23+ messages in thread
From: Paul B. Henson @ 2013-03-19  0:50 UTC (permalink / raw)
  To: u-boot

I'm prototyping a project that's going to need to boot linux from NAND 
on a mx28evk board.

I was able to successfully use the u-boot mxsboot utility to generate a 
nand image and burn it, then boot from it. I noticed one anomaly though, 
when using mxsboot/u-boot to generate and burn the bootstream to NAND, 
when the linux kernel boots it finds bad blocks:

[    1.090000] NAND device: Manufacturer ID: 0x2c, Chip ID: 0xf1 (Micron 
MT29F14
[    1.100000] Scanning device for bad blocks
[    1.110000] Bad eraseblock 0 at 0x000000000000
[    1.110000] Bad eraseblock 1 at 0x000000020000
[    1.120000] Bad eraseblock 2 at 0x000000040000
[    1.120000] Bad eraseblock 3 at 0x000000060000

When I burn the exact same bootstream with kobs-ng, linux does not find 
any bad blocks, so it seems to be a byproduct of either the image 
generated by mxsboot or the u-boot burning.

I don't think this is having any functional impact, as the scrub 
component of burning a new nand image wipes out the bad blocks, and once 
linux is booted it really has no need to read the bootstream from the 
bootloader mtd partition.

However, it seems anomalous, and I was wondering if other people are 
have seen it, and whether or not it is something that might be fixed.

Thanks much?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-03-19  0:50 [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks Paul B. Henson
@ 2013-03-19 23:23 ` Scott Wood
  2013-03-20 21:20   ` Paul B. Henson
  2013-04-04 10:09 ` Trent Piepho
  1 sibling, 1 reply; 23+ messages in thread
From: Scott Wood @ 2013-03-19 23:23 UTC (permalink / raw)
  To: u-boot

On 03/18/2013 07:50:07 PM, Paul B. Henson wrote:
> I'm prototyping a project that's going to need to boot linux from  
> NAND on a mx28evk board.
> 
> I was able to successfully use the u-boot mxsboot utility to generate  
> a nand image and burn it, then boot from it. I noticed one anomaly  
> though, when using mxsboot/u-boot to generate and burn the bootstream  
> to NAND, when the linux kernel boots it finds bad blocks:
> 
> [    1.090000] NAND device: Manufacturer ID: 0x2c, Chip ID: 0xf1  
> (Micron MT29F14
> [    1.100000] Scanning device for bad blocks
> [    1.110000] Bad eraseblock 0 at 0x000000000000
> [    1.110000] Bad eraseblock 1 at 0x000000020000
> [    1.120000] Bad eraseblock 2 at 0x000000040000
> [    1.120000] Bad eraseblock 3 at 0x000000060000
> 
> When I burn the exact same bootstream with kobs-ng, linux does not  
> find any bad blocks, so it seems to be a byproduct of either the  
> image generated by mxsboot or the u-boot burning.
> 
> I don't think this is having any functional impact, as the scrub  
> component of burning a new nand image wipes out the bad blocks,

You should not be routinely scrubbing NAND!

The manufacturers put bad block information there for a reason.

-Scott

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-03-19 23:23 ` Scott Wood
@ 2013-03-20 21:20   ` Paul B. Henson
  2013-03-20 21:24     ` Scott Wood
  0 siblings, 1 reply; 23+ messages in thread
From: Paul B. Henson @ 2013-03-20 21:20 UTC (permalink / raw)
  To: u-boot

On Tue, Mar 19, 2013 at 06:23:27PM -0500, Scott Wood wrote:

> > I don't think this is having any functional impact, as the scrub  
> > component of burning a new nand image wipes out the bad blocks,
> 
> You should not be routinely scrubbing NAND!
> 
> The manufacturers put bad block information there for a reason.

Hmm, I was following the instructions in doc/README.mx28_common, which
says to use "run update_nand_full" to burn the NAND image, and one
component of that per include/configs/mx28evk.h is:

	nand scrub -y 0x0 ${filesize}

Are the instructions/env script incorrect?

I don't believe the bad blocks that linux finds are actual bad blocks,
and definitely not factory bad blocks. They seem to show up as a
byproduct of the way u-boot is burning the NAND image. They are always
at the same addresses (I tried two different NAND chips), and only
appear when u-boot is used to burn the bootstream, but not when kobs-ng
is used.

Thanks...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-03-20 21:20   ` Paul B. Henson
@ 2013-03-20 21:24     ` Scott Wood
  0 siblings, 0 replies; 23+ messages in thread
From: Scott Wood @ 2013-03-20 21:24 UTC (permalink / raw)
  To: u-boot

On 03/20/2013 04:20:07 PM, Paul B. Henson wrote:
> On Tue, Mar 19, 2013 at 06:23:27PM -0500, Scott Wood wrote:
> 
> > > I don't think this is having any functional impact, as the scrub
> > > component of burning a new nand image wipes out the bad blocks,
> >
> > You should not be routinely scrubbing NAND!
> >
> > The manufacturers put bad block information there for a reason.
> 
> Hmm, I was following the instructions in doc/README.mx28_common, which
> says to use "run update_nand_full" to burn the NAND image, and one
> component of that per include/configs/mx28evk.h is:
> 
> 	nand scrub -y 0x0 ${filesize}
> 
> Are the instructions/env script incorrect?

The env script is incorrect.  Otavio, Marek, what's going on here?

> I don't believe the bad blocks that linux finds are actual bad blocks,
> and definitely not factory bad blocks. They seem to show up as a
> byproduct of the way u-boot is burning the NAND image. They are always
> at the same addresses (I tried two different NAND chips), and only
> appear when u-boot is used to burn the bootstream, but not when  
> kobs-ng
> is used.

My guess is there's some mismatch regarding NAND layout, but someone  
more familiar with mx28 will need to answer that...

-Scott

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-03-19  0:50 [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks Paul B. Henson
  2013-03-19 23:23 ` Scott Wood
@ 2013-04-04 10:09 ` Trent Piepho
  2013-04-06  4:28   ` Paul B. Henson
  1 sibling, 1 reply; 23+ messages in thread
From: Trent Piepho @ 2013-04-04 10:09 UTC (permalink / raw)
  To: u-boot

On Mon, Mar 18, 2013 at 5:50 PM, Paul B. Henson <henson@acm.org> wrote:
> I'm prototyping a project that's going to need to boot linux from NAND on a
> mx28evk board.
>
> I was able to successfully use the u-boot mxsboot utility to generate a nand
> image and burn it, then boot from it. I noticed one anomaly though, when
> using mxsboot/u-boot to generate and burn the bootstream to NAND, when the
> linux kernel boots it finds bad blocks:
>
> [    1.090000] NAND device: Manufacturer ID: 0x2c, Chip ID: 0xf1 (Micron
> MT29F14
> [    1.100000] Scanning device for bad blocks
> [    1.110000] Bad eraseblock 0 at 0x000000000000
> [    1.110000] Bad eraseblock 1 at 0x000000020000
> [    1.120000] Bad eraseblock 2 at 0x000000040000
> [    1.120000] Bad eraseblock 3 at 0x000000060000
>
> When I burn the exact same bootstream with kobs-ng, linux does not find any
> bad blocks, so it seems to be a byproduct of either the image generated by
> mxsboot or the u-boot burning.

I get the same problem, with an iMX28 based board and the same NAND
chip (correct name MT29F1G08ABADAWP, MT29F14 is from serial terminal
line wrapping).

It's something to do with the way u-boot writes to nand.  If I write
with nandwrite it doesn't happen, nandtest doesn't find any bad
blocks, the chip is supposed to guarantee that block 0 is not bad, and
those four blocks contain all copies of the FCB so if they were all
bad you couldn't boot.

A bad block on that chip is marked with a non-0xff as the first OOB
byte in the 1st page of a block.  So, my guess is that when u-boot
writes the FCB data it also writes something to the OOB data.  It's
not entirely clear to me if the ROM bootloader uses the OOB data in
the FCB blocks or not, which pages used by the ROM are BCH encoded,
and how the that affects the OOB data.

You said you've booted from NAND.  Did you have to program any of the
OTP fuses to do this?  Try as I might, the ROM bootloader refuses to
accept anything I flash into NAND.  Of course the error code doesn't
tell you WHY it didn't like the image.  One thing I found was the
OCOTP fuse bits for NAND_ROW_ADDRESS_BYTES.  The default is for 3
bytes of row address.  Yet the MT29F1 has 2 bytes of row address.  Did
you program the fuse bits to change the number of row address bytes to
two?

> I don't think this is having any functional impact, as the scrub component
> of burning a new nand image wipes out the bad blocks, and once linux is
> booted it really has no need to read the bootstream from the bootloader mtd
> partition.

nandwrite didn't seem to want to program the blocks after they were
marked bad.  The only way fix this seemed to be to scrub nand from
u-boot.  So it's a problem if you want to be able to flash the
bootloader from Linux, unless there is some way to get the blocks
written when they have been marked bad.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-04 10:09 ` Trent Piepho
@ 2013-04-06  4:28   ` Paul B. Henson
  2013-04-06  7:18     ` Trent Piepho
  0 siblings, 1 reply; 23+ messages in thread
From: Paul B. Henson @ 2013-04-06  4:28 UTC (permalink / raw)
  To: u-boot

On 4/4/2013 3:09 AM, Trent Piepho wrote:

> It's something to do with the way u-boot writes to nand.  If I write
> with nandwrite it doesn't happen, nandtest doesn't find any bad

Hmm, I'm pretty sure I tested burning the u-boot generated nand image 
with nandwrite under Linux with exactly the same result, it seems to be 
inherent in the underlying data, not the burn method.

Did you use the --oob option to nandwrite? The u-boot generated image is 
actually written in two separate steps, the initial piece is written raw 
and includes oob data, the second piece is written normally and the 
ecc/oob is generated by the hardware. To burn it under linux, you need 
to split the u-boot nand image into those two pieces, and write the 
first with -oob, and the second normally.

> A bad block on that chip is marked with a non-0xff as the first OOB
> byte in the 1st page of a block.  So, my guess is that when u-boot
> writes the FCB data it also writes something to the OOB data.

Yes, as would linux if you used the --oob option to nandwrite.

> You said you've booted from NAND.  Did you have to program any of the
> OTP fuses to do this?

No. All I did was install the actual NAND chip and update the boot dip 
switches. Testing u-boot, I followed the script in the default 
environment other than updating it to load the firmware from SD rather 
than tftp. For testing under Linux, I used dd to split the u-boot nand 
image into two pieces, corresponding to the u-boot burn instructions.

> nandwrite didn't seem to want to program the blocks after they were
> marked bad.  The only way fix this seemed to be to scrub nand from
> u-boot.  So it's a problem if you want to be able to flash the
> bootloader from Linux, unless there is some way to get the blocks
> written when they have been marked bad.

No, from what I understand there is no way to clear bad block markers 
from within linux short of modifying the mtd driver.

I followed up with Otavio off list, he said he had ordered some nand 
chips for his board and would get back to me once he had received them 
and had a chance to replicate the issue.

Are you targeting burning the nand with u-boot or linux? If you are 
using an older kernel, the kobs-ng that comes with the mx28 BSP works 
fine. It does not work with newer kernels though, there is a newer 
version of kobs-ng that comes with a different chip BSP that I've heard 
will work correctly on current kernels, it is on my to do list to try it 
out.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-06  4:28   ` Paul B. Henson
@ 2013-04-06  7:18     ` Trent Piepho
  2013-04-11  0:20       ` Paul B. Henson
  0 siblings, 1 reply; 23+ messages in thread
From: Trent Piepho @ 2013-04-06  7:18 UTC (permalink / raw)
  To: u-boot

On Apr 5, 2013 9:28 PM, "Paul B. Henson" <henson@acm.org> wrote:
>
> On 4/4/2013 3:09 AM, Trent Piepho wrote:
>
>> It's something to do with the way u-boot writes to nand.  If I write
>> with nandwrite it doesn't happen, nandtest doesn't find any bad
>
>
> Hmm, I'm pretty sure I tested burning the u-boot generated nand image with nandwrite under Linux with exactly the same result, it seems to be inherent in the underlying data, not the burn method.

> Did you use the --oob option to nandwrite? The u-boot generated image is actually written in two separate steps, the initial piece is written raw and includes oob data, the second piece is written normally and the ecc/oob is generated by the hardware. To burn it under linux, you need to split the u-boot nand image into those two pieces, and write the first with -oob, and the second normally.

Did you already have the bad sectors when you burnt under Linux?  I
hadn't used --oob under linux, which as you've said doesn't work.
I've now burnt with kobs-ng a working nand image and have no bad
sectors.

I looked into this more and haven't entirely figured it out.  It's
definitely something to do with the raw sectors vs BCH protected
sectors.

I don't think the image u-boot mxsboot generates includes any OOB
data.  For me, it made an image which is *exactly* 24 blocks of 128
kiB each.  If the FCB blocks had OOB data then there would need to be
some multiple of 64 OBB bytes in the image (16 kiB I would think).  I
think maybe this is the problem.  The update_nand_full script calls
"nand write.raw ${loadaddr} 0x0 ${fcb_sz}" and write.raw expects
loadaddr to contain $fcb_sz pages of (2048 + 64) bytes each.  But a
hexdump of the u-boot image:
00000000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff  |................|
00000010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB ....P<......|

00020000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff  |................|
00020010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB ....P<......|

There is the first FCB block at offset 0.  And the second FCB block at
offset 0x20000.  That's 64 * 2048 bytes, not 64 * 2112 bytes.  No OOB
data.  The next two FCBs are@0x40000 and 0x60000, again not where
they should be if they contained the OOB data.

So I think the reason flashing with u-boot didn't work for me is that
the u-boot script vs mxsboot are broken.  The script expects OOB data
in the first 4 blocks while mxsboot doesn't put any there.  I wonder
if the way mxsboot or write.raw work has changed recently and one is
out of date?

Now the BCH error correction is interesting.  When it's used, the nand
page does not consist of 2048 bytes of data and 64 oob bytes of ecc.
Instead it's something like 10 bytes metadata, 512 bytes data, 13
bytes ecc, 512 bytes data, 13 ecc, etc.  The data and the ECC are
intermixed!  So if you write a page in BCH mode and then read it in
RAW mode, or vice versa, you get completely incorrect data back.

This is why writing with nandwrite doesn't work.  The ROM bootloader
expects the FCB blocks, which contain the BCH parameters, to be in raw
mode and apparently expects the rest to be in BCH mode.  If you write
with nandwrite everything is in BCH mode and thus the FCBs will be in
the wrong format and not read correctly.  The FCB blocks actually only
use the first 1036 bytes so the OOB data doesn't matter to the
bootloader, but since it's written in BCH mode everything is in the
wrong location.

>> A bad block on that chip is marked with a non-0xff as the first OOB
>> byte in the 1st page of a block.  So, my guess is that when u-boot
>> writes the FCB data it also writes something to the OOB data.
>
>
> Yes, as would linux if you used the --oob option to nandwrite.

Does this work for you?  As I see it, one would need to first generate
an image with OOB data, i.e. 2112 bytes pages, for the FCB blocks and
mxsboot doesn't do that.  Then you would need to get this written in
RAW mode.  I don't think nandwrite will do this, even with the --oob
option.  It's still in ECC mode.  The GPMI driver does not support
writing OOB data (see gpmi_ecc_write_oob(), all it does is return
-EPERM).  It does have a RAW mode write option, which is what kobs-ng
uses to write the FCB blocks.

>> You said you've booted from NAND.  Did you have to program any of the
>> OTP fuses to do this?
>
>
> No. All I did was install the actual NAND chip and update the boot dip switches. Testing u-boot, I followed the script in the default environment other than updating it to load the firmware from SD rather than tftp. For testing under Linux, I used dd to split the u-boot nand image into two pieces, corresponding to the u-boot burn instructions.

What instructions are those?  I didn't see anything in the
README.mx28_common file about that.  I think that could work, if you
re-blocked the FCB pages with ibs=2048 obs=2112 count=256.

I think this explains why the pages become "bad".  If you did this
then the OOB data added by dd with be all zero.  That first byte not
being 0xff anymore would mark the page as bad.  The bootloader doesn't
care since it ignores those marks, but Linux doesn't ignore them.  The
solution would be for u-boot to not write into the OOB data when
writing the FCBs.  This is what the kobs-ng program does.  It puts the
nand driver into raw mode and then writes 2048 byte pages for the FCB,
leaving the OOB data untouched.  U-boot appears to write the OOB data
(with all zeros), causing the pages to become marked as bad.

I think the u-boot method is somewhat problematic with regards to how
the NAND and ROM bootloader work.  It's nice to have a "smart" image
generator that does everything to make a raw image that you then flash
with any "dumb" flasher.  That's how we used to do things with NOR.
So much easier.  But the hardware really doesn't lend itself to this
anymore.  A "smart" flasher like kobs-ng seems more in line with what
the hardware calls for.

1) To create a correct image one needs to have all these details of
the hardware.  Page size, block size, BCH mode, ROM bootloader stride
and count configuration for the FCB and for the DBBT, nand timing to
put in FCB, etc.  A smart flasher can just query the drivers and get
all these values directly.  The smart image generator needs to get
told all these by someone transcribing them from the system.  Some can
be calculated, but then the code for that duplicates the code in the
driver and must be kept up to date as hardware and drivers change.  It
also means the image generated by the smart image generator can is
very hardware specific and can not cope with minor hardware changes.
E.g., the .sb file can be flashed onto any nand chip but the .nand
file can't be.

2) Some parts of the image needs to be written in raw mode, some in
ecc mode.  The image doesn't say.  So the "dumb" flasher still needs
to figure that out, by duplicating calculations from block size, ROM
bootloader stride and count fuse settings, etc. that the image
generator also did.

3) Lots of the image is actually blank.  There is a difference between
not programming a NAND page and programming a NAND page to all 1s.

4) This seems like the big problem to me.  The mxsboot system can't
cope with bad blocks.  It just creates a blank DBBT table.  What you
need to do is query the mtd driver and get the bad block list and then
use that to construct an accurate DBBT and then not use those bad
blocks.  kobs-ng can do this (not sure if it actually does).

> Are you targeting burning the nand with u-boot or linux? If you are using an older kernel, the kobs-ng that comes with the mx28 BSP works fine. It does not work with newer kernels though, there is a newer version of kobs-ng that comes with a different chip BSP that I've heard will work correctly on current kernels, it is on my to do list to try it out.

I'd like to burn from Linux.  It's easier to create an end user
firmware update system in linux than in u-boot.  The kobs-ng that
comes with the BSP does indeed not support the mainline kernel.  Why
Freescale can't update the BSP I don't know.  So maybe easy to fix
bugs.  I ported that kobs-ng to the current kernel, then discovered
there was the new version that already worked.  This does work to
flash u-boot or a kernel to nand from linux.  It doesn't write the OOB
data when it flashes the FCBs and so the blocks don't get marked bad.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-06  7:18     ` Trent Piepho
@ 2013-04-11  0:20       ` Paul B. Henson
  2013-04-11 12:03         ` Trent Piepho
  2013-04-13 14:42         ` Marek Vasut
  0 siblings, 2 replies; 23+ messages in thread
From: Paul B. Henson @ 2013-04-11  0:20 UTC (permalink / raw)
  To: u-boot

Let me just preface this reply with the disclaimer that I'm fairly new
to embedded development, and it sounds like you know a lot more about
what you're talking about than I do ;).

On 4/6/2013 12:18 AM, Trent Piepho wrote:

> Did you already have the bad sectors when you burnt under Linux?  I
> hadn't used --oob under linux, which as you've said doesn't work.

I did not have any bad blocks before I tried to burn the mxsboot
generated image to nand from Linux using nandwrite. You misunderstood me
though, I was able to exactly replicate the outcome from burning the
image with u-boot using Linux nandwrite. The board successfully booted
from NAND after burning the image with nandwrite, but resulted in the
exact same bad blocks.

> I've now burnt with kobs-ng a working nand image and have no bad
> sectors.

Yes, when I burn the bootstream with kobs-ng, I also do not get any bad
blocks on the nand.

> I don't think the image u-boot mxsboot generates includes any OOB
> data.  For me, it made an image which is *exactly* 24 blocks of 128
> kiB each.  If the FCB blocks had OOB data then there would need to
> be some multiple of 64 OBB bytes in the image (16 kiB I would think).
> I think maybe this is the problem.  The update_nand_full script
> calls "nand write.raw ${loadaddr} 0x0 ${fcb_sz}" and write.raw
> expects loadaddr to contain $fcb_sz pages of (2048 + 64) bytes each.

I'm not sure what you mean. According to u-boot:

Device 0: nand0, sector size 128 KiB
   Page size      2048 b
   OOB size         64 b

The page size is 2048 bytes, with 64 bytes of oob data, for a total of
2112 bytes.

When I burn the first part of the image with u-boot:

MX28EVK U-Boot > nand write.raw ${loadaddr} 0x0 ${fcb_sz}

NAND write:  540672 bytes written: OK

It writes 540672 bytes, which is evenly divisible by 2112 (256).

If you look at the mxsboot source code:

   for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
         offset = i * nand_writesize;
        memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize);
    }

It appears to be writing the FCB including oob data.

> I wonder if the way mxsboot or write.raw work has changed recently
> and one is out of date?

I used the latest git head of the ARM branch when I was testing a few
weeks ago.

> This is why writing with nandwrite doesn't work.  The ROM bootloader
> expects the FCB blocks, which contain the BCH parameters, to be in
> raw mode and apparently expects the rest to be in BCH mode.

Unless I am misunderstanding the u-boot instructions, the FCB blocks are
written in raw mode:

MX28EVK U-Boot > nand write.raw ${loadaddr} 0x0 ${fcb_sz}

NAND write:  540672 bytes written: OK

and the rest is not:

MX28EVK U-Boot > setexpr update_off ${loadaddr} + ${update_nand_fcb}
MX28EVK U-Boot > setexpr update_sz ${filesize} - ${update_nand_fcb}
MX28EVK U-Boot > nand write ${update_off} ${update_nand_fcb} ${update_sz}

NAND write: device 0 offset 0x80000, size 0xa80000
  11010048 bytes written: OK

>> rather than tftp. For testing under Linux, I used dd to split the
>> u-boot nand image into two pieces, corresponding to the u-boot burn
>> instructions.
>
> What instructions are those?  I didn't see anything in the
> README.mx28_common file about that.  I think that could work, if you
> re-blocked the FCB pages with ibs=2048 obs=2112 count=256.

There are no explicit instructions for nandwrite in u-boot, I simply 
split the mxsboot NAND image into two pieces to match the pieces that 
u-boot wrote:

dd if=test.nand bs=2112 count=256 of=test-head.nand
dd if=test.nand bs=1 skip=524288 of=test-tail.nand

And then wrote the first piece with nandwrite -oob at offset 0 and the 
second with regular nandwrite at offset 0x80000. This worked exactly the 
same as using u-boot, including the four blocks being marked as bad by 
Linux.

> U-boot appears to write the OOB data (with all zeros), causing the
> pages to become marked as bad.

If you look at the mxsboot source code, they appear to be trying to 
calculate the ecc and generate the oob data, but maybe they are doing it 
wrong?

> A "smart" flasher like kobs-ng seems more in line with what the
> hardware calls for.

Yes, but unfortunately I'm not sure that's something that could be 
implemented within a running u-boot?

> The smart image generator needs to get told all these by someone
> transcribing them from the system.

The current mxsboot is not that smart :), for the most part it has 
values hardcoded and you would need to recompile it if you wanted to 
change them. But it is just the first iteration while they are trying to 
get something working.

> cope with bad blocks.  It just creates a blank DBBT table.  What you
> need to do is query the mtd driver and get the bad block list and
> then use that to construct an accurate DBBT and then not use those
> bad blocks.  kobs-ng can do this (not sure if it actually does).

Does the ROM IPL pay any attention to bad blocks? I don't know exactly 
how it works, but from what I've heard it doesn't sound like it deals 
with bad blocks very well if at all.

> I'd like to burn from Linux.  It's easier to create an end user
> firmware update system in linux than in u-boot.

Yes, agreed. We've been evaluating our bootloader options for the 
project I'm working on. Initially we were going to have u-boot be the 
bootloader and directly load our production kernel. However, as you say, 
it is somewhat difficult to create a flexible update/recovery system 
within u-boot. Next we looked into using the freescale bootlets to 
directly load a bootloader/recovery linux kernel with a bundled 
initramfs that would be used for the update/recovery process, and then 
have it use kexec to load the production kernel. This worked out pretty 
well.

What I think we're going to go with is actually a hybrid of both u-boot 
and a stripped-down linux/initramfs bootloader/recovery kernel. The 
bootstream actually contains two copies of whatever object is going to 
be loaded (I'm guessing, but I assume that is because the ROM IPL 
doesn't handle bad blocks, so if the first copy can't be read it will 
just try the second). Our recovery kernel/initramfs is probably going to 
be about 5M, so it would take about 10M to be loaded directly. u-boot is 
less than 1M, so by including both, it actually uses *less* space 
(2*1+5=7, vs 2*5=10). In addition, the recommendation seems to be to 
have the IPL load the minimal amount possible, so with this 
implementation the IPL loads a tiny u-boot, which again is a lot smarter 
and more reliable about loading the larger recovery kernel, which can 
then either perform recovery options or kexec the production kernel.

> Why Freescale can't update the BSP I don't know.

Yes, it does seem in the embedded space chip manufacturers release 
something and then let it stagnate :(. I generally prefer to run more up 
to date stuff.

> I ported that kobs-ng to the current kernel, then discovered there
> was the new version that already worked.

I worked on porting it, and got to the point where it wanted to read a 
sysfs node to determine the NAND geometry which no longer existed. I was 
inquiring on the linux mtd mailing list about the possibility of getting 
that information back, when I was directed to a newer version of kobs-ng 
that comes in a different chip's BSP that's supposed to work with a 
current kernel.

> This does work to flash u-boot or a kernel to nand from linux.  It
> doesn't write the OOB data when it flashes the FCBs and so the
> blocks don't get marked bad.

I'm glad to hear the newer kobs-ng works with a current kernel, I had 
not yet had a chance to try it out. It would be nice if Freescale just 
had a link to the latest kobs-ng, rather than trying to figure out which 
chip BSP has the newest version, but I guess that's not the way they are 
bundling things or want people to work.

At this point, while I think it would be nice in general for the u-boot 
mxsboot utility to be fixed and work correctly, I don't think either of 
us is going to need that? If the latest kobs-ng worked correctly under a 
current kernel, we can just build the u-boot.sb file and use kobs-ng 
under Linux to burn it to NAND.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-11  0:20       ` Paul B. Henson
@ 2013-04-11 12:03         ` Trent Piepho
  2013-04-11 18:33           ` Paul B. Henson
  2013-04-13 14:42         ` Marek Vasut
  1 sibling, 1 reply; 23+ messages in thread
From: Trent Piepho @ 2013-04-11 12:03 UTC (permalink / raw)
  To: u-boot

>> I don't think the image u-boot mxsboot generates includes any OOB
>> data.  For me, it made an image which is *exactly* 24 blocks of 128
>> kiB each.  If the FCB blocks had OOB data then there would need to
>> be some multiple of 64 OBB bytes in the image (16 kiB I would think).
>> I think maybe this is the problem.  The update_nand_full script
>> calls "nand write.raw ${loadaddr} 0x0 ${fcb_sz}" and write.raw
>> expects loadaddr to contain $fcb_sz pages of (2048 + 64) bytes each.
>
>
> I'm not sure what you mean. According to u-boot:
>
> Device 0: nand0, sector size 128 KiB
>   Page size      2048 b
>   OOB size         64 b
>
> The page size is 2048 bytes, with 64 bytes of oob data, for a total of
> 2112 bytes.
>
> When I burn the first part of the image with u-boot:
>
> MX28EVK U-Boot > nand write.raw ${loadaddr} 0x0 ${fcb_sz}
>
> NAND write:  540672 bytes written: OK

I'm talking about the image file as generated by mxsimage.  If I hex dump
that, it's clearly written entirely with 2048 byte pages.  If you hexdump
your image are the FCB blocks exactly 128k apart?  Or are they 64 * 2112 =
132k apart?  It should be the latter, as 132k * 4 = 540672 bytes.

> If you look at the mxsboot source code:
>
>   for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
>         offset = i * nand_writesize;
>        memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize);
>    }
>
> It appears to be writing the FCB including oob data.

Looks wrong to me!  Notice that offset is equal to i * nand_writesize, not
i * (nand_writesize + nand_oobsize).  I think this only produce a bootable
image because:

The FCB data is only 1036 bytes in size.  The remaining 1012 bytes of data
and 64 oob bytes in the page aren't used.  And the 63 pages after the first
aren't used either.  So they can be full of garbage and it doesn't matter.
 The image mxsboot creates is ok for the first 1036 bytes.  Everything
after that is wrong, but it doesn't matter.

There are four copies of the FCB blocks.  The ROM bootloader looks for the
first valid one and uses it.  The first one is ok in the mxsboot image.
 All the rest are corrupted since they are written in the wrong location.
 But since the first one was ok the bootloader never even looks at the bad
ones.  Unless the NAND page goes bad, then the whole point of having
redundant copies will be defeated.

Now, look at the mx28_nand_fcb_block() that generates the FCB block.  It
calls memset() to fill the entire 2112 bytes with ZERO.  The mx28_nand_fcb
struct is 512 bytes, so the copy to copy the fcb struct to the buffer at
offset 12, and then the code to write the fcb ecc at offset 512+12 only
writes the first 1036 bytes.  The remaining bytes, including the OOB, will
all be zero.  And a ZERO byte in the first OOB byte makes the NAND block as
bad.  So that's why burning the mxsboot generated image with nand write.raw
makes the blocks bad.  Using kobs-ng doesn't write the OOB data and erase
any bad block markers, which is better.  I guess this is not just a bug in
mxsboot, but also a deficiency in u-boot's nand support.  It allows one to
write 2048 bytes in ECC mode or 2112 bytes in raw mode.  What one should
actually do to flash these blocks is write 2048 bytes in raw mode.

>> This is why writing with nandwrite doesn't work.  The ROM bootloader
>> expects the FCB blocks, which contain the BCH parameters, to be in
>> raw mode and apparently expects the rest to be in BCH mode.
>
>
> Unless I am misunderstanding the u-boot instructions, the FCB blocks are
> written in raw mode:

It's not the flashing that is in the wrong mode, but the image mxsboot
generates that is wrong.

> There are no explicit instructions for nandwrite in u-boot, I simply
split the mxsboot NAND image into two pieces to match the pieces that
u-boot wrote:
>
> dd if=test.nand bs=2112 count=256 of=test-head.nand
> dd if=test.nand bs=1 skip=524288 of=test-tail.nand
>
> And then wrote the first piece with nandwrite -oob at offset 0 and the
second with regular nandwrite at offset 0x80000. This worked exactly the
same as using u-boot, including the four blocks being marked as bad by
Linux.

If the four blocks were already marked as bad, then nandwrite will not
write them.  So maybe you only have a working image because it was already
working and wasn't modified?  Can you erase flash in u-boot, verify that
nand does not boot, and the make a working nand using just nandwrite --oob?
 I think you will also need to use the option --noecc to write in raw mode.

>> U-boot appears to write the OOB data (with all zeros), causing the
>> pages to become marked as bad.
>
>
> If you look at the mxsboot source code, they appear to be trying to
calculate the ecc and generate the oob data, but maybe they are doing it
wrong?

Look closer, they don't.  The ECC they generate is just the 512 bytes of
ecc data for the 512 bytes of FCB data.  This is a special ecc just used
for the FCBs.  Nothing will actually write past the 1036th byte of the
block and so it will still be all zero past that including the oob data.

>> A "smart" flasher like kobs-ng seems more in line with what the
>> hardware calls for.
>
> Yes, but unfortunately I'm not sure that's something that could be
implemented within a running u-boot?

It would be harder.  I think you'd need to write a u-boot app to do it.

>> cope with bad blocks.  It just creates a blank DBBT table.  What you
>> need to do is query the mtd driver and get the bad block list and
>> then use that to construct an accurate DBBT and then not use those
>> bad blocks.  kobs-ng can do this (not sure if it actually does).
>
>
> Does the ROM IPL pay any attention to bad blocks? I don't know exactly
how it works, but from what I've heard it doesn't sound like it deals with
bad blocks very well if at all.

It is supposed to use something called the DBBT to skip bad blocks.
 mxsboot doesn't generate a real DBBT so bad blocks probably aren't handled.

>> I'd like to burn from Linux.  It's easier to create an end user
>> firmware update system in linux than in u-boot.
>
>
> Yes, agreed. We've been evaluating our bootloader options for the project
I'm working on. Initially we were going to have u-boot be the bootloader
and directly load our production kernel. However, as you say, it is
somewhat difficult to create a flexible update/recovery system within
u-boot. Next we looked into using the freescale bootlets to directly load a
bootloader/recovery linux kernel with a bundled initramfs that would be
used for the update/recovery process, and then have it use kexec to load
the production kernel. This worked out pretty well.
>
> What I think we're going to go with is actually a hybrid of both u-boot
and a stripped-down linux/initramfs bootloader/recovery kernel. The
bootstream actually contains two copies of whatever object is going to be
loaded (I'm guessing, but I assume that is because the ROM IPL doesn't
handle bad blocks, so if the first copy can't be read it will just try the
second). Our recovery kernel/initramfs is probably going to be about 5M, so
it would take about 10M to be loaded directly. u-boot is less than 1M, so
by including both, it actually uses *less* space (2*1+5=7, vs 2*5=10). In
addition, the recommendation seems to be to have the IPL load the minimal
amount possible, so with this implementation the IPL loads a tiny u-boot,
which again is a lot smarter and more reliable about loading the larger
recovery kernel, which can then either perform recovery options or kexec
the production kernel.

I've done that before.  The u-boot env was written from Linux to tell
u-boot which kernel to boot, the firmware update kernel and rootfs or the
main kernel and rootfs.

Another way is to use initramfs for your main filesystem.  If you have no
filesystems mounted from flash, then you can just flash without rebooting.
 You need a small filesystem for this of course.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-11 12:03         ` Trent Piepho
@ 2013-04-11 18:33           ` Paul B. Henson
  2013-04-11 23:25             ` Trent Piepho
  0 siblings, 1 reply; 23+ messages in thread
From: Paul B. Henson @ 2013-04-11 18:33 UTC (permalink / raw)
  To: u-boot

On 4/11/2013 5:03 AM, Trent Piepho wrote:

> I'm talking about the image file as generated by mxsimage.  If I hex
>  dump that, it's clearly written entirely with 2048 byte pages.  If
> you hexdump your image are the FCB blocks exactly 128k apart?

Hmm, I don't have one in front of me to conveniently look at, but as I
recall when I was working with it the FCB blocks did indeed appear to be
evenly spaced at locations divisible by 1k.

> Looks wrong to me!  Notice that offset is equal to i *
> nand_writesize, not i * (nand_writesize + nand_oobsize).

Ah, good eye. They are writing the the correct amount of data, but in
the wrong places.

> All the rest are corrupted since they are written in the wrong
> location.  But since the first one was ok the bootloader never even
> looks at the bad ones.  Unless the NAND page goes bad, then the whole
>  point of having redundant copies will be defeated.

That sounds like a correct conclusion.

> What one should actually do to flash these blocks is write 2048 bytes
> in raw mode.

I guess that would only work if whatever reading the blocks also read 
them in raw mode, as otherwise the lack of ECC in the OOB area would 
fail the read?

> If the four blocks were already marked as bad, then nandwrite will
> not write them. So maybe you only have a working image because it
> was already working and wasn't modified?  Can you erase flash in
> u-boot, verify that nand does not boot, and the make a working nand
> using just nandwrite --oob?  I think you will also need to use the
> option --noecc to write in raw mode.

I did actually erase the NAND before testing the burn in Linux, so I can 
confirm it does actually work ? the first time. After the first burn, 
the next time Linux is booted, it detects the blocks as bad, and will 
not overwrite them, even in raw mode. I unfortunately did not make good 
notes, and don't recall the specific flags I used with nandwrite during 
the test.

> I've done that before.  The u-boot env was written from Linux to tell
>  u-boot which kernel to boot, the firmware update kernel and rootfs
> or the main kernel and rootfs.

I think we're going to always have u-boot boot the recovery kernel and 
have that bootstrap the production kernel. We plan to have a physical 
reset button on the device, which if held down when powered on will 
reset the device to factory defaults. The recovery kernel will check if 
that button is pressed when it loads and rewrite the production area of 
the flash if so from a recovery partition, otherwise just load the 
production kernel.

Hopefully Otavio is watching this thread and can address the issues you 
found with mxsboot.

Thanks much?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-11 18:33           ` Paul B. Henson
@ 2013-04-11 23:25             ` Trent Piepho
  2013-04-20  1:03               ` Paul B. Henson
  0 siblings, 1 reply; 23+ messages in thread
From: Trent Piepho @ 2013-04-11 23:25 UTC (permalink / raw)
  To: u-boot

On Thu, Apr 11, 2013 at 11:33 AM, Paul B. Henson <henson@acm.org> wrote:
> On 4/11/2013 5:03 AM, Trent Piepho wrote:
>> What one should actually do to flash these blocks is write 2048 bytes
>> in raw mode.
>
>
> I guess that would only work if whatever reading the blocks also read them
> in raw mode, as otherwise the lack of ECC in the OOB area would fail the
> read?

See my second message in the thread.  The FCBs are in raw mode, all
the rest are in BCH/ECC mode.  The FCBs have the BCH parameters, so
the IPL can't switch to ECC mode until it gets the parameters from
them.  Also, the BCH/ECC data is NOT in the OOB area.  In BCH/ECC
mode, the data and the ECC bytes are intermixed throughout the full
2112 byte page.  So one MUST write the FCBs in raw mode and everything
else in ECC mode for it to work, as the IPL will read the data in this
manner.  That's why the image from mxsboot needs to be split into two.
 Maybe it would make more sense for mxsboot to write two files?  One
with the FCBs and one with everything else?

The FCBs are only 1036 byes long.  The OOB isn't used by the FCB.  So
when writing the FCBs, the OOB should not be written and whatever
bad/good flag is in there left alone.  But u-boot flashes the entire
block with zeros (the first 2112 page and also the 63 unused pages
after it too).  So the OOB is also zeroed out, and that marks the
block as bad.

>> I've done that before.  The u-boot env was written from Linux to tell
>>  u-boot which kernel to boot, the firmware update kernel and rootfs
>> or the main kernel and rootfs.
>
>
> I think we're going to always have u-boot boot the recovery kernel and have
> that bootstrap the production kernel. We plan to have a physical reset
> button on the device, which if held down when powered on will reset the
> device to factory defaults. The recovery kernel will check if that button is
> pressed when it loads and rewrite the production area of the flash if so
> from a recovery partition, otherwise just load the production kernel.

You'll boot slower then, as you're basically booting twice.  Maybe
that doesn't matter for you.  I'm usually trying to get booted and
have system startup done in <500 ms and two boots make that a lot
harder.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-11  0:20       ` Paul B. Henson
  2013-04-11 12:03         ` Trent Piepho
@ 2013-04-13 14:42         ` Marek Vasut
  2013-04-13 16:31           ` Trent Piepho
  1 sibling, 1 reply; 23+ messages in thread
From: Marek Vasut @ 2013-04-13 14:42 UTC (permalink / raw)
  To: u-boot

Dear Paul B. Henson,

> Let me just preface this reply with the disclaimer that I'm fairly new
> to embedded development, and it sounds like you know a lot more about
> what you're talking about than I do ;).

[...]

I'm not reading the thread as it -- again -- contains loads of baseless "is 
broken" and "doesn't work" accusations left and right, sorry. I am CCing Fabio.

The issue with the bad sectors is known. This is because the kobs-ng scans the 
NAND and fills the DBBT. This is not something that can be done off-line, so the 
mxsboot can never generate such a image per-se. It can on the other hand 
generate a bootable image. Note that the mxsboot is by default configured for 
2048+64 bps flashes.

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-13 14:42         ` Marek Vasut
@ 2013-04-13 16:31           ` Trent Piepho
  2013-04-13 18:26             ` Marek Vasut
  0 siblings, 1 reply; 23+ messages in thread
From: Trent Piepho @ 2013-04-13 16:31 UTC (permalink / raw)
  To: u-boot

On Sat, Apr 13, 2013 at 7:42 AM, Marek Vasut <marex@denx.de> wrote:
> Dear Paul B. Henson,
>
>> Let me just preface this reply with the disclaimer that I'm fairly new
>> to embedded development, and it sounds like you know a lot more about
>> what you're talking about than I do ;).
>
> [...]
>
> I'm not reading the thread as it -- again -- contains loads of baseless "is
> broken" and "doesn't work" accusations left and right, sorry. I am CCing Fabio.

So why did you respond?  Just to tell us you don't care if there are
bugs in mxsboot?

> The issue with the bad sectors is known. This is because the kobs-ng scans the
> NAND and fills the DBBT. This is not something that can be done off-line, so the

You misunderstand, flashing with u-boot marks the sectors as bad.
It's not necessary to ever use kobs-ng to see the problem, so it can't
possibly be the cause.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-13 16:31           ` Trent Piepho
@ 2013-04-13 18:26             ` Marek Vasut
  0 siblings, 0 replies; 23+ messages in thread
From: Marek Vasut @ 2013-04-13 18:26 UTC (permalink / raw)
  To: u-boot

Dear Trent Piepho,

> On Sat, Apr 13, 2013 at 7:42 AM, Marek Vasut <marex@denx.de> wrote:
> > Dear Paul B. Henson,
> > 
> >> Let me just preface this reply with the disclaimer that I'm fairly new
> >> to embedded development, and it sounds like you know a lot more about
> >> what you're talking about than I do ;).
> > 
> > [...]
> > 
> > I'm not reading the thread as it -- again -- contains loads of baseless
> > "is broken" and "doesn't work" accusations left and right, sorry. I am
> > CCing Fabio.
> 
> So why did you respond?  Just to tell us you don't care if there are
> bugs in mxsboot?

To CC Fabio to let him handle this issue, we obviously cannot work together.

> > The issue with the bad sectors is known. This is because the kobs-ng
> > scans the NAND and fills the DBBT. This is not something that can be
> > done off-line, so the
> 
> You misunderstand, flashing with u-boot marks the sectors as bad.
> It's not necessary to ever use kobs-ng to see the problem, so it can't
> possibly be the cause.

Then blank DBBT is also a problem.

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-11 23:25             ` Trent Piepho
@ 2013-04-20  1:03               ` Paul B. Henson
  2013-04-20  1:22                 ` Trent Piepho
  0 siblings, 1 reply; 23+ messages in thread
From: Paul B. Henson @ 2013-04-20  1:03 UTC (permalink / raw)
  To: u-boot

On 4/11/2013 4:25 PM, Trent Piepho wrote:
> Maybe it would make more sense for mxsboot to write two files?  One
> with the FCBs and one with everything else?

Hmm, possibly; I guess that would be conceptually simpler but require 
more commands to execute to get done.

> The FCBs are only 1036 byes long.  The OOB isn't used by the FCB.  So
> when writing the FCBs, the OOB should not be written and whatever
> bad/good flag is in there left alone.  But u-boot flashes the entire
> block with zeros (the first 2112 page and also the 63 unused pages
> after it too).  So the OOB is also zeroed out, and that marks the
> block as bad.

I'm not that familiar with the intricacies of NAND, it sounds like 
you're saying each FCB should be written separately rather than in one 
fell swoop as it does currently?

There haven't been any responses or follow-ups to this thread, so I 
guess they either think it's working fine as is or aren't 
interested/don't have the time to follow up on the issue. I'm not 
accusing anything of being broken, just explaining what I'm seeing and 
offering to help :)...

>> I think we're going to always have u-boot boot the recovery kernel and have
>> that bootstrap the production kernel. We plan to have a physical reset
>
> You'll boot slower then, as you're basically booting twice.  Maybe
> that doesn't matter for you.

Boot time doesn't matter too much for our application, it shouldn't boot 
very often and if it does a couple extra seconds won't be a problem.

What is your recovery plan in the case of the production kernel/file 
system becoming corrupt and unbootable? u-boot, per the environment 
variable, will try to load the production kernel, which then can't boot 
far enough to reset the environment variable to load the recovery kernel?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-20  1:03               ` Paul B. Henson
@ 2013-04-20  1:22                 ` Trent Piepho
  2013-04-23  0:42                   ` Paul B. Henson
  0 siblings, 1 reply; 23+ messages in thread
From: Trent Piepho @ 2013-04-20  1:22 UTC (permalink / raw)
  To: u-boot

On Fri, Apr 19, 2013 at 6:03 PM, Paul B. Henson <henson@acm.org> wrote:
> On 4/11/2013 4:25 PM, Trent Piepho wrote:
>>
>> Maybe it would make more sense for mxsboot to write two files?  One
>> with the FCBs and one with everything else?
>
> Hmm, possibly; I guess that would be conceptually simpler but require more
> commands to execute to get done.

Don't see why.  If mxsboot wrote both files at once, there'd be the
same number of commands to generate them.  When flashing with
nandwrite, the commands to split the file would no longer be
necessary.  U-boot would have to load and flash two files, but it
could avoid having to calculate where to split the file like it does
now.

>> The FCBs are only 1036 byes long.  The OOB isn't used by the FCB.  So
>> when writing the FCBs, the OOB should not be written and whatever
>> bad/good flag is in there left alone.  But u-boot flashes the entire
>> block with zeros (the first 2112 page and also the 63 unused pages
>> after it too).  So the OOB is also zeroed out, and that marks the
>> block as bad.
>
>
> I'm not that familiar with the intricacies of NAND, it sounds like you're
> saying each FCB should be written separately rather than in one fell swoop
> as it does currently?

Yes.  Or at least it should not write the OOB.  Basically you have a
136 KB NAND block, including data+OOB.  The first 1036 bytes are FCB
data.  Bytes 2048 to 2112 contain a bad block marker.  The remaining
~134 KB after byte 2112 are entirely unused.  kobs-ng writes only the
first 2048 bytes, and thus does not write bytes 2048-2112, which
contain the marker.  u-boot writes all 136 KB, including the marker.
There are 4 FCB blocks like this.  While U-boot gets the first one
correct (other than the bad block marker), it doesn't write the 3
after it correctly.

>>> I think we're going to always have u-boot boot the recovery kernel and
>>> have
>>> that bootstrap the production kernel. We plan to have a physical reset
>>
>>
>> You'll boot slower then, as you're basically booting twice.  Maybe
>> that doesn't matter for you.
>
>
> Boot time doesn't matter too much for our application, it shouldn't boot
> very often and if it does a couple extra seconds won't be a problem.
>
> What is your recovery plan in the case of the production kernel/file system
> becoming corrupt and unbootable? u-boot, per the environment variable, will
> try to load the production kernel, which then can't boot far enough to reset
> the environment variable to load the recovery kernel?

The production rootfs and kernel are read-only, so shouldn't become
corrupted on SLC nand.  So there isn't anything to detect that and
switch to a backup.  If something does happen, then recovery would be
by microSD card boot/reflash.  The ROM bootloader supposedly supports
two firmware images to boot from.  That's one of the reasons why the
output of mxsboot is so big, as it contains two images.  It's not
clear to me if the bootloader supports switching between image in any
useful way.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-20  1:22                 ` Trent Piepho
@ 2013-04-23  0:42                   ` Paul B. Henson
  2013-04-26  1:13                     ` Marek Vasut
  0 siblings, 1 reply; 23+ messages in thread
From: Paul B. Henson @ 2013-04-23  0:42 UTC (permalink / raw)
  To: u-boot

On 4/19/2013 6:22 PM, Trent Piepho wrote:

>> Hmm, possibly; I guess that would be conceptually simpler but
>> require more commands to execute to get done.
>
> Don't see why.  If mxsboot wrote both files at once, there'd be the
> same number of commands to generate them.

Well, you'd have to copy two files instead of one to say SD, and run the
mmcload command twice instead of once, but now I'm just being pedantic :).

> While U-boot gets the first one correct (other than the bad block
> marker), it doesn't write the 3 after it correctly.

So if the first one were ever corrupted, the boot would fail. It seems 
like that would be worth fixing.

> supports two firmware images to boot from.  That's one of the reasons
> why the output of mxsboot is so big, as it contains two images.  It's
> not clear to me if the bootloader supports switching between image in
> any useful way.

 From the limited understanding I have of it, no, the second image is 
only loaded if the first one fails.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-23  0:42                   ` Paul B. Henson
@ 2013-04-26  1:13                     ` Marek Vasut
  2013-04-29 20:54                       ` Paul B. Henson
  0 siblings, 1 reply; 23+ messages in thread
From: Marek Vasut @ 2013-04-26  1:13 UTC (permalink / raw)
  To: u-boot

Dear Paul B. Henson,

> On 4/19/2013 6:22 PM, Trent Piepho wrote:
> >> Hmm, possibly; I guess that would be conceptually simpler but
> >> require more commands to execute to get done.
> > 
> > Don't see why.  If mxsboot wrote both files at once, there'd be the
> > same number of commands to generate them.
> 
> Well, you'd have to copy two files instead of one to say SD, and run the
> mmcload command twice instead of once, but now I'm just being pedantic :).
> 
> > While U-boot gets the first one correct (other than the bad block
> > marker), it doesn't write the 3 after it correctly.
> 
> So if the first one were ever corrupted, the boot would fail. It seems
> like that would be worth fixing.
> 
> > supports two firmware images to boot from.  That's one of the reasons
> > why the output of mxsboot is so big, as it contains two images.  It's
> > not clear to me if the bootloader supports switching between image in
> > any useful way.
> 
>  From the limited understanding I have of it, no, the second image is
> only loaded if the first one fails.

I didn't really track the thread and I'm plenty busy, besides I had quite a 
clash with Trent in another thread, sorry about me being plenty unpleasant. 
Anyway, can you please sum what is going on and what you came up with?

Please also always CC Fabio, he is of great help.

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-26  1:13                     ` Marek Vasut
@ 2013-04-29 20:54                       ` Paul B. Henson
  2013-04-29 21:01                         ` Marek Vasut
  2013-05-04  0:08                         ` Marek Vasut
  0 siblings, 2 replies; 23+ messages in thread
From: Paul B. Henson @ 2013-04-29 20:54 UTC (permalink / raw)
  To: u-boot

On 4/25/2013 6:13 PM, Marek Vasut wrote:

> I didn't really track the thread and I'm plenty busy, besides I had quite a
> clash with Trent in another thread, sorry about me being plenty unpleasant.
> Anyway, can you please sum what is going on and what you came up with?

Most of the analysis came from Trent, but I can try to summarize the 
findings.

One problem is that the current mxsboot misaligns the FCB's:

    for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
          offset = i * nand_writesize;
         memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize);
     }

The code writes out nand_writesize+nand_oobsize bytes, but updates the 
offset only by nand_writesize, so every FCB but the first one isn't in 
the right place:

hexdump of the u-boot image:
00000000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff 
|................|
00000010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB 
....P<......|

00020000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff 
|................|
00020010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB 
....P<......|

The first FCB block is at offset 0. The second FCB block is at
offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB
data. The next two FCBs are at 0x40000 and 0x60000, again not where
they should be if they contained the OOB data.

Another problem is that the OOB section gets zeroed out.

If you look at the mx28_nand_fcb_block() that generates the FCB block, 
it calls memset() to fill the entire 2112 bytes with zero. The 
mx28_nand_fcb struct is 512 bytes, so the copy to copy the fcb struct to 
the buffer at offset 12, and then the code to write the fcb ecc@
offset 512+12 only writes the first 1036 bytes. The remaining bytes, 
including the OOB, will all be zero. A zero byte in the first OOB byte 
makes the NAND block as bad. Burning the mxsboot generated image with 
nand write.raw makes the blocks bad because it fills the OOB section 
with all zero.

It seems possibly either the FCB's should each be written separately, 
not overwriting the OOB area, or the image containing them needs to be 
aligned correctly and have proper OOB data?

The TL;DR summary is simply that mxsboot generates the image with 
misaligned FCB's and invalid OOB data.

While we're on the subject of mx28evk, I posted a couple simple 
questions to the list that I didn't see responses to; perhaps one of you 
guys knows the answers off the top of your head?

First, I was wondering why the mx28evk board config doesn't define 
CONFIG_FIT? It seemed like that was the new preferred image format as 
opposed to the legacy image, when I added it seems to work fine so I 
wasn't sure why it's not there by default.

Second, the config defines a load address for the kernel and device 
tree, but none for a ramdisk image. Is there any particular address that 
would be best for that that could perhaps be added to the default 
environment?

Thanks?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-29 20:54                       ` Paul B. Henson
@ 2013-04-29 21:01                         ` Marek Vasut
  2013-05-04  0:08                         ` Marek Vasut
  1 sibling, 0 replies; 23+ messages in thread
From: Marek Vasut @ 2013-04-29 21:01 UTC (permalink / raw)
  To: u-boot

Dear Paul B. Henson,

> On 4/25/2013 6:13 PM, Marek Vasut wrote:
> > I didn't really track the thread and I'm plenty busy, besides I had quite
> > a clash with Trent in another thread, sorry about me being plenty
> > unpleasant. Anyway, can you please sum what is going on and what you
> > came up with?
> 
> Most of the analysis came from Trent, but I can try to summarize the
> findings.
> 
> One problem is that the current mxsboot misaligns the FCB's:
> 
>     for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
>           offset = i * nand_writesize;
>          memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize);
>      }
> 
> The code writes out nand_writesize+nand_oobsize bytes, but updates the
> offset only by nand_writesize, so every FCB but the first one isn't in
> the right place:
> 
> hexdump of the u-boot image:
> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> 
> |................|
> 
> 00000010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> ....P<......|
> 
> 00020000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> 
> |................|
> 
> 00020010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> ....P<......|
> 
> The first FCB block is at offset 0. The second FCB block is at
> offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB
> data. The next two FCBs are at 0x40000 and 0x60000, again not where
> they should be if they contained the OOB data.
> 
> Another problem is that the OOB section gets zeroed out.
> 
> If you look at the mx28_nand_fcb_block() that generates the FCB block,
> it calls memset() to fill the entire 2112 bytes with zero. The
> mx28_nand_fcb struct is 512 bytes, so the copy to copy the fcb struct to
> the buffer at offset 12, and then the code to write the fcb ecc at
> offset 512+12 only writes the first 1036 bytes. The remaining bytes,
> including the OOB, will all be zero. A zero byte in the first OOB byte
> makes the NAND block as bad. Burning the mxsboot generated image with
> nand write.raw makes the blocks bad because it fills the OOB section
> with all zero.
> 
> It seems possibly either the FCB's should each be written separately,
> not overwriting the OOB area, or the image containing them needs to be
> aligned correctly and have proper OOB data?

I'll take one more stab at reading this tomorrow.

> The TL;DR summary is simply that mxsboot generates the image with
> misaligned FCB's and invalid OOB data.
> 
> While we're on the subject of mx28evk, I posted a couple simple
> questions to the list that I didn't see responses to; perhaps one of you
> guys knows the answers off the top of your head?

CC me and Fabio, then you have good chance of having them answered.

> First, I was wondering why the mx28evk board config doesn't define
> CONFIG_FIT? It seemed like that was the new preferred image format as
> opposed to the legacy image, when I added it seems to work fine so I
> wasn't sure why it's not there by default.

It's just disabled as we use uImage on those boards. Sure, you can enable FIT 
image and yes, it's the new preffered format.

> Second, the config defines a load address for the kernel and device
> tree, but none for a ramdisk image. Is there any particular address that
> would be best for that that could perhaps be added to the default
> environment?

I don't know many people who still use ramdisk, but any address above kernel 
works.

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-04-29 20:54                       ` Paul B. Henson
  2013-04-29 21:01                         ` Marek Vasut
@ 2013-05-04  0:08                         ` Marek Vasut
  2013-05-04  6:21                           ` Trent Piepho
  1 sibling, 1 reply; 23+ messages in thread
From: Marek Vasut @ 2013-05-04  0:08 UTC (permalink / raw)
  To: u-boot

Dear Paul B. Henson,

> On 4/25/2013 6:13 PM, Marek Vasut wrote:
> > I didn't really track the thread and I'm plenty busy, besides I had quite
> > a clash with Trent in another thread, sorry about me being plenty
> > unpleasant. Anyway, can you please sum what is going on and what you
> > came up with?
> 
> Most of the analysis came from Trent, but I can try to summarize the
> findings.
> 
> One problem is that the current mxsboot misaligns the FCB's:
> 
>     for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
>           offset = i * nand_writesize;
>          memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize);
>      }
> 
> The code writes out nand_writesize+nand_oobsize bytes, but updates the
> offset only by nand_writesize, so every FCB but the first one isn't in
> the right place:
> 
> hexdump of the u-boot image:
> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> 
> |................|
> 
> 00000010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> ....P<......|
> 
> 00020000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> 
> |................|
> 
> 00020010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> ....P<......|
> 
> The first FCB block is at offset 0. The second FCB block is at
> offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB
> data. The next two FCBs are at 0x40000 and 0x60000, again not where
> they should be if they contained the OOB data.
> 
> Another problem is that the OOB section gets zeroed out.

Ok, I see the problem, but I don't see easy solution. For some reason, the BCH 
doesn't compute the same ECC as mx28_nand_parity_13_8() when writing regular 
data, do you know why?

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-05-04  0:08                         ` Marek Vasut
@ 2013-05-04  6:21                           ` Trent Piepho
  2013-05-04 13:20                             ` Marek Vasut
  0 siblings, 1 reply; 23+ messages in thread
From: Trent Piepho @ 2013-05-04  6:21 UTC (permalink / raw)
  To: u-boot

On Fri, May 3, 2013 at 5:08 PM, Marek Vasut <marex@denx.de> wrote:

> > On 4/25/2013 6:13 PM, Marek Vasut wrote:
> > > I didn't really track the thread and I'm plenty busy, besides I had
> quite
> > > a clash with Trent in another thread, sorry about me being plenty
> > > unpleasant. Anyway, can you please sum what is going on and what you
> > > came up with?
> >
> > Most of the analysis came from Trent, but I can try to summarize the
> > findings.
> >
> > One problem is that the current mxsboot misaligns the FCB's:
> >
> >     for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
> >           offset = i * nand_writesize;
> >          memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize);
> >      }
> >
> > The code writes out nand_writesize+nand_oobsize bytes, but updates the
> > offset only by nand_writesize, so every FCB but the first one isn't in
> > the right place:
> >
> > hexdump of the u-boot image:
> > 00000000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> >
> > |................|
> >
> > 00000010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> > ....P<......|
> >
> > 00020000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> >
> > |................|
> >
> > 00020010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> > ....P<......|
> >
> > The first FCB block is at offset 0. The second FCB block is at
> > offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB
> > data. The next two FCBs are at 0x40000 and 0x60000, again not where
> > they should be if they contained the OOB data.
> >
> > Another problem is that the OOB section gets zeroed out.
>
> Ok, I see the problem, but I don't see easy solution. For some reason, the
> BCH
> doesn't compute the same ECC as mx28_nand_parity_13_8() when writing
> regular
> data, do you know why?
>

Completely different algorithms.  The BCH ECC is computed on a 512 byte
block at a time using a vastly more complex algorithm.  The 13_8 parity is
only used by the ROM bootloader code for checking the FCB blocks (and maybe
some other boot blocks too?  Not sure about that off the top of head).  It
produces one byte of parity (only 6 bits used I think) for each byte of
data.  This is because the FCB blocks are not in BCH format, as those
blocks contain the BCH parameters that are necessary to decode BCH encoded
blocks.  Thus the FCB blocks are raw so the BCH parameters can be read,
then the rest of the blocks can be in BCH mode.

I think what needs to be done is for u-boot to have a nand write.raw that
writes a block in non-BCH mode WITHOUT writing the OOB data.  Since the OOB
data in the FCB blocks should not actually be written, it's not necessarily
a problem that mxsboot fails to include it in the image.  The problem
really is that u-boot nand write expects it and the image does not have it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks
  2013-05-04  6:21                           ` Trent Piepho
@ 2013-05-04 13:20                             ` Marek Vasut
  0 siblings, 0 replies; 23+ messages in thread
From: Marek Vasut @ 2013-05-04 13:20 UTC (permalink / raw)
  To: u-boot

Dear Trent Piepho,

> On Fri, May 3, 2013 at 5:08 PM, Marek Vasut <marex@denx.de> wrote:
> > > On 4/25/2013 6:13 PM, Marek Vasut wrote:
> > > > I didn't really track the thread and I'm plenty busy, besides I had
> > 
> > quite
> > 
> > > > a clash with Trent in another thread, sorry about me being plenty
> > > > unpleasant. Anyway, can you please sum what is going on and what you
> > > > came up with?
> > > 
> > > Most of the analysis came from Trent, but I can try to summarize the
> > > findings.
> > > 
> > > One problem is that the current mxsboot misaligns the FCB's:
> > >     for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
> > >     
> > >           offset = i * nand_writesize;
> > >          
> > >          memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize);
> > >      
> > >      }
> > > 
> > > The code writes out nand_writesize+nand_oobsize bytes, but updates the
> > > offset only by nand_writesize, so every FCB but the first one isn't in
> > > the right place:
> > > 
> > > hexdump of the u-boot image:
> > > 00000000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> > > 
> > > |................|
> > > 
> > > 00000010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> > > ....P<......|
> > > 
> > > 00020000  00 00 00 00 00 00 00 00  00 00 00 00 d6 fc ff ff
> > > 
> > > |................|
> > > 
> > > 00020010  46 43 42 20 00 00 00 01  50 3c 19 06 00 00 00 00  |FCB
> > > ....P<......|
> > > 
> > > The first FCB block is at offset 0. The second FCB block is at
> > > offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB
> > > data. The next two FCBs are at 0x40000 and 0x60000, again not where
> > > they should be if they contained the OOB data.
> > > 
> > > Another problem is that the OOB section gets zeroed out.
> > 
> > Ok, I see the problem, but I don't see easy solution. For some reason,
> > the BCH
> > doesn't compute the same ECC as mx28_nand_parity_13_8() when writing
> > regular
> > data, do you know why?
> 
> Completely different algorithms.  The BCH ECC is computed on a 512 byte
> block at a time using a vastly more complex algorithm.  The 13_8 parity is
> only used by the ROM bootloader code for checking the FCB blocks (and maybe
> some other boot blocks too?

It's only the first block, I checked the bootrom source.

> Not sure about that off the top of head).  It
> produces one byte of parity (only 6 bits used I think) for each byte of
> data.  This is because the FCB blocks are not in BCH format, as those
> blocks contain the BCH parameters that are necessary to decode BCH encoded
> blocks.  Thus the FCB blocks are raw so the BCH parameters can be read,
> then the rest of the blocks can be in BCH mode.
> 
> I think what needs to be done is for u-boot to have a nand write.raw that
> writes a block in non-BCH mode WITHOUT writing the OOB data.  Since the OOB
> data in the FCB blocks should not actually be written, it's not necessarily
> a problem that mxsboot fails to include it in the image.  The problem
> really is that u-boot nand write expects it and the image does not have it.

What OOB data are you exactly talking about? The 10b metadata at the begining? 
Or the parity blocks? Or what? The other option would be to replace the zero'd 
bytes in the image with 0xff maybe, but I didn't test that.

Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-05-04 13:20 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-19  0:50 [U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks Paul B. Henson
2013-03-19 23:23 ` Scott Wood
2013-03-20 21:20   ` Paul B. Henson
2013-03-20 21:24     ` Scott Wood
2013-04-04 10:09 ` Trent Piepho
2013-04-06  4:28   ` Paul B. Henson
2013-04-06  7:18     ` Trent Piepho
2013-04-11  0:20       ` Paul B. Henson
2013-04-11 12:03         ` Trent Piepho
2013-04-11 18:33           ` Paul B. Henson
2013-04-11 23:25             ` Trent Piepho
2013-04-20  1:03               ` Paul B. Henson
2013-04-20  1:22                 ` Trent Piepho
2013-04-23  0:42                   ` Paul B. Henson
2013-04-26  1:13                     ` Marek Vasut
2013-04-29 20:54                       ` Paul B. Henson
2013-04-29 21:01                         ` Marek Vasut
2013-05-04  0:08                         ` Marek Vasut
2013-05-04  6:21                           ` Trent Piepho
2013-05-04 13:20                             ` Marek Vasut
2013-04-13 14:42         ` Marek Vasut
2013-04-13 16:31           ` Trent Piepho
2013-04-13 18:26             ` Marek Vasut

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.