All of lore.kernel.org
 help / color / mirror / Atom feed
* ARM: sunxi: Experiences NAND flash
@ 2015-08-11 12:16 ` Olliver Schinagl
  0 siblings, 0 replies; 13+ messages in thread
From: Olliver Schinagl @ 2015-08-11 12:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hello everybody,

We are working with Boris and Roy's patch series on getting the NAND 
flash chip working on Olimex OLinuXino Lime2 boards. Initially, 
everything looks fine, but we noticed that occasionally (after 
power/cycle or power cut) ubi fails to mount the partition. It is not 
something easily enough to reproduce, but it has failed on 5 boards out 
of 30 we have.

U-boot reports the following:
UBI: default fastmap pool size: 100
UBI: default fastmap WL pool size: 25
UBI: attaching mtd1 to ubi0
UBI: scanning is finished
UBI init error 22
Error reading superblock on volume 'ubi:boot' errno=-19!
ubifsmount - mount UBIFS volume

whereas the linux kernel booted from sd card gives:
ubiattach /dev/ubi_ctrl -m 0
[  100.560704] ubi0: default fastmap pool size: 8
[  100.565186] ubi0: default fastmap WL pool size: 4
[  100.570100] ubi0: attaching mtd0
[  100.590469] ubi0: scanning is finished
[  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was 
not found
[  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, 
error -22
ubiattach: error!: cannot attach mtd0
            error 22 (Invalid argument)

The u-boot version we are using is a few months out of date
U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner 
Technology
arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
GNU ld (2.25-5+5+b1) 2.25

but the kernel is fairly up to date:
4.2.0-rc4-opinicus-g8ec3671


Now I know that the mtd stuff is all very new and all very untested, 
what I am curious about is a) have other people actually tried the mtd 
stuff on Allwinner hardware, and b) has anybody encountered this issue 
as well?

It's not something very easily reproducible (toggling a machine on/off 
repeatedly did not trigger it yet) but it does happen.

Olliver

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] ARM: sunxi: Experiences NAND flash
@ 2015-08-11 12:16 ` Olliver Schinagl
  0 siblings, 0 replies; 13+ messages in thread
From: Olliver Schinagl @ 2015-08-11 12:16 UTC (permalink / raw)
  To: u-boot

Hello everybody,

We are working with Boris and Roy's patch series on getting the NAND 
flash chip working on Olimex OLinuXino Lime2 boards. Initially, 
everything looks fine, but we noticed that occasionally (after 
power/cycle or power cut) ubi fails to mount the partition. It is not 
something easily enough to reproduce, but it has failed on 5 boards out 
of 30 we have.

U-boot reports the following:
UBI: default fastmap pool size: 100
UBI: default fastmap WL pool size: 25
UBI: attaching mtd1 to ubi0
UBI: scanning is finished
UBI init error 22
Error reading superblock on volume 'ubi:boot' errno=-19!
ubifsmount - mount UBIFS volume

whereas the linux kernel booted from sd card gives:
ubiattach /dev/ubi_ctrl -m 0
[  100.560704] ubi0: default fastmap pool size: 8
[  100.565186] ubi0: default fastmap WL pool size: 4
[  100.570100] ubi0: attaching mtd0
[  100.590469] ubi0: scanning is finished
[  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was 
not found
[  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, 
error -22
ubiattach: error!: cannot attach mtd0
            error 22 (Invalid argument)

The u-boot version we are using is a few months out of date
U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner 
Technology
arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
GNU ld (2.25-5+5+b1) 2.25

but the kernel is fairly up to date:
4.2.0-rc4-opinicus-g8ec3671


Now I know that the mtd stuff is all very new and all very untested, 
what I am curious about is a) have other people actually tried the mtd 
stuff on Allwinner hardware, and b) has anybody encountered this issue 
as well?

It's not something very easily reproducible (toggling a machine on/off 
repeatedly did not trigger it yet) but it does happen.

Olliver

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [linux-sunxi] ARM: sunxi: Experiences NAND flash
  2015-08-11 12:16 ` [U-Boot] " Olliver Schinagl
  (?)
@ 2015-08-12 13:31 ` Olliver Schinagl
  2015-08-17  7:48     ` [U-Boot] " Boris Brezillon
  -1 siblings, 1 reply; 13+ messages in thread
From: Olliver Schinagl @ 2015-08-12 13:31 UTC (permalink / raw)
  To: u-boot

Hey Yassin,

I'm affraid. The strange thing that seems very related here is that when 
writing a file onto the flash, it fails and succeeds alternating. It 
never fails or succeeds twice in a row! And this on any board and any 
partition.


root at system-020502824168:/boot# nandwrite -p /dev/mtd0 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
libmtd: error!: cannot write 8192 bytes to mtd0 (eraseblock 0, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 00000000 to 0x1fffff
nandwrite: error!: Data was only partially written due to error
            error 5 (Input/output error)
root at system-020502824168:/boot# nandwrite -p /dev/mtd0 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
root at system-020502824168:/boot# nandwrite -p /dev/mtd0 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
libmtd: error!: cannot write 8192 bytes to mtd0 (eraseblock 0, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 00000000 to 0x1fffff
nandwrite: error!: Data was only partially written due to error
            error 5 (Input/output error)
root at system-020502824168:/boot# nandwrite -p /dev/mtd0 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
libmtd: error!: cannot write 8192 bytes to mtd2 (eraseblock 0, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 00000000 to 0x1fffff
Writing data to block 1 at offset 0x200000
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
libmtd: error!: cannot write 8192 bytes to mtd2 (eraseblock 0, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 00000000 to 0x1fffff
Writing data to block 1 at offset 0x200000
libmtd: error!: cannot write 8192 bytes to mtd2 (eraseblock 1, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 0x200000 to 0x3fffff
Writing data to block 2 at offset 0x400000
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
libmtd: error!: cannot write 8192 bytes to mtd2 (eraseblock 0, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 00000000 to 0x1fffff
Writing data to block 1 at offset 0x200000
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
root at system-020502824168:/boot# nandwrite -p /dev/mtd2 
u-boot-sunxi-with-spl.bin
Writing data to block 0 at offset 0x0
libmtd: error!: cannot write 8192 bytes to mtd2 (eraseblock 0, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 00000000 to 0x1fffff
Writing data to block 1 at offset 0x200000
libmtd: error!: cannot write 8192 bytes to mtd2 (eraseblock 1, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 0x200000 to 0x3fffff
Writing data to block 2 at offset 0x400000
libmtd: error!: cannot write 8192 bytes to mtd2 (eraseblock 2, offset 32768)
         error 5 (Input/output error)
Erasing failed write from 0x400000 to 0x5fffff
Writing data to block 3 at offset 0x600000


On 12-08-15 03:19, Yassin Jaffer wrote:
> Hi Oliver
> Did you try without fastmap enabled?
>
> On Tue, Aug 11, 2015 at 10:16 PM, Olliver Schinagl 
> <oliver+list at schinagl.nl <mailto:oliver+list@schinagl.nl>> wrote:
>
>     Hello everybody,
>
>     We are working with Boris and Roy's patch series on getting the
>     NAND flash chip working on Olimex OLinuXino Lime2 boards.
>     Initially, everything looks fine, but we noticed that occasionally
>     (after power/cycle or power cut) ubi fails to mount the partition.
>     It is not something easily enough to reproduce, but it has failed
>     on 5 boards out of 30 we have.
>
>     U-boot reports the following:
>     UBI: default fastmap pool size: 100
>     UBI: default fastmap WL pool size: 25
>     UBI: attaching mtd1 to ubi0
>     UBI: scanning is finished
>     UBI init error 22
>     Error reading superblock on volume 'ubi:boot' errno=-19!
>     ubifsmount - mount UBIFS volume
>
>     whereas the linux kernel booted from sd card gives:
>     ubiattach /dev/ubi_ctrl -m 0
>     [  100.560704] ubi0: default fastmap pool size: 8
>     [  100.565186] ubi0: default fastmap WL pool size: 4
>     [  100.570100] ubi0: attaching mtd0
>     [  100.590469] ubi0: scanning is finished
>     [  100.594732] ubi0 error: ubi_read_volume_table: the layout
>     volume was not found
>     [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach
>     mtd0, error -22
>     ubiattach: error!: cannot attach mtd0
>                error 22 (Invalid argument)
>
>     The u-boot version we are using is a few months out of date
>     U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200)
>     Allwinner Technology
>     arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
>     GNU ld (2.25-5+5+b1) 2.25
>
>     but the kernel is fairly up to date:
>     4.2.0-rc4-opinicus-g8ec3671
>
>
>     Now I know that the mtd stuff is all very new and all very
>     untested, what I am curious about is a) have other people actually
>     tried the mtd stuff on Allwinner hardware, and b) has anybody
>     encountered this issue as well?
>
>     It's not something very easily reproducible (toggling a machine
>     on/off repeatedly did not trigger it yet) but it does happen.
>
>     Olliver
>
>     -- 
>     You received this message because you are subscribed to the Google
>     Groups "linux-sunxi" group.
>     To unsubscribe from this group and stop receiving emails from it,
>     send an email to linux-sunxi+unsubscribe at googlegroups.com
>     <mailto:linux-sunxi%2Bunsubscribe@googlegroups.com>.
>     For more options, visit https://groups.google.com/d/optout.
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ARM: sunxi: Experiences NAND flash
  2015-08-11 12:16 ` [U-Boot] " Olliver Schinagl
@ 2015-08-17  7:34   ` Boris Brezillon
  -1 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2015-08-17  7:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Oliver,

Sorry for the late reply (I was in vacation for the last 2 weeks)

On Tue, 11 Aug 2015 14:16:52 +0200
Olliver Schinagl <oliver+list@schinagl.nl> wrote:

> Hello everybody,
> 
> We are working with Boris and Roy's patch series on getting the NAND 
> flash chip working on Olimex OLinuXino Lime2 boards. Initially, 
> everything looks fine, but we noticed that occasionally (after 
> power/cycle or power cut) ubi fails to mount the partition. It is not 
> something easily enough to reproduce, but it has failed on 5 boards out 
> of 30 we have.

I remember warning you about that problem before: MLC NANDs are not as
reliable as SLC ones (please read my presentation about MLC support in
Linux [1]). I also remember recommending using an SLC chip if you were
tight on time to avoid dealing with all these MLC related problems, but
you decided to go for the MLC solution.

Back to your problem now, what you're seeing here is probably caused by
interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
of my presentation for more information).

> 
> U-boot reports the following:
> UBI: default fastmap pool size: 100
> UBI: default fastmap WL pool size: 25
> UBI: attaching mtd1 to ubi0
> UBI: scanning is finished
> UBI init error 22
> Error reading superblock on volume 'ubi:boot' errno=-19!
> ubifsmount - mount UBIFS volume
> 
> whereas the linux kernel booted from sd card gives:
> ubiattach /dev/ubi_ctrl -m 0
> [  100.560704] ubi0: default fastmap pool size: 8
> [  100.565186] ubi0: default fastmap WL pool size: 4
> [  100.570100] ubi0: attaching mtd0
> [  100.590469] ubi0: scanning is finished
> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was 
> not found
> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, 
> error -22
> ubiattach: error!: cannot attach mtd0
>             error 22 (Invalid argument)
> 
> The u-boot version we are using is a few months out of date
> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner 
> Technology
> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
> GNU ld (2.25-5+5+b1) 2.25
> 
> but the kernel is fairly up to date:
> 4.2.0-rc4-opinicus-g8ec3671
> 
> 
> Now I know that the mtd stuff is all very new and all very untested, 
> what I am curious about is a) have other people actually tried the mtd 
> stuff on Allwinner hardware, and b) has anybody encountered this issue 
> as well?

Yes we did. So far we're using the NAND in SLC mode to address this
problem. It seems to work, but you also loose half the NAND capacity.

> 
> It's not something very easily reproducible (toggling a machine on/off 
> repeatedly did not trigger it yet) but it does happen.

I managed to reproduce it by faking a power cut directly in the NAND
core code (by sending a RESET command to the NAND chip in the middle of
a program operation), and I can confirm SLC mode address the problem.

Anyway, remember that MLC NANDs have other sources of unreliability
(e.g the unstable bits problem).

Best Regards,

Boris


[1]http://events.linuxfoundation.org/sites/events/files/slides/brezillon-mlc-nand_0.pdf

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] ARM: sunxi: Experiences NAND flash
@ 2015-08-17  7:34   ` Boris Brezillon
  0 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2015-08-17  7:34 UTC (permalink / raw)
  To: u-boot

Hi Oliver,

Sorry for the late reply (I was in vacation for the last 2 weeks)

On Tue, 11 Aug 2015 14:16:52 +0200
Olliver Schinagl <oliver+list@schinagl.nl> wrote:

> Hello everybody,
> 
> We are working with Boris and Roy's patch series on getting the NAND 
> flash chip working on Olimex OLinuXino Lime2 boards. Initially, 
> everything looks fine, but we noticed that occasionally (after 
> power/cycle or power cut) ubi fails to mount the partition. It is not 
> something easily enough to reproduce, but it has failed on 5 boards out 
> of 30 we have.

I remember warning you about that problem before: MLC NANDs are not as
reliable as SLC ones (please read my presentation about MLC support in
Linux [1]). I also remember recommending using an SLC chip if you were
tight on time to avoid dealing with all these MLC related problems, but
you decided to go for the MLC solution.

Back to your problem now, what you're seeing here is probably caused by
interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
of my presentation for more information).

> 
> U-boot reports the following:
> UBI: default fastmap pool size: 100
> UBI: default fastmap WL pool size: 25
> UBI: attaching mtd1 to ubi0
> UBI: scanning is finished
> UBI init error 22
> Error reading superblock on volume 'ubi:boot' errno=-19!
> ubifsmount - mount UBIFS volume
> 
> whereas the linux kernel booted from sd card gives:
> ubiattach /dev/ubi_ctrl -m 0
> [  100.560704] ubi0: default fastmap pool size: 8
> [  100.565186] ubi0: default fastmap WL pool size: 4
> [  100.570100] ubi0: attaching mtd0
> [  100.590469] ubi0: scanning is finished
> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was 
> not found
> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0, 
> error -22
> ubiattach: error!: cannot attach mtd0
>             error 22 (Invalid argument)
> 
> The u-boot version we are using is a few months out of date
> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner 
> Technology
> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
> GNU ld (2.25-5+5+b1) 2.25
> 
> but the kernel is fairly up to date:
> 4.2.0-rc4-opinicus-g8ec3671
> 
> 
> Now I know that the mtd stuff is all very new and all very untested, 
> what I am curious about is a) have other people actually tried the mtd 
> stuff on Allwinner hardware, and b) has anybody encountered this issue 
> as well?

Yes we did. So far we're using the NAND in SLC mode to address this
problem. It seems to work, but you also loose half the NAND capacity.

> 
> It's not something very easily reproducible (toggling a machine on/off 
> repeatedly did not trigger it yet) but it does happen.

I managed to reproduce it by faking a power cut directly in the NAND
core code (by sending a RESET command to the NAND chip in the middle of
a program operation), and I can confirm SLC mode address the problem.

Anyway, remember that MLC NANDs have other sources of unreliability
(e.g the unstable bits problem).

Best Regards,

Boris


[1]http://events.linuxfoundation.org/sites/events/files/slides/brezillon-mlc-nand_0.pdf

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [linux-sunxi] ARM: sunxi: Experiences NAND flash
  2015-08-12 13:31 ` [U-Boot] [linux-sunxi] " Olliver Schinagl
@ 2015-08-17  7:48     ` Boris Brezillon
  0 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2015-08-17  7:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Oliver,

On Wed, 12 Aug 2015 15:31:15 +0200
Olliver Schinagl <oliver+list@schinagl.nl> wrote:

> Hey Yassin,
> 
> I'm affraid. The strange thing that seems very related here is that when 
> writing a file onto the flash, it fails and succeeds alternating. It 
> never fails or succeeds twice in a row! And this on any board and any 
> partition.

I don't know if you only pasted half your command sequence, but it
seems you are writing twice on the same memory region without erasing it,
and this is prohibited on NAND devices.

Try with:

# flash_erase /dev/mtd0 && nandwrite -p /dev/mtd0 u-boot-sunxi-with-spl.bin

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [linux-sunxi] ARM: sunxi: Experiences NAND flash
@ 2015-08-17  7:48     ` Boris Brezillon
  0 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2015-08-17  7:48 UTC (permalink / raw)
  To: u-boot

Hi Oliver,

On Wed, 12 Aug 2015 15:31:15 +0200
Olliver Schinagl <oliver+list@schinagl.nl> wrote:

> Hey Yassin,
> 
> I'm affraid. The strange thing that seems very related here is that when 
> writing a file onto the flash, it fails and succeeds alternating. It 
> never fails or succeeds twice in a row! And this on any board and any 
> partition.

I don't know if you only pasted half your command sequence, but it
seems you are writing twice on the same memory region without erasing it,
and this is prohibited on NAND devices.

Try with:

# flash_erase /dev/mtd0 && nandwrite -p /dev/mtd0 u-boot-sunxi-with-spl.bin

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [linux-sunxi] Re: ARM: sunxi: Experiences NAND flash
  2015-08-17  7:34   ` [U-Boot] " Boris Brezillon
@ 2015-08-17  7:51     ` Michal Suchanek
  -1 siblings, 0 replies; 13+ messages in thread
From: Michal Suchanek @ 2015-08-17  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

Hello

On 17 August 2015 at 09:34, Boris Brezillon
<boris.brezillon@free-electrons.com> wrote:
> Hi Oliver,
>
> Sorry for the late reply (I was in vacation for the last 2 weeks)
>
> On Tue, 11 Aug 2015 14:16:52 +0200
> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
>

>>
>> Now I know that the mtd stuff is all very new and all very untested,
>> what I am curious about is a) have other people actually tried the mtd
>> stuff on Allwinner hardware, and b) has anybody encountered this issue
>> as well?
>
> Yes we did. So far we're using the NAND in SLC mode to address this
> problem. It seems to work, but you also loose half the NAND capacity.
>


What is needed to use the NAND in SLC mode?

Presumably you need to know something about its organizetion?

Is this data available for chips commonly used on sunxi devices?

Thanks

Michal

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] [linux-sunxi] Re: ARM: sunxi: Experiences NAND flash
@ 2015-08-17  7:51     ` Michal Suchanek
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Suchanek @ 2015-08-17  7:51 UTC (permalink / raw)
  To: u-boot

Hello

On 17 August 2015 at 09:34, Boris Brezillon
<boris.brezillon@free-electrons.com> wrote:
> Hi Oliver,
>
> Sorry for the late reply (I was in vacation for the last 2 weeks)
>
> On Tue, 11 Aug 2015 14:16:52 +0200
> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
>

>>
>> Now I know that the mtd stuff is all very new and all very untested,
>> what I am curious about is a) have other people actually tried the mtd
>> stuff on Allwinner hardware, and b) has anybody encountered this issue
>> as well?
>
> Yes we did. So far we're using the NAND in SLC mode to address this
> problem. It seems to work, but you also loose half the NAND capacity.
>


What is needed to use the NAND in SLC mode?

Presumably you need to know something about its organizetion?

Is this data available for chips commonly used on sunxi devices?

Thanks

Michal

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ARM: sunxi: Experiences NAND flash
  2015-08-17  7:34   ` [U-Boot] " Boris Brezillon
@ 2015-08-17  8:30     ` Roy Spliet
  -1 siblings, 0 replies; 13+ messages in thread
From: Roy Spliet @ 2015-08-17  8:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Reply in-line

Op 17-08-15 om 08:34 schreef Boris Brezillon:
> Hi Oliver,
>
> Sorry for the late reply (I was in vacation for the last 2 weeks)
>
> On Tue, 11 Aug 2015 14:16:52 +0200
> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
>
>> Hello everybody,
>>
>> We are working with Boris and Roy's patch series on getting the NAND
>> flash chip working on Olimex OLinuXino Lime2 boards. Initially,
>> everything looks fine, but we noticed that occasionally (after
>> power/cycle or power cut) ubi fails to mount the partition. It is not
>> something easily enough to reproduce, but it has failed on 5 boards out
>> of 30 we have.
> I remember warning you about that problem before: MLC NANDs are not as
> reliable as SLC ones (please read my presentation about MLC support in
> Linux [1]). I also remember recommending using an SLC chip if you were
> tight on time to avoid dealing with all these MLC related problems, but
> you decided to go for the MLC solution.
>
> Back to your problem now, what you're seeing here is probably caused by
> interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
> of my presentation for more information).
In his defence; we looked at it, and from what we could tell it is not 
possible to find an affordable SLC chip that the Allwinner A10/A20 
BootROM would even boot from. In general, chips below 8K page size 
require 64-bit EEC strength to operate, which in turn required more OOB 
area than any chip would provide. This limitation is in my opinion a 
design fault from AllWinners side and I hope that their future SoCs can 
boot with more relaxed EEC settings to facilitate for cheap SLC chips, 
but right now there is nothing we can do to change that situation.
>> U-boot reports the following:
>> UBI: default fastmap pool size: 100
>> UBI: default fastmap WL pool size: 25
>> UBI: attaching mtd1 to ubi0
>> UBI: scanning is finished
>> UBI init error 22
>> Error reading superblock on volume 'ubi:boot' errno=-19!
>> ubifsmount - mount UBIFS volume
>>
>> whereas the linux kernel booted from sd card gives:
>> ubiattach /dev/ubi_ctrl -m 0
>> [  100.560704] ubi0: default fastmap pool size: 8
>> [  100.565186] ubi0: default fastmap WL pool size: 4
>> [  100.570100] ubi0: attaching mtd0
>> [  100.590469] ubi0: scanning is finished
>> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was
>> not found
>> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0,
>> error -22
>> ubiattach: error!: cannot attach mtd0
>>              error 22 (Invalid argument)
>>
>> The u-boot version we are using is a few months out of date
>> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner
>> Technology
>> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
>> GNU ld (2.25-5+5+b1) 2.25
>>
>> but the kernel is fairly up to date:
>> 4.2.0-rc4-opinicus-g8ec3671
>>
>>
>> Now I know that the mtd stuff is all very new and all very untested,
>> what I am curious about is a) have other people actually tried the mtd
>> stuff on Allwinner hardware, and b) has anybody encountered this issue
>> as well?
> Yes we did. So far we're using the NAND in SLC mode to address this
> problem. It seems to work, but you also loose half the NAND capacity.
So as requested by someone else: how exactly does that work? Can we just 
give your NAND driver a mapping between shared pages and instruct it to 
ignore half, or does the driver require some serious patchery?
Cheers,

Roy
>
>> It's not something very easily reproducible (toggling a machine on/off
>> repeatedly did not trigger it yet) but it does happen.
> I managed to reproduce it by faking a power cut directly in the NAND
> core code (by sending a RESET command to the NAND chip in the middle of
> a program operation), and I can confirm SLC mode address the problem.
>
> Anyway, remember that MLC NANDs have other sources of unreliability
> (e.g the unstable bits problem).
>
> Best Regards,
>
> Boris
>
>
> [1]http://events.linuxfoundation.org/sites/events/files/slides/brezillon-mlc-nand_0.pdf
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] ARM: sunxi: Experiences NAND flash
@ 2015-08-17  8:30     ` Roy Spliet
  0 siblings, 0 replies; 13+ messages in thread
From: Roy Spliet @ 2015-08-17  8:30 UTC (permalink / raw)
  To: u-boot

Hello,

Reply in-line

Op 17-08-15 om 08:34 schreef Boris Brezillon:
> Hi Oliver,
>
> Sorry for the late reply (I was in vacation for the last 2 weeks)
>
> On Tue, 11 Aug 2015 14:16:52 +0200
> Olliver Schinagl <oliver+list@schinagl.nl> wrote:
>
>> Hello everybody,
>>
>> We are working with Boris and Roy's patch series on getting the NAND
>> flash chip working on Olimex OLinuXino Lime2 boards. Initially,
>> everything looks fine, but we noticed that occasionally (after
>> power/cycle or power cut) ubi fails to mount the partition. It is not
>> something easily enough to reproduce, but it has failed on 5 boards out
>> of 30 we have.
> I remember warning you about that problem before: MLC NANDs are not as
> reliable as SLC ones (please read my presentation about MLC support in
> Linux [1]). I also remember recommending using an SLC chip if you were
> tight on time to avoid dealing with all these MLC related problems, but
> you decided to go for the MLC solution.
>
> Back to your problem now, what you're seeing here is probably caused by
> interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
> of my presentation for more information).
In his defence; we looked at it, and from what we could tell it is not 
possible to find an affordable SLC chip that the Allwinner A10/A20 
BootROM would even boot from. In general, chips below 8K page size 
require 64-bit EEC strength to operate, which in turn required more OOB 
area than any chip would provide. This limitation is in my opinion a 
design fault from AllWinners side and I hope that their future SoCs can 
boot with more relaxed EEC settings to facilitate for cheap SLC chips, 
but right now there is nothing we can do to change that situation.
>> U-boot reports the following:
>> UBI: default fastmap pool size: 100
>> UBI: default fastmap WL pool size: 25
>> UBI: attaching mtd1 to ubi0
>> UBI: scanning is finished
>> UBI init error 22
>> Error reading superblock on volume 'ubi:boot' errno=-19!
>> ubifsmount - mount UBIFS volume
>>
>> whereas the linux kernel booted from sd card gives:
>> ubiattach /dev/ubi_ctrl -m 0
>> [  100.560704] ubi0: default fastmap pool size: 8
>> [  100.565186] ubi0: default fastmap WL pool size: 4
>> [  100.570100] ubi0: attaching mtd0
>> [  100.590469] ubi0: scanning is finished
>> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was
>> not found
>> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0,
>> error -22
>> ubiattach: error!: cannot attach mtd0
>>              error 22 (Invalid argument)
>>
>> The u-boot version we are using is a few months out of date
>> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner
>> Technology
>> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
>> GNU ld (2.25-5+5+b1) 2.25
>>
>> but the kernel is fairly up to date:
>> 4.2.0-rc4-opinicus-g8ec3671
>>
>>
>> Now I know that the mtd stuff is all very new and all very untested,
>> what I am curious about is a) have other people actually tried the mtd
>> stuff on Allwinner hardware, and b) has anybody encountered this issue
>> as well?
> Yes we did. So far we're using the NAND in SLC mode to address this
> problem. It seems to work, but you also loose half the NAND capacity.
So as requested by someone else: how exactly does that work? Can we just 
give your NAND driver a mapping between shared pages and instruct it to 
ignore half, or does the driver require some serious patchery?
Cheers,

Roy
>
>> It's not something very easily reproducible (toggling a machine on/off
>> repeatedly did not trigger it yet) but it does happen.
> I managed to reproduce it by faking a power cut directly in the NAND
> core code (by sending a RESET command to the NAND chip in the middle of
> a program operation), and I can confirm SLC mode address the problem.
>
> Anyway, remember that MLC NANDs have other sources of unreliability
> (e.g the unstable bits problem).
>
> Best Regards,
>
> Boris
>
>
> [1]http://events.linuxfoundation.org/sites/events/files/slides/brezillon-mlc-nand_0.pdf
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ARM: sunxi: Experiences NAND flash
  2015-08-17  8:30     ` [U-Boot] " Roy Spliet
@ 2015-08-17  9:03       ` Boris Brezillon
  -1 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2015-08-17  9:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Roy,

On Mon, 17 Aug 2015 09:30:38 +0100
Roy Spliet <seven@nimrod-online.com> wrote:

> Hello,
> 
> Reply in-line
> 
> Op 17-08-15 om 08:34 schreef Boris Brezillon:
> > Hi Oliver,
> >
> > Sorry for the late reply (I was in vacation for the last 2 weeks)
> >
> > On Tue, 11 Aug 2015 14:16:52 +0200
> > Olliver Schinagl <oliver+list@schinagl.nl> wrote:
> >
> >> Hello everybody,
> >>
> >> We are working with Boris and Roy's patch series on getting the NAND
> >> flash chip working on Olimex OLinuXino Lime2 boards. Initially,
> >> everything looks fine, but we noticed that occasionally (after
> >> power/cycle or power cut) ubi fails to mount the partition. It is not
> >> something easily enough to reproduce, but it has failed on 5 boards out
> >> of 30 we have.
> > I remember warning you about that problem before: MLC NANDs are not as
> > reliable as SLC ones (please read my presentation about MLC support in
> > Linux [1]). I also remember recommending using an SLC chip if you were
> > tight on time to avoid dealing with all these MLC related problems, but
> > you decided to go for the MLC solution.
> >
> > Back to your problem now, what you're seeing here is probably caused by
> > interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
> > of my presentation for more information).
> In his defence; we looked at it, and from what we could tell it is not 
> possible to find an affordable SLC chip that the Allwinner A10/A20 
> BootROM would even boot from. In general, chips below 8K page size 
> require 64-bit EEC strength to operate, which in turn required more OOB 
> area than any chip would provide. This limitation is in my opinion a 
> design fault from AllWinners side and I hope that their future SoCs can 
> boot with more relaxed EEC settings to facilitate for cheap SLC chips, 
> but right now there is nothing we can do to change that situation.

Hm, according to this table [1], it also tries the 64bit/512bytes
scheme, which should fit in most SLC NANDs (if you have a NAND with 2k
+ 64byte pages, and you only use 512 bytes per page it leaves 1600 bytes
for your ECC data).
This being said, supporting this kind of layout in Linux can be
complicated: I remember we (Roy and I) tried to patch the nand part code
to tweak the data/oob repartition for this case, but we didn't manage
to get it to work.

> >> U-boot reports the following:
> >> UBI: default fastmap pool size: 100
> >> UBI: default fastmap WL pool size: 25
> >> UBI: attaching mtd1 to ubi0
> >> UBI: scanning is finished
> >> UBI init error 22
> >> Error reading superblock on volume 'ubi:boot' errno=-19!
> >> ubifsmount - mount UBIFS volume
> >>
> >> whereas the linux kernel booted from sd card gives:
> >> ubiattach /dev/ubi_ctrl -m 0
> >> [  100.560704] ubi0: default fastmap pool size: 8
> >> [  100.565186] ubi0: default fastmap WL pool size: 4
> >> [  100.570100] ubi0: attaching mtd0
> >> [  100.590469] ubi0: scanning is finished
> >> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was
> >> not found
> >> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0,
> >> error -22
> >> ubiattach: error!: cannot attach mtd0
> >>              error 22 (Invalid argument)
> >>
> >> The u-boot version we are using is a few months out of date
> >> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner
> >> Technology
> >> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
> >> GNU ld (2.25-5+5+b1) 2.25
> >>
> >> but the kernel is fairly up to date:
> >> 4.2.0-rc4-opinicus-g8ec3671
> >>
> >>
> >> Now I know that the mtd stuff is all very new and all very untested,
> >> what I am curious about is a) have other people actually tried the mtd
> >> stuff on Allwinner hardware, and b) has anybody encountered this issue
> >> as well?
> > Yes we did. So far we're using the NAND in SLC mode to address this
> > problem. It seems to work, but you also loose half the NAND capacity.
> So as requested by someone else: how exactly does that work? Can we just 
> give your NAND driver a mapping between shared pages and instruct it to 
> ignore half, or does the driver require some serious patchery?

I only have a prototype for this SLC mode, and the code is available
here [2].
In short, the NAND core layer checks for SLC mode activation, and if it
is activated it only exposes half the erase block capacity.
This also requires some chip specific code to enable/disable the SLC
mode and adjust the row/column addresses before passing them to the
controller driver.
Note that SLC mode can be enabled by partitions, which let us declare
the SPL partition in MLC mode so that BROM can still load the SPL.

Best Regards,

Boris

[1]http://linux-sunxi.org/NAND#More_information_on_BROM_NAND
[2]https://github.com/NextThingCo/CHIP-linux/tree/nextthing/4.2/chip-nand-slc-mode

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [U-Boot] ARM: sunxi: Experiences NAND flash
@ 2015-08-17  9:03       ` Boris Brezillon
  0 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2015-08-17  9:03 UTC (permalink / raw)
  To: u-boot

Hi Roy,

On Mon, 17 Aug 2015 09:30:38 +0100
Roy Spliet <seven@nimrod-online.com> wrote:

> Hello,
> 
> Reply in-line
> 
> Op 17-08-15 om 08:34 schreef Boris Brezillon:
> > Hi Oliver,
> >
> > Sorry for the late reply (I was in vacation for the last 2 weeks)
> >
> > On Tue, 11 Aug 2015 14:16:52 +0200
> > Olliver Schinagl <oliver+list@schinagl.nl> wrote:
> >
> >> Hello everybody,
> >>
> >> We are working with Boris and Roy's patch series on getting the NAND
> >> flash chip working on Olimex OLinuXino Lime2 boards. Initially,
> >> everything looks fine, but we noticed that occasionally (after
> >> power/cycle or power cut) ubi fails to mount the partition. It is not
> >> something easily enough to reproduce, but it has failed on 5 boards out
> >> of 30 we have.
> > I remember warning you about that problem before: MLC NANDs are not as
> > reliable as SLC ones (please read my presentation about MLC support in
> > Linux [1]). I also remember recommending using an SLC chip if you were
> > tight on time to avoid dealing with all these MLC related problems, but
> > you decided to go for the MLC solution.
> >
> > Back to your problem now, what you're seeing here is probably caused by
> > interrupted PROGRAM operations on paired pages (page 17, 18 and 26 to 32
> > of my presentation for more information).
> In his defence; we looked at it, and from what we could tell it is not 
> possible to find an affordable SLC chip that the Allwinner A10/A20 
> BootROM would even boot from. In general, chips below 8K page size 
> require 64-bit EEC strength to operate, which in turn required more OOB 
> area than any chip would provide. This limitation is in my opinion a 
> design fault from AllWinners side and I hope that their future SoCs can 
> boot with more relaxed EEC settings to facilitate for cheap SLC chips, 
> but right now there is nothing we can do to change that situation.

Hm, according to this table [1], it also tries the 64bit/512bytes
scheme, which should fit in most SLC NANDs (if you have a NAND with 2k
+ 64byte pages, and you only use 512 bytes per page it leaves 1600 bytes
for your ECC data).
This being said, supporting this kind of layout in Linux can be
complicated: I remember we (Roy and I) tried to patch the nand part code
to tweak the data/oob repartition for this case, but we didn't manage
to get it to work.

> >> U-boot reports the following:
> >> UBI: default fastmap pool size: 100
> >> UBI: default fastmap WL pool size: 25
> >> UBI: attaching mtd1 to ubi0
> >> UBI: scanning is finished
> >> UBI init error 22
> >> Error reading superblock on volume 'ubi:boot' errno=-19!
> >> ubifsmount - mount UBIFS volume
> >>
> >> whereas the linux kernel booted from sd card gives:
> >> ubiattach /dev/ubi_ctrl -m 0
> >> [  100.560704] ubi0: default fastmap pool size: 8
> >> [  100.565186] ubi0: default fastmap WL pool size: 4
> >> [  100.570100] ubi0: attaching mtd0
> >> [  100.590469] ubi0: scanning is finished
> >> [  100.594732] ubi0 error: ubi_read_volume_table: the layout volume was
> >> not found
> >> [  100.602675] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd0,
> >> error -22
> >> ubiattach: error!: cannot attach mtd0
> >>              error 22 (Invalid argument)
> >>
> >> The u-boot version we are using is a few months out of date
> >> U-Boot 2015.07-rc2-g2540c39 (Aug 04 2015 - 16:09:02 +0200) Allwinner
> >> Technology
> >> arm-none-eabi-gcc (4.8.4-1+11-1) 4.8.4 20141219 (release)
> >> GNU ld (2.25-5+5+b1) 2.25
> >>
> >> but the kernel is fairly up to date:
> >> 4.2.0-rc4-opinicus-g8ec3671
> >>
> >>
> >> Now I know that the mtd stuff is all very new and all very untested,
> >> what I am curious about is a) have other people actually tried the mtd
> >> stuff on Allwinner hardware, and b) has anybody encountered this issue
> >> as well?
> > Yes we did. So far we're using the NAND in SLC mode to address this
> > problem. It seems to work, but you also loose half the NAND capacity.
> So as requested by someone else: how exactly does that work? Can we just 
> give your NAND driver a mapping between shared pages and instruct it to 
> ignore half, or does the driver require some serious patchery?

I only have a prototype for this SLC mode, and the code is available
here [2].
In short, the NAND core layer checks for SLC mode activation, and if it
is activated it only exposes half the erase block capacity.
This also requires some chip specific code to enable/disable the SLC
mode and adjust the row/column addresses before passing them to the
controller driver.
Note that SLC mode can be enabled by partitions, which let us declare
the SPL partition in MLC mode so that BROM can still load the SPL.

Best Regards,

Boris

[1]http://linux-sunxi.org/NAND#More_information_on_BROM_NAND
[2]https://github.com/NextThingCo/CHIP-linux/tree/nextthing/4.2/chip-nand-slc-mode

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-08-17  9:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-11 12:16 ARM: sunxi: Experiences NAND flash Olliver Schinagl
2015-08-11 12:16 ` [U-Boot] " Olliver Schinagl
2015-08-12 13:31 ` [U-Boot] [linux-sunxi] " Olliver Schinagl
2015-08-17  7:48   ` Boris Brezillon
2015-08-17  7:48     ` [U-Boot] " Boris Brezillon
2015-08-17  7:34 ` Boris Brezillon
2015-08-17  7:34   ` [U-Boot] " Boris Brezillon
2015-08-17  7:51   ` [linux-sunxi] " Michal Suchanek
2015-08-17  7:51     ` [U-Boot] " Michal Suchanek
2015-08-17  8:30   ` Roy Spliet
2015-08-17  8:30     ` [U-Boot] " Roy Spliet
2015-08-17  9:03     ` Boris Brezillon
2015-08-17  9:03       ` [U-Boot] " Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.