All of lore.kernel.org
 help / color / mirror / Atom feed
* UBI assert failed in ubi_wl_init
@ 2018-02-01  6:44 Timo Ketola
  2018-02-04 13:44 ` Richard Weinberger
  0 siblings, 1 reply; 17+ messages in thread
From: Timo Ketola @ 2018-02-01  6:44 UTC (permalink / raw)
  To: linux-mtd

Hi,

Could someone please give some ideas...

I'm trying to deploy UBI fastmap. Basically it seems to work but there
are some messages I'm not comfortable with.

I made UBI image with ubinize with fixed size empty volume (autoresize
seems not to work with fastmap at all). I flashed it with U-Boot 'nand
write.trimffs'. On first boot the fastmap complains about not to be able
to write fastmap:


> ubi1: default fastmap pool size: 256
> ubi1: default fastmap WL pool size: 128
> ubi1: attaching mtd1
> ubi1: scanning is finished
> ubi1: attached mtd1 (name "user", size 1984 MiB)
> ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
> ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
> ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
> ubi1: good PEBs: 15864, bad PEBs: 8, corrupted PEBs: 0
> ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
> ubi1: max/mean erase counter: 0/0, WL threshold: 4096, image sequence number: 18
> 6152832
> ubi1: available PEBs: 2331, total reserved PEBs: 13533, PEBs reserved for bad PE
> B handling: 312
> ubi1: background thread "ubi_bgt1d" started, PID 98
...
> ubi1 error: ubi_update_fastmap: could not get any free erase block
> ubi1 warning: ubi_update_fastmap: Unable to write new fastmap, err=-28
> UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 105
> UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "user"
> UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048
>  bytes/2048 bytes
> UBIFS (ubi1:0): FS size: 1675702272 bytes (1598 MiB, 13197 LEBs), journal size 9
> 023488 bytes (8 MiB, 72 LEBs)
> UBIFS (ubi1:0): reserved for root: 0 bytes (0 KiB)
> UBIFS (ubi1:0): media format: w4/r0 (latest is w5/r0), UUID C8D0E8D3-8394-445A-9
> 089-95BD864D5949, big LPT model


However, the image still mounts and works and on next boots also the
fastmap seems to be functional. Scanning is fast. But now UBI complains
about failing assert:


> ubi1: default fastmap pool size: 256
> ubi1: default fastmap WL pool size: 128
> ubi1: attaching mtd1
> ubi1: attached by fastmap
> ubi1: fastmap pool size: 256
> ubi1: fastmap WL pool size: 128
> UBI assert failed in ubi_wl_init at 1666 (pid 1)
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.1 #3
> Hardware name: Freescale i.MX6 SoloX (Device Tree)
> Backtrace: 
> [<c010a1b8>] (dump_backtrace) from [<c010a454>] (show_stack+0x18/0x1c)
>  r7:eee39b00 r6:c0912ac8 r5:60000013 r4:00000000
> [<c010a43c>] (show_stack) from [<c029edd0>] (dump_stack+0x80/0xa0)
> [<c029ed50>] (dump_stack) from [<c03d2394>] (ubi_wl_init+0x430/0x540)
>  r7:eee39b00 r6:eee39b04 r5:00003df7 r4:eee38040
> [<c03d1f64>] (ubi_wl_init) from [<c03d4c94>] (ubi_attach+0x28c/0x364)
>  r10:00000800 r9:eee37040 r8:00000040 r7:eee39c00 r6:eee39b00 r5:eee38040
>  r4:00000000
> [<c03d4a08>] (ubi_attach) from [<c03c9668>] (ubi_attach_mtd_dev+0x788/0xbc0)
>  r9:eee38080 r8:00000840 r7:ef349bc0 r6:00000001 r5:eee38040 r4:00000800
> [<c03c8ee0>] (ubi_attach_mtd_dev) from [<c0823d60>] (ubi_init+0x148/0x1e8)
>  r10:00000000 r9:c0933600 r8:00000000 r7:c0989024 r6:ef349bc0 r5:c0989074
>  r4:00000001
> [<c0823c18>] (ubi_init) from [<c0101780>] (do_one_initcall+0xb0/0x154)
>  r8:00000000 r7:c083edd0 r6:c0835838 r5:c0823c18 r4:00000007
> [<c01016d0>] (do_one_initcall) from [<c0800e30>] (kernel_init_freeable+0x130/0x1
> f8)
>  r8:c0933600 r7:c083edd0 r6:c0835838 r5:0000008d r4:00000007
> [<c0800d00>] (kernel_init_freeable) from [<c056ab6c>] (kernel_init+0x10/0x114)
>  r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c056ab5c r4:00000000
> [<c056ab5c>] (kernel_init) from [<c01070c8>] (ret_from_fork+0x14/0x2c)
>  r5:c056ab5c r4:00000000
> ubi1: attached mtd1 (name "user", size 1984 MiB)
> ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
> ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
> ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
> ubi1: good PEBs: 15864, bad PEBs: 8, corrupted PEBs: 0
> ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
> ubi1: max/mean erase counter: 2/0, WL threshold: 4096, image sequence number: 18
> 6152832
> ubi1: available PEBs: 2331, total reserved PEBs: 13533, PEBs reserved for bad PE
> B handling: 312
> ubi1: background thread "ubi_bgt1d" started, PID 98
...
> UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 105
> UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "user"
> UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048
>  bytes/2048 bytes
> UBIFS (ubi1:0): FS size: 1675702272 bytes (1598 MiB, 13197 LEBs), journal size 9
> 023488 bytes (8 MiB, 72 LEBs)
> UBIFS (ubi1:0): reserved for root: 0 bytes (0 KiB)
> UBIFS (ubi1:0): media format: w4/r0 (latest is w5/r0), UUID C8D0E8D3-8394-445A-9
> 089-95BD864D5949, big LPT model


The line 1666 reads:

> 	ubi_assert(ubi->good_peb_count == found_pebs);


Why the assert fails? Where should I look at to fix this? Is this
something to worry about?

I'm on mainline 4.12.1. ubinize version: 1.5.2

--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-01  6:44 UBI assert failed in ubi_wl_init Timo Ketola
@ 2018-02-04 13:44 ` Richard Weinberger
  2018-02-05  6:39   ` Timo Ketola
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Weinberger @ 2018-02-04 13:44 UTC (permalink / raw)
  To: Timo Ketola; +Cc: linux-mtd

Timo,

On Thu, Feb 1, 2018 at 7:44 AM, Timo Ketola <timo@exertus.fi> wrote:
> Hi,
>
> Could someone please give some ideas...
>
> I'm trying to deploy UBI fastmap. Basically it seems to work but there
> are some messages I'm not comfortable with.
>
> I made UBI image with ubinize with fixed size empty volume (autoresize
> seems not to work with fastmap at all). I flashed it with U-Boot 'nand
> write.trimffs'. On first boot the fastmap complains about not to be able
> to write fastmap:
>
>
>> ubi1: default fastmap pool size: 256
>> ubi1: default fastmap WL pool size: 128
>> ubi1: attaching mtd1
>> ubi1: scanning is finished
>> ubi1: attached mtd1 (name "user", size 1984 MiB)
>> ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
>> ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
>> ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
>> ubi1: good PEBs: 15864, bad PEBs: 8, corrupted PEBs: 0
>> ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
>> ubi1: max/mean erase counter: 0/0, WL threshold: 4096, image sequence number: 18
>> 6152832
>> ubi1: available PEBs: 2331, total reserved PEBs: 13533, PEBs reserved for bad PE
>> B handling: 312
>> ubi1: background thread "ubi_bgt1d" started, PID 98
> ...
>> ubi1 error: ubi_update_fastmap: could not get any free erase block
>> ubi1 warning: ubi_update_fastmap: Unable to write new fastmap, err=-28

Most likely because you have absolutely no free erase blocks available
and fastmap
was also unable to move blocks around to produce free space within the
first 64 blocks.

>> UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 105
>> UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "user"
>> UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048
>>  bytes/2048 bytes
>> UBIFS (ubi1:0): FS size: 1675702272 bytes (1598 MiB, 13197 LEBs), journal size 9
>> 023488 bytes (8 MiB, 72 LEBs)
>> UBIFS (ubi1:0): reserved for root: 0 bytes (0 KiB)
>> UBIFS (ubi1:0): media format: w4/r0 (latest is w5/r0), UUID C8D0E8D3-8394-445A-9
>> 089-95BD864D5949, big LPT model
>
>
> However, the image still mounts and works and on next boots also the
> fastmap seems to be functional. Scanning is fast. But now UBI complains
> about failing assert:
>
>
>> ubi1: default fastmap pool size: 256
>> ubi1: default fastmap WL pool size: 128
>> ubi1: attaching mtd1
>> ubi1: attached by fastmap
>> ubi1: fastmap pool size: 256
>> ubi1: fastmap WL pool size: 128
>> UBI assert failed in ubi_wl_init at 1666 (pid 1)
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.1 #3
>> Hardware name: Freescale i.MX6 SoloX (Device Tree)
>> Backtrace:
>> [<c010a1b8>] (dump_backtrace) from [<c010a454>] (show_stack+0x18/0x1c)
>>  r7:eee39b00 r6:c0912ac8 r5:60000013 r4:00000000
>> [<c010a43c>] (show_stack) from [<c029edd0>] (dump_stack+0x80/0xa0)
>> [<c029ed50>] (dump_stack) from [<c03d2394>] (ubi_wl_init+0x430/0x540)
>>  r7:eee39b00 r6:eee39b04 r5:00003df7 r4:eee38040
>> [<c03d1f64>] (ubi_wl_init) from [<c03d4c94>] (ubi_attach+0x28c/0x364)
>>  r10:00000800 r9:eee37040 r8:00000040 r7:eee39c00 r6:eee39b00 r5:eee38040
>>  r4:00000000
>> [<c03d4a08>] (ubi_attach) from [<c03c9668>] (ubi_attach_mtd_dev+0x788/0xbc0)
>>  r9:eee38080 r8:00000840 r7:ef349bc0 r6:00000001 r5:eee38040 r4:00000800
>> [<c03c8ee0>] (ubi_attach_mtd_dev) from [<c0823d60>] (ubi_init+0x148/0x1e8)
>>  r10:00000000 r9:c0933600 r8:00000000 r7:c0989024 r6:ef349bc0 r5:c0989074
>>  r4:00000001
>> [<c0823c18>] (ubi_init) from [<c0101780>] (do_one_initcall+0xb0/0x154)
>>  r8:00000000 r7:c083edd0 r6:c0835838 r5:c0823c18 r4:00000007
>> [<c01016d0>] (do_one_initcall) from [<c0800e30>] (kernel_init_freeable+0x130/0x1
>> f8)
>>  r8:c0933600 r7:c083edd0 r6:c0835838 r5:0000008d r4:00000007
>> [<c0800d00>] (kernel_init_freeable) from [<c056ab6c>] (kernel_init+0x10/0x114)
>>  r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c056ab5c r4:00000000
>> [<c056ab5c>] (kernel_init) from [<c01070c8>] (ret_from_fork+0x14/0x2c)
>>  r5:c056ab5c r4:00000000
>> ubi1: attached mtd1 (name "user", size 1984 MiB)
>> ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
>> ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
>> ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
>> ubi1: good PEBs: 15864, bad PEBs: 8, corrupted PEBs: 0
>> ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
>> ubi1: max/mean erase counter: 2/0, WL threshold: 4096, image sequence number: 18
>> 6152832
>> ubi1: available PEBs: 2331, total reserved PEBs: 13533, PEBs reserved for bad PE
>> B handling: 312
>> ubi1: background thread "ubi_bgt1d" started, PID 98
> ...
>> UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 105
>> UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "user"
>> UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048
>>  bytes/2048 bytes
>> UBIFS (ubi1:0): FS size: 1675702272 bytes (1598 MiB, 13197 LEBs), journal size 9
>> 023488 bytes (8 MiB, 72 LEBs)
>> UBIFS (ubi1:0): reserved for root: 0 bytes (0 KiB)
>> UBIFS (ubi1:0): media format: w4/r0 (latest is w5/r0), UUID C8D0E8D3-8394-445A-9
>> 089-95BD864D5949, big LPT model
>
>
> The line 1666 reads:
>
>>       ubi_assert(ubi->good_peb_count == found_pebs);
>
>
> Why the assert fails? Where should I look at to fix this? Is this
> something to worry about?

Now things get interesting. If UBI is unable to write a new fastmap it
makes sure that the current one,
which is now outdated and wrong, will be erased.
This seemed to happen in your case since I don't see a "Unable to
invalidate current fastmap!" message
and UBI did not change to read-only mode.

As it looks the fastmap invalidation code has a problem and you end up
with an outdated fastmap which
is obviously bad.

Can you share me this UBI image such that I can have a quick look why
the invalidation did not work?

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-04 13:44 ` Richard Weinberger
@ 2018-02-05  6:39   ` Timo Ketola
  2018-02-12 22:49     ` Richard Weinberger
  0 siblings, 1 reply; 17+ messages in thread
From: Timo Ketola @ 2018-02-05  6:39 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On 04.02.2018 15:44, Richard Weinberger wrote:
> Can you share me this UBI image such that I can have a quick look why
> the invalidation did not work?

Thank you for looking at this!

I sent the image privately to not bother the list with 3M file.

It really is an empty image.


Build commands:

mkfs.ubifs -e 0x1f000 -c 2048 -m 0x800 -o userfs.ubifs

ubinize -o userfs.ubi -m 0x800 -p 0x20000 ubinize.cfg


ubinize.cfg:

[ubifs]
mode=ubi
vol_id=0
vol_type=dynamic
vol_name=user
vol_alignment=1
image=userfs.ubifs
vol_size=1800MiB


--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-05  6:39   ` Timo Ketola
@ 2018-02-12 22:49     ` Richard Weinberger
  2018-02-13 11:58       ` Timo Ketola
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Weinberger @ 2018-02-12 22:49 UTC (permalink / raw)
  To: Timo Ketola, linux-mtd

Timo,

Am Montag, 5. Februar 2018, 07:39:50 CET schrieb Timo Ketola:
> On 04.02.2018 15:44, Richard Weinberger wrote:
> > Can you share me this UBI image such that I can have a quick look why
> > the invalidation did not work?
> 
> Thank you for looking at this!
> 
> I sent the image privately to not bother the list with 3M file.
> 
> It really is an empty image.

I meant the image _after_ UBI failed to place the fastmap. :)

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-12 22:49     ` Richard Weinberger
@ 2018-02-13 11:58       ` Timo Ketola
  2018-02-13 12:42         ` Richard Weinberger
  0 siblings, 1 reply; 17+ messages in thread
From: Timo Ketola @ 2018-02-13 11:58 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On 13.02.2018 00:49, Richard Weinberger wrote:
> I meant the image _after_ UBI failed to place the fastmap. :)

Argh, sorry, shame on me.

https://drive.google.com/open?id=1t-auduKG2bwzpzZf09rV4O9F7GUN2viJ

https://drive.google.com/open?id=1ahKl8pf3svKQSvz6VQ-7xUQsyfkVuO7k

The first one is taken after the first boot and the second after the
second boot.

I took them with:

# dd if=/dev/mtd1 of=/media/sda1/ubi1.bin bs=1M

--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-13 11:58       ` Timo Ketola
@ 2018-02-13 12:42         ` Richard Weinberger
  2018-02-13 14:00           ` Timo Ketola
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Weinberger @ 2018-02-13 12:42 UTC (permalink / raw)
  To: Timo Ketola; +Cc: linux-mtd

Timo,

Am Dienstag, 13. Februar 2018, 12:58:50 CET schrieb Timo Ketola:
> On 13.02.2018 00:49, Richard Weinberger wrote:
> > I meant the image _after_ UBI failed to place the fastmap. :)
> 
> Argh, sorry, shame on me.
> 
> https://drive.google.com/open?id=1t-auduKG2bwzpzZf09rV4O9F7GUN2viJ

This image contains no fastmap at all.
 
> https://drive.google.com/open?id=1ahKl8pf3svKQSvz6VQ-7xUQsyfkVuO7k
> 
> The first one is taken after the first boot and the second after the
> second boot.

This one a good fastmap (full scan comes to the same result).
The UBIFS seems good but contains no files/dirs.

So, this is not what you have described in your problem report.
Can you please double check?

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-13 12:42         ` Richard Weinberger
@ 2018-02-13 14:00           ` Timo Ketola
  2018-02-18 20:31             ` Richard Weinberger
  0 siblings, 1 reply; 17+ messages in thread
From: Timo Ketola @ 2018-02-13 14:00 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On 13.02.2018 14:42, Richard Weinberger wrote:
> So, this is not what you have described in your problem report.
> Can you please double check?

Hmm, how?

To get those images, I burned a fresh empty image again into the NAND.
On the first boot there was again in the dmesg this:


> [   18.804716] ubi1 error: ubi_update_fastmap: could not get any free erase block
> [   18.810745] ubi1 warning: ubi_update_fastmap: Unable to write new fastmap, err=-28

And then I took the first image. Then I rebooted and this time there was
in the dmesg this:

> [    4.231126] UBI assert failed in ubi_wl_init at 1666 (pid 1)

and backtrace. Then I took the second image.

Every boot after the first one generates the failed assertion into the
dmesg.

What should I do next?

--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-13 14:00           ` Timo Ketola
@ 2018-02-18 20:31             ` Richard Weinberger
  2018-03-16 13:34               ` Timo Ketola
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Weinberger @ 2018-02-18 20:31 UTC (permalink / raw)
  To: Timo Ketola; +Cc: Richard Weinberger, linux-mtd

On Tue, Feb 13, 2018 at 3:00 PM, Timo Ketola <timo@exertus.fi> wrote:
> On 13.02.2018 14:42, Richard Weinberger wrote:
>> So, this is not what you have described in your problem report.
>> Can you please double check?
>
> Hmm, how?
>
> To get those images, I burned a fresh empty image again into the NAND.
> On the first boot there was again in the dmesg this:
>
>
>> [   18.804716] ubi1 error: ubi_update_fastmap: could not get any free erase block
>> [   18.810745] ubi1 warning: ubi_update_fastmap: Unable to write new fastmap, err=-28
>
> And then I took the first image. Then I rebooted and this time there was
> in the dmesg this:
>
>> [    4.231126] UBI assert failed in ubi_wl_init at 1666 (pid 1)
>
> and backtrace. Then I took the second image.
>
> Every boot after the first one generates the failed assertion into the
> dmesg.
>
> What should I do next?

Okay. That's a bit odd. Maybe my analysis is wrong. Can you try the
following, replay the image to your NAND and attach again.
Then you should get the same UBI assert a second time.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-02-18 20:31             ` Richard Weinberger
@ 2018-03-16 13:34               ` Timo Ketola
  2018-03-19  7:13                 ` Timo Ketola
  2018-03-20 22:02                 ` Richard Weinberger
  0 siblings, 2 replies; 17+ messages in thread
From: Timo Ketola @ 2018-03-16 13:34 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd

On 18.02.2018 22:31, Richard Weinberger wrote:
> Okay. That's a bit odd. Maybe my analysis is wrong. Can you try the
> following, replay the image to your NAND and attach again.
> Then you should get the same UBI assert a second time.

Of course I can but what would be the difference to simple rebooting? I
get the assertion failure on every boot from the second one onwards.
Before I commit to that one, let's look what I found recently.

I already tried the replay but bumped into ECC errors. Due to that and
other things I switched the kernel to Freescale IMX 4.9.74 with Boundary
fixes. I reduced the size of the user partition to 1800M and got rid of
the 'could not get any free erase block' and 'Unable to write new
fastmap' errors. The assertion failures from the second boot onwards
remained.

I wondered why I'm not seeing the same issue with my smaller 64M system
mtd. Then I reduced the user partition to 48M and the mtd to 64M and the
issue vanished. Then I kept the partition at 48M but restored the mtd
size to 1800M and the issue returned. If I make the mtd size 0x4e680000
or smaller there is no issue. If I make it 0x4e6a0000 or larger the
issue is there. Every time I erased the NAND and burned a fresh image.

What do you want me to do next?

--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-16 13:34               ` Timo Ketola
@ 2018-03-19  7:13                 ` Timo Ketola
  2018-03-20 22:02                 ` Richard Weinberger
  1 sibling, 0 replies; 17+ messages in thread
From: Timo Ketola @ 2018-03-19  7:13 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd

Hi again,

One further detail:

I tried to increase the UBI volume size from 48M to 64M. It didn't
affect the borderline, where the issue starts to manifest.

I already mixed up some terms and numbers so I try to wrap up everything
here:

           | UBI volume size MiB:
  mtd size |   48 |   64 | 1800
-----------+------+------+------
0x04000000 |      |  na  |  na
0x4E680000 |      |      |  na
0x4E6A0000 |   X  |   X  |  na
0x7C000000 |   X  |   X  |   X

X: The issue is there
na: obviously not possible

UBI volume size is what I put into ubinize.cfg vol_size variable.

By 'the issue' I mean specifically the assertion failure:

> UBI assert failed in ubi_wl_init at 1666 (pid 1)

--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-16 13:34               ` Timo Ketola
  2018-03-19  7:13                 ` Timo Ketola
@ 2018-03-20 22:02                 ` Richard Weinberger
  2018-03-22 13:57                   ` Timo Ketola
  1 sibling, 1 reply; 17+ messages in thread
From: Richard Weinberger @ 2018-03-20 22:02 UTC (permalink / raw)
  To: Timo Ketola; +Cc: Richard Weinberger, linux-mtd

Timo,

On Fri, Mar 16, 2018 at 2:34 PM, Timo Ketola <timo@exertus.fi> wrote:
> On 18.02.2018 22:31, Richard Weinberger wrote:
>> Okay. That's a bit odd. Maybe my analysis is wrong. Can you try the
>> following, replay the image to your NAND and attach again.
>> Then you should get the same UBI assert a second time.
>
> Of course I can but what would be the difference to simple rebooting? I
> get the assertion failure on every boot from the second one onwards.
> Before I commit to that one, let's look what I found recently.
>
> I already tried the replay but bumped into ECC errors. Due to that and
> other things I switched the kernel to Freescale IMX 4.9.74 with Boundary
> fixes. I reduced the size of the user partition to 1800M and got rid of
> the 'could not get any free erase block' and 'Unable to write new
> fastmap' errors. The assertion failures from the second boot onwards
> remained.
>
> I wondered why I'm not seeing the same issue with my smaller 64M system
> mtd. Then I reduced the user partition to 48M and the mtd to 64M and the
> issue vanished. Then I kept the partition at 48M but restored the mtd
> size to 1800M and the issue returned. If I make the mtd size 0x4e680000
> or smaller there is no issue. If I make it 0x4e6a0000 or larger the
> issue is there. Every time I erased the NAND and burned a fresh image.
>
> What do you want me to do next?

Did you verify the image or not? :)
I asked you do to so because the image you sent me does not make sense
and I'm not sure whether I can trust it.
Hence I asked for a double-check.

You seem to run in some wired corner-case, that's why I'd like to
inspect your fastmap.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-20 22:02                 ` Richard Weinberger
@ 2018-03-22 13:57                   ` Timo Ketola
  2018-03-22 15:05                     ` Richard Weinberger
  0 siblings, 1 reply; 17+ messages in thread
From: Timo Ketola @ 2018-03-22 13:57 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd

On 21.03.2018 00:02, Richard Weinberger wrote:
> Timo,
> 
> On Fri, Mar 16, 2018 at 2:34 PM, Timo Ketola <timo@exertus.fi> wrote:
>> On 18.02.2018 22:31, Richard Weinberger wrote:
>>> Okay. That's a bit odd. Maybe my analysis is wrong. Can you try the
>>> following, replay the image to your NAND and attach again.
>>> Then you should get the same UBI assert a second time.
>>
>> Of course I can but what would be the difference to simple rebooting? I
>> get the assertion failure on every boot from the second one onwards.
>> Before I commit to that one, let's look what I found recently.
>>
>> I already tried the replay but bumped into ECC errors. Due to that and
>> other things I switched the kernel to Freescale IMX 4.9.74 with Boundary
>> fixes. I reduced the size of the user partition to 1800M and got rid of
>> the 'could not get any free erase block' and 'Unable to write new
>> fastmap' errors. The assertion failures from the second boot onwards
>> remained.
>>
>> I wondered why I'm not seeing the same issue with my smaller 64M system
>> mtd. Then I reduced the user partition to 48M and the mtd to 64M and the
>> issue vanished. Then I kept the partition at 48M but restored the mtd
>> size to 1800M and the issue returned. If I make the mtd size 0x4e680000
>> or smaller there is no issue. If I make it 0x4e6a0000 or larger the
>> issue is there. Every time I erased the NAND and burned a fresh image.
>>
>> What do you want me to do next?
> 
> Did you verify the image or not? :)
> I asked you do to so because the image you sent me does not make sense
> and I'm not sure whether I can trust it.
> Hence I asked for a double-check.

Ok, sorry.

Here is a dump after the second boot:

https://drive.google.com/open?id=1oa2lV_OB_tC-SX_c4jylnMXK6x1xhj2o

After I dumped it I erased the whole mtd, wrote the dump back there,
dumped another dump and verified that it was identical with the first
one. Then I rebooted and observed that the issue was still there just as
before.

--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-22 13:57                   ` Timo Ketola
@ 2018-03-22 15:05                     ` Richard Weinberger
  2018-03-22 15:40                       ` Timo Ketola
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Weinberger @ 2018-03-22 15:05 UTC (permalink / raw)
  To: Timo Ketola; +Cc: Richard Weinberger, linux-mtd

Timo,

Am Donnerstag, 22. März 2018, 14:57:57 CET schrieb Timo Ketola:
> On 21.03.2018 00:02, Richard Weinberger wrote:
> > Timo,
> > 
> > On Fri, Mar 16, 2018 at 2:34 PM, Timo Ketola <timo@exertus.fi> wrote:
> >> On 18.02.2018 22:31, Richard Weinberger wrote:
> >>> Okay. That's a bit odd. Maybe my analysis is wrong. Can you try the
> >>> following, replay the image to your NAND and attach again.
> >>> Then you should get the same UBI assert a second time.
> >> 
> >> Of course I can but what would be the difference to simple rebooting? I
> >> get the assertion failure on every boot from the second one onwards.
> >> Before I commit to that one, let's look what I found recently.
> >> 
> >> I already tried the replay but bumped into ECC errors. Due to that and
> >> other things I switched the kernel to Freescale IMX 4.9.74 with Boundary
> >> fixes. I reduced the size of the user partition to 1800M and got rid of
> >> the 'could not get any free erase block' and 'Unable to write new
> >> fastmap' errors. The assertion failures from the second boot onwards
> >> remained.
> >> 
> >> I wondered why I'm not seeing the same issue with my smaller 64M system
> >> mtd. Then I reduced the user partition to 48M and the mtd to 64M and the
> >> issue vanished. Then I kept the partition at 48M but restored the mtd
> >> size to 1800M and the issue returned. If I make the mtd size 0x4e680000
> >> or smaller there is no issue. If I make it 0x4e6a0000 or larger the
> >> issue is there. Every time I erased the NAND and burned a fresh image.
> >> 
> >> What do you want me to do next?
> > 
> > Did you verify the image or not? :)
> > I asked you do to so because the image you sent me does not make sense
> > and I'm not sure whether I can trust it.
> > Hence I asked for a double-check.
> 
> Ok, sorry.
> 
> Here is a dump after the second boot:
> 
> https://drive.google.com/open?id=1oa2lV_OB_tC-SX_c4jylnMXK6x1xhj2o
> 
> After I dumped it I erased the whole mtd, wrote the dump back there,
> dumped another dump and verified that it was identical with the first
> one. Then I rebooted and observed that the issue was still there just as
> before.

This image contains a Fastmap, but it is invalid.
The image has a length of 2079326208 bytes, your PEB size is 128KiB.
Therefore PEB count is 15864.
But the Fastmap references PEBs 15865, 15866 and 15867.

This explains the failing assert, Fastmap talks about PEBs which are unknown 
to UBI.

Can it be that your mtd partition layout is bad/broken?

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-22 15:05                     ` Richard Weinberger
@ 2018-03-22 15:40                       ` Timo Ketola
  2018-03-22 15:48                         ` Richard Weinberger
  0 siblings, 1 reply; 17+ messages in thread
From: Timo Ketola @ 2018-03-22 15:40 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, linux-mtd

On 22.03.2018 17:05, Richard Weinberger wrote:
> Timo,
> 
> Am Donnerstag, 22. März 2018, 14:57:57 CET schrieb Timo Ketola:
...
>> Here is a dump after the second boot:
>>
>> https://drive.google.com/open?id=1oa2lV_OB_tC-SX_c4jylnMXK6x1xhj2o
>>
>> After I dumped it I erased the whole mtd, wrote the dump back there,
>> dumped another dump and verified that it was identical with the first
>> one. Then I rebooted and observed that the issue was still there just as
>> before.
> 
> This image contains a Fastmap, but it is invalid.
> The image has a length of 2079326208 bytes, your PEB size is 128KiB.
> Therefore PEB count is 15864.
> But the Fastmap references PEBs 15865, 15866 and 15867.
> 
> This explains the failing assert, Fastmap talks about PEBs which are unknown 
> to UBI.
> 
> Can it be that your mtd partition layout is bad/broken?

The partition spans two chips and they both have four blocks of bad
block tables at the end of the chips. Would that explain?

I'm slightly confused though, because flash_erase reports total of eight
blocks:

# flash_eraseall /dev/mtd1
Erasing 128 Kibyte @ 3bf60000 - 48% complete.
Skipping bad block at 0x3bf80000

Skipping bad block at 0x3bfa0000

Skipping bad block at 0x3bfc0000

Skipping bad block at 0x3bfe0000
Erasing 128 Kibyte @ 7bf60000 - 99% complete.
Skipping bad block at 0x7bf80000

Skipping bad block at 0x7bfa0000

Skipping bad block at 0x7bfc0000

Skipping bad block at 0x7bfe0000
Erasing 128 Kibyte @ 7c000000 - 100% complete.


but, log tells only about four blocks:

[    1.549699] Bad block table found at page 524224, version 0x01
[    1.554241] Bad block table found at page 1048512, version 0x01
[    1.561314] Bad block table found at page 524160, version 0x01
[    1.565853] Bad block table found at page 1048448, version 0x01
[    1.573232] 2 ofpart partitions found on MTD device gpmi-nand
[    1.577688] Creating 2 MTD partitions on "gpmi-nand":
[    1.581499] 0x000000000000-0x000004000000 : "system"
[    1.595001] 0x000004000000-0x000080000000 : "user"


--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-22 15:40                       ` Timo Ketola
@ 2018-03-22 15:48                         ` Richard Weinberger
  2018-03-23  9:26                           ` Timo Ketola
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Weinberger @ 2018-03-22 15:48 UTC (permalink / raw)
  To: Timo Ketola, linux-mtd

Timo,

Am Donnerstag, 22. März 2018, 16:40:42 CET schrieb Timo Ketola:
> On 22.03.2018 17:05, Richard Weinberger wrote:
> > Timo,
> 
> > Am Donnerstag, 22. März 2018, 14:57:57 CET schrieb Timo Ketola:
> ...
> 
> >> Here is a dump after the second boot:
> >> 
> >> https://drive.google.com/open?id=1oa2lV_OB_tC-SX_c4jylnMXK6x1xhj2o
> >> 
> >> After I dumped it I erased the whole mtd, wrote the dump back there,
> >> dumped another dump and verified that it was identical with the first
> >> one. Then I rebooted and observed that the issue was still there just as
> >> before.
> > 
> > This image contains a Fastmap, but it is invalid.
> > The image has a length of 2079326208 bytes, your PEB size is 128KiB.
> > Therefore PEB count is 15864.
> > But the Fastmap references PEBs 15865, 15866 and 15867.
> > 
> > This explains the failing assert, Fastmap talks about PEBs which are
> > unknown to UBI.
> > 
> > Can it be that your mtd partition layout is bad/broken?
> 
> The partition spans two chips and they both have four blocks of bad
> block tables at the end of the chips. Would that explain?

If this setup works correctly, it should work.

> I'm slightly confused though, because flash_erase reports total of eight
> blocks:
> 
> # flash_eraseall /dev/mtd1
> Erasing 128 Kibyte @ 3bf60000 - 48% complete.
> Skipping bad block at 0x3bf80000
> 
> Skipping bad block at 0x3bfa0000
> 
> Skipping bad block at 0x3bfc0000
> 
> Skipping bad block at 0x3bfe0000
> Erasing 128 Kibyte @ 7bf60000 - 99% complete.
> Skipping bad block at 0x7bf80000
> 
> Skipping bad block at 0x7bfa0000
> 
> Skipping bad block at 0x7bfc0000
> 
> Skipping bad block at 0x7bfe0000
> Erasing 128 Kibyte @ 7c000000 - 100% complete.
> 
> 
> but, log tells only about four blocks:
> 
> [    1.549699] Bad block table found at page 524224, version 0x01
> [    1.554241] Bad block table found at page 1048512, version 0x01
> [    1.561314] Bad block table found at page 524160, version 0x01
> [    1.565853] Bad block table found at page 1048448, version 0x01
> [    1.573232] 2 ofpart partitions found on MTD device gpmi-nand
> [    1.577688] Creating 2 MTD partitions on "gpmi-nand":
> [    1.581499] 0x000000000000-0x000004000000 : "system"
> [    1.595001] 0x000004000000-0x000080000000 : "user"

Now things get interesting.
Can it be that the number of bad blocks is not stable?
IOW sometimes 4 and sometimes 8 are reported?
This would more or less explain what you see.

Fastmap does not care about bad block itself, all it cares about the total 
number of good blocks.

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-22 15:48                         ` Richard Weinberger
@ 2018-03-23  9:26                           ` Timo Ketola
  2018-03-23  9:35                             ` Richard Weinberger
  0 siblings, 1 reply; 17+ messages in thread
From: Timo Ketola @ 2018-03-23  9:26 UTC (permalink / raw)
  To: Richard Weinberger, linux-mtd

On 22.03.2018 17:48, Richard Weinberger wrote:
> Am Donnerstag, 22. März 2018, 16:40:42 CET schrieb Timo Ketola:
>> I'm slightly confused though, because flash_erase reports total of eight
>> blocks:
>>
>> # flash_eraseall /dev/mtd1
>> Erasing 128 Kibyte @ 3bf60000 - 48% complete.
>> Skipping bad block at 0x3bf80000
>>
>> Skipping bad block at 0x3bfa0000
>>
>> Skipping bad block at 0x3bfc0000
>>
>> Skipping bad block at 0x3bfe0000
>> Erasing 128 Kibyte @ 7bf60000 - 99% complete.
>> Skipping bad block at 0x7bf80000
>>
>> Skipping bad block at 0x7bfa0000
>>
>> Skipping bad block at 0x7bfc0000
>>
>> Skipping bad block at 0x7bfe0000
>> Erasing 128 Kibyte @ 7c000000 - 100% complete.
>>
>>
>> but, log tells only about four blocks:
>>
>> [    1.549699] Bad block table found at page 524224, version 0x01
>> [    1.554241] Bad block table found at page 1048512, version 0x01
>> [    1.561314] Bad block table found at page 524160, version 0x01
>> [    1.565853] Bad block table found at page 1048448, version 0x01
>> [    1.573232] 2 ofpart partitions found on MTD device gpmi-nand
>> [    1.577688] Creating 2 MTD partitions on "gpmi-nand":
>> [    1.581499] 0x000000000000-0x000004000000 : "system"
>> [    1.595001] 0x000004000000-0x000080000000 : "user"
> 
> Now things get interesting.
> Can it be that the number of bad blocks is not stable?
> IOW sometimes 4 and sometimes 8 are reported?
> This would more or less explain what you see.

Dumps show that the two blocks per chip are really bad block tables. The
other two are empty. Could it be that the mtd reserves the other two for
bbt mirrors?

Anyway, I configured the bbt off and now there is full 15872 good PEBs
but the issue stays. Here is a dump after full erase ('nand scrub' from
U-Boot), ubiformat, ubiattach, ubimkvol -m -N user and a couple of reboots:

https://drive.google.com/open?id=1QGUIxarow3mGXbiJYgsLzc-a_Y5oN5XS

Is there any documents about the structure of fastmap blocks? google was
no friend for me...

> Fastmap does not care about bad block itself, all it cares about the total 
> number of good blocks.

How does it record the block references, absolute or relative to good
blocks?

| a | b | bad | c |

Would it record c as block 3 (as absolute) or 2 (as third good one)?

--

Timo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: UBI assert failed in ubi_wl_init
  2018-03-23  9:26                           ` Timo Ketola
@ 2018-03-23  9:35                             ` Richard Weinberger
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Weinberger @ 2018-03-23  9:35 UTC (permalink / raw)
  To: Timo Ketola; +Cc: linux-mtd

Timo,

Am Freitag, 23. März 2018, 10:26:30 CET schrieb Timo Ketola:
> On 22.03.2018 17:48, Richard Weinberger w> rote:
> > Am Donnerstag, 22. März 2018, 16:40:42 CET schrieb Timo Ketola:
> >> I'm slightly confused though, because flash_erase reports total of eight
> >> blocks:
> >> 
> >> # flash_eraseall /dev/mtd1
> >> Erasing 128 Kibyte @ 3bf60000 - 48% complete.
> >> Skipping bad block at 0x3bf80000
> >> 
> >> Skipping bad block at 0x3bfa0000
> >> 
> >> Skipping bad block at 0x3bfc0000
> >> 
> >> Skipping bad block at 0x3bfe0000
> >> Erasing 128 Kibyte @ 7bf60000 - 99% complete.
> >> Skipping bad block at 0x7bf80000
> >> 
> >> Skipping bad block at 0x7bfa0000
> >> 
> >> Skipping bad block at 0x7bfc0000
> >> 
> >> Skipping bad block at 0x7bfe0000
> >> Erasing 128 Kibyte @ 7c000000 - 100% complete.
> >> 
> >> 
> >> but, log tells only about four blocks:
> >> 
> >> [    1.549699] Bad block table found at page 524224, version 0x01
> >> [    1.554241] Bad block table found at page 1048512, version 0x01
> >> [    1.561314] Bad block table found at page 524160, version 0x01
> >> [    1.565853] Bad block table found at page 1048448, version 0x01
> >> [    1.573232] 2 ofpart partitions found on MTD device gpmi-nand
> >> [    1.577688] Creating 2 MTD partitions on "gpmi-nand":
> >> [    1.581499] 0x000000000000-0x000004000000 : "system"
> >> [    1.595001] 0x000004000000-0x000080000000 : "user"
> > 
> > Now things get interesting.
> > Can it be that the number of bad blocks is not stable?
> > IOW sometimes 4 and sometimes 8 are reported?
> > This would more or less explain what you see.
> 
> Dumps show that the two blocks per chip are really bad block tables. The
> other two are empty. Could it be that the mtd reserves the other two for
> bbt mirrors?

This can be.

> Anyway, I configured the bbt off and now there is full 15872 good PEBs
> but the issue stays. Here is a dump after full erase ('nand scrub' from
> U-Boot), ubiformat, ubiattach, ubimkvol -m -N user and a couple of reboots:
> 
> https://drive.google.com/open?id=1QGUIxarow3mGXbiJYgsLzc-a_Y5oN5XS
> 
> Is there any documents about the structure of fastmap blocks? google was
> no friend for me...

See ubi-media.h.
 
> > Fastmap does not care about bad block itself, all it cares about the total
> > number of good blocks.
> 
> How does it record the block references, absolute or relative to good
> blocks?
> 
> | a | b | bad | c |
> 
> Would it record c as block 3 (as absolute) or 2 (as third good one)?

As I said, Fastmap does not care at all. It takes the PEB numbers from UBI.
And UBI just skips bad blocks. So c will be 3.

The big difference between Fastmap and non-Fastmap is that PEB numbers are 
stored on flash.
So maybe we have bug in this area.

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-03-23  9:34 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-01  6:44 UBI assert failed in ubi_wl_init Timo Ketola
2018-02-04 13:44 ` Richard Weinberger
2018-02-05  6:39   ` Timo Ketola
2018-02-12 22:49     ` Richard Weinberger
2018-02-13 11:58       ` Timo Ketola
2018-02-13 12:42         ` Richard Weinberger
2018-02-13 14:00           ` Timo Ketola
2018-02-18 20:31             ` Richard Weinberger
2018-03-16 13:34               ` Timo Ketola
2018-03-19  7:13                 ` Timo Ketola
2018-03-20 22:02                 ` Richard Weinberger
2018-03-22 13:57                   ` Timo Ketola
2018-03-22 15:05                     ` Richard Weinberger
2018-03-22 15:40                       ` Timo Ketola
2018-03-22 15:48                         ` Richard Weinberger
2018-03-23  9:26                           ` Timo Ketola
2018-03-23  9:35                             ` Richard Weinberger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.