All of lore.kernel.org
 help / color / mirror / Atom feed
* UBIFS panic during brown-out
@ 2021-10-11 17:16 Kristof Havasi
  2021-10-11 19:02 ` Richard Weinberger
  0 siblings, 1 reply; 7+ messages in thread
From: Kristof Havasi @ 2021-10-11 17:16 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

Dear Richard,

we have been in contact regarding another (similar) kernel panic a year ago:
http://lists.infradead.org/pipermail/linux-mtd/2020-September/082175.html

Now we are stress testing our board for brown-out stability, and we can
see reproducible file system corruptions.
The culprit was narrowed down to a pending SQLite operation during brown-out,
in the business-logic.

As far as I understood, UBIFS is extremely robust in such cases, so I would
expect a corrupted file as the worst case scenario, not an unbootable system.

Am I too "optimistic" about UBIFS's brown-out stability?

Does the Auth+Encryption combo increase the chances for corrupting the FS
during brown-out, due to the extra operations?

Would you suggest turning on any of the chk_* knobs in the debugfs?
I am not sure that they are helpful, or will just modify the behaviour of the
timings of the system too much.

Would it be a last resort in case the brown-out is detected to switch the FS
into ro mode? Is there any API/ABI apart from the debugfs's ro_error knob to
switch the FS into read-only mode and so trying to prevent file-system
corruption?

Any pointers are welcomed!

We are on Kernel v5.4.150
We use _both_ UBIFS Authentication and Encryption.
The board is a SAMA5D3 powered embedded device with 1GB NAND flash,
which is not even nearly full (10% used).

Kernel panic after the last brown-out:
ubi0: scanning is finished
ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
sequence number: 921361235
>ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for bad PEB handling: 67
ubi0: background thread "ubi_bgt0d" started, PID 617
UBIFS (ubi0:4): Mounting in authenticated mode
UBIFS (ubi0:4): recovery needed
>VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1
Please append a correct "root=" boot option; here are the available partitions:
Kernel panic - not syncing: VFS: Unable to mount root fs on ubi0:rootfs
CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.109-00033-ga88943c1e68 #1
Hardware name: Atmel SAMA5
[<c010babc>] (unwind_backtrace) from [<c010a8d0>] (show_stack+0x10/0x14)
[<c010a8d0>] (show_stack) from [<c0569970>] (panic+0xfc/0x2e8)
[<c0569970>] (panic) from [<c0801598>] (mount_block_root+0x268/0x2a4)
[<c0801598>] (mount_block_root) from [<c0801674>] (prepare_namespace+0x9c/0x184)
[<c0801674>] (prepare_namespace) from [<c056b774>] (kernel_init+0x8/0x108)
[<c056b774>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
Exception stack(0xc7829fb0 to 0xc7829ff8)
9fa0:                                     00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on
ubi0:rootfs ]---

Best regards,
Kristóf Havasi

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UBIFS panic during brown-out
  2021-10-11 17:16 UBIFS panic during brown-out Kristof Havasi
@ 2021-10-11 19:02 ` Richard Weinberger
  2021-10-12 13:02   ` Kristof Havasi
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Weinberger @ 2021-10-11 19:02 UTC (permalink / raw)
  To: Kristof Havasi; +Cc: linux-mtd

Kristóf,

----- Ursprüngliche Mail -----
> Von: "Kristof Havasi" <havasiefr@gmail.com>
> An: "richard" <richard@nod.at>
> CC: "linux-mtd" <linux-mtd@lists.infradead.org>
> Gesendet: Montag, 11. Oktober 2021 19:16:01
> Betreff: UBIFS panic during brown-out

> Dear Richard,
> 
> we have been in contact regarding another (similar) kernel panic a year ago:
> http://lists.infradead.org/pipermail/linux-mtd/2020-September/082175.html
> 
> Now we are stress testing our board for brown-out stability, and we can
> see reproducible file system corruptions.
> The culprit was narrowed down to a pending SQLite operation during brown-out,
> in the business-logic.
> 
> As far as I understood, UBIFS is extremely robust in such cases, so I would
> expect a corrupted file as the worst case scenario, not an unbootable system.
> 
> Am I too "optimistic" about UBIFS's brown-out stability?

A brown-out is something very bad. Electronic components show undefined behavior
in this phase. UBIFS (and Linux) can nothing do there.
But both UBI and UBIFS try to be robust against sudden power loss (power-cut).
E.g. an interrupted erase or program operation. 

So, are you really talking about brown-out? If so, better talk to you hardware guys.

> Does the Auth+Encryption combo increase the chances for corrupting the FS
> during brown-out, due to the extra operations?

Assuming you actually meant power-cut.
Both features are rather new, so there can be still bugs.
 
> Would you suggest turning on any of the chk_* knobs in the debugfs?
> I am not sure that they are helpful, or will just modify the behaviour of the
> timings of the system too much.

Let's start with logs first.
 
> Would it be a last resort in case the brown-out is detected to switch the FS
> into ro mode? Is there any API/ABI apart from the debugfs's ro_error knob to
> switch the FS into read-only mode and so trying to prevent file-system
> corruption?

Not really. You need to make sure that the current NAND command is finished
and no new one will be scheduled. Depending on your hardware design this can
be a challenge.

> Any pointers are welcomed!
> 
> We are on Kernel v5.4.150
> We use _both_ UBIFS Authentication and Encryption.
> The board is a SAMA5D3 powered embedded device with 1GB NAND flash,
> which is not even nearly full (10% used).
> 
> Kernel panic after the last brown-out:
> ubi0: scanning is finished
> ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
> ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
> ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
> ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
> ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
> ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
> ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
> sequence number: 921361235
>>ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for bad PEB
>>handling: 67
> ubi0: background thread "ubi_bgt0d" started, PID 617
> UBIFS (ubi0:4): Mounting in authenticated mode
> UBIFS (ubi0:4): recovery needed
>>VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1

There is a lot of context missing. Can you please share the full logs?
Usually UBIFS prints in detail what went wrong.

> Please append a correct "root=" boot option; here are the available partitions:
> Kernel panic - not syncing: VFS: Unable to mount root fs on ubi0:rootfs
> CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.109-00033-ga88943c1e68 #1

5.4 is not fresh. Can you use a more recent kernel?

Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UBIFS panic during brown-out
  2021-10-11 19:02 ` Richard Weinberger
@ 2021-10-12 13:02   ` Kristof Havasi
  2021-10-12 13:36     ` Richard Weinberger
  0 siblings, 1 reply; 7+ messages in thread
From: Kristof Havasi @ 2021-10-12 13:02 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

Hello Richard,

On Mon, 11 Oct 2021 at 21:02, Richard Weinberger <richard@nod.at> wrote:
>
> Kristóf,
>
> ----- Ursprüngliche Mail -----
> > Von: "Kristof Havasi" <havasiefr@gmail.com>
> > An: "richard" <richard@nod.at>
> > CC: "linux-mtd" <linux-mtd@lists.infradead.org>
> > Gesendet: Montag, 11. Oktober 2021 19:16:01
> > Betreff: UBIFS panic during brown-out
>
> > Dear Richard,
> >
> > we have been in contact regarding another (similar) kernel panic a year ago:
> > http://lists.infradead.org/pipermail/linux-mtd/2020-September/082175.html
> >
> > Now we are stress testing our board for brown-out stability, and we can
> > see reproducible file system corruptions.
> > The culprit was narrowed down to a pending SQLite operation during brown-out,
> > in the business-logic.
> >
> > As far as I understood, UBIFS is extremely robust in such cases, so I would
> > expect a corrupted file as the worst case scenario, not an unbootable system.
> >
> > Am I too "optimistic" about UBIFS's brown-out stability?
>
> A brown-out is something very bad. Electronic components show undefined behavior
> in this phase. UBIFS (and Linux) can nothing do there.
> But both UBI and UBIFS try to be robust against sudden power loss (power-cut).
> E.g. an interrupted erase or program operation.
>
> So, are you really talking about brown-out? If so, better talk to you hardware guys.

Sorry, s/brown-out/power-cut/.

>
> > Does the Auth+Encryption combo increase the chances for corrupting the FS
> > during brown-out, due to the extra operations?
>
> Assuming you actually meant power-cut.
> Both features are rather new, so there can be still bugs.

If the logs do not reveal any pointers, we could re-test with disabled enc/auth.

>
> > Would you suggest turning on any of the chk_* knobs in the debugfs?
> > I am not sure that they are helpful, or will just modify the behaviour of the
> > timings of the system too much.
>
> Let's start with logs first.
>
> > Would it be a last resort in case the brown-out is detected to switch the FS
> > into ro mode? Is there any API/ABI apart from the debugfs's ro_error knob to
> > switch the FS into read-only mode and so trying to prevent file-system
> > corruption?
>
> Not really. You need to make sure that the current NAND command is finished
> and no new one will be scheduled. Depending on your hardware design this can
> be a challenge.

So would a "successful `sync` and ro mode switch" be a safe bet, if
during the two
actions no new data is written to the FS?

>
> > Any pointers are welcomed!
> >
> > We are on Kernel v5.4.150
> > We use _both_ UBIFS Authentication and Encryption.
> > The board is a SAMA5D3 powered embedded device with 1GB NAND flash,
> > which is not even nearly full (10% used).
> >
> > Kernel panic after the last brown-out:
> > ubi0: scanning is finished
> > ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
> > ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
> > ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
> > ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
> > ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
> > ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
> > ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
> > sequence number: 921361235
> >>ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for bad PEB
> >>handling: 67
> > ubi0: background thread "ubi_bgt0d" started, PID 617
> > UBIFS (ubi0:4): Mounting in authenticated mode
> > UBIFS (ubi0:4): recovery needed
> >>VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1
>
> There is a lot of context missing. Can you please share the full logs?
> Usually UBIFS prints in detail what went wrong.

I couldn't see any UBIFS logs over the serial (but the kernel log
level was also only 1)
and the application log was also redirected there...
But here is the full log from the application logs through the power-cut and the
boot-up

[Sat Oct 09 14:15:00 2021] pwr outage WARN: Power outage detected ...
[Sat Oct 09 14:15:00 2021] pwr outage INFO: Low Power mode activated
[Sat Oct 09 14:15:00 2021] event manager TRACE: Database
/data/logsystem.db opened successfully
[Sat Oct 09 14:15:00 2021] event manager TRACE: SQLite prepare :
SELECT * FROM log_sys WHERE  event_flag=2 OR event_flag=1 ORDER BY
timestamp ASC LIMIT 1;
[Sat Oct 09 14:15:00 2021] fnn logger TRACE: Database
/data/logcustomer.db opened successfully
[snip] some 10s of lines of application log related to SQL operations
[Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQLite prepare : SELECT
MAX(repetition_counter), MAX(record_number), MAX(parent_record_number)
FROM log_customer WHERE  timestamp > ?1 AND event_id = ?2 AND
event_sub_id = ?3 AND user_identity = ?4 AND subject_identity = ?5 AND
message_extensions = ?6 AND peer_identity = ?7;
[Sat Oct 09 14:15:00 2021] fnn logger TRACE: Stepping:
SELECT MAX(repetition_counter), MAX(record_number),
MAX(parent_record_number) FROM log_customer WHERE  timestamp >
1633695300 AND event_id = 19005 AND event_sub_id = 1 AND user_identity
= '' AND subject_identity = '01005e3180a0.eefr2000003006.sm' AND
message_extensions = '' AND peer_identity = '';
[Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQL-Step executed successfully
[Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQLite prepare : UPDATE
log_customer SET [snip] WHERE record_number = ?21 AND
repetition_counter = ?22;
[Sat Oct 09 14:15:01 2021] fnn logger TRACE: Stepping:
UPDATE log_customer SET repetition_counter = 372, parent_record_number
= 3031, seconds_index = 246, timestamp = 1633781700, level = 2,
version = 1, length = 0, device_typ¦RomBOOT
RomBOOT

AT91Bootstrap for SMGW 3.8.12-00020-g0f08dbf (Wed Apr 14 18:46:12 CEST 2021)

NAND: ONFI flash detected
NAND: Manufacturer ID: 0x2c Chip ID: 0xd3
NAND: Page Bytes: 4096, Spare Bytes: 224
NAND: ECC Correctability Bits: 8, ECC Sector Bytes: 512
NAND: Disable On-Die ECC
NAND: Initialize PMECC params, cap: 8, sector: 512
NAND: Initialize UBI...
UBI: Loading volume-table 0 at PEB 202 with copy flag!
NAND: Loading UBI volume kernel kernel-spare to @0x22000000
(0xffffffff bytes length)...
UBI: Length of Vol-2 updated to 3128 KB
NAND: Loading UBI volume dtb dtb-spare to @0x21000000 (0xffffffff
bytes length)...
UBI: Length of Vol-0 updated to 33 KB

Booting zImage ......

Using device tree in place at 0x210001f4
DT: Injected MAC addresses
DT: Populated virtual-eprom

Starting linux kernel ..., machid: 0xffffffff

Booting Linux on physical CPU 0x0
Linux version 5.4.109-00033-ga88943c1e68
(root@runner-tzue45hw-project-205-concurrent-0) (gcc version 6.5.0
(Linaro GCC 6.5-2018.12)) #1 Wed Apr 14 18:46:17 CEST 2021
CPU: ARMv7 Processor [410fc051] revision 1 (ARMv7), cr=10c53c7d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
OF: fdt: Machine model: SMGW
Memory policy: Data cache writeback
CPU: All CPU(s) started in SVC mode.
Built 1 zonelists, mobility grouping on.  Total pages: 32511
Kernel command line: rootfstype=ubifs ubi.mtd=1 root=ubi0:rootfs
rootflags=auth_hash_name=sha256,auth_key=ubifsauth:0000aaaaffff0000
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
mem auto-init: stack:off, heap alloc:on, heap free:on
mem auto-init: clearing system memory may take some time...
Memory: 121196K/131068K available (5120K kernel code, 197K rwdata,
956K rodata, 1024K init, 153K bss, 9872K reserved, 0K cma-reserved)
NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
random: get_random_bytes called from start_kernel+0x2ec/0x41c with crng_init=0
master clk is overclocked
clocksource: pit: mask: 0xfffffff max_cycles: 0xfffffff, max_idle_ns:
14479245754 ns
Console: colour dummy device 80x30
printk: console [tty0] enabled
sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps every
21474836475000000ns
Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
CPU: Testing write buffer coherency: ok
Setting up static identity map for 0x20100000 - 0x2010003c
devtmpfs: initialized
VFP support v0.3: implementor 41 architecture 2 part 30 variant 5 rev 1
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
max_idle_ns: 19112604462750000 ns
futex hash table entries: 256 (order: -1, 3072 bytes, linear)
pinctrl core: initialized pinctrl subsystem
NET: Registered protocol family 16
DMA: preallocated 256 KiB pool for atomic coherent allocations
cpuidle: using governor ladder
gpio-at91 fffff200.gpio: at address (ptrval)
gpio-at91 fffff400.gpio: at address (ptrval)
gpio-at91 fffff600.gpio: at address (ptrval)
gpio-at91 fffff800.gpio: at address (ptrval)
gpio-at91 fffffa00.gpio: at address (ptrval)
pinctrl-at91 ahb:apb:pinctrl@fffff200: initialized AT91 pinctrl driver
atmel_tcb f0010000.timer: IRQ index 1 not found
atmel_tcb f0010000.timer: IRQ index 2 not found
atmel_tcb f8014000.timer: IRQ index 1 not found
atmel_tcb f8014000.timer: IRQ index 2 not found
at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set
slave ), 8 channels
at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set
slave ), 8 channels
AT91: Detected SoC family: sama5d3
AT91: Detected SoC: sama5d35, revision 2
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
at91_i2c f0014000.i2c: AT91 i2c bus driver (hw version: 0x402).
at91_i2c f0018000.i2c: can't get DMA channel, continue without DMA support
at91_i2c f0018000.i2c: AT91 i2c bus driver (hw version: 0x402).
clocksource: Switched to clocksource pit
NET: Registered protocol family 2
tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
TCP established hash table entries: 1024 (order: 0, 4096 bytes, linear)
TCP bind hash table entries: 1024 (order: 0, 4096 bytes, linear)
TCP: Hash tables configured (established 1024 bind 1024)
UDP hash table entries: 256 (order: 0, 4096 bytes, linear)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes, linear)
NET: Registered protocol family 1
Initialise system trusted keyrings
workingset: timestamp_bits=30 max_order=15 bucket_order=0
NET: Registered protocol family 38
Key type asymmetric registered
Asymmetric key parser 'x509' registered
atmel_usart_serial.0.auto: ttyS1 at MMIO 0xf001c000 (irq = 21,
base_baud = 4125000) is a ATMEL_SERIAL
atmel_usart_serial.1.auto: ttyS2 at MMIO 0xf0020000 (irq = 22,
base_baud = 8250000) is a ATMEL_SERIAL
atmel_usart_serial.2.auto: ttyS3 at MMIO 0xf8020000 (irq = 24,
base_baud = 4125000) is a ATMEL_SERIAL
atmel_usart_serial.3.auto: ttyS0 at MMIO 0xffffee00 (irq = 32,
base_baud = 8250000) is a ATMEL_SERIAL
printk: console [ttyS0] enabled
atmel_spi f8008000.spi: DMA TX channel not available, SPI unable to use DMA
atmel_spi f8008000.spi: Atmel SPI Controller using PIO only
ksz8863@0 enforce active low on chipselect handle
atmel_spi f8008000.spi: Atmel SPI Controller version 0x213 at
0xf8008000 (irq 23)
libphy: Fixed MDIO Bus: probed
macb f0028000.ethernet: invalid hw address, using random
random: fast init done
libphy: MACB_mii_bus: probed
Generic PHY fixed-0:00: attached PHY driver [Generic PHY]
(mii_bus:phy_addr=fixed-0:00, irq=POLL)
macb f0028000.ethernet eth0: Cadence GEM rev 0x00020119 at 0xf0028000
irq 40 (12:71:da:58:0b:2d)
libphy: MACB_mii_bus: probed
Micrel KSZ8081 or KSZ8091 f802c000.ethernet-ffffffff:00: attached PHY
driver [Micrel KSZ8081 or KSZ8091]
(mii_bus:phy_addr=f802c000.ethernet-ffffffff:00, irq=45)
macb f802c000.ethernet eth1: Cadence MACB rev 0x0001010c at 0xf802c000
irq 41 (d8:c4:97:1c:9a:d7)
usbcore: registered new interface driver cdc_ether
usbcore: registered new interface driver cdc_ncm
usbcore: registered new interface driver cdc_mbim
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-platform: EHCI generic platform driver
ehci-atmel: EHCI Atmel driver
atmel-ehci 700000.ehci: EHCI Host Controller
atmel-ehci 700000.ehci: new USB bus registered, assigned bus number 1
atmel-ehci 700000.ehci: irq 43, io mem 0x00700000
atmel-ehci 700000.ehci: USB 2.0 started, EHCI 1.00
usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.04
usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: EHCI Host Controller
usb usb1: Manufacturer: Linux 5.4.109-00033-ga88943c1e68 ehci_hcd
usb usb1: SerialNumber: 700000.ehci
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-platform: OHCI generic platform driver
ohci-atmel: OHCI Atmel driver
at91_ohci 600000.ohci: USB Host Controller
at91_ohci 600000.ohci: new USB bus registered, assigned bus number 2
at91_ohci 600000.ohci: irq 43, io mem 0x00600000
usb usb2: New USB device found, idVendor=1d6b, idProduct=0001, bcdDevice= 5.04
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: USB Host Controller
usb usb2: Manufacturer: Linux 5.4.109-00033-ga88943c1e68 ohci_hcd
usb usb2: SerialNumber: at91
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
usbcore: registered new interface driver cdc_acm
cdc_acm: USB Abstract Control Model driver for USB modems and ISDN adapters
usbcore: registered new interface driver cdc_wdm
usbcore: registered new interface driver usbserial_generic
usbserial: USB Serial support registered for generic
usbcore: registered new interface driver option
usbserial: USB Serial support registered for GSM modem (1-port)
usbcore: registered new interface driver usb_serial_simple
usbserial: USB Serial support registered for carelink
usbserial: USB Serial support registered for zio
usbserial: USB Serial support registered for funsoft
usbserial: USB Serial support registered for flashloader
usbserial: USB Serial support registered for google
usbserial: USB Serial support registered for libtransistor
usbserial: USB Serial support registered for vivopay
usbserial: USB Serial support registered for moto_modem
usbserial: USB Serial support registered for motorola_tetra
usbserial: USB Serial support registered for novatel_gps
usbserial: USB Serial support registered for hp4x
usbserial: USB Serial support registered for suunto
usbserial: USB Serial support registered for siemens_mpi
rtc-rv8803 1-0032: registered as rtc0
i2c /dev entries driver
at91-reset fffffe00.rstc: Starting after wakeup
at91sam9_wdt: enabled (heartbeat=15 sec, nowayout=1)
atmel_aes f8038000.aes: version: 0x135
atmel_aes f8038000.aes: Atmel AES - Using dma1chan0, dma1chan1 for DMA transfers
atmel_sha f8034000.sha: version: 0x410
atmel_sha f8034000.sha: using dma1chan2 for DMA transfers
atmel_sha f8034000.sha: Atmel SHA1/SHA256/SHA224/SHA384/SHA512
atmel_tdes f803c000.tdes: version: 0x701
atmel_tdes f803c000.tdes: using dma1chan3, dma1chan4 for DMA transfers
atmel_tdes f803c000.tdes: Atmel DES/TDES
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xd3
nand: Micron MT29F8G08ABACAWP
nand: 1024 MiB, SLC, erase size: 256 KiB, page size: 4096, OOB size: 224
Bad block table found at page 262080, version 0x01
Bad block table found at page 262016, version 0x01
nand_read_bbt: bad block at 0x000000dc0000
nand_read_bbt: bad block at 0x00000e280000
nand_read_bbt: bad block at 0x000024f40000
nand_read_bbt: bad block at 0x000026ec0000
nand_read_bbt: bad block at 0x000027ec0000
nand_read_bbt: bad block at 0x00002f140000
nand_read_bbt: bad block at 0x000030ec0000
nand_read_bbt: bad block at 0x000033b40000
nand_read_bbt: bad block at 0x000038840000
2 fixed-partitions partitions found on MTD device atmel_nand
Creating 2 MTD partitions on "atmel_nand":
0x000000000000-0x000000040000 : "at91bootstrap"
0x000000040000-0x000040000000 : "ubivols"
NET: Registered protocol family 10
Segment Routing with IPv6
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
8021q: 802.1Q VLAN Support v1.8
Loading compiled-in X.509 certificates
Loaded X.509 cert 'EFR: efr.de: a54308f075327fb6c3a6fb687fdabc10ce4c51c6'
Key type ._fscrypt registered
Key type .fscrypt registered
ksz8795-switch spi0.0: active low/falling IRQ
libphy: dsa slave smi: probed
ksz8795-switch spi0.0 han (uninitialized): PHY [dsa-0.0:00] driver
[Micrel KSZ886X Switch]
ksz8795-switch spi0.0 cls (uninitialized): PHY [dsa-0.0:01] driver
[Micrel KSZ886X Switch]
ksz8795-switch spi0.0: Using legacy PHYLIB callbacks. Please migrate to PHYLINK!
DSA: tree 0 setup
ubi0: attaching mtd1
usb 2-3: new full-speed USB device number 2 using at91_ohci
usb 2-3: New USB device found, idVendor=08e6, idProduct=3437, bcdDevice= 2.01
usb 2-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 2-3: Product: USB SmartCard Reader
usb 2-3: Manufacturer: Gemalto
random: crng init done
ubi0: scanning is finished
ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
sequence number: 921361235
ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for
bad PEB handling: 67
ubi0: background thread "ubi_bgt0d" started, PID 617
rtc-rv8803 1-0032: setting system clock to 2021-10-09T12:16:01 UTC (1633781761)
UBIFS (ubi0:4): Mounting in authenticated mode
UBIFS (ubi0:4): recovery needed
VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1
Please append a correct "root=" boot option; here are the available partitions:
Kernel panic - not syncing: VFS: Unable to mount root fs on ubi0:rootfs
CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.109-00033-ga88943c1e68 #1
Hardware name: Atmel SAMA5
[<c010babc>] (unwind_backtrace) from [<c010a8d0>] (show_stack+0x10/0x14)
[<c010a8d0>] (show_stack) from [<c0569970>] (panic+0xfc/0x2e8)
[<c0569970>] (panic) from [<c0801598>] (mount_block_root+0x268/0x2a4)
[<c0801598>] (mount_block_root) from [<c0801674>] (prepare_namespace+0x9c/0x184)
[<c0801674>] (prepare_namespace) from [<c056b774>] (kernel_init+0x8/0x108)
[<c056b774>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
Exception stack(0xc7829fb0 to 0xc7829ff8)
9fa0:                                     00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on
ubi0:rootfs ]---
RomBOOT
RomBOOT

>
> 5.4 is not fresh. Can you use a more recent kernel?

We are tracking the latest patches on 5.4, but will try whether a
rebase to 5.10 is straight-forward or not.
Or at least port the minimal changes which are needed to re-create the crash.

But first, I will try to reproduce the crash with a stripped down
program, because with the current tests,
it can take anywhere between 20-500 iterations of power-cuts with the
full business logic to trigger the crash.
(5m ON 30s OFF)

Best Regards,
Kristóf

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UBIFS panic during brown-out
  2021-10-12 13:02   ` Kristof Havasi
@ 2021-10-12 13:36     ` Richard Weinberger
       [not found]       ` <CADBnMvidNZEiJfSTLe-rj+V0N=EtW8jTrYXzc2KRJKNS-zkCjg@mail.gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Weinberger @ 2021-10-12 13:36 UTC (permalink / raw)
  To: Kristof Havasi; +Cc: linux-mtd

Kristóf,

----- Ursprüngliche Mail -----
> Von: "Kristof Havasi" <havasiefr@gmail.com>
>> So, are you really talking about brown-out? If so, better talk to you hardware
>> guys.
> 
> Sorry, s/brown-out/power-cut/.

:-)

>>
>> > Does the Auth+Encryption combo increase the chances for corrupting the FS
>> > during brown-out, due to the extra operations?
>>
>> Assuming you actually meant power-cut.
>> Both features are rather new, so there can be still bugs.
> 
> If the logs do not reveal any pointers, we could re-test with disabled enc/auth.
> 
>>
>> > Would you suggest turning on any of the chk_* knobs in the debugfs?
>> > I am not sure that they are helpful, or will just modify the behaviour of the
>> > timings of the system too much.
>>
>> Let's start with logs first.
>>
>> > Would it be a last resort in case the brown-out is detected to switch the FS
>> > into ro mode? Is there any API/ABI apart from the debugfs's ro_error knob to
>> > switch the FS into read-only mode and so trying to prevent file-system
>> > corruption?
>>
>> Not really. You need to make sure that the current NAND command is finished
>> and no new one will be scheduled. Depending on your hardware design this can
>> be a challenge.
> 
> So would a "successful `sync` and ro mode switch" be a safe bet, if
> during the two
> actions no new data is written to the FS?

Kind of. Actually UBIFS should handle this just fine.
Before we consider workarounds, let's give the problem more thought.

>>
>> > Any pointers are welcomed!
>> >
>> > We are on Kernel v5.4.150
>> > We use _both_ UBIFS Authentication and Encryption.
>> > The board is a SAMA5D3 powered embedded device with 1GB NAND flash,
>> > which is not even nearly full (10% used).
>> >
>> > Kernel panic after the last brown-out:
>> > ubi0: scanning is finished
>> > ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
>> > ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
>> > ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
>> > ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
>> > ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
>> > ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
>> > ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
>> > sequence number: 921361235
>> >>ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for bad PEB
>> >>handling: 67
>> > ubi0: background thread "ubi_bgt0d" started, PID 617
>> > UBIFS (ubi0:4): Mounting in authenticated mode
>> > UBIFS (ubi0:4): recovery needed
>> >>VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1
>>
>> There is a lot of context missing. Can you please share the full logs?
>> Usually UBIFS prints in detail what went wrong.
> 
> I couldn't see any UBIFS logs over the serial (but the kernel log
> level was also only 1)
> and the application log was also redirected there...
> But here is the full log from the application logs through the power-cut and the
> boot-up
> 
> [Sat Oct 09 14:15:00 2021] pwr outage WARN: Power outage detected ...
> [Sat Oct 09 14:15:00 2021] pwr outage INFO: Low Power mode activated
> [Sat Oct 09 14:15:00 2021] event manager TRACE: Database
> /data/logsystem.db opened successfully
> [Sat Oct 09 14:15:00 2021] event manager TRACE: SQLite prepare :
> SELECT * FROM log_sys WHERE  event_flag=2 OR event_flag=1 ORDER BY
> timestamp ASC LIMIT 1;
> [Sat Oct 09 14:15:00 2021] fnn logger TRACE: Database
> /data/logcustomer.db opened successfully
> [snip] some 10s of lines of application log related to SQL operations
> [Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQLite prepare : SELECT
> MAX(repetition_counter), MAX(record_number), MAX(parent_record_number)
> FROM log_customer WHERE  timestamp > ?1 AND event_id = ?2 AND
> event_sub_id = ?3 AND user_identity = ?4 AND subject_identity = ?5 AND
> message_extensions = ?6 AND peer_identity = ?7;
> [Sat Oct 09 14:15:00 2021] fnn logger TRACE: Stepping:
> SELECT MAX(repetition_counter), MAX(record_number),
> MAX(parent_record_number) FROM log_customer WHERE  timestamp >
> 1633695300 AND event_id = 19005 AND event_sub_id = 1 AND user_identity
> = '' AND subject_identity = '01005e3180a0.eefr2000003006.sm' AND
> message_extensions = '' AND peer_identity = '';
> [Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQL-Step executed successfully
> [Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQLite prepare : UPDATE
> log_customer SET [snip] WHERE record_number = ?21 AND
> repetition_counter = ?22;
> [Sat Oct 09 14:15:01 2021] fnn logger TRACE: Stepping:
> UPDATE log_customer SET repetition_counter = 372, parent_record_number
> = 3031, seconds_index = 246, timestamp = 1633781700, level = 2,
> version = 1, length = 0, device_typ¦RomBOOT
> RomBOOT
> 
> AT91Bootstrap for SMGW 3.8.12-00020-g0f08dbf (Wed Apr 14 18:46:12 CEST 2021)
> 
> NAND: ONFI flash detected
> NAND: Manufacturer ID: 0x2c Chip ID: 0xd3
> NAND: Page Bytes: 4096, Spare Bytes: 224
> NAND: ECC Correctability Bits: 8, ECC Sector Bytes: 512
> NAND: Disable On-Die ECC
> NAND: Initialize PMECC params, cap: 8, sector: 512
> NAND: Initialize UBI...
> UBI: Loading volume-table 0 at PEB 202 with copy flag!
> NAND: Loading UBI volume kernel kernel-spare to @0x22000000
> (0xffffffff bytes length)...
> UBI: Length of Vol-2 updated to 3128 KB
> NAND: Loading UBI volume dtb dtb-spare to @0x21000000 (0xffffffff
> bytes length)...
> UBI: Length of Vol-0 updated to 33 KB
> 
> Booting zImage ......
> 
> Using device tree in place at 0x210001f4
> DT: Injected MAC addresses
> DT: Populated virtual-eprom
> 
> Starting linux kernel ..., machid: 0xffffffff
> 
> Booting Linux on physical CPU 0x0
> Linux version 5.4.109-00033-ga88943c1e68
> (root@runner-tzue45hw-project-205-concurrent-0) (gcc version 6.5.0
> (Linaro GCC 6.5-2018.12)) #1 Wed Apr 14 18:46:17 CEST 2021
> CPU: ARMv7 Processor [410fc051] revision 1 (ARMv7), cr=10c53c7d
> CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> OF: fdt: Machine model: SMGW
> Memory policy: Data cache writeback
> CPU: All CPU(s) started in SVC mode.
> Built 1 zonelists, mobility grouping on.  Total pages: 32511
> Kernel command line: rootfstype=ubifs ubi.mtd=1 root=ubi0:rootfs
> rootflags=auth_hash_name=sha256,auth_key=ubifsauth:0000aaaaffff0000
> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
> mem auto-init: stack:off, heap alloc:on, heap free:on
> mem auto-init: clearing system memory may take some time...
> Memory: 121196K/131068K available (5120K kernel code, 197K rwdata,
> 956K rodata, 1024K init, 153K bss, 9872K reserved, 0K cma-reserved)
> NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> random: get_random_bytes called from start_kernel+0x2ec/0x41c with crng_init=0
> master clk is overclocked
> clocksource: pit: mask: 0xfffffff max_cycles: 0xfffffff, max_idle_ns:
> 14479245754 ns
> Console: colour dummy device 80x30
> printk: console [tty0] enabled
> sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps every
> 21474836475000000ns
> Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184)
> pid_max: default: 32768 minimum: 301
> Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> CPU: Testing write buffer coherency: ok
> Setting up static identity map for 0x20100000 - 0x2010003c
> devtmpfs: initialized
> VFP support v0.3: implementor 41 architecture 2 part 30 variant 5 rev 1
> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
> max_idle_ns: 19112604462750000 ns
> futex hash table entries: 256 (order: -1, 3072 bytes, linear)
> pinctrl core: initialized pinctrl subsystem
> NET: Registered protocol family 16
> DMA: preallocated 256 KiB pool for atomic coherent allocations
> cpuidle: using governor ladder
> gpio-at91 fffff200.gpio: at address (ptrval)
> gpio-at91 fffff400.gpio: at address (ptrval)
> gpio-at91 fffff600.gpio: at address (ptrval)
> gpio-at91 fffff800.gpio: at address (ptrval)
> gpio-at91 fffffa00.gpio: at address (ptrval)
> pinctrl-at91 ahb:apb:pinctrl@fffff200: initialized AT91 pinctrl driver
> atmel_tcb f0010000.timer: IRQ index 1 not found
> atmel_tcb f0010000.timer: IRQ index 2 not found
> atmel_tcb f8014000.timer: IRQ index 1 not found
> atmel_tcb f8014000.timer: IRQ index 2 not found
> at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set
> slave ), 8 channels
> at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set
> slave ), 8 channels
> AT91: Detected SoC family: sama5d3
> AT91: Detected SoC: sama5d35, revision 2
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
> at91_i2c f0014000.i2c: AT91 i2c bus driver (hw version: 0x402).
> at91_i2c f0018000.i2c: can't get DMA channel, continue without DMA support
> at91_i2c f0018000.i2c: AT91 i2c bus driver (hw version: 0x402).
> clocksource: Switched to clocksource pit
> NET: Registered protocol family 2
> tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
> TCP established hash table entries: 1024 (order: 0, 4096 bytes, linear)
> TCP bind hash table entries: 1024 (order: 0, 4096 bytes, linear)
> TCP: Hash tables configured (established 1024 bind 1024)
> UDP hash table entries: 256 (order: 0, 4096 bytes, linear)
> UDP-Lite hash table entries: 256 (order: 0, 4096 bytes, linear)
> NET: Registered protocol family 1
> Initialise system trusted keyrings
> workingset: timestamp_bits=30 max_order=15 bucket_order=0
> NET: Registered protocol family 38
> Key type asymmetric registered
> Asymmetric key parser 'x509' registered
> atmel_usart_serial.0.auto: ttyS1 at MMIO 0xf001c000 (irq = 21,
> base_baud = 4125000) is a ATMEL_SERIAL
> atmel_usart_serial.1.auto: ttyS2 at MMIO 0xf0020000 (irq = 22,
> base_baud = 8250000) is a ATMEL_SERIAL
> atmel_usart_serial.2.auto: ttyS3 at MMIO 0xf8020000 (irq = 24,
> base_baud = 4125000) is a ATMEL_SERIAL
> atmel_usart_serial.3.auto: ttyS0 at MMIO 0xffffee00 (irq = 32,
> base_baud = 8250000) is a ATMEL_SERIAL
> printk: console [ttyS0] enabled
> atmel_spi f8008000.spi: DMA TX channel not available, SPI unable to use DMA
> atmel_spi f8008000.spi: Atmel SPI Controller using PIO only
> ksz8863@0 enforce active low on chipselect handle
> atmel_spi f8008000.spi: Atmel SPI Controller version 0x213 at
> 0xf8008000 (irq 23)
> libphy: Fixed MDIO Bus: probed
> macb f0028000.ethernet: invalid hw address, using random
> random: fast init done
> libphy: MACB_mii_bus: probed
> Generic PHY fixed-0:00: attached PHY driver [Generic PHY]
> (mii_bus:phy_addr=fixed-0:00, irq=POLL)
> macb f0028000.ethernet eth0: Cadence GEM rev 0x00020119 at 0xf0028000
> irq 40 (12:71:da:58:0b:2d)
> libphy: MACB_mii_bus: probed
> Micrel KSZ8081 or KSZ8091 f802c000.ethernet-ffffffff:00: attached PHY
> driver [Micrel KSZ8081 or KSZ8091]
> (mii_bus:phy_addr=f802c000.ethernet-ffffffff:00, irq=45)
> macb f802c000.ethernet eth1: Cadence MACB rev 0x0001010c at 0xf802c000
> irq 41 (d8:c4:97:1c:9a:d7)
> usbcore: registered new interface driver cdc_ether
> usbcore: registered new interface driver cdc_ncm
> usbcore: registered new interface driver cdc_mbim
> ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> ehci-platform: EHCI generic platform driver
> ehci-atmel: EHCI Atmel driver
> atmel-ehci 700000.ehci: EHCI Host Controller
> atmel-ehci 700000.ehci: new USB bus registered, assigned bus number 1
> atmel-ehci 700000.ehci: irq 43, io mem 0x00700000
> atmel-ehci 700000.ehci: USB 2.0 started, EHCI 1.00
> usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.04
> usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> usb usb1: Product: EHCI Host Controller
> usb usb1: Manufacturer: Linux 5.4.109-00033-ga88943c1e68 ehci_hcd
> usb usb1: SerialNumber: 700000.ehci
> hub 1-0:1.0: USB hub found
> hub 1-0:1.0: 3 ports detected
> ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> ohci-platform: OHCI generic platform driver
> ohci-atmel: OHCI Atmel driver
> at91_ohci 600000.ohci: USB Host Controller
> at91_ohci 600000.ohci: new USB bus registered, assigned bus number 2
> at91_ohci 600000.ohci: irq 43, io mem 0x00600000
> usb usb2: New USB device found, idVendor=1d6b, idProduct=0001, bcdDevice= 5.04
> usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> usb usb2: Product: USB Host Controller
> usb usb2: Manufacturer: Linux 5.4.109-00033-ga88943c1e68 ohci_hcd
> usb usb2: SerialNumber: at91
> hub 2-0:1.0: USB hub found
> hub 2-0:1.0: 3 ports detected
> usbcore: registered new interface driver cdc_acm
> cdc_acm: USB Abstract Control Model driver for USB modems and ISDN adapters
> usbcore: registered new interface driver cdc_wdm
> usbcore: registered new interface driver usbserial_generic
> usbserial: USB Serial support registered for generic
> usbcore: registered new interface driver option
> usbserial: USB Serial support registered for GSM modem (1-port)
> usbcore: registered new interface driver usb_serial_simple
> usbserial: USB Serial support registered for carelink
> usbserial: USB Serial support registered for zio
> usbserial: USB Serial support registered for funsoft
> usbserial: USB Serial support registered for flashloader
> usbserial: USB Serial support registered for google
> usbserial: USB Serial support registered for libtransistor
> usbserial: USB Serial support registered for vivopay
> usbserial: USB Serial support registered for moto_modem
> usbserial: USB Serial support registered for motorola_tetra
> usbserial: USB Serial support registered for novatel_gps
> usbserial: USB Serial support registered for hp4x
> usbserial: USB Serial support registered for suunto
> usbserial: USB Serial support registered for siemens_mpi
> rtc-rv8803 1-0032: registered as rtc0
> i2c /dev entries driver
> at91-reset fffffe00.rstc: Starting after wakeup
> at91sam9_wdt: enabled (heartbeat=15 sec, nowayout=1)
> atmel_aes f8038000.aes: version: 0x135
> atmel_aes f8038000.aes: Atmel AES - Using dma1chan0, dma1chan1 for DMA transfers
> atmel_sha f8034000.sha: version: 0x410
> atmel_sha f8034000.sha: using dma1chan2 for DMA transfers
> atmel_sha f8034000.sha: Atmel SHA1/SHA256/SHA224/SHA384/SHA512
> atmel_tdes f803c000.tdes: version: 0x701
> atmel_tdes f803c000.tdes: using dma1chan3, dma1chan4 for DMA transfers
> atmel_tdes f803c000.tdes: Atmel DES/TDES
> usbcore: registered new interface driver usbhid
> usbhid: USB HID core driver
> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xd3
> nand: Micron MT29F8G08ABACAWP
> nand: 1024 MiB, SLC, erase size: 256 KiB, page size: 4096, OOB size: 224
> Bad block table found at page 262080, version 0x01
> Bad block table found at page 262016, version 0x01
> nand_read_bbt: bad block at 0x000000dc0000
> nand_read_bbt: bad block at 0x00000e280000
> nand_read_bbt: bad block at 0x000024f40000
> nand_read_bbt: bad block at 0x000026ec0000
> nand_read_bbt: bad block at 0x000027ec0000
> nand_read_bbt: bad block at 0x00002f140000
> nand_read_bbt: bad block at 0x000030ec0000
> nand_read_bbt: bad block at 0x000033b40000
> nand_read_bbt: bad block at 0x000038840000
> 2 fixed-partitions partitions found on MTD device atmel_nand
> Creating 2 MTD partitions on "atmel_nand":
> 0x000000000000-0x000000040000 : "at91bootstrap"
> 0x000000040000-0x000040000000 : "ubivols"
> NET: Registered protocol family 10
> Segment Routing with IPv6
> sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> NET: Registered protocol family 17
> 8021q: 802.1Q VLAN Support v1.8
> Loading compiled-in X.509 certificates
> Loaded X.509 cert 'EFR: efr.de: a54308f075327fb6c3a6fb687fdabc10ce4c51c6'
> Key type ._fscrypt registered
> Key type .fscrypt registered
> ksz8795-switch spi0.0: active low/falling IRQ
> libphy: dsa slave smi: probed
> ksz8795-switch spi0.0 han (uninitialized): PHY [dsa-0.0:00] driver
> [Micrel KSZ886X Switch]
> ksz8795-switch spi0.0 cls (uninitialized): PHY [dsa-0.0:01] driver
> [Micrel KSZ886X Switch]
> ksz8795-switch spi0.0: Using legacy PHYLIB callbacks. Please migrate to PHYLINK!
> DSA: tree 0 setup
> ubi0: attaching mtd1
> usb 2-3: new full-speed USB device number 2 using at91_ohci
> usb 2-3: New USB device found, idVendor=08e6, idProduct=3437, bcdDevice= 2.01
> usb 2-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> usb 2-3: Product: USB SmartCard Reader
> usb 2-3: Manufacturer: Gemalto
> random: crng init done
> ubi0: scanning is finished
> ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
> ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
> ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
> ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
> ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
> ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
> ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
> sequence number: 921361235
> ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for
> bad PEB handling: 67
> ubi0: background thread "ubi_bgt0d" started, PID 617
> rtc-rv8803 1-0032: setting system clock to 2021-10-09T12:16:01 UTC (1633781761)
> UBIFS (ubi0:4): Mounting in authenticated mode
> UBIFS (ubi0:4): recovery needed
> VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1

Hmm. -1 is -EPERM.
So some authentification check fails.

Can you find out which -EPERM in fs/ubifs this is?
E.g. by adding a printk() before the return.
Maybe this gives us a clue.

> Please append a correct "root=" boot option; here are the available partitions:
> Kernel panic - not syncing: VFS: Unable to mount root fs on ubi0:rootfs
> CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.109-00033-ga88943c1e68 #1
> Hardware name: Atmel SAMA5
> [<c010babc>] (unwind_backtrace) from [<c010a8d0>] (show_stack+0x10/0x14)
> [<c010a8d0>] (show_stack) from [<c0569970>] (panic+0xfc/0x2e8)
> [<c0569970>] (panic) from [<c0801598>] (mount_block_root+0x268/0x2a4)
> [<c0801598>] (mount_block_root) from [<c0801674>] (prepare_namespace+0x9c/0x184)
> [<c0801674>] (prepare_namespace) from [<c056b774>] (kernel_init+0x8/0x108)
> [<c056b774>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> Exception stack(0xc7829fb0 to 0xc7829ff8)
> 9fa0:                                     00000000 00000000 00000000 00000000
> 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on
> ubi0:rootfs ]---
> RomBOOT
> RomBOOT
> 
>>
>> 5.4 is not fresh. Can you use a more recent kernel?
> 
> We are tracking the latest patches on 5.4, but will try whether a
> rebase to 5.10 is straight-forward or not.
> Or at least port the minimal changes which are needed to re-create the crash.
> 
> But first, I will try to reproduce the crash with a stripped down
> program, because with the current tests,
> it can take anywhere between 20-500 iterations of power-cuts with the
> full business logic to trigger the crash.
> (5m ON 30s OFF)

Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UBIFS panic during brown-out
       [not found]         ` <CADBnMvhCmNQsYRVAveuf=Ri820bLyLdcrA71Nywh0mVL+X8Fng@mail.gmail.com>
@ 2021-10-21 15:33           ` Kristof Havasi
  2021-10-21 17:44             ` Richard Weinberger
  0 siblings, 1 reply; 7+ messages in thread
From: Kristof Havasi @ 2021-10-21 15:33 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

Dear Richard,

I noticed that I forgot to Cc linux-mtd from the previous emails.

Latest update: we could reproduce the bug with a v5.10 port.
The same EPERM error is hit.

Happy to hear any hints on how I can support in fixing this issue.

Best Regards,
Kristóf


On Tue, 19 Oct 2021 at 18:09, Kristof Havasi <havasiefr@gmail.com> wrote:
>
> Dear Richard,
>
> a quick update on the topic:
>
> I disabled UBIFS-Authentication (but left UBIFS-Encryption), and our
> stress-test has been running successfully without the mount-issue
> (+1150 power-cuts)
> for 3+ days.
>
> If I understand correctly, the hit EPERM error originates from finding
> an auth-snode,
> where the hmac over the previous data nodes doesn't match the calculated one.
> (A possible tampering case, I guess).
>
> If you could give some pointers where to maybe inject more analysis
> printk-s which
> would help further, I would happily re-do the test. (Wasn't yet
> successful to create
> a minimal program to reproduce the issue.)
>
> Is e.g. n_nodes of any interest here?
> Or something on the writer's side which would help further? I am not
> familiar with
> the ubifs source code, but if you point me to the place where the
> caches are written
> to the flash and the authentication node is inserted, I could add some
> debugging there.
>
> > > >> 5.4 is not fresh. Can you use a more recent kernel?
> Would a port to v5.10.y be fresh enough?
>
> Best regards,
> Kristóf
>
> On Thu, 14 Oct 2021 at 13:08, Kristof Havasi <havasiefr@gmail.com> wrote:
> >
> > Dear Richard,
> >
> > On Tue, 12 Oct 2021 at 15:36, Richard Weinberger <richard@nod.at> wrote:
> > >
> > > Kristóf,
> > >
> > > ----- Ursprüngliche Mail -----
> > > > Von: "Kristof Havasi" <havasiefr@gmail.com>
> > > >> So, are you really talking about brown-out? If so, better talk to you hardware
> > > >> guys.
> > > >
> > > > Sorry, s/brown-out/power-cut/.
> > >
> > > :-)
> > >
> > > >>
> > > >> > Does the Auth+Encryption combo increase the chances for corrupting the FS
> > > >> > during brown-out, due to the extra operations?
> > > >>
> > > >> Assuming you actually meant power-cut.
> > > >> Both features are rather new, so there can be still bugs.
> > > >
> > > > If the logs do not reveal any pointers, we could re-test with disabled enc/auth.
> > > >
> > > >>
> > > >> > Would you suggest turning on any of the chk_* knobs in the debugfs?
> > > >> > I am not sure that they are helpful, or will just modify the behaviour of the
> > > >> > timings of the system too much.
> > > >>
> > > >> Let's start with logs first.
> > > >>
> > > >> > Would it be a last resort in case the brown-out is detected to switch the FS
> > > >> > into ro mode? Is there any API/ABI apart from the debugfs's ro_error knob to
> > > >> > switch the FS into read-only mode and so trying to prevent file-system
> > > >> > corruption?
> > > >>
> > > >> Not really. You need to make sure that the current NAND command is finished
> > > >> and no new one will be scheduled. Depending on your hardware design this can
> > > >> be a challenge.
> > > >
> > > > So would a "successful `sync` and ro mode switch" be a safe bet, if
> > > > during the two
> > > > actions no new data is written to the FS?
> > >
> > > Kind of. Actually UBIFS should handle this just fine.
> > > Before we consider workarounds, let's give the problem more thought.
> >
> > I can see that in all pre-crash cases the recovery is successful:
> >
> > ubi0: scanning is finished
> > ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
> > ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
> > ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
> > ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
> > ubi0: good PEBs: 4063, bad PEBs: 32, corrupted PEBs: 0
> > ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
> > ubi0: max/mean erase counter: 3/1, WL threshold: 4096, image sequence
> > number: 1244242931
> > ubi0: available PEBs: 0, total reserved PEBs: 4063, PEBs reserved for
> > bad PEB handling: 48
> > ubi0: background thread "ubi_bgt0d" started, PID 622
> > UBIFS (ubi0:4): Mounting in authenticated mode
> > UBIFS (ubi0:4): recovery needed
> > UBIFS (ubi0:4): recovery deferred
> > UBIFS (ubi0:4): UBIFS: mounted UBI device 0, volume 4, name "rootfs", R/O mode
> > UBIFS (ubi0:4): LEB size: 253952 bytes (248 KiB), min./max. I/O unit
> > sizes: 4096 bytes/4096 bytes
> > UBIFS (ubi0:4): FS size: 1004888064 bytes (958 MiB, 3957 LEBs),
> > journal size 9404416 bytes (8 MiB, 38 LEBs)
> > UBIFS (ubi0:4): reserved for root: 0 bytes (0 KiB)
> > UBIFS (ubi0:4): media format: w5/r0 (latest is w5/r0), UUID
> > 33BCA4FE-C691-4F5D-ACD1-D2B9D6D6B201, small LPT model
> > VFS: Mounted root (ubifs filesystem) readonly on device 0:13.
> > fscrypt: AES-256-CTS-CBC using implementation "cts(atmel-cbc-aes)"
> > devtmpfs: mounted
> > Freeing unused kernel memory: 1024K
> > Run /sbin/init as init process
> > fscrypt: AES-256-XTS using implementation "xts(atmel-ecb-aes)"
> > UBIFS (ubi0:4): completing deferred recovery
> > UBIFS (ubi0:4): background thread "ubifs_bgt0_4" started, PID 632
> > UBIFS (ubi0:4): deferred recovery completed
> >
> > >
> > > >>
> > > >> > Any pointers are welcomed!
> > > >> >
> > > >> > We are on Kernel v5.4.150
> > > >> > We use _both_ UBIFS Authentication and Encryption.
> > > >> > The board is a SAMA5D3 powered embedded device with 1GB NAND flash,
> > > >> > which is not even nearly full (10% used).
> > > >> >
> > > >> > Kernel panic after the last brown-out:
> > > >> > ubi0: scanning is finished
> > > >> > ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
> > > >> > ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
> > > >> > ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
> > > >> > ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
> > > >> > ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
> > > >> > ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
> > > >> > ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
> > > >> > sequence number: 921361235
> > > >> >>ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for bad PEB
> > > >> >>handling: 67
> > > >> > ubi0: background thread "ubi_bgt0d" started, PID 617
> > > >> > UBIFS (ubi0:4): Mounting in authenticated mode
> > > >> > UBIFS (ubi0:4): recovery needed
> > > >> >>VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1
> > > >>
> > > >> There is a lot of context missing. Can you please share the full logs?
> > > >> Usually UBIFS prints in detail what went wrong.
> > > >
> > > > I couldn't see any UBIFS logs over the serial (but the kernel log
> > > > level was also only 1)
> > > > and the application log was also redirected there...
> > > > But here is the full log from the application logs through the power-cut and the
> > > > boot-up
> > > >
> > > > [Sat Oct 09 14:15:00 2021] pwr outage WARN: Power outage detected ...
> > > > [Sat Oct 09 14:15:00 2021] pwr outage INFO: Low Power mode activated
> > > > [Sat Oct 09 14:15:00 2021] event manager TRACE: Database
> > > > /data/logsystem.db opened successfully
> > > > [Sat Oct 09 14:15:00 2021] event manager TRACE: SQLite prepare :
> > > > SELECT * FROM log_sys WHERE  event_flag=2 OR event_flag=1 ORDER BY
> > > > timestamp ASC LIMIT 1;
> > > > [Sat Oct 09 14:15:00 2021] fnn logger TRACE: Database
> > > > /data/logcustomer.db opened successfully
> > > > [snip] some 10s of lines of application log related to SQL operations
> > > > [Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQLite prepare : SELECT
> > > > MAX(repetition_counter), MAX(record_number), MAX(parent_record_number)
> > > > FROM log_customer WHERE  timestamp > ?1 AND event_id = ?2 AND
> > > > event_sub_id = ?3 AND user_identity = ?4 AND subject_identity = ?5 AND
> > > > message_extensions = ?6 AND peer_identity = ?7;
> > > > [Sat Oct 09 14:15:00 2021] fnn logger TRACE: Stepping:
> > > > SELECT MAX(repetition_counter), MAX(record_number),
> > > > MAX(parent_record_number) FROM log_customer WHERE  timestamp >
> > > > 1633695300 AND event_id = 19005 AND event_sub_id = 1 AND user_identity
> > > > = '' AND subject_identity = '01005e3180a0.eefr2000003006.sm' AND
> > > > message_extensions = '' AND peer_identity = '';
> > > > [Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQL-Step executed successfully
> > > > [Sat Oct 09 14:15:00 2021] fnn logger TRACE: SQLite prepare : UPDATE
> > > > log_customer SET [snip] WHERE record_number = ?21 AND
> > > > repetition_counter = ?22;
> > > > [Sat Oct 09 14:15:01 2021] fnn logger TRACE: Stepping:
> > > > UPDATE log_customer SET repetition_counter = 372, parent_record_number
> > > > = 3031, seconds_index = 246, timestamp = 1633781700, level = 2,
> > > > version = 1, length = 0, device_typ¦RomBOOT
> > > > RomBOOT
> > > >
> > > > AT91Bootstrap for SMGW 3.8.12-00020-g0f08dbf (Wed Apr 14 18:46:12 CEST 2021)
> > > >
> > > > NAND: ONFI flash detected
> > > > NAND: Manufacturer ID: 0x2c Chip ID: 0xd3
> > > > NAND: Page Bytes: 4096, Spare Bytes: 224
> > > > NAND: ECC Correctability Bits: 8, ECC Sector Bytes: 512
> > > > NAND: Disable On-Die ECC
> > > > NAND: Initialize PMECC params, cap: 8, sector: 512
> > > > NAND: Initialize UBI...
> > > > UBI: Loading volume-table 0 at PEB 202 with copy flag!
> > > > NAND: Loading UBI volume kernel kernel-spare to @0x22000000
> > > > (0xffffffff bytes length)...
> > > > UBI: Length of Vol-2 updated to 3128 KB
> > > > NAND: Loading UBI volume dtb dtb-spare to @0x21000000 (0xffffffff
> > > > bytes length)...
> > > > UBI: Length of Vol-0 updated to 33 KB
> > > >
> > > > Booting zImage ......
> > > >
> > > > Using device tree in place at 0x210001f4
> > > > DT: Injected MAC addresses
> > > > DT: Populated virtual-eprom
> > > >
> > > > Starting linux kernel ..., machid: 0xffffffff
> > > >
> > > > Booting Linux on physical CPU 0x0
> > > > Linux version 5.4.109-00033-ga88943c1e68
> > > > (root@runner-tzue45hw-project-205-concurrent-0) (gcc version 6.5.0
> > > > (Linaro GCC 6.5-2018.12)) #1 Wed Apr 14 18:46:17 CEST 2021
> > > > CPU: ARMv7 Processor [410fc051] revision 1 (ARMv7), cr=10c53c7d
> > > > CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> > > > OF: fdt: Machine model: SMGW
> > > > Memory policy: Data cache writeback
> > > > CPU: All CPU(s) started in SVC mode.
> > > > Built 1 zonelists, mobility grouping on.  Total pages: 32511
> > > > Kernel command line: rootfstype=ubifs ubi.mtd=1 root=ubi0:rootfs
> > > > rootflags=auth_hash_name=sha256,auth_key=ubifsauth:0000aaaaffff0000
> > > > Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
> > > > Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
> > > > mem auto-init: stack:off, heap alloc:on, heap free:on
> > > > mem auto-init: clearing system memory may take some time...
> > > > Memory: 121196K/131068K available (5120K kernel code, 197K rwdata,
> > > > 956K rodata, 1024K init, 153K bss, 9872K reserved, 0K cma-reserved)
> > > > NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> > > > random: get_random_bytes called from start_kernel+0x2ec/0x41c with crng_init=0
> > > > master clk is overclocked
> > > > clocksource: pit: mask: 0xfffffff max_cycles: 0xfffffff, max_idle_ns:
> > > > 14479245754 ns
> > > > Console: colour dummy device 80x30
> > > > printk: console [tty0] enabled
> > > > sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps every
> > > > 21474836475000000ns
> > > > Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184)
> > > > pid_max: default: 32768 minimum: 301
> > > > Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> > > > Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> > > > CPU: Testing write buffer coherency: ok
> > > > Setting up static identity map for 0x20100000 - 0x2010003c
> > > > devtmpfs: initialized
> > > > VFP support v0.3: implementor 41 architecture 2 part 30 variant 5 rev 1
> > > > clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
> > > > max_idle_ns: 19112604462750000 ns
> > > > futex hash table entries: 256 (order: -1, 3072 bytes, linear)
> > > > pinctrl core: initialized pinctrl subsystem
> > > > NET: Registered protocol family 16
> > > > DMA: preallocated 256 KiB pool for atomic coherent allocations
> > > > cpuidle: using governor ladder
> > > > gpio-at91 fffff200.gpio: at address (ptrval)
> > > > gpio-at91 fffff400.gpio: at address (ptrval)
> > > > gpio-at91 fffff600.gpio: at address (ptrval)
> > > > gpio-at91 fffff800.gpio: at address (ptrval)
> > > > gpio-at91 fffffa00.gpio: at address (ptrval)
> > > > pinctrl-at91 ahb:apb:pinctrl@fffff200: initialized AT91 pinctrl driver
> > > > atmel_tcb f0010000.timer: IRQ index 1 not found
> > > > atmel_tcb f0010000.timer: IRQ index 2 not found
> > > > atmel_tcb f8014000.timer: IRQ index 1 not found
> > > > atmel_tcb f8014000.timer: IRQ index 2 not found
> > > > at_hdmac ffffe600.dma-controller: Atmel AHB DMA Controller ( cpy set
> > > > slave ), 8 channels
> > > > at_hdmac ffffe800.dma-controller: Atmel AHB DMA Controller ( cpy set
> > > > slave ), 8 channels
> > > > AT91: Detected SoC family: sama5d3
> > > > AT91: Detected SoC: sama5d35, revision 2
> > > > usbcore: registered new interface driver usbfs
> > > > usbcore: registered new interface driver hub
> > > > usbcore: registered new device driver usb
> > > > at91_i2c f0014000.i2c: using dma0chan0 (tx) and dma0chan1 (rx) for DMA transfers
> > > > at91_i2c f0014000.i2c: AT91 i2c bus driver (hw version: 0x402).
> > > > at91_i2c f0018000.i2c: can't get DMA channel, continue without DMA support
> > > > at91_i2c f0018000.i2c: AT91 i2c bus driver (hw version: 0x402).
> > > > clocksource: Switched to clocksource pit
> > > > NET: Registered protocol family 2
> > > > tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
> > > > TCP established hash table entries: 1024 (order: 0, 4096 bytes, linear)
> > > > TCP bind hash table entries: 1024 (order: 0, 4096 bytes, linear)
> > > > TCP: Hash tables configured (established 1024 bind 1024)
> > > > UDP hash table entries: 256 (order: 0, 4096 bytes, linear)
> > > > UDP-Lite hash table entries: 256 (order: 0, 4096 bytes, linear)
> > > > NET: Registered protocol family 1
> > > > Initialise system trusted keyrings
> > > > workingset: timestamp_bits=30 max_order=15 bucket_order=0
> > > > NET: Registered protocol family 38
> > > > Key type asymmetric registered
> > > > Asymmetric key parser 'x509' registered
> > > > atmel_usart_serial.0.auto: ttyS1 at MMIO 0xf001c000 (irq = 21,
> > > > base_baud = 4125000) is a ATMEL_SERIAL
> > > > atmel_usart_serial.1.auto: ttyS2 at MMIO 0xf0020000 (irq = 22,
> > > > base_baud = 8250000) is a ATMEL_SERIAL
> > > > atmel_usart_serial.2.auto: ttyS3 at MMIO 0xf8020000 (irq = 24,
> > > > base_baud = 4125000) is a ATMEL_SERIAL
> > > > atmel_usart_serial.3.auto: ttyS0 at MMIO 0xffffee00 (irq = 32,
> > > > base_baud = 8250000) is a ATMEL_SERIAL
> > > > printk: console [ttyS0] enabled
> > > > atmel_spi f8008000.spi: DMA TX channel not available, SPI unable to use DMA
> > > > atmel_spi f8008000.spi: Atmel SPI Controller using PIO only
> > > > ksz8863@0 enforce active low on chipselect handle
> > > > atmel_spi f8008000.spi: Atmel SPI Controller version 0x213 at
> > > > 0xf8008000 (irq 23)
> > > > libphy: Fixed MDIO Bus: probed
> > > > macb f0028000.ethernet: invalid hw address, using random
> > > > random: fast init done
> > > > libphy: MACB_mii_bus: probed
> > > > Generic PHY fixed-0:00: attached PHY driver [Generic PHY]
> > > > (mii_bus:phy_addr=fixed-0:00, irq=POLL)
> > > > macb f0028000.ethernet eth0: Cadence GEM rev 0x00020119 at 0xf0028000
> > > > irq 40 (12:71:da:58:0b:2d)
> > > > libphy: MACB_mii_bus: probed
> > > > Micrel KSZ8081 or KSZ8091 f802c000.ethernet-ffffffff:00: attached PHY
> > > > driver [Micrel KSZ8081 or KSZ8091]
> > > > (mii_bus:phy_addr=f802c000.ethernet-ffffffff:00, irq=45)
> > > > macb f802c000.ethernet eth1: Cadence MACB rev 0x0001010c at 0xf802c000
> > > > irq 41 (d8:c4:97:1c:9a:d7)
> > > > usbcore: registered new interface driver cdc_ether
> > > > usbcore: registered new interface driver cdc_ncm
> > > > usbcore: registered new interface driver cdc_mbim
> > > > ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > > > ehci-platform: EHCI generic platform driver
> > > > ehci-atmel: EHCI Atmel driver
> > > > atmel-ehci 700000.ehci: EHCI Host Controller
> > > > atmel-ehci 700000.ehci: new USB bus registered, assigned bus number 1
> > > > atmel-ehci 700000.ehci: irq 43, io mem 0x00700000
> > > > atmel-ehci 700000.ehci: USB 2.0 started, EHCI 1.00
> > > > usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.04
> > > > usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> > > > usb usb1: Product: EHCI Host Controller
> > > > usb usb1: Manufacturer: Linux 5.4.109-00033-ga88943c1e68 ehci_hcd
> > > > usb usb1: SerialNumber: 700000.ehci
> > > > hub 1-0:1.0: USB hub found
> > > > hub 1-0:1.0: 3 ports detected
> > > > ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > > > ohci-platform: OHCI generic platform driver
> > > > ohci-atmel: OHCI Atmel driver
> > > > at91_ohci 600000.ohci: USB Host Controller
> > > > at91_ohci 600000.ohci: new USB bus registered, assigned bus number 2
> > > > at91_ohci 600000.ohci: irq 43, io mem 0x00600000
> > > > usb usb2: New USB device found, idVendor=1d6b, idProduct=0001, bcdDevice= 5.04
> > > > usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> > > > usb usb2: Product: USB Host Controller
> > > > usb usb2: Manufacturer: Linux 5.4.109-00033-ga88943c1e68 ohci_hcd
> > > > usb usb2: SerialNumber: at91
> > > > hub 2-0:1.0: USB hub found
> > > > hub 2-0:1.0: 3 ports detected
> > > > usbcore: registered new interface driver cdc_acm
> > > > cdc_acm: USB Abstract Control Model driver for USB modems and ISDN adapters
> > > > usbcore: registered new interface driver cdc_wdm
> > > > usbcore: registered new interface driver usbserial_generic
> > > > usbserial: USB Serial support registered for generic
> > > > usbcore: registered new interface driver option
> > > > usbserial: USB Serial support registered for GSM modem (1-port)
> > > > usbcore: registered new interface driver usb_serial_simple
> > > > usbserial: USB Serial support registered for carelink
> > > > usbserial: USB Serial support registered for zio
> > > > usbserial: USB Serial support registered for funsoft
> > > > usbserial: USB Serial support registered for flashloader
> > > > usbserial: USB Serial support registered for google
> > > > usbserial: USB Serial support registered for libtransistor
> > > > usbserial: USB Serial support registered for vivopay
> > > > usbserial: USB Serial support registered for moto_modem
> > > > usbserial: USB Serial support registered for motorola_tetra
> > > > usbserial: USB Serial support registered for novatel_gps
> > > > usbserial: USB Serial support registered for hp4x
> > > > usbserial: USB Serial support registered for suunto
> > > > usbserial: USB Serial support registered for siemens_mpi
> > > > rtc-rv8803 1-0032: registered as rtc0
> > > > i2c /dev entries driver
> > > > at91-reset fffffe00.rstc: Starting after wakeup
> > > > at91sam9_wdt: enabled (heartbeat=15 sec, nowayout=1)
> > > > atmel_aes f8038000.aes: version: 0x135
> > > > atmel_aes f8038000.aes: Atmel AES - Using dma1chan0, dma1chan1 for DMA transfers
> > > > atmel_sha f8034000.sha: version: 0x410
> > > > atmel_sha f8034000.sha: using dma1chan2 for DMA transfers
> > > > atmel_sha f8034000.sha: Atmel SHA1/SHA256/SHA224/SHA384/SHA512
> > > > atmel_tdes f803c000.tdes: version: 0x701
> > > > atmel_tdes f803c000.tdes: using dma1chan3, dma1chan4 for DMA transfers
> > > > atmel_tdes f803c000.tdes: Atmel DES/TDES
> > > > usbcore: registered new interface driver usbhid
> > > > usbhid: USB HID core driver
> > > > nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xd3
> > > > nand: Micron MT29F8G08ABACAWP
> > > > nand: 1024 MiB, SLC, erase size: 256 KiB, page size: 4096, OOB size: 224
> > > > Bad block table found at page 262080, version 0x01
> > > > Bad block table found at page 262016, version 0x01
> > > > nand_read_bbt: bad block at 0x000000dc0000
> > > > nand_read_bbt: bad block at 0x00000e280000
> > > > nand_read_bbt: bad block at 0x000024f40000
> > > > nand_read_bbt: bad block at 0x000026ec0000
> > > > nand_read_bbt: bad block at 0x000027ec0000
> > > > nand_read_bbt: bad block at 0x00002f140000
> > > > nand_read_bbt: bad block at 0x000030ec0000
> > > > nand_read_bbt: bad block at 0x000033b40000
> > > > nand_read_bbt: bad block at 0x000038840000
> > > > 2 fixed-partitions partitions found on MTD device atmel_nand
> > > > Creating 2 MTD partitions on "atmel_nand":
> > > > 0x000000000000-0x000000040000 : "at91bootstrap"
> > > > 0x000000040000-0x000040000000 : "ubivols"
> > > > NET: Registered protocol family 10
> > > > Segment Routing with IPv6
> > > > sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> > > > NET: Registered protocol family 17
> > > > 8021q: 802.1Q VLAN Support v1.8
> > > > Loading compiled-in X.509 certificates
> > > > Loaded X.509 cert 'EFR: efr.de: a54308f075327fb6c3a6fb687fdabc10ce4c51c6'
> > > > Key type ._fscrypt registered
> > > > Key type .fscrypt registered
> > > > ksz8795-switch spi0.0: active low/falling IRQ
> > > > libphy: dsa slave smi: probed
> > > > ksz8795-switch spi0.0 han (uninitialized): PHY [dsa-0.0:00] driver
> > > > [Micrel KSZ886X Switch]
> > > > ksz8795-switch spi0.0 cls (uninitialized): PHY [dsa-0.0:01] driver
> > > > [Micrel KSZ886X Switch]
> > > > ksz8795-switch spi0.0: Using legacy PHYLIB callbacks. Please migrate to PHYLINK!
> > > > DSA: tree 0 setup
> > > > ubi0: attaching mtd1
> > > > usb 2-3: new full-speed USB device number 2 using at91_ohci
> > > > usb 2-3: New USB device found, idVendor=08e6, idProduct=3437, bcdDevice= 2.01
> > > > usb 2-3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> > > > usb 2-3: Product: USB SmartCard Reader
> > > > usb 2-3: Manufacturer: Gemalto
> > > > random: crng init done
> > > > ubi0: scanning is finished
> > > > ubi0: attached mtd1 (name "ubivols", size 1023 MiB)
> > > > ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952 bytes
> > > > ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size 4096
> > > > ubi0: VID header offset: 4096 (aligned 4096), data offset: 8192
> > > > ubi0: good PEBs: 4082, bad PEBs: 13, corrupted PEBs: 0
> > > > ubi0: user volume: 5, internal volumes: 1, max. volumes count: 128
> > > > ubi0: max/mean erase counter: 118/11, WL threshold: 4096, image
> > > > sequence number: 921361235
> > > > ubi0: available PEBs: 0, total reserved PEBs: 4082, PEBs reserved for
> > > > bad PEB handling: 67
> > > > ubi0: background thread "ubi_bgt0d" started, PID 617
> > > > rtc-rv8803 1-0032: setting system clock to 2021-10-09T12:16:01 UTC (1633781761)
> > > > UBIFS (ubi0:4): Mounting in authenticated mode
> > > > UBIFS (ubi0:4): recovery needed
> > > > VFS: Cannot open root device "ubi0:rootfs" or ubi0:rootfs: error -1
> > >
> > > Hmm. -1 is -EPERM.
> > > So some authentification check fails.
> > >
> > > Can you find out which -EPERM in fs/ubifs this is?
> > > E.g. by adding a printk() before the return.
> > > Maybe this gives us a clue.
> >
> > I wasn't able to reproduce the crash in a stripped down format, but re-started
> > the full-blown test with the printk()-s before the EPREM-s.
> > It failed again this morning and here is the result:
> >
> > The one yielding is from authenticate_sleb() from fs/ubifs/replay.c,
> > when ubifs_check_hmac() fails:
> >
> > err = ubifs_check_hmac(c, auth->hmac,
> > if (err) {
> >         err = -EPERM;
> >              goto out;
> > }
> >
> > >
> > > > Please append a correct "root=" boot option; here are the available partitions:
> > > > Kernel panic - not syncing: VFS: Unable to mount root fs on ubi0:rootfs
> > > > CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.109-00033-ga88943c1e68 #1
> > > > Hardware name: Atmel SAMA5
> > > > [<c010babc>] (unwind_backtrace) from [<c010a8d0>] (show_stack+0x10/0x14)
> > > > [<c010a8d0>] (show_stack) from [<c0569970>] (panic+0xfc/0x2e8)
> > > > [<c0569970>] (panic) from [<c0801598>] (mount_block_root+0x268/0x2a4)
> > > > [<c0801598>] (mount_block_root) from [<c0801674>] (prepare_namespace+0x9c/0x184)
> > > > [<c0801674>] (prepare_namespace) from [<c056b774>] (kernel_init+0x8/0x108)
> > > > [<c056b774>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> > > > Exception stack(0xc7829fb0 to 0xc7829ff8)
> > > > 9fa0:                                     00000000 00000000 00000000 00000000
> > > > 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > > > 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > > > ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on
> > > > ubi0:rootfs ]---
> > > > RomBOOT
> > > > RomBOOT
> > > >
> > > >>
> > > >> 5.4 is not fresh. Can you use a more recent kernel?
> > > >
> > > > We are tracking the latest patches on 5.4, but will try whether a
> > > > rebase to 5.10 is straight-forward or not.
> > > > Or at least port the minimal changes which are needed to re-create the crash.
> > > >
> > > > But first, I will try to reproduce the crash with a stripped down
> > > > program, because with the current tests,
> > > > it can take anywhere between 20-500 iterations of power-cuts with the
> > > > full business logic to trigger the crash.
> > > > (5m ON 30s OFF)
> > >
> > > Thanks,
> > > //richard
> >
> > Best Regards,
> > Kristóf

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UBIFS panic during brown-out
  2021-10-21 15:33           ` Kristof Havasi
@ 2021-10-21 17:44             ` Richard Weinberger
  2021-10-25  9:35               ` Kristof Havasi
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Weinberger @ 2021-10-21 17:44 UTC (permalink / raw)
  To: Kristof Havasi; +Cc: linux-mtd

Kristof,

----- Ursprüngliche Mail -----
> Von: "Kristof Havasi" <havasiefr@gmail.com>
> An: "richard" <richard@nod.at>
> CC: "linux-mtd" <linux-mtd@lists.infradead.org>
> Gesendet: Donnerstag, 21. Oktober 2021 17:33:15
> Betreff: Re: UBIFS panic during brown-out

> Dear Richard,
> 
> I noticed that I forgot to Cc linux-mtd from the previous emails.
> 
> Latest update: we could reproduce the bug with a v5.10 port.
> The same EPERM error is hit.
> 
> Happy to hear any hints on how I can support in fixing this issue.

so I fear you found a bug in the UBIFS auth code.
Is it possible to share the filesystem and they keymaterial with me?

Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: UBIFS panic during brown-out
  2021-10-21 17:44             ` Richard Weinberger
@ 2021-10-25  9:35               ` Kristof Havasi
  0 siblings, 0 replies; 7+ messages in thread
From: Kristof Havasi @ 2021-10-25  9:35 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

Dear Richard,

On Thu, 21 Oct 2021 at 19:44, Richard Weinberger <richard@nod.at> wrote:
>
> Kristof,
>
> ----- Ursprüngliche Mail -----
> > Von: "Kristof Havasi" <havasiefr@gmail.com>
> > An: "richard" <richard@nod.at>
> > CC: "linux-mtd" <linux-mtd@lists.infradead.org>
> > Gesendet: Donnerstag, 21. Oktober 2021 17:33:15
> > Betreff: Re: UBIFS panic during brown-out
>
> > Dear Richard,
> >
> > I noticed that I forgot to Cc linux-mtd from the previous emails.
> >
> > Latest update: we could reproduce the bug with a v5.10 port.
> > The same EPERM error is hit.
> >
> > Happy to hear any hints on how I can support in fixing this issue.
>
> so I fear you found a bug in the UBIFS auth code.
> Is it possible to share the filesystem and they keymaterial with me?

I used our build server to generate the test images, where we use one-time
throw-away keys/certs for each build. Anyway, I managed to retrieve
the key materials
for the failing v5.10-based test.

We have the following layout on the flash:
0000 - [XXX] partition for bootstrap (at91)
[XXX]-           partition for UBI image

where the UBI image has:
- vol-0 DTB0 [ro]
- vol-1 DTB1 [ro]
- vol-2 Kernel0 [ro]
- vol-3 Kernel1 [ro]
- vol-4 RootFS [dyn, autoresize]

I don't know how to extract only the rootFS from the "bricked" device.
But I could dump the flash of that device and create a link for you
with that and the keys/certs.
Would that help?

>
> Thanks,
> //richard

Best Regards,
Kristóf

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-25  9:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11 17:16 UBIFS panic during brown-out Kristof Havasi
2021-10-11 19:02 ` Richard Weinberger
2021-10-12 13:02   ` Kristof Havasi
2021-10-12 13:36     ` Richard Weinberger
     [not found]       ` <CADBnMvidNZEiJfSTLe-rj+V0N=EtW8jTrYXzc2KRJKNS-zkCjg@mail.gmail.com>
     [not found]         ` <CADBnMvhCmNQsYRVAveuf=Ri820bLyLdcrA71Nywh0mVL+X8Fng@mail.gmail.com>
2021-10-21 15:33           ` Kristof Havasi
2021-10-21 17:44             ` Richard Weinberger
2021-10-25  9:35               ` Kristof Havasi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.