All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] arm: socfpga gen5: warm reboot reliability
@ 2018-11-14 20:30 Simon Goldschmidt
  2018-11-14 20:51 ` Marek Vasut
  0 siblings, 1 reply; 4+ messages in thread
From: Simon Goldschmidt @ 2018-11-14 20:30 UTC (permalink / raw)
  To: u-boot

Hi,

[This whole description is not qspi specific but qspi happens to be my 
boot source. It should be the same when booting from mmc or nand.]

On my socfgpa gen5 board, I was a little surprised that a soft reboot 
did not start the SPL from qspi after updating it from a running Linux 
(via swupdate writing to qspi).

When debugging this, I found that this is because SPL and U-Boot set the 
magic value 0xae9efebc to the sysmgr_regs->romcodegrp_warmramgrp_enable 
register, which tells the boot ROM that it can jump to SPL in OnChip RAM 
on next warm reboot.

Now this is a problem for me specifically because I want to run the SPL 
stored in qspi after updating it. I don't know if I'm the only one 
wanting to do that, but I could implement this for my own board by just 
resetting the magic value in the warmramgrp_enable register.

What surprised me more is that the boot ROM provides the ability to 
check the CRC of SPL, but it doesn't check the CRC since the related 
registers tellt it to not do that (to enable this, SPL would have to 
write its length + CRC to sysmgr registers). I would have suspected this 
bad design since maybe we were rebooted by the watchdog because some 
driver wrote bad data somewhere, so it might well be the SPL in OnChip 
RAM is not intact any more.

Now should we:
a) calculate and set a CRC for SPL's text / rodata segments (what about 
initialized rwdata?) or
b) set the warmramgrp_enable register to always load SPL from boot source or
c) make it configurable?

Writing the code for any of the 3 choices is easy, so I wanted to get an 
opinion first before sending a patch. Personally, I'd favour always 
loading SPL from boot source, but the comment (in misc_gen5.c 
arch_early_init_r()) suggests I'd break ancient kernels with that (I 
assume the problem is the problem I have with 4.9: Linux sets qspi into 
4-byte mode and the boot rom can then not load SPL since it uses 3-byte 
mode without switching mode in the qspi chip).

Any thoughts?

Simon

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [U-Boot] arm: socfpga gen5: warm reboot reliability
  2018-11-14 20:30 [U-Boot] arm: socfpga gen5: warm reboot reliability Simon Goldschmidt
@ 2018-11-14 20:51 ` Marek Vasut
  2018-11-14 21:04   ` Simon Goldschmidt
  0 siblings, 1 reply; 4+ messages in thread
From: Marek Vasut @ 2018-11-14 20:51 UTC (permalink / raw)
  To: u-boot

On 11/14/2018 09:30 PM, Simon Goldschmidt wrote:
> Hi,

Hi,

> [This whole description is not qspi specific but qspi happens to be my
> boot source. It should be the same when booting from mmc or nand.]
> 
> On my socfgpa gen5 board, I was a little surprised that a soft reboot
> did not start the SPL from qspi after updating it from a running Linux
> (via swupdate writing to qspi).
> 
> When debugging this, I found that this is because SPL and U-Boot set the
> magic value 0xae9efebc to the sysmgr_regs->romcodegrp_warmramgrp_enable
> register, which tells the boot ROM that it can jump to SPL in OnChip RAM
> on next warm reboot.
> 
> Now this is a problem for me specifically because I want to run the SPL
> stored in qspi after updating it. I don't know if I'm the only one
> wanting to do that, but I could implement this for my own board by just
> resetting the magic value in the warmramgrp_enable register.

Yes, that's Alteras' workaround for broken systems which do not connect
SPI NOR reset and thus leave SPI NOR in undefined state (ie. 4B mode).
When the bootrom tries to load from it, it fails and the system hangs.

You can do cold reset instead of warm reset to force the bootrom to
reload the preloader from SPI NOR, it's another bit in the reset register.

> What surprised me more is that the boot ROM provides the ability to
> check the CRC of SPL, but it doesn't check the CRC since the related
> registers tellt it to not do that (to enable this, SPL would have to
> write its length + CRC to sysmgr registers). I would have suspected this
> bad design since maybe we were rebooted by the watchdog because some
> driver wrote bad data somewhere, so it might well be the SPL in OnChip
> RAM is not intact any more.

IIRC the BootROM only checks CRC of the header, no ?

> Now should we:
> a) calculate and set a CRC for SPL's text / rodata segments (what about
> initialized rwdata?) or
> b) set the warmramgrp_enable register to always load SPL from boot
> source or
> c) make it configurable?

a) would be ideal, if it really is possible and works :)

> Writing the code for any of the 3 choices is easy, so I wanted to get an
> opinion first before sending a patch. Personally, I'd favour always
> loading SPL from boot source, but the comment (in misc_gen5.c
> arch_early_init_r()) suggests I'd break ancient kernels with that (I
> assume the problem is the problem I have with 4.9: Linux sets qspi into
> 4-byte mode and the boot rom can then not load SPL since it uses 3-byte
> mode without switching mode in the qspi chip).

You'd break broken boards some more , that's all. It has nothing to do
with kernel version.

> Any thoughts?
> 
> Simon


-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [U-Boot] arm: socfpga gen5: warm reboot reliability
  2018-11-14 20:51 ` Marek Vasut
@ 2018-11-14 21:04   ` Simon Goldschmidt
  2018-11-14 23:06     ` Marek Vasut
  0 siblings, 1 reply; 4+ messages in thread
From: Simon Goldschmidt @ 2018-11-14 21:04 UTC (permalink / raw)
  To: u-boot

On 14.11.2018 21:51, Marek Vasut wrote:
> On 11/14/2018 09:30 PM, Simon Goldschmidt wrote:
>> Hi,
> Hi,
>
>> [This whole description is not qspi specific but qspi happens to be my
>> boot source. It should be the same when booting from mmc or nand.]
>>
>> On my socfgpa gen5 board, I was a little surprised that a soft reboot
>> did not start the SPL from qspi after updating it from a running Linux
>> (via swupdate writing to qspi).
>>
>> When debugging this, I found that this is because SPL and U-Boot set the
>> magic value 0xae9efebc to the sysmgr_regs->romcodegrp_warmramgrp_enable
>> register, which tells the boot ROM that it can jump to SPL in OnChip RAM
>> on next warm reboot.
>>
>> Now this is a problem for me specifically because I want to run the SPL
>> stored in qspi after updating it. I don't know if I'm the only one
>> wanting to do that, but I could implement this for my own board by just
>> resetting the magic value in the warmramgrp_enable register.
> Yes, that's Alteras' workaround for broken systems which do not connect
> SPI NOR reset and thus leave SPI NOR in undefined state (ie. 4B mode).

I remember being somewhat shocked to see some of the SPI NOR chips don't 
even have a reset input...

> When the bootrom tries to load from it, it fails and the system hangs.
>
> You can do cold reset instead of warm reset to force the bootrom to
> reload the preloader from SPI NOR, it's another bit in the reset register.
>
>> What surprised me more is that the boot ROM provides the ability to
>> check the CRC of SPL, but it doesn't check the CRC since the related
>> registers tellt it to not do that (to enable this, SPL would have to
>> write its length + CRC to sysmgr registers). I would have suspected this
>> bad design since maybe we were rebooted by the watchdog because some
>> driver wrote bad data somewhere, so it might well be the SPL in OnChip
>> RAM is not intact any more.
> IIRC the BootROM only checks CRC of the header, no ?

That would have been good, but from reading the documentation ([1]), it 
seems to be different: it's a offset/length range that is CRC checked. 
And if the 'length' register is 0, "the Boot ROM won't perform CRC 
calculation and CRC check to avoid overhead caused by CRC validation."

>
>> Now should we:
>> a) calculate and set a CRC for SPL's text / rodata segments (what about
>> initialized rwdata?) or
>> b) set the warmramgrp_enable register to always load SPL from boot
>> source or
>> c) make it configurable?
> a) would be ideal, if it really is possible and works :)

Well, for someone in the need of updating SPL and using the updated SPL, 
b) would be best, but you can always just reset the magic after such an 
update to ensure the updated SPL is loaded.

I guess I'll try the CRC mechanism.

>
>> Writing the code for any of the 3 choices is easy, so I wanted to get an
>> opinion first before sending a patch. Personally, I'd favour always
>> loading SPL from boot source, but the comment (in misc_gen5.c
>> arch_early_init_r()) suggests I'd break ancient kernels with that (I
>> assume the problem is the problem I have with 4.9: Linux sets qspi into
>> 4-byte mode and the boot rom can then not load SPL since it uses 3-byte
>> mode without switching mode in the qspi chip).
> You'd break broken boards some more , that's all. It has nothing to do
> with kernel version.

I thought newer kernels use 4-byte opcodes that don't set the chip into 
4-byte mode, thus not breaking the boot ROM. That would make it 
depending on the kernel version. But I haven't found the time to prove 
this theory, yet.


Simon


[1] 
https://www.intel.com/content/www/us/en/programmable/hps/cyclone-v/hps.html 
-> sysmgr -> Warm Boot from On-Chip RAM Group

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [U-Boot] arm: socfpga gen5: warm reboot reliability
  2018-11-14 21:04   ` Simon Goldschmidt
@ 2018-11-14 23:06     ` Marek Vasut
  0 siblings, 0 replies; 4+ messages in thread
From: Marek Vasut @ 2018-11-14 23:06 UTC (permalink / raw)
  To: u-boot

On 11/14/2018 10:04 PM, Simon Goldschmidt wrote:
> On 14.11.2018 21:51, Marek Vasut wrote:
>> On 11/14/2018 09:30 PM, Simon Goldschmidt wrote:
>>> Hi,
>> Hi,
>>
>>> [This whole description is not qspi specific but qspi happens to be my
>>> boot source. It should be the same when booting from mmc or nand.]
>>>
>>> On my socfgpa gen5 board, I was a little surprised that a soft reboot
>>> did not start the SPL from qspi after updating it from a running Linux
>>> (via swupdate writing to qspi).
>>>
>>> When debugging this, I found that this is because SPL and U-Boot set the
>>> magic value 0xae9efebc to the sysmgr_regs->romcodegrp_warmramgrp_enable
>>> register, which tells the boot ROM that it can jump to SPL in OnChip RAM
>>> on next warm reboot.
>>>
>>> Now this is a problem for me specifically because I want to run the SPL
>>> stored in qspi after updating it. I don't know if I'm the only one
>>> wanting to do that, but I could implement this for my own board by just
>>> resetting the magic value in the warmramgrp_enable register.
>> Yes, that's Alteras' workaround for broken systems which do not connect
>> SPI NOR reset and thus leave SPI NOR in undefined state (ie. 4B mode).
> 
> I remember being somewhat shocked to see some of the SPI NOR chips don't
> even have a reset input...

Yep, which is sad. Those probably need some sort of switch on the Vcc,
although I'd rather go for one which has a reset line.

>> When the bootrom tries to load from it, it fails and the system hangs.
>>
>> You can do cold reset instead of warm reset to force the bootrom to
>> reload the preloader from SPI NOR, it's another bit in the reset
>> register.
>>
>>> What surprised me more is that the boot ROM provides the ability to
>>> check the CRC of SPL, but it doesn't check the CRC since the related
>>> registers tellt it to not do that (to enable this, SPL would have to
>>> write its length + CRC to sysmgr registers). I would have suspected this
>>> bad design since maybe we were rebooted by the watchdog because some
>>> driver wrote bad data somewhere, so it might well be the SPL in OnChip
>>> RAM is not intact any more.
>> IIRC the BootROM only checks CRC of the header, no ?
> 
> That would have been good, but from reading the documentation ([1]), it
> seems to be different: it's a offset/length range that is CRC checked.
> And if the 'length' register is 0, "the Boot ROM won't perform CRC
> calculation and CRC check to avoid overhead caused by CRC validation."

We should definitely do the CRC check, unless it has side effects :-)

>>> Now should we:
>>> a) calculate and set a CRC for SPL's text / rodata segments (what about
>>> initialized rwdata?) or
>>> b) set the warmramgrp_enable register to always load SPL from boot
>>> source or
>>> c) make it configurable?
>> a) would be ideal, if it really is possible and works :)
> 
> Well, for someone in the need of updating SPL and using the updated SPL,
> b) would be best, but you can always just reset the magic after such an
> update to ensure the updated SPL is loaded.

Can't you invalidate the OCRAM (memset it to 0) and do cold-reset after
such an update ?

> I guess I'll try the CRC mechanism.
> 
>>
>>> Writing the code for any of the 3 choices is easy, so I wanted to get an
>>> opinion first before sending a patch. Personally, I'd favour always
>>> loading SPL from boot source, but the comment (in misc_gen5.c
>>> arch_early_init_r()) suggests I'd break ancient kernels with that (I
>>> assume the problem is the problem I have with 4.9: Linux sets qspi into
>>> 4-byte mode and the boot rom can then not load SPL since it uses 3-byte
>>> mode without switching mode in the qspi chip).
>> You'd break broken boards some more , that's all. It has nothing to do
>> with kernel version.
> 
> I thought newer kernels use 4-byte opcodes that don't set the chip into
> 4-byte mode, thus not breaking the boot ROM. That would make it
> depending on the kernel version. But I haven't found the time to prove
> this theory, yet.

It depends on the chip. Some chips don't have dedicated 4byte opcodes.
But even page program, which is interrupted in the middle of it, can
cause the bootrom to fail to boot, since the SPI NOR may not respond.
Worse yet, it may store some of the communication as actual data in the
SPI NOR, thus corrupting the content.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-11-14 23:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-14 20:30 [U-Boot] arm: socfpga gen5: warm reboot reliability Simon Goldschmidt
2018-11-14 20:51 ` Marek Vasut
2018-11-14 21:04   ` Simon Goldschmidt
2018-11-14 23:06     ` Marek Vasut

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.