linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: MIPS (mt7688): EBase change in U-Boot breaks Linux
       [not found]         ` <CACUy__XtyDY08KTTMnKoXXKq4oUrNYdRXZOmtuXEmnfD7UveiA@mail.gmail.com>
@ 2018-12-13 19:47           ` Paul Burton
  2018-12-14  6:56             ` Stefan Roese
  0 siblings, 1 reply; 6+ messages in thread
From: Paul Burton @ 2018-12-13 19:47 UTC (permalink / raw)
  To: Daniel Schwierzeck, Stefan Roese
  Cc: U-Boot Mailing List, Horatiu Vultur, Linux-MIPS, linux-mips

Hello,

On Thu, Dec 13, 2018 at 03:23:39PM +0100, Daniel Schwierzeck wrote:
> > >>>> Finally I found that this line in U-Boot makes Linux break:
> > >>>>
> > >>>> arch/mips/lib/traps.c:
> > >>>>
> > >>>> void trap_init(ulong reloc_addr)
> > >>>>       unsigned long ebase = gd->irq_sp;
> > >>>>       ...
> > >>>>       write_c0_ebase(ebase);
> > >>>>
> > >>>> This sets EBase to something like 0x87e9b000 on my system (128MiB).
> > >>>> And Linux then re-uses this value and copies the exceptions handlers
> > >>>> to this address, overwriting random code and leading to an unstable
> > >>>> system.
> > >>>>
> > >>>> So my questions now is, how should this be handled on the MT7688
> > >>>> platform instead? One way would be to set EBase back to the
> > >>>> original value (0x80000000) before booting into Linux. Another
> > >>>> solution would be to add some Linux code like board_ebase_setup()
> > >>>> to the MT7688 Linux port.
>%
> > > I could also prepare a U-Boot patch to restore the original ebase value before
> > > handing the control over to the OS.
> >
> > I'm not so sure, if overwriting 0x80000000 (default value of EBase on
> > this SoC) with the exception handler is allowed. Is this address "zero"
> > handled somewhat specific in MIPS Linux? AFAICT, the complete DDR
> > area on my platform (0x8000.0000 - 0x87ff.ffff) is available for Linux.
> > So allocating some memory for this exception handler seems the right
> > way to go to me.
> 
> maybe that's why some platforms define a load address of 0x80002000 or similar
> to protect this area somehow.

Does this Linux patch help by any chance?

https://git.linux-mips.org/cgit/linux-mti.git/commit/?h=eng-v4.20&id=39e4d339a4540b66e9d9a8ea0da9ee41a21473b4

I'm not sure I remember why I didn't get that upstreamed yet, I probably
wanted to research what other systems were doing... Speaking for Malta,
the kernel's board support has reserved the start of kseg0 for longer
than I've been involved.

An alternative would be for Linux to allocate a page for use with the
exception vectors using memblock, and ignore the EBase value U-Boot left
us with. But just marking the area U-Boot used as reserved ought to do
the trick, and has the advantage of ensuring U-Boot's vectors don't get
overwritten before Linux sets up its own which sometimes allows U-Boot
to provide some useful output.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MIPS (mt7688): EBase change in U-Boot breaks Linux
  2018-12-13 19:47           ` MIPS (mt7688): EBase change in U-Boot breaks Linux Paul Burton
@ 2018-12-14  6:56             ` Stefan Roese
  2018-12-14 21:28               ` Paul Burton
  2018-12-14 21:31               ` Paul Burton
  0 siblings, 2 replies; 6+ messages in thread
From: Stefan Roese @ 2018-12-14  6:56 UTC (permalink / raw)
  To: Paul Burton, Daniel Schwierzeck
  Cc: U-Boot Mailing List, Horatiu Vultur, Linux-MIPS, linux-mips

Hi Paul,

On 13.12.18 20:47, Paul Burton wrote:
> On Thu, Dec 13, 2018 at 03:23:39PM +0100, Daniel Schwierzeck wrote:
>>>>>>> Finally I found that this line in U-Boot makes Linux break:
>>>>>>>
>>>>>>> arch/mips/lib/traps.c:
>>>>>>>
>>>>>>> void trap_init(ulong reloc_addr)
>>>>>>>        unsigned long ebase = gd->irq_sp;
>>>>>>>        ...
>>>>>>>        write_c0_ebase(ebase);
>>>>>>>
>>>>>>> This sets EBase to something like 0x87e9b000 on my system (128MiB).
>>>>>>> And Linux then re-uses this value and copies the exceptions handlers
>>>>>>> to this address, overwriting random code and leading to an unstable
>>>>>>> system.
>>>>>>>
>>>>>>> So my questions now is, how should this be handled on the MT7688
>>>>>>> platform instead? One way would be to set EBase back to the
>>>>>>> original value (0x80000000) before booting into Linux. Another
>>>>>>> solution would be to add some Linux code like board_ebase_setup()
>>>>>>> to the MT7688 Linux port.
>> %
>>>> I could also prepare a U-Boot patch to restore the original ebase value before
>>>> handing the control over to the OS.
>>>
>>> I'm not so sure, if overwriting 0x80000000 (default value of EBase on
>>> this SoC) with the exception handler is allowed. Is this address "zero"
>>> handled somewhat specific in MIPS Linux? AFAICT, the complete DDR
>>> area on my platform (0x8000.0000 - 0x87ff.ffff) is available for Linux.
>>> So allocating some memory for this exception handler seems the right
>>> way to go to me.
>>
>> maybe that's why some platforms define a load address of 0x80002000 or similar
>> to protect this area somehow.
> 
> Does this Linux patch help by any chance?
> 
> https://git.linux-mips.org/cgit/linux-mti.git/commit/?h=eng-v4.20&id=39e4d339a4540b66e9d9a8ea0da9ee41a21473b4
> 
> I'm not sure I remember why I didn't get that upstreamed yet, I probably
> wanted to research what other systems were doing... Speaking for Malta,
> the kernel's board support has reserved the start of kseg0 for longer
> than I've been involved.

No, this patch does not solve this issue (bootup still hangs or crashes
while mounting the rootfs). I can only assume that its too late to try
to reserve this memory region as the memblock_reserve() call returns 0
(no error).
  
> An alternative would be for Linux to allocate a page for use with the
> exception vectors using memblock, and ignore the EBase value U-Boot left
> us with. But just marking the area U-Boot used as reserved ought to do
> the trick, and has the advantage of ensuring U-Boot's vectors don't get
> overwritten before Linux sets up its own which sometimes allows U-Boot
> to provide some useful output.

I agree that re-using the U-Boot value would be optimal for boot-time
error printing. But this does not seem to work on our platform AFAICT.
So how to proceed? Should I enable CONFIG_CPU_MIPSR2_IRQ_VI or #define
"cpu_has_veic" to 1 as Lantiq does?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MIPS (mt7688): EBase change in U-Boot breaks Linux
  2018-12-14  6:56             ` Stefan Roese
@ 2018-12-14 21:28               ` Paul Burton
  2018-12-17  8:55                 ` Stefan Roese
  2018-12-14 21:31               ` Paul Burton
  1 sibling, 1 reply; 6+ messages in thread
From: Paul Burton @ 2018-12-14 21:28 UTC (permalink / raw)
  To: Stefan Roese
  Cc: Daniel Schwierzeck, U-Boot Mailing List, Horatiu Vultur,
	Linux-MIPS, linux-mips

Hi Stefan,

On Fri, Dec 14, 2018 at 07:56:59AM +0100, Stefan Roese wrote:
> > Does this Linux patch help by any chance?
> > 
> > https://git.linux-mips.org/cgit/linux-mti.git/commit/?h=eng-v4.20&id=39e4d339a4540b66e9d9a8ea0da9ee41a21473b4
> > 
> > I'm not sure I remember why I didn't get that upstreamed yet, I probably
> > wanted to research what other systems were doing... Speaking for Malta,
> > the kernel's board support has reserved the start of kseg0 for longer
> > than I've been involved.
> 
> No, this patch does not solve this issue (bootup still hangs or crashes
> while mounting the rootfs). I can only assume that its too late to try
> to reserve this memory region as the memblock_reserve() call returns 0
> (no error).

Hmm, OK. Do you know what is getting overwritten? Is it part of the
kernel binary itself?

> > An alternative would be for Linux to allocate a page for use with the
> > exception vectors using memblock, and ignore the EBase value U-Boot left
> > us with. But just marking the area U-Boot used as reserved ought to do
> > the trick, and has the advantage of ensuring U-Boot's vectors don't get
> > overwritten before Linux sets up its own which sometimes allows U-Boot
> > to provide some useful output.
> 
> I agree that re-using the U-Boot value would be optimal for boot-time
> error printing. But this does not seem to work on our platform AFAICT.
> So how to proceed? Should I enable CONFIG_CPU_MIPSR2_IRQ_VI or #define
> "cpu_has_veic" to 1 as Lantiq does?

I think the answer to the question above will be helpful - if it's the
kernel binary itself getting overwritten then we have 2 options:

  1) Move the kernel, ie. change load-y in arch/mips/ralink/Platform.

  2) Have Linux recognize that the address in EBase is unsuitable &
     allocate a new page.

Or perhaps even both - having Linux recognize & avoid the problem seems
good for robustness, but if the kernel binary is overwriting the
exception vectors it might be useful to move the kernel anyway so that
we don't prevent U-Boot's vectors from working in between loading the
kernel & booting it.

If it's not the kernel binary overwriting the vectors & then being
overwritten, then I'd be interested in knowing what is in that memory.
We shouldn't have allocated much of anything this early, but a possible
fix might be to reserve the page EBase resides in from bootmem_init().

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MIPS (mt7688): EBase change in U-Boot breaks Linux
  2018-12-14  6:56             ` Stefan Roese
  2018-12-14 21:28               ` Paul Burton
@ 2018-12-14 21:31               ` Paul Burton
  2018-12-17  9:13                 ` Stefan Roese
  1 sibling, 1 reply; 6+ messages in thread
From: Paul Burton @ 2018-12-14 21:31 UTC (permalink / raw)
  To: Stefan Roese
  Cc: Daniel Schwierzeck, U-Boot Mailing List, Horatiu Vultur,
	Linux-MIPS, linux-mips

Hi Stefan,

On Fri, Dec 14, 2018 at 07:56:59AM +0100, Stefan Roese wrote:
> So how to proceed? Should I enable CONFIG_CPU_MIPSR2_IRQ_VI or #define
> "cpu_has_veic" to 1 as Lantiq does?

...and on that point in particular, it really depends on your hardware.

You shouldn't need to do either of those things just to avoid this bug,
but if your hardware actually supports VI or EIC then it may be
beneficial to enable them.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MIPS (mt7688): EBase change in U-Boot breaks Linux
  2018-12-14 21:28               ` Paul Burton
@ 2018-12-17  8:55                 ` Stefan Roese
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Roese @ 2018-12-17  8:55 UTC (permalink / raw)
  To: Paul Burton
  Cc: Daniel Schwierzeck, U-Boot Mailing List, Horatiu Vultur,
	Linux-MIPS, linux-mips

Hi Paul.

On 14.12.18 22:28, Paul Burton wrote:
> On Fri, Dec 14, 2018 at 07:56:59AM +0100, Stefan Roese wrote:
>>> Does this Linux patch help by any chance?
>>>
>>> https://git.linux-mips.org/cgit/linux-mti.git/commit/?h=eng-v4.20&id=39e4d339a4540b66e9d9a8ea0da9ee41a21473b4
>>>
>>> I'm not sure I remember why I didn't get that upstreamed yet, I probably
>>> wanted to research what other systems were doing... Speaking for Malta,
>>> the kernel's board support has reserved the start of kseg0 for longer
>>> than I've been involved.
>>
>> No, this patch does not solve this issue (bootup still hangs or crashes
>> while mounting the rootfs). I can only assume that its too late to try
>> to reserve this memory region as the memblock_reserve() call returns 0
>> (no error).
> 
> Hmm, OK. Do you know what is getting overwritten? Is it part of the
> kernel binary itself?

Okay, I did a bit more research and debugging here. MIPS sets
CONFIG_ARCH_DISCARD_MEMBLOCK in general, which results in free'ing
the reserved memory region(s) via memblock_discard() at a later boot
stage.

I also changed arch/mips/Kconfig so that ARCH_DISCARD_MEMBLOCK is
not defined, but this did not solve the issue either. I'm not sure
why ARCH_DISCARD_MEMBLOCK is defined for MIPS. Here a log from
the system running with ARCH_DISCARD_MEMBLOCK disabled and EBase
set to some area where booting does work (for test purpose only):

root@mt7688:~# cat /sys/kernel/debug/memblock/
memory    reserved
root@mt7688:~# cat /sys/kernel/debug/memblock/memory
    0: 0x00000000..0x07ffffff
root@mt7688:~# cat /sys/kernel/debug/memblock/reserved
    0: 0x02220000..0x02220fff

memblock really only seems to be suitable for early memory handling.
Reserving memory for the complete OS lifetime does not work (AFAICT).
Perhaps moving to CMA would help here.

So back to your question: It's not kernel memory that is overwritten
at e.g. 0x06f5f000 (128MiB memory) but its some userspace memory allocated
dynamically when starting into the mount process of the rootfs. This
memory region is *not* revered at that stage any more. memblock does not
seem to be the correct way to reserve areas here.
  
>>> An alternative would be for Linux to allocate a page for use with the
>>> exception vectors using memblock, and ignore the EBase value U-Boot left
>>> us with. But just marking the area U-Boot used as reserved ought to do
>>> the trick, and has the advantage of ensuring U-Boot's vectors don't get
>>> overwritten before Linux sets up its own which sometimes allows U-Boot
>>> to provide some useful output.
>>
>> I agree that re-using the U-Boot value would be optimal for boot-time
>> error printing. But this does not seem to work on our platform AFAICT.
>> So how to proceed? Should I enable CONFIG_CPU_MIPSR2_IRQ_VI or #define
>> "cpu_has_veic" to 1 as Lantiq does?
> 
> I think the answer to the question above will be helpful - if it's the
> kernel binary itself getting overwritten then we have 2 options:
> 
>    1) Move the kernel, ie. change load-y in arch/mips/ralink/Platform.
> 
>    2) Have Linux recognize that the address in EBase is unsuitable &
>       allocate a new page.
> 
> Or perhaps even both - having Linux recognize & avoid the problem seems
> good for robustness, but if the kernel binary is overwriting the
> exception vectors it might be useful to move the kernel anyway so that
> we don't prevent U-Boot's vectors from working in between loading the
> kernel & booting it.
> 
> If it's not the kernel binary overwriting the vectors & then being
> overwritten, then I'd be interested in knowing what is in that memory.
> We shouldn't have allocated much of anything this early, but a possible
> fix might be to reserve the page EBase resides in from bootmem_init().

That does not help (see comments about memblock usage above). I could
add a check, if EBase resides in the system memory and if this is the
case, allocate a page and move EBase to this new location.

What do you think? Did I misinterpret this memblock usage on MIPS? Do
you have other ideas on how to solve this issue?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MIPS (mt7688): EBase change in U-Boot breaks Linux
  2018-12-14 21:31               ` Paul Burton
@ 2018-12-17  9:13                 ` Stefan Roese
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Roese @ 2018-12-17  9:13 UTC (permalink / raw)
  To: Paul Burton
  Cc: Daniel Schwierzeck, U-Boot Mailing List, Horatiu Vultur,
	Linux-MIPS, linux-mips

Hi Paul,

On 14.12.18 22:31, Paul Burton wrote:
> On Fri, Dec 14, 2018 at 07:56:59AM +0100, Stefan Roese wrote:
>> So how to proceed? Should I enable CONFIG_CPU_MIPSR2_IRQ_VI or #define
>> "cpu_has_veic" to 1 as Lantiq does?
> 
> ...and on that point in particular, it really depends on your hardware.
> 
> You shouldn't need to do either of those things just to avoid this bug,
> but if your hardware actually supports VI or EIC then it may be
> beneficial to enable them.

Checking again, the MT7688 supports VI. config3=00002420, so VInt (Bit 5)
is set. But without CONFIG_CPU_MIPSR2_IRQ_VI being set, cpu_has_vint will
stay set to zero. So it seems that I need set CONFIG_CPU_MIPSR2_IRQ_VI
at least for this SoC (CONFIG_SOC_MT7620) if not even for all Ralink
based SoC's.

If nobody objects, I'll submit a patch enabling CONFIG_CPU_MIPSR2_IRQ_VI
for CONFIG_SOC_MT7620.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-12-17  9:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e4f0fff9-a3c5-85ce-c4be-6e0aa0f74592@denx.de>
     [not found] ` <d81ac18d-47ed-02ec-bc37-f5a7e0ab9223@gmail.com>
     [not found]   ` <543512d8-91ea-2a49-5423-680860c0ba9f@denx.de>
     [not found]     ` <CACUy__X434rmJnX96i057-ir8yiCBjMac_V41HJ+pyG0xLPcRg@mail.gmail.com>
     [not found]       ` <dad02a31-ed34-f99a-26c5-60e4a7209057@denx.de>
     [not found]         ` <CACUy__XtyDY08KTTMnKoXXKq4oUrNYdRXZOmtuXEmnfD7UveiA@mail.gmail.com>
2018-12-13 19:47           ` MIPS (mt7688): EBase change in U-Boot breaks Linux Paul Burton
2018-12-14  6:56             ` Stefan Roese
2018-12-14 21:28               ` Paul Burton
2018-12-17  8:55                 ` Stefan Roese
2018-12-14 21:31               ` Paul Burton
2018-12-17  9:13                 ` Stefan Roese

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).