linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Atish Kumar Patra <atishp@rivosinc.com>
To: Samuel Holland <samuel@sholland.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>,
	jszhang@kernel.org, Paul Walmsley <paul.walmsley@sifive.com>,
	aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] riscv: head: use 0 as the default text_offset
Date: Tue, 29 Nov 2022 01:15:50 -0800	[thread overview]
Message-ID: <CAHBxVyEPvxm2KqCedkDTduVRq2mFCwuWE-NL9WO0_a7=+Y7upw@mail.gmail.com> (raw)
In-Reply-To: <d79d85c2-8457-9b0e-6460-089cf91dd425@sholland.org>

On Mon, Nov 28, 2022 at 10:55 PM Samuel Holland <samuel@sholland.org> wrote:
>
> On 11/29/22 00:19, Palmer Dabbelt wrote:
> > On Mon, 28 Nov 2022 21:04:48 PST (-0800), samuel@sholland.org wrote:
> >> On 11/28/22 14:11, Atish Kumar Patra wrote:
> >>> On Mon, Nov 28, 2022 at 7:34 AM Jisheng Zhang <jszhang@kernel.org>
> >>> wrote:
> >>>>
> >>>> Commit 0f327f2aaad6 ("RISC-V: Add an Image header that boot loader can
> >>>> parse.") adds an image header which "is based on ARM64 boot image
> >>>> header and provides an opportunity to combine both ARM64 & RISC-V
> >>>> image headers in future.". At that time, arm64's default text_offset
> >>>> is 0x80000, this is to give "512 KB of guaranteed BSS space to put
> >>>> the swapper page tables" as commit cfa7ede20f13 ("arm64: set
> >>>> TEXT_OFFSET
> >>>> to 0x0 in preparation for removing it entirely") pointed out, but
> >>>> riscv doesn't need the space, so use 0 as the default text_offset.
> >>>>
> >>>> Before this patch, booting linux kernel on Sipeed bl808 M1s Dock
> >>>> with u-boot booti cmd:
> >>>> [    0.000000] OF: fdt: Ignoring memory range 0x50000000 - 0x50200000
> >>>> ...
> >>>> [    0.000000]   DMA32    [mem 0x0000000050200000-0x0000000053ffffff]
> >>>> As can be seen, 2MB DDR(0x50000000 - 0x501fffff) can't be used by
> >>>> linux.
> >>>>
> >>>> After this patch, the 64MB DDR is fully usable by linux
> >>>> [    0.000000]   DMA32    [mem 0x0000000050000000-0x0000000053ffffff]
> >>>>
> >>>> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> >>>> ---
> >>>>  arch/riscv/kernel/head.S | 12 +-----------
> >>>>  1 file changed, 1 insertion(+), 11 deletions(-)
> >>>>
> >>>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> >>>> index b865046e4dbb..ef95943f7a70 100644
> >>>> --- a/arch/riscv/kernel/head.S
> >>>> +++ b/arch/riscv/kernel/head.S
> >>>> @@ -38,18 +38,8 @@ ENTRY(_start)
> >>>>         .word 0
> >>>>  #endif
> >>>>         .balign 8
> >>>> -#ifdef CONFIG_RISCV_M_MODE
> >>>> -       /* Image load offset (0MB) from start of RAM for M-mode */
> >>>> +       /* Image load offset (0MB) from start of RAM */
> >>>>         .dword 0
> >>>> -#else
> >>>> -#if __riscv_xlen == 64
> >>>> -       /* Image load offset(2MB) from start of RAM */
> >>>> -       .dword 0x200000
> >>>> -#else
> >>>> -       /* Image load offset(4MB) from start of RAM */
> >>>> -       .dword 0x400000
> >>>> -#endif
> >>>> -#endif
> >>>
> >>> NACK.
> >>> RV64 needs to boot at a 2MB aligned address and RV32 needs to boot at
> >>> a 4MB aligned address.
> >>> The firmware is assumed to live at the start of DRAM for Linux running
> >>> in S-mode.
> >>
> >> What needs to happen so we can stop making this assumption? If the SBI
> >> implementation wants to reserve memory, it should use the devicetree to
> >> do so. OpenSBI already does this.
> >
> > IMO we've really screwed up the boot flow on RISC-V.  Having Linux
> > reserve space for the firmware is just all backwards, Linux can't know
> > how much memory the firmware needs (which manifests under large hart
> > counts in OpenSBI, for example).  Unfortunately there's no specification
> > that defines these platform-level details, so we're stuck depending on
> > unspecified behavior like this.
> >
> > I think we could fix this by either making Linux's early boot relocation
> > code work sanely (fix whatever bugs are there, document what can't be
> > fixed, and then add some sort of Image flag to tell firmware the kernel
> > can be relocated) or relying on relocatable firmware, but both of those
> > come with some costs ...
>
> It sounds like Alexandre's patch[1] lets us use memory below this
> offset, so we don't have to relocate the kernel to the beginning of RAM.
> In fact, we could even increase the offset if we are concerned about the
> kernel link address conflicting with the SBI implementation.
>
> [1]:
> https://patchwork.kernel.org/project/linux-riscv/patch/20221122084141.1849421-1-alexghiti@rivosinc.com/
>
> >> Throwing away 2 MiB of RAM is quite wasteful considering we have
> >> multiple SoCs (D1s, BL808) that are limited to 64 MiB of in-package RAM.
> >
> > ... and I'd argue that users on systems don't want to pay those costs.
>
> What does fixing the early relocation code cost? Just longer boot time?
> If the bootloader takes care of avoiding reserved-memory regions, and
> Linux can run from wherever it gets loaded, that would be ideal to me.
>
> > In fact, I'd argue that systems like that don't want resident firmware
> > at all.
>
> I would much rather pay 256 KiB for resident firmware than reimplement
> all of the power management and PMU logic in Linux. It's not as bad as
> losing 2 MiB when I know most of that is unused.
>

There are also debug triggers, AP-TEE which are SBI extensions.
In addition to that we have steal time accounting (STA) SBI extension
in virtualization
use cases as well.

Note: The PMU requirement will probably no longer be if supervisor
counter delegation
extension is approved. But it will take some time for hardware to
actually implement that.

> > So let's just add a CONFIG_SBI=n, and then just use direct drivers for
> > everything.  If the firmware doesn't need to be resident then it's
> > pretty straight-forward to support these 0 offsets, so we can just add
> > that as another Kconfig.  Sure this will trip up firmware that depends
> > on these fixed reservations, but saying "the resident firmware fits in 0
> > superpages" is just as much of a platform-specific dependency as saying
> > "the resident firmware fits in 1 superpage".  If firmware can't handle
> > this field in the Image format then we're going to end up with breakages
> > at some point, it might as well be now.
> >
> > If these systems don't have all the ISA bits necessary to avoid M-mode
> > entirely then we can just implement a tiny M-mode stub in Linux that
> > gets left around during early boot and then shims stuff to S-mode.
> > That'll be a bit of a headache and with some extensions it can be
> > avoided, the standard stuff won't allow for that until the latest round
> > of specs is done but if it's possible via whatever custom extensions are
> > in these things then that's probably the way to go.
>
> I don't think Linux has a choice here, when started in S-mode. And
> neither does the bootloader parsing the Image, because it most likely
> runs in S-mode as well.
>
> And when started in M-mode, we already don't use SBI.
>
> Regards,
> Samuel
>

  reply	other threads:[~2022-11-29  9:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-28 15:24 [PATCH] riscv: head: use 0 as the default text_offset Jisheng Zhang
2022-11-28 16:03 ` Andreas Schwab
2022-11-28 20:11 ` Atish Kumar Patra
2022-11-29  5:04   ` Samuel Holland
2022-11-29  5:33     ` Alexandre Ghiti
2022-11-29  6:19     ` Palmer Dabbelt
2022-11-29  6:55       ` Samuel Holland
2022-11-29  9:15         ` Atish Kumar Patra [this message]
2022-11-29 18:33         ` Palmer Dabbelt
2022-11-29  9:18       ` Anup Patel
2022-11-29 15:07         ` Jisheng Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHBxVyEPvxm2KqCedkDTduVRq2mFCwuWE-NL9WO0_a7=+Y7upw@mail.gmail.com' \
    --to=atishp@rivosinc.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=jszhang@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=samuel@sholland.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).