* [PATCH v2 0/5] Boot RISC-V kernel from any 4KB aligned address @ 2019-03-21 9:47 Anup Patel 2019-03-21 9:47 ` [PATCH v2 1/5] RISC-V: Add separate defconfig for 32bit systems Anup Patel ` (4 more replies) 0 siblings, 5 replies; 16+ messages in thread From: Anup Patel @ 2019-03-21 9:47 UTC (permalink / raw) To: Palmer Dabbelt, Albert Ou Cc: Anup Patel, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv This patchset primarily extends initial page table setup using fixmap to boot Linux RISC-V kernel (64bit and 32bit) from any 4KB aligned address. We also add 32bit defconfig to allow people to try 32bit Linux RISC-V kernel as well. The patchset is based on Linux-5.1-rc1 and tested on SiFive Unleashed board and QEMU virt machine. It can also be found in riscv_setup_vm_v2 branch of https//github.com/avpatel/linux.git Changes since v1: - Add kconfig option BOOT_PAGE_ALIGNED to enable 4KB aligned booting - Improved initial page table setup code to select best/biggest possible mapping size based on load address alignment - Added PATCH4 to remove redundant trampoline page table - Added PATCH5 to fix memory reservation in setup_bootmem() Anup Patel (5): RISC-V: Add separate defconfig for 32bit systems RISC-V: Make setup_vm() independent of GCC code model RISC-V: Allow booting kernel from any 4KB aligned address RISC-V: Remove redundant trampoline page table RISC-V: Fix memory reservation in setup_bootmem() arch/riscv/Kconfig | 11 + arch/riscv/configs/rv32_defconfig | 84 ++++++ arch/riscv/include/asm/fixmap.h | 5 + arch/riscv/include/asm/pgtable-64.h | 5 + arch/riscv/include/asm/pgtable.h | 6 +- arch/riscv/kernel/head.S | 15 +- arch/riscv/kernel/setup.c | 4 +- arch/riscv/mm/init.c | 391 +++++++++++++++++++++++----- 8 files changed, 450 insertions(+), 71 deletions(-) create mode 100644 arch/riscv/configs/rv32_defconfig -- 2.17.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 1/5] RISC-V: Add separate defconfig for 32bit systems 2019-03-21 9:47 [PATCH v2 0/5] Boot RISC-V kernel from any 4KB aligned address Anup Patel @ 2019-03-21 9:47 ` Anup Patel 2019-03-21 9:47 ` [PATCH v2 2/5] RISC-V: Make setup_vm() independent of GCC code model Anup Patel ` (3 subsequent siblings) 4 siblings, 0 replies; 16+ messages in thread From: Anup Patel @ 2019-03-21 9:47 UTC (permalink / raw) To: Palmer Dabbelt, Albert Ou Cc: Anup Patel, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv This patch adds rv32_defconfig for 32bit systems. The only difference between rv32_defconfig and defconfig is that rv32_defconfig has CONFIG_ARCH_RV32I=y. Signed-off-by: Anup Patel <anup.patel@wdc.com> --- arch/riscv/configs/rv32_defconfig | 84 +++++++++++++++++++++++++++++++ 1 file changed, 84 insertions(+) create mode 100644 arch/riscv/configs/rv32_defconfig diff --git a/arch/riscv/configs/rv32_defconfig b/arch/riscv/configs/rv32_defconfig new file mode 100644 index 000000000000..1a911ed8e772 --- /dev/null +++ b/arch/riscv/configs/rv32_defconfig @@ -0,0 +1,84 @@ +CONFIG_SYSVIPC=y +CONFIG_POSIX_MQUEUE=y +CONFIG_IKCONFIG=y +CONFIG_IKCONFIG_PROC=y +CONFIG_CGROUPS=y +CONFIG_CGROUP_SCHED=y +CONFIG_CFS_BANDWIDTH=y +CONFIG_CGROUP_BPF=y +CONFIG_NAMESPACES=y +CONFIG_USER_NS=y +CONFIG_CHECKPOINT_RESTORE=y +CONFIG_BLK_DEV_INITRD=y +CONFIG_EXPERT=y +CONFIG_BPF_SYSCALL=y +CONFIG_ARCH_RV32I=y +CONFIG_SMP=y +CONFIG_MODULES=y +CONFIG_MODULE_UNLOAD=y +CONFIG_NET=y +CONFIG_PACKET=y +CONFIG_UNIX=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_IP_PNP_RARP=y +CONFIG_NETLINK_DIAG=y +CONFIG_PCI=y +CONFIG_PCIEPORTBUS=y +CONFIG_PCI_HOST_GENERIC=y +CONFIG_PCIE_XILINX=y +CONFIG_DEVTMPFS=y +CONFIG_BLK_DEV_LOOP=y +CONFIG_VIRTIO_BLK=y +CONFIG_BLK_DEV_SD=y +CONFIG_BLK_DEV_SR=y +CONFIG_ATA=y +CONFIG_SATA_AHCI=y +CONFIG_SATA_AHCI_PLATFORM=y +CONFIG_NETDEVICES=y +CONFIG_VIRTIO_NET=y +CONFIG_MACB=y +CONFIG_E1000E=y +CONFIG_R8169=y +CONFIG_MICROSEMI_PHY=y +CONFIG_INPUT_MOUSEDEV=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_OF_PLATFORM=y +CONFIG_SERIAL_EARLYCON_RISCV_SBI=y +CONFIG_HVC_RISCV_SBI=y +# CONFIG_PTP_1588_CLOCK is not set +CONFIG_DRM=y +CONFIG_DRM_RADEON=y +CONFIG_FRAMEBUFFER_CONSOLE=y +CONFIG_USB=y +CONFIG_USB_XHCI_HCD=y +CONFIG_USB_XHCI_PLATFORM=y +CONFIG_USB_EHCI_HCD=y +CONFIG_USB_EHCI_HCD_PLATFORM=y +CONFIG_USB_OHCI_HCD=y +CONFIG_USB_OHCI_HCD_PLATFORM=y +CONFIG_USB_STORAGE=y +CONFIG_USB_UAS=y +CONFIG_VIRTIO_MMIO=y +CONFIG_SIFIVE_PLIC=y +CONFIG_EXT4_FS=y +CONFIG_EXT4_FS_POSIX_ACL=y +CONFIG_AUTOFS4_FS=y +CONFIG_MSDOS_FS=y +CONFIG_VFAT_FS=y +CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y +CONFIG_NFS_FS=y +CONFIG_NFS_V4=y +CONFIG_NFS_V4_1=y +CONFIG_NFS_V4_2=y +CONFIG_ROOT_NFS=y +CONFIG_CRYPTO_USER_API_HASH=y +CONFIG_CRYPTO_DEV_VIRTIO=y +CONFIG_PRINTK_TIME=y +# CONFIG_RCU_TRACE is not set -- 2.17.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 2/5] RISC-V: Make setup_vm() independent of GCC code model 2019-03-21 9:47 [PATCH v2 0/5] Boot RISC-V kernel from any 4KB aligned address Anup Patel 2019-03-21 9:47 ` [PATCH v2 1/5] RISC-V: Add separate defconfig for 32bit systems Anup Patel @ 2019-03-21 9:47 ` Anup Patel 2019-03-23 15:45 ` Mike Rapoport 2019-03-21 9:47 ` [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address Anup Patel ` (2 subsequent siblings) 4 siblings, 1 reply; 16+ messages in thread From: Anup Patel @ 2019-03-21 9:47 UTC (permalink / raw) To: Palmer Dabbelt, Albert Ou Cc: Anup Patel, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv The setup_vm() must access kernel symbols in a position independent way because it will be called from head.S with MMU off. If we compile kernel with cmodel=medany then PC-relative addressing will be used in setup_vm() to access kernel symbols so it works perfectly fine. Although, if we compile kernel with cmodel=medlow then either absolute addressing or PC-relative addressing (based on whichever requires fewer instructions) is used to access kernel symbols in setup_vm(). This can break setup_vm() whenever any absolute addressing is used to access kernel symbols. With the movement of setup_vm() from kernel/setup.c to mm/init.c, the setup_vm() is now broken for cmodel=medlow but it works perfectly fine for cmodel=medany. This patch fixes setup_vm() and makes it independent of GCC code model by accessing kernel symbols relative to kernel load address instead of assuming PC-relative addressing. Fixes: 6f1e9e946f0b ("RISC-V: Move setup_vm() to mm/init.c") Signed-off-by: Anup Patel <anup.patel@wdc.com> --- arch/riscv/kernel/head.S | 1 + arch/riscv/mm/init.c | 73 ++++++++++++++++++++++++++-------------- 2 files changed, 49 insertions(+), 25 deletions(-) diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index fe884cd69abd..7966262b4f9d 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -62,6 +62,7 @@ clear_bss_done: /* Initialize page tables and relocate to virtual addresses */ la sp, init_thread_union + THREAD_SIZE + la a0, _start call setup_vm call relocate diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index b379a75ac6a6..e38f8195e45b 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -172,55 +172,78 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot) } } -asmlinkage void __init setup_vm(void) +static inline void *__load_addr(void *ptr, uintptr_t load_pa) { extern char _start; + uintptr_t va = (uintptr_t)ptr; + uintptr_t sz = (uintptr_t)(&_end) - (uintptr_t)(&_start); + + if (va >= PAGE_OFFSET && va <= (PAGE_OFFSET + sz)) + return (void *)(load_pa + (va - PAGE_OFFSET)); + return (void *)va; +} + +#define __load_va(ptr, load_pa) __load_addr(ptr, load_pa) +#define __load_pa(ptr, load_pa) ((uintptr_t)__load_addr(ptr, load_pa)) + +asmlinkage void __init setup_vm(uintptr_t load_pa) +{ uintptr_t i; - uintptr_t pa = (uintptr_t) &_start; +#ifndef __PAGETABLE_PMD_FOLDED + pmd_t *pmdp; +#endif + pgd_t *pgdp; + phys_addr_t map_pa; + pgprot_t tableprot = __pgprot(_PAGE_TABLE); pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); - va_pa_offset = PAGE_OFFSET - pa; - pfn_base = PFN_DOWN(pa); + va_pa_offset = PAGE_OFFSET - load_pa; + pfn_base = PFN_DOWN(load_pa); /* Sanity check alignment and size */ BUG_ON((PAGE_OFFSET % PGDIR_SIZE) != 0); - BUG_ON((pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); + BUG_ON((load_pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); #ifndef __PAGETABLE_PMD_FOLDED - trampoline_pg_dir[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN((uintptr_t)trampoline_pmd), - __pgprot(_PAGE_TABLE)); - trampoline_pmd[0] = pfn_pmd(PFN_DOWN(pa), prot); + pgdp = __load_va(trampoline_pg_dir, load_pa); + map_pa = __load_pa(trampoline_pmd, load_pa); + pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = + pfn_pgd(PFN_DOWN(map_pa), tableprot); + trampoline_pmd[0] = pfn_pmd(PFN_DOWN(load_pa), prot); + + pgdp = __load_va(swapper_pg_dir, load_pa); for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; - swapper_pg_dir[o] = - pfn_pgd(PFN_DOWN((uintptr_t)swapper_pmd) + i, - __pgprot(_PAGE_TABLE)); + map_pa = __load_pa(swapper_pmd, load_pa); + pgdp[o] = pfn_pgd(PFN_DOWN(map_pa) + i, tableprot); } + pmdp = __load_va(swapper_pmd, load_pa); for (i = 0; i < ARRAY_SIZE(swapper_pmd); i++) - swapper_pmd[i] = pfn_pmd(PFN_DOWN(pa + i * PMD_SIZE), prot); + pmdp[i] = pfn_pmd(PFN_DOWN(load_pa + i * PMD_SIZE), prot); - swapper_pg_dir[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN((uintptr_t)fixmap_pmd), - __pgprot(_PAGE_TABLE)); + map_pa = __load_pa(fixmap_pmd, load_pa); + pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = + pfn_pgd(PFN_DOWN(map_pa), tableprot); + pmdp = __load_va(fixmap_pmd, load_pa); + map_pa = __load_pa(fixmap_pte, load_pa); fixmap_pmd[(FIXADDR_START >> PMD_SHIFT) % PTRS_PER_PMD] = - pfn_pmd(PFN_DOWN((uintptr_t)fixmap_pte), - __pgprot(_PAGE_TABLE)); + pfn_pmd(PFN_DOWN(map_pa), tableprot); #else - trampoline_pg_dir[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN(pa), prot); + pgdp = __load_va(trampoline_pg_dir, load_pa); + pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = + pfn_pgd(PFN_DOWN(load_pa), prot); + pgdp = __load_va(swapper_pg_dir, load_pa); for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; - swapper_pg_dir[o] = - pfn_pgd(PFN_DOWN(pa + i * PGDIR_SIZE), prot); + pgdp[o] = pfn_pgd(PFN_DOWN(load_pa + i * PGDIR_SIZE), prot); } - swapper_pg_dir[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN((uintptr_t)fixmap_pte), - __pgprot(_PAGE_TABLE)); + map_pa = __load_pa(fixmap_pte, load_pa); + pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = + pfn_pgd(PFN_DOWN(map_pa), tableprot); #endif } -- 2.17.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/5] RISC-V: Make setup_vm() independent of GCC code model 2019-03-21 9:47 ` [PATCH v2 2/5] RISC-V: Make setup_vm() independent of GCC code model Anup Patel @ 2019-03-23 15:45 ` Mike Rapoport 2019-03-25 4:19 ` Anup Patel 0 siblings, 1 reply; 16+ messages in thread From: Mike Rapoport @ 2019-03-23 15:45 UTC (permalink / raw) To: Anup Patel Cc: Albert Ou, Palmer Dabbelt, linux-kernel, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv On Thu, Mar 21, 2019 at 09:47:47AM +0000, Anup Patel wrote: > The setup_vm() must access kernel symbols in a position independent way > because it will be called from head.S with MMU off. > > If we compile kernel with cmodel=medany then PC-relative addressing will > be used in setup_vm() to access kernel symbols so it works perfectly fine. > > Although, if we compile kernel with cmodel=medlow then either absolute > addressing or PC-relative addressing (based on whichever requires fewer > instructions) is used to access kernel symbols in setup_vm(). This can > break setup_vm() whenever any absolute addressing is used to access > kernel symbols. > > With the movement of setup_vm() from kernel/setup.c to mm/init.c, the > setup_vm() is now broken for cmodel=medlow but it works perfectly fine > for cmodel=medany. > > This patch fixes setup_vm() and makes it independent of GCC code model > by accessing kernel symbols relative to kernel load address instead of > assuming PC-relative addressing. > > Fixes: 6f1e9e946f0b ("RISC-V: Move setup_vm() to mm/init.c") > Signed-off-by: Anup Patel <anup.patel@wdc.com> > --- > arch/riscv/kernel/head.S | 1 + > arch/riscv/mm/init.c | 73 ++++++++++++++++++++++++++-------------- > 2 files changed, 49 insertions(+), 25 deletions(-) > > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S > index fe884cd69abd..7966262b4f9d 100644 > --- a/arch/riscv/kernel/head.S > +++ b/arch/riscv/kernel/head.S > @@ -62,6 +62,7 @@ clear_bss_done: > > /* Initialize page tables and relocate to virtual addresses */ > la sp, init_thread_union + THREAD_SIZE > + la a0, _start > call setup_vm > call relocate > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index b379a75ac6a6..e38f8195e45b 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -172,55 +172,78 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot) > } > } > > -asmlinkage void __init setup_vm(void) > +static inline void *__load_addr(void *ptr, uintptr_t load_pa) > { > extern char _start; > + uintptr_t va = (uintptr_t)ptr; > + uintptr_t sz = (uintptr_t)(&_end) - (uintptr_t)(&_start); > + > + if (va >= PAGE_OFFSET && va <= (PAGE_OFFSET + sz)) > + return (void *)(load_pa + (va - PAGE_OFFSET)); > + return (void *)va; > +} > + > +#define __load_va(ptr, load_pa) __load_addr(ptr, load_pa) > +#define __load_pa(ptr, load_pa) ((uintptr_t)__load_addr(ptr, load_pa)) > + > +asmlinkage void __init setup_vm(uintptr_t load_pa) > +{ > uintptr_t i; > - uintptr_t pa = (uintptr_t) &_start; > +#ifndef __PAGETABLE_PMD_FOLDED > + pmd_t *pmdp; > +#endif > + pgd_t *pgdp; > + phys_addr_t map_pa; > + pgprot_t tableprot = __pgprot(_PAGE_TABLE); > pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); > > - va_pa_offset = PAGE_OFFSET - pa; > - pfn_base = PFN_DOWN(pa); > + va_pa_offset = PAGE_OFFSET - load_pa; > + pfn_base = PFN_DOWN(load_pa); > > /* Sanity check alignment and size */ > BUG_ON((PAGE_OFFSET % PGDIR_SIZE) != 0); > - BUG_ON((pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); > + BUG_ON((load_pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); > > #ifndef __PAGETABLE_PMD_FOLDED > - trampoline_pg_dir[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN((uintptr_t)trampoline_pmd), > - __pgprot(_PAGE_TABLE)); > - trampoline_pmd[0] = pfn_pmd(PFN_DOWN(pa), prot); > + pgdp = __load_va(trampoline_pg_dir, load_pa); > + map_pa = __load_pa(trampoline_pmd, load_pa); > + pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = Can we use pgd_index(PAGE_OFFSET) here as index to PGD? > + pfn_pgd(PFN_DOWN(map_pa), tableprot); It seems that __load_pa result is always used with PFN_DOWN(), it's worth adding __load_pfn(). Then the last two statements become map_pfn = __load_pfn(trampoline_pmd, load_pa); pgdp[pgd_index(PAGE_OFFSET)] = pfn_pgd(map_pfn, tableprot); This applies to most of the mappings below as well. > + trampoline_pmd[0] = pfn_pmd(PFN_DOWN(load_pa), prot); > + > + pgdp = __load_va(swapper_pg_dir, load_pa); > > for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { > size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; > > - swapper_pg_dir[o] = > - pfn_pgd(PFN_DOWN((uintptr_t)swapper_pmd) + i, > - __pgprot(_PAGE_TABLE)); > + map_pa = __load_pa(swapper_pmd, load_pa); > + pgdp[o] = pfn_pgd(PFN_DOWN(map_pa) + i, tableprot); > } > + pmdp = __load_va(swapper_pmd, load_pa); > for (i = 0; i < ARRAY_SIZE(swapper_pmd); i++) > - swapper_pmd[i] = pfn_pmd(PFN_DOWN(pa + i * PMD_SIZE), prot); > + pmdp[i] = pfn_pmd(PFN_DOWN(load_pa + i * PMD_SIZE), prot); > > - swapper_pg_dir[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN((uintptr_t)fixmap_pmd), > - __pgprot(_PAGE_TABLE)); > + map_pa = __load_pa(fixmap_pmd, load_pa); > + pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = > + pfn_pgd(PFN_DOWN(map_pa), tableprot); > + pmdp = __load_va(fixmap_pmd, load_pa); > + map_pa = __load_pa(fixmap_pte, load_pa); > fixmap_pmd[(FIXADDR_START >> PMD_SHIFT) % PTRS_PER_PMD] = > - pfn_pmd(PFN_DOWN((uintptr_t)fixmap_pte), > - __pgprot(_PAGE_TABLE)); > + pfn_pmd(PFN_DOWN(map_pa), tableprot); > #else > - trampoline_pg_dir[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN(pa), prot); > + pgdp = __load_va(trampoline_pg_dir, load_pa); > + pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = > + pfn_pgd(PFN_DOWN(load_pa), prot); > > + pgdp = __load_va(swapper_pg_dir, load_pa); > for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { > size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; > > - swapper_pg_dir[o] = > - pfn_pgd(PFN_DOWN(pa + i * PGDIR_SIZE), prot); > + pgdp[o] = pfn_pgd(PFN_DOWN(load_pa + i * PGDIR_SIZE), prot); > } > > - swapper_pg_dir[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN((uintptr_t)fixmap_pte), > - __pgprot(_PAGE_TABLE)); > + map_pa = __load_pa(fixmap_pte, load_pa); > + pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = > + pfn_pgd(PFN_DOWN(map_pa), tableprot); > #endif > } > -- > 2.17.1 > -- Sincerely yours, Mike. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/5] RISC-V: Make setup_vm() independent of GCC code model 2019-03-23 15:45 ` Mike Rapoport @ 2019-03-25 4:19 ` Anup Patel 0 siblings, 0 replies; 16+ messages in thread From: Anup Patel @ 2019-03-25 4:19 UTC (permalink / raw) To: Mike Rapoport Cc: Palmer Dabbelt, Anup Patel, linux-kernel, Christoph Hellwig, Atish Patra, Albert Ou, Paul Walmsley, linux-riscv On Sat, Mar 23, 2019 at 9:15 PM Mike Rapoport <rppt@linux.ibm.com> wrote: > > On Thu, Mar 21, 2019 at 09:47:47AM +0000, Anup Patel wrote: > > The setup_vm() must access kernel symbols in a position independent way > > because it will be called from head.S with MMU off. > > > > If we compile kernel with cmodel=medany then PC-relative addressing will > > be used in setup_vm() to access kernel symbols so it works perfectly fine. > > > > Although, if we compile kernel with cmodel=medlow then either absolute > > addressing or PC-relative addressing (based on whichever requires fewer > > instructions) is used to access kernel symbols in setup_vm(). This can > > break setup_vm() whenever any absolute addressing is used to access > > kernel symbols. > > > > With the movement of setup_vm() from kernel/setup.c to mm/init.c, the > > setup_vm() is now broken for cmodel=medlow but it works perfectly fine > > for cmodel=medany. > > > > This patch fixes setup_vm() and makes it independent of GCC code model > > by accessing kernel symbols relative to kernel load address instead of > > assuming PC-relative addressing. > > > > Fixes: 6f1e9e946f0b ("RISC-V: Move setup_vm() to mm/init.c") > > Signed-off-by: Anup Patel <anup.patel@wdc.com> > > --- > > arch/riscv/kernel/head.S | 1 + > > arch/riscv/mm/init.c | 73 ++++++++++++++++++++++++++-------------- > > 2 files changed, 49 insertions(+), 25 deletions(-) > > > > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S > > index fe884cd69abd..7966262b4f9d 100644 > > --- a/arch/riscv/kernel/head.S > > +++ b/arch/riscv/kernel/head.S > > @@ -62,6 +62,7 @@ clear_bss_done: > > > > /* Initialize page tables and relocate to virtual addresses */ > > la sp, init_thread_union + THREAD_SIZE > > + la a0, _start > > call setup_vm > > call relocate > > > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > > index b379a75ac6a6..e38f8195e45b 100644 > > --- a/arch/riscv/mm/init.c > > +++ b/arch/riscv/mm/init.c > > @@ -172,55 +172,78 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot) > > } > > } > > > > -asmlinkage void __init setup_vm(void) > > +static inline void *__load_addr(void *ptr, uintptr_t load_pa) > > { > > extern char _start; > > + uintptr_t va = (uintptr_t)ptr; > > + uintptr_t sz = (uintptr_t)(&_end) - (uintptr_t)(&_start); > > + > > + if (va >= PAGE_OFFSET && va <= (PAGE_OFFSET + sz)) > > + return (void *)(load_pa + (va - PAGE_OFFSET)); > > + return (void *)va; > > +} > > + > > +#define __load_va(ptr, load_pa) __load_addr(ptr, load_pa) > > +#define __load_pa(ptr, load_pa) ((uintptr_t)__load_addr(ptr, load_pa)) > > + > > +asmlinkage void __init setup_vm(uintptr_t load_pa) > > +{ > > uintptr_t i; > > - uintptr_t pa = (uintptr_t) &_start; > > +#ifndef __PAGETABLE_PMD_FOLDED > > + pmd_t *pmdp; > > +#endif > > + pgd_t *pgdp; > > + phys_addr_t map_pa; > > + pgprot_t tableprot = __pgprot(_PAGE_TABLE); > > pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); > > > > - va_pa_offset = PAGE_OFFSET - pa; > > - pfn_base = PFN_DOWN(pa); > > + va_pa_offset = PAGE_OFFSET - load_pa; > > + pfn_base = PFN_DOWN(load_pa); > > > > /* Sanity check alignment and size */ > > BUG_ON((PAGE_OFFSET % PGDIR_SIZE) != 0); > > - BUG_ON((pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); > > + BUG_ON((load_pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); > > > > #ifndef __PAGETABLE_PMD_FOLDED > > - trampoline_pg_dir[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = > > - pfn_pgd(PFN_DOWN((uintptr_t)trampoline_pmd), > > - __pgprot(_PAGE_TABLE)); > > - trampoline_pmd[0] = pfn_pmd(PFN_DOWN(pa), prot); > > + pgdp = __load_va(trampoline_pg_dir, load_pa); > > + map_pa = __load_pa(trampoline_pmd, load_pa); > > + pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = > > Can we use pgd_index(PAGE_OFFSET) here as index to PGD? > > > + pfn_pgd(PFN_DOWN(map_pa), tableprot); > > It seems that __load_pa result is always used with PFN_DOWN(), it's worth > adding __load_pfn(). Then the last two statements become > > map_pfn = __load_pfn(trampoline_pmd, load_pa); > pgdp[pgd_index(PAGE_OFFSET)] = pfn_pgd(map_pfn, tableprot); > > This applies to most of the mappings below as well. Thanks for the comments. I am going to drop this patch because we have other patch which uses "CFLAGS_init.o := -cmodel=medany" in mm/Makefile Regards, Anup _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address 2019-03-21 9:47 [PATCH v2 0/5] Boot RISC-V kernel from any 4KB aligned address Anup Patel 2019-03-21 9:47 ` [PATCH v2 1/5] RISC-V: Add separate defconfig for 32bit systems Anup Patel 2019-03-21 9:47 ` [PATCH v2 2/5] RISC-V: Make setup_vm() independent of GCC code model Anup Patel @ 2019-03-21 9:47 ` Anup Patel 2019-03-23 15:40 ` Mike Rapoport 2019-03-21 9:47 ` [PATCH v2 4/5] RISC-V: Remove redundant trampoline page table Anup Patel 2019-03-21 9:47 ` [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() Anup Patel 4 siblings, 1 reply; 16+ messages in thread From: Anup Patel @ 2019-03-21 9:47 UTC (permalink / raw) To: Palmer Dabbelt, Albert Ou Cc: Anup Patel, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv Currently, we have to boot RISCV64 kernel from a 2MB aligned physical address and RISCV32 kernel from a 4MB aligned physical address. This constraint is because initial pagetable setup (i.e. setup_vm()) maps entire RAM using hugepages (i.e. 2MB for 3-level pagetable and 4MB for 2-level pagetable). Further, the above booting contraint also results in memory wastage because if we boot kernel from some <xyz> address (which is not same as RAM start address) then RISCV kernel will map PAGE_OFFSET virtual address lineraly to <xyz> physical address and memory between RAM start and <xyz> will be reserved/unusable. For example, RISCV64 kernel booted from 0x80200000 will waste 2MB of RAM and RISCV32 kernel booted from 0x80400000 will waste 4MB of RAM. This patch re-writes the initial pagetable setup code to allow booting RISV32 and RISCV64 kernel from any 4KB (i.e. PAGE_SIZE) aligned address. To achieve this: 1. We add kconfig option BOOT_PAGE_ALIGNED. When it is enabled we use 4KB mappings in initial page table setup otherwise we use 2MB/4MB mappings. 2. We map kernel and dtb (few MBs) in setup_vm() (called from head.S) 3. Once we reach paging_init() (called from setup_arch()) after memblock setup, we map all available memory banks. With this patch in-place, the booting constraint for RISCV32 and RISCV64 kernel is much more relaxed when CONFIG_BOOT_PAGE_ALIGNED=y and we can now boot kernel very close to RAM start thereby minimizng memory wastage. Signed-off-by: Anup Patel <anup.patel@wdc.com> --- arch/riscv/Kconfig | 11 + arch/riscv/include/asm/fixmap.h | 5 + arch/riscv/include/asm/pgtable-64.h | 5 + arch/riscv/include/asm/pgtable.h | 6 +- arch/riscv/kernel/head.S | 1 + arch/riscv/kernel/setup.c | 4 +- arch/riscv/mm/init.c | 402 ++++++++++++++++++++++++---- 7 files changed, 378 insertions(+), 56 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index eb56c82d8aa1..1b0c66f7aba3 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -172,6 +172,17 @@ config SMP If you don't know what to do here, say N. +config BOOT_PAGE_ALIGNED + bool "Allow booting from page aligned address" + help + This enables support for booting kernel from any page aligned + address (i.e. 4KB aligned). This option is particularly useful + on systems with very less RAM (few MBs) because using it we + can boot kernel closer RAM start thereby reducing unusable RAM + below kernel. + + If you don't know what to do here, say N. + config NR_CPUS int "Maximum number of CPUs (2-32)" range 2 32 diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h index 57afe604b495..5cf53dd882e5 100644 --- a/arch/riscv/include/asm/fixmap.h +++ b/arch/riscv/include/asm/fixmap.h @@ -21,6 +21,11 @@ */ enum fixed_addresses { FIX_HOLE, +#define FIX_FDT_SIZE SZ_1M + FIX_FDT_END, + FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1, + FIX_PTE, + FIX_PMD, FIX_EARLYCON_MEM_BASE, __end_of_fixed_addresses }; diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h index 7aa0ea9bd8bb..56ecc3dc939d 100644 --- a/arch/riscv/include/asm/pgtable-64.h +++ b/arch/riscv/include/asm/pgtable-64.h @@ -78,6 +78,11 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot) return __pmd((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot)); } +static inline unsigned long _pmd_pfn(pmd_t pmd) +{ + return pmd_val(pmd) >> _PAGE_PFN_SHIFT; +} + #define pmd_ERROR(e) \ pr_err("%s:%d: bad pmd %016lx.\n", __FILE__, __LINE__, pmd_val(e)) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 1141364d990e..05fa2115e736 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -121,12 +121,16 @@ static inline void pmd_clear(pmd_t *pmdp) set_pmd(pmdp, __pmd(0)); } - static inline pgd_t pfn_pgd(unsigned long pfn, pgprot_t prot) { return __pgd((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot)); } +static inline unsigned long _pgd_pfn(pgd_t pgd) +{ + return pgd_val(pgd) >> _PAGE_PFN_SHIFT; +} + #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1)) /* Locate an entry in the page global directory */ diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index 7966262b4f9d..12a3ec5eb8ab 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -63,6 +63,7 @@ clear_bss_done: /* Initialize page tables and relocate to virtual addresses */ la sp, init_thread_union + THREAD_SIZE la a0, _start + mv a1, s1 call setup_vm call relocate diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index ecb654f6a79e..acdd0f74982b 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -30,6 +30,7 @@ #include <linux/sched/task.h> #include <linux/swiotlb.h> +#include <asm/fixmap.h> #include <asm/setup.h> #include <asm/sections.h> #include <asm/pgtable.h> @@ -62,7 +63,8 @@ unsigned long boot_cpu_hartid; void __init parse_dtb(unsigned int hartid, void *dtb) { - if (early_init_dt_scan(__va(dtb))) + dtb = (void *)fix_to_virt(FIX_FDT) + ((uintptr_t)dtb & ~PAGE_MASK); + if (early_init_dt_scan(dtb)) return; pr_err("No DTB passed to the kernel\n"); diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index e38f8195e45b..c389fbfeccd8 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1,14 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ /* + * Copyright (C) 2019 Western Digital Corporation or its affiliates. * Copyright (C) 2012 Regents of the University of California - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation, version 2. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. */ #include <linux/init.h> @@ -43,13 +36,6 @@ void setup_zero_page(void) memset((void *)empty_zero_page, 0, PAGE_SIZE); } -void __init paging_init(void) -{ - setup_zero_page(); - local_flush_tlb_all(); - zone_sizes_init(); -} - void __init mem_init(void) { #ifdef CONFIG_FLATMEM @@ -143,18 +129,36 @@ void __init setup_bootmem(void) } } +#define MAX_EARLY_MAPPING_SIZE SZ_128M + pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss; pgd_t trampoline_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); #ifndef __PAGETABLE_PMD_FOLDED -#define NUM_SWAPPER_PMDS ((uintptr_t)-PAGE_OFFSET >> PGDIR_SHIFT) -pmd_t swapper_pmd[PTRS_PER_PMD*((-PAGE_OFFSET)/PGDIR_SIZE)] __page_aligned_bss; -pmd_t trampoline_pmd[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); +#if MAX_EARLY_MAPPING_SIZE < PGDIR_SIZE +#define NUM_SWAPPER_PMDS 1UL +#else +#define NUM_SWAPPER_PMDS (MAX_EARLY_MAPPING_SIZE/PGDIR_SIZE) +#endif +#define NUM_TRAMPOLINE_PMDS 1UL +pmd_t swapper_pmd[PTRS_PER_PMD*NUM_SWAPPER_PMDS] __page_aligned_bss; +pmd_t trampoline_pmd[PTRS_PER_PMD*NUM_TRAMPOLINE_PMDS] + __initdata __aligned(PAGE_SIZE); pmd_t fixmap_pmd[PTRS_PER_PMD] __page_aligned_bss; +#define NUM_SWAPPER_PTES (MAX_EARLY_MAPPING_SIZE/PMD_SIZE) +#else +#define NUM_SWAPPER_PTES (MAX_EARLY_MAPPING_SIZE/PGDIR_SIZE) #endif +#define NUM_TRAMPOLINE_PTES 1UL + +pte_t swapper_pte[PTRS_PER_PTE*NUM_SWAPPER_PTES] __page_aligned_bss; +pte_t trampoline_pte[PTRS_PER_PTE*NUM_TRAMPOLINE_PTES] + __initdata __aligned(PAGE_SIZE); pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss; +uintptr_t map_size; + void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot) { unsigned long addr = __fix_to_virt(idx); @@ -172,6 +176,13 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot) } } +struct mapping_ops { + pte_t *(*get_pte_virt)(phys_addr_t pa); + phys_addr_t (*alloc_pte)(uintptr_t va, uintptr_t load_pa); + pmd_t *(*get_pmd_virt)(phys_addr_t pa); + phys_addr_t (*alloc_pmd)(uintptr_t va, uintptr_t load_pa); +}; + static inline void *__load_addr(void *ptr, uintptr_t load_pa) { extern char _start; @@ -186,64 +197,347 @@ static inline void *__load_addr(void *ptr, uintptr_t load_pa) #define __load_va(ptr, load_pa) __load_addr(ptr, load_pa) #define __load_pa(ptr, load_pa) ((uintptr_t)__load_addr(ptr, load_pa)) -asmlinkage void __init setup_vm(uintptr_t load_pa) +static phys_addr_t __init final_alloc_pgtable(void) +{ + return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE); +} + +static pte_t *__init early_get_pte_virt(phys_addr_t pa) +{ + return (pte_t *)((uintptr_t)pa); +} + +static pte_t *__init final_get_pte_virt(phys_addr_t pa) +{ + clear_fixmap(FIX_PTE); + + return (pte_t *)set_fixmap_offset(FIX_PTE, pa); +} + +static phys_addr_t __init early_alloc_trampoline_pte(uintptr_t va, + uintptr_t load_pa) +{ + pte_t *base = __load_va(trampoline_pte, load_pa); + uintptr_t pte_num = ((va - PAGE_OFFSET) >> PMD_SHIFT); + + BUG_ON(pte_num >= NUM_TRAMPOLINE_PTES); + + return (uintptr_t)&base[pte_num * PTRS_PER_PTE]; +} + +static phys_addr_t __init early_alloc_swapper_pte(uintptr_t va, + uintptr_t load_pa) +{ + pte_t *base = __load_va(swapper_pte, load_pa); + uintptr_t pte_num = ((va - PAGE_OFFSET) >> PMD_SHIFT); + + BUG_ON(pte_num >= NUM_SWAPPER_PTES); + + return (uintptr_t)&base[pte_num * PTRS_PER_PTE]; +} + +static phys_addr_t __init final_alloc_pte(uintptr_t va, uintptr_t load_pa) +{ + return final_alloc_pgtable(); +} + +static void __init create_pte_mapping(pte_t *ptep, + uintptr_t va, phys_addr_t pa, + phys_addr_t sz, pgprot_t prot) { - uintptr_t i; + uintptr_t pte_index = pte_index(va); + + BUG_ON(sz != PAGE_SIZE); + + if (pte_none(ptep[pte_index])) + ptep[pte_index] = pfn_pte(PFN_DOWN(pa), prot); +} + #ifndef __PAGETABLE_PMD_FOLDED +static pmd_t *__init early_get_pmd_virt(phys_addr_t pa) +{ + return (pmd_t *)((uintptr_t)pa); +} + +static pmd_t *__init final_get_pmd_virt(phys_addr_t pa) +{ + clear_fixmap(FIX_PMD); + + return (pmd_t *)set_fixmap_offset(FIX_PMD, pa); +} + +static phys_addr_t __init early_alloc_trampoline_pmd(uintptr_t va, + uintptr_t load_pa) +{ + pmd_t *base = __load_va(trampoline_pmd, load_pa); + uintptr_t pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT; + + BUG_ON(pmd_num >= NUM_TRAMPOLINE_PMDS); + + return (uintptr_t)&base[pmd_num * PTRS_PER_PMD]; +} + +static phys_addr_t __init early_alloc_swapper_pmd(uintptr_t va, + uintptr_t load_pa) +{ + pmd_t *base = __load_va(swapper_pmd, load_pa); + uintptr_t pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT; + + BUG_ON(pmd_num >= NUM_SWAPPER_PMDS); + + return (uintptr_t)&base[pmd_num * PTRS_PER_PMD]; +} + +static phys_addr_t __init final_alloc_pmd(uintptr_t va, uintptr_t load_pa) +{ + return final_alloc_pgtable(); +} + +static void __init create_pmd_mapping(pmd_t *pmdp, + uintptr_t va, phys_addr_t pa, + phys_addr_t sz, pgprot_t prot, + uintptr_t ops_load_pa, + struct mapping_ops *ops) +{ + pte_t *ptep; + phys_addr_t pte_phys; + uintptr_t pmd_index = pmd_index(va); + + if (sz == PMD_SIZE) { + if (pmd_none(pmdp[pmd_index])) + pmdp[pmd_index] = pfn_pmd(PFN_DOWN(pa), prot); + return; + } + + if (pmd_none(pmdp[pmd_index])) { + pte_phys = ops->alloc_pte(va, ops_load_pa); + pmdp[pmd_index] = pfn_pmd(PFN_DOWN(pte_phys), + __pgprot(_PAGE_TABLE)); + ptep = ops->get_pte_virt(pte_phys); + memset(ptep, 0, PAGE_SIZE); + } else { + pte_phys = PFN_PHYS(_pmd_pfn(pmdp[pmd_index])); + ptep = ops->get_pte_virt(pte_phys); + } + + create_pte_mapping(ptep, va, pa, sz, prot); +} + +static void __init create_pgd_mapping(pgd_t *pgdp, + uintptr_t va, phys_addr_t pa, + phys_addr_t sz, pgprot_t prot, + uintptr_t ops_load_pa, + struct mapping_ops *ops) +{ pmd_t *pmdp; + phys_addr_t pmd_phys; + uintptr_t pgd_index = pgd_index(va); + + if (sz == PGDIR_SIZE) { + if (pgd_val(pgdp[pgd_index]) == 0) + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pa), prot); + return; + } + + if (pgd_val(pgdp[pgd_index]) == 0) { + pmd_phys = ops->alloc_pmd(va, ops_load_pa); + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pmd_phys), + __pgprot(_PAGE_TABLE)); + pmdp = ops->get_pmd_virt(pmd_phys); + memset(pmdp, 0, PAGE_SIZE); + } else { + pmd_phys = PFN_PHYS(_pgd_pfn(pgdp[pgd_index])); + pmdp = ops->get_pmd_virt(pmd_phys); + } + + create_pmd_mapping(pmdp, va, pa, sz, prot, ops_load_pa, ops); +} +#else +static void __init create_pgd_mapping(pgd_t *pgdp, + uintptr_t va, phys_addr_t pa, + phys_addr_t sz, pgprot_t prot, + uintptr_t ops_load_pa, + struct mapping_ops *ops) +{ + pte_t *ptep; + phys_addr_t pte_phys; + uintptr_t pgd_index = pgd_index(va); + + if (sz == PGDIR_SIZE) { + if (pgd_val(pgdp[pgd_index]) == 0) + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pa), prot); + return; + } + + if (pgd_val(pgdp[pgd_index]) == 0) { + pte_phys = ops->alloc_pte(va, ops_load_pa); + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pte_phys), + __pgprot(_PAGE_TABLE)); + ptep = ops->get_pte_virt(pte_phys); + memset(ptep, 0, PAGE_SIZE); + } else { + pte_phys = PFN_PHYS(_pgd_pfn(pgdp[pgd_index])); + ptep = ops->get_pte_virt(pte_phys); + } + + create_pte_mapping(ptep, va, pa, sz, prot); +} +#endif + +static uintptr_t __init best_map_size(uintptr_t load_pa, phys_addr_t size) +{ +#ifdef CONFIG_BOOT_PAGE_ALIGNED + uintptr_t map_sz = PAGE_SIZE; +#else +#ifndef __PAGETABLE_PMD_FOLDED + uintptr_t map_sz = PMD_SIZE; +#else + uintptr_t map_sz = PGDIR_SIZE; +#endif #endif - pgd_t *pgdp; + +#ifndef __PAGETABLE_PMD_FOLDED + if (!(load_pa & (PMD_SIZE - 1)) && + (size >= PMD_SIZE) && + (map_sz < PMD_SIZE)) + map_sz = PMD_SIZE; +#endif + + if (!(load_pa & (PGDIR_SIZE - 1)) && + (size >= PGDIR_SIZE) && + (map_sz < PGDIR_SIZE)) + map_sz = PGDIR_SIZE; + + return map_sz; +} + +asmlinkage void __init setup_vm(uintptr_t load_pa, uintptr_t dtb_pa) +{ phys_addr_t map_pa; + uintptr_t va, end_va; + uintptr_t load_sz = __load_pa(&_end, load_pa) - load_pa; pgprot_t tableprot = __pgprot(_PAGE_TABLE); pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); + struct mapping_ops tramp_ops, swap_ops; va_pa_offset = PAGE_OFFSET - load_pa; pfn_base = PFN_DOWN(load_pa); + map_size = best_map_size(load_pa, PGDIR_SIZE); /* Sanity check alignment and size */ BUG_ON((PAGE_OFFSET % PGDIR_SIZE) != 0); - BUG_ON((load_pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); + BUG_ON((load_pa % map_size) != 0); + BUG_ON(load_sz > MAX_EARLY_MAPPING_SIZE); -#ifndef __PAGETABLE_PMD_FOLDED - pgdp = __load_va(trampoline_pg_dir, load_pa); - map_pa = __load_pa(trampoline_pmd, load_pa); - pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN(map_pa), tableprot); - trampoline_pmd[0] = pfn_pmd(PFN_DOWN(load_pa), prot); + /* Setup trampoline mapping ops */ + tramp_ops.get_pte_virt = __load_va(early_get_pte_virt, load_pa); + tramp_ops.alloc_pte = __load_va(early_alloc_trampoline_pte, load_pa); + tramp_ops.get_pmd_virt = NULL; + tramp_ops.alloc_pmd = NULL; - pgdp = __load_va(swapper_pg_dir, load_pa); + /* Setup swapper mapping ops */ + swap_ops.get_pte_virt = __load_va(early_get_pte_virt, load_pa); + swap_ops.alloc_pte = __load_va(early_alloc_swapper_pte, load_pa); + swap_ops.get_pmd_virt = NULL; + swap_ops.alloc_pmd = NULL; - for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { - size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; +#ifndef __PAGETABLE_PMD_FOLDED + /* Update trampoline mapping ops for PMD */ + tramp_ops.get_pmd_virt = __load_va(early_get_pmd_virt, load_pa); + tramp_ops.alloc_pmd = __load_va(early_alloc_trampoline_pmd, load_pa); - map_pa = __load_pa(swapper_pmd, load_pa); - pgdp[o] = pfn_pgd(PFN_DOWN(map_pa) + i, tableprot); - } - pmdp = __load_va(swapper_pmd, load_pa); - for (i = 0; i < ARRAY_SIZE(swapper_pmd); i++) - pmdp[i] = pfn_pmd(PFN_DOWN(load_pa + i * PMD_SIZE), prot); + /* Update swapper mapping ops for PMD */ + swap_ops.get_pmd_virt = __load_va(early_get_pmd_virt, load_pa); + swap_ops.alloc_pmd = __load_va(early_alloc_swapper_pmd, load_pa); + /* Setup swapper PGD and PMD for fixmap */ map_pa = __load_pa(fixmap_pmd, load_pa); - pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN(map_pa), tableprot); - pmdp = __load_va(fixmap_pmd, load_pa); + create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), + FIXADDR_START, map_pa, PGDIR_SIZE, tableprot, + load_pa, &swap_ops); map_pa = __load_pa(fixmap_pte, load_pa); - fixmap_pmd[(FIXADDR_START >> PMD_SHIFT) % PTRS_PER_PMD] = - pfn_pmd(PFN_DOWN(map_pa), tableprot); + create_pmd_mapping(__load_va(fixmap_pmd, load_pa), + FIXADDR_START, map_pa, PMD_SIZE, tableprot, + load_pa, &swap_ops); #else - pgdp = __load_va(trampoline_pg_dir, load_pa); - pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN(load_pa), prot); + /* Setup swapper PGD for fixmap */ + map_pa = __load_pa(fixmap_pte, load_pa); + create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), + FIXADDR_START, map_pa, PGDIR_SIZE, tableprot, + load_pa, &swap_ops); +#endif - pgdp = __load_va(swapper_pg_dir, load_pa); - for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { - size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; + /* Setup trampoling PGD covering first few MBs of kernel */ + end_va = PAGE_OFFSET + PAGE_SIZE*PTRS_PER_PTE; + for (va = PAGE_OFFSET; va < end_va; va += map_size) + create_pgd_mapping(__load_va(trampoline_pg_dir, load_pa), + va, load_pa + (va - PAGE_OFFSET), + map_size, prot, load_pa, &tramp_ops); + + /* + * Setup swapper PGD covering entire kernel which will allows + * us to reach paging_init(). We map all memory banks later in + * setup_vm_final() below. + */ + end_va = PAGE_OFFSET + load_sz; + for (va = PAGE_OFFSET; va < end_va; va += map_size) + create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), + va, load_pa + (va - PAGE_OFFSET), + map_size, prot, load_pa, &swap_ops); + + /* Create fixed mapping for early parsing of FDT */ + end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE; + for (va = __fix_to_virt(FIX_FDT); va < end_va; va += PAGE_SIZE) + create_pte_mapping(__load_va(fixmap_pte, load_pa), + va, dtb_pa + (va - __fix_to_virt(FIX_FDT)), + PAGE_SIZE, prot); +} - pgdp[o] = pfn_pgd(PFN_DOWN(load_pa + i * PGDIR_SIZE), prot); - } +static void __init setup_vm_final(void) +{ + phys_addr_t pa, start, end; + struct memblock_region *reg; + struct mapping_ops ops; + pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); - map_pa = __load_pa(fixmap_pte, load_pa); - pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = - pfn_pgd(PFN_DOWN(map_pa), tableprot); + /* Setup mapping ops */ + ops.get_pte_virt = final_get_pte_virt; + ops.alloc_pte = final_alloc_pte; +#ifndef __PAGETABLE_PMD_FOLDED + ops.get_pmd_virt = final_get_pmd_virt; + ops.alloc_pmd = final_alloc_pmd; +#else + ops.get_pmd_virt = NULL; + ops.alloc_pmd = NULL; #endif + + /* Map all memory banks */ + for_each_memblock(memory, reg) { + start = reg->base; + end = start + reg->size; + + if (start >= end) + break; + if (memblock_is_nomap(reg)) + continue; + if (start <= __pa(PAGE_OFFSET) && + __pa(PAGE_OFFSET) < end) + start = __pa(PAGE_OFFSET); + + for (pa = start; pa < end; pa += map_size) + create_pgd_mapping(swapper_pg_dir, + (uintptr_t)__va(pa), pa, + map_size, prot, 0, &ops); + } + + clear_fixmap(FIX_PTE); + clear_fixmap(FIX_PMD); +} + +void __init paging_init(void) +{ + setup_vm_final(); + setup_zero_page(); + local_flush_tlb_all(); + zone_sizes_init(); } -- 2.17.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address 2019-03-21 9:47 ` [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address Anup Patel @ 2019-03-23 15:40 ` Mike Rapoport 2019-03-23 17:24 ` Christoph Hellwig 2019-03-24 3:32 ` Anup Patel 0 siblings, 2 replies; 16+ messages in thread From: Mike Rapoport @ 2019-03-23 15:40 UTC (permalink / raw) To: Anup Patel Cc: Albert Ou, Palmer Dabbelt, linux-kernel, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv On Thu, Mar 21, 2019 at 09:47:51AM +0000, Anup Patel wrote: > Currently, we have to boot RISCV64 kernel from a 2MB aligned physical > address and RISCV32 kernel from a 4MB aligned physical address. This > constraint is because initial pagetable setup (i.e. setup_vm()) maps > entire RAM using hugepages (i.e. 2MB for 3-level pagetable and 4MB for > 2-level pagetable). > > Further, the above booting contraint also results in memory wastage > because if we boot kernel from some <xyz> address (which is not same as > RAM start address) then RISCV kernel will map PAGE_OFFSET virtual address > lineraly to <xyz> physical address and memory between RAM start and <xyz> > will be reserved/unusable. > > For example, RISCV64 kernel booted from 0x80200000 will waste 2MB of RAM > and RISCV32 kernel booted from 0x80400000 will waste 4MB of RAM. > > This patch re-writes the initial pagetable setup code to allow booting > RISV32 and RISCV64 kernel from any 4KB (i.e. PAGE_SIZE) aligned address. > > To achieve this: > 1. We add kconfig option BOOT_PAGE_ALIGNED. When it is enabled we use > 4KB mappings in initial page table setup otherwise we use 2MB/4MB > mappings. > 2. We map kernel and dtb (few MBs) in setup_vm() (called from head.S) > 3. Once we reach paging_init() (called from setup_arch()) after > memblock setup, we map all available memory banks. > > With this patch in-place, the booting constraint for RISCV32 and RISCV64 > kernel is much more relaxed when CONFIG_BOOT_PAGE_ALIGNED=y and we can > now boot kernel very close to RAM start thereby minimizng memory wastage. I have no general objection, but I presume the patch will be significantly simplified if the addition of 4K pages support will follow the removal of the trampoline_pd_dir. That said, I didn't look into the details, since they will change substantially, only some comments on the Kconfig part. On the high level, have you considered using large pages in setup_vm() and the remapping everything with 4K pages in setup_vm_final()? This might save you the whole ops-> churn. > Signed-off-by: Anup Patel <anup.patel@wdc.com> > --- > arch/riscv/Kconfig | 11 + > arch/riscv/include/asm/fixmap.h | 5 + > arch/riscv/include/asm/pgtable-64.h | 5 + > arch/riscv/include/asm/pgtable.h | 6 +- > arch/riscv/kernel/head.S | 1 + > arch/riscv/kernel/setup.c | 4 +- > arch/riscv/mm/init.c | 402 ++++++++++++++++++++++++---- > 7 files changed, 378 insertions(+), 56 deletions(-) > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index eb56c82d8aa1..1b0c66f7aba3 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -172,6 +172,17 @@ config SMP > > If you don't know what to do here, say N. > > +config BOOT_PAGE_ALIGNED > + bool "Allow booting from page aligned address" default no, please > + help > + This enables support for booting kernel from any page aligned > + address (i.e. 4KB aligned). This option is particularly useful > + on systems with very less RAM (few MBs) because using it we ^ small > + can boot kernel closer RAM start thereby reducing unusable RAM > + below kernel. > + > + If you don't know what to do here, say N. > + > config NR_CPUS > int "Maximum number of CPUs (2-32)" > range 2 32 > diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h > index 57afe604b495..5cf53dd882e5 100644 > --- a/arch/riscv/include/asm/fixmap.h > +++ b/arch/riscv/include/asm/fixmap.h > @@ -21,6 +21,11 @@ > */ > enum fixed_addresses { > FIX_HOLE, > +#define FIX_FDT_SIZE SZ_1M > + FIX_FDT_END, > + FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1, > + FIX_PTE, > + FIX_PMD, > FIX_EARLYCON_MEM_BASE, > __end_of_fixed_addresses > }; > diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h > index 7aa0ea9bd8bb..56ecc3dc939d 100644 > --- a/arch/riscv/include/asm/pgtable-64.h > +++ b/arch/riscv/include/asm/pgtable-64.h > @@ -78,6 +78,11 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot) > return __pmd((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot)); > } > > +static inline unsigned long _pmd_pfn(pmd_t pmd) > +{ > + return pmd_val(pmd) >> _PAGE_PFN_SHIFT; > +} > + > #define pmd_ERROR(e) \ > pr_err("%s:%d: bad pmd %016lx.\n", __FILE__, __LINE__, pmd_val(e)) > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > index 1141364d990e..05fa2115e736 100644 > --- a/arch/riscv/include/asm/pgtable.h > +++ b/arch/riscv/include/asm/pgtable.h > @@ -121,12 +121,16 @@ static inline void pmd_clear(pmd_t *pmdp) > set_pmd(pmdp, __pmd(0)); > } > > - > static inline pgd_t pfn_pgd(unsigned long pfn, pgprot_t prot) > { > return __pgd((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot)); > } > > +static inline unsigned long _pgd_pfn(pgd_t pgd) > +{ > + return pgd_val(pgd) >> _PAGE_PFN_SHIFT; > +} > + > #define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1)) > > /* Locate an entry in the page global directory */ > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S > index 7966262b4f9d..12a3ec5eb8ab 100644 > --- a/arch/riscv/kernel/head.S > +++ b/arch/riscv/kernel/head.S > @@ -63,6 +63,7 @@ clear_bss_done: > /* Initialize page tables and relocate to virtual addresses */ > la sp, init_thread_union + THREAD_SIZE > la a0, _start > + mv a1, s1 > call setup_vm > call relocate > > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index ecb654f6a79e..acdd0f74982b 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -30,6 +30,7 @@ > #include <linux/sched/task.h> > #include <linux/swiotlb.h> > > +#include <asm/fixmap.h> > #include <asm/setup.h> > #include <asm/sections.h> > #include <asm/pgtable.h> > @@ -62,7 +63,8 @@ unsigned long boot_cpu_hartid; > > void __init parse_dtb(unsigned int hartid, void *dtb) > { > - if (early_init_dt_scan(__va(dtb))) > + dtb = (void *)fix_to_virt(FIX_FDT) + ((uintptr_t)dtb & ~PAGE_MASK); > + if (early_init_dt_scan(dtb)) > return; > > pr_err("No DTB passed to the kernel\n"); > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index e38f8195e45b..c389fbfeccd8 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -1,14 +1,7 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > /* > + * Copyright (C) 2019 Western Digital Corporation or its affiliates. > * Copyright (C) 2012 Regents of the University of California > - * > - * This program is free software; you can redistribute it and/or > - * modify it under the terms of the GNU General Public License > - * as published by the Free Software Foundation, version 2. > - * > - * This program is distributed in the hope that it will be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > - * GNU General Public License for more details. > */ > > #include <linux/init.h> > @@ -43,13 +36,6 @@ void setup_zero_page(void) > memset((void *)empty_zero_page, 0, PAGE_SIZE); > } > > -void __init paging_init(void) > -{ > - setup_zero_page(); > - local_flush_tlb_all(); > - zone_sizes_init(); > -} > - > void __init mem_init(void) > { > #ifdef CONFIG_FLATMEM > @@ -143,18 +129,36 @@ void __init setup_bootmem(void) > } > } > > +#define MAX_EARLY_MAPPING_SIZE SZ_128M > + > pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss; > pgd_t trampoline_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); > > #ifndef __PAGETABLE_PMD_FOLDED > -#define NUM_SWAPPER_PMDS ((uintptr_t)-PAGE_OFFSET >> PGDIR_SHIFT) > -pmd_t swapper_pmd[PTRS_PER_PMD*((-PAGE_OFFSET)/PGDIR_SIZE)] __page_aligned_bss; > -pmd_t trampoline_pmd[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); > +#if MAX_EARLY_MAPPING_SIZE < PGDIR_SIZE > +#define NUM_SWAPPER_PMDS 1UL > +#else > +#define NUM_SWAPPER_PMDS (MAX_EARLY_MAPPING_SIZE/PGDIR_SIZE) > +#endif > +#define NUM_TRAMPOLINE_PMDS 1UL > +pmd_t swapper_pmd[PTRS_PER_PMD*NUM_SWAPPER_PMDS] __page_aligned_bss; > +pmd_t trampoline_pmd[PTRS_PER_PMD*NUM_TRAMPOLINE_PMDS] > + __initdata __aligned(PAGE_SIZE); > pmd_t fixmap_pmd[PTRS_PER_PMD] __page_aligned_bss; > +#define NUM_SWAPPER_PTES (MAX_EARLY_MAPPING_SIZE/PMD_SIZE) > +#else > +#define NUM_SWAPPER_PTES (MAX_EARLY_MAPPING_SIZE/PGDIR_SIZE) > #endif > > +#define NUM_TRAMPOLINE_PTES 1UL > + > +pte_t swapper_pte[PTRS_PER_PTE*NUM_SWAPPER_PTES] __page_aligned_bss; > +pte_t trampoline_pte[PTRS_PER_PTE*NUM_TRAMPOLINE_PTES] > + __initdata __aligned(PAGE_SIZE); > pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss; > > +uintptr_t map_size; > + > void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot) > { > unsigned long addr = __fix_to_virt(idx); > @@ -172,6 +176,13 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot) > } > } > > +struct mapping_ops { > + pte_t *(*get_pte_virt)(phys_addr_t pa); > + phys_addr_t (*alloc_pte)(uintptr_t va, uintptr_t load_pa); > + pmd_t *(*get_pmd_virt)(phys_addr_t pa); > + phys_addr_t (*alloc_pmd)(uintptr_t va, uintptr_t load_pa); > +}; > + > static inline void *__load_addr(void *ptr, uintptr_t load_pa) > { > extern char _start; > @@ -186,64 +197,347 @@ static inline void *__load_addr(void *ptr, uintptr_t load_pa) > #define __load_va(ptr, load_pa) __load_addr(ptr, load_pa) > #define __load_pa(ptr, load_pa) ((uintptr_t)__load_addr(ptr, load_pa)) > > -asmlinkage void __init setup_vm(uintptr_t load_pa) > +static phys_addr_t __init final_alloc_pgtable(void) > +{ > + return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE); > +} > + > +static pte_t *__init early_get_pte_virt(phys_addr_t pa) > +{ > + return (pte_t *)((uintptr_t)pa); > +} > + > +static pte_t *__init final_get_pte_virt(phys_addr_t pa) > +{ > + clear_fixmap(FIX_PTE); > + > + return (pte_t *)set_fixmap_offset(FIX_PTE, pa); > +} > + > +static phys_addr_t __init early_alloc_trampoline_pte(uintptr_t va, > + uintptr_t load_pa) > +{ > + pte_t *base = __load_va(trampoline_pte, load_pa); > + uintptr_t pte_num = ((va - PAGE_OFFSET) >> PMD_SHIFT); > + > + BUG_ON(pte_num >= NUM_TRAMPOLINE_PTES); > + > + return (uintptr_t)&base[pte_num * PTRS_PER_PTE]; > +} > + > +static phys_addr_t __init early_alloc_swapper_pte(uintptr_t va, > + uintptr_t load_pa) > +{ > + pte_t *base = __load_va(swapper_pte, load_pa); > + uintptr_t pte_num = ((va - PAGE_OFFSET) >> PMD_SHIFT); > + > + BUG_ON(pte_num >= NUM_SWAPPER_PTES); > + > + return (uintptr_t)&base[pte_num * PTRS_PER_PTE]; > +} > + > +static phys_addr_t __init final_alloc_pte(uintptr_t va, uintptr_t load_pa) > +{ > + return final_alloc_pgtable(); > +} > + > +static void __init create_pte_mapping(pte_t *ptep, > + uintptr_t va, phys_addr_t pa, > + phys_addr_t sz, pgprot_t prot) > { > - uintptr_t i; > + uintptr_t pte_index = pte_index(va); > + > + BUG_ON(sz != PAGE_SIZE); > + > + if (pte_none(ptep[pte_index])) > + ptep[pte_index] = pfn_pte(PFN_DOWN(pa), prot); > +} > + > #ifndef __PAGETABLE_PMD_FOLDED > +static pmd_t *__init early_get_pmd_virt(phys_addr_t pa) > +{ > + return (pmd_t *)((uintptr_t)pa); > +} > + > +static pmd_t *__init final_get_pmd_virt(phys_addr_t pa) > +{ > + clear_fixmap(FIX_PMD); > + > + return (pmd_t *)set_fixmap_offset(FIX_PMD, pa); > +} > + > +static phys_addr_t __init early_alloc_trampoline_pmd(uintptr_t va, > + uintptr_t load_pa) > +{ > + pmd_t *base = __load_va(trampoline_pmd, load_pa); > + uintptr_t pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT; > + > + BUG_ON(pmd_num >= NUM_TRAMPOLINE_PMDS); > + > + return (uintptr_t)&base[pmd_num * PTRS_PER_PMD]; > +} > + > +static phys_addr_t __init early_alloc_swapper_pmd(uintptr_t va, > + uintptr_t load_pa) > +{ > + pmd_t *base = __load_va(swapper_pmd, load_pa); > + uintptr_t pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT; > + > + BUG_ON(pmd_num >= NUM_SWAPPER_PMDS); > + > + return (uintptr_t)&base[pmd_num * PTRS_PER_PMD]; > +} > + > +static phys_addr_t __init final_alloc_pmd(uintptr_t va, uintptr_t load_pa) > +{ > + return final_alloc_pgtable(); > +} > + > +static void __init create_pmd_mapping(pmd_t *pmdp, > + uintptr_t va, phys_addr_t pa, > + phys_addr_t sz, pgprot_t prot, > + uintptr_t ops_load_pa, > + struct mapping_ops *ops) > +{ > + pte_t *ptep; > + phys_addr_t pte_phys; > + uintptr_t pmd_index = pmd_index(va); > + > + if (sz == PMD_SIZE) { > + if (pmd_none(pmdp[pmd_index])) > + pmdp[pmd_index] = pfn_pmd(PFN_DOWN(pa), prot); > + return; > + } > + > + if (pmd_none(pmdp[pmd_index])) { > + pte_phys = ops->alloc_pte(va, ops_load_pa); > + pmdp[pmd_index] = pfn_pmd(PFN_DOWN(pte_phys), > + __pgprot(_PAGE_TABLE)); > + ptep = ops->get_pte_virt(pte_phys); > + memset(ptep, 0, PAGE_SIZE); > + } else { > + pte_phys = PFN_PHYS(_pmd_pfn(pmdp[pmd_index])); > + ptep = ops->get_pte_virt(pte_phys); > + } > + > + create_pte_mapping(ptep, va, pa, sz, prot); > +} > + > +static void __init create_pgd_mapping(pgd_t *pgdp, > + uintptr_t va, phys_addr_t pa, > + phys_addr_t sz, pgprot_t prot, > + uintptr_t ops_load_pa, > + struct mapping_ops *ops) > +{ > pmd_t *pmdp; > + phys_addr_t pmd_phys; > + uintptr_t pgd_index = pgd_index(va); > + > + if (sz == PGDIR_SIZE) { > + if (pgd_val(pgdp[pgd_index]) == 0) > + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pa), prot); > + return; > + } > + > + if (pgd_val(pgdp[pgd_index]) == 0) { > + pmd_phys = ops->alloc_pmd(va, ops_load_pa); > + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pmd_phys), > + __pgprot(_PAGE_TABLE)); > + pmdp = ops->get_pmd_virt(pmd_phys); > + memset(pmdp, 0, PAGE_SIZE); > + } else { > + pmd_phys = PFN_PHYS(_pgd_pfn(pgdp[pgd_index])); > + pmdp = ops->get_pmd_virt(pmd_phys); > + } > + > + create_pmd_mapping(pmdp, va, pa, sz, prot, ops_load_pa, ops); > +} > +#else > +static void __init create_pgd_mapping(pgd_t *pgdp, > + uintptr_t va, phys_addr_t pa, > + phys_addr_t sz, pgprot_t prot, > + uintptr_t ops_load_pa, > + struct mapping_ops *ops) > +{ > + pte_t *ptep; > + phys_addr_t pte_phys; > + uintptr_t pgd_index = pgd_index(va); > + > + if (sz == PGDIR_SIZE) { > + if (pgd_val(pgdp[pgd_index]) == 0) > + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pa), prot); > + return; > + } > + > + if (pgd_val(pgdp[pgd_index]) == 0) { > + pte_phys = ops->alloc_pte(va, ops_load_pa); > + pgdp[pgd_index] = pfn_pgd(PFN_DOWN(pte_phys), > + __pgprot(_PAGE_TABLE)); > + ptep = ops->get_pte_virt(pte_phys); > + memset(ptep, 0, PAGE_SIZE); > + } else { > + pte_phys = PFN_PHYS(_pgd_pfn(pgdp[pgd_index])); > + ptep = ops->get_pte_virt(pte_phys); > + } > + > + create_pte_mapping(ptep, va, pa, sz, prot); > +} > +#endif > + > +static uintptr_t __init best_map_size(uintptr_t load_pa, phys_addr_t size) > +{ > +#ifdef CONFIG_BOOT_PAGE_ALIGNED > + uintptr_t map_sz = PAGE_SIZE; > +#else > +#ifndef __PAGETABLE_PMD_FOLDED > + uintptr_t map_sz = PMD_SIZE; > +#else > + uintptr_t map_sz = PGDIR_SIZE; > +#endif > #endif > - pgd_t *pgdp; > + > +#ifndef __PAGETABLE_PMD_FOLDED > + if (!(load_pa & (PMD_SIZE - 1)) && > + (size >= PMD_SIZE) && > + (map_sz < PMD_SIZE)) > + map_sz = PMD_SIZE; > +#endif > + > + if (!(load_pa & (PGDIR_SIZE - 1)) && > + (size >= PGDIR_SIZE) && > + (map_sz < PGDIR_SIZE)) > + map_sz = PGDIR_SIZE; > + > + return map_sz; > +} > + > +asmlinkage void __init setup_vm(uintptr_t load_pa, uintptr_t dtb_pa) > +{ > phys_addr_t map_pa; > + uintptr_t va, end_va; > + uintptr_t load_sz = __load_pa(&_end, load_pa) - load_pa; > pgprot_t tableprot = __pgprot(_PAGE_TABLE); > pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); > + struct mapping_ops tramp_ops, swap_ops; > > va_pa_offset = PAGE_OFFSET - load_pa; > pfn_base = PFN_DOWN(load_pa); > + map_size = best_map_size(load_pa, PGDIR_SIZE); > > /* Sanity check alignment and size */ > BUG_ON((PAGE_OFFSET % PGDIR_SIZE) != 0); > - BUG_ON((load_pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0); > + BUG_ON((load_pa % map_size) != 0); > + BUG_ON(load_sz > MAX_EARLY_MAPPING_SIZE); > > -#ifndef __PAGETABLE_PMD_FOLDED > - pgdp = __load_va(trampoline_pg_dir, load_pa); > - map_pa = __load_pa(trampoline_pmd, load_pa); > - pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN(map_pa), tableprot); > - trampoline_pmd[0] = pfn_pmd(PFN_DOWN(load_pa), prot); > + /* Setup trampoline mapping ops */ > + tramp_ops.get_pte_virt = __load_va(early_get_pte_virt, load_pa); > + tramp_ops.alloc_pte = __load_va(early_alloc_trampoline_pte, load_pa); > + tramp_ops.get_pmd_virt = NULL; > + tramp_ops.alloc_pmd = NULL; > > - pgdp = __load_va(swapper_pg_dir, load_pa); > + /* Setup swapper mapping ops */ > + swap_ops.get_pte_virt = __load_va(early_get_pte_virt, load_pa); > + swap_ops.alloc_pte = __load_va(early_alloc_swapper_pte, load_pa); > + swap_ops.get_pmd_virt = NULL; > + swap_ops.alloc_pmd = NULL; > > - for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { > - size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; > +#ifndef __PAGETABLE_PMD_FOLDED > + /* Update trampoline mapping ops for PMD */ > + tramp_ops.get_pmd_virt = __load_va(early_get_pmd_virt, load_pa); > + tramp_ops.alloc_pmd = __load_va(early_alloc_trampoline_pmd, load_pa); > > - map_pa = __load_pa(swapper_pmd, load_pa); > - pgdp[o] = pfn_pgd(PFN_DOWN(map_pa) + i, tableprot); > - } > - pmdp = __load_va(swapper_pmd, load_pa); > - for (i = 0; i < ARRAY_SIZE(swapper_pmd); i++) > - pmdp[i] = pfn_pmd(PFN_DOWN(load_pa + i * PMD_SIZE), prot); > + /* Update swapper mapping ops for PMD */ > + swap_ops.get_pmd_virt = __load_va(early_get_pmd_virt, load_pa); > + swap_ops.alloc_pmd = __load_va(early_alloc_swapper_pmd, load_pa); > > + /* Setup swapper PGD and PMD for fixmap */ > map_pa = __load_pa(fixmap_pmd, load_pa); > - pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN(map_pa), tableprot); > - pmdp = __load_va(fixmap_pmd, load_pa); > + create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), > + FIXADDR_START, map_pa, PGDIR_SIZE, tableprot, > + load_pa, &swap_ops); > map_pa = __load_pa(fixmap_pte, load_pa); > - fixmap_pmd[(FIXADDR_START >> PMD_SHIFT) % PTRS_PER_PMD] = > - pfn_pmd(PFN_DOWN(map_pa), tableprot); > + create_pmd_mapping(__load_va(fixmap_pmd, load_pa), > + FIXADDR_START, map_pa, PMD_SIZE, tableprot, > + load_pa, &swap_ops); > #else > - pgdp = __load_va(trampoline_pg_dir, load_pa); > - pgdp[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN(load_pa), prot); > + /* Setup swapper PGD for fixmap */ > + map_pa = __load_pa(fixmap_pte, load_pa); > + create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), > + FIXADDR_START, map_pa, PGDIR_SIZE, tableprot, > + load_pa, &swap_ops); > +#endif > > - pgdp = __load_va(swapper_pg_dir, load_pa); > - for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) { > - size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i; > + /* Setup trampoling PGD covering first few MBs of kernel */ > + end_va = PAGE_OFFSET + PAGE_SIZE*PTRS_PER_PTE; > + for (va = PAGE_OFFSET; va < end_va; va += map_size) > + create_pgd_mapping(__load_va(trampoline_pg_dir, load_pa), > + va, load_pa + (va - PAGE_OFFSET), > + map_size, prot, load_pa, &tramp_ops); > + > + /* > + * Setup swapper PGD covering entire kernel which will allows > + * us to reach paging_init(). We map all memory banks later in > + * setup_vm_final() below. > + */ > + end_va = PAGE_OFFSET + load_sz; > + for (va = PAGE_OFFSET; va < end_va; va += map_size) > + create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), > + va, load_pa + (va - PAGE_OFFSET), > + map_size, prot, load_pa, &swap_ops); > + > + /* Create fixed mapping for early parsing of FDT */ > + end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE; > + for (va = __fix_to_virt(FIX_FDT); va < end_va; va += PAGE_SIZE) > + create_pte_mapping(__load_va(fixmap_pte, load_pa), > + va, dtb_pa + (va - __fix_to_virt(FIX_FDT)), > + PAGE_SIZE, prot); > +} > > - pgdp[o] = pfn_pgd(PFN_DOWN(load_pa + i * PGDIR_SIZE), prot); > - } > +static void __init setup_vm_final(void) > +{ > + phys_addr_t pa, start, end; > + struct memblock_region *reg; > + struct mapping_ops ops; > + pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); > > - map_pa = __load_pa(fixmap_pte, load_pa); > - pgdp[(FIXADDR_START >> PGDIR_SHIFT) % PTRS_PER_PGD] = > - pfn_pgd(PFN_DOWN(map_pa), tableprot); > + /* Setup mapping ops */ > + ops.get_pte_virt = final_get_pte_virt; > + ops.alloc_pte = final_alloc_pte; > +#ifndef __PAGETABLE_PMD_FOLDED > + ops.get_pmd_virt = final_get_pmd_virt; > + ops.alloc_pmd = final_alloc_pmd; > +#else > + ops.get_pmd_virt = NULL; > + ops.alloc_pmd = NULL; > #endif > + > + /* Map all memory banks */ > + for_each_memblock(memory, reg) { > + start = reg->base; > + end = start + reg->size; > + > + if (start >= end) > + break; > + if (memblock_is_nomap(reg)) > + continue; > + if (start <= __pa(PAGE_OFFSET) && > + __pa(PAGE_OFFSET) < end) > + start = __pa(PAGE_OFFSET); > + > + for (pa = start; pa < end; pa += map_size) > + create_pgd_mapping(swapper_pg_dir, > + (uintptr_t)__va(pa), pa, > + map_size, prot, 0, &ops); > + } > + > + clear_fixmap(FIX_PTE); > + clear_fixmap(FIX_PMD); > +} > + > +void __init paging_init(void) > +{ > + setup_vm_final(); > + setup_zero_page(); > + local_flush_tlb_all(); > + zone_sizes_init(); > } > -- > 2.17.1 > -- Sincerely yours, Mike. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address 2019-03-23 15:40 ` Mike Rapoport @ 2019-03-23 17:24 ` Christoph Hellwig 2019-03-24 4:16 ` Anup Patel 2019-03-24 3:32 ` Anup Patel 1 sibling, 1 reply; 16+ messages in thread From: Christoph Hellwig @ 2019-03-23 17:24 UTC (permalink / raw) To: Mike Rapoport Cc: Palmer Dabbelt, Anup Patel, linux-kernel, Christoph Hellwig, Atish Patra, Albert Ou, Paul Walmsley, linux-riscv On Sat, Mar 23, 2019 at 05:40:12PM +0200, Mike Rapoport wrote: > I have no general objection, but I presume the patch will be significantly > simplified if the addition of 4K pages support will follow the removal of > the trampoline_pd_dir. > > That said, I didn't look into the details, since they will change > substantially, only some comments on the Kconfig part. > > On the high level, have you considered using large pages in setup_vm() and > the remapping everything with 4K pages in setup_vm_final()? This might > save you the whole ops-> churn. That would be a great start. That being said the current tiny memory RISC-V devices don't even have a MMU, so the kernel pagetable mapping isn't even relevant for them. I'm just not sure adding more complexity in the early boot path for a borderline case (MMU and tiny memory with a tiny kernel image) is reall worth all the complexity. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address 2019-03-23 17:24 ` Christoph Hellwig @ 2019-03-24 4:16 ` Anup Patel 0 siblings, 0 replies; 16+ messages in thread From: Anup Patel @ 2019-03-24 4:16 UTC (permalink / raw) To: Christoph Hellwig Cc: Palmer Dabbelt, Anup Patel, linux-kernel, Mike Rapoport, Atish Patra, Albert Ou, Paul Walmsley, linux-riscv On Sat, Mar 23, 2019 at 10:54 PM Christoph Hellwig <hch@infradead.org> wrote: > > On Sat, Mar 23, 2019 at 05:40:12PM +0200, Mike Rapoport wrote: > > I have no general objection, but I presume the patch will be significantly > > simplified if the addition of 4K pages support will follow the removal of > > the trampoline_pd_dir. > > > > That said, I didn't look into the details, since they will change > > substantially, only some comments on the Kconfig part. > > > > On the high level, have you considered using large pages in setup_vm() and > > the remapping everything with 4K pages in setup_vm_final()? This might > > save you the whole ops-> churn. > > That would be a great start. That being said the current tiny memory > RISC-V devices don't even have a MMU, so the kernel pagetable mapping > isn't even relevant for them. I'm just not sure adding more complexity > in the early boot path for a borderline case (MMU and tiny memory > with a tiny kernel image) is reall worth all the complexity. It's not just for addressing a borderline case (MMU and tiny memory with tiny kernel image). We trying to addresses following issues in current code: 1. The current setup_vm() maps all possible kernel virtual addresses (128GB on 64bit system and 1GB on 32bit system). The amount RAM present on real systems might be much less so we should not have kernel mappings for non-existent RAM. Of course, we don't know amount of RAM available in setup_vm() so we have to split page table setup in two parts and do minimal required mapping in setup_vm(). 2. NOMMU kernel requires a swapper_pg_dir with identity mapping (VA == PA) and without it we get boot-time crash so we cannot skip it for NOMMU case. For NOMMU, the PAGE_OFFSET will typically be 0x80020000 (or 0x80xxxxxx). This means swapper_pmd array (which uses -PAGE_OFFSET) will be over-sized causing compile errors. 3. For both NOMMU with tiny memory and MMU with tiny memory, the current setup_vm() is not allowing us to place kernel on non-2M (or non-4M) aligned addressed there by causing memory below kernel to be wasted. 4. For MMU based kernel, the current setup_vm() is hard-wired for fixed 2M mapping size. It will require more changes if we want to do 1G mappings. The above issues motivated us to re-write setup_vm(). We are trying to make initial page table setup more flexible and robust so that: 1. We don't have any unwanted mappings pointing to non-existent RAM 2. We can have any value of PAGE_OFFSET for NOMMU case without the page table arrays becoming oversized 3. We can create mappings of best possible size to get good performance 4. We can boot from any 4K/2M/1G (or just 4K) aligned load address Also, the end result of all this is a much more readable page table setup code shared between setup_vm() an setup_vm_final() where the differences are abstracted via mapping ops. Regards, Anup _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address 2019-03-23 15:40 ` Mike Rapoport 2019-03-23 17:24 ` Christoph Hellwig @ 2019-03-24 3:32 ` Anup Patel 1 sibling, 0 replies; 16+ messages in thread From: Anup Patel @ 2019-03-24 3:32 UTC (permalink / raw) To: Mike Rapoport Cc: Palmer Dabbelt, Anup Patel, linux-kernel, Christoph Hellwig, Atish Patra, Albert Ou, Paul Walmsley, linux-riscv On Sat, Mar 23, 2019 at 9:10 PM Mike Rapoport <rppt@linux.ibm.com> wrote: > > On Thu, Mar 21, 2019 at 09:47:51AM +0000, Anup Patel wrote: > > Currently, we have to boot RISCV64 kernel from a 2MB aligned physical > > address and RISCV32 kernel from a 4MB aligned physical address. This > > constraint is because initial pagetable setup (i.e. setup_vm()) maps > > entire RAM using hugepages (i.e. 2MB for 3-level pagetable and 4MB for > > 2-level pagetable). > > > > Further, the above booting contraint also results in memory wastage > > because if we boot kernel from some <xyz> address (which is not same as > > RAM start address) then RISCV kernel will map PAGE_OFFSET virtual address > > lineraly to <xyz> physical address and memory between RAM start and <xyz> > > will be reserved/unusable. > > > > For example, RISCV64 kernel booted from 0x80200000 will waste 2MB of RAM > > and RISCV32 kernel booted from 0x80400000 will waste 4MB of RAM. > > > > This patch re-writes the initial pagetable setup code to allow booting > > RISV32 and RISCV64 kernel from any 4KB (i.e. PAGE_SIZE) aligned address. > > > > To achieve this: > > 1. We add kconfig option BOOT_PAGE_ALIGNED. When it is enabled we use > > 4KB mappings in initial page table setup otherwise we use 2MB/4MB > > mappings. > > 2. We map kernel and dtb (few MBs) in setup_vm() (called from head.S) > > 3. Once we reach paging_init() (called from setup_arch()) after > > memblock setup, we map all available memory banks. > > > > With this patch in-place, the booting constraint for RISCV32 and RISCV64 > > kernel is much more relaxed when CONFIG_BOOT_PAGE_ALIGNED=y and we can > > now boot kernel very close to RAM start thereby minimizng memory wastage. > > I have no general objection, but I presume the patch will be significantly > simplified if the addition of 4K pages support will follow the removal of > the trampoline_pd_dir. > > That said, I didn't look into the details, since they will change > substantially, only some comments on the Kconfig part. > > On the high level, have you considered using large pages in setup_vm() and > the remapping everything with 4K pages in setup_vm_final()? This might > save you the whole ops-> churn. Yes, it can save the ops churn in setup_vm() but we should let setup_vm() choose best possible mapping size based on load address alignment hence it's better have common page table setup code shared between early and final page table setup which can create mapping of any size. For example, if we are booted from 1G aligned load address then setup_vm() should create 1G mapping thereby getting better performance compared to 2M mappings. Regards, Anup _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 4/5] RISC-V: Remove redundant trampoline page table 2019-03-21 9:47 [PATCH v2 0/5] Boot RISC-V kernel from any 4KB aligned address Anup Patel ` (2 preceding siblings ...) 2019-03-21 9:47 ` [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address Anup Patel @ 2019-03-21 9:47 ` Anup Patel 2019-03-22 13:33 ` Christoph Hellwig 2019-03-21 9:47 ` [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() Anup Patel 4 siblings, 1 reply; 16+ messages in thread From: Anup Patel @ 2019-03-21 9:47 UTC (permalink / raw) To: Palmer Dabbelt, Albert Ou Cc: Anup Patel, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv The trampoline page table is redundant because: 1. There is no mapping in trampoling page table which is not covered by swapper page table. 2. The relocate() in head.S will first load trampoline page table and after that it will load swapper page table. Same thing can be achieved by straight away loading swapper page table. 3. The trampoline page table is in init section. The relocate() will break after trampoline page table has been free'ed by kernel. This also means runtime HART hotplug will not work correctly due to broken relocate() after kernel is booted. Due to above, this patch removes trampoline page table and related code from kernel/head.S and mm/init.c. Signed-off-by: Anup Patel <anup.patel@wdc.com> --- arch/riscv/kernel/head.S | 13 ++----- arch/riscv/mm/init.c | 79 ++++++++-------------------------------- 2 files changed, 19 insertions(+), 73 deletions(-) diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index 12a3ec5eb8ab..94f424e2038d 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -93,21 +93,19 @@ relocate: add a0, a0, a1 csrw stvec, a0 - /* Compute satp for kernel page tables, but don't load it yet */ + /* Compute satp for kernel page directory, but don't load it yet */ la a2, swapper_pg_dir srl a2, a2, PAGE_SHIFT li a1, SATP_MODE or a2, a2, a1 /* - * Load trampoline page directory, which will cause us to trap to + * Load kernel page directory, which will cause us to trap to * stvec if VA != PA, or simply fall through if VA == PA */ - la a0, trampoline_pg_dir - srl a0, a0, PAGE_SHIFT - or a0, a0, a1 sfence.vma - csrw sptbr, a0 + csrw sptbr, a2 + .align 2 1: /* Set trap vector to spin forever to help debug */ @@ -120,9 +118,6 @@ relocate: la gp, __global_pointer$ .option pop - /* Switch to kernel page tables */ - csrw sptbr, a2 - ret .Lsecondary_start: diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index c389fbfeccd8..2e2f2567964c 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -132,7 +132,6 @@ void __init setup_bootmem(void) #define MAX_EARLY_MAPPING_SIZE SZ_128M pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss; -pgd_t trampoline_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); #ifndef __PAGETABLE_PMD_FOLDED #if MAX_EARLY_MAPPING_SIZE < PGDIR_SIZE @@ -140,21 +139,14 @@ pgd_t trampoline_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); #else #define NUM_SWAPPER_PMDS (MAX_EARLY_MAPPING_SIZE/PGDIR_SIZE) #endif -#define NUM_TRAMPOLINE_PMDS 1UL pmd_t swapper_pmd[PTRS_PER_PMD*NUM_SWAPPER_PMDS] __page_aligned_bss; -pmd_t trampoline_pmd[PTRS_PER_PMD*NUM_TRAMPOLINE_PMDS] - __initdata __aligned(PAGE_SIZE); pmd_t fixmap_pmd[PTRS_PER_PMD] __page_aligned_bss; #define NUM_SWAPPER_PTES (MAX_EARLY_MAPPING_SIZE/PMD_SIZE) #else #define NUM_SWAPPER_PTES (MAX_EARLY_MAPPING_SIZE/PGDIR_SIZE) #endif -#define NUM_TRAMPOLINE_PTES 1UL - pte_t swapper_pte[PTRS_PER_PTE*NUM_SWAPPER_PTES] __page_aligned_bss; -pte_t trampoline_pte[PTRS_PER_PTE*NUM_TRAMPOLINE_PTES] - __initdata __aligned(PAGE_SIZE); pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss; uintptr_t map_size; @@ -214,19 +206,7 @@ static pte_t *__init final_get_pte_virt(phys_addr_t pa) return (pte_t *)set_fixmap_offset(FIX_PTE, pa); } -static phys_addr_t __init early_alloc_trampoline_pte(uintptr_t va, - uintptr_t load_pa) -{ - pte_t *base = __load_va(trampoline_pte, load_pa); - uintptr_t pte_num = ((va - PAGE_OFFSET) >> PMD_SHIFT); - - BUG_ON(pte_num >= NUM_TRAMPOLINE_PTES); - - return (uintptr_t)&base[pte_num * PTRS_PER_PTE]; -} - -static phys_addr_t __init early_alloc_swapper_pte(uintptr_t va, - uintptr_t load_pa) +static phys_addr_t __init early_alloc_pte(uintptr_t va, uintptr_t load_pa) { pte_t *base = __load_va(swapper_pte, load_pa); uintptr_t pte_num = ((va - PAGE_OFFSET) >> PMD_SHIFT); @@ -266,19 +246,7 @@ static pmd_t *__init final_get_pmd_virt(phys_addr_t pa) return (pmd_t *)set_fixmap_offset(FIX_PMD, pa); } -static phys_addr_t __init early_alloc_trampoline_pmd(uintptr_t va, - uintptr_t load_pa) -{ - pmd_t *base = __load_va(trampoline_pmd, load_pa); - uintptr_t pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT; - - BUG_ON(pmd_num >= NUM_TRAMPOLINE_PMDS); - - return (uintptr_t)&base[pmd_num * PTRS_PER_PMD]; -} - -static phys_addr_t __init early_alloc_swapper_pmd(uintptr_t va, - uintptr_t load_pa) +static phys_addr_t __init early_alloc_pmd(uintptr_t va, uintptr_t load_pa) { pmd_t *base = __load_va(swapper_pmd, load_pa); uintptr_t pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT; @@ -418,7 +386,7 @@ asmlinkage void __init setup_vm(uintptr_t load_pa, uintptr_t dtb_pa) uintptr_t load_sz = __load_pa(&_end, load_pa) - load_pa; pgprot_t tableprot = __pgprot(_PAGE_TABLE); pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC); - struct mapping_ops tramp_ops, swap_ops; + struct mapping_ops ops; va_pa_offset = PAGE_OFFSET - load_pa; pfn_base = PFN_DOWN(load_pa); @@ -429,51 +397,34 @@ asmlinkage void __init setup_vm(uintptr_t load_pa, uintptr_t dtb_pa) BUG_ON((load_pa % map_size) != 0); BUG_ON(load_sz > MAX_EARLY_MAPPING_SIZE); - /* Setup trampoline mapping ops */ - tramp_ops.get_pte_virt = __load_va(early_get_pte_virt, load_pa); - tramp_ops.alloc_pte = __load_va(early_alloc_trampoline_pte, load_pa); - tramp_ops.get_pmd_virt = NULL; - tramp_ops.alloc_pmd = NULL; - - /* Setup swapper mapping ops */ - swap_ops.get_pte_virt = __load_va(early_get_pte_virt, load_pa); - swap_ops.alloc_pte = __load_va(early_alloc_swapper_pte, load_pa); - swap_ops.get_pmd_virt = NULL; - swap_ops.alloc_pmd = NULL; + /* Setup mapping ops */ + ops.get_pte_virt = __load_va(early_get_pte_virt, load_pa); + ops.alloc_pte = __load_va(early_alloc_pte, load_pa); + ops.get_pmd_virt = NULL; + ops.alloc_pmd = NULL; #ifndef __PAGETABLE_PMD_FOLDED - /* Update trampoline mapping ops for PMD */ - tramp_ops.get_pmd_virt = __load_va(early_get_pmd_virt, load_pa); - tramp_ops.alloc_pmd = __load_va(early_alloc_trampoline_pmd, load_pa); - - /* Update swapper mapping ops for PMD */ - swap_ops.get_pmd_virt = __load_va(early_get_pmd_virt, load_pa); - swap_ops.alloc_pmd = __load_va(early_alloc_swapper_pmd, load_pa); + /* Update mapping ops for PMD */ + ops.get_pmd_virt = __load_va(early_get_pmd_virt, load_pa); + ops.alloc_pmd = __load_va(early_alloc_pmd, load_pa); /* Setup swapper PGD and PMD for fixmap */ map_pa = __load_pa(fixmap_pmd, load_pa); create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), FIXADDR_START, map_pa, PGDIR_SIZE, tableprot, - load_pa, &swap_ops); + load_pa, &ops); map_pa = __load_pa(fixmap_pte, load_pa); create_pmd_mapping(__load_va(fixmap_pmd, load_pa), FIXADDR_START, map_pa, PMD_SIZE, tableprot, - load_pa, &swap_ops); + load_pa, &ops); #else /* Setup swapper PGD for fixmap */ map_pa = __load_pa(fixmap_pte, load_pa); create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), FIXADDR_START, map_pa, PGDIR_SIZE, tableprot, - load_pa, &swap_ops); + load_pa, &ops); #endif - /* Setup trampoling PGD covering first few MBs of kernel */ - end_va = PAGE_OFFSET + PAGE_SIZE*PTRS_PER_PTE; - for (va = PAGE_OFFSET; va < end_va; va += map_size) - create_pgd_mapping(__load_va(trampoline_pg_dir, load_pa), - va, load_pa + (va - PAGE_OFFSET), - map_size, prot, load_pa, &tramp_ops); - /* * Setup swapper PGD covering entire kernel which will allows * us to reach paging_init(). We map all memory banks later in @@ -483,7 +434,7 @@ asmlinkage void __init setup_vm(uintptr_t load_pa, uintptr_t dtb_pa) for (va = PAGE_OFFSET; va < end_va; va += map_size) create_pgd_mapping(__load_va(swapper_pg_dir, load_pa), va, load_pa + (va - PAGE_OFFSET), - map_size, prot, load_pa, &swap_ops); + map_size, prot, load_pa, &ops); /* Create fixed mapping for early parsing of FDT */ end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE; -- 2.17.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 4/5] RISC-V: Remove redundant trampoline page table 2019-03-21 9:47 ` [PATCH v2 4/5] RISC-V: Remove redundant trampoline page table Anup Patel @ 2019-03-22 13:33 ` Christoph Hellwig 2019-03-25 4:17 ` Anup Patel 0 siblings, 1 reply; 16+ messages in thread From: Christoph Hellwig @ 2019-03-22 13:33 UTC (permalink / raw) To: Anup Patel Cc: Albert Ou, Palmer Dabbelt, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv > > - /* Compute satp for kernel page tables, but don't load it yet */ > + /* Compute satp for kernel page directory, but don't load it yet */ > /* > - * Load trampoline page directory, which will cause us to trap to > + * Load kernel page directory, which will cause us to trap to > * stvec if VA != PA, or simply fall through if VA == PA > */ If we want to nitpick comments I think this should take about the page table root or something like that. Otherwise the idea looks good, but I really think we should do this before all the changes to the setup_vm code. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 4/5] RISC-V: Remove redundant trampoline page table 2019-03-22 13:33 ` Christoph Hellwig @ 2019-03-25 4:17 ` Anup Patel 0 siblings, 0 replies; 16+ messages in thread From: Anup Patel @ 2019-03-25 4:17 UTC (permalink / raw) To: Christoph Hellwig Cc: Palmer Dabbelt, Anup Patel, linux-kernel, Mike Rapoport, Atish Patra, Albert Ou, Paul Walmsley, linux-riscv On Fri, Mar 22, 2019 at 7:03 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > - /* Compute satp for kernel page tables, but don't load it yet */ > > + /* Compute satp for kernel page directory, but don't load it yet */ > > > > /* > > - * Load trampoline page directory, which will cause us to trap to > > + * Load kernel page directory, which will cause us to trap to > > * stvec if VA != PA, or simply fall through if VA == PA > > */ > > If we want to nitpick comments I think this should take about the > page table root or something like that. Okay, I will update comments. > > Otherwise the idea looks good, but I really think we should do this > before all the changes to the setup_vm code. Sure, I will move it before setup_vm code so that setup_vm code is further simplified. Regards, Anup _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() 2019-03-21 9:47 [PATCH v2 0/5] Boot RISC-V kernel from any 4KB aligned address Anup Patel ` (3 preceding siblings ...) 2019-03-21 9:47 ` [PATCH v2 4/5] RISC-V: Remove redundant trampoline page table Anup Patel @ 2019-03-21 9:47 ` Anup Patel 2019-03-22 13:31 ` Christoph Hellwig 2019-03-23 15:44 ` Mike Rapoport 4 siblings, 2 replies; 16+ messages in thread From: Anup Patel @ 2019-03-21 9:47 UTC (permalink / raw) To: Palmer Dabbelt, Albert Ou Cc: Anup Patel, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv Currently, the setup_bootmem() reserves memory from RAM start to the kernel end. This prevents us from exploring ways to use the RAM below (or before) the kernel start hence this patch updates setup_bootmem() to only reserve memory from the kernel start to the kernel end. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Anup Patel <anup.patel@wdc.com> --- arch/riscv/mm/init.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 2e2f2567964c..99b42380d17d 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -18,6 +18,8 @@ #include <asm/pgtable.h> #include <asm/io.h> +extern char _start[]; + static void __init zone_sizes_init(void) { unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, }; @@ -93,15 +95,17 @@ void __init setup_bootmem(void) /* Find the memory region containing the kernel */ for_each_memblock(memory, reg) { - phys_addr_t vmlinux_end = __pa(_end); + phys_addr_t vmlinux_end = __pa(&_end); + phys_addr_t vmlinux_start = __pa(&_start); phys_addr_t end = reg->base + reg->size; if (reg->base <= vmlinux_end && vmlinux_end <= end) { /* - * Reserve from the start of the region to the end of + * Reserve from the start of the kernel to the end of * the kernel */ - memblock_reserve(reg->base, vmlinux_end - reg->base); + memblock_reserve(vmlinux_start, + vmlinux_end - vmlinux_start); mem_size = min(reg->size, (phys_addr_t)-PAGE_OFFSET); } } @@ -177,7 +181,6 @@ struct mapping_ops { static inline void *__load_addr(void *ptr, uintptr_t load_pa) { - extern char _start; uintptr_t va = (uintptr_t)ptr; uintptr_t sz = (uintptr_t)(&_end) - (uintptr_t)(&_start); -- 2.17.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() 2019-03-21 9:47 ` [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() Anup Patel @ 2019-03-22 13:31 ` Christoph Hellwig 2019-03-23 15:44 ` Mike Rapoport 1 sibling, 0 replies; 16+ messages in thread From: Christoph Hellwig @ 2019-03-22 13:31 UTC (permalink / raw) To: Anup Patel Cc: Albert Ou, Palmer Dabbelt, linux-kernel, Mike Rapoport, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv Looks good. Please move it to the front of the series. Reviewed-by: Christoph Hellwig <hch@lst.de> _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() 2019-03-21 9:47 ` [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() Anup Patel 2019-03-22 13:31 ` Christoph Hellwig @ 2019-03-23 15:44 ` Mike Rapoport 1 sibling, 0 replies; 16+ messages in thread From: Mike Rapoport @ 2019-03-23 15:44 UTC (permalink / raw) To: Anup Patel Cc: Albert Ou, Palmer Dabbelt, linux-kernel, Christoph Hellwig, Atish Patra, Paul Walmsley, linux-riscv On Thu, Mar 21, 2019 at 09:47:58AM +0000, Anup Patel wrote: > Currently, the setup_bootmem() reserves memory from RAM start to the > kernel end. This prevents us from exploring ways to use the RAM below > (or before) the kernel start hence this patch updates setup_bootmem() > to only reserve memory from the kernel start to the kernel end. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > Signed-off-by: Anup Patel <anup.patel@wdc.com> > --- > arch/riscv/mm/init.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 2e2f2567964c..99b42380d17d 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -18,6 +18,8 @@ > #include <asm/pgtable.h> > #include <asm/io.h> > > +extern char _start[]; > + > static void __init zone_sizes_init(void) > { > unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, }; > @@ -93,15 +95,17 @@ void __init setup_bootmem(void) > > /* Find the memory region containing the kernel */ > for_each_memblock(memory, reg) { > - phys_addr_t vmlinux_end = __pa(_end); > + phys_addr_t vmlinux_end = __pa(&_end); > + phys_addr_t vmlinux_start = __pa(&_start); > phys_addr_t end = reg->base + reg->size; > > if (reg->base <= vmlinux_end && vmlinux_end <= end) { > /* > - * Reserve from the start of the region to the end of > + * Reserve from the start of the kernel to the end of > * the kernel > */ > - memblock_reserve(reg->base, vmlinux_end - reg->base); > + memblock_reserve(vmlinux_start, > + vmlinux_end - vmlinux_start); Sorry for misleading you here, but this can be done outside the loop as well as the calculation of vmlinux_{start,end}. > mem_size = min(reg->size, (phys_addr_t)-PAGE_OFFSET); > } > } > @@ -177,7 +181,6 @@ struct mapping_ops { > > static inline void *__load_addr(void *ptr, uintptr_t load_pa) > { > - extern char _start; > uintptr_t va = (uintptr_t)ptr; > uintptr_t sz = (uintptr_t)(&_end) - (uintptr_t)(&_start); > > -- > 2.17.1 > -- Sincerely yours, Mike. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2019-03-25 4:20 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-21 9:47 [PATCH v2 0/5] Boot RISC-V kernel from any 4KB aligned address Anup Patel 2019-03-21 9:47 ` [PATCH v2 1/5] RISC-V: Add separate defconfig for 32bit systems Anup Patel 2019-03-21 9:47 ` [PATCH v2 2/5] RISC-V: Make setup_vm() independent of GCC code model Anup Patel 2019-03-23 15:45 ` Mike Rapoport 2019-03-25 4:19 ` Anup Patel 2019-03-21 9:47 ` [PATCH v2 3/5] RISC-V: Allow booting kernel from any 4KB aligned address Anup Patel 2019-03-23 15:40 ` Mike Rapoport 2019-03-23 17:24 ` Christoph Hellwig 2019-03-24 4:16 ` Anup Patel 2019-03-24 3:32 ` Anup Patel 2019-03-21 9:47 ` [PATCH v2 4/5] RISC-V: Remove redundant trampoline page table Anup Patel 2019-03-22 13:33 ` Christoph Hellwig 2019-03-25 4:17 ` Anup Patel 2019-03-21 9:47 ` [PATCH v2 5/5] RISC-V: Fix memory reservation in setup_bootmem() Anup Patel 2019-03-22 13:31 ` Christoph Hellwig 2019-03-23 15:44 ` Mike Rapoport
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).