linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/3] Move kernel mapping outside the linear mapping
@ 2021-04-11 16:41 Alexandre Ghiti
  2021-04-11 16:41 ` [PATCH v5 1/3] riscv: Move kernel mapping outside of " Alexandre Ghiti
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2021-04-11 16:41 UTC (permalink / raw)
  To: Jonathan Corbet, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Arnd Bergmann, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm
  Cc: Alexandre Ghiti

I decided to split sv48 support in small series to ease the review.

This patchset pushes the kernel mapping (modules and BPF too) to the last
4GB of the 64bit address space, this allows to:
- implement relocatable kernel (that will come later in another
  patchset) that requires to move the kernel mapping out of the linear
  mapping to avoid to copy the kernel at a different physical address.
- have a single kernel that is not relocatable (and then that avoids the
  performance penalty imposed by PIC kernel) for both sv39 and sv48.

The first patch implements this behaviour, the second patch introduces a
documentation that describes the virtual address space layout of the 64bit
kernel and the last patch is taken from my sv48 series where I simply added
the dump of the modules/kernel/BPF mapping.

I removed the Reviewed-by on the first patch since it changed enough from
last time and deserves a second look.

Changes in v5:
- Fix 32BIT build that failed because MODULE_VADDR does not exist as
  modules lie in the vmalloc zone in 32BIT, reported by kernel test
  robot.

Changes in v4:
- Fix BUILTIN_DTB since we used __va to obtain the virtual address of the
  builtin DTB which returns a linear mapping address, and then we use
  this address before setup_vm_final installs the linear mapping: this
  is not possible anymore since the kernel does not lie inside the
  linear mapping anymore.

Changes in v3:
- Fix broken nommu build as reported by kernel test robot by protecting
  the kernel mapping only in 64BIT and MMU configs, by reverting the
  introduction of load_sz_pmd and by not exporting load_sz/load_pa anymore
  since they were not initialized in nommu config. 

Changes in v2:
- Fix documentation about direct mapping size which is 124GB instead
  of 126GB.
- Fix SPDX missing header in documentation.
- Fix another checkpatch warning about EXPORT_SYMBOL which was not
  directly below variable declaration.
 
Alexandre Ghiti (3):
  riscv: Move kernel mapping outside of linear mapping
  Documentation: riscv: Add documentation that describes the VM layout
  riscv: Prepare ptdump for vm layout dynamic addresses

 Documentation/riscv/index.rst       |  1 +
 Documentation/riscv/vm-layout.rst   | 63 +++++++++++++++++++++
 arch/riscv/boot/loader.lds.S        |  3 +-
 arch/riscv/include/asm/page.h       | 17 +++++-
 arch/riscv/include/asm/pgtable.h    | 37 ++++++++----
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/kernel/head.S            |  3 +-
 arch/riscv/kernel/module.c          |  6 +-
 arch/riscv/kernel/setup.c           |  5 ++
 arch/riscv/kernel/vmlinux.lds.S     |  3 +-
 arch/riscv/mm/fault.c               | 13 +++++
 arch/riscv/mm/init.c                | 87 ++++++++++++++++++++++-------
 arch/riscv/mm/kasan_init.c          |  9 +++
 arch/riscv/mm/physaddr.c            |  2 +-
 arch/riscv/mm/ptdump.c              | 73 ++++++++++++++++++++----
 15 files changed, 271 insertions(+), 52 deletions(-)
 create mode 100644 Documentation/riscv/vm-layout.rst

-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-04-11 16:41 [PATCH v5 0/3] Move kernel mapping outside the linear mapping Alexandre Ghiti
@ 2021-04-11 16:41 ` Alexandre Ghiti
  2021-04-15  4:20   ` Palmer Dabbelt
  2021-04-11 16:41 ` [PATCH v5 2/3] Documentation: riscv: Add documentation that describes the VM layout Alexandre Ghiti
  2021-04-11 16:41 ` [PATCH v5 3/3] riscv: Prepare ptdump for vm layout dynamic addresses Alexandre Ghiti
  2 siblings, 1 reply; 17+ messages in thread
From: Alexandre Ghiti @ 2021-04-11 16:41 UTC (permalink / raw)
  To: Jonathan Corbet, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Arnd Bergmann, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm
  Cc: Alexandre Ghiti

This is a preparatory patch for relocatable kernel and sv48 support.

The kernel used to be linked at PAGE_OFFSET address therefore we could use
the linear mapping for the kernel mapping. But the relocated kernel base
address will be different from PAGE_OFFSET and since in the linear mapping,
two different virtual addresses cannot point to the same physical address,
the kernel mapping needs to lie outside the linear mapping so that we don't
have to copy it at the same physical offset.

The kernel mapping is moved to the last 2GB of the address space, BPF
is now always after the kernel and modules use the 2GB memory range right
before the kernel, so BPF and modules regions do not overlap. KASLR
implementation will simply have to move the kernel in the last 2GB range
and just take care of leaving enough space for BPF.

In addition, by moving the kernel to the end of the address space, both
sv39 and sv48 kernels will be exactly the same without needing to be
relocated at runtime.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
---
 arch/riscv/boot/loader.lds.S        |  3 +-
 arch/riscv/include/asm/page.h       | 17 +++++-
 arch/riscv/include/asm/pgtable.h    | 37 ++++++++----
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/kernel/head.S            |  3 +-
 arch/riscv/kernel/module.c          |  6 +-
 arch/riscv/kernel/setup.c           |  5 ++
 arch/riscv/kernel/vmlinux.lds.S     |  3 +-
 arch/riscv/mm/fault.c               | 13 +++++
 arch/riscv/mm/init.c                | 87 ++++++++++++++++++++++-------
 arch/riscv/mm/kasan_init.c          |  9 +++
 arch/riscv/mm/physaddr.c            |  2 +-
 12 files changed, 146 insertions(+), 40 deletions(-)

diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
index 47a5003c2e28..62d94696a19c 100644
--- a/arch/riscv/boot/loader.lds.S
+++ b/arch/riscv/boot/loader.lds.S
@@ -1,13 +1,14 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #include <asm/page.h>
+#include <asm/pgtable.h>
 
 OUTPUT_ARCH(riscv)
 ENTRY(_start)
 
 SECTIONS
 {
-	. = PAGE_OFFSET;
+	. = KERNEL_LINK_ADDR;
 
 	.payload : {
 		*(.payload)
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index adc9d26f3d75..22cfb2be60dc 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -90,15 +90,28 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_MMU
 extern unsigned long va_pa_offset;
+extern unsigned long va_kernel_pa_offset;
 extern unsigned long pfn_base;
 #define ARCH_PFN_OFFSET		(pfn_base)
 #else
 #define va_pa_offset		0
+#define va_kernel_pa_offset	0
 #define ARCH_PFN_OFFSET		(PAGE_OFFSET >> PAGE_SHIFT)
 #endif /* CONFIG_MMU */
 
-#define __pa_to_va_nodebug(x)	((void *)((unsigned long) (x) + va_pa_offset))
-#define __va_to_pa_nodebug(x)	((unsigned long)(x) - va_pa_offset)
+extern unsigned long kernel_virt_addr;
+
+#define linear_mapping_pa_to_va(x)	((void *)((unsigned long)(x) + va_pa_offset))
+#define kernel_mapping_pa_to_va(x)	((void *)((unsigned long)(x) + va_kernel_pa_offset))
+#define __pa_to_va_nodebug(x)		linear_mapping_pa_to_va(x)
+
+#define linear_mapping_va_to_pa(x)	((unsigned long)(x) - va_pa_offset)
+#define kernel_mapping_va_to_pa(x)	((unsigned long)(x) - va_kernel_pa_offset)
+#define __va_to_pa_nodebug(x)	({						\
+	unsigned long _x = x;							\
+	(_x < kernel_virt_addr) ?						\
+		linear_mapping_va_to_pa(_x) : kernel_mapping_va_to_pa(_x);	\
+	})
 
 #ifdef CONFIG_DEBUG_VIRTUAL
 extern phys_addr_t __virt_to_phys(unsigned long x);
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index ebf817c1bdf4..80e63a93e903 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -11,23 +11,30 @@
 
 #include <asm/pgtable-bits.h>
 
-#ifndef __ASSEMBLY__
-
-/* Page Upper Directory not used in RISC-V */
-#include <asm-generic/pgtable-nopud.h>
-#include <asm/page.h>
-#include <asm/tlbflush.h>
-#include <linux/mm_types.h>
+#ifndef CONFIG_MMU
+#define KERNEL_LINK_ADDR	PAGE_OFFSET
+#else
 
-#ifdef CONFIG_MMU
+#define ADDRESS_SPACE_END	(UL(-1))
+/*
+ * Leave 2GB for kernel and BPF at the end of the address space
+ */
+#define KERNEL_LINK_ADDR	(ADDRESS_SPACE_END - SZ_2G + 1)
 
 #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
 #define VMALLOC_END      (PAGE_OFFSET - 1)
 #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
 
+/* KASLR should leave at least 128MB for BPF after the kernel */
 #define BPF_JIT_REGION_SIZE	(SZ_128M)
-#define BPF_JIT_REGION_START	(PAGE_OFFSET - BPF_JIT_REGION_SIZE)
-#define BPF_JIT_REGION_END	(VMALLOC_END)
+#define BPF_JIT_REGION_START	PFN_ALIGN((unsigned long)&_end)
+#define BPF_JIT_REGION_END	(BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
+
+/* Modules always live before the kernel */
+#ifdef CONFIG_64BIT
+#define MODULES_VADDR	(PFN_ALIGN((unsigned long)&_end) - SZ_2G)
+#define MODULES_END	(PFN_ALIGN((unsigned long)&_start))
+#endif
 
 /*
  * Roughly size the vmemmap space to be large enough to fit enough
@@ -57,9 +64,16 @@
 #define FIXADDR_SIZE     PGDIR_SIZE
 #endif
 #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
-
 #endif
 
+#ifndef __ASSEMBLY__
+
+/* Page Upper Directory not used in RISC-V */
+#include <asm-generic/pgtable-nopud.h>
+#include <asm/page.h>
+#include <asm/tlbflush.h>
+#include <linux/mm_types.h>
+
 #ifdef CONFIG_64BIT
 #include <asm/pgtable-64.h>
 #else
@@ -484,6 +498,7 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
 
+extern char _start[];
 extern void *dtb_early_va;
 extern uintptr_t dtb_early_pa;
 void setup_bootmem(void);
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 6887b3d9f371..a9c56776fa0e 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -17,6 +17,7 @@ int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
 int set_memory_rw_nx(unsigned long addr, int numpages);
 void protect_kernel_text_data(void);
+void protect_kernel_linear_mapping_text_rodata(void);
 #else
 static inline int set_memory_ro(unsigned long addr, int numpages) { return 0; }
 static inline int set_memory_rw(unsigned long addr, int numpages) { return 0; }
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index f5a9bad86e58..6cb05f22e52a 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -69,7 +69,8 @@ pe_head_start:
 #ifdef CONFIG_MMU
 relocate:
 	/* Relocate return address */
-	li a1, PAGE_OFFSET
+	la a1, kernel_virt_addr
+	REG_L a1, 0(a1)
 	la a2, _start
 	sub a1, a1, a2
 	add ra, ra, a1
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 104fba889cf7..ce153771e5e9 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -408,12 +408,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 }
 
 #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
-#define VMALLOC_MODULE_START \
-	 max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
 void *module_alloc(unsigned long size)
 {
-	return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
-				    VMALLOC_END, GFP_KERNEL,
+	return __vmalloc_node_range(size, 1, MODULES_VADDR,
+				    MODULES_END, GFP_KERNEL,
 				    PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
 				    __builtin_return_address(0));
 }
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index e85bacff1b50..30e4af0fd50c 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -265,6 +265,11 @@ void __init setup_arch(char **cmdline_p)
 
 	if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
 		protect_kernel_text_data();
+
+#if defined(CONFIG_64BIT) && defined(CONFIG_MMU)
+	protect_kernel_linear_mapping_text_rodata();
+#endif
+
 #ifdef CONFIG_SWIOTLB
 	swiotlb_init(1);
 #endif
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index de03cb22d0e9..0726c05e0336 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -4,7 +4,8 @@
  * Copyright (C) 2017 SiFive
  */
 
-#define LOAD_OFFSET PAGE_OFFSET
+#include <asm/pgtable.h>
+#define LOAD_OFFSET KERNEL_LINK_ADDR
 #include <asm/vmlinux.lds.h>
 #include <asm/page.h>
 #include <asm/cache.h>
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index 8f17519208c7..1b14d523a95c 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -231,6 +231,19 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 		return;
 	}
 
+#ifdef CONFIG_64BIT
+	/*
+	 * Modules in 64bit kernels lie in their own virtual region which is not
+	 * in the vmalloc region, but dealing with page faults in this region
+	 * or the vmalloc region amounts to doing the same thing: checking that
+	 * the mapping exists in init_mm.pgd and updating user page table, so
+	 * just use vmalloc_fault.
+	 */
+	if (unlikely(addr >= MODULES_VADDR && addr < MODULES_END)) {
+		vmalloc_fault(regs, code, addr);
+		return;
+	}
+#endif
 	/* Enable interrupts if they were enabled in the parent context. */
 	if (likely(regs->status & SR_PIE))
 		local_irq_enable();
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 7f5036fbee8c..093f3a96ecfc 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -25,6 +25,9 @@
 
 #include "../kernel/head.h"
 
+unsigned long kernel_virt_addr = KERNEL_LINK_ADDR;
+EXPORT_SYMBOL(kernel_virt_addr);
+
 unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
 							__page_aligned_bss;
 EXPORT_SYMBOL(empty_zero_page);
@@ -88,6 +91,8 @@ static void print_vm_layout(void)
 		  (unsigned long)VMALLOC_END);
 	print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
 		  (unsigned long)high_memory);
+	print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR,
+		  (unsigned long)ADDRESS_SPACE_END);
 }
 #else
 static void print_vm_layout(void) { }
@@ -116,8 +121,13 @@ void __init setup_bootmem(void)
 	/* The maximal physical memory size is -PAGE_OFFSET. */
 	memblock_enforce_memory_limit(-PAGE_OFFSET);
 
-	/* Reserve from the start of the kernel to the end of the kernel */
-	memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
+	/*
+	 * Reserve from the start of the kernel to the end of the kernel
+	 * and make sure we align the reservation on PMD_SIZE since we will
+	 * map the kernel in the linear mapping as read-only: we do not want
+	 * any allocation to happen between _end and the next pmd aligned page.
+	 */
+	memblock_reserve(vmlinux_start, (vmlinux_end - vmlinux_start + PMD_SIZE - 1) & PMD_MASK);
 
 	/*
 	 * memblock allocator is not aware of the fact that last 4K bytes of
@@ -152,8 +162,12 @@ void __init setup_bootmem(void)
 #ifdef CONFIG_MMU
 static struct pt_alloc_ops pt_ops;
 
+/* Offset between linear mapping virtual address and kernel load address */
 unsigned long va_pa_offset;
 EXPORT_SYMBOL(va_pa_offset);
+/* Offset between kernel mapping virtual address and kernel load address */
+unsigned long va_kernel_pa_offset;
+EXPORT_SYMBOL(va_kernel_pa_offset);
 unsigned long pfn_base;
 EXPORT_SYMBOL(pfn_base);
 
@@ -257,7 +271,7 @@ static pmd_t *get_pmd_virt_late(phys_addr_t pa)
 
 static phys_addr_t __init alloc_pmd_early(uintptr_t va)
 {
-	BUG_ON((va - PAGE_OFFSET) >> PGDIR_SHIFT);
+	BUG_ON((va - kernel_virt_addr) >> PGDIR_SHIFT);
 
 	return (uintptr_t)early_pmd;
 }
@@ -372,17 +386,32 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
 #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
 #endif
 
+uintptr_t load_pa, load_sz;
+
+static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size)
+{
+	uintptr_t va, end_va;
+
+	end_va = kernel_virt_addr + load_sz;
+	for (va = kernel_virt_addr; va < end_va; va += map_size)
+		create_pgd_mapping(pgdir, va,
+				   load_pa + (va - kernel_virt_addr),
+				   map_size, PAGE_KERNEL_EXEC);
+}
+
 asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 {
-	uintptr_t va, pa, end_va;
-	uintptr_t load_pa = (uintptr_t)(&_start);
-	uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
+	uintptr_t pa;
 	uintptr_t map_size;
 #ifndef __PAGETABLE_PMD_FOLDED
 	pmd_t fix_bmap_spmd, fix_bmap_epmd;
 #endif
+	load_pa = (uintptr_t)(&_start);
+	load_sz = (uintptr_t)(&_end) - load_pa;
 
 	va_pa_offset = PAGE_OFFSET - load_pa;
+	va_kernel_pa_offset = kernel_virt_addr - load_pa;
+
 	pfn_base = PFN_DOWN(load_pa);
 
 	/*
@@ -410,26 +439,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 	create_pmd_mapping(fixmap_pmd, FIXADDR_START,
 			   (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
 	/* Setup trampoline PGD and PMD */
-	create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
+	create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
 			   (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
-	create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
+	create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
 			   load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
 #else
 	/* Setup trampoline PGD */
-	create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
+	create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
 			   load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
 #endif
 
 	/*
-	 * Setup early PGD covering entire kernel which will allows
+	 * Setup early PGD covering entire kernel which will allow
 	 * us to reach paging_init(). We map all memory banks later
 	 * in setup_vm_final() below.
 	 */
-	end_va = PAGE_OFFSET + load_sz;
-	for (va = PAGE_OFFSET; va < end_va; va += map_size)
-		create_pgd_mapping(early_pg_dir, va,
-				   load_pa + (va - PAGE_OFFSET),
-				   map_size, PAGE_KERNEL_EXEC);
+	create_kernel_page_table(early_pg_dir, map_size);
 
 #ifndef __PAGETABLE_PMD_FOLDED
 	/* Setup early PMD for DTB */
@@ -444,7 +469,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 			   pa + PMD_SIZE, PMD_SIZE, PAGE_KERNEL);
 	dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PMD_SIZE - 1));
 #else /* CONFIG_BUILTIN_DTB */
-	dtb_early_va = __va(dtb_pa);
+	/*
+	 * __va can't be used since it would return a linear mapping address
+	 * whereas dtb_early_va will be used before setup_vm_final installs
+	 * the linear mapping.
+	 */
+	dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
 #endif /* CONFIG_BUILTIN_DTB */
 #else
 #ifndef CONFIG_BUILTIN_DTB
@@ -456,7 +486,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 			   pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
 	dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE - 1));
 #else /* CONFIG_BUILTIN_DTB */
-	dtb_early_va = __va(dtb_pa);
+	dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
 #endif /* CONFIG_BUILTIN_DTB */
 #endif
 	dtb_early_pa = dtb_pa;
@@ -492,6 +522,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 #endif
 }
 
+#ifdef CONFIG_64BIT
+void protect_kernel_linear_mapping_text_rodata(void)
+{
+	unsigned long text_start = (unsigned long)lm_alias(_start);
+	unsigned long init_text_start = (unsigned long)lm_alias(__init_text_begin);
+	unsigned long rodata_start = (unsigned long)lm_alias(__start_rodata);
+	unsigned long data_start = (unsigned long)lm_alias(_data);
+
+	set_memory_ro(text_start, (init_text_start - text_start) >> PAGE_SHIFT);
+	set_memory_nx(text_start, (init_text_start - text_start) >> PAGE_SHIFT);
+
+	set_memory_ro(rodata_start, (data_start - rodata_start) >> PAGE_SHIFT);
+	set_memory_nx(rodata_start, (data_start - rodata_start) >> PAGE_SHIFT);
+}
+#endif
+
 static void __init setup_vm_final(void)
 {
 	uintptr_t va, map_size;
@@ -513,7 +559,7 @@ static void __init setup_vm_final(void)
 			   __pa_symbol(fixmap_pgd_next),
 			   PGDIR_SIZE, PAGE_TABLE);
 
-	/* Map all memory banks */
+	/* Map all memory banks in the linear mapping */
 	for_each_mem_range(i, &start, &end) {
 		if (start >= end)
 			break;
@@ -525,10 +571,13 @@ static void __init setup_vm_final(void)
 		for (pa = start; pa < end; pa += map_size) {
 			va = (uintptr_t)__va(pa);
 			create_pgd_mapping(swapper_pg_dir, va, pa,
-					   map_size, PAGE_KERNEL_EXEC);
+					   map_size, PAGE_KERNEL);
 		}
 	}
 
+	/* Map the kernel */
+	create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
+
 	/* Clear fixmap PTE and PMD mappings */
 	clear_fixmap(FIX_PTE);
 	clear_fixmap(FIX_PMD);
diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
index 2c39f0386673..28f4d52cf17e 100644
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -171,6 +171,10 @@ void __init kasan_init(void)
 	phys_addr_t _start, _end;
 	u64 i;
 
+	/*
+	 * Populate all kernel virtual address space with kasan_early_shadow_page
+	 * except for the linear mapping and the modules/kernel/BPF mapping.
+	 */
 	kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
 				    (void *)kasan_mem_to_shadow((void *)
 								VMEMMAP_END));
@@ -183,6 +187,7 @@ void __init kasan_init(void)
 			(void *)kasan_mem_to_shadow((void *)VMALLOC_START),
 			(void *)kasan_mem_to_shadow((void *)VMALLOC_END));
 
+	/* Populate the linear mapping */
 	for_each_mem_range(i, &_start, &_end) {
 		void *start = (void *)__va(_start);
 		void *end = (void *)__va(_end);
@@ -193,6 +198,10 @@ void __init kasan_init(void)
 		kasan_populate(kasan_mem_to_shadow(start), kasan_mem_to_shadow(end));
 	};
 
+	/* Populate kernel, BPF, modules mapping */
+	kasan_populate(kasan_mem_to_shadow((const void *)MODULES_VADDR),
+		       kasan_mem_to_shadow((const void *)BPF_JIT_REGION_END));
+
 	for (i = 0; i < PTRS_PER_PTE; i++)
 		set_pte(&kasan_early_shadow_pte[i],
 			mk_pte(virt_to_page(kasan_early_shadow_page),
diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
index e8e4dcd39fed..35703d5ef5fd 100644
--- a/arch/riscv/mm/physaddr.c
+++ b/arch/riscv/mm/physaddr.c
@@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
 
 phys_addr_t __phys_addr_symbol(unsigned long x)
 {
-	unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
+	unsigned long kernel_start = (unsigned long)kernel_virt_addr;
 	unsigned long kernel_end = (unsigned long)_end;
 
 	/*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 2/3] Documentation: riscv: Add documentation that describes the VM layout
  2021-04-11 16:41 [PATCH v5 0/3] Move kernel mapping outside the linear mapping Alexandre Ghiti
  2021-04-11 16:41 ` [PATCH v5 1/3] riscv: Move kernel mapping outside of " Alexandre Ghiti
@ 2021-04-11 16:41 ` Alexandre Ghiti
  2021-04-11 16:41 ` [PATCH v5 3/3] riscv: Prepare ptdump for vm layout dynamic addresses Alexandre Ghiti
  2 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2021-04-11 16:41 UTC (permalink / raw)
  To: Jonathan Corbet, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Arnd Bergmann, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm
  Cc: Alexandre Ghiti

This new document presents the RISC-V virtual memory layout and is based
one the x86 one: it describes the different limits of the different regions
of the virtual address space.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
---
 Documentation/riscv/index.rst     |  1 +
 Documentation/riscv/vm-layout.rst | 63 +++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)
 create mode 100644 Documentation/riscv/vm-layout.rst

diff --git a/Documentation/riscv/index.rst b/Documentation/riscv/index.rst
index 6e6e39482502..ea915c196048 100644
--- a/Documentation/riscv/index.rst
+++ b/Documentation/riscv/index.rst
@@ -6,6 +6,7 @@ RISC-V architecture
     :maxdepth: 1
 
     boot-image-header
+    vm-layout
     pmu
     patch-acceptance
 
diff --git a/Documentation/riscv/vm-layout.rst b/Documentation/riscv/vm-layout.rst
new file mode 100644
index 000000000000..329d32098af4
--- /dev/null
+++ b/Documentation/riscv/vm-layout.rst
@@ -0,0 +1,63 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================================
+Virtual Memory Layout on RISC-V Linux
+=====================================
+
+:Author: Alexandre Ghiti <alex@ghiti.fr>
+:Date: 12 February 2021
+
+This document describes the virtual memory layout used by the RISC-V Linux
+Kernel.
+
+RISC-V Linux Kernel 32bit
+=========================
+
+RISC-V Linux Kernel SV32
+------------------------
+
+TODO
+
+RISC-V Linux Kernel 64bit
+=========================
+
+The RISC-V privileged architecture document states that the 64bit addresses
+"must have bits 63–48 all equal to bit 47, or else a page-fault exception will
+occur.": that splits the virtual address space into 2 halves separated by a very
+big hole, the lower half is where the userspace resides, the upper half is where
+the RISC-V Linux Kernel resides.
+
+RISC-V Linux Kernel SV39
+------------------------
+
+::
+
+  ========================================================================================================================
+      Start addr    |   Offset   |     End addr     |  Size   | VM area description
+  ========================================================================================================================
+                    |            |                  |         |
+   0000000000000000 |    0       | 0000003fffffffff |  256 GB | user-space virtual memory, different per mm
+  __________________|____________|__________________|_________|___________________________________________________________
+                    |            |                  |         |
+   0000004000000000 | +256    GB | ffffffbfffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
+                    |            |                  |         |     virtual memory addresses up to the -256 GB
+                    |            |                  |         |     starting offset of kernel mappings.
+  __________________|____________|__________________|_________|___________________________________________________________
+                                                              |
+                                                              | Kernel-space virtual memory, shared between all processes:
+  ____________________________________________________________|___________________________________________________________
+                    |            |                  |         |
+   ffffffc000000000 | -256    GB | ffffffc7ffffffff |   32 GB | kasan
+   ffffffcefee00000 | -196    GB | ffffffcefeffffff |    2 MB | fixmap
+   ffffffceff000000 | -196    GB | ffffffceffffffff |   16 MB | PCI io
+   ffffffcf00000000 | -196    GB | ffffffcfffffffff |    4 GB | vmemmap
+   ffffffd000000000 | -192    GB | ffffffdfffffffff |   64 GB | vmalloc/ioremap space
+   ffffffe000000000 | -128    GB | ffffffff7fffffff |  124 GB | direct mapping of all physical memory
+  __________________|____________|__________________|_________|____________________________________________________________
+                                                              |
+                                                              |
+  ____________________________________________________________|____________________________________________________________
+                    |            |                  |         |
+   ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | modules
+   ffffffff80000000 |   -2    GB | ffffffffffffffff |    2 GB | kernel, BPF
+  __________________|____________|__________________|_________|____________________________________________________________
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 3/3] riscv: Prepare ptdump for vm layout dynamic addresses
  2021-04-11 16:41 [PATCH v5 0/3] Move kernel mapping outside the linear mapping Alexandre Ghiti
  2021-04-11 16:41 ` [PATCH v5 1/3] riscv: Move kernel mapping outside of " Alexandre Ghiti
  2021-04-11 16:41 ` [PATCH v5 2/3] Documentation: riscv: Add documentation that describes the VM layout Alexandre Ghiti
@ 2021-04-11 16:41 ` Alexandre Ghiti
  2 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2021-04-11 16:41 UTC (permalink / raw)
  To: Jonathan Corbet, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Arnd Bergmann, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm
  Cc: Alexandre Ghiti, Anup Patel

This is a preparatory patch for sv48 support that will introduce
dynamic PAGE_OFFSET.

Dynamic PAGE_OFFSET implies that all zones (vmalloc, vmemmap, fixaddr...)
whose addresses depend on PAGE_OFFSET become dynamic and can't be used
to statically initialize the array used by ptdump to identify the
different zones of the vm layout.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/mm/ptdump.c | 73 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 61 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/mm/ptdump.c b/arch/riscv/mm/ptdump.c
index ace74dec7492..0aba4421115c 100644
--- a/arch/riscv/mm/ptdump.c
+++ b/arch/riscv/mm/ptdump.c
@@ -58,29 +58,56 @@ struct ptd_mm_info {
 	unsigned long end;
 };
 
+enum address_markers_idx {
+#ifdef CONFIG_KASAN
+	KASAN_SHADOW_START_NR,
+	KASAN_SHADOW_END_NR,
+#endif
+	FIXMAP_START_NR,
+	FIXMAP_END_NR,
+	PCI_IO_START_NR,
+	PCI_IO_END_NR,
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+	VMEMMAP_START_NR,
+	VMEMMAP_END_NR,
+#endif
+	VMALLOC_START_NR,
+	VMALLOC_END_NR,
+	PAGE_OFFSET_NR,
+#ifdef CONFIG_64BIT
+	MODULES_MAPPING_NR,
+#endif
+	KERNEL_MAPPING_NR,
+	END_OF_SPACE_NR
+};
+
 static struct addr_marker address_markers[] = {
 #ifdef CONFIG_KASAN
-	{KASAN_SHADOW_START,	"Kasan shadow start"},
-	{KASAN_SHADOW_END,	"Kasan shadow end"},
+	{0, "Kasan shadow start"},
+	{0, "Kasan shadow end"},
 #endif
-	{FIXADDR_START,		"Fixmap start"},
-	{FIXADDR_TOP,		"Fixmap end"},
-	{PCI_IO_START,		"PCI I/O start"},
-	{PCI_IO_END,		"PCI I/O end"},
+	{0, "Fixmap start"},
+	{0, "Fixmap end"},
+	{0, "PCI I/O start"},
+	{0, "PCI I/O end"},
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
-	{VMEMMAP_START,		"vmemmap start"},
-	{VMEMMAP_END,		"vmemmap end"},
+	{0, "vmemmap start"},
+	{0, "vmemmap end"},
+#endif
+	{0, "vmalloc() area"},
+	{0, "vmalloc() end"},
+	{0, "Linear mapping"},
+#ifdef CONFIG_64BIT
+	{0, "Modules mapping"},
 #endif
-	{VMALLOC_START,		"vmalloc() area"},
-	{VMALLOC_END,		"vmalloc() end"},
-	{PAGE_OFFSET,		"Linear mapping"},
+	{0, "Kernel mapping (kernel, BPF)"},
 	{-1, NULL},
 };
 
 static struct ptd_mm_info kernel_ptd_info = {
 	.mm		= &init_mm,
 	.markers	= address_markers,
-	.base_addr	= KERN_VIRT_START,
+	.base_addr	= 0,
 	.end		= ULONG_MAX,
 };
 
@@ -335,6 +362,28 @@ static int ptdump_init(void)
 {
 	unsigned int i, j;
 
+#ifdef CONFIG_KASAN
+	address_markers[KASAN_SHADOW_START_NR].start_address = KASAN_SHADOW_START;
+	address_markers[KASAN_SHADOW_END_NR].start_address = KASAN_SHADOW_END;
+#endif
+	address_markers[FIXMAP_START_NR].start_address = FIXADDR_START;
+	address_markers[FIXMAP_END_NR].start_address = FIXADDR_TOP;
+	address_markers[PCI_IO_START_NR].start_address = PCI_IO_START;
+	address_markers[PCI_IO_END_NR].start_address = PCI_IO_END;
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+	address_markers[VMEMMAP_START_NR].start_address = VMEMMAP_START;
+	address_markers[VMEMMAP_END_NR].start_address = VMEMMAP_END;
+#endif
+	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
+	address_markers[VMALLOC_END_NR].start_address = VMALLOC_END;
+	address_markers[PAGE_OFFSET_NR].start_address = PAGE_OFFSET;
+#ifdef CONFIG_64BIT
+	address_markers[MODULES_MAPPING_NR].start_address = MODULES_VADDR;
+#endif
+	address_markers[KERNEL_MAPPING_NR].start_address = kernel_virt_addr;
+
+	kernel_ptd_info.base_addr = KERN_VIRT_START;
+
 	for (i = 0; i < ARRAY_SIZE(pg_level); i++)
 		for (j = 0; j < ARRAY_SIZE(pte_bits); j++)
 			pg_level[i].mask |= pte_bits[j].mask;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-04-11 16:41 ` [PATCH v5 1/3] riscv: Move kernel mapping outside of " Alexandre Ghiti
@ 2021-04-15  4:20   ` Palmer Dabbelt
  2021-04-15  4:54     ` Alex Ghiti
  0 siblings, 1 reply; 17+ messages in thread
From: Palmer Dabbelt @ 2021-04-15  4:20 UTC (permalink / raw)
  To: alex
  Cc: corbet, Paul Walmsley, aou, Arnd Bergmann, aryabinin, glider,
	dvyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm, alex

On Sun, 11 Apr 2021 09:41:44 PDT (-0700), alex@ghiti.fr wrote:
> This is a preparatory patch for relocatable kernel and sv48 support.
>
> The kernel used to be linked at PAGE_OFFSET address therefore we could use
> the linear mapping for the kernel mapping. But the relocated kernel base
> address will be different from PAGE_OFFSET and since in the linear mapping,
> two different virtual addresses cannot point to the same physical address,
> the kernel mapping needs to lie outside the linear mapping so that we don't
> have to copy it at the same physical offset.
>
> The kernel mapping is moved to the last 2GB of the address space, BPF
> is now always after the kernel and modules use the 2GB memory range right
> before the kernel, so BPF and modules regions do not overlap. KASLR
> implementation will simply have to move the kernel in the last 2GB range
> and just take care of leaving enough space for BPF.
>
> In addition, by moving the kernel to the end of the address space, both
> sv39 and sv48 kernels will be exactly the same without needing to be
> relocated at runtime.
>
> Suggested-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
> ---
>  arch/riscv/boot/loader.lds.S        |  3 +-
>  arch/riscv/include/asm/page.h       | 17 +++++-
>  arch/riscv/include/asm/pgtable.h    | 37 ++++++++----
>  arch/riscv/include/asm/set_memory.h |  1 +
>  arch/riscv/kernel/head.S            |  3 +-
>  arch/riscv/kernel/module.c          |  6 +-
>  arch/riscv/kernel/setup.c           |  5 ++
>  arch/riscv/kernel/vmlinux.lds.S     |  3 +-
>  arch/riscv/mm/fault.c               | 13 +++++
>  arch/riscv/mm/init.c                | 87 ++++++++++++++++++++++-------
>  arch/riscv/mm/kasan_init.c          |  9 +++
>  arch/riscv/mm/physaddr.c            |  2 +-
>  12 files changed, 146 insertions(+), 40 deletions(-)
>
> diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
> index 47a5003c2e28..62d94696a19c 100644
> --- a/arch/riscv/boot/loader.lds.S
> +++ b/arch/riscv/boot/loader.lds.S
> @@ -1,13 +1,14 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>
>  #include <asm/page.h>
> +#include <asm/pgtable.h>
>
>  OUTPUT_ARCH(riscv)
>  ENTRY(_start)
>
>  SECTIONS
>  {
> -	. = PAGE_OFFSET;
> +	. = KERNEL_LINK_ADDR;
>
>  	.payload : {
>  		*(.payload)
> diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
> index adc9d26f3d75..22cfb2be60dc 100644
> --- a/arch/riscv/include/asm/page.h
> +++ b/arch/riscv/include/asm/page.h
> @@ -90,15 +90,28 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_MMU
>  extern unsigned long va_pa_offset;
> +extern unsigned long va_kernel_pa_offset;
>  extern unsigned long pfn_base;
>  #define ARCH_PFN_OFFSET		(pfn_base)
>  #else
>  #define va_pa_offset		0
> +#define va_kernel_pa_offset	0
>  #define ARCH_PFN_OFFSET		(PAGE_OFFSET >> PAGE_SHIFT)
>  #endif /* CONFIG_MMU */
>
> -#define __pa_to_va_nodebug(x)	((void *)((unsigned long) (x) + va_pa_offset))
> -#define __va_to_pa_nodebug(x)	((unsigned long)(x) - va_pa_offset)
> +extern unsigned long kernel_virt_addr;
> +
> +#define linear_mapping_pa_to_va(x)	((void *)((unsigned long)(x) + va_pa_offset))
> +#define kernel_mapping_pa_to_va(x)	((void *)((unsigned long)(x) + va_kernel_pa_offset))
> +#define __pa_to_va_nodebug(x)		linear_mapping_pa_to_va(x)
> +
> +#define linear_mapping_va_to_pa(x)	((unsigned long)(x) - va_pa_offset)
> +#define kernel_mapping_va_to_pa(x)	((unsigned long)(x) - va_kernel_pa_offset)
> +#define __va_to_pa_nodebug(x)	({						\
> +	unsigned long _x = x;							\
> +	(_x < kernel_virt_addr) ?						\
> +		linear_mapping_va_to_pa(_x) : kernel_mapping_va_to_pa(_x);	\
> +	})
>
>  #ifdef CONFIG_DEBUG_VIRTUAL
>  extern phys_addr_t __virt_to_phys(unsigned long x);
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index ebf817c1bdf4..80e63a93e903 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -11,23 +11,30 @@
>
>  #include <asm/pgtable-bits.h>
>
> -#ifndef __ASSEMBLY__
> -
> -/* Page Upper Directory not used in RISC-V */
> -#include <asm-generic/pgtable-nopud.h>
> -#include <asm/page.h>
> -#include <asm/tlbflush.h>
> -#include <linux/mm_types.h>
> +#ifndef CONFIG_MMU
> +#define KERNEL_LINK_ADDR	PAGE_OFFSET
> +#else
>
> -#ifdef CONFIG_MMU
> +#define ADDRESS_SPACE_END	(UL(-1))
> +/*
> + * Leave 2GB for kernel and BPF at the end of the address space
> + */
> +#define KERNEL_LINK_ADDR	(ADDRESS_SPACE_END - SZ_2G + 1)
>
>  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>  #define VMALLOC_END      (PAGE_OFFSET - 1)
>  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>
> +/* KASLR should leave at least 128MB for BPF after the kernel */
>  #define BPF_JIT_REGION_SIZE	(SZ_128M)
> -#define BPF_JIT_REGION_START	(PAGE_OFFSET - BPF_JIT_REGION_SIZE)
> -#define BPF_JIT_REGION_END	(VMALLOC_END)
> +#define BPF_JIT_REGION_START	PFN_ALIGN((unsigned long)&_end)
> +#define BPF_JIT_REGION_END	(BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
> +
> +/* Modules always live before the kernel */
> +#ifdef CONFIG_64BIT
> +#define MODULES_VADDR	(PFN_ALIGN((unsigned long)&_end) - SZ_2G)
> +#define MODULES_END	(PFN_ALIGN((unsigned long)&_start))
> +#endif
>
>  /*
>   * Roughly size the vmemmap space to be large enough to fit enough
> @@ -57,9 +64,16 @@
>  #define FIXADDR_SIZE     PGDIR_SIZE
>  #endif
>  #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
> -
>  #endif
>
> +#ifndef __ASSEMBLY__
> +
> +/* Page Upper Directory not used in RISC-V */
> +#include <asm-generic/pgtable-nopud.h>
> +#include <asm/page.h>
> +#include <asm/tlbflush.h>
> +#include <linux/mm_types.h>
> +
>  #ifdef CONFIG_64BIT
>  #include <asm/pgtable-64.h>
>  #else
> @@ -484,6 +498,7 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
>
>  #define kern_addr_valid(addr)   (1) /* FIXME */
>
> +extern char _start[];
>  extern void *dtb_early_va;
>  extern uintptr_t dtb_early_pa;
>  void setup_bootmem(void);
> diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
> index 6887b3d9f371..a9c56776fa0e 100644
> --- a/arch/riscv/include/asm/set_memory.h
> +++ b/arch/riscv/include/asm/set_memory.h
> @@ -17,6 +17,7 @@ int set_memory_x(unsigned long addr, int numpages);
>  int set_memory_nx(unsigned long addr, int numpages);
>  int set_memory_rw_nx(unsigned long addr, int numpages);
>  void protect_kernel_text_data(void);
> +void protect_kernel_linear_mapping_text_rodata(void);
>  #else
>  static inline int set_memory_ro(unsigned long addr, int numpages) { return 0; }
>  static inline int set_memory_rw(unsigned long addr, int numpages) { return 0; }
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index f5a9bad86e58..6cb05f22e52a 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -69,7 +69,8 @@ pe_head_start:
>  #ifdef CONFIG_MMU
>  relocate:
>  	/* Relocate return address */
> -	li a1, PAGE_OFFSET
> +	la a1, kernel_virt_addr
> +	REG_L a1, 0(a1)
>  	la a2, _start
>  	sub a1, a1, a2
>  	add ra, ra, a1
> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> index 104fba889cf7..ce153771e5e9 100644
> --- a/arch/riscv/kernel/module.c
> +++ b/arch/riscv/kernel/module.c
> @@ -408,12 +408,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
>  }
>
>  #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> -#define VMALLOC_MODULE_START \
> -	 max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>  void *module_alloc(unsigned long size)
>  {
> -	return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
> -				    VMALLOC_END, GFP_KERNEL,
> +	return __vmalloc_node_range(size, 1, MODULES_VADDR,
> +				    MODULES_END, GFP_KERNEL,
>  				    PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>  				    __builtin_return_address(0));
>  }
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index e85bacff1b50..30e4af0fd50c 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -265,6 +265,11 @@ void __init setup_arch(char **cmdline_p)
>
>  	if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
>  		protect_kernel_text_data();
> +
> +#if defined(CONFIG_64BIT) && defined(CONFIG_MMU)
> +	protect_kernel_linear_mapping_text_rodata();
> +#endif
> +
>  #ifdef CONFIG_SWIOTLB
>  	swiotlb_init(1);
>  #endif
> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> index de03cb22d0e9..0726c05e0336 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -4,7 +4,8 @@
>   * Copyright (C) 2017 SiFive
>   */
>
> -#define LOAD_OFFSET PAGE_OFFSET
> +#include <asm/pgtable.h>
> +#define LOAD_OFFSET KERNEL_LINK_ADDR
>  #include <asm/vmlinux.lds.h>
>  #include <asm/page.h>
>  #include <asm/cache.h>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index 8f17519208c7..1b14d523a95c 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -231,6 +231,19 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
>  		return;
>  	}
>
> +#ifdef CONFIG_64BIT
> +	/*
> +	 * Modules in 64bit kernels lie in their own virtual region which is not
> +	 * in the vmalloc region, but dealing with page faults in this region
> +	 * or the vmalloc region amounts to doing the same thing: checking that
> +	 * the mapping exists in init_mm.pgd and updating user page table, so
> +	 * just use vmalloc_fault.
> +	 */
> +	if (unlikely(addr >= MODULES_VADDR && addr < MODULES_END)) {
> +		vmalloc_fault(regs, code, addr);
> +		return;
> +	}
> +#endif
>  	/* Enable interrupts if they were enabled in the parent context. */
>  	if (likely(regs->status & SR_PIE))
>  		local_irq_enable();
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 7f5036fbee8c..093f3a96ecfc 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -25,6 +25,9 @@
>
>  #include "../kernel/head.h"
>
> +unsigned long kernel_virt_addr = KERNEL_LINK_ADDR;
> +EXPORT_SYMBOL(kernel_virt_addr);
> +
>  unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
>  							__page_aligned_bss;
>  EXPORT_SYMBOL(empty_zero_page);
> @@ -88,6 +91,8 @@ static void print_vm_layout(void)
>  		  (unsigned long)VMALLOC_END);
>  	print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
>  		  (unsigned long)high_memory);
> +	print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR,
> +		  (unsigned long)ADDRESS_SPACE_END);
>  }
>  #else
>  static void print_vm_layout(void) { }
> @@ -116,8 +121,13 @@ void __init setup_bootmem(void)
>  	/* The maximal physical memory size is -PAGE_OFFSET. */
>  	memblock_enforce_memory_limit(-PAGE_OFFSET);
>
> -	/* Reserve from the start of the kernel to the end of the kernel */
> -	memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
> +	/*
> +	 * Reserve from the start of the kernel to the end of the kernel
> +	 * and make sure we align the reservation on PMD_SIZE since we will
> +	 * map the kernel in the linear mapping as read-only: we do not want
> +	 * any allocation to happen between _end and the next pmd aligned page.
> +	 */
> +	memblock_reserve(vmlinux_start, (vmlinux_end - vmlinux_start + PMD_SIZE - 1) & PMD_MASK);
>
>  	/*
>  	 * memblock allocator is not aware of the fact that last 4K bytes of
> @@ -152,8 +162,12 @@ void __init setup_bootmem(void)
>  #ifdef CONFIG_MMU
>  static struct pt_alloc_ops pt_ops;
>
> +/* Offset between linear mapping virtual address and kernel load address */
>  unsigned long va_pa_offset;
>  EXPORT_SYMBOL(va_pa_offset);
> +/* Offset between kernel mapping virtual address and kernel load address */
> +unsigned long va_kernel_pa_offset;
> +EXPORT_SYMBOL(va_kernel_pa_offset);
>  unsigned long pfn_base;
>  EXPORT_SYMBOL(pfn_base);
>
> @@ -257,7 +271,7 @@ static pmd_t *get_pmd_virt_late(phys_addr_t pa)
>
>  static phys_addr_t __init alloc_pmd_early(uintptr_t va)
>  {
> -	BUG_ON((va - PAGE_OFFSET) >> PGDIR_SHIFT);
> +	BUG_ON((va - kernel_virt_addr) >> PGDIR_SHIFT);
>
>  	return (uintptr_t)early_pmd;
>  }
> @@ -372,17 +386,32 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
>  #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
>  #endif
>
> +uintptr_t load_pa, load_sz;
> +
> +static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size)
> +{
> +	uintptr_t va, end_va;
> +
> +	end_va = kernel_virt_addr + load_sz;
> +	for (va = kernel_virt_addr; va < end_va; va += map_size)
> +		create_pgd_mapping(pgdir, va,
> +				   load_pa + (va - kernel_virt_addr),
> +				   map_size, PAGE_KERNEL_EXEC);
> +}
> +
>  asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>  {
> -	uintptr_t va, pa, end_va;
> -	uintptr_t load_pa = (uintptr_t)(&_start);
> -	uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
> +	uintptr_t pa;
>  	uintptr_t map_size;
>  #ifndef __PAGETABLE_PMD_FOLDED
>  	pmd_t fix_bmap_spmd, fix_bmap_epmd;
>  #endif
> +	load_pa = (uintptr_t)(&_start);
> +	load_sz = (uintptr_t)(&_end) - load_pa;
>
>  	va_pa_offset = PAGE_OFFSET - load_pa;
> +	va_kernel_pa_offset = kernel_virt_addr - load_pa;
> +
>  	pfn_base = PFN_DOWN(load_pa);
>
>  	/*
> @@ -410,26 +439,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>  	create_pmd_mapping(fixmap_pmd, FIXADDR_START,
>  			   (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
>  	/* Setup trampoline PGD and PMD */
> -	create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
> +	create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>  			   (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
> -	create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
> +	create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
>  			   load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
>  #else
>  	/* Setup trampoline PGD */
> -	create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
> +	create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>  			   load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>  #endif
>
>  	/*
> -	 * Setup early PGD covering entire kernel which will allows
> +	 * Setup early PGD covering entire kernel which will allow
>  	 * us to reach paging_init(). We map all memory banks later
>  	 * in setup_vm_final() below.
>  	 */
> -	end_va = PAGE_OFFSET + load_sz;
> -	for (va = PAGE_OFFSET; va < end_va; va += map_size)
> -		create_pgd_mapping(early_pg_dir, va,
> -				   load_pa + (va - PAGE_OFFSET),
> -				   map_size, PAGE_KERNEL_EXEC);
> +	create_kernel_page_table(early_pg_dir, map_size);
>
>  #ifndef __PAGETABLE_PMD_FOLDED
>  	/* Setup early PMD for DTB */
> @@ -444,7 +469,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>  			   pa + PMD_SIZE, PMD_SIZE, PAGE_KERNEL);
>  	dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PMD_SIZE - 1));
>  #else /* CONFIG_BUILTIN_DTB */
> -	dtb_early_va = __va(dtb_pa);
> +	/*
> +	 * __va can't be used since it would return a linear mapping address
> +	 * whereas dtb_early_va will be used before setup_vm_final installs
> +	 * the linear mapping.
> +	 */
> +	dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>  #endif /* CONFIG_BUILTIN_DTB */
>  #else
>  #ifndef CONFIG_BUILTIN_DTB
> @@ -456,7 +486,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>  			   pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
>  	dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE - 1));
>  #else /* CONFIG_BUILTIN_DTB */
> -	dtb_early_va = __va(dtb_pa);
> +	dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>  #endif /* CONFIG_BUILTIN_DTB */
>  #endif
>  	dtb_early_pa = dtb_pa;
> @@ -492,6 +522,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>  #endif
>  }
>
> +#ifdef CONFIG_64BIT
> +void protect_kernel_linear_mapping_text_rodata(void)
> +{
> +	unsigned long text_start = (unsigned long)lm_alias(_start);
> +	unsigned long init_text_start = (unsigned long)lm_alias(__init_text_begin);
> +	unsigned long rodata_start = (unsigned long)lm_alias(__start_rodata);
> +	unsigned long data_start = (unsigned long)lm_alias(_data);
> +
> +	set_memory_ro(text_start, (init_text_start - text_start) >> PAGE_SHIFT);
> +	set_memory_nx(text_start, (init_text_start - text_start) >> PAGE_SHIFT);
> +
> +	set_memory_ro(rodata_start, (data_start - rodata_start) >> PAGE_SHIFT);
> +	set_memory_nx(rodata_start, (data_start - rodata_start) >> PAGE_SHIFT);
> +}
> +#endif
> +
>  static void __init setup_vm_final(void)
>  {
>  	uintptr_t va, map_size;
> @@ -513,7 +559,7 @@ static void __init setup_vm_final(void)
>  			   __pa_symbol(fixmap_pgd_next),
>  			   PGDIR_SIZE, PAGE_TABLE);
>
> -	/* Map all memory banks */
> +	/* Map all memory banks in the linear mapping */
>  	for_each_mem_range(i, &start, &end) {
>  		if (start >= end)
>  			break;
> @@ -525,10 +571,13 @@ static void __init setup_vm_final(void)
>  		for (pa = start; pa < end; pa += map_size) {
>  			va = (uintptr_t)__va(pa);
>  			create_pgd_mapping(swapper_pg_dir, va, pa,
> -					   map_size, PAGE_KERNEL_EXEC);
> +					   map_size, PAGE_KERNEL);
>  		}
>  	}
>
> +	/* Map the kernel */
> +	create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
> +
>  	/* Clear fixmap PTE and PMD mappings */
>  	clear_fixmap(FIX_PTE);
>  	clear_fixmap(FIX_PMD);
> diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
> index 2c39f0386673..28f4d52cf17e 100644
> --- a/arch/riscv/mm/kasan_init.c
> +++ b/arch/riscv/mm/kasan_init.c
> @@ -171,6 +171,10 @@ void __init kasan_init(void)
>  	phys_addr_t _start, _end;
>  	u64 i;
>
> +	/*
> +	 * Populate all kernel virtual address space with kasan_early_shadow_page
> +	 * except for the linear mapping and the modules/kernel/BPF mapping.
> +	 */
>  	kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
>  				    (void *)kasan_mem_to_shadow((void *)
>  								VMEMMAP_END));
> @@ -183,6 +187,7 @@ void __init kasan_init(void)
>  			(void *)kasan_mem_to_shadow((void *)VMALLOC_START),
>  			(void *)kasan_mem_to_shadow((void *)VMALLOC_END));
>
> +	/* Populate the linear mapping */
>  	for_each_mem_range(i, &_start, &_end) {
>  		void *start = (void *)__va(_start);
>  		void *end = (void *)__va(_end);
> @@ -193,6 +198,10 @@ void __init kasan_init(void)
>  		kasan_populate(kasan_mem_to_shadow(start), kasan_mem_to_shadow(end));
>  	};
>
> +	/* Populate kernel, BPF, modules mapping */
> +	kasan_populate(kasan_mem_to_shadow((const void *)MODULES_VADDR),
> +		       kasan_mem_to_shadow((const void *)BPF_JIT_REGION_END));
> +
>  	for (i = 0; i < PTRS_PER_PTE; i++)
>  		set_pte(&kasan_early_shadow_pte[i],
>  			mk_pte(virt_to_page(kasan_early_shadow_page),
> diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
> index e8e4dcd39fed..35703d5ef5fd 100644
> --- a/arch/riscv/mm/physaddr.c
> +++ b/arch/riscv/mm/physaddr.c
> @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
>
>  phys_addr_t __phys_addr_symbol(unsigned long x)
>  {
> -	unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
> +	unsigned long kernel_start = (unsigned long)kernel_virt_addr;
>  	unsigned long kernel_end = (unsigned long)_end;
>
>  	/*

This is breaking boot for me with CONFIG_STRICT_KERNEL_RWX=n.  I'm not 
even really convinced that's a useful config to support, but it's 
currently optional and I'd prefer to avoid breaking it if possible.

I can't quite figure out what's going on here and I'm pretty much tired 
out for tonight.  LMK if you don't have time to look at it and I'll try 
to give it another shot.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-04-15  4:20   ` Palmer Dabbelt
@ 2021-04-15  4:54     ` Alex Ghiti
  2021-04-15 18:00       ` Alex Ghiti
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Ghiti @ 2021-04-15  4:54 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: corbet, Paul Walmsley, aou, Arnd Bergmann, aryabinin, glider,
	dvyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm

Le 4/15/21 à 12:20 AM, Palmer Dabbelt a écrit :
> On Sun, 11 Apr 2021 09:41:44 PDT (-0700), alex@ghiti.fr wrote:
>> This is a preparatory patch for relocatable kernel and sv48 support.
>>
>> The kernel used to be linked at PAGE_OFFSET address therefore we could 
>> use
>> the linear mapping for the kernel mapping. But the relocated kernel base
>> address will be different from PAGE_OFFSET and since in the linear 
>> mapping,
>> two different virtual addresses cannot point to the same physical 
>> address,
>> the kernel mapping needs to lie outside the linear mapping so that we 
>> don't
>> have to copy it at the same physical offset.
>>
>> The kernel mapping is moved to the last 2GB of the address space, BPF
>> is now always after the kernel and modules use the 2GB memory range right
>> before the kernel, so BPF and modules regions do not overlap. KASLR
>> implementation will simply have to move the kernel in the last 2GB range
>> and just take care of leaving enough space for BPF.
>>
>> In addition, by moving the kernel to the end of the address space, both
>> sv39 and sv48 kernels will be exactly the same without needing to be
>> relocated at runtime.
>>
>> Suggested-by: Arnd Bergmann <arnd@arndb.de>
>> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
>> ---
>>  arch/riscv/boot/loader.lds.S        |  3 +-
>>  arch/riscv/include/asm/page.h       | 17 +++++-
>>  arch/riscv/include/asm/pgtable.h    | 37 ++++++++----
>>  arch/riscv/include/asm/set_memory.h |  1 +
>>  arch/riscv/kernel/head.S            |  3 +-
>>  arch/riscv/kernel/module.c          |  6 +-
>>  arch/riscv/kernel/setup.c           |  5 ++
>>  arch/riscv/kernel/vmlinux.lds.S     |  3 +-
>>  arch/riscv/mm/fault.c               | 13 +++++
>>  arch/riscv/mm/init.c                | 87 ++++++++++++++++++++++-------
>>  arch/riscv/mm/kasan_init.c          |  9 +++
>>  arch/riscv/mm/physaddr.c            |  2 +-
>>  12 files changed, 146 insertions(+), 40 deletions(-)
>>
>> diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
>> index 47a5003c2e28..62d94696a19c 100644
>> --- a/arch/riscv/boot/loader.lds.S
>> +++ b/arch/riscv/boot/loader.lds.S
>> @@ -1,13 +1,14 @@
>>  /* SPDX-License-Identifier: GPL-2.0 */
>>
>>  #include <asm/page.h>
>> +#include <asm/pgtable.h>
>>
>>  OUTPUT_ARCH(riscv)
>>  ENTRY(_start)
>>
>>  SECTIONS
>>  {
>> -    . = PAGE_OFFSET;
>> +    . = KERNEL_LINK_ADDR;
>>
>>      .payload : {
>>          *(.payload)
>> diff --git a/arch/riscv/include/asm/page.h 
>> b/arch/riscv/include/asm/page.h
>> index adc9d26f3d75..22cfb2be60dc 100644
>> --- a/arch/riscv/include/asm/page.h
>> +++ b/arch/riscv/include/asm/page.h
>> @@ -90,15 +90,28 @@ typedef struct page *pgtable_t;
>>
>>  #ifdef CONFIG_MMU
>>  extern unsigned long va_pa_offset;
>> +extern unsigned long va_kernel_pa_offset;
>>  extern unsigned long pfn_base;
>>  #define ARCH_PFN_OFFSET        (pfn_base)
>>  #else
>>  #define va_pa_offset        0
>> +#define va_kernel_pa_offset    0
>>  #define ARCH_PFN_OFFSET        (PAGE_OFFSET >> PAGE_SHIFT)
>>  #endif /* CONFIG_MMU */
>>
>> -#define __pa_to_va_nodebug(x)    ((void *)((unsigned long) (x) + 
>> va_pa_offset))
>> -#define __va_to_pa_nodebug(x)    ((unsigned long)(x) - va_pa_offset)
>> +extern unsigned long kernel_virt_addr;
>> +
>> +#define linear_mapping_pa_to_va(x)    ((void *)((unsigned long)(x) + 
>> va_pa_offset))
>> +#define kernel_mapping_pa_to_va(x)    ((void *)((unsigned long)(x) + 
>> va_kernel_pa_offset))
>> +#define __pa_to_va_nodebug(x)        linear_mapping_pa_to_va(x)
>> +
>> +#define linear_mapping_va_to_pa(x)    ((unsigned long)(x) - 
>> va_pa_offset)
>> +#define kernel_mapping_va_to_pa(x)    ((unsigned long)(x) - 
>> va_kernel_pa_offset)
>> +#define __va_to_pa_nodebug(x)    ({                        \
>> +    unsigned long _x = x;                            \
>> +    (_x < kernel_virt_addr) ?                        \
>> +        linear_mapping_va_to_pa(_x) : kernel_mapping_va_to_pa(_x);    \
>> +    })
>>
>>  #ifdef CONFIG_DEBUG_VIRTUAL
>>  extern phys_addr_t __virt_to_phys(unsigned long x);
>> diff --git a/arch/riscv/include/asm/pgtable.h 
>> b/arch/riscv/include/asm/pgtable.h
>> index ebf817c1bdf4..80e63a93e903 100644
>> --- a/arch/riscv/include/asm/pgtable.h
>> +++ b/arch/riscv/include/asm/pgtable.h
>> @@ -11,23 +11,30 @@
>>
>>  #include <asm/pgtable-bits.h>
>>
>> -#ifndef __ASSEMBLY__
>> -
>> -/* Page Upper Directory not used in RISC-V */
>> -#include <asm-generic/pgtable-nopud.h>
>> -#include <asm/page.h>
>> -#include <asm/tlbflush.h>
>> -#include <linux/mm_types.h>
>> +#ifndef CONFIG_MMU
>> +#define KERNEL_LINK_ADDR    PAGE_OFFSET
>> +#else
>>
>> -#ifdef CONFIG_MMU
>> +#define ADDRESS_SPACE_END    (UL(-1))
>> +/*
>> + * Leave 2GB for kernel and BPF at the end of the address space
>> + */
>> +#define KERNEL_LINK_ADDR    (ADDRESS_SPACE_END - SZ_2G + 1)
>>
>>  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>>  #define VMALLOC_END      (PAGE_OFFSET - 1)
>>  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>>
>> +/* KASLR should leave at least 128MB for BPF after the kernel */
>>  #define BPF_JIT_REGION_SIZE    (SZ_128M)
>> -#define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
>> -#define BPF_JIT_REGION_END    (VMALLOC_END)
>> +#define BPF_JIT_REGION_START    PFN_ALIGN((unsigned long)&_end)
>> +#define BPF_JIT_REGION_END    (BPF_JIT_REGION_START + 
>> BPF_JIT_REGION_SIZE)
>> +
>> +/* Modules always live before the kernel */
>> +#ifdef CONFIG_64BIT
>> +#define MODULES_VADDR    (PFN_ALIGN((unsigned long)&_end) - SZ_2G)
>> +#define MODULES_END    (PFN_ALIGN((unsigned long)&_start))
>> +#endif
>>
>>  /*
>>   * Roughly size the vmemmap space to be large enough to fit enough
>> @@ -57,9 +64,16 @@
>>  #define FIXADDR_SIZE     PGDIR_SIZE
>>  #endif
>>  #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
>> -
>>  #endif
>>
>> +#ifndef __ASSEMBLY__
>> +
>> +/* Page Upper Directory not used in RISC-V */
>> +#include <asm-generic/pgtable-nopud.h>
>> +#include <asm/page.h>
>> +#include <asm/tlbflush.h>
>> +#include <linux/mm_types.h>
>> +
>>  #ifdef CONFIG_64BIT
>>  #include <asm/pgtable-64.h>
>>  #else
>> @@ -484,6 +498,7 @@ static inline int ptep_clear_flush_young(struct 
>> vm_area_struct *vma,
>>
>>  #define kern_addr_valid(addr)   (1) /* FIXME */
>>
>> +extern char _start[];
>>  extern void *dtb_early_va;
>>  extern uintptr_t dtb_early_pa;
>>  void setup_bootmem(void);
>> diff --git a/arch/riscv/include/asm/set_memory.h 
>> b/arch/riscv/include/asm/set_memory.h
>> index 6887b3d9f371..a9c56776fa0e 100644
>> --- a/arch/riscv/include/asm/set_memory.h
>> +++ b/arch/riscv/include/asm/set_memory.h
>> @@ -17,6 +17,7 @@ int set_memory_x(unsigned long addr, int numpages);
>>  int set_memory_nx(unsigned long addr, int numpages);
>>  int set_memory_rw_nx(unsigned long addr, int numpages);
>>  void protect_kernel_text_data(void);
>> +void protect_kernel_linear_mapping_text_rodata(void);
>>  #else
>>  static inline int set_memory_ro(unsigned long addr, int numpages) { 
>> return 0; }
>>  static inline int set_memory_rw(unsigned long addr, int numpages) { 
>> return 0; }
>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>> index f5a9bad86e58..6cb05f22e52a 100644
>> --- a/arch/riscv/kernel/head.S
>> +++ b/arch/riscv/kernel/head.S
>> @@ -69,7 +69,8 @@ pe_head_start:
>>  #ifdef CONFIG_MMU
>>  relocate:
>>      /* Relocate return address */
>> -    li a1, PAGE_OFFSET
>> +    la a1, kernel_virt_addr
>> +    REG_L a1, 0(a1)
>>      la a2, _start
>>      sub a1, a1, a2
>>      add ra, ra, a1
>> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
>> index 104fba889cf7..ce153771e5e9 100644
>> --- a/arch/riscv/kernel/module.c
>> +++ b/arch/riscv/kernel/module.c
>> @@ -408,12 +408,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const 
>> char *strtab,
>>  }
>>
>>  #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
>> -#define VMALLOC_MODULE_START \
>> -     max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>>  void *module_alloc(unsigned long size)
>>  {
>> -    return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
>> -                    VMALLOC_END, GFP_KERNEL,
>> +    return __vmalloc_node_range(size, 1, MODULES_VADDR,
>> +                    MODULES_END, GFP_KERNEL,
>>                      PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>>                      __builtin_return_address(0));
>>  }
>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>> index e85bacff1b50..30e4af0fd50c 100644
>> --- a/arch/riscv/kernel/setup.c
>> +++ b/arch/riscv/kernel/setup.c
>> @@ -265,6 +265,11 @@ void __init setup_arch(char **cmdline_p)
>>
>>      if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
>>          protect_kernel_text_data();
>> +
>> +#if defined(CONFIG_64BIT) && defined(CONFIG_MMU)
>> +    protect_kernel_linear_mapping_text_rodata();
>> +#endif
>> +
>>  #ifdef CONFIG_SWIOTLB
>>      swiotlb_init(1);
>>  #endif
>> diff --git a/arch/riscv/kernel/vmlinux.lds.S 
>> b/arch/riscv/kernel/vmlinux.lds.S
>> index de03cb22d0e9..0726c05e0336 100644
>> --- a/arch/riscv/kernel/vmlinux.lds.S
>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>> @@ -4,7 +4,8 @@
>>   * Copyright (C) 2017 SiFive
>>   */
>>
>> -#define LOAD_OFFSET PAGE_OFFSET
>> +#include <asm/pgtable.h>
>> +#define LOAD_OFFSET KERNEL_LINK_ADDR
>>  #include <asm/vmlinux.lds.h>
>>  #include <asm/page.h>
>>  #include <asm/cache.h>
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index 8f17519208c7..1b14d523a95c 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -231,6 +231,19 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
>>          return;
>>      }
>>
>> +#ifdef CONFIG_64BIT
>> +    /*
>> +     * Modules in 64bit kernels lie in their own virtual region which 
>> is not
>> +     * in the vmalloc region, but dealing with page faults in this 
>> region
>> +     * or the vmalloc region amounts to doing the same thing: 
>> checking that
>> +     * the mapping exists in init_mm.pgd and updating user page 
>> table, so
>> +     * just use vmalloc_fault.
>> +     */
>> +    if (unlikely(addr >= MODULES_VADDR && addr < MODULES_END)) {
>> +        vmalloc_fault(regs, code, addr);
>> +        return;
>> +    }
>> +#endif
>>      /* Enable interrupts if they were enabled in the parent context. */
>>      if (likely(regs->status & SR_PIE))
>>          local_irq_enable();
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 7f5036fbee8c..093f3a96ecfc 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -25,6 +25,9 @@
>>
>>  #include "../kernel/head.h"
>>
>> +unsigned long kernel_virt_addr = KERNEL_LINK_ADDR;
>> +EXPORT_SYMBOL(kernel_virt_addr);
>> +
>>  unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
>>                              __page_aligned_bss;
>>  EXPORT_SYMBOL(empty_zero_page);
>> @@ -88,6 +91,8 @@ static void print_vm_layout(void)
>>            (unsigned long)VMALLOC_END);
>>      print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
>>            (unsigned long)high_memory);
>> +    print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR,
>> +          (unsigned long)ADDRESS_SPACE_END);
>>  }
>>  #else
>>  static void print_vm_layout(void) { }
>> @@ -116,8 +121,13 @@ void __init setup_bootmem(void)
>>      /* The maximal physical memory size is -PAGE_OFFSET. */
>>      memblock_enforce_memory_limit(-PAGE_OFFSET);
>>
>> -    /* Reserve from the start of the kernel to the end of the kernel */
>> -    memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
>> +    /*
>> +     * Reserve from the start of the kernel to the end of the kernel
>> +     * and make sure we align the reservation on PMD_SIZE since we will
>> +     * map the kernel in the linear mapping as read-only: we do not want
>> +     * any allocation to happen between _end and the next pmd aligned 
>> page.
>> +     */
>> +    memblock_reserve(vmlinux_start, (vmlinux_end - vmlinux_start + 
>> PMD_SIZE - 1) & PMD_MASK);
>>
>>      /*
>>       * memblock allocator is not aware of the fact that last 4K bytes of
>> @@ -152,8 +162,12 @@ void __init setup_bootmem(void)
>>  #ifdef CONFIG_MMU
>>  static struct pt_alloc_ops pt_ops;
>>
>> +/* Offset between linear mapping virtual address and kernel load 
>> address */
>>  unsigned long va_pa_offset;
>>  EXPORT_SYMBOL(va_pa_offset);
>> +/* Offset between kernel mapping virtual address and kernel load 
>> address */
>> +unsigned long va_kernel_pa_offset;
>> +EXPORT_SYMBOL(va_kernel_pa_offset);
>>  unsigned long pfn_base;
>>  EXPORT_SYMBOL(pfn_base);
>>
>> @@ -257,7 +271,7 @@ static pmd_t *get_pmd_virt_late(phys_addr_t pa)
>>
>>  static phys_addr_t __init alloc_pmd_early(uintptr_t va)
>>  {
>> -    BUG_ON((va - PAGE_OFFSET) >> PGDIR_SHIFT);
>> +    BUG_ON((va - kernel_virt_addr) >> PGDIR_SHIFT);
>>
>>      return (uintptr_t)early_pmd;
>>  }
>> @@ -372,17 +386,32 @@ static uintptr_t __init 
>> best_map_size(phys_addr_t base, phys_addr_t size)
>>  #error "setup_vm() is called from head.S before relocate so it should 
>> not use absolute addressing."
>>  #endif
>>
>> +uintptr_t load_pa, load_sz;
>> +
>> +static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t 
>> map_size)
>> +{
>> +    uintptr_t va, end_va;
>> +
>> +    end_va = kernel_virt_addr + load_sz;
>> +    for (va = kernel_virt_addr; va < end_va; va += map_size)
>> +        create_pgd_mapping(pgdir, va,
>> +                   load_pa + (va - kernel_virt_addr),
>> +                   map_size, PAGE_KERNEL_EXEC);
>> +}
>> +
>>  asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>  {
>> -    uintptr_t va, pa, end_va;
>> -    uintptr_t load_pa = (uintptr_t)(&_start);
>> -    uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
>> +    uintptr_t pa;
>>      uintptr_t map_size;
>>  #ifndef __PAGETABLE_PMD_FOLDED
>>      pmd_t fix_bmap_spmd, fix_bmap_epmd;
>>  #endif
>> +    load_pa = (uintptr_t)(&_start);
>> +    load_sz = (uintptr_t)(&_end) - load_pa;
>>
>>      va_pa_offset = PAGE_OFFSET - load_pa;
>> +    va_kernel_pa_offset = kernel_virt_addr - load_pa;
>> +
>>      pfn_base = PFN_DOWN(load_pa);
>>
>>      /*
>> @@ -410,26 +439,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>      create_pmd_mapping(fixmap_pmd, FIXADDR_START,
>>                 (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
>>      /* Setup trampoline PGD and PMD */
>> -    create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>> +    create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>                 (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
>> -    create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
>> +    create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
>>                 load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
>>  #else
>>      /* Setup trampoline PGD */
>> -    create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>> +    create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>                 load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>>  #endif
>>
>>      /*
>> -     * Setup early PGD covering entire kernel which will allows
>> +     * Setup early PGD covering entire kernel which will allow
>>       * us to reach paging_init(). We map all memory banks later
>>       * in setup_vm_final() below.
>>       */
>> -    end_va = PAGE_OFFSET + load_sz;
>> -    for (va = PAGE_OFFSET; va < end_va; va += map_size)
>> -        create_pgd_mapping(early_pg_dir, va,
>> -                   load_pa + (va - PAGE_OFFSET),
>> -                   map_size, PAGE_KERNEL_EXEC);
>> +    create_kernel_page_table(early_pg_dir, map_size);
>>
>>  #ifndef __PAGETABLE_PMD_FOLDED
>>      /* Setup early PMD for DTB */
>> @@ -444,7 +469,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>                 pa + PMD_SIZE, PMD_SIZE, PAGE_KERNEL);
>>      dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PMD_SIZE - 
>> 1));
>>  #else /* CONFIG_BUILTIN_DTB */
>> -    dtb_early_va = __va(dtb_pa);
>> +    /*
>> +     * __va can't be used since it would return a linear mapping address
>> +     * whereas dtb_early_va will be used before setup_vm_final installs
>> +     * the linear mapping.
>> +     */
>> +    dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>>  #endif /* CONFIG_BUILTIN_DTB */
>>  #else
>>  #ifndef CONFIG_BUILTIN_DTB
>> @@ -456,7 +486,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>                 pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
>>      dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE 
>> - 1));
>>  #else /* CONFIG_BUILTIN_DTB */
>> -    dtb_early_va = __va(dtb_pa);
>> +    dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>>  #endif /* CONFIG_BUILTIN_DTB */
>>  #endif
>>      dtb_early_pa = dtb_pa;
>> @@ -492,6 +522,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>  #endif
>>  }
>>
>> +#ifdef CONFIG_64BIT
>> +void protect_kernel_linear_mapping_text_rodata(void)
>> +{
>> +    unsigned long text_start = (unsigned long)lm_alias(_start);
>> +    unsigned long init_text_start = (unsigned 
>> long)lm_alias(__init_text_begin);
>> +    unsigned long rodata_start = (unsigned 
>> long)lm_alias(__start_rodata);
>> +    unsigned long data_start = (unsigned long)lm_alias(_data);
>> +
>> +    set_memory_ro(text_start, (init_text_start - text_start) >> 
>> PAGE_SHIFT);
>> +    set_memory_nx(text_start, (init_text_start - text_start) >> 
>> PAGE_SHIFT);
>> +
>> +    set_memory_ro(rodata_start, (data_start - rodata_start) >> 
>> PAGE_SHIFT);
>> +    set_memory_nx(rodata_start, (data_start - rodata_start) >> 
>> PAGE_SHIFT);
>> +}
>> +#endif
>> +
>>  static void __init setup_vm_final(void)
>>  {
>>      uintptr_t va, map_size;
>> @@ -513,7 +559,7 @@ static void __init setup_vm_final(void)
>>                 __pa_symbol(fixmap_pgd_next),
>>                 PGDIR_SIZE, PAGE_TABLE);
>>
>> -    /* Map all memory banks */
>> +    /* Map all memory banks in the linear mapping */
>>      for_each_mem_range(i, &start, &end) {
>>          if (start >= end)
>>              break;
>> @@ -525,10 +571,13 @@ static void __init setup_vm_final(void)
>>          for (pa = start; pa < end; pa += map_size) {
>>              va = (uintptr_t)__va(pa);
>>              create_pgd_mapping(swapper_pg_dir, va, pa,
>> -                       map_size, PAGE_KERNEL_EXEC);
>> +                       map_size, PAGE_KERNEL);
>>          }
>>      }
>>
>> +    /* Map the kernel */
>> +    create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
>> +
>>      /* Clear fixmap PTE and PMD mappings */
>>      clear_fixmap(FIX_PTE);
>>      clear_fixmap(FIX_PMD);
>> diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
>> index 2c39f0386673..28f4d52cf17e 100644
>> --- a/arch/riscv/mm/kasan_init.c
>> +++ b/arch/riscv/mm/kasan_init.c
>> @@ -171,6 +171,10 @@ void __init kasan_init(void)
>>      phys_addr_t _start, _end;
>>      u64 i;
>>
>> +    /*
>> +     * Populate all kernel virtual address space with 
>> kasan_early_shadow_page
>> +     * except for the linear mapping and the modules/kernel/BPF mapping.
>> +     */
>>      kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
>>                      (void *)kasan_mem_to_shadow((void *)
>>                                  VMEMMAP_END));
>> @@ -183,6 +187,7 @@ void __init kasan_init(void)
>>              (void *)kasan_mem_to_shadow((void *)VMALLOC_START),
>>              (void *)kasan_mem_to_shadow((void *)VMALLOC_END));
>>
>> +    /* Populate the linear mapping */
>>      for_each_mem_range(i, &_start, &_end) {
>>          void *start = (void *)__va(_start);
>>          void *end = (void *)__va(_end);
>> @@ -193,6 +198,10 @@ void __init kasan_init(void)
>>          kasan_populate(kasan_mem_to_shadow(start), 
>> kasan_mem_to_shadow(end));
>>      };
>>
>> +    /* Populate kernel, BPF, modules mapping */
>> +    kasan_populate(kasan_mem_to_shadow((const void *)MODULES_VADDR),
>> +               kasan_mem_to_shadow((const void *)BPF_JIT_REGION_END));
>> +
>>      for (i = 0; i < PTRS_PER_PTE; i++)
>>          set_pte(&kasan_early_shadow_pte[i],
>>              mk_pte(virt_to_page(kasan_early_shadow_page),
>> diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
>> index e8e4dcd39fed..35703d5ef5fd 100644
>> --- a/arch/riscv/mm/physaddr.c
>> +++ b/arch/riscv/mm/physaddr.c
>> @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
>>
>>  phys_addr_t __phys_addr_symbol(unsigned long x)
>>  {
>> -    unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
>> +    unsigned long kernel_start = (unsigned long)kernel_virt_addr;
>>      unsigned long kernel_end = (unsigned long)_end;
>>
>>      /*
> 
> This is breaking boot for me with CONFIG_STRICT_KERNEL_RWX=n.  I'm not 
> even really convinced that's a useful config to support, but it's 
> currently optional and I'd prefer to avoid breaking it if possible.
> 
> I can't quite figure out what's going on here and I'm pretty much tired 
> out for tonight.  LMK if you don't have time to look at it and I'll try 
> to give it another shot.

I'm taking a look at that.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-04-15  4:54     ` Alex Ghiti
@ 2021-04-15 18:00       ` Alex Ghiti
  2021-04-18 11:38         ` Alex Ghiti
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Ghiti @ 2021-04-15 18:00 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: corbet, Paul Walmsley, aou, Arnd Bergmann, aryabinin, glider,
	dvyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm

Le 4/15/21 à 12:54 AM, Alex Ghiti a écrit :
> Le 4/15/21 à 12:20 AM, Palmer Dabbelt a écrit :
>> On Sun, 11 Apr 2021 09:41:44 PDT (-0700), alex@ghiti.fr wrote:
>>> This is a preparatory patch for relocatable kernel and sv48 support.
>>>
>>> The kernel used to be linked at PAGE_OFFSET address therefore we 
>>> could use
>>> the linear mapping for the kernel mapping. But the relocated kernel base
>>> address will be different from PAGE_OFFSET and since in the linear 
>>> mapping,
>>> two different virtual addresses cannot point to the same physical 
>>> address,
>>> the kernel mapping needs to lie outside the linear mapping so that we 
>>> don't
>>> have to copy it at the same physical offset.
>>>
>>> The kernel mapping is moved to the last 2GB of the address space, BPF
>>> is now always after the kernel and modules use the 2GB memory range 
>>> right
>>> before the kernel, so BPF and modules regions do not overlap. KASLR
>>> implementation will simply have to move the kernel in the last 2GB range
>>> and just take care of leaving enough space for BPF.
>>>
>>> In addition, by moving the kernel to the end of the address space, both
>>> sv39 and sv48 kernels will be exactly the same without needing to be
>>> relocated at runtime.
>>>
>>> Suggested-by: Arnd Bergmann <arnd@arndb.de>
>>> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
>>> ---
>>>  arch/riscv/boot/loader.lds.S        |  3 +-
>>>  arch/riscv/include/asm/page.h       | 17 +++++-
>>>  arch/riscv/include/asm/pgtable.h    | 37 ++++++++----
>>>  arch/riscv/include/asm/set_memory.h |  1 +
>>>  arch/riscv/kernel/head.S            |  3 +-
>>>  arch/riscv/kernel/module.c          |  6 +-
>>>  arch/riscv/kernel/setup.c           |  5 ++
>>>  arch/riscv/kernel/vmlinux.lds.S     |  3 +-
>>>  arch/riscv/mm/fault.c               | 13 +++++
>>>  arch/riscv/mm/init.c                | 87 ++++++++++++++++++++++-------
>>>  arch/riscv/mm/kasan_init.c          |  9 +++
>>>  arch/riscv/mm/physaddr.c            |  2 +-
>>>  12 files changed, 146 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
>>> index 47a5003c2e28..62d94696a19c 100644
>>> --- a/arch/riscv/boot/loader.lds.S
>>> +++ b/arch/riscv/boot/loader.lds.S
>>> @@ -1,13 +1,14 @@
>>>  /* SPDX-License-Identifier: GPL-2.0 */
>>>
>>>  #include <asm/page.h>
>>> +#include <asm/pgtable.h>
>>>
>>>  OUTPUT_ARCH(riscv)
>>>  ENTRY(_start)
>>>
>>>  SECTIONS
>>>  {
>>> -    . = PAGE_OFFSET;
>>> +    . = KERNEL_LINK_ADDR;
>>>
>>>      .payload : {
>>>          *(.payload)
>>> diff --git a/arch/riscv/include/asm/page.h 
>>> b/arch/riscv/include/asm/page.h
>>> index adc9d26f3d75..22cfb2be60dc 100644
>>> --- a/arch/riscv/include/asm/page.h
>>> +++ b/arch/riscv/include/asm/page.h
>>> @@ -90,15 +90,28 @@ typedef struct page *pgtable_t;
>>>
>>>  #ifdef CONFIG_MMU
>>>  extern unsigned long va_pa_offset;
>>> +extern unsigned long va_kernel_pa_offset;
>>>  extern unsigned long pfn_base;
>>>  #define ARCH_PFN_OFFSET        (pfn_base)
>>>  #else
>>>  #define va_pa_offset        0
>>> +#define va_kernel_pa_offset    0
>>>  #define ARCH_PFN_OFFSET        (PAGE_OFFSET >> PAGE_SHIFT)
>>>  #endif /* CONFIG_MMU */
>>>
>>> -#define __pa_to_va_nodebug(x)    ((void *)((unsigned long) (x) + 
>>> va_pa_offset))
>>> -#define __va_to_pa_nodebug(x)    ((unsigned long)(x) - va_pa_offset)
>>> +extern unsigned long kernel_virt_addr;
>>> +
>>> +#define linear_mapping_pa_to_va(x)    ((void *)((unsigned long)(x) + 
>>> va_pa_offset))
>>> +#define kernel_mapping_pa_to_va(x)    ((void *)((unsigned long)(x) + 
>>> va_kernel_pa_offset))
>>> +#define __pa_to_va_nodebug(x)        linear_mapping_pa_to_va(x)
>>> +
>>> +#define linear_mapping_va_to_pa(x)    ((unsigned long)(x) - 
>>> va_pa_offset)
>>> +#define kernel_mapping_va_to_pa(x)    ((unsigned long)(x) - 
>>> va_kernel_pa_offset)
>>> +#define __va_to_pa_nodebug(x)    ({                        \
>>> +    unsigned long _x = x;                            \
>>> +    (_x < kernel_virt_addr) ?                        \
>>> +        linear_mapping_va_to_pa(_x) : kernel_mapping_va_to_pa(_x);    \
>>> +    })
>>>
>>>  #ifdef CONFIG_DEBUG_VIRTUAL
>>>  extern phys_addr_t __virt_to_phys(unsigned long x);
>>> diff --git a/arch/riscv/include/asm/pgtable.h 
>>> b/arch/riscv/include/asm/pgtable.h
>>> index ebf817c1bdf4..80e63a93e903 100644
>>> --- a/arch/riscv/include/asm/pgtable.h
>>> +++ b/arch/riscv/include/asm/pgtable.h
>>> @@ -11,23 +11,30 @@
>>>
>>>  #include <asm/pgtable-bits.h>
>>>
>>> -#ifndef __ASSEMBLY__
>>> -
>>> -/* Page Upper Directory not used in RISC-V */
>>> -#include <asm-generic/pgtable-nopud.h>
>>> -#include <asm/page.h>
>>> -#include <asm/tlbflush.h>
>>> -#include <linux/mm_types.h>
>>> +#ifndef CONFIG_MMU
>>> +#define KERNEL_LINK_ADDR    PAGE_OFFSET
>>> +#else
>>>
>>> -#ifdef CONFIG_MMU
>>> +#define ADDRESS_SPACE_END    (UL(-1))
>>> +/*
>>> + * Leave 2GB for kernel and BPF at the end of the address space
>>> + */
>>> +#define KERNEL_LINK_ADDR    (ADDRESS_SPACE_END - SZ_2G + 1)
>>>
>>>  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
>>>  #define VMALLOC_END      (PAGE_OFFSET - 1)
>>>  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>>>
>>> +/* KASLR should leave at least 128MB for BPF after the kernel */
>>>  #define BPF_JIT_REGION_SIZE    (SZ_128M)
>>> -#define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
>>> -#define BPF_JIT_REGION_END    (VMALLOC_END)
>>> +#define BPF_JIT_REGION_START    PFN_ALIGN((unsigned long)&_end)
>>> +#define BPF_JIT_REGION_END    (BPF_JIT_REGION_START + 
>>> BPF_JIT_REGION_SIZE)
>>> +
>>> +/* Modules always live before the kernel */
>>> +#ifdef CONFIG_64BIT
>>> +#define MODULES_VADDR    (PFN_ALIGN((unsigned long)&_end) - SZ_2G)
>>> +#define MODULES_END    (PFN_ALIGN((unsigned long)&_start))
>>> +#endif
>>>
>>>  /*
>>>   * Roughly size the vmemmap space to be large enough to fit enough
>>> @@ -57,9 +64,16 @@
>>>  #define FIXADDR_SIZE     PGDIR_SIZE
>>>  #endif
>>>  #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
>>> -
>>>  #endif
>>>
>>> +#ifndef __ASSEMBLY__
>>> +
>>> +/* Page Upper Directory not used in RISC-V */
>>> +#include <asm-generic/pgtable-nopud.h>
>>> +#include <asm/page.h>
>>> +#include <asm/tlbflush.h>
>>> +#include <linux/mm_types.h>
>>> +
>>>  #ifdef CONFIG_64BIT
>>>  #include <asm/pgtable-64.h>
>>>  #else
>>> @@ -484,6 +498,7 @@ static inline int ptep_clear_flush_young(struct 
>>> vm_area_struct *vma,
>>>
>>>  #define kern_addr_valid(addr)   (1) /* FIXME */
>>>
>>> +extern char _start[];
>>>  extern void *dtb_early_va;
>>>  extern uintptr_t dtb_early_pa;
>>>  void setup_bootmem(void);
>>> diff --git a/arch/riscv/include/asm/set_memory.h 
>>> b/arch/riscv/include/asm/set_memory.h
>>> index 6887b3d9f371..a9c56776fa0e 100644
>>> --- a/arch/riscv/include/asm/set_memory.h
>>> +++ b/arch/riscv/include/asm/set_memory.h
>>> @@ -17,6 +17,7 @@ int set_memory_x(unsigned long addr, int numpages);
>>>  int set_memory_nx(unsigned long addr, int numpages);
>>>  int set_memory_rw_nx(unsigned long addr, int numpages);
>>>  void protect_kernel_text_data(void);
>>> +void protect_kernel_linear_mapping_text_rodata(void);
>>>  #else
>>>  static inline int set_memory_ro(unsigned long addr, int numpages) { 
>>> return 0; }
>>>  static inline int set_memory_rw(unsigned long addr, int numpages) { 
>>> return 0; }
>>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>>> index f5a9bad86e58..6cb05f22e52a 100644
>>> --- a/arch/riscv/kernel/head.S
>>> +++ b/arch/riscv/kernel/head.S
>>> @@ -69,7 +69,8 @@ pe_head_start:
>>>  #ifdef CONFIG_MMU
>>>  relocate:
>>>      /* Relocate return address */
>>> -    li a1, PAGE_OFFSET
>>> +    la a1, kernel_virt_addr
>>> +    REG_L a1, 0(a1)
>>>      la a2, _start
>>>      sub a1, a1, a2
>>>      add ra, ra, a1
>>> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
>>> index 104fba889cf7..ce153771e5e9 100644
>>> --- a/arch/riscv/kernel/module.c
>>> +++ b/arch/riscv/kernel/module.c
>>> @@ -408,12 +408,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const 
>>> char *strtab,
>>>  }
>>>
>>>  #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
>>> -#define VMALLOC_MODULE_START \
>>> -     max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>>>  void *module_alloc(unsigned long size)
>>>  {
>>> -    return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
>>> -                    VMALLOC_END, GFP_KERNEL,
>>> +    return __vmalloc_node_range(size, 1, MODULES_VADDR,
>>> +                    MODULES_END, GFP_KERNEL,
>>>                      PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>>>                      __builtin_return_address(0));
>>>  }
>>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>>> index e85bacff1b50..30e4af0fd50c 100644
>>> --- a/arch/riscv/kernel/setup.c
>>> +++ b/arch/riscv/kernel/setup.c
>>> @@ -265,6 +265,11 @@ void __init setup_arch(char **cmdline_p)
>>>
>>>      if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
>>>          protect_kernel_text_data();
>>> +
>>> +#if defined(CONFIG_64BIT) && defined(CONFIG_MMU)
>>> +    protect_kernel_linear_mapping_text_rodata();
>>> +#endif
>>> +
>>>  #ifdef CONFIG_SWIOTLB
>>>      swiotlb_init(1);
>>>  #endif
>>> diff --git a/arch/riscv/kernel/vmlinux.lds.S 
>>> b/arch/riscv/kernel/vmlinux.lds.S
>>> index de03cb22d0e9..0726c05e0336 100644
>>> --- a/arch/riscv/kernel/vmlinux.lds.S
>>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>>> @@ -4,7 +4,8 @@
>>>   * Copyright (C) 2017 SiFive
>>>   */
>>>
>>> -#define LOAD_OFFSET PAGE_OFFSET
>>> +#include <asm/pgtable.h>
>>> +#define LOAD_OFFSET KERNEL_LINK_ADDR
>>>  #include <asm/vmlinux.lds.h>
>>>  #include <asm/page.h>
>>>  #include <asm/cache.h>
>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>> index 8f17519208c7..1b14d523a95c 100644
>>> --- a/arch/riscv/mm/fault.c
>>> +++ b/arch/riscv/mm/fault.c
>>> @@ -231,6 +231,19 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
>>>          return;
>>>      }
>>>
>>> +#ifdef CONFIG_64BIT
>>> +    /*
>>> +     * Modules in 64bit kernels lie in their own virtual region 
>>> which is not
>>> +     * in the vmalloc region, but dealing with page faults in this 
>>> region
>>> +     * or the vmalloc region amounts to doing the same thing: 
>>> checking that
>>> +     * the mapping exists in init_mm.pgd and updating user page 
>>> table, so
>>> +     * just use vmalloc_fault.
>>> +     */
>>> +    if (unlikely(addr >= MODULES_VADDR && addr < MODULES_END)) {
>>> +        vmalloc_fault(regs, code, addr);
>>> +        return;
>>> +    }
>>> +#endif
>>>      /* Enable interrupts if they were enabled in the parent context. */
>>>      if (likely(regs->status & SR_PIE))
>>>          local_irq_enable();
>>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>>> index 7f5036fbee8c..093f3a96ecfc 100644
>>> --- a/arch/riscv/mm/init.c
>>> +++ b/arch/riscv/mm/init.c
>>> @@ -25,6 +25,9 @@
>>>
>>>  #include "../kernel/head.h"
>>>
>>> +unsigned long kernel_virt_addr = KERNEL_LINK_ADDR;
>>> +EXPORT_SYMBOL(kernel_virt_addr);
>>> +
>>>  unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
>>>                              __page_aligned_bss;
>>>  EXPORT_SYMBOL(empty_zero_page);
>>> @@ -88,6 +91,8 @@ static void print_vm_layout(void)
>>>            (unsigned long)VMALLOC_END);
>>>      print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
>>>            (unsigned long)high_memory);
>>> +    print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR,
>>> +          (unsigned long)ADDRESS_SPACE_END);
>>>  }
>>>  #else
>>>  static void print_vm_layout(void) { }
>>> @@ -116,8 +121,13 @@ void __init setup_bootmem(void)
>>>      /* The maximal physical memory size is -PAGE_OFFSET. */
>>>      memblock_enforce_memory_limit(-PAGE_OFFSET);
>>>
>>> -    /* Reserve from the start of the kernel to the end of the kernel */
>>> -    memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
>>> +    /*
>>> +     * Reserve from the start of the kernel to the end of the kernel
>>> +     * and make sure we align the reservation on PMD_SIZE since we will
>>> +     * map the kernel in the linear mapping as read-only: we do not 
>>> want
>>> +     * any allocation to happen between _end and the next pmd 
>>> aligned page.
>>> +     */
>>> +    memblock_reserve(vmlinux_start, (vmlinux_end - vmlinux_start + 
>>> PMD_SIZE - 1) & PMD_MASK);
>>>
>>>      /*
>>>       * memblock allocator is not aware of the fact that last 4K 
>>> bytes of
>>> @@ -152,8 +162,12 @@ void __init setup_bootmem(void)
>>>  #ifdef CONFIG_MMU
>>>  static struct pt_alloc_ops pt_ops;
>>>
>>> +/* Offset between linear mapping virtual address and kernel load 
>>> address */
>>>  unsigned long va_pa_offset;
>>>  EXPORT_SYMBOL(va_pa_offset);
>>> +/* Offset between kernel mapping virtual address and kernel load 
>>> address */
>>> +unsigned long va_kernel_pa_offset;
>>> +EXPORT_SYMBOL(va_kernel_pa_offset);
>>>  unsigned long pfn_base;
>>>  EXPORT_SYMBOL(pfn_base);
>>>
>>> @@ -257,7 +271,7 @@ static pmd_t *get_pmd_virt_late(phys_addr_t pa)
>>>
>>>  static phys_addr_t __init alloc_pmd_early(uintptr_t va)
>>>  {
>>> -    BUG_ON((va - PAGE_OFFSET) >> PGDIR_SHIFT);
>>> +    BUG_ON((va - kernel_virt_addr) >> PGDIR_SHIFT);
>>>
>>>      return (uintptr_t)early_pmd;
>>>  }
>>> @@ -372,17 +386,32 @@ static uintptr_t __init 
>>> best_map_size(phys_addr_t base, phys_addr_t size)
>>>  #error "setup_vm() is called from head.S before relocate so it 
>>> should not use absolute addressing."
>>>  #endif
>>>
>>> +uintptr_t load_pa, load_sz;
>>> +
>>> +static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t 
>>> map_size)
>>> +{
>>> +    uintptr_t va, end_va;
>>> +
>>> +    end_va = kernel_virt_addr + load_sz;
>>> +    for (va = kernel_virt_addr; va < end_va; va += map_size)
>>> +        create_pgd_mapping(pgdir, va,
>>> +                   load_pa + (va - kernel_virt_addr),
>>> +                   map_size, PAGE_KERNEL_EXEC);
>>> +}
>>> +
>>>  asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>  {
>>> -    uintptr_t va, pa, end_va;
>>> -    uintptr_t load_pa = (uintptr_t)(&_start);
>>> -    uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
>>> +    uintptr_t pa;
>>>      uintptr_t map_size;
>>>  #ifndef __PAGETABLE_PMD_FOLDED
>>>      pmd_t fix_bmap_spmd, fix_bmap_epmd;
>>>  #endif
>>> +    load_pa = (uintptr_t)(&_start);
>>> +    load_sz = (uintptr_t)(&_end) - load_pa;
>>>
>>>      va_pa_offset = PAGE_OFFSET - load_pa;
>>> +    va_kernel_pa_offset = kernel_virt_addr - load_pa;
>>> +
>>>      pfn_base = PFN_DOWN(load_pa);
>>>
>>>      /*
>>> @@ -410,26 +439,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>      create_pmd_mapping(fixmap_pmd, FIXADDR_START,
>>>                 (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
>>>      /* Setup trampoline PGD and PMD */
>>> -    create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>>> +    create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>>                 (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
>>> -    create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
>>> +    create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
>>>                 load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
>>>  #else
>>>      /* Setup trampoline PGD */
>>> -    create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>>> +    create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>>                 load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>>>  #endif
>>>
>>>      /*
>>> -     * Setup early PGD covering entire kernel which will allows
>>> +     * Setup early PGD covering entire kernel which will allow
>>>       * us to reach paging_init(). We map all memory banks later
>>>       * in setup_vm_final() below.
>>>       */
>>> -    end_va = PAGE_OFFSET + load_sz;
>>> -    for (va = PAGE_OFFSET; va < end_va; va += map_size)
>>> -        create_pgd_mapping(early_pg_dir, va,
>>> -                   load_pa + (va - PAGE_OFFSET),
>>> -                   map_size, PAGE_KERNEL_EXEC);
>>> +    create_kernel_page_table(early_pg_dir, map_size);
>>>
>>>  #ifndef __PAGETABLE_PMD_FOLDED
>>>      /* Setup early PMD for DTB */
>>> @@ -444,7 +469,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>                 pa + PMD_SIZE, PMD_SIZE, PAGE_KERNEL);
>>>      dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PMD_SIZE - 
>>> 1));
>>>  #else /* CONFIG_BUILTIN_DTB */
>>> -    dtb_early_va = __va(dtb_pa);
>>> +    /*
>>> +     * __va can't be used since it would return a linear mapping 
>>> address
>>> +     * whereas dtb_early_va will be used before setup_vm_final installs
>>> +     * the linear mapping.
>>> +     */
>>> +    dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>>>  #endif /* CONFIG_BUILTIN_DTB */
>>>  #else
>>>  #ifndef CONFIG_BUILTIN_DTB
>>> @@ -456,7 +486,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>                 pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
>>>      dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE 
>>> - 1));
>>>  #else /* CONFIG_BUILTIN_DTB */
>>> -    dtb_early_va = __va(dtb_pa);
>>> +    dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>>>  #endif /* CONFIG_BUILTIN_DTB */
>>>  #endif
>>>      dtb_early_pa = dtb_pa;
>>> @@ -492,6 +522,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>  #endif
>>>  }
>>>
>>> +#ifdef CONFIG_64BIT
>>> +void protect_kernel_linear_mapping_text_rodata(void)
>>> +{
>>> +    unsigned long text_start = (unsigned long)lm_alias(_start);
>>> +    unsigned long init_text_start = (unsigned 
>>> long)lm_alias(__init_text_begin);
>>> +    unsigned long rodata_start = (unsigned 
>>> long)lm_alias(__start_rodata);
>>> +    unsigned long data_start = (unsigned long)lm_alias(_data);
>>> +
>>> +    set_memory_ro(text_start, (init_text_start - text_start) >> 
>>> PAGE_SHIFT);
>>> +    set_memory_nx(text_start, (init_text_start - text_start) >> 
>>> PAGE_SHIFT);
>>> +
>>> +    set_memory_ro(rodata_start, (data_start - rodata_start) >> 
>>> PAGE_SHIFT);
>>> +    set_memory_nx(rodata_start, (data_start - rodata_start) >> 
>>> PAGE_SHIFT);
>>> +}
>>> +#endif
>>> +
>>>  static void __init setup_vm_final(void)
>>>  {
>>>      uintptr_t va, map_size;
>>> @@ -513,7 +559,7 @@ static void __init setup_vm_final(void)
>>>                 __pa_symbol(fixmap_pgd_next),
>>>                 PGDIR_SIZE, PAGE_TABLE);
>>>
>>> -    /* Map all memory banks */
>>> +    /* Map all memory banks in the linear mapping */
>>>      for_each_mem_range(i, &start, &end) {
>>>          if (start >= end)
>>>              break;
>>> @@ -525,10 +571,13 @@ static void __init setup_vm_final(void)
>>>          for (pa = start; pa < end; pa += map_size) {
>>>              va = (uintptr_t)__va(pa);
>>>              create_pgd_mapping(swapper_pg_dir, va, pa,
>>> -                       map_size, PAGE_KERNEL_EXEC);
>>> +                       map_size, PAGE_KERNEL);
>>>          }
>>>      }
>>>
>>> +    /* Map the kernel */
>>> +    create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
>>> +
>>>      /* Clear fixmap PTE and PMD mappings */
>>>      clear_fixmap(FIX_PTE);
>>>      clear_fixmap(FIX_PMD);
>>> diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
>>> index 2c39f0386673..28f4d52cf17e 100644
>>> --- a/arch/riscv/mm/kasan_init.c
>>> +++ b/arch/riscv/mm/kasan_init.c
>>> @@ -171,6 +171,10 @@ void __init kasan_init(void)
>>>      phys_addr_t _start, _end;
>>>      u64 i;
>>>
>>> +    /*
>>> +     * Populate all kernel virtual address space with 
>>> kasan_early_shadow_page
>>> +     * except for the linear mapping and the modules/kernel/BPF 
>>> mapping.
>>> +     */
>>>      kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
>>>                      (void *)kasan_mem_to_shadow((void *)
>>>                                  VMEMMAP_END));
>>> @@ -183,6 +187,7 @@ void __init kasan_init(void)
>>>              (void *)kasan_mem_to_shadow((void *)VMALLOC_START),
>>>              (void *)kasan_mem_to_shadow((void *)VMALLOC_END));
>>>
>>> +    /* Populate the linear mapping */
>>>      for_each_mem_range(i, &_start, &_end) {
>>>          void *start = (void *)__va(_start);
>>>          void *end = (void *)__va(_end);
>>> @@ -193,6 +198,10 @@ void __init kasan_init(void)
>>>          kasan_populate(kasan_mem_to_shadow(start), 
>>> kasan_mem_to_shadow(end));
>>>      };
>>>
>>> +    /* Populate kernel, BPF, modules mapping */
>>> +    kasan_populate(kasan_mem_to_shadow((const void *)MODULES_VADDR),
>>> +               kasan_mem_to_shadow((const void *)BPF_JIT_REGION_END));
>>> +
>>>      for (i = 0; i < PTRS_PER_PTE; i++)
>>>          set_pte(&kasan_early_shadow_pte[i],
>>>              mk_pte(virt_to_page(kasan_early_shadow_page),
>>> diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
>>> index e8e4dcd39fed..35703d5ef5fd 100644
>>> --- a/arch/riscv/mm/physaddr.c
>>> +++ b/arch/riscv/mm/physaddr.c
>>> @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
>>>
>>>  phys_addr_t __phys_addr_symbol(unsigned long x)
>>>  {
>>> -    unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
>>> +    unsigned long kernel_start = (unsigned long)kernel_virt_addr;
>>>      unsigned long kernel_end = (unsigned long)_end;
>>>
>>>      /*
>>
>> This is breaking boot for me with CONFIG_STRICT_KERNEL_RWX=n.  I'm not 
>> even really convinced that's a useful config to support, but it's 
>> currently optional and I'd prefer to avoid breaking it if possible.
>>
>> I can't quite figure out what's going on here and I'm pretty much 
>> tired out for tonight.  LMK if you don't have time to look at it and 
>> I'll try to give it another shot.
> 
> I'm taking a look at that.

Just to make sure you don't miss it, I fixed this issue in 
https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/

Thanks,

Alex

> 
> Thanks,
> 
> Alex
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-04-15 18:00       ` Alex Ghiti
@ 2021-04-18 11:38         ` Alex Ghiti
  2021-06-10 16:39           ` Andreas Schwab
  0 siblings, 1 reply; 17+ messages in thread
From: Alex Ghiti @ 2021-04-18 11:38 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: corbet, Paul Walmsley, aou, Arnd Bergmann, aryabinin, glider,
	dvyukov, linux-doc, linux-riscv, linux-kernel, kasan-dev,
	linux-arch, linux-mm, Guenter Roeck

Hi Palmer,

Le 4/15/21 à 2:00 PM, Alex Ghiti a écrit :
> Le 4/15/21 à 12:54 AM, Alex Ghiti a écrit :
>> Le 4/15/21 à 12:20 AM, Palmer Dabbelt a écrit :
>>> On Sun, 11 Apr 2021 09:41:44 PDT (-0700), alex@ghiti.fr wrote:
>>>> This is a preparatory patch for relocatable kernel and sv48 support.
>>>>
>>>> The kernel used to be linked at PAGE_OFFSET address therefore we 
>>>> could use
>>>> the linear mapping for the kernel mapping. But the relocated kernel 
>>>> base
>>>> address will be different from PAGE_OFFSET and since in the linear 
>>>> mapping,
>>>> two different virtual addresses cannot point to the same physical 
>>>> address,
>>>> the kernel mapping needs to lie outside the linear mapping so that 
>>>> we don't
>>>> have to copy it at the same physical offset.
>>>>
>>>> The kernel mapping is moved to the last 2GB of the address space, BPF
>>>> is now always after the kernel and modules use the 2GB memory range 
>>>> right
>>>> before the kernel, so BPF and modules regions do not overlap. KASLR
>>>> implementation will simply have to move the kernel in the last 2GB 
>>>> range
>>>> and just take care of leaving enough space for BPF.
>>>>
>>>> In addition, by moving the kernel to the end of the address space, both
>>>> sv39 and sv48 kernels will be exactly the same without needing to be
>>>> relocated at runtime.
>>>>
>>>> Suggested-by: Arnd Bergmann <arnd@arndb.de>
>>>> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
>>>> ---
>>>>  arch/riscv/boot/loader.lds.S        |  3 +-
>>>>  arch/riscv/include/asm/page.h      | 17 +++++-
>>>>  arch/riscv/include/asm/pgtable.h    | 37 ++++++++----
>>>>  arch/riscv/include/asm/set_memory.h |  1 +
>>>>  arch/riscv/kernel/head.S            |  3 +-
>>>>  arch/riscv/kernel/module.c          |  6 +-
>>>>  arch/riscv/kernel/setup.c           |  5 ++
>>>>  arch/riscv/kernel/vmlinux.lds.S     | 3 +-
>>>>  arch/riscv/mm/fault.c               | 13 +++++
>>>>  arch/riscv/mm/init.c                | 87 ++++++++++++++++++++++-------
>>>>  arch/riscv/mm/kasan_init.c          |  9 +++
>>>>  arch/riscv/mm/physaddr.c            |  2 +-
>>>>  12 files changed, 146 insertions(+), 40 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/boot/loader.lds.S 
>>>> b/arch/riscv/boot/loader.lds.S
>>>> index 47a5003c2e28..62d94696a19c 100644
>>>> --- a/arch/riscv/boot/loader.lds.S
>>>> +++ b/arch/riscv/boot/loader.lds.S
>>>> @@ -1,13 +1,14 @@
>>>>  /* SPDX-License-Identifier: GPL-2.0 */
>>>>
>>>>  #include <asm/page.h>
>>>> +#include <asm/pgtable.h>
>>>>
>>>>  OUTPUT_ARCH(riscv)
>>>>  ENTRY(_start)
>>>>
>>>>  SECTIONS
>>>>  {
>>>> -    . = PAGE_OFFSET;
>>>> +    . = KERNEL_LINK_ADDR;
>>>>
>>>>      .payload : {
>>>>          *(.payload)
>>>> diff --git a/arch/riscv/include/asm/page.h 
>>>> b/arch/riscv/include/asm/page.h
>>>> index adc9d26f3d75..22cfb2be60dc 100644
>>>> --- a/arch/riscv/include/asm/page.h
>>>> +++ b/arch/riscv/include/asm/page.h
>>>> @@ -90,15 +90,28 @@ typedef struct page *pgtable_t;
>>>>
>>>>  #ifdef CONFIG_MMU
>>>>  extern unsigned long va_pa_offset;
>>>> +extern unsigned long va_kernel_pa_offset;
>>>>  extern unsigned long pfn_base;
>>>>  #define ARCH_PFN_OFFSET       (pfn_base)
>>>>  #else
>>>>  #define va_pa_offset        0
>>>> +#define va_kernel_pa_offset    0
>>>>  #define ARCH_PFN_OFFSET       (PAGE_OFFSET >> PAGE_SHIFT)
>>>>  #endif /* CONFIG_MMU */
>>>>
>>>> -#define __pa_to_va_nodebug(x)    ((void *)((unsigned long) (x) + 
>>>> va_pa_offset))
>>>> -#define __va_to_pa_nodebug(x)    ((unsigned long)(x) - va_pa_offset)
>>>> +extern unsigned long kernel_virt_addr;
>>>> +
>>>> +#define linear_mapping_pa_to_va(x)    ((void *)((unsigned long)(x) 
>>>> + va_pa_offset))
>>>> +#define kernel_mapping_pa_to_va(x)    ((void *)((unsigned long)(x) 
>>>> + va_kernel_pa_offset))
>>>> +#define __pa_to_va_nodebug(x)        linear_mapping_pa_to_va(x)
>>>> +
>>>> +#define linear_mapping_va_to_pa(x)    ((unsigned long)(x) - 
>>>> va_pa_offset)
>>>> +#define kernel_mapping_va_to_pa(x)    ((unsigned long)(x) - 
>>>> va_kernel_pa_offset)
>>>> +#define __va_to_pa_nodebug(x)    ({                        \
>>>> +    unsigned long _x = x;                            \
>>>> +    (_x < kernel_virt_addr) ?                        \
>>>> +        linear_mapping_va_to_pa(_x) : 
>>>> kernel_mapping_va_to_pa(_x);    \
>>>> +    })
>>>>
>>>>  #ifdef CONFIG_DEBUG_VIRTUAL
>>>>  extern phys_addr_t __virt_to_phys(unsigned long x);
>>>> diff --git a/arch/riscv/include/asm/pgtable.h 
>>>> b/arch/riscv/include/asm/pgtable.h
>>>> index ebf817c1bdf4..80e63a93e903 100644
>>>> --- a/arch/riscv/include/asm/pgtable.h
>>>> +++ b/arch/riscv/include/asm/pgtable.h
>>>> @@ -11,23 +11,30 @@
>>>>
>>>>  #include <asm/pgtable-bits.h>
>>>>
>>>> -#ifndef __ASSEMBLY__
>>>> -
>>>> -/* Page Upper Directory not used in RISC-V */
>>>> -#include <asm-generic/pgtable-nopud.h>
>>>> -#include <asm/page.h>
>>>> -#include <asm/tlbflush.h>
>>>> -#include <linux/mm_types.h>
>>>> +#ifndef CONFIG_MMU
>>>> +#define KERNEL_LINK_ADDR    PAGE_OFFSET
>>>> +#else
>>>>
>>>> -#ifdef CONFIG_MMU
>>>> +#define ADDRESS_SPACE_END    (UL(-1))
>>>> +/*
>>>> + * Leave 2GB for kernel and BPF at the end of the address space
>>>> + */
>>>> +#define KERNEL_LINK_ADDR    (ADDRESS_SPACE_END - SZ_2G + 1)
>>>>
>>>>  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >>1)
>>>>  #define VMALLOC_END      (PAGE_OFFSET - 1)
>>>>  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
>>>>
>>>> +/* KASLR should leave at least 128MB for BPF after the kernel */
>>>>  #define BPF_JIT_REGION_SIZE    (SZ_128M)
>>>> -#define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
>>>> -#define BPF_JIT_REGION_END    (VMALLOC_END)
>>>> +#define BPF_JIT_REGION_START    PFN_ALIGN((unsigned long)&_end)
>>>> +#define BPF_JIT_REGION_END    (BPF_JIT_REGION_START + 
>>>> BPF_JIT_REGION_SIZE)
>>>> +
>>>> +/* Modules always live before the kernel */
>>>> +#ifdef CONFIG_64BIT
>>>> +#define MODULES_VADDR    (PFN_ALIGN((unsigned long)&_end) - SZ_2G)
>>>> +#define MODULES_END    (PFN_ALIGN((unsigned long)&_start))
>>>> +#endif
>>>>
>>>>  /*
>>>>   * Roughly size the vmemmap space to be large enough to fit enough
>>>> @@ -57,9 +64,16 @@
>>>>  #define FIXADDR_SIZE     PGDIR_SIZE
>>>>  #endif
>>>>  #define FIXADDR_START    (FIXADDR_TOP - FIXADDR_SIZE)
>>>> -
>>>>  #endif
>>>>
>>>> +#ifndef __ASSEMBLY__
>>>> +
>>>> +/* Page Upper Directory not used in RISC-V */
>>>> +#include <asm-generic/pgtable-nopud.h>
>>>> +#include <asm/page.h>
>>>> +#include <asm/tlbflush.h>
>>>> +#include <linux/mm_types.h>
>>>> +
>>>>  #ifdef CONFIG_64BIT
>>>>  #include <asm/pgtable-64.h>
>>>>  #else
>>>> @@ -484,6 +498,7 @@ static inline int ptep_clear_flush_young(struct 
>>>> vm_area_struct *vma,
>>>>
>>>>  #define kern_addr_valid(addr)   (1) /* FIXME */
>>>>
>>>> +extern char _start[];
>>>>  extern void *dtb_early_va;
>>>>  extern uintptr_t dtb_early_pa;
>>>>  void setup_bootmem(void);
>>>> diff --git a/arch/riscv/include/asm/set_memory.h 
>>>> b/arch/riscv/include/asm/set_memory.h
>>>> index 6887b3d9f371..a9c56776fa0e 100644
>>>> --- a/arch/riscv/include/asm/set_memory.h
>>>> +++ b/arch/riscv/include/asm/set_memory.h
>>>> @@ -17,6 +17,7 @@ int set_memory_x(unsigned long addr, int numpages);
>>>>  int set_memory_nx(unsigned long addr, int numpages);
>>>>  int set_memory_rw_nx(unsigned long addr, int numpages);
>>>>  void protect_kernel_text_data(void);
>>>> +void protect_kernel_linear_mapping_text_rodata(void);
>>>>  #else
>>>>  static inline int set_memory_ro(unsigned long addr, int numpages) { 
>>>> return 0; }
>>>>  static inline int set_memory_rw(unsigned long addr, int numpages) { 
>>>> return 0; }
>>>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>>>> index f5a9bad86e58..6cb05f22e52a 100644
>>>> --- a/arch/riscv/kernel/head.S
>>>> +++ b/arch/riscv/kernel/head.S
>>>> @@ -69,7 +69,8 @@ pe_head_start:
>>>>  #ifdef CONFIG_MMU
>>>>  relocate:
>>>>      /* Relocate return address */
>>>> -    li a1, PAGE_OFFSET
>>>> +    la a1, kernel_virt_addr
>>>> +    REG_L a1, 0(a1)
>>>>      la a2, _start
>>>>      sub a1, a1, a2
>>>>      add ra, ra, a1
>>>> diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
>>>> index 104fba889cf7..ce153771e5e9 100644
>>>> --- a/arch/riscv/kernel/module.c
>>>> +++ b/arch/riscv/kernel/module.c
>>>> @@ -408,12 +408,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, 
>>>> const char *strtab,
>>>>  }
>>>>
>>>>  #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
>>>> -#define VMALLOC_MODULE_START \
>>>> -     max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
>>>>  void *module_alloc(unsigned long size)
>>>>  {
>>>> -    return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
>>>> -                    VMALLOC_END, GFP_KERNEL,
>>>> +    return __vmalloc_node_range(size, 1, MODULES_VADDR,
>>>> +                    MODULES_END, GFP_KERNEL,
>>>>                      PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
>>>>                      __builtin_return_address(0));
>>>>  }
>>>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>>>> index e85bacff1b50..30e4af0fd50c 100644
>>>> --- a/arch/riscv/kernel/setup.c
>>>> +++ b/arch/riscv/kernel/setup.c
>>>> @@ -265,6 +265,11 @@ void __init setup_arch(char **cmdline_p)
>>>>
>>>>      if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
>>>>          protect_kernel_text_data();
>>>> +
>>>> +#if defined(CONFIG_64BIT) && defined(CONFIG_MMU)
>>>> +    protect_kernel_linear_mapping_text_rodata();
>>>> +#endif
>>>> +
>>>>  #ifdef CONFIG_SWIOTLB
>>>>      swiotlb_init(1);
>>>>  #endif
>>>> diff --git a/arch/riscv/kernel/vmlinux.lds.S 
>>>> b/arch/riscv/kernel/vmlinux.lds.S
>>>> index de03cb22d0e9..0726c05e0336 100644
>>>> --- a/arch/riscv/kernel/vmlinux.lds.S
>>>> +++ b/arch/riscv/kernel/vmlinux.lds.S
>>>> @@ -4,7 +4,8 @@
>>>>   * Copyright (C) 2017 SiFive
>>>>   */
>>>>
>>>> -#define LOAD_OFFSET PAGE_OFFSET
>>>> +#include <asm/pgtable.h>
>>>> +#define LOAD_OFFSET KERNEL_LINK_ADDR
>>>>  #include <asm/vmlinux.lds.h>
>>>>  #include <asm/page.h>
>>>>  #include <asm/cache.h>
>>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>>> index 8f17519208c7..1b14d523a95c 100644
>>>> --- a/arch/riscv/mm/fault.c
>>>> +++ b/arch/riscv/mm/fault.c
>>>> @@ -231,6 +231,19 @@ asmlinkage void do_page_fault(struct pt_regs 
>>>> *regs)
>>>>          return;
>>>>      }
>>>>
>>>> +#ifdef CONFIG_64BIT
>>>> +    /*
>>>> +     * Modules in 64bit kernels lie in their ownvirtual region 
>>>> which is not
>>>> +     * in the vmalloc region, but dealing with page faults in this 
>>>> region
>>>> +     * or the vmalloc region amounts to doing the same thing: 
>>>> checking that
>>>> +     * the mapping exists in init_mm.pgd and updating user page 
>>>> table, so
>>>> +     * just use vmalloc_fault.
>>>> +     */
>>>> +    if (unlikely(addr >= MODULES_VADDR && addr < MODULES_END)) {
>>>> +        vmalloc_fault(regs, code,addr);
>>>> +        return;
>>>> +    }
>>>> +#endif
>>>>      /* Enable interrupts if they were enabled inthe parent context. */
>>>>      if (likely(regs->status & SR_PIE))
>>>>          local_irq_enable();
>>>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>>>> index 7f5036fbee8c..093f3a96ecfc 100644
>>>> --- a/arch/riscv/mm/init.c
>>>> +++ b/arch/riscv/mm/init.c
>>>> @@ -25,6 +25,9 @@
>>>>
>>>>  #include "../kernel/head.h"
>>>>
>>>> +unsigned long kernel_virt_addr = KERNEL_LINK_ADDR;
>>>> +EXPORT_SYMBOL(kernel_virt_addr);
>>>> +
>>>>  unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
>>>>                              __page_aligned_bss;
>>>>  EXPORT_SYMBOL(empty_zero_page);
>>>> @@ -88,6 +91,8 @@ static void print_vm_layout(void)
>>>>            (unsigned long)VMALLOC_END);
>>>>      print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
>>>>            (unsigned long)high_memory);
>>>> +    print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR,
>>>> +          (unsigned long)ADDRESS_SPACE_END);
>>>>  }
>>>>  #else
>>>>  static void print_vm_layout(void) { }
>>>> @@ -116,8 +121,13 @@ void __init setup_bootmem(void)
>>>>      /* The maximal physical memory size is -PAGE_OFFSET. */
>>>>      memblock_enforce_memory_limit(-PAGE_OFFSET);
>>>>
>>>> -    /* Reserve from the start of the kernel to the end of the 
>>>> kernel */
>>>> -    memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
>>>> +    /*
>>>> +     * Reserve from the start of the kernel to the end of the kernel
>>>> +     * and make sure we align the reservation onPMD_SIZE since we will
>>>> +     * map the kernel in the linear mapping as read-only: we do not 
>>>> want
>>>> +     * any allocation to happen between _end andthe next pmd 
>>>> aligned page.
>>>> +     */
>>>> +    memblock_reserve(vmlinux_start, (vmlinux_end - vmlinux_start + 
>>>> PMD_SIZE - 1) & PMD_MASK);
>>>>
>>>>      /*
>>>>       * memblock allocator is not aware of the fact that last 4K 
>>>> bytes of
>>>> @@ -152,8 +162,12 @@ void __init setup_bootmem(void)
>>>>  #ifdef CONFIG_MMU
>>>>  static struct pt_alloc_ops pt_ops;
>>>>
>>>> +/* Offset between linear mapping virtual address and kernel load 
>>>> address */
>>>>  unsigned long va_pa_offset;
>>>>  EXPORT_SYMBOL(va_pa_offset);
>>>> +/* Offset between kernel mapping virtual address and kernel load 
>>>> address */
>>>> +unsigned long va_kernel_pa_offset;
>>>> +EXPORT_SYMBOL(va_kernel_pa_offset);
>>>>  unsigned long pfn_base;
>>>>  EXPORT_SYMBOL(pfn_base);
>>>>
>>>> @@ -257,7 +271,7 @@ static pmd_t *get_pmd_virt_late(phys_addr_t pa)
>>>>
>>>>  static phys_addr_t __init alloc_pmd_early(uintptr_t va)
>>>>  {
>>>> -    BUG_ON((va - PAGE_OFFSET) >> PGDIR_SHIFT);
>>>> +    BUG_ON((va - kernel_virt_addr) >> PGDIR_SHIFT);
>>>>
>>>>      return (uintptr_t)early_pmd;
>>>>  }
>>>> @@ -372,17 +386,32 @@ static uintptr_t __init 
>>>> best_map_size(phys_addr_t base, phys_addr_t size)
>>>>  #error "setup_vm() is called from head.S before relocate so it 
>>>> should not use absolute addressing."
>>>>  #endif
>>>>
>>>> +uintptr_t load_pa, load_sz;
>>>> +
>>>> +static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t 
>>>> map_size)
>>>> +{
>>>> +    uintptr_t va, end_va;
>>>> +
>>>> +    end_va = kernel_virt_addr + load_sz;
>>>> +    for (va = kernel_virt_addr; va < end_va; va +=map_size)
>>>> +        create_pgd_mapping(pgdir,va,
>>>> +                   load_pa + (va - kernel_virt_addr),
>>>> +                   map_size, PAGE_KERNEL_EXEC);
>>>> +}
>>>> +
>>>>  asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>  {
>>>> -    uintptr_t va, pa, end_va;
>>>> -    uintptr_t load_pa = (uintptr_t)(&_start);
>>>> -    uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
>>>> +    uintptr_t pa;
>>>>      uintptr_t map_size;
>>>>  #ifndef __PAGETABLE_PMD_FOLDED
>>>>      pmd_t fix_bmap_spmd, fix_bmap_epmd;
>>>>  #endif
>>>> +    load_pa = (uintptr_t)(&_start);
>>>> +    load_sz = (uintptr_t)(&_end) - load_pa;
>>>>
>>>>      va_pa_offset = PAGE_OFFSET - load_pa;
>>>> +    va_kernel_pa_offset = kernel_virt_addr - load_pa;
>>>> +
>>>>      pfn_base = PFN_DOWN(load_pa);
>>>>
>>>>      /*
>>>> @@ -410,26 +439,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>      create_pmd_mapping(fixmap_pmd, FIXADDR_START,
>>>>                 (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
>>>>      /* Setup trampoline PGD and PMD */
>>>> -    create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>>>> +    create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>>>                 (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
>>>> -    create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
>>>> +    create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
>>>>                 load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
>>>>  #else
>>>>      /* Setup trampoline PGD */
>>>> -    create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
>>>> +    create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
>>>>                 load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
>>>>  #endif
>>>>
>>>>      /*
>>>> -     * Setup early PGD covering entire kernel which will allows
>>>> +     * Setup early PGD covering entire kernel which will allow
>>>>       * us to reach paging_init(). We map all memory banks later
>>>>       * in setup_vm_final() below.
>>>>       */
>>>> -    end_va = PAGE_OFFSET + load_sz;
>>>> -    for (va = PAGE_OFFSET; va < end_va; va += map_size)
>>>> -        create_pgd_mapping(early_pg_dir, va,
>>>> -                   load_pa + (va - PAGE_OFFSET),
>>>> -                   map_size, PAGE_KERNEL_EXEC);
>>>> +    create_kernel_page_table(early_pg_dir, map_size);
>>>>
>>>>  #ifndef __PAGETABLE_PMD_FOLDED
>>>>      /* Setup early PMD for DTB */
>>>> @@ -444,7 +469,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>                 pa + PMD_SIZE, PMD_SIZE, PAGE_KERNEL);
>>>>      dtb_early_va = (void *)DTB_EARLY_BASE_VA +(dtb_pa & (PMD_SIZE - 
>>>> 1));
>>>>  #else /* CONFIG_BUILTIN_DTB */
>>>> -    dtb_early_va = __va(dtb_pa);
>>>> +    /*
>>>> +     * __va can't be used since it would return a linear mapping 
>>>> address
>>>> +     * whereas dtb_early_va will be used before setup_vm_final 
>>>> installs
>>>> +     * the linear mapping.
>>>> +     */
>>>> +    dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>>>>  #endif /* CONFIG_BUILTIN_DTB */
>>>>  #else
>>>>  #ifndef CONFIG_BUILTIN_DTB
>>>> @@ -456,7 +486,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>                 pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
>>>>      dtb_early_va = (void *)DTB_EARLY_BASE_VA +(dtb_pa & (PGDIR_SIZE 
>>>> - 1));
>>>>  #else /* CONFIG_BUILTIN_DTB */
>>>> -    dtb_early_va = __va(dtb_pa);
>>>> +    dtb_early_va = kernel_mapping_pa_to_va(dtb_pa);
>>>>  #endif /* CONFIG_BUILTIN_DTB */
>>>>  #endif
>>>>      dtb_early_pa = dtb_pa;
>>>> @@ -492,6 +522,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>>>>  #endif
>>>>  }
>>>>
>>>> +#ifdef CONFIG_64BIT
>>>> +void protect_kernel_linear_mapping_text_rodata(void)
>>>> +{
>>>> +    unsigned long text_start = (unsigned long)lm_alias(_start);
>>>> +    unsigned long init_text_start = (unsigned 
>>>> long)lm_alias(__init_text_begin);
>>>> +    unsigned long rodata_start = (unsigned 
>>>> long)lm_alias(__start_rodata);
>>>> +    unsigned long data_start = (unsigned long)lm_alias(_data);
>>>> +
>>>> +    set_memory_ro(text_start, (init_text_start - text_start) >> 
>>>> PAGE_SHIFT);
>>>> +    set_memory_nx(text_start, (init_text_start - text_start) >> 
>>>> PAGE_SHIFT);
>>>> +
>>>> +    set_memory_ro(rodata_start, (data_start - rodata_start) >> 
>>>> PAGE_SHIFT);
>>>> +    set_memory_nx(rodata_start, (data_start - rodata_start) >> 
>>>> PAGE_SHIFT);
>>>> +}
>>>> +#endif
>>>> +
>>>>  static void __init setup_vm_final(void)
>>>>  {
>>>>      uintptr_t va, map_size;
>>>> @@ -513,7 +559,7 @@ static void __init setup_vm_final(void)
>>>>                 __pa_symbol(fixmap_pgd_next),
>>>>                 PGDIR_SIZE, PAGE_TABLE);
>>>>
>>>> -    /* Map all memory banks */
>>>> +    /* Map all memory banks in the linear mapping */
>>>>      for_each_mem_range(i, &start, &end) {
>>>>          if (start >= end)
>>>>              break;
>>>> @@ -525,10 +571,13 @@ static void __init setup_vm_final(void)
>>>>          for (pa = start; pa < end; pa += map_size) {
>>>>              va = (uintptr_t)__va(pa);
>>>>              create_pgd_mapping(swapper_pg_dir, va, pa,
>>>> -                       map_size,PAGE_KERNEL_EXEC);
>>>> +                       map_size,PAGE_KERNEL);
>>>>          }
>>>>      }
>>>>
>>>> +    /* Map the kernel */
>>>> +    create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
>>>> +
>>>>      /* Clear fixmap PTE and PMD mappings */
>>>>      clear_fixmap(FIX_PTE);
>>>>      clear_fixmap(FIX_PMD);
>>>> diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
>>>> index 2c39f0386673..28f4d52cf17e 100644
>>>> --- a/arch/riscv/mm/kasan_init.c
>>>> +++ b/arch/riscv/mm/kasan_init.c
>>>> @@ -171,6 +171,10 @@ void __init kasan_init(void)
>>>>      phys_addr_t _start, _end;
>>>>      u64 i;
>>>>
>>>> +    /*
>>>> +     * Populate all kernel virtual address spacewith 
>>>> kasan_early_shadow_page
>>>> +     * except for the linear mapping and the modules/kernel/BPF 
>>>> mapping.
>>>> +     */
>>>>      kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
>>>>                      (void *)kasan_mem_to_shadow((void *)
>>>>                                  VMEMMAP_END));
>>>> @@ -183,6 +187,7 @@ void __init kasan_init(void)
>>>>              (void *)kasan_mem_to_shadow((void *)VMALLOC_START),
>>>>              (void *)kasan_mem_to_shadow((void *)VMALLOC_END));
>>>>
>>>> +    /* Populate the linear mapping */
>>>>      for_each_mem_range(i, &_start, &_end) {
>>>>          void *start = (void *)__va(_start);
>>>>          void *end = (void *)__va(_end);
>>>> @@ -193,6 +198,10 @@ void __init kasan_init(void)
>>>>          kasan_populate(kasan_mem_to_shadow(start), 
>>>> kasan_mem_to_shadow(end));
>>>>      };
>>>>
>>>> +    /* Populate kernel, BPF, modules mapping */
>>>> +    kasan_populate(kasan_mem_to_shadow((const void *)MODULES_VADDR),
>>>> +               kasan_mem_to_shadow((const void *)BPF_JIT_REGION_END));
>>>> +
>>>>      for (i = 0; i < PTRS_PER_PTE; i++)
>>>>          set_pte(&kasan_early_shadow_pte[i],
>>>>              mk_pte(virt_to_page(kasan_early_shadow_page),
>>>> diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
>>>> index e8e4dcd39fed..35703d5ef5fd 100644
>>>> --- a/arch/riscv/mm/physaddr.c
>>>> +++ b/arch/riscv/mm/physaddr.c
>>>> @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
>>>>
>>>>  phys_addr_t __phys_addr_symbol(unsigned long x)
>>>>  {
>>>> -    unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
>>>> +    unsigned long kernel_start = (unsigned long)kernel_virt_addr;
>>>>      unsigned long kernel_end = (unsigned long)_end;
>>>>
>>>>      /*
>>>
>>> This is breaking boot for me with CONFIG_STRICT_KERNEL_RWX=n.  I'm 
>>> not even really convinced that's a useful config to support, but it's 
>>> currently optional and I'd prefer to avoid breaking it if possible.
>>>
>>> I can't quite figure out what's going on here and I'm pretty much 
>>> tired out for tonight.  LMK if you don't have time to look at it and 
>>> I'll try to give it another shot.
>>
>> I'm taking a look at that.
> 
> Just to make sure you don't miss it, I fixed this issue in 
> https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/

Guenter reported that this patchset broke 32b kernel build: I had 
neglected it. So I pushed 2 other fixes for this where I added Fixes tag 
and that can be squashed in case you want to squash them.

To sum up, there are 3 patches that fix this series:

https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/

https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/

https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/

Sorry for that,

Thanks,

Alex

> 
> 
> Thanks,
> 
> Alex
> 
>>
>> Thanks,
>>
>> Alex
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-04-18 11:38         ` Alex Ghiti
@ 2021-06-10 16:39           ` Andreas Schwab
  2021-06-10 17:10             ` Guenter Roeck
  0 siblings, 1 reply; 17+ messages in thread
From: Andreas Schwab @ 2021-06-10 16:39 UTC (permalink / raw)
  To: Alex Ghiti
  Cc: Palmer Dabbelt, corbet, Paul Walmsley, aou, Arnd Bergmann,
	aryabinin, glider, dvyukov, linux-doc, linux-riscv, linux-kernel,
	kasan-dev, linux-arch, linux-mm, Guenter Roeck

On Apr 18 2021, Alex Ghiti wrote:

> To sum up, there are 3 patches that fix this series:
>
> https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/
>
> https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/
>
> https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/

Has this been fixed yet?  Booting is still broken here.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-10 16:39           ` Andreas Schwab
@ 2021-06-10 17:10             ` Guenter Roeck
  2021-06-10 17:11               ` Andreas Schwab
  0 siblings, 1 reply; 17+ messages in thread
From: Guenter Roeck @ 2021-06-10 17:10 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Alex Ghiti, Palmer Dabbelt, corbet, Paul Walmsley, aou,
	Arnd Bergmann, aryabinin, glider, dvyukov, linux-doc,
	linux-riscv, linux-kernel, kasan-dev, linux-arch, linux-mm

On Thu, Jun 10, 2021 at 06:39:39PM +0200, Andreas Schwab wrote:
> On Apr 18 2021, Alex Ghiti wrote:
> 
> > To sum up, there are 3 patches that fix this series:
> >
> > https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/
> >
> > https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/
> >
> > https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/
> 
> Has this been fixed yet?  Booting is still broken here.
> 

In -next ? riscv32 doesn't even build for me there, and riscv64 images
generate warnings and/or don't boot (but that doesn't seem to be riscv
related, at least at first glance).

Guenter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-10 17:10             ` Guenter Roeck
@ 2021-06-10 17:11               ` Andreas Schwab
  2021-06-10 17:20                 ` Guenter Roeck
  0 siblings, 1 reply; 17+ messages in thread
From: Andreas Schwab @ 2021-06-10 17:11 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alex Ghiti, Palmer Dabbelt, corbet, Paul Walmsley, aou,
	Arnd Bergmann, aryabinin, glider, dvyukov, linux-doc,
	linux-riscv, linux-kernel, kasan-dev, linux-arch, linux-mm

On Jun 10 2021, Guenter Roeck wrote:

> On Thu, Jun 10, 2021 at 06:39:39PM +0200, Andreas Schwab wrote:
>> On Apr 18 2021, Alex Ghiti wrote:
>> 
>> > To sum up, there are 3 patches that fix this series:
>> >
>> > https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/
>> >
>> > https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/
>> >
>> > https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/
>> 
>> Has this been fixed yet?  Booting is still broken here.
>> 
>
> In -next ?

No, -rc5.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-10 17:11               ` Andreas Schwab
@ 2021-06-10 17:20                 ` Guenter Roeck
  2021-06-10 17:29                   ` Andreas Schwab
  0 siblings, 1 reply; 17+ messages in thread
From: Guenter Roeck @ 2021-06-10 17:20 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Alex Ghiti, Palmer Dabbelt, corbet, Paul Walmsley, aou,
	Arnd Bergmann, aryabinin, glider, dvyukov, linux-doc,
	linux-riscv, linux-kernel, kasan-dev, linux-arch, linux-mm

On Thu, Jun 10, 2021 at 07:11:38PM +0200, Andreas Schwab wrote:
> On Jun 10 2021, Guenter Roeck wrote:
> 
> > On Thu, Jun 10, 2021 at 06:39:39PM +0200, Andreas Schwab wrote:
> >> On Apr 18 2021, Alex Ghiti wrote:
> >> 
> >> > To sum up, there are 3 patches that fix this series:
> >> >
> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/
> >> >
> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/
> >> >
> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/
> >> 
> >> Has this been fixed yet?  Booting is still broken here.
> >> 
> >
> > In -next ?
> 
> No, -rc5.
> 
Booting v5.13-rc5 in qemu works for me for riscv32 and riscv64,
but of course that doesn't mean much. Just wondering, not knowing
the context - did you provide details ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-10 17:20                 ` Guenter Roeck
@ 2021-06-10 17:29                   ` Andreas Schwab
  2021-06-11 11:00                     ` Guenter Roeck
  0 siblings, 1 reply; 17+ messages in thread
From: Andreas Schwab @ 2021-06-10 17:29 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alex Ghiti, Palmer Dabbelt, corbet, Paul Walmsley, aou,
	Arnd Bergmann, aryabinin, glider, dvyukov, linux-doc,
	linux-riscv, linux-kernel, kasan-dev, linux-arch, linux-mm

On Jun 10 2021, Guenter Roeck wrote:

> On Thu, Jun 10, 2021 at 07:11:38PM +0200, Andreas Schwab wrote:
>> On Jun 10 2021, Guenter Roeck wrote:
>> 
>> > On Thu, Jun 10, 2021 at 06:39:39PM +0200, Andreas Schwab wrote:
>> >> On Apr 18 2021, Alex Ghiti wrote:
>> >> 
>> >> > To sum up, there are 3 patches that fix this series:
>> >> >
>> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/
>> >> >
>> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/
>> >> >
>> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/
>> >> 
>> >> Has this been fixed yet?  Booting is still broken here.
>> >> 
>> >
>> > In -next ?
>> 
>> No, -rc5.
>> 
> Booting v5.13-rc5 in qemu works for me for riscv32 and riscv64,
> but of course that doesn't mean much. Just wondering, not knowing
> the context - did you provide details ?

Does that work for you:

https://github.com/openSUSE/kernel-source/blob/master/config/riscv64/default

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-10 17:29                   ` Andreas Schwab
@ 2021-06-11 11:00                     ` Guenter Roeck
  2021-06-17  2:58                       ` Palmer Dabbelt
  0 siblings, 1 reply; 17+ messages in thread
From: Guenter Roeck @ 2021-06-11 11:00 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Alex Ghiti, Palmer Dabbelt, corbet, Paul Walmsley, aou,
	Arnd Bergmann, aryabinin, glider, dvyukov, linux-doc,
	linux-riscv, linux-kernel, kasan-dev, linux-arch, linux-mm

On Thu, Jun 10, 2021 at 07:29:15PM +0200, Andreas Schwab wrote:
> On Jun 10 2021, Guenter Roeck wrote:
> 
> > On Thu, Jun 10, 2021 at 07:11:38PM +0200, Andreas Schwab wrote:
> >> On Jun 10 2021, Guenter Roeck wrote:
> >> 
> >> > On Thu, Jun 10, 2021 at 06:39:39PM +0200, Andreas Schwab wrote:
> >> >> On Apr 18 2021, Alex Ghiti wrote:
> >> >> 
> >> >> > To sum up, there are 3 patches that fix this series:
> >> >> >
> >> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/
> >> >> >
> >> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/
> >> >> >
> >> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/
> >> >> 
> >> >> Has this been fixed yet?  Booting is still broken here.
> >> >> 
> >> >
> >> > In -next ?
> >> 
> >> No, -rc5.
> >> 
> > Booting v5.13-rc5 in qemu works for me for riscv32 and riscv64,
> > but of course that doesn't mean much. Just wondering, not knowing
> > the context - did you provide details ?
> 
> Does that work for you:
> 
> https://github.com/openSUSE/kernel-source/blob/master/config/riscv64/default
> 

That isn't an upstream kernel configuration; it looks like includes suse
patches. But, yes, it does crash almost immediately if I build an upstream
kernel based on it and try to run that kernel in qemu. I did not try to
track it down further; after all, it might just be that the configuration
is inappropriate for use with qemu. But the configuration isn't really
what I had asked.

Guenter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-11 11:00                     ` Guenter Roeck
@ 2021-06-17  2:58                       ` Palmer Dabbelt
  2021-06-17  9:14                         ` Andreas Schwab
  0 siblings, 1 reply; 17+ messages in thread
From: Palmer Dabbelt @ 2021-06-17  2:58 UTC (permalink / raw)
  To: schwab, linux
  Cc: schwab, alex, corbet, Paul Walmsley, aou, Arnd Bergmann,
	aryabinin, glider, dvyukov, linux-doc, linux-riscv, linux-kernel,
	kasan-dev, linux-arch, linux-mm

On Fri, 11 Jun 2021 04:00:19 PDT (-0700), linux@roeck-us.net wrote:
> On Thu, Jun 10, 2021 at 07:29:15PM +0200, Andreas Schwab wrote:
>> On Jun 10 2021, Guenter Roeck wrote:
>>
>> > On Thu, Jun 10, 2021 at 07:11:38PM +0200, Andreas Schwab wrote:
>> >> On Jun 10 2021, Guenter Roeck wrote:
>> >>
>> >> > On Thu, Jun 10, 2021 at 06:39:39PM +0200, Andreas Schwab wrote:
>> >> >> On Apr 18 2021, Alex Ghiti wrote:
>> >> >>
>> >> >> > To sum up, there are 3 patches that fix this series:
>> >> >> >
>> >> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210415110426.2238-1-alex@ghiti.fr/
>> >> >> >
>> >> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210417172159.32085-1-alex@ghiti.fr/
>> >> >> >
>> >> >> > https://patchwork.kernel.org/project/linux-riscv/patch/20210418112856.15078-1-alex@ghiti.fr/
>> >> >>
>> >> >> Has this been fixed yet?  Booting is still broken here.
>> >> >>
>> >> >
>> >> > In -next ?
>> >>
>> >> No, -rc5.
>> >>
>> > Booting v5.13-rc5 in qemu works for me for riscv32 and riscv64,
>> > but of course that doesn't mean much. Just wondering, not knowing
>> > the context - did you provide details ?
>>
>> Does that work for you:
>>
>> https://github.com/openSUSE/kernel-source/blob/master/config/riscv64/default
>>
>
> That isn't an upstream kernel configuration; it looks like includes suse
> patches. But, yes, it does crash almost immediately if I build an upstream
> kernel based on it and try to run that kernel in qemu. I did not try to
> track it down further; after all, it might just be that the configuration
> is inappropriate for use with qemu. But the configuration isn't really
> what I had asked.

This seems a long way off from defconfig.  It's entirly possible I'm 
missing something, but at least CONFIG_SOC_VIRT is jumping out as 
something that's disabled in the SUSE config but enabled upstream.  That 
alone shouldn't actually do anything, but it does ensure we have all the 
drivers necessary to boot on QEMU.

It's entierly possible there's a real bug here, though, as I don't 
really see what these relocatable patches would have to do with that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-17  2:58                       ` Palmer Dabbelt
@ 2021-06-17  9:14                         ` Andreas Schwab
  2021-07-01  2:59                           ` Palmer Dabbelt
  0 siblings, 1 reply; 17+ messages in thread
From: Andreas Schwab @ 2021-06-17  9:14 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: linux, alex, corbet, Paul Walmsley, aou, Arnd Bergmann,
	aryabinin, glider, dvyukov, linux-doc, linux-riscv, linux-kernel,
	kasan-dev, linux-arch, linux-mm

On Jun 16 2021, Palmer Dabbelt wrote:

> This seems a long way off from defconfig.  It's entirly possible I'm
> missing something, but at least CONFIG_SOC_VIRT is jumping out as 
> something that's disabled in the SUSE config but enabled upstream.

None of the SOC configs are really needed, they are just convenience.
They can even be harmful, if they force a config to y if m is actually
wanted.  Which is what happens with SOC_VIRT, which forces
RTC_DRV_GOLDFISH to y.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 1/3] riscv: Move kernel mapping outside of linear mapping
  2021-06-17  9:14                         ` Andreas Schwab
@ 2021-07-01  2:59                           ` Palmer Dabbelt
  0 siblings, 0 replies; 17+ messages in thread
From: Palmer Dabbelt @ 2021-07-01  2:59 UTC (permalink / raw)
  To: schwab
  Cc: linux, alex, corbet, Paul Walmsley, aou, Arnd Bergmann,
	aryabinin, glider, dvyukov, linux-doc, linux-riscv, linux-kernel,
	kasan-dev, linux-arch, linux-mm

On Thu, 17 Jun 2021 02:14:48 PDT (-0700), schwab@linux-m68k.org wrote:
> On Jun 16 2021, Palmer Dabbelt wrote:
>
>> This seems a long way off from defconfig.  It's entirly possible I'm
>> missing something, but at least CONFIG_SOC_VIRT is jumping out as
>> something that's disabled in the SUSE config but enabled upstream.
>
> None of the SOC configs are really needed, they are just convenience.
> They can even be harmful, if they force a config to y if m is actually
> wanted.  Which is what happens with SOC_VIRT, which forces
> RTC_DRV_GOLDFISH to y.

Ya, in retrospect the SOC configs were really just a bad idea.  I think 
we've talked about removing them before as they break stuff, I just 
haven't gotten around to doing it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-07-01  2:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-11 16:41 [PATCH v5 0/3] Move kernel mapping outside the linear mapping Alexandre Ghiti
2021-04-11 16:41 ` [PATCH v5 1/3] riscv: Move kernel mapping outside of " Alexandre Ghiti
2021-04-15  4:20   ` Palmer Dabbelt
2021-04-15  4:54     ` Alex Ghiti
2021-04-15 18:00       ` Alex Ghiti
2021-04-18 11:38         ` Alex Ghiti
2021-06-10 16:39           ` Andreas Schwab
2021-06-10 17:10             ` Guenter Roeck
2021-06-10 17:11               ` Andreas Schwab
2021-06-10 17:20                 ` Guenter Roeck
2021-06-10 17:29                   ` Andreas Schwab
2021-06-11 11:00                     ` Guenter Roeck
2021-06-17  2:58                       ` Palmer Dabbelt
2021-06-17  9:14                         ` Andreas Schwab
2021-07-01  2:59                           ` Palmer Dabbelt
2021-04-11 16:41 ` [PATCH v5 2/3] Documentation: riscv: Add documentation that describes the VM layout Alexandre Ghiti
2021-04-11 16:41 ` [PATCH v5 3/3] riscv: Prepare ptdump for vm layout dynamic addresses Alexandre Ghiti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).