All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -fixes v2 0/4] Fixes KASAN and other along the way
@ 2022-02-21 16:12 ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

As reported by Aleksandr, syzbot riscv is broken since commit
54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually
breaks KASAN_INLINE which is not fixed in this series, that will come later
when found.

Nevertheless, this series fixes small things that made the syzbot
configuration + KASAN_OUTLINE fail to boot.

Note that even though the config at [1] boots fine with this series, I
was not able to boot the small config at [2] which fails because
kasan_poison receives a really weird address 0x4075706301000000 (maybe a
kasan person could provide some hint about what happens below in
do_ctors -> __asan_register_globals):

Thread 2 hit Breakpoint 1, kasan_poison (addr=<optimized out>, size=<optimized out>, value=<optimized out>, init=<optimized out>) at /home/alex/work/linux/mm/kasan/shadow.c:90
90		if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
1: x/i $pc
=> 0xffffffff80261712 <kasan_poison>:	andi	a4,a0,7
5: /x $a0 = 0x4075706301000000

Thread 2 hit Breakpoint 2, handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:27
27		csrrw tp, CSR_SCRATCH, tp
1: x/i $pc
=> 0xffffffff80004098 <handle_exception>:	csrrw	tp,sscratch,tp
5: /x $a0 = 0xe80eae0b60200000
(gdb) bt
#0  handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:27
#1  0xffffffff80261746 in kasan_poison (addr=<optimized out>, size=<optimized out>, value=<optimized out>, init=<optimized out>)
    at /home/alex/work/linux/mm/kasan/shadow.c:98
#2  0xffffffff802618b4 in kasan_unpoison (addr=<optimized out>, size=<optimized out>, init=<optimized out>)
    at /home/alex/work/linux/mm/kasan/shadow.c:138
#3  0xffffffff80260876 in register_global (global=<optimized out>) at /home/alex/work/linux/mm/kasan/generic.c:214
#4  __asan_register_globals (globals=<optimized out>, size=<optimized out>) at /home/alex/work/linux/mm/kasan/generic.c:226
#5  0xffffffff8125efac in _sub_I_65535_1 ()
#6  0xffffffff81201b32 in do_ctors () at /home/alex/work/linux/init/main.c:1156
#7  do_basic_setup () at /home/alex/work/linux/init/main.c:1407
#8  kernel_init_freeable () at /home/alex/work/linux/init/main.c:1613
#9  0xffffffff81153ddc in kernel_init (unused=<optimized out>) at /home/alex/work/linux/init/main.c:1502
#10 0xffffffff800041c0 in handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:231


Thanks again to Aleksandr for narrowing down the issues fixed here.


[1] https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
[2] https://gist.github.com/AlexGhiti/a5a0cab0227e2bf38f9d12232591c0e4


Changes in v2:
- Fix kernel test robot failure regarding KERN_VIRT_SIZE that is
  undefined for nommu config

Alexandre Ghiti (4):
  riscv: Fix is_linear_mapping with recent move of KASAN region
  riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
  riscv: Fix DEBUG_VIRTUAL false warnings
  riscv: Fix config KASAN && DEBUG_VIRTUAL

 arch/riscv/include/asm/page.h    | 2 +-
 arch/riscv/include/asm/pgtable.h | 1 +
 arch/riscv/mm/Makefile           | 3 +++
 arch/riscv/mm/kasan_init.c       | 3 +--
 arch/riscv/mm/physaddr.c         | 4 +---
 5 files changed, 7 insertions(+), 6 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 0/4] Fixes KASAN and other along the way
@ 2022-02-21 16:12 ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

As reported by Aleksandr, syzbot riscv is broken since commit
54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually
breaks KASAN_INLINE which is not fixed in this series, that will come later
when found.

Nevertheless, this series fixes small things that made the syzbot
configuration + KASAN_OUTLINE fail to boot.

Note that even though the config at [1] boots fine with this series, I
was not able to boot the small config at [2] which fails because
kasan_poison receives a really weird address 0x4075706301000000 (maybe a
kasan person could provide some hint about what happens below in
do_ctors -> __asan_register_globals):

Thread 2 hit Breakpoint 1, kasan_poison (addr=<optimized out>, size=<optimized out>, value=<optimized out>, init=<optimized out>) at /home/alex/work/linux/mm/kasan/shadow.c:90
90		if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
1: x/i $pc
=> 0xffffffff80261712 <kasan_poison>:	andi	a4,a0,7
5: /x $a0 = 0x4075706301000000

Thread 2 hit Breakpoint 2, handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:27
27		csrrw tp, CSR_SCRATCH, tp
1: x/i $pc
=> 0xffffffff80004098 <handle_exception>:	csrrw	tp,sscratch,tp
5: /x $a0 = 0xe80eae0b60200000
(gdb) bt
#0  handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:27
#1  0xffffffff80261746 in kasan_poison (addr=<optimized out>, size=<optimized out>, value=<optimized out>, init=<optimized out>)
    at /home/alex/work/linux/mm/kasan/shadow.c:98
#2  0xffffffff802618b4 in kasan_unpoison (addr=<optimized out>, size=<optimized out>, init=<optimized out>)
    at /home/alex/work/linux/mm/kasan/shadow.c:138
#3  0xffffffff80260876 in register_global (global=<optimized out>) at /home/alex/work/linux/mm/kasan/generic.c:214
#4  __asan_register_globals (globals=<optimized out>, size=<optimized out>) at /home/alex/work/linux/mm/kasan/generic.c:226
#5  0xffffffff8125efac in _sub_I_65535_1 ()
#6  0xffffffff81201b32 in do_ctors () at /home/alex/work/linux/init/main.c:1156
#7  do_basic_setup () at /home/alex/work/linux/init/main.c:1407
#8  kernel_init_freeable () at /home/alex/work/linux/init/main.c:1613
#9  0xffffffff81153ddc in kernel_init (unused=<optimized out>) at /home/alex/work/linux/init/main.c:1502
#10 0xffffffff800041c0 in handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:231


Thanks again to Aleksandr for narrowing down the issues fixed here.


[1] https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
[2] https://gist.github.com/AlexGhiti/a5a0cab0227e2bf38f9d12232591c0e4


Changes in v2:
- Fix kernel test robot failure regarding KERN_VIRT_SIZE that is
  undefined for nommu config

Alexandre Ghiti (4):
  riscv: Fix is_linear_mapping with recent move of KASAN region
  riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
  riscv: Fix DEBUG_VIRTUAL false warnings
  riscv: Fix config KASAN && DEBUG_VIRTUAL

 arch/riscv/include/asm/page.h    | 2 +-
 arch/riscv/include/asm/pgtable.h | 1 +
 arch/riscv/mm/Makefile           | 3 +++
 arch/riscv/mm/kasan_init.c       | 3 +--
 arch/riscv/mm/physaddr.c         | 4 +---
 5 files changed, 7 insertions(+), 6 deletions(-)

-- 
2.32.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 1/4] riscv: Fix is_linear_mapping with recent move of KASAN region
  2022-02-21 16:12 ` Alexandre Ghiti
@ 2022-02-21 16:12   ` Alexandre Ghiti
  -1 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

KASAN region was recently moved between the linear mapping and the
kernel mapping, is_linear_mapping used to check the validity of an
address by using the start of the kernel mapping, which is now wrong.

Fix this by using the maximum size of the physical memory.

Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/include/asm/page.h    | 2 +-
 arch/riscv/include/asm/pgtable.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 160e3a1e8f8b..004372f8da54 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -119,7 +119,7 @@ extern phys_addr_t phys_ram_base;
 	((x) >= kernel_map.virt_addr && (x) < (kernel_map.virt_addr + kernel_map.size))
 
 #define is_linear_mapping(x)	\
-	((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < kernel_map.virt_addr))
+	((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < PAGE_OFFSET + KERN_VIRT_SIZE))
 
 #define linear_mapping_pa_to_va(x)	((void *)((unsigned long)(x) + kernel_map.va_pa_offset))
 #define kernel_mapping_pa_to_va(y)	({						\
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7e949f25c933..e3549e50de95 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -13,6 +13,7 @@
 
 #ifndef CONFIG_MMU
 #define KERNEL_LINK_ADDR	PAGE_OFFSET
+#define KERN_VIRT_SIZE		(UL(-1))
 #else
 
 #define ADDRESS_SPACE_END	(UL(-1))
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 1/4] riscv: Fix is_linear_mapping with recent move of KASAN region
@ 2022-02-21 16:12   ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

KASAN region was recently moved between the linear mapping and the
kernel mapping, is_linear_mapping used to check the validity of an
address by using the start of the kernel mapping, which is now wrong.

Fix this by using the maximum size of the physical memory.

Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/include/asm/page.h    | 2 +-
 arch/riscv/include/asm/pgtable.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 160e3a1e8f8b..004372f8da54 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -119,7 +119,7 @@ extern phys_addr_t phys_ram_base;
 	((x) >= kernel_map.virt_addr && (x) < (kernel_map.virt_addr + kernel_map.size))
 
 #define is_linear_mapping(x)	\
-	((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < kernel_map.virt_addr))
+	((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < PAGE_OFFSET + KERN_VIRT_SIZE))
 
 #define linear_mapping_pa_to_va(x)	((void *)((unsigned long)(x) + kernel_map.va_pa_offset))
 #define kernel_mapping_pa_to_va(y)	({						\
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7e949f25c933..e3549e50de95 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -13,6 +13,7 @@
 
 #ifndef CONFIG_MMU
 #define KERNEL_LINK_ADDR	PAGE_OFFSET
+#define KERN_VIRT_SIZE		(UL(-1))
 #else
 
 #define ADDRESS_SPACE_END	(UL(-1))
-- 
2.32.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 2/4] riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
  2022-02-21 16:12 ` Alexandre Ghiti
@ 2022-02-21 16:12   ` Alexandre Ghiti
  -1 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b72721d0 ("riscv: Add KASAN support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/kasan_init.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
index f61f7ca6fe0f..85e849318389 100644
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -202,8 +202,7 @@ asmlinkage void __init kasan_early_init(void)
 
 	for (i = 0; i < PTRS_PER_PTE; ++i)
 		set_pte(kasan_early_shadow_pte + i,
-			mk_pte(virt_to_page(kasan_early_shadow_page),
-			       PAGE_KERNEL));
+			pfn_pte(virt_to_pfn(kasan_early_shadow_page), PAGE_KERNEL));
 
 	for (i = 0; i < PTRS_PER_PMD; ++i)
 		set_pmd(kasan_early_shadow_pmd + i,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 2/4] riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
@ 2022-02-21 16:12   ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b72721d0 ("riscv: Add KASAN support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/kasan_init.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
index f61f7ca6fe0f..85e849318389 100644
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -202,8 +202,7 @@ asmlinkage void __init kasan_early_init(void)
 
 	for (i = 0; i < PTRS_PER_PTE; ++i)
 		set_pte(kasan_early_shadow_pte + i,
-			mk_pte(virt_to_page(kasan_early_shadow_page),
-			       PAGE_KERNEL));
+			pfn_pte(virt_to_pfn(kasan_early_shadow_page), PAGE_KERNEL));
 
 	for (i = 0; i < PTRS_PER_PMD; ++i)
 		set_pmd(kasan_early_shadow_pmd + i,
-- 
2.32.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 3/4] riscv: Fix DEBUG_VIRTUAL false warnings
  2022-02-21 16:12 ` Alexandre Ghiti
@ 2022-02-21 16:12   ` Alexandre Ghiti
  -1 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

KERN_VIRT_SIZE used to encompass the kernel mapping before it was
redefined when moving the kasan mapping next to the kernel mapping to only
match the maximum amount of physical memory.

Then, kernel mapping addresses that go through __virt_to_phys are now
declared as wrong which is not true, one can use __virt_to_phys on such
addresses.

Fix this by redefining the condition that matches wrong addresses.

Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/physaddr.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
index e7fd0c253c7b..19cf25a74ee2 100644
--- a/arch/riscv/mm/physaddr.c
+++ b/arch/riscv/mm/physaddr.c
@@ -8,12 +8,10 @@
 
 phys_addr_t __virt_to_phys(unsigned long x)
 {
-	phys_addr_t y = x - PAGE_OFFSET;
-
 	/*
 	 * Boundary checking aginst the kernel linear mapping space.
 	 */
-	WARN(y >= KERN_VIRT_SIZE,
+	WARN(!is_linear_mapping(x) && !is_kernel_mapping(x),
 	     "virt_to_phys used for non-linear address: %pK (%pS)\n",
 	     (void *)x, (void *)x);
 
-- 
2.32.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 3/4] riscv: Fix DEBUG_VIRTUAL false warnings
@ 2022-02-21 16:12   ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

KERN_VIRT_SIZE used to encompass the kernel mapping before it was
redefined when moving the kasan mapping next to the kernel mapping to only
match the maximum amount of physical memory.

Then, kernel mapping addresses that go through __virt_to_phys are now
declared as wrong which is not true, one can use __virt_to_phys on such
addresses.

Fix this by redefining the condition that matches wrong addresses.

Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/physaddr.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
index e7fd0c253c7b..19cf25a74ee2 100644
--- a/arch/riscv/mm/physaddr.c
+++ b/arch/riscv/mm/physaddr.c
@@ -8,12 +8,10 @@
 
 phys_addr_t __virt_to_phys(unsigned long x)
 {
-	phys_addr_t y = x - PAGE_OFFSET;
-
 	/*
 	 * Boundary checking aginst the kernel linear mapping space.
 	 */
-	WARN(y >= KERN_VIRT_SIZE,
+	WARN(!is_linear_mapping(x) && !is_kernel_mapping(x),
 	     "virt_to_phys used for non-linear address: %pK (%pS)\n",
 	     (void *)x, (void *)x);
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
  2022-02-21 16:12 ` Alexandre Ghiti
@ 2022-02-21 16:12   ` Alexandre Ghiti
  -1 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 7ebaef10ea1b..ac7a25298a04 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
 ifdef CONFIG_KASAN
 KASAN_SANITIZE_kasan_init.o := n
 KASAN_SANITIZE_init.o := n
+ifdef CONFIG_DEBUG_VIRTUAL
+KASAN_SANITIZE_physaddr.o := n
+endif
 endif
 
 obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
@ 2022-02-21 16:12   ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-21 16:12 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 7ebaef10ea1b..ac7a25298a04 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
 ifdef CONFIG_KASAN
 KASAN_SANITIZE_kasan_init.o := n
 KASAN_SANITIZE_init.o := n
+ifdef CONFIG_DEBUG_VIRTUAL
+KASAN_SANITIZE_physaddr.o := n
+endif
 endif
 
 obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
-- 
2.32.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
  2022-02-21 16:12   ` Alexandre Ghiti
@ 2022-02-22 10:28     ` Aleksandr Nogikh
  -1 siblings, 0 replies; 18+ messages in thread
From: Aleksandr Nogikh @ 2022-02-22 10:28 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Nick Hu,
	linux-riscv, LKML, kasan-dev

Hi Alexandre,

Thanks for the series!

However, I still haven't managed to boot the kernel. What I did:
1) Checked out the riscv/fixes branch (this is the one we're using on
syzbot). The latest commit was
6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
2) Applied all 4 patches.
3) Used the config from the cover letter:
https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
-device virtio-rng-pci -machine virt -device
virtio-net-pci,netdev=net0 -netdev
user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
virtio-blk-device,drive=hd0 -drive
file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
-kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
runs qemu).

Can you please hint at what I'm doing differently?

A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
leads to a booting kernel, which was not the case before.
make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-

--
Best Regards,
Aleksandr

On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> __virt_to_phys function is called very early in the boot process (ie
> kasan_early_init) so it should not be instrumented by KASAN otherwise it
> bugs.
>
> Fix this by declaring phys_addr.c as non-kasan instrumentable.
>
> Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> ---
>  arch/riscv/mm/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> index 7ebaef10ea1b..ac7a25298a04 100644
> --- a/arch/riscv/mm/Makefile
> +++ b/arch/riscv/mm/Makefile
> @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
>  ifdef CONFIG_KASAN
>  KASAN_SANITIZE_kasan_init.o := n
>  KASAN_SANITIZE_init.o := n
> +ifdef CONFIG_DEBUG_VIRTUAL
> +KASAN_SANITIZE_physaddr.o := n
> +endif
>  endif
>
>  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
@ 2022-02-22 10:28     ` Aleksandr Nogikh
  0 siblings, 0 replies; 18+ messages in thread
From: Aleksandr Nogikh @ 2022-02-22 10:28 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Nick Hu,
	linux-riscv, LKML, kasan-dev

Hi Alexandre,

Thanks for the series!

However, I still haven't managed to boot the kernel. What I did:
1) Checked out the riscv/fixes branch (this is the one we're using on
syzbot). The latest commit was
6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
2) Applied all 4 patches.
3) Used the config from the cover letter:
https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
-device virtio-rng-pci -machine virt -device
virtio-net-pci,netdev=net0 -netdev
user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
virtio-blk-device,drive=hd0 -drive
file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
-kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
runs qemu).

Can you please hint at what I'm doing differently?

A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
leads to a booting kernel, which was not the case before.
make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-

--
Best Regards,
Aleksandr

On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> __virt_to_phys function is called very early in the boot process (ie
> kasan_early_init) so it should not be instrumented by KASAN otherwise it
> bugs.
>
> Fix this by declaring phys_addr.c as non-kasan instrumentable.
>
> Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> ---
>  arch/riscv/mm/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> index 7ebaef10ea1b..ac7a25298a04 100644
> --- a/arch/riscv/mm/Makefile
> +++ b/arch/riscv/mm/Makefile
> @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
>  ifdef CONFIG_KASAN
>  KASAN_SANITIZE_kasan_init.o := n
>  KASAN_SANITIZE_init.o := n
> +ifdef CONFIG_DEBUG_VIRTUAL
> +KASAN_SANITIZE_physaddr.o := n
> +endif
>  endif
>
>  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> --
> 2.32.0
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
  2022-02-22 10:28     ` Aleksandr Nogikh
@ 2022-02-23 13:10       ` Alexandre Ghiti
  -1 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-23 13:10 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Nick Hu,
	linux-riscv, LKML, kasan-dev

Hi Aleksandr,

On Tue, Feb 22, 2022 at 11:28 AM Aleksandr Nogikh <nogikh@google.com> wrote:
>
> Hi Alexandre,
>
> Thanks for the series!
>
> However, I still haven't managed to boot the kernel. What I did:
> 1) Checked out the riscv/fixes branch (this is the one we're using on
> syzbot). The latest commit was
> 6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
> 2) Applied all 4 patches.
> 3) Used the config from the cover letter:
> https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
> 4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
> 5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
> -device virtio-rng-pci -machine virt -device
> virtio-net-pci,netdev=net0 -netdev
> user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
> virtio-blk-device,drive=hd0 -drive
> file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
> -kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
> console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
> runs qemu).
>
> Can you please hint at what I'm doing differently?

A short summary of what I found to keep you updated:

I compared your command line and mine, the differences are that I use
"smp=4" and I add "earlycon" to the kernel command line. When added to
your command line, that allows it to boot. I understand why it helps
but I can't explain what's wrong...Anyway, I fixed a warning that I
had missed and that allows me to remove the "smp=4" and "earlycon".

But this is not over yet...Your command line still does not allow to
reach userspace, it fails with the following stacktrace:

[   11.537817][    T1] Unable to handle kernel paging request at
virtual address fffff5eeffffc800
[   11.539450][    T1] Oops [#1]
[   11.539909][    T1] Modules linked in:
[   11.540451][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.17.0-rc1-00007-ga68b89289e26-dirty #28
[   11.541364][    T1] Hardware name: riscv-virtio,qemu (DT)
[   11.542032][    T1] epc : kasan_check_range+0x96/0x13e
[   11.542654][    T1]  ra : memset+0x1e/0x4c
[   11.543388][    T1] epc : ffffffff8046c312 ra : ffffffff8046ca16 sp
: ffffaf8007337b70
[   11.544037][    T1]  gp : ffffffff85866c80 tp : ffffaf80073d8000 t0
: 0000000000046000
[   11.544637][    T1]  t1 : fffff5eeffffc9ff t2 : 0000000000000000 s0
: ffffaf8007337ba0
[   11.545409][    T1]  s1 : 0000000000001000 a0 : fffff5eeffffca00 a1
: 0000000000001000
[   11.546072][    T1]  a2 : 0000000000000001 a3 : ffffffff8039ef24 a4
: ffffaf7ffffe4000
[   11.546707][    T1]  a5 : fffff5eeffffc800 a6 : 0000004000000000 a7
: ffffaf7ffffe4fff
[   11.547541][    T1]  s2 : ffffaf7ffffe4000 s3 : 0000000000000000 s4
: ffffffff8467faa8
[   11.548277][    T1]  s5 : 0000000000000000 s6 : ffffffff85869840 s7
: 0000000000000000
[   11.548950][    T1]  s8 : 0000000000001000 s9 : ffffaf805a54a048
s10: ffffffff8588d420
[   11.549705][    T1]  s11: ffffaf7ffffe4000 t3 : 0000000000000000 t4
: 0000000000000040
[   11.550465][    T1]  t5 : fffff5eeffffca00 t6 : 0000000000000002
[   11.551131][    T1] status: 0000000000000120 badaddr:
fffff5eeffffc800 cause: 000000000000000d
[   11.551961][    T1] [<ffffffff8039ef24>] pcpu_alloc+0x84a/0x125c
[   11.552928][    T1] [<ffffffff8039f994>] __alloc_percpu+0x28/0x34
[   11.553555][    T1] [<ffffffff83286954>] ip_rt_init+0x15a/0x35c
[   11.554128][    T1] [<ffffffff83286d24>] ip_init+0x18/0x30
[   11.554642][    T1] [<ffffffff8328844a>] inet_init+0x2a6/0x550
[   11.555428][    T1] [<ffffffff80003220>] do_one_initcall+0x132/0x7e4
[   11.556049][    T1] [<ffffffff83201f7a>] kernel_init_freeable+0x510/0x5b4
[   11.556771][    T1] [<ffffffff831424e4>] kernel_init+0x28/0x21c
[   11.557344][    T1] [<ffffffff800056a0>] ret_from_exception+0x0/0x14
[   11.585469][    T1] ---[ end trace 0000000000000000 ]---

0xfffff5eeffffc800 is a KASAN address that points to the very end of
vmalloc address range, which is weird since KASAN_VMALLOC is not
enabled.
Moreover my command line does not trigger the above bug, and I'm
trying to understand why:

/home/alex/work/qemu/build/riscv64-softmmu/qemu-system-riscv64 -M virt
-bios /home/alex/work/opensbi/build/platform/generic/firmware/fw_dynamic.bin
-kernel /home/alex/work/kernel-build/riscv_rv64_kernel/arch/riscv/boot/Image
-netdev user,id=net0 -device virtio-net-device,netdev=net0 -drive
file=/home/alex/work/kernel-build/rootfs.ext2,format=raw,id=hd0
-device virtio-blk-device,drive=hd0 -nographic -smp 4 -m 16G -s
-append "rootwait earlycon root=/dev/vda ro earlyprintk=serial"

I'm looking into all of this and will get back with a v3 soon :)

Thanks,

Alex






>
> A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
> leads to a booting kernel, which was not the case before.
> make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
> ./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
> make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
>
> --
> Best Regards,
> Aleksandr
>
> On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
> >
> > __virt_to_phys function is called very early in the boot process (ie
> > kasan_early_init) so it should not be instrumented by KASAN otherwise it
> > bugs.
> >
> > Fix this by declaring phys_addr.c as non-kasan instrumentable.
> >
> > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> > ---
> >  arch/riscv/mm/Makefile | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> > index 7ebaef10ea1b..ac7a25298a04 100644
> > --- a/arch/riscv/mm/Makefile
> > +++ b/arch/riscv/mm/Makefile
> > @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
> >  ifdef CONFIG_KASAN
> >  KASAN_SANITIZE_kasan_init.o := n
> >  KASAN_SANITIZE_init.o := n
> > +ifdef CONFIG_DEBUG_VIRTUAL
> > +KASAN_SANITIZE_physaddr.o := n
> > +endif
> >  endif
> >
> >  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> > --
> > 2.32.0
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
@ 2022-02-23 13:10       ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-23 13:10 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Nick Hu,
	linux-riscv, LKML, kasan-dev

Hi Aleksandr,

On Tue, Feb 22, 2022 at 11:28 AM Aleksandr Nogikh <nogikh@google.com> wrote:
>
> Hi Alexandre,
>
> Thanks for the series!
>
> However, I still haven't managed to boot the kernel. What I did:
> 1) Checked out the riscv/fixes branch (this is the one we're using on
> syzbot). The latest commit was
> 6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
> 2) Applied all 4 patches.
> 3) Used the config from the cover letter:
> https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
> 4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
> 5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
> -device virtio-rng-pci -machine virt -device
> virtio-net-pci,netdev=net0 -netdev
> user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
> virtio-blk-device,drive=hd0 -drive
> file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
> -kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
> console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
> runs qemu).
>
> Can you please hint at what I'm doing differently?

A short summary of what I found to keep you updated:

I compared your command line and mine, the differences are that I use
"smp=4" and I add "earlycon" to the kernel command line. When added to
your command line, that allows it to boot. I understand why it helps
but I can't explain what's wrong...Anyway, I fixed a warning that I
had missed and that allows me to remove the "smp=4" and "earlycon".

But this is not over yet...Your command line still does not allow to
reach userspace, it fails with the following stacktrace:

[   11.537817][    T1] Unable to handle kernel paging request at
virtual address fffff5eeffffc800
[   11.539450][    T1] Oops [#1]
[   11.539909][    T1] Modules linked in:
[   11.540451][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.17.0-rc1-00007-ga68b89289e26-dirty #28
[   11.541364][    T1] Hardware name: riscv-virtio,qemu (DT)
[   11.542032][    T1] epc : kasan_check_range+0x96/0x13e
[   11.542654][    T1]  ra : memset+0x1e/0x4c
[   11.543388][    T1] epc : ffffffff8046c312 ra : ffffffff8046ca16 sp
: ffffaf8007337b70
[   11.544037][    T1]  gp : ffffffff85866c80 tp : ffffaf80073d8000 t0
: 0000000000046000
[   11.544637][    T1]  t1 : fffff5eeffffc9ff t2 : 0000000000000000 s0
: ffffaf8007337ba0
[   11.545409][    T1]  s1 : 0000000000001000 a0 : fffff5eeffffca00 a1
: 0000000000001000
[   11.546072][    T1]  a2 : 0000000000000001 a3 : ffffffff8039ef24 a4
: ffffaf7ffffe4000
[   11.546707][    T1]  a5 : fffff5eeffffc800 a6 : 0000004000000000 a7
: ffffaf7ffffe4fff
[   11.547541][    T1]  s2 : ffffaf7ffffe4000 s3 : 0000000000000000 s4
: ffffffff8467faa8
[   11.548277][    T1]  s5 : 0000000000000000 s6 : ffffffff85869840 s7
: 0000000000000000
[   11.548950][    T1]  s8 : 0000000000001000 s9 : ffffaf805a54a048
s10: ffffffff8588d420
[   11.549705][    T1]  s11: ffffaf7ffffe4000 t3 : 0000000000000000 t4
: 0000000000000040
[   11.550465][    T1]  t5 : fffff5eeffffca00 t6 : 0000000000000002
[   11.551131][    T1] status: 0000000000000120 badaddr:
fffff5eeffffc800 cause: 000000000000000d
[   11.551961][    T1] [<ffffffff8039ef24>] pcpu_alloc+0x84a/0x125c
[   11.552928][    T1] [<ffffffff8039f994>] __alloc_percpu+0x28/0x34
[   11.553555][    T1] [<ffffffff83286954>] ip_rt_init+0x15a/0x35c
[   11.554128][    T1] [<ffffffff83286d24>] ip_init+0x18/0x30
[   11.554642][    T1] [<ffffffff8328844a>] inet_init+0x2a6/0x550
[   11.555428][    T1] [<ffffffff80003220>] do_one_initcall+0x132/0x7e4
[   11.556049][    T1] [<ffffffff83201f7a>] kernel_init_freeable+0x510/0x5b4
[   11.556771][    T1] [<ffffffff831424e4>] kernel_init+0x28/0x21c
[   11.557344][    T1] [<ffffffff800056a0>] ret_from_exception+0x0/0x14
[   11.585469][    T1] ---[ end trace 0000000000000000 ]---

0xfffff5eeffffc800 is a KASAN address that points to the very end of
vmalloc address range, which is weird since KASAN_VMALLOC is not
enabled.
Moreover my command line does not trigger the above bug, and I'm
trying to understand why:

/home/alex/work/qemu/build/riscv64-softmmu/qemu-system-riscv64 -M virt
-bios /home/alex/work/opensbi/build/platform/generic/firmware/fw_dynamic.bin
-kernel /home/alex/work/kernel-build/riscv_rv64_kernel/arch/riscv/boot/Image
-netdev user,id=net0 -device virtio-net-device,netdev=net0 -drive
file=/home/alex/work/kernel-build/rootfs.ext2,format=raw,id=hd0
-device virtio-blk-device,drive=hd0 -nographic -smp 4 -m 16G -s
-append "rootwait earlycon root=/dev/vda ro earlyprintk=serial"

I'm looking into all of this and will get back with a v3 soon :)

Thanks,

Alex






>
> A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
> leads to a booting kernel, which was not the case before.
> make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
> ./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
> make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
>
> --
> Best Regards,
> Aleksandr
>
> On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
> >
> > __virt_to_phys function is called very early in the boot process (ie
> > kasan_early_init) so it should not be instrumented by KASAN otherwise it
> > bugs.
> >
> > Fix this by declaring phys_addr.c as non-kasan instrumentable.
> >
> > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> > ---
> >  arch/riscv/mm/Makefile | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> > index 7ebaef10ea1b..ac7a25298a04 100644
> > --- a/arch/riscv/mm/Makefile
> > +++ b/arch/riscv/mm/Makefile
> > @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
> >  ifdef CONFIG_KASAN
> >  KASAN_SANITIZE_kasan_init.o := n
> >  KASAN_SANITIZE_init.o := n
> > +ifdef CONFIG_DEBUG_VIRTUAL
> > +KASAN_SANITIZE_physaddr.o := n
> > +endif
> >  endif
> >
> >  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> > --
> > 2.32.0
> >

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
  2022-02-23 13:10       ` Alexandre Ghiti
@ 2022-02-23 17:17         ` Alexandre Ghiti
  -1 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-23 17:17 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	linux-riscv, LKML, kasan-dev

On Wed, Feb 23, 2022 at 2:10 PM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> Hi Aleksandr,
>
> On Tue, Feb 22, 2022 at 11:28 AM Aleksandr Nogikh <nogikh@google.com> wrote:
> >
> > Hi Alexandre,
> >
> > Thanks for the series!
> >
> > However, I still haven't managed to boot the kernel. What I did:
> > 1) Checked out the riscv/fixes branch (this is the one we're using on
> > syzbot). The latest commit was
> > 6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
> > 2) Applied all 4 patches.
> > 3) Used the config from the cover letter:
> > https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
> > 4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
> > 5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
> > -device virtio-rng-pci -machine virt -device
> > virtio-net-pci,netdev=net0 -netdev
> > user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
> > virtio-blk-device,drive=hd0 -drive
> > file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
> > -kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
> > console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
> > runs qemu).
> >
> > Can you please hint at what I'm doing differently?
>
> A short summary of what I found to keep you updated:
>
> I compared your command line and mine, the differences are that I use
> "smp=4" and I add "earlycon" to the kernel command line. When added to
> your command line, that allows it to boot. I understand why it helps
> but I can't explain what's wrong...Anyway, I fixed a warning that I
> had missed and that allows me to remove the "smp=4" and "earlycon".
>
> But this is not over yet...Your command line still does not allow to
> reach userspace, it fails with the following stacktrace:
>
> [   11.537817][    T1] Unable to handle kernel paging request at
> virtual address fffff5eeffffc800
> [   11.539450][    T1] Oops [#1]
> [   11.539909][    T1] Modules linked in:
> [   11.540451][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 5.17.0-rc1-00007-ga68b89289e26-dirty #28
> [   11.541364][    T1] Hardware name: riscv-virtio,qemu (DT)
> [   11.542032][    T1] epc : kasan_check_range+0x96/0x13e
> [   11.542654][    T1]  ra : memset+0x1e/0x4c
> [   11.543388][    T1] epc : ffffffff8046c312 ra : ffffffff8046ca16 sp
> : ffffaf8007337b70
> [   11.544037][    T1]  gp : ffffffff85866c80 tp : ffffaf80073d8000 t0
> : 0000000000046000
> [   11.544637][    T1]  t1 : fffff5eeffffc9ff t2 : 0000000000000000 s0
> : ffffaf8007337ba0
> [   11.545409][    T1]  s1 : 0000000000001000 a0 : fffff5eeffffca00 a1
> : 0000000000001000
> [   11.546072][    T1]  a2 : 0000000000000001 a3 : ffffffff8039ef24 a4
> : ffffaf7ffffe4000
> [   11.546707][    T1]  a5 : fffff5eeffffc800 a6 : 0000004000000000 a7
> : ffffaf7ffffe4fff
> [   11.547541][    T1]  s2 : ffffaf7ffffe4000 s3 : 0000000000000000 s4
> : ffffffff8467faa8
> [   11.548277][    T1]  s5 : 0000000000000000 s6 : ffffffff85869840 s7
> : 0000000000000000
> [   11.548950][    T1]  s8 : 0000000000001000 s9 : ffffaf805a54a048
> s10: ffffffff8588d420
> [   11.549705][    T1]  s11: ffffaf7ffffe4000 t3 : 0000000000000000 t4
> : 0000000000000040
> [   11.550465][    T1]  t5 : fffff5eeffffca00 t6 : 0000000000000002
> [   11.551131][    T1] status: 0000000000000120 badaddr:
> fffff5eeffffc800 cause: 000000000000000d
> [   11.551961][    T1] [<ffffffff8039ef24>] pcpu_alloc+0x84a/0x125c
> [   11.552928][    T1] [<ffffffff8039f994>] __alloc_percpu+0x28/0x34
> [   11.553555][    T1] [<ffffffff83286954>] ip_rt_init+0x15a/0x35c
> [   11.554128][    T1] [<ffffffff83286d24>] ip_init+0x18/0x30
> [   11.554642][    T1] [<ffffffff8328844a>] inet_init+0x2a6/0x550
> [   11.555428][    T1] [<ffffffff80003220>] do_one_initcall+0x132/0x7e4
> [   11.556049][    T1] [<ffffffff83201f7a>] kernel_init_freeable+0x510/0x5b4
> [   11.556771][    T1] [<ffffffff831424e4>] kernel_init+0x28/0x21c
> [   11.557344][    T1] [<ffffffff800056a0>] ret_from_exception+0x0/0x14
> [   11.585469][    T1] ---[ end trace 0000000000000000 ]---
>
> 0xfffff5eeffffc800 is a KASAN address that points to the very end of
> vmalloc address range, which is weird since KASAN_VMALLOC is not
> enabled.
> Moreover my command line does not trigger the above bug, and I'm
> trying to understand why:

When I read this email I saw that I did not use the same qemu version:
I have a locally built version that disables sv48, which is the one
that works so the problem came from the sv48 support.

In a nutshell, the issue comes from the fact that kasan inner regions
are not aligned on PGDIR_SIZE when sv48 (which is 4-level page table)
is on, and then when populating the kasan linear mapping region, that
clears the kasan vmalloc region which is in the same PGD: the fix is
to copy its content before initializing the linear mapping entries.
This issue only happens when KASAN_VMALLOC is disabled. I had fixed
this already for kasan_shallow_populate_pud, but missed
kasan_populate_pud.

Tomorrow I'll push the v3. It still does not fix the issue I describe
in the cover letter though, so still more work to do. At least, I was
able to reach userspace with your *exact* qemu command :)

Alex


>
> /home/alex/work/qemu/build/riscv64-softmmu/qemu-system-riscv64 -M virt
> -bios /home/alex/work/opensbi/build/platform/generic/firmware/fw_dynamic.bin
> -kernel /home/alex/work/kernel-build/riscv_rv64_kernel/arch/riscv/boot/Image
> -netdev user,id=net0 -device virtio-net-device,netdev=net0 -drive
> file=/home/alex/work/kernel-build/rootfs.ext2,format=raw,id=hd0
> -device virtio-blk-device,drive=hd0 -nographic -smp 4 -m 16G -s
> -append "rootwait earlycon root=/dev/vda ro earlyprintk=serial"
>
> I'm looking into all of this and will get back with a v3 soon :)
>
> Thanks,
>
> Alex
>
>
>
>
>
>
> >
> > A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
> > leads to a booting kernel, which was not the case before.
> > make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
> > ./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
> > make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
> >
> > --
> > Best Regards,
> > Aleksandr
> >
> > On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
> > <alexandre.ghiti@canonical.com> wrote:
> > >
> > > __virt_to_phys function is called very early in the boot process (ie
> > > kasan_early_init) so it should not be instrumented by KASAN otherwise it
> > > bugs.
> > >
> > > Fix this by declaring phys_addr.c as non-kasan instrumentable.
> > >
> > > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> > > ---
> > >  arch/riscv/mm/Makefile | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> > > index 7ebaef10ea1b..ac7a25298a04 100644
> > > --- a/arch/riscv/mm/Makefile
> > > +++ b/arch/riscv/mm/Makefile
> > > @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
> > >  ifdef CONFIG_KASAN
> > >  KASAN_SANITIZE_kasan_init.o := n
> > >  KASAN_SANITIZE_init.o := n
> > > +ifdef CONFIG_DEBUG_VIRTUAL
> > > +KASAN_SANITIZE_physaddr.o := n
> > > +endif
> > >  endif
> > >
> > >  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> > > --
> > > 2.32.0
> > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
@ 2022-02-23 17:17         ` Alexandre Ghiti
  0 siblings, 0 replies; 18+ messages in thread
From: Alexandre Ghiti @ 2022-02-23 17:17 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	linux-riscv, LKML, kasan-dev

On Wed, Feb 23, 2022 at 2:10 PM Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> Hi Aleksandr,
>
> On Tue, Feb 22, 2022 at 11:28 AM Aleksandr Nogikh <nogikh@google.com> wrote:
> >
> > Hi Alexandre,
> >
> > Thanks for the series!
> >
> > However, I still haven't managed to boot the kernel. What I did:
> > 1) Checked out the riscv/fixes branch (this is the one we're using on
> > syzbot). The latest commit was
> > 6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
> > 2) Applied all 4 patches.
> > 3) Used the config from the cover letter:
> > https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
> > 4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
> > 5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
> > -device virtio-rng-pci -machine virt -device
> > virtio-net-pci,netdev=net0 -netdev
> > user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
> > virtio-blk-device,drive=hd0 -drive
> > file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
> > -kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
> > console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
> > runs qemu).
> >
> > Can you please hint at what I'm doing differently?
>
> A short summary of what I found to keep you updated:
>
> I compared your command line and mine, the differences are that I use
> "smp=4" and I add "earlycon" to the kernel command line. When added to
> your command line, that allows it to boot. I understand why it helps
> but I can't explain what's wrong...Anyway, I fixed a warning that I
> had missed and that allows me to remove the "smp=4" and "earlycon".
>
> But this is not over yet...Your command line still does not allow to
> reach userspace, it fails with the following stacktrace:
>
> [   11.537817][    T1] Unable to handle kernel paging request at
> virtual address fffff5eeffffc800
> [   11.539450][    T1] Oops [#1]
> [   11.539909][    T1] Modules linked in:
> [   11.540451][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 5.17.0-rc1-00007-ga68b89289e26-dirty #28
> [   11.541364][    T1] Hardware name: riscv-virtio,qemu (DT)
> [   11.542032][    T1] epc : kasan_check_range+0x96/0x13e
> [   11.542654][    T1]  ra : memset+0x1e/0x4c
> [   11.543388][    T1] epc : ffffffff8046c312 ra : ffffffff8046ca16 sp
> : ffffaf8007337b70
> [   11.544037][    T1]  gp : ffffffff85866c80 tp : ffffaf80073d8000 t0
> : 0000000000046000
> [   11.544637][    T1]  t1 : fffff5eeffffc9ff t2 : 0000000000000000 s0
> : ffffaf8007337ba0
> [   11.545409][    T1]  s1 : 0000000000001000 a0 : fffff5eeffffca00 a1
> : 0000000000001000
> [   11.546072][    T1]  a2 : 0000000000000001 a3 : ffffffff8039ef24 a4
> : ffffaf7ffffe4000
> [   11.546707][    T1]  a5 : fffff5eeffffc800 a6 : 0000004000000000 a7
> : ffffaf7ffffe4fff
> [   11.547541][    T1]  s2 : ffffaf7ffffe4000 s3 : 0000000000000000 s4
> : ffffffff8467faa8
> [   11.548277][    T1]  s5 : 0000000000000000 s6 : ffffffff85869840 s7
> : 0000000000000000
> [   11.548950][    T1]  s8 : 0000000000001000 s9 : ffffaf805a54a048
> s10: ffffffff8588d420
> [   11.549705][    T1]  s11: ffffaf7ffffe4000 t3 : 0000000000000000 t4
> : 0000000000000040
> [   11.550465][    T1]  t5 : fffff5eeffffca00 t6 : 0000000000000002
> [   11.551131][    T1] status: 0000000000000120 badaddr:
> fffff5eeffffc800 cause: 000000000000000d
> [   11.551961][    T1] [<ffffffff8039ef24>] pcpu_alloc+0x84a/0x125c
> [   11.552928][    T1] [<ffffffff8039f994>] __alloc_percpu+0x28/0x34
> [   11.553555][    T1] [<ffffffff83286954>] ip_rt_init+0x15a/0x35c
> [   11.554128][    T1] [<ffffffff83286d24>] ip_init+0x18/0x30
> [   11.554642][    T1] [<ffffffff8328844a>] inet_init+0x2a6/0x550
> [   11.555428][    T1] [<ffffffff80003220>] do_one_initcall+0x132/0x7e4
> [   11.556049][    T1] [<ffffffff83201f7a>] kernel_init_freeable+0x510/0x5b4
> [   11.556771][    T1] [<ffffffff831424e4>] kernel_init+0x28/0x21c
> [   11.557344][    T1] [<ffffffff800056a0>] ret_from_exception+0x0/0x14
> [   11.585469][    T1] ---[ end trace 0000000000000000 ]---
>
> 0xfffff5eeffffc800 is a KASAN address that points to the very end of
> vmalloc address range, which is weird since KASAN_VMALLOC is not
> enabled.
> Moreover my command line does not trigger the above bug, and I'm
> trying to understand why:

When I read this email I saw that I did not use the same qemu version:
I have a locally built version that disables sv48, which is the one
that works so the problem came from the sv48 support.

In a nutshell, the issue comes from the fact that kasan inner regions
are not aligned on PGDIR_SIZE when sv48 (which is 4-level page table)
is on, and then when populating the kasan linear mapping region, that
clears the kasan vmalloc region which is in the same PGD: the fix is
to copy its content before initializing the linear mapping entries.
This issue only happens when KASAN_VMALLOC is disabled. I had fixed
this already for kasan_shallow_populate_pud, but missed
kasan_populate_pud.

Tomorrow I'll push the v3. It still does not fix the issue I describe
in the cover letter though, so still more work to do. At least, I was
able to reach userspace with your *exact* qemu command :)

Alex


>
> /home/alex/work/qemu/build/riscv64-softmmu/qemu-system-riscv64 -M virt
> -bios /home/alex/work/opensbi/build/platform/generic/firmware/fw_dynamic.bin
> -kernel /home/alex/work/kernel-build/riscv_rv64_kernel/arch/riscv/boot/Image
> -netdev user,id=net0 -device virtio-net-device,netdev=net0 -drive
> file=/home/alex/work/kernel-build/rootfs.ext2,format=raw,id=hd0
> -device virtio-blk-device,drive=hd0 -nographic -smp 4 -m 16G -s
> -append "rootwait earlycon root=/dev/vda ro earlyprintk=serial"
>
> I'm looking into all of this and will get back with a v3 soon :)
>
> Thanks,
>
> Alex
>
>
>
>
>
>
> >
> > A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
> > leads to a booting kernel, which was not the case before.
> > make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
> > ./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
> > make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
> >
> > --
> > Best Regards,
> > Aleksandr
> >
> > On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
> > <alexandre.ghiti@canonical.com> wrote:
> > >
> > > __virt_to_phys function is called very early in the boot process (ie
> > > kasan_early_init) so it should not be instrumented by KASAN otherwise it
> > > bugs.
> > >
> > > Fix this by declaring phys_addr.c as non-kasan instrumentable.
> > >
> > > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> > > ---
> > >  arch/riscv/mm/Makefile | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> > > index 7ebaef10ea1b..ac7a25298a04 100644
> > > --- a/arch/riscv/mm/Makefile
> > > +++ b/arch/riscv/mm/Makefile
> > > @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
> > >  ifdef CONFIG_KASAN
> > >  KASAN_SANITIZE_kasan_init.o := n
> > >  KASAN_SANITIZE_init.o := n
> > > +ifdef CONFIG_DEBUG_VIRTUAL
> > > +KASAN_SANITIZE_physaddr.o := n
> > > +endif
> > >  endif
> > >
> > >  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> > > --
> > > 2.32.0
> > >

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
  2022-02-23 17:17         ` Alexandre Ghiti
@ 2022-02-25  3:57           ` Palmer Dabbelt
  -1 siblings, 0 replies; 18+ messages in thread
From: Palmer Dabbelt @ 2022-02-25  3:57 UTC (permalink / raw)
  To: alexandre.ghiti
  Cc: nogikh, Paul Walmsley, aou, ryabinin.a.a, glider, andreyknvl,
	dvyukov, linux-riscv, linux-kernel, kasan-dev

On Wed, 23 Feb 2022 09:17:16 PST (-0800), alexandre.ghiti@canonical.com wrote:
> On Wed, Feb 23, 2022 at 2:10 PM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
>>
>> Hi Aleksandr,
>>
>> On Tue, Feb 22, 2022 at 11:28 AM Aleksandr Nogikh <nogikh@google.com> wrote:
>> >
>> > Hi Alexandre,
>> >
>> > Thanks for the series!
>> >
>> > However, I still haven't managed to boot the kernel. What I did:
>> > 1) Checked out the riscv/fixes branch (this is the one we're using on
>> > syzbot). The latest commit was
>> > 6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
>> > 2) Applied all 4 patches.
>> > 3) Used the config from the cover letter:
>> > https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
>> > 4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
>> > 5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
>> > -device virtio-rng-pci -machine virt -device
>> > virtio-net-pci,netdev=net0 -netdev
>> > user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
>> > virtio-blk-device,drive=hd0 -drive
>> > file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
>> > -kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
>> > console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
>> > runs qemu).
>> >
>> > Can you please hint at what I'm doing differently?
>>
>> A short summary of what I found to keep you updated:
>>
>> I compared your command line and mine, the differences are that I use
>> "smp=4" and I add "earlycon" to the kernel command line. When added to
>> your command line, that allows it to boot. I understand why it helps
>> but I can't explain what's wrong...Anyway, I fixed a warning that I
>> had missed and that allows me to remove the "smp=4" and "earlycon".
>>
>> But this is not over yet...Your command line still does not allow to
>> reach userspace, it fails with the following stacktrace:
>>
>> [   11.537817][    T1] Unable to handle kernel paging request at
>> virtual address fffff5eeffffc800
>> [   11.539450][    T1] Oops [#1]
>> [   11.539909][    T1] Modules linked in:
>> [   11.540451][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
>> 5.17.0-rc1-00007-ga68b89289e26-dirty #28
>> [   11.541364][    T1] Hardware name: riscv-virtio,qemu (DT)
>> [   11.542032][    T1] epc : kasan_check_range+0x96/0x13e
>> [   11.542654][    T1]  ra : memset+0x1e/0x4c
>> [   11.543388][    T1] epc : ffffffff8046c312 ra : ffffffff8046ca16 sp
>> : ffffaf8007337b70
>> [   11.544037][    T1]  gp : ffffffff85866c80 tp : ffffaf80073d8000 t0
>> : 0000000000046000
>> [   11.544637][    T1]  t1 : fffff5eeffffc9ff t2 : 0000000000000000 s0
>> : ffffaf8007337ba0
>> [   11.545409][    T1]  s1 : 0000000000001000 a0 : fffff5eeffffca00 a1
>> : 0000000000001000
>> [   11.546072][    T1]  a2 : 0000000000000001 a3 : ffffffff8039ef24 a4
>> : ffffaf7ffffe4000
>> [   11.546707][    T1]  a5 : fffff5eeffffc800 a6 : 0000004000000000 a7
>> : ffffaf7ffffe4fff
>> [   11.547541][    T1]  s2 : ffffaf7ffffe4000 s3 : 0000000000000000 s4
>> : ffffffff8467faa8
>> [   11.548277][    T1]  s5 : 0000000000000000 s6 : ffffffff85869840 s7
>> : 0000000000000000
>> [   11.548950][    T1]  s8 : 0000000000001000 s9 : ffffaf805a54a048
>> s10: ffffffff8588d420
>> [   11.549705][    T1]  s11: ffffaf7ffffe4000 t3 : 0000000000000000 t4
>> : 0000000000000040
>> [   11.550465][    T1]  t5 : fffff5eeffffca00 t6 : 0000000000000002
>> [   11.551131][    T1] status: 0000000000000120 badaddr:
>> fffff5eeffffc800 cause: 000000000000000d
>> [   11.551961][    T1] [<ffffffff8039ef24>] pcpu_alloc+0x84a/0x125c
>> [   11.552928][    T1] [<ffffffff8039f994>] __alloc_percpu+0x28/0x34
>> [   11.553555][    T1] [<ffffffff83286954>] ip_rt_init+0x15a/0x35c
>> [   11.554128][    T1] [<ffffffff83286d24>] ip_init+0x18/0x30
>> [   11.554642][    T1] [<ffffffff8328844a>] inet_init+0x2a6/0x550
>> [   11.555428][    T1] [<ffffffff80003220>] do_one_initcall+0x132/0x7e4
>> [   11.556049][    T1] [<ffffffff83201f7a>] kernel_init_freeable+0x510/0x5b4
>> [   11.556771][    T1] [<ffffffff831424e4>] kernel_init+0x28/0x21c
>> [   11.557344][    T1] [<ffffffff800056a0>] ret_from_exception+0x0/0x14
>> [   11.585469][    T1] ---[ end trace 0000000000000000 ]---
>>
>> 0xfffff5eeffffc800 is a KASAN address that points to the very end of
>> vmalloc address range, which is weird since KASAN_VMALLOC is not
>> enabled.
>> Moreover my command line does not trigger the above bug, and I'm
>> trying to understand why:
>
> When I read this email I saw that I did not use the same qemu version:
> I have a locally built version that disables sv48, which is the one
> that works so the problem came from the sv48 support.
>
> In a nutshell, the issue comes from the fact that kasan inner regions
> are not aligned on PGDIR_SIZE when sv48 (which is 4-level page table)
> is on, and then when populating the kasan linear mapping region, that
> clears the kasan vmalloc region which is in the same PGD: the fix is
> to copy its content before initializing the linear mapping entries.
> This issue only happens when KASAN_VMALLOC is disabled. I had fixed
> this already for kasan_shallow_populate_pud, but missed
> kasan_populate_pud.
>
> Tomorrow I'll push the v3. It still does not fix the issue I describe
> in the cover letter though, so still more work to do. At least, I was
> able to reach userspace with your *exact* qemu command :)

I can't find a v3.

>
> Alex
>
>
>>
>> /home/alex/work/qemu/build/riscv64-softmmu/qemu-system-riscv64 -M virt
>> -bios /home/alex/work/opensbi/build/platform/generic/firmware/fw_dynamic.bin
>> -kernel /home/alex/work/kernel-build/riscv_rv64_kernel/arch/riscv/boot/Image
>> -netdev user,id=net0 -device virtio-net-device,netdev=net0 -drive
>> file=/home/alex/work/kernel-build/rootfs.ext2,format=raw,id=hd0
>> -device virtio-blk-device,drive=hd0 -nographic -smp 4 -m 16G -s
>> -append "rootwait earlycon root=/dev/vda ro earlyprintk=serial"
>>
>> I'm looking into all of this and will get back with a v3 soon :)
>>
>> Thanks,
>>
>> Alex
>>
>>
>>
>>
>>
>>
>> >
>> > A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
>> > leads to a booting kernel, which was not the case before.
>> > make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
>> > ./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
>> > make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
>> >
>> > --
>> > Best Regards,
>> > Aleksandr
>> >
>> > On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
>> > <alexandre.ghiti@canonical.com> wrote:
>> > >
>> > > __virt_to_phys function is called very early in the boot process (ie
>> > > kasan_early_init) so it should not be instrumented by KASAN otherwise it
>> > > bugs.
>> > >
>> > > Fix this by declaring phys_addr.c as non-kasan instrumentable.
>> > >
>> > > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
>> > > ---
>> > >  arch/riscv/mm/Makefile | 3 +++
>> > >  1 file changed, 3 insertions(+)
>> > >
>> > > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
>> > > index 7ebaef10ea1b..ac7a25298a04 100644
>> > > --- a/arch/riscv/mm/Makefile
>> > > +++ b/arch/riscv/mm/Makefile
>> > > @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
>> > >  ifdef CONFIG_KASAN
>> > >  KASAN_SANITIZE_kasan_init.o := n
>> > >  KASAN_SANITIZE_init.o := n
>> > > +ifdef CONFIG_DEBUG_VIRTUAL
>> > > +KASAN_SANITIZE_physaddr.o := n
>> > > +endif
>> > >  endif
>> > >
>> > >  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
>> > > --
>> > > 2.32.0
>> > >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL
@ 2022-02-25  3:57           ` Palmer Dabbelt
  0 siblings, 0 replies; 18+ messages in thread
From: Palmer Dabbelt @ 2022-02-25  3:57 UTC (permalink / raw)
  To: alexandre.ghiti
  Cc: nogikh, Paul Walmsley, aou, ryabinin.a.a, glider, andreyknvl,
	dvyukov, linux-riscv, linux-kernel, kasan-dev

On Wed, 23 Feb 2022 09:17:16 PST (-0800), alexandre.ghiti@canonical.com wrote:
> On Wed, Feb 23, 2022 at 2:10 PM Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
>>
>> Hi Aleksandr,
>>
>> On Tue, Feb 22, 2022 at 11:28 AM Aleksandr Nogikh <nogikh@google.com> wrote:
>> >
>> > Hi Alexandre,
>> >
>> > Thanks for the series!
>> >
>> > However, I still haven't managed to boot the kernel. What I did:
>> > 1) Checked out the riscv/fixes branch (this is the one we're using on
>> > syzbot). The latest commit was
>> > 6df2a016c0c8a3d0933ef33dd192ea6606b115e3.
>> > 2) Applied all 4 patches.
>> > 3) Used the config from the cover letter:
>> > https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
>> > 4) Built with `make -j32 ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-`
>> > 5) Ran with `qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot
>> > -device virtio-rng-pci -machine virt -device
>> > virtio-net-pci,netdev=net0 -netdev
>> > user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device
>> > virtio-blk-device,drive=hd0 -drive
>> > file=~/kernel-image/riscv64,if=none,format=raw,id=hd0 -snapshot
>> > -kernel ~/linux-riscv/arch/riscv/boot/Image -append "root=/dev/vda
>> > console=ttyS0 earlyprintk=serial"` (this is similar to how syzkaller
>> > runs qemu).
>> >
>> > Can you please hint at what I'm doing differently?
>>
>> A short summary of what I found to keep you updated:
>>
>> I compared your command line and mine, the differences are that I use
>> "smp=4" and I add "earlycon" to the kernel command line. When added to
>> your command line, that allows it to boot. I understand why it helps
>> but I can't explain what's wrong...Anyway, I fixed a warning that I
>> had missed and that allows me to remove the "smp=4" and "earlycon".
>>
>> But this is not over yet...Your command line still does not allow to
>> reach userspace, it fails with the following stacktrace:
>>
>> [   11.537817][    T1] Unable to handle kernel paging request at
>> virtual address fffff5eeffffc800
>> [   11.539450][    T1] Oops [#1]
>> [   11.539909][    T1] Modules linked in:
>> [   11.540451][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
>> 5.17.0-rc1-00007-ga68b89289e26-dirty #28
>> [   11.541364][    T1] Hardware name: riscv-virtio,qemu (DT)
>> [   11.542032][    T1] epc : kasan_check_range+0x96/0x13e
>> [   11.542654][    T1]  ra : memset+0x1e/0x4c
>> [   11.543388][    T1] epc : ffffffff8046c312 ra : ffffffff8046ca16 sp
>> : ffffaf8007337b70
>> [   11.544037][    T1]  gp : ffffffff85866c80 tp : ffffaf80073d8000 t0
>> : 0000000000046000
>> [   11.544637][    T1]  t1 : fffff5eeffffc9ff t2 : 0000000000000000 s0
>> : ffffaf8007337ba0
>> [   11.545409][    T1]  s1 : 0000000000001000 a0 : fffff5eeffffca00 a1
>> : 0000000000001000
>> [   11.546072][    T1]  a2 : 0000000000000001 a3 : ffffffff8039ef24 a4
>> : ffffaf7ffffe4000
>> [   11.546707][    T1]  a5 : fffff5eeffffc800 a6 : 0000004000000000 a7
>> : ffffaf7ffffe4fff
>> [   11.547541][    T1]  s2 : ffffaf7ffffe4000 s3 : 0000000000000000 s4
>> : ffffffff8467faa8
>> [   11.548277][    T1]  s5 : 0000000000000000 s6 : ffffffff85869840 s7
>> : 0000000000000000
>> [   11.548950][    T1]  s8 : 0000000000001000 s9 : ffffaf805a54a048
>> s10: ffffffff8588d420
>> [   11.549705][    T1]  s11: ffffaf7ffffe4000 t3 : 0000000000000000 t4
>> : 0000000000000040
>> [   11.550465][    T1]  t5 : fffff5eeffffca00 t6 : 0000000000000002
>> [   11.551131][    T1] status: 0000000000000120 badaddr:
>> fffff5eeffffc800 cause: 000000000000000d
>> [   11.551961][    T1] [<ffffffff8039ef24>] pcpu_alloc+0x84a/0x125c
>> [   11.552928][    T1] [<ffffffff8039f994>] __alloc_percpu+0x28/0x34
>> [   11.553555][    T1] [<ffffffff83286954>] ip_rt_init+0x15a/0x35c
>> [   11.554128][    T1] [<ffffffff83286d24>] ip_init+0x18/0x30
>> [   11.554642][    T1] [<ffffffff8328844a>] inet_init+0x2a6/0x550
>> [   11.555428][    T1] [<ffffffff80003220>] do_one_initcall+0x132/0x7e4
>> [   11.556049][    T1] [<ffffffff83201f7a>] kernel_init_freeable+0x510/0x5b4
>> [   11.556771][    T1] [<ffffffff831424e4>] kernel_init+0x28/0x21c
>> [   11.557344][    T1] [<ffffffff800056a0>] ret_from_exception+0x0/0x14
>> [   11.585469][    T1] ---[ end trace 0000000000000000 ]---
>>
>> 0xfffff5eeffffc800 is a KASAN address that points to the very end of
>> vmalloc address range, which is weird since KASAN_VMALLOC is not
>> enabled.
>> Moreover my command line does not trigger the above bug, and I'm
>> trying to understand why:
>
> When I read this email I saw that I did not use the same qemu version:
> I have a locally built version that disables sv48, which is the one
> that works so the problem came from the sv48 support.
>
> In a nutshell, the issue comes from the fact that kasan inner regions
> are not aligned on PGDIR_SIZE when sv48 (which is 4-level page table)
> is on, and then when populating the kasan linear mapping region, that
> clears the kasan vmalloc region which is in the same PGD: the fix is
> to copy its content before initializing the linear mapping entries.
> This issue only happens when KASAN_VMALLOC is disabled. I had fixed
> this already for kasan_shallow_populate_pud, but missed
> kasan_populate_pud.
>
> Tomorrow I'll push the v3. It still does not fix the issue I describe
> in the cover letter though, so still more work to do. At least, I was
> able to reach userspace with your *exact* qemu command :)

I can't find a v3.

>
> Alex
>
>
>>
>> /home/alex/work/qemu/build/riscv64-softmmu/qemu-system-riscv64 -M virt
>> -bios /home/alex/work/opensbi/build/platform/generic/firmware/fw_dynamic.bin
>> -kernel /home/alex/work/kernel-build/riscv_rv64_kernel/arch/riscv/boot/Image
>> -netdev user,id=net0 -device virtio-net-device,netdev=net0 -drive
>> file=/home/alex/work/kernel-build/rootfs.ext2,format=raw,id=hd0
>> -device virtio-blk-device,drive=hd0 -nographic -smp 4 -m 16G -s
>> -append "rootwait earlycon root=/dev/vda ro earlyprintk=serial"
>>
>> I'm looking into all of this and will get back with a v3 soon :)
>>
>> Thanks,
>>
>> Alex
>>
>>
>>
>>
>>
>>
>> >
>> > A simple config with KASAN, KASAN_OUTLINE and DEBUG_VIRTUAL now indeed
>> > leads to a booting kernel, which was not the case before.
>> > make defconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
>> > ./scripts/config -e KASAN -e KASAN_OUTLINE -e DEBUG_VIRTUAL
>> > make olddefconfig ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu-
>> >
>> > --
>> > Best Regards,
>> > Aleksandr
>> >
>> > On Mon, Feb 21, 2022 at 5:17 PM Alexandre Ghiti
>> > <alexandre.ghiti@canonical.com> wrote:
>> > >
>> > > __virt_to_phys function is called very early in the boot process (ie
>> > > kasan_early_init) so it should not be instrumented by KASAN otherwise it
>> > > bugs.
>> > >
>> > > Fix this by declaring phys_addr.c as non-kasan instrumentable.
>> > >
>> > > Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
>> > > ---
>> > >  arch/riscv/mm/Makefile | 3 +++
>> > >  1 file changed, 3 insertions(+)
>> > >
>> > > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
>> > > index 7ebaef10ea1b..ac7a25298a04 100644
>> > > --- a/arch/riscv/mm/Makefile
>> > > +++ b/arch/riscv/mm/Makefile
>> > > @@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
>> > >  ifdef CONFIG_KASAN
>> > >  KASAN_SANITIZE_kasan_init.o := n
>> > >  KASAN_SANITIZE_init.o := n
>> > > +ifdef CONFIG_DEBUG_VIRTUAL
>> > > +KASAN_SANITIZE_physaddr.o := n
>> > > +endif
>> > >  endif
>> > >
>> > >  obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
>> > > --
>> > > 2.32.0
>> > >

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-02-25  3:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-21 16:12 [PATCH -fixes v2 0/4] Fixes KASAN and other along the way Alexandre Ghiti
2022-02-21 16:12 ` Alexandre Ghiti
2022-02-21 16:12 ` [PATCH -fixes v2 1/4] riscv: Fix is_linear_mapping with recent move of KASAN region Alexandre Ghiti
2022-02-21 16:12   ` Alexandre Ghiti
2022-02-21 16:12 ` [PATCH -fixes v2 2/4] riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP Alexandre Ghiti
2022-02-21 16:12   ` Alexandre Ghiti
2022-02-21 16:12 ` [PATCH -fixes v2 3/4] riscv: Fix DEBUG_VIRTUAL false warnings Alexandre Ghiti
2022-02-21 16:12   ` Alexandre Ghiti
2022-02-21 16:12 ` [PATCH -fixes v2 4/4] riscv: Fix config KASAN && DEBUG_VIRTUAL Alexandre Ghiti
2022-02-21 16:12   ` Alexandre Ghiti
2022-02-22 10:28   ` Aleksandr Nogikh
2022-02-22 10:28     ` Aleksandr Nogikh
2022-02-23 13:10     ` Alexandre Ghiti
2022-02-23 13:10       ` Alexandre Ghiti
2022-02-23 17:17       ` Alexandre Ghiti
2022-02-23 17:17         ` Alexandre Ghiti
2022-02-25  3:57         ` Palmer Dabbelt
2022-02-25  3:57           ` Palmer Dabbelt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.