linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
@ 2022-02-25 12:39 Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 1/6] riscv: Fix is_linear_mapping with recent move of KASAN region Alexandre Ghiti
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 12:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

As reported by Aleksandr, syzbot riscv is broken since commit
54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually
breaks KASAN_INLINE which is not fixed in this series, that will come later
when found.

Nevertheless, this series fixes small things that made the syzbot
configuration + KASAN_OUTLINE fail to boot.

Note that even though the config at [1] boots fine with this series, I
was not able to boot the small config at [2] which fails because
kasan_poison receives a really weird address 0x4075706301000000 (maybe a
kasan person could provide some hint about what happens below in
do_ctors -> __asan_register_globals):

Thread 2 hit Breakpoint 1, kasan_poison (addr=<optimized out>, size=<optimized out>, value=<optimized out>, init=<optimized out>) at /home/alex/work/linux/mm/kasan/shadow.c:90
90		if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
1: x/i $pc
=> 0xffffffff80261712 <kasan_poison>:	andi	a4,a0,7
5: /x $a0 = 0x4075706301000000

Thread 2 hit Breakpoint 2, handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:27
27		csrrw tp, CSR_SCRATCH, tp
1: x/i $pc
=> 0xffffffff80004098 <handle_exception>:	csrrw	tp,sscratch,tp
5: /x $a0 = 0xe80eae0b60200000
(gdb) bt
#0  handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:27
#1  0xffffffff80261746 in kasan_poison (addr=<optimized out>, size=<optimized out>, value=<optimized out>, init=<optimized out>)
    at /home/alex/work/linux/mm/kasan/shadow.c:98
#2  0xffffffff802618b4 in kasan_unpoison (addr=<optimized out>, size=<optimized out>, init=<optimized out>)
    at /home/alex/work/linux/mm/kasan/shadow.c:138
#3  0xffffffff80260876 in register_global (global=<optimized out>) at /home/alex/work/linux/mm/kasan/generic.c:214
#4  __asan_register_globals (globals=<optimized out>, size=<optimized out>) at /home/alex/work/linux/mm/kasan/generic.c:226
#5  0xffffffff8125efac in _sub_I_65535_1 ()
#6  0xffffffff81201b32 in do_ctors () at /home/alex/work/linux/init/main.c:1156
#7  do_basic_setup () at /home/alex/work/linux/init/main.c:1407
#8  kernel_init_freeable () at /home/alex/work/linux/init/main.c:1613
#9  0xffffffff81153ddc in kernel_init (unused=<optimized out>) at /home/alex/work/linux/init/main.c:1502
#10 0xffffffff800041c0 in handle_exception () at /home/alex/work/linux/arch/riscv/kernel/entry.S:231


Thanks again to Aleksandr for narrowing down the issues fixed here.


[1] https://gist.github.com/a-nogikh/279c85c2d24f47efcc3e865c08844138
[2] https://gist.github.com/AlexGhiti/a5a0cab0227e2bf38f9d12232591c0e4

Changes in v3:
- Add PATCH 5/6 and PATCH 6/6

Changes in v2:
- Fix kernel test robot failure regarding KERN_VIRT_SIZE that is
  undefined for nommu config

Alexandre Ghiti (6):
  riscv: Fix is_linear_mapping with recent move of KASAN region
  riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
  riscv: Fix DEBUG_VIRTUAL false warnings
  riscv: Fix config KASAN && DEBUG_VIRTUAL
  riscv: Move high_memory initialization to setup_bootmem
  riscv: Fix kasan pud population

 arch/riscv/include/asm/page.h    | 2 +-
 arch/riscv/include/asm/pgtable.h | 1 +
 arch/riscv/mm/Makefile           | 3 +++
 arch/riscv/mm/init.c             | 2 +-
 arch/riscv/mm/kasan_init.c       | 8 +++++---
 arch/riscv/mm/physaddr.c         | 4 +---
 6 files changed, 12 insertions(+), 8 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH -fixes v3 1/6] riscv: Fix is_linear_mapping with recent move of KASAN region
  2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
@ 2022-02-25 12:39 ` Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 2/6] riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP Alexandre Ghiti
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 12:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

KASAN region was recently moved between the linear mapping and the
kernel mapping, is_linear_mapping used to check the validity of an
address by using the start of the kernel mapping, which is now wrong.

Fix this by using the maximum size of the physical memory.

Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/include/asm/page.h    | 2 +-
 arch/riscv/include/asm/pgtable.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 160e3a1e8f8b..004372f8da54 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -119,7 +119,7 @@ extern phys_addr_t phys_ram_base;
 	((x) >= kernel_map.virt_addr && (x) < (kernel_map.virt_addr + kernel_map.size))
 
 #define is_linear_mapping(x)	\
-	((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < kernel_map.virt_addr))
+	((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < PAGE_OFFSET + KERN_VIRT_SIZE))
 
 #define linear_mapping_pa_to_va(x)	((void *)((unsigned long)(x) + kernel_map.va_pa_offset))
 #define kernel_mapping_pa_to_va(y)	({						\
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7e949f25c933..e3549e50de95 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -13,6 +13,7 @@
 
 #ifndef CONFIG_MMU
 #define KERNEL_LINK_ADDR	PAGE_OFFSET
+#define KERN_VIRT_SIZE		(UL(-1))
 #else
 
 #define ADDRESS_SPACE_END	(UL(-1))
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH -fixes v3 2/6] riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
  2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 1/6] riscv: Fix is_linear_mapping with recent move of KASAN region Alexandre Ghiti
@ 2022-02-25 12:39 ` Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 3/6] riscv: Fix DEBUG_VIRTUAL false warnings Alexandre Ghiti
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 12:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b72721d0 ("riscv: Add KASAN support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/kasan_init.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
index f61f7ca6fe0f..85e849318389 100644
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -202,8 +202,7 @@ asmlinkage void __init kasan_early_init(void)
 
 	for (i = 0; i < PTRS_PER_PTE; ++i)
 		set_pte(kasan_early_shadow_pte + i,
-			mk_pte(virt_to_page(kasan_early_shadow_page),
-			       PAGE_KERNEL));
+			pfn_pte(virt_to_pfn(kasan_early_shadow_page), PAGE_KERNEL));
 
 	for (i = 0; i < PTRS_PER_PMD; ++i)
 		set_pmd(kasan_early_shadow_pmd + i,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH -fixes v3 3/6] riscv: Fix DEBUG_VIRTUAL false warnings
  2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 1/6] riscv: Fix is_linear_mapping with recent move of KASAN region Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 2/6] riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP Alexandre Ghiti
@ 2022-02-25 12:39 ` Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 4/6] riscv: Fix config KASAN && DEBUG_VIRTUAL Alexandre Ghiti
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 12:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

KERN_VIRT_SIZE used to encompass the kernel mapping before it was
redefined when moving the kasan mapping next to the kernel mapping to only
match the maximum amount of physical memory.

Then, kernel mapping addresses that go through __virt_to_phys are now
declared as wrong which is not true, one can use __virt_to_phys on such
addresses.

Fix this by redefining the condition that matches wrong addresses.

Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/physaddr.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
index e7fd0c253c7b..19cf25a74ee2 100644
--- a/arch/riscv/mm/physaddr.c
+++ b/arch/riscv/mm/physaddr.c
@@ -8,12 +8,10 @@
 
 phys_addr_t __virt_to_phys(unsigned long x)
 {
-	phys_addr_t y = x - PAGE_OFFSET;
-
 	/*
 	 * Boundary checking aginst the kernel linear mapping space.
 	 */
-	WARN(y >= KERN_VIRT_SIZE,
+	WARN(!is_linear_mapping(x) && !is_kernel_mapping(x),
 	     "virt_to_phys used for non-linear address: %pK (%pS)\n",
 	     (void *)x, (void *)x);
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH -fixes v3 4/6] riscv: Fix config KASAN && DEBUG_VIRTUAL
  2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
                   ` (2 preceding siblings ...)
  2022-02-25 12:39 ` [PATCH -fixes v3 3/6] riscv: Fix DEBUG_VIRTUAL false warnings Alexandre Ghiti
@ 2022-02-25 12:39 ` Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 5/6] riscv: Move high_memory initialization to setup_bootmem Alexandre Ghiti
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 12:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 7ebaef10ea1b..ac7a25298a04 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -24,6 +24,9 @@ obj-$(CONFIG_KASAN)   += kasan_init.o
 ifdef CONFIG_KASAN
 KASAN_SANITIZE_kasan_init.o := n
 KASAN_SANITIZE_init.o := n
+ifdef CONFIG_DEBUG_VIRTUAL
+KASAN_SANITIZE_physaddr.o := n
+endif
 endif
 
 obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH -fixes v3 5/6] riscv: Move high_memory initialization to setup_bootmem
  2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
                   ` (3 preceding siblings ...)
  2022-02-25 12:39 ` [PATCH -fixes v3 4/6] riscv: Fix config KASAN && DEBUG_VIRTUAL Alexandre Ghiti
@ 2022-02-25 12:39 ` Alexandre Ghiti
  2022-02-25 12:39 ` [PATCH -fixes v3 6/6] riscv: Fix kasan pud population Alexandre Ghiti
  2022-02-25 13:05 ` [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Marco Elver
  6 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 12:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

high_memory used to be initialized in mem_init, way after setup_bootmem.
But a call to dma_contiguous_reserve in this function gives rise to the
below warning because high_memory is equal to 0 and is used at the very
beginning at cma_declare_contiguous_nid.

It went unnoticed since the move of the kasan region redefined
KERN_VIRT_SIZE so that it does not encompass -1 anymore.

Fix this by initializing high_memory in setup_bootmem.

------------[ cut here ]------------
virt_to_phys used for non-linear address: ffffffffffffffff (0xffffffffffffffff)
WARNING: CPU: 0 PID: 0 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xac/0x1b8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc1-00007-ga68b89289e26 #27
Hardware name: riscv-virtio,qemu (DT)
epc : __virt_to_phys+0xac/0x1b8
 ra : __virt_to_phys+0xac/0x1b8
epc : ffffffff80014922 ra : ffffffff80014922 sp : ffffffff84a03c30
 gp : ffffffff85866c80 tp : ffffffff84a3f180 t0 : ffffffff86bce657
 t1 : fffffffef09406e8 t2 : 0000000000000000 s0 : ffffffff84a03c70
 s1 : ffffffffffffffff a0 : 000000000000004f a1 : 00000000000f0000
 a2 : 0000000000000002 a3 : ffffffff8011f408 a4 : 0000000000000000
 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffff84a03747
 s2 : ffffffd800000000 s3 : ffffffff86ef4000 s4 : ffffffff8467f828
 s5 : fffffff800000000 s6 : 8000000000006800 s7 : 0000000000000000
 s8 : 0000000480000000 s9 : 0000000080038ea0 s10: 0000000000000000
 s11: ffffffffffffffff t3 : ffffffff84a035c0 t4 : fffffffef09406e8
 t5 : fffffffef09406e9 t6 : ffffffff84a03758
status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff8322ef4c>] cma_declare_contiguous_nid+0xf2/0x64a
[<ffffffff83212a58>] dma_contiguous_reserve_area+0x46/0xb4
[<ffffffff83212c3a>] dma_contiguous_reserve+0x174/0x18e
[<ffffffff83208fc2>] paging_init+0x12c/0x35e
[<ffffffff83206bd2>] setup_arch+0x120/0x74e
[<ffffffff83201416>] start_kernel+0xce/0x68c
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<0000000000000000>] 0x0
softirqs last  enabled at (0): [<0000000000000000>] 0x0
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---

Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index c27294128e18..0d588032d6e6 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -125,7 +125,6 @@ void __init mem_init(void)
 	else
 		swiotlb_force = SWIOTLB_NO_FORCE;
 #endif
-	high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
 	memblock_free_all();
 
 	print_vm_layout();
@@ -195,6 +194,7 @@ static void __init setup_bootmem(void)
 
 	min_low_pfn = PFN_UP(phys_ram_base);
 	max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end);
+	high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
 
 	dma32_phys_limit = min(4UL * SZ_1G, (unsigned long)PFN_PHYS(max_low_pfn));
 	set_max_mapnr(max_low_pfn - ARCH_PFN_OFFSET);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH -fixes v3 6/6] riscv: Fix kasan pud population
  2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
                   ` (4 preceding siblings ...)
  2022-02-25 12:39 ` [PATCH -fixes v3 5/6] riscv: Move high_memory initialization to setup_bootmem Alexandre Ghiti
@ 2022-02-25 12:39 ` Alexandre Ghiti
  2022-02-25 13:05 ` [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Marco Elver
  6 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 12:39 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Alexandre Ghiti, Aleksandr Nogikh, Nick Hu, linux-riscv,
	linux-kernel, kasan-dev

In sv48, the kasan inner regions are not aligned on PGDIR_SIZE and then
when we populate the kasan linear mapping region, we clear the kasan
vmalloc region which is in the same PGD.

Fix this by copying the content of the kasan early pud after allocating a
new PGD for the first time.

Fixes: e8a62cc26ddf ("riscv: Implement sv48 support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
---
 arch/riscv/mm/kasan_init.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
index 85e849318389..cd1a145257b7 100644
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -113,8 +113,11 @@ static void __init kasan_populate_pud(pgd_t *pgd,
 		base_pud = pt_ops.get_pud_virt(pfn_to_phys(_pgd_pfn(*pgd)));
 	} else {
 		base_pud = (pud_t *)pgd_page_vaddr(*pgd);
-		if (base_pud == lm_alias(kasan_early_shadow_pud))
+		if (base_pud == lm_alias(kasan_early_shadow_pud)) {
 			base_pud = memblock_alloc(PTRS_PER_PUD * sizeof(pud_t), PAGE_SIZE);
+			memcpy(base_pud, (void *)kasan_early_shadow_pud,
+			       sizeof(pud_t) * PTRS_PER_PUD);
+		}
 	}
 
 	pudp = base_pud + pud_index(vaddr);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
  2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
                   ` (5 preceding siblings ...)
  2022-02-25 12:39 ` [PATCH -fixes v3 6/6] riscv: Fix kasan pud population Alexandre Ghiti
@ 2022-02-25 13:05 ` Marco Elver
  2022-02-25 14:04   ` Alexandre Ghiti
  6 siblings, 1 reply; 17+ messages in thread
From: Marco Elver @ 2022-02-25 13:05 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Aleksandr Nogikh, Nick Hu, linux-riscv, linux-kernel, kasan-dev

On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
<alexandre.ghiti@canonical.com> wrote:
>
> As reported by Aleksandr, syzbot riscv is broken since commit
> 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually
> breaks KASAN_INLINE which is not fixed in this series, that will come later
> when found.
>
> Nevertheless, this series fixes small things that made the syzbot
> configuration + KASAN_OUTLINE fail to boot.
>
> Note that even though the config at [1] boots fine with this series, I
> was not able to boot the small config at [2] which fails because
> kasan_poison receives a really weird address 0x4075706301000000 (maybe a
> kasan person could provide some hint about what happens below in
> do_ctors -> __asan_register_globals):

asan_register_globals is responsible for poisoning redzones around
globals. As hinted by 'do_ctors', it calls constructors, and in this
case a compiler-generated constructor that calls
__asan_register_globals with metadata generated by the compiler. That
metadata contains information about global variables. Note, these
constructors are called on initial boot, but also every time a kernel
module (that has globals) is loaded.

It may also be a toolchain issue, but it's hard to say. If you're
using GCC to test, try Clang (11 or later), and vice-versa.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
  2022-02-25 13:05 ` [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Marco Elver
@ 2022-02-25 14:04   ` Alexandre Ghiti
       [not found]     ` <CAG_fn=WYmkqPX_qCVmxv1dx87JkXHGF1-a6_8K0jwWuBWzRJfA@mail.gmail.com>
  0 siblings, 1 reply; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 14:04 UTC (permalink / raw)
  To: Marco Elver
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
	Aleksandr Nogikh, Nick Hu, linux-riscv, linux-kernel, kasan-dev

On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com> wrote:
>
> On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
> <alexandre.ghiti@canonical.com> wrote:
> >
> > As reported by Aleksandr, syzbot riscv is broken since commit
> > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually
> > breaks KASAN_INLINE which is not fixed in this series, that will come later
> > when found.
> >
> > Nevertheless, this series fixes small things that made the syzbot
> > configuration + KASAN_OUTLINE fail to boot.
> >
> > Note that even though the config at [1] boots fine with this series, I
> > was not able to boot the small config at [2] which fails because
> > kasan_poison receives a really weird address 0x4075706301000000 (maybe a
> > kasan person could provide some hint about what happens below in
> > do_ctors -> __asan_register_globals):
>
> asan_register_globals is responsible for poisoning redzones around
> globals. As hinted by 'do_ctors', it calls constructors, and in this
> case a compiler-generated constructor that calls
> __asan_register_globals with metadata generated by the compiler. That
> metadata contains information about global variables. Note, these
> constructors are called on initial boot, but also every time a kernel
> module (that has globals) is loaded.
>
> It may also be a toolchain issue, but it's hard to say. If you're
> using GCC to test, try Clang (11 or later), and vice-versa.

I tried 3 different gcc toolchains already, but that did not fix the
issue. The only thing that worked was setting asan-globals=0 in
scripts/Makefile.kasan, but ok, that's not a fix.
I tried to bisect this issue but our kasan implementation has been
broken quite a few times, so it failed.

I keep digging!

Thanks for the tips,

Alex

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
       [not found]     ` <CAG_fn=WYmkqPX_qCVmxv1dx87JkXHGF1-a6_8K0jwWuBWzRJfA@mail.gmail.com>
@ 2022-02-25 14:15       ` Alexandre Ghiti
       [not found]         ` <CAG_fn=VZ3fS7ekmJknQ6sW5zC09iUT9mzWjEhyrn3NaAWfVP_Q@mail.gmail.com>
  0 siblings, 1 reply; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 14:15 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Marco Elver, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov,
	Aleksandr Nogikh, Nick Hu, linux-riscv, LKML, kasan-dev

On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com> wrote:
>
>
>
> On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote:
>>
>> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com> wrote:
>> >
>> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
>> > <alexandre.ghiti@canonical.com> wrote:
>> > >
>> > > As reported by Aleksandr, syzbot riscv is broken since commit
>> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually
>> > > breaks KASAN_INLINE which is not fixed in this series, that will come later
>> > > when found.
>> > >
>> > > Nevertheless, this series fixes small things that made the syzbot
>> > > configuration + KASAN_OUTLINE fail to boot.
>> > >
>> > > Note that even though the config at [1] boots fine with this series, I
>> > > was not able to boot the small config at [2] which fails because
>> > > kasan_poison receives a really weird address 0x4075706301000000 (maybe a
>> > > kasan person could provide some hint about what happens below in
>> > > do_ctors -> __asan_register_globals):
>> >
>> > asan_register_globals is responsible for poisoning redzones around
>> > globals. As hinted by 'do_ctors', it calls constructors, and in this
>> > case a compiler-generated constructor that calls
>> > __asan_register_globals with metadata generated by the compiler. That
>> > metadata contains information about global variables. Note, these
>> > constructors are called on initial boot, but also every time a kernel
>> > module (that has globals) is loaded.
>> >
>> > It may also be a toolchain issue, but it's hard to say. If you're
>> > using GCC to test, try Clang (11 or later), and vice-versa.
>>
>> I tried 3 different gcc toolchains already, but that did not fix the
>> issue. The only thing that worked was setting asan-globals=0 in
>> scripts/Makefile.kasan, but ok, that's not a fix.
>> I tried to bisect this issue but our kasan implementation has been
>> broken quite a few times, so it failed.
>>
>> I keep digging!
>>
>
> The problem does not reproduce for me with GCC 11.2.0: kernels built with both [1] and [2] are bootable.

Do you mean you reach userspace? Because my image boots too, and fails
at some point:

[    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
every 4398046511100ns
[    0.015847] Console: colour dummy device 80x25
[    0.016899] printk: console [tty0] enabled
[    0.020326] printk: bootconsole [ns16550a0] disabled

It traps here.

> FWIW here is how I run them:
>
> qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
>   -device virtio-rng-pci -machine virt -device \
>   virtio-net-pci,netdev=net0 -netdev \
>   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
>   virtio-blk-device,drive=hd0 -drive \
>   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
>   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append "root=/dev/vda
>   console=ttyS0 earlyprintk=serial"
>
>
>>
>> Thanks for the tips,
>>
>> Alex
>
>
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Paul Manicle, Liana Sebastian
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
>
> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.
>
>
>
> This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
       [not found]         ` <CAG_fn=VZ3fS7ekmJknQ6sW5zC09iUT9mzWjEhyrn3NaAWfVP_Q@mail.gmail.com>
@ 2022-02-25 14:46           ` Alexandre Ghiti
  0 siblings, 0 replies; 17+ messages in thread
From: Alexandre Ghiti @ 2022-02-25 14:46 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Marco Elver, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov,
	Aleksandr Nogikh, Nick Hu, linux-riscv, LKML, kasan-dev

On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com> wrote:
>
>
>
> On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote:
>>
>> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com> wrote:
>> >
>> >
>> >
>> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote:
>> >>
>> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com> wrote:
>> >> >
>> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
>> >> > <alexandre.ghiti@canonical.com> wrote:
>> >> > >
>> >> > > As reported by Aleksandr, syzbot riscv is broken since commit
>> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually
>> >> > > breaks KASAN_INLINE which is not fixed in this series, that will come later
>> >> > > when found.
>> >> > >
>> >> > > Nevertheless, this series fixes small things that made the syzbot
>> >> > > configuration + KASAN_OUTLINE fail to boot.
>> >> > >
>> >> > > Note that even though the config at [1] boots fine with this series, I
>> >> > > was not able to boot the small config at [2] which fails because
>> >> > > kasan_poison receives a really weird address 0x4075706301000000 (maybe a
>> >> > > kasan person could provide some hint about what happens below in
>> >> > > do_ctors -> __asan_register_globals):
>> >> >
>> >> > asan_register_globals is responsible for poisoning redzones around
>> >> > globals. As hinted by 'do_ctors', it calls constructors, and in this
>> >> > case a compiler-generated constructor that calls
>> >> > __asan_register_globals with metadata generated by the compiler. That
>> >> > metadata contains information about global variables. Note, these
>> >> > constructors are called on initial boot, but also every time a kernel
>> >> > module (that has globals) is loaded.
>> >> >
>> >> > It may also be a toolchain issue, but it's hard to say. If you're
>> >> > using GCC to test, try Clang (11 or later), and vice-versa.
>> >>
>> >> I tried 3 different gcc toolchains already, but that did not fix the
>> >> issue. The only thing that worked was setting asan-globals=0 in
>> >> scripts/Makefile.kasan, but ok, that's not a fix.
>> >> I tried to bisect this issue but our kasan implementation has been
>> >> broken quite a few times, so it failed.
>> >>
>> >> I keep digging!
>> >>
>> >
>> > The problem does not reproduce for me with GCC 11.2.0: kernels built with both [1] and [2] are bootable.
>>
>> Do you mean you reach userspace? Because my image boots too, and fails
>> at some point:
>>
>> [    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
>> every 4398046511100ns
>> [    0.015847] Console: colour dummy device 80x25
>> [    0.016899] printk: console [tty0] enabled
>> [    0.020326] printk: bootconsole [ns16550a0] disabled
>>
>
> In my case, QEMU successfully boots to the login prompt.
> I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr shared with me (guess it was built according to this instruction: https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md)
>

Nice thanks guys! I always use the latest opensbi and not the one that
is embedded in qemu, which is the only difference between your command
line (which works) and mine (which does not work). So the issue is
probably there, I really need to investigate that now.

That means I only need to fix KASAN_INLINE and we're good.

I imagine Palmer can add your Tested-by on the series then?

Thanks again!

Alex

>>
>> It traps here.
>>
>> > FWIW here is how I run them:
>> >
>> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
>> >   -device virtio-rng-pci -machine virt -device \
>> >   virtio-net-pci,netdev=net0 -netdev \
>> >   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
>> >   virtio-blk-device,drive=hd0 -drive \
>> >   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
>> >   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append "root=/dev/vda
>> >   console=ttyS0 earlyprintk=serial"
>> >
>> >
>> >>
>> >> Thanks for the tips,
>> >>
>> >> Alex
>> >
>> >
>> >
>> > --
>> > Alexander Potapenko
>> > Software Engineer
>> >
>> > Google Germany GmbH
>> > Erika-Mann-Straße, 33
>> > 80636 München
>> >
>> > Geschäftsführer: Paul Manicle, Liana Sebastian
>> > Registergericht und -nummer: Hamburg, HRB 86891
>> > Sitz der Gesellschaft: Hamburg
>> >
>> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.
>> >
>> >
>> >
>> > This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.
>>
>> --
>> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com.
>
>
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Paul Manicle, Liana Sebastian
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
>
> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.
>
>
>
> This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
  2022-03-10  8:41         ` Alexandre Ghiti
@ 2022-03-24 16:53           ` Aleksandr Nogikh
  0 siblings, 0 replies; 17+ messages in thread
From: Aleksandr Nogikh @ 2022-03-24 16:53 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Dmitry Vyukov, Palmer Dabbelt, Alexander Potapenko, Marco Elver,
	Paul Walmsley, Albert Ou, Andrey Ryabinin, Andrey Konovalov,
	Nick Hu, linux-riscv, LKML, kasan-dev

https://pastebin.com/pN4rUjSi))))On Thu, Mar 10, 2022 at 9:42 AM
Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote:
>
> Hi,
>
> On Wed, Mar 9, 2022 at 11:52 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Wed, 9 Mar 2022 at 11:45, Aleksandr Nogikh <nogikh@google.com> wrote:
> > >
> > > I switched the riscv syzbot instance to KASAN_OUTLINE and now it is
> > > finally being fuzzed again!
> > >
> > > Thank you very much for the series!
> >
> >
> > But all riscv crashes are still classified as "corrupted" and thrown
> > away (not reported):
> > https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452
> >
> > The problem is that risvc oopses don't contain "Call Trace:" in the
> > beginning of stack traces, so it's hard to make sense out of them.
> > arch/riscv seems to print "Call Trace:" in a wrong function, not where
> > all other arches print it.
> >
>
> Does the following diff fix this issue?
>
> diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
> index 201ee206fb57..348ca19ccbf8 100644
> --- a/arch/riscv/kernel/stacktrace.c
> +++ b/arch/riscv/kernel/stacktrace.c
> @@ -109,12 +109,12 @@ static bool print_trace_address(void *arg,
> unsigned long pc)
>  noinline void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
>                     const char *loglvl)
>  {
> +       pr_cont("%sCall Trace:\n", loglvl);
>         walk_stackframe(task, regs, print_trace_address, (void *)loglvl);
>  }
>
>  void show_stack(struct task_struct *task, unsigned long *sp, const
> char *loglvl)
>  {
> -       pr_cont("%sCall Trace:\n", loglvl);
>         dump_backtrace(NULL, task, loglvl);
>  }
>
> Thanks,
>
> Alex

I wouldn't say that all riscv crashes are ending up in the "corrupted
report" bucket, but for some classes of errors there are definitely
differences from other architectures and they prevent syzkaller from
making sense out of those reports. At the moment everything seems to
be working fine at least with "WARNING:", "KASAN:" and "kernel
panic:".

I've run syzkaller with and without the small patch. From what I
observed, it definitely helps with the "BUG: soft lockup in" class of
reports. Previously they were declared corrupted, now syzkaller parses
them normally.

There's still a problem with "INFO: rcu_preempt detected stalls on
CPUs/tasks", which might be a bit more complicated than just the Call
Trace printing location.

Here's an example of such a report from x86: https://pastebin.com/KMEE5YRf
There goes a header with the  "rcu: INFO: rcu_preempt detected stalls
on CPUs/tasks:" title
(https://elixir.bootlin.com/linux/v5.17/source/kernel/rcu/tree_stall.h#L520),
then backtrace for one CPU
(https://elixir.bootlin.com/linux/v5.17/source/kernel/rcu/tree_stall.h#L331),
then there goes another error message about starving kthread
(https://elixir.bootlin.com/linux/v5.17/source/kernel/rcu/tree_stall.h#L442),
then there go two kthread-related traces.

And here's a report from riscv: https://pastebin.com/pN4rUjSi
There's de facto no backtrace between "rcu: INFO: rcu_preempt detected
stalls on CPUs/tasks:" and "rcu: RCU grace-period kthread stack
dump:".


>
> >
> >
> > > --
> > > Best Regards,
> > > Aleksandr
> > >
> > > On Fri, Mar 4, 2022 at 5:12 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> > > >
> > > > On Tue, 01 Mar 2022 09:39:54 PST (-0800), Palmer Dabbelt wrote:
> > > > > On Fri, 25 Feb 2022 07:00:23 PST (-0800), glider@google.com wrote:
> > > > >> On Fri, Feb 25, 2022 at 3:47 PM Alexandre Ghiti <
> > > > >> alexandre.ghiti@canonical.com> wrote:
> > > > >>
> > > > >>> On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com>
> > > > >>> wrote:
> > > > >>> >
> > > > >>> >
> > > > >>> >
> > > > >>> > On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <
> > > > >>> alexandre.ghiti@canonical.com> wrote:
> > > > >>> >>
> > > > >>> >> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com>
> > > > >>> wrote:
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <
> > > > >>> alexandre.ghiti@canonical.com> wrote:
> > > > >>> >> >>
> > > > >>> >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com>
> > > > >>> wrote:
> > > > >>> >> >> >
> > > > >>> >> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
> > > > >>> >> >> > <alexandre.ghiti@canonical.com> wrote:
> > > > >>> >> >> > >
> > > > >>> >> >> > > As reported by Aleksandr, syzbot riscv is broken since commit
> > > > >>> >> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit
> > > > >>> actually
> > > > >>> >> >> > > breaks KASAN_INLINE which is not fixed in this series, that will
> > > > >>> come later
> > > > >>> >> >> > > when found.
> > > > >>> >> >> > >
> > > > >>> >> >> > > Nevertheless, this series fixes small things that made the syzbot
> > > > >>> >> >> > > configuration + KASAN_OUTLINE fail to boot.
> > > > >>> >> >> > >
> > > > >>> >> >> > > Note that even though the config at [1] boots fine with this
> > > > >>> series, I
> > > > >>> >> >> > > was not able to boot the small config at [2] which fails because
> > > > >>> >> >> > > kasan_poison receives a really weird address 0x4075706301000000
> > > > >>> (maybe a
> > > > >>> >> >> > > kasan person could provide some hint about what happens below in
> > > > >>> >> >> > > do_ctors -> __asan_register_globals):
> > > > >>> >> >> >
> > > > >>> >> >> > asan_register_globals is responsible for poisoning redzones around
> > > > >>> >> >> > globals. As hinted by 'do_ctors', it calls constructors, and in
> > > > >>> this
> > > > >>> >> >> > case a compiler-generated constructor that calls
> > > > >>> >> >> > __asan_register_globals with metadata generated by the compiler.
> > > > >>> That
> > > > >>> >> >> > metadata contains information about global variables. Note, these
> > > > >>> >> >> > constructors are called on initial boot, but also every time a
> > > > >>> kernel
> > > > >>> >> >> > module (that has globals) is loaded.
> > > > >>> >> >> >
> > > > >>> >> >> > It may also be a toolchain issue, but it's hard to say. If you're
> > > > >>> >> >> > using GCC to test, try Clang (11 or later), and vice-versa.
> > > > >>> >> >>
> > > > >>> >> >> I tried 3 different gcc toolchains already, but that did not fix the
> > > > >>> >> >> issue. The only thing that worked was setting asan-globals=0 in
> > > > >>> >> >> scripts/Makefile.kasan, but ok, that's not a fix.
> > > > >>> >> >> I tried to bisect this issue but our kasan implementation has been
> > > > >>> >> >> broken quite a few times, so it failed.
> > > > >>> >> >>
> > > > >>> >> >> I keep digging!
> > > > >>> >> >>
> > > > >>> >> >
> > > > >>> >> > The problem does not reproduce for me with GCC 11.2.0: kernels built
> > > > >>> with both [1] and [2] are bootable.
> > > > >>> >>
> > > > >>> >> Do you mean you reach userspace? Because my image boots too, and fails
> > > > >>> >> at some point:
> > > > >>> >>
> > > > >>> >> [    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
> > > > >>> >> every 4398046511100ns
> > > > >>> >> [    0.015847] Console: colour dummy device 80x25
> > > > >>> >> [    0.016899] printk: console [tty0] enabled
> > > > >>> >> [    0.020326] printk: bootconsole [ns16550a0] disabled
> > > > >>> >>
> > > > >>> >
> > > > >>> > In my case, QEMU successfully boots to the login prompt.
> > > > >>> > I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr
> > > > >>> shared with me (guess it was built according to this instruction:
> > > > >>> https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md
> > > > >>> )
> > > > >>> >
> > > > >>>
> > > > >>> Nice thanks guys! I always use the latest opensbi and not the one that
> > > > >>> is embedded in qemu, which is the only difference between your command
> > > > >>> line (which works) and mine (which does not work). So the issue is
> > > > >>> probably there, I really need to investigate that now.
> > > > >>>
> > > > >>> Great to hear that!
> > > > >>
> > > > >>
> > > > >>> That means I only need to fix KASAN_INLINE and we're good.
> > > > >>>
> > > > >>> I imagine Palmer can add your Tested-by on the series then?
> > > > >>>
> > > > >> Sure :)
> > > > >
> > > > > Do you mind actually posting that (i, the Tested-by tag)?  It's less
> > > > > likely to get lost that way.  I intend on taking this into fixes ASAP,
> > > > > my builds have blown up for some reason (I got bounced between machines,
> > > > > so I'm blaming that) so I need to fix that first.
> > > >
> > > > This is on fixes (with a "Tested-by: Alexander Potapenko
> > > > <glider@google.com>"), along with some trivial commit message fixes.
> > > >
> > > > Thanks!
> > > >
> > > > >
> > > > >>
> > > > >>>
> > > > >>> Thanks again!
> > > > >>>
> > > > >>> Alex
> > > > >>>
> > > > >>> >>
> > > > >>> >> It traps here.
> > > > >>> >>
> > > > >>> >> > FWIW here is how I run them:
> > > > >>> >> >
> > > > >>> >> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
> > > > >>> >> >   -device virtio-rng-pci -machine virt -device \
> > > > >>> >> >   virtio-net-pci,netdev=net0 -netdev \
> > > > >>> >> >   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
> > > > >>> >> >   virtio-blk-device,drive=hd0 -drive \
> > > > >>> >> >   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
> > > > >>> >> >   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append
> > > > >>> "root=/dev/vda
> > > > >>> >> >   console=ttyS0 earlyprintk=serial"
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> >>
> > > > >>> >> >> Thanks for the tips,
> > > > >>> >> >>
> > > > >>> >> >> Alex
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> > --
> > > > >>> >> > Alexander Potapenko
> > > > >>> >> > Software Engineer
> > > > >>> >> >
> > > > >>> >> > Google Germany GmbH
> > > > >>> >> > Erika-Mann-Straße, 33
> > > > >>> >> > 80636 München
> > > > >>> >> >
> > > > >>> >> > Geschäftsführer: Paul Manicle, Liana Sebastian
> > > > >>> >> > Registergericht und -nummer: Hamburg, HRB 86891
> > > > >>> >> > Sitz der Gesellschaft: Hamburg
> > > > >>> >> >
> > > > >>> >> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise
> > > > >>> erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes
> > > > >>> weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte
> > > > >>> wissen, dass die E-Mail an die falsche Person gesendet wurde.
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> >
> > > > >>> >> > This e-mail is confidential. If you received this communication by
> > > > >>> mistake, please don't forward it to anyone else, please erase all copies
> > > > >>> and attachments, and please let me know that it has gone to the wrong
> > > > >>> person.
> > > > >>> >>
> > > > >>> >> --
> > > > >>> >> You received this message because you are subscribed to the Google
> > > > >>> Groups "kasan-dev" group.
> > > > >>> >> To unsubscribe from this group and stop receiving emails from it, send
> > > > >>> an email to kasan-dev+unsubscribe@googlegroups.com.
> > > > >>> >> To view this discussion on the web visit
> > > > >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com
> > > > >>> .
> > > > >>> >
> > > > >>> >
> > > > >>> >
> > > > >>> > --
> > > > >>> > Alexander Potapenko
> > > > >>> > Software Engineer
> > > > >>> >
> > > > >>> > Google Germany GmbH
> > > > >>> > Erika-Mann-Straße, 33
> > > > >>> > 80636 München
> > > > >>> >
> > > > >>> > Geschäftsführer: Paul Manicle, Liana Sebastian
> > > > >>> > Registergericht und -nummer: Hamburg, HRB 86891
> > > > >>> > Sitz der Gesellschaft: Hamburg
> > > > >>> >
> > > > >>> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> > > > >>> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> > > > >>> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> > > > >>> dass die E-Mail an die falsche Person gesendet wurde.
> > > > >>> >
> > > > >>> >
> > > > >>> >
> > > > >>> > This e-mail is confidential. If you received this communication by
> > > > >>> mistake, please don't forward it to anyone else, please erase all copies
> > > > >>> and attachments, and please let me know that it has gone to the wrong
> > > > >>> person.
> > > > >>>
> > > > >>> --
> > > > >>> You received this message because you are subscribed to the Google Groups
> > > > >>> "kasan-dev" group.
> > > > >>> To unsubscribe from this group and stop receiving emails from it, send an
> > > > >>> email to kasan-dev+unsubscribe@googlegroups.com.
> > > > >>> To view this discussion on the web visit
> > > > >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCuJw8N0dUmQNdFqDM96bzKqPDjRe4FUnOCbjhJtO0R8Hg%40mail.gmail.com
> > > > >>> .
> > > > >>>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Alexander Potapenko
> > > > >> Software Engineer
> > > > >>
> > > > >> Google Germany GmbH
> > > > >> Erika-Mann-Straße, 33
> > > > >> 80636 München
> > > > >>
> > > > >> Geschäftsführer: Paul Manicle, Liana Sebastian
> > > > >> Registergericht und -nummer: Hamburg, HRB 86891
> > > > >> Sitz der Gesellschaft: Hamburg
> > > > >>
> > > > >> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> > > > >> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> > > > >> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> > > > >> dass die E-Mail an die falsche Person gesendet wurde.
> > > > >>
> > > > >>
> > > > >>
> > > > >> This e-mail is confidential. If you received this communication by mistake,
> > > > >> please don't forward it to anyone else, please erase all copies and
> > > > >> attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
  2022-03-09 10:52       ` Dmitry Vyukov
@ 2022-03-10  8:41         ` Alexandre Ghiti
  2022-03-24 16:53           ` Aleksandr Nogikh
  0 siblings, 1 reply; 17+ messages in thread
From: Alexandre Ghiti @ 2022-03-10  8:41 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Aleksandr Nogikh, Palmer Dabbelt, Alexander Potapenko,
	Marco Elver, Paul Walmsley, Albert Ou, Andrey Ryabinin,
	Andrey Konovalov, Nick Hu, linux-riscv, LKML, kasan-dev

Hi,

On Wed, Mar 9, 2022 at 11:52 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Wed, 9 Mar 2022 at 11:45, Aleksandr Nogikh <nogikh@google.com> wrote:
> >
> > I switched the riscv syzbot instance to KASAN_OUTLINE and now it is
> > finally being fuzzed again!
> >
> > Thank you very much for the series!
>
>
> But all riscv crashes are still classified as "corrupted" and thrown
> away (not reported):
> https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452
>
> The problem is that risvc oopses don't contain "Call Trace:" in the
> beginning of stack traces, so it's hard to make sense out of them.
> arch/riscv seems to print "Call Trace:" in a wrong function, not where
> all other arches print it.
>

Does the following diff fix this issue?

diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 201ee206fb57..348ca19ccbf8 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -109,12 +109,12 @@ static bool print_trace_address(void *arg,
unsigned long pc)
 noinline void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
                    const char *loglvl)
 {
+       pr_cont("%sCall Trace:\n", loglvl);
        walk_stackframe(task, regs, print_trace_address, (void *)loglvl);
 }

 void show_stack(struct task_struct *task, unsigned long *sp, const
char *loglvl)
 {
-       pr_cont("%sCall Trace:\n", loglvl);
        dump_backtrace(NULL, task, loglvl);
 }

Thanks,

Alex

>
>
> > --
> > Best Regards,
> > Aleksandr
> >
> > On Fri, Mar 4, 2022 at 5:12 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> > >
> > > On Tue, 01 Mar 2022 09:39:54 PST (-0800), Palmer Dabbelt wrote:
> > > > On Fri, 25 Feb 2022 07:00:23 PST (-0800), glider@google.com wrote:
> > > >> On Fri, Feb 25, 2022 at 3:47 PM Alexandre Ghiti <
> > > >> alexandre.ghiti@canonical.com> wrote:
> > > >>
> > > >>> On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com>
> > > >>> wrote:
> > > >>> >
> > > >>> >
> > > >>> >
> > > >>> > On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <
> > > >>> alexandre.ghiti@canonical.com> wrote:
> > > >>> >>
> > > >>> >> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com>
> > > >>> wrote:
> > > >>> >> >
> > > >>> >> >
> > > >>> >> >
> > > >>> >> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <
> > > >>> alexandre.ghiti@canonical.com> wrote:
> > > >>> >> >>
> > > >>> >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com>
> > > >>> wrote:
> > > >>> >> >> >
> > > >>> >> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
> > > >>> >> >> > <alexandre.ghiti@canonical.com> wrote:
> > > >>> >> >> > >
> > > >>> >> >> > > As reported by Aleksandr, syzbot riscv is broken since commit
> > > >>> >> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit
> > > >>> actually
> > > >>> >> >> > > breaks KASAN_INLINE which is not fixed in this series, that will
> > > >>> come later
> > > >>> >> >> > > when found.
> > > >>> >> >> > >
> > > >>> >> >> > > Nevertheless, this series fixes small things that made the syzbot
> > > >>> >> >> > > configuration + KASAN_OUTLINE fail to boot.
> > > >>> >> >> > >
> > > >>> >> >> > > Note that even though the config at [1] boots fine with this
> > > >>> series, I
> > > >>> >> >> > > was not able to boot the small config at [2] which fails because
> > > >>> >> >> > > kasan_poison receives a really weird address 0x4075706301000000
> > > >>> (maybe a
> > > >>> >> >> > > kasan person could provide some hint about what happens below in
> > > >>> >> >> > > do_ctors -> __asan_register_globals):
> > > >>> >> >> >
> > > >>> >> >> > asan_register_globals is responsible for poisoning redzones around
> > > >>> >> >> > globals. As hinted by 'do_ctors', it calls constructors, and in
> > > >>> this
> > > >>> >> >> > case a compiler-generated constructor that calls
> > > >>> >> >> > __asan_register_globals with metadata generated by the compiler.
> > > >>> That
> > > >>> >> >> > metadata contains information about global variables. Note, these
> > > >>> >> >> > constructors are called on initial boot, but also every time a
> > > >>> kernel
> > > >>> >> >> > module (that has globals) is loaded.
> > > >>> >> >> >
> > > >>> >> >> > It may also be a toolchain issue, but it's hard to say. If you're
> > > >>> >> >> > using GCC to test, try Clang (11 or later), and vice-versa.
> > > >>> >> >>
> > > >>> >> >> I tried 3 different gcc toolchains already, but that did not fix the
> > > >>> >> >> issue. The only thing that worked was setting asan-globals=0 in
> > > >>> >> >> scripts/Makefile.kasan, but ok, that's not a fix.
> > > >>> >> >> I tried to bisect this issue but our kasan implementation has been
> > > >>> >> >> broken quite a few times, so it failed.
> > > >>> >> >>
> > > >>> >> >> I keep digging!
> > > >>> >> >>
> > > >>> >> >
> > > >>> >> > The problem does not reproduce for me with GCC 11.2.0: kernels built
> > > >>> with both [1] and [2] are bootable.
> > > >>> >>
> > > >>> >> Do you mean you reach userspace? Because my image boots too, and fails
> > > >>> >> at some point:
> > > >>> >>
> > > >>> >> [    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
> > > >>> >> every 4398046511100ns
> > > >>> >> [    0.015847] Console: colour dummy device 80x25
> > > >>> >> [    0.016899] printk: console [tty0] enabled
> > > >>> >> [    0.020326] printk: bootconsole [ns16550a0] disabled
> > > >>> >>
> > > >>> >
> > > >>> > In my case, QEMU successfully boots to the login prompt.
> > > >>> > I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr
> > > >>> shared with me (guess it was built according to this instruction:
> > > >>> https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md
> > > >>> )
> > > >>> >
> > > >>>
> > > >>> Nice thanks guys! I always use the latest opensbi and not the one that
> > > >>> is embedded in qemu, which is the only difference between your command
> > > >>> line (which works) and mine (which does not work). So the issue is
> > > >>> probably there, I really need to investigate that now.
> > > >>>
> > > >>> Great to hear that!
> > > >>
> > > >>
> > > >>> That means I only need to fix KASAN_INLINE and we're good.
> > > >>>
> > > >>> I imagine Palmer can add your Tested-by on the series then?
> > > >>>
> > > >> Sure :)
> > > >
> > > > Do you mind actually posting that (i, the Tested-by tag)?  It's less
> > > > likely to get lost that way.  I intend on taking this into fixes ASAP,
> > > > my builds have blown up for some reason (I got bounced between machines,
> > > > so I'm blaming that) so I need to fix that first.
> > >
> > > This is on fixes (with a "Tested-by: Alexander Potapenko
> > > <glider@google.com>"), along with some trivial commit message fixes.
> > >
> > > Thanks!
> > >
> > > >
> > > >>
> > > >>>
> > > >>> Thanks again!
> > > >>>
> > > >>> Alex
> > > >>>
> > > >>> >>
> > > >>> >> It traps here.
> > > >>> >>
> > > >>> >> > FWIW here is how I run them:
> > > >>> >> >
> > > >>> >> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
> > > >>> >> >   -device virtio-rng-pci -machine virt -device \
> > > >>> >> >   virtio-net-pci,netdev=net0 -netdev \
> > > >>> >> >   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
> > > >>> >> >   virtio-blk-device,drive=hd0 -drive \
> > > >>> >> >   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
> > > >>> >> >   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append
> > > >>> "root=/dev/vda
> > > >>> >> >   console=ttyS0 earlyprintk=serial"
> > > >>> >> >
> > > >>> >> >
> > > >>> >> >>
> > > >>> >> >> Thanks for the tips,
> > > >>> >> >>
> > > >>> >> >> Alex
> > > >>> >> >
> > > >>> >> >
> > > >>> >> >
> > > >>> >> > --
> > > >>> >> > Alexander Potapenko
> > > >>> >> > Software Engineer
> > > >>> >> >
> > > >>> >> > Google Germany GmbH
> > > >>> >> > Erika-Mann-Straße, 33
> > > >>> >> > 80636 München
> > > >>> >> >
> > > >>> >> > Geschäftsführer: Paul Manicle, Liana Sebastian
> > > >>> >> > Registergericht und -nummer: Hamburg, HRB 86891
> > > >>> >> > Sitz der Gesellschaft: Hamburg
> > > >>> >> >
> > > >>> >> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise
> > > >>> erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes
> > > >>> weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte
> > > >>> wissen, dass die E-Mail an die falsche Person gesendet wurde.
> > > >>> >> >
> > > >>> >> >
> > > >>> >> >
> > > >>> >> > This e-mail is confidential. If you received this communication by
> > > >>> mistake, please don't forward it to anyone else, please erase all copies
> > > >>> and attachments, and please let me know that it has gone to the wrong
> > > >>> person.
> > > >>> >>
> > > >>> >> --
> > > >>> >> You received this message because you are subscribed to the Google
> > > >>> Groups "kasan-dev" group.
> > > >>> >> To unsubscribe from this group and stop receiving emails from it, send
> > > >>> an email to kasan-dev+unsubscribe@googlegroups.com.
> > > >>> >> To view this discussion on the web visit
> > > >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com
> > > >>> .
> > > >>> >
> > > >>> >
> > > >>> >
> > > >>> > --
> > > >>> > Alexander Potapenko
> > > >>> > Software Engineer
> > > >>> >
> > > >>> > Google Germany GmbH
> > > >>> > Erika-Mann-Straße, 33
> > > >>> > 80636 München
> > > >>> >
> > > >>> > Geschäftsführer: Paul Manicle, Liana Sebastian
> > > >>> > Registergericht und -nummer: Hamburg, HRB 86891
> > > >>> > Sitz der Gesellschaft: Hamburg
> > > >>> >
> > > >>> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> > > >>> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> > > >>> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> > > >>> dass die E-Mail an die falsche Person gesendet wurde.
> > > >>> >
> > > >>> >
> > > >>> >
> > > >>> > This e-mail is confidential. If you received this communication by
> > > >>> mistake, please don't forward it to anyone else, please erase all copies
> > > >>> and attachments, and please let me know that it has gone to the wrong
> > > >>> person.
> > > >>>
> > > >>> --
> > > >>> You received this message because you are subscribed to the Google Groups
> > > >>> "kasan-dev" group.
> > > >>> To unsubscribe from this group and stop receiving emails from it, send an
> > > >>> email to kasan-dev+unsubscribe@googlegroups.com.
> > > >>> To view this discussion on the web visit
> > > >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCuJw8N0dUmQNdFqDM96bzKqPDjRe4FUnOCbjhJtO0R8Hg%40mail.gmail.com
> > > >>> .
> > > >>>
> > > >>
> > > >>
> > > >> --
> > > >> Alexander Potapenko
> > > >> Software Engineer
> > > >>
> > > >> Google Germany GmbH
> > > >> Erika-Mann-Straße, 33
> > > >> 80636 München
> > > >>
> > > >> Geschäftsführer: Paul Manicle, Liana Sebastian
> > > >> Registergericht und -nummer: Hamburg, HRB 86891
> > > >> Sitz der Gesellschaft: Hamburg
> > > >>
> > > >> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> > > >> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> > > >> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> > > >> dass die E-Mail an die falsche Person gesendet wurde.
> > > >>
> > > >>
> > > >>
> > > >> This e-mail is confidential. If you received this communication by mistake,
> > > >> please don't forward it to anyone else, please erase all copies and
> > > >> attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
  2022-03-09 10:45     ` Aleksandr Nogikh
@ 2022-03-09 10:52       ` Dmitry Vyukov
  2022-03-10  8:41         ` Alexandre Ghiti
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2022-03-09 10:52 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: Palmer Dabbelt, Alexander Potapenko, Alexandre Ghiti,
	Marco Elver, Paul Walmsley, Albert Ou, Andrey Ryabinin,
	Andrey Konovalov, Nick Hu, linux-riscv, LKML, kasan-dev

On Wed, 9 Mar 2022 at 11:45, Aleksandr Nogikh <nogikh@google.com> wrote:
>
> I switched the riscv syzbot instance to KASAN_OUTLINE and now it is
> finally being fuzzed again!
>
> Thank you very much for the series!


But all riscv crashes are still classified as "corrupted" and thrown
away (not reported):
https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452

The problem is that risvc oopses don't contain "Call Trace:" in the
beginning of stack traces, so it's hard to make sense out of them.
arch/riscv seems to print "Call Trace:" in a wrong function, not where
all other arches print it.



> --
> Best Regards,
> Aleksandr
>
> On Fri, Mar 4, 2022 at 5:12 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
> >
> > On Tue, 01 Mar 2022 09:39:54 PST (-0800), Palmer Dabbelt wrote:
> > > On Fri, 25 Feb 2022 07:00:23 PST (-0800), glider@google.com wrote:
> > >> On Fri, Feb 25, 2022 at 3:47 PM Alexandre Ghiti <
> > >> alexandre.ghiti@canonical.com> wrote:
> > >>
> > >>> On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com>
> > >>> wrote:
> > >>> >
> > >>> >
> > >>> >
> > >>> > On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <
> > >>> alexandre.ghiti@canonical.com> wrote:
> > >>> >>
> > >>> >> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com>
> > >>> wrote:
> > >>> >> >
> > >>> >> >
> > >>> >> >
> > >>> >> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <
> > >>> alexandre.ghiti@canonical.com> wrote:
> > >>> >> >>
> > >>> >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com>
> > >>> wrote:
> > >>> >> >> >
> > >>> >> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
> > >>> >> >> > <alexandre.ghiti@canonical.com> wrote:
> > >>> >> >> > >
> > >>> >> >> > > As reported by Aleksandr, syzbot riscv is broken since commit
> > >>> >> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit
> > >>> actually
> > >>> >> >> > > breaks KASAN_INLINE which is not fixed in this series, that will
> > >>> come later
> > >>> >> >> > > when found.
> > >>> >> >> > >
> > >>> >> >> > > Nevertheless, this series fixes small things that made the syzbot
> > >>> >> >> > > configuration + KASAN_OUTLINE fail to boot.
> > >>> >> >> > >
> > >>> >> >> > > Note that even though the config at [1] boots fine with this
> > >>> series, I
> > >>> >> >> > > was not able to boot the small config at [2] which fails because
> > >>> >> >> > > kasan_poison receives a really weird address 0x4075706301000000
> > >>> (maybe a
> > >>> >> >> > > kasan person could provide some hint about what happens below in
> > >>> >> >> > > do_ctors -> __asan_register_globals):
> > >>> >> >> >
> > >>> >> >> > asan_register_globals is responsible for poisoning redzones around
> > >>> >> >> > globals. As hinted by 'do_ctors', it calls constructors, and in
> > >>> this
> > >>> >> >> > case a compiler-generated constructor that calls
> > >>> >> >> > __asan_register_globals with metadata generated by the compiler.
> > >>> That
> > >>> >> >> > metadata contains information about global variables. Note, these
> > >>> >> >> > constructors are called on initial boot, but also every time a
> > >>> kernel
> > >>> >> >> > module (that has globals) is loaded.
> > >>> >> >> >
> > >>> >> >> > It may also be a toolchain issue, but it's hard to say. If you're
> > >>> >> >> > using GCC to test, try Clang (11 or later), and vice-versa.
> > >>> >> >>
> > >>> >> >> I tried 3 different gcc toolchains already, but that did not fix the
> > >>> >> >> issue. The only thing that worked was setting asan-globals=0 in
> > >>> >> >> scripts/Makefile.kasan, but ok, that's not a fix.
> > >>> >> >> I tried to bisect this issue but our kasan implementation has been
> > >>> >> >> broken quite a few times, so it failed.
> > >>> >> >>
> > >>> >> >> I keep digging!
> > >>> >> >>
> > >>> >> >
> > >>> >> > The problem does not reproduce for me with GCC 11.2.0: kernels built
> > >>> with both [1] and [2] are bootable.
> > >>> >>
> > >>> >> Do you mean you reach userspace? Because my image boots too, and fails
> > >>> >> at some point:
> > >>> >>
> > >>> >> [    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
> > >>> >> every 4398046511100ns
> > >>> >> [    0.015847] Console: colour dummy device 80x25
> > >>> >> [    0.016899] printk: console [tty0] enabled
> > >>> >> [    0.020326] printk: bootconsole [ns16550a0] disabled
> > >>> >>
> > >>> >
> > >>> > In my case, QEMU successfully boots to the login prompt.
> > >>> > I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr
> > >>> shared with me (guess it was built according to this instruction:
> > >>> https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md
> > >>> )
> > >>> >
> > >>>
> > >>> Nice thanks guys! I always use the latest opensbi and not the one that
> > >>> is embedded in qemu, which is the only difference between your command
> > >>> line (which works) and mine (which does not work). So the issue is
> > >>> probably there, I really need to investigate that now.
> > >>>
> > >>> Great to hear that!
> > >>
> > >>
> > >>> That means I only need to fix KASAN_INLINE and we're good.
> > >>>
> > >>> I imagine Palmer can add your Tested-by on the series then?
> > >>>
> > >> Sure :)
> > >
> > > Do you mind actually posting that (i, the Tested-by tag)?  It's less
> > > likely to get lost that way.  I intend on taking this into fixes ASAP,
> > > my builds have blown up for some reason (I got bounced between machines,
> > > so I'm blaming that) so I need to fix that first.
> >
> > This is on fixes (with a "Tested-by: Alexander Potapenko
> > <glider@google.com>"), along with some trivial commit message fixes.
> >
> > Thanks!
> >
> > >
> > >>
> > >>>
> > >>> Thanks again!
> > >>>
> > >>> Alex
> > >>>
> > >>> >>
> > >>> >> It traps here.
> > >>> >>
> > >>> >> > FWIW here is how I run them:
> > >>> >> >
> > >>> >> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
> > >>> >> >   -device virtio-rng-pci -machine virt -device \
> > >>> >> >   virtio-net-pci,netdev=net0 -netdev \
> > >>> >> >   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
> > >>> >> >   virtio-blk-device,drive=hd0 -drive \
> > >>> >> >   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
> > >>> >> >   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append
> > >>> "root=/dev/vda
> > >>> >> >   console=ttyS0 earlyprintk=serial"
> > >>> >> >
> > >>> >> >
> > >>> >> >>
> > >>> >> >> Thanks for the tips,
> > >>> >> >>
> > >>> >> >> Alex
> > >>> >> >
> > >>> >> >
> > >>> >> >
> > >>> >> > --
> > >>> >> > Alexander Potapenko
> > >>> >> > Software Engineer
> > >>> >> >
> > >>> >> > Google Germany GmbH
> > >>> >> > Erika-Mann-Straße, 33
> > >>> >> > 80636 München
> > >>> >> >
> > >>> >> > Geschäftsführer: Paul Manicle, Liana Sebastian
> > >>> >> > Registergericht und -nummer: Hamburg, HRB 86891
> > >>> >> > Sitz der Gesellschaft: Hamburg
> > >>> >> >
> > >>> >> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise
> > >>> erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes
> > >>> weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte
> > >>> wissen, dass die E-Mail an die falsche Person gesendet wurde.
> > >>> >> >
> > >>> >> >
> > >>> >> >
> > >>> >> > This e-mail is confidential. If you received this communication by
> > >>> mistake, please don't forward it to anyone else, please erase all copies
> > >>> and attachments, and please let me know that it has gone to the wrong
> > >>> person.
> > >>> >>
> > >>> >> --
> > >>> >> You received this message because you are subscribed to the Google
> > >>> Groups "kasan-dev" group.
> > >>> >> To unsubscribe from this group and stop receiving emails from it, send
> > >>> an email to kasan-dev+unsubscribe@googlegroups.com.
> > >>> >> To view this discussion on the web visit
> > >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com
> > >>> .
> > >>> >
> > >>> >
> > >>> >
> > >>> > --
> > >>> > Alexander Potapenko
> > >>> > Software Engineer
> > >>> >
> > >>> > Google Germany GmbH
> > >>> > Erika-Mann-Straße, 33
> > >>> > 80636 München
> > >>> >
> > >>> > Geschäftsführer: Paul Manicle, Liana Sebastian
> > >>> > Registergericht und -nummer: Hamburg, HRB 86891
> > >>> > Sitz der Gesellschaft: Hamburg
> > >>> >
> > >>> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> > >>> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> > >>> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> > >>> dass die E-Mail an die falsche Person gesendet wurde.
> > >>> >
> > >>> >
> > >>> >
> > >>> > This e-mail is confidential. If you received this communication by
> > >>> mistake, please don't forward it to anyone else, please erase all copies
> > >>> and attachments, and please let me know that it has gone to the wrong
> > >>> person.
> > >>>
> > >>> --
> > >>> You received this message because you are subscribed to the Google Groups
> > >>> "kasan-dev" group.
> > >>> To unsubscribe from this group and stop receiving emails from it, send an
> > >>> email to kasan-dev+unsubscribe@googlegroups.com.
> > >>> To view this discussion on the web visit
> > >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCuJw8N0dUmQNdFqDM96bzKqPDjRe4FUnOCbjhJtO0R8Hg%40mail.gmail.com
> > >>> .
> > >>>
> > >>
> > >>
> > >> --
> > >> Alexander Potapenko
> > >> Software Engineer
> > >>
> > >> Google Germany GmbH
> > >> Erika-Mann-Straße, 33
> > >> 80636 München
> > >>
> > >> Geschäftsführer: Paul Manicle, Liana Sebastian
> > >> Registergericht und -nummer: Hamburg, HRB 86891
> > >> Sitz der Gesellschaft: Hamburg
> > >>
> > >> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> > >> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> > >> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> > >> dass die E-Mail an die falsche Person gesendet wurde.
> > >>
> > >>
> > >>
> > >> This e-mail is confidential. If you received this communication by mistake,
> > >> please don't forward it to anyone else, please erase all copies and
> > >> attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
  2022-03-04  4:12   ` Palmer Dabbelt
@ 2022-03-09 10:45     ` Aleksandr Nogikh
  2022-03-09 10:52       ` Dmitry Vyukov
  0 siblings, 1 reply; 17+ messages in thread
From: Aleksandr Nogikh @ 2022-03-09 10:45 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: Alexander Potapenko, Alexandre Ghiti, Marco Elver, Paul Walmsley,
	Albert Ou, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov,
	Nick Hu, linux-riscv, LKML, kasan-dev

I switched the riscv syzbot instance to KASAN_OUTLINE and now it is
finally being fuzzed again!

Thank you very much for the series!

--
Best Regards,
Aleksandr

On Fri, Mar 4, 2022 at 5:12 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Tue, 01 Mar 2022 09:39:54 PST (-0800), Palmer Dabbelt wrote:
> > On Fri, 25 Feb 2022 07:00:23 PST (-0800), glider@google.com wrote:
> >> On Fri, Feb 25, 2022 at 3:47 PM Alexandre Ghiti <
> >> alexandre.ghiti@canonical.com> wrote:
> >>
> >>> On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com>
> >>> wrote:
> >>> >
> >>> >
> >>> >
> >>> > On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <
> >>> alexandre.ghiti@canonical.com> wrote:
> >>> >>
> >>> >> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com>
> >>> wrote:
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <
> >>> alexandre.ghiti@canonical.com> wrote:
> >>> >> >>
> >>> >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com>
> >>> wrote:
> >>> >> >> >
> >>> >> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
> >>> >> >> > <alexandre.ghiti@canonical.com> wrote:
> >>> >> >> > >
> >>> >> >> > > As reported by Aleksandr, syzbot riscv is broken since commit
> >>> >> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit
> >>> actually
> >>> >> >> > > breaks KASAN_INLINE which is not fixed in this series, that will
> >>> come later
> >>> >> >> > > when found.
> >>> >> >> > >
> >>> >> >> > > Nevertheless, this series fixes small things that made the syzbot
> >>> >> >> > > configuration + KASAN_OUTLINE fail to boot.
> >>> >> >> > >
> >>> >> >> > > Note that even though the config at [1] boots fine with this
> >>> series, I
> >>> >> >> > > was not able to boot the small config at [2] which fails because
> >>> >> >> > > kasan_poison receives a really weird address 0x4075706301000000
> >>> (maybe a
> >>> >> >> > > kasan person could provide some hint about what happens below in
> >>> >> >> > > do_ctors -> __asan_register_globals):
> >>> >> >> >
> >>> >> >> > asan_register_globals is responsible for poisoning redzones around
> >>> >> >> > globals. As hinted by 'do_ctors', it calls constructors, and in
> >>> this
> >>> >> >> > case a compiler-generated constructor that calls
> >>> >> >> > __asan_register_globals with metadata generated by the compiler.
> >>> That
> >>> >> >> > metadata contains information about global variables. Note, these
> >>> >> >> > constructors are called on initial boot, but also every time a
> >>> kernel
> >>> >> >> > module (that has globals) is loaded.
> >>> >> >> >
> >>> >> >> > It may also be a toolchain issue, but it's hard to say. If you're
> >>> >> >> > using GCC to test, try Clang (11 or later), and vice-versa.
> >>> >> >>
> >>> >> >> I tried 3 different gcc toolchains already, but that did not fix the
> >>> >> >> issue. The only thing that worked was setting asan-globals=0 in
> >>> >> >> scripts/Makefile.kasan, but ok, that's not a fix.
> >>> >> >> I tried to bisect this issue but our kasan implementation has been
> >>> >> >> broken quite a few times, so it failed.
> >>> >> >>
> >>> >> >> I keep digging!
> >>> >> >>
> >>> >> >
> >>> >> > The problem does not reproduce for me with GCC 11.2.0: kernels built
> >>> with both [1] and [2] are bootable.
> >>> >>
> >>> >> Do you mean you reach userspace? Because my image boots too, and fails
> >>> >> at some point:
> >>> >>
> >>> >> [    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
> >>> >> every 4398046511100ns
> >>> >> [    0.015847] Console: colour dummy device 80x25
> >>> >> [    0.016899] printk: console [tty0] enabled
> >>> >> [    0.020326] printk: bootconsole [ns16550a0] disabled
> >>> >>
> >>> >
> >>> > In my case, QEMU successfully boots to the login prompt.
> >>> > I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr
> >>> shared with me (guess it was built according to this instruction:
> >>> https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md
> >>> )
> >>> >
> >>>
> >>> Nice thanks guys! I always use the latest opensbi and not the one that
> >>> is embedded in qemu, which is the only difference between your command
> >>> line (which works) and mine (which does not work). So the issue is
> >>> probably there, I really need to investigate that now.
> >>>
> >>> Great to hear that!
> >>
> >>
> >>> That means I only need to fix KASAN_INLINE and we're good.
> >>>
> >>> I imagine Palmer can add your Tested-by on the series then?
> >>>
> >> Sure :)
> >
> > Do you mind actually posting that (i, the Tested-by tag)?  It's less
> > likely to get lost that way.  I intend on taking this into fixes ASAP,
> > my builds have blown up for some reason (I got bounced between machines,
> > so I'm blaming that) so I need to fix that first.
>
> This is on fixes (with a "Tested-by: Alexander Potapenko
> <glider@google.com>"), along with some trivial commit message fixes.
>
> Thanks!
>
> >
> >>
> >>>
> >>> Thanks again!
> >>>
> >>> Alex
> >>>
> >>> >>
> >>> >> It traps here.
> >>> >>
> >>> >> > FWIW here is how I run them:
> >>> >> >
> >>> >> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
> >>> >> >   -device virtio-rng-pci -machine virt -device \
> >>> >> >   virtio-net-pci,netdev=net0 -netdev \
> >>> >> >   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
> >>> >> >   virtio-blk-device,drive=hd0 -drive \
> >>> >> >   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
> >>> >> >   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append
> >>> "root=/dev/vda
> >>> >> >   console=ttyS0 earlyprintk=serial"
> >>> >> >
> >>> >> >
> >>> >> >>
> >>> >> >> Thanks for the tips,
> >>> >> >>
> >>> >> >> Alex
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > --
> >>> >> > Alexander Potapenko
> >>> >> > Software Engineer
> >>> >> >
> >>> >> > Google Germany GmbH
> >>> >> > Erika-Mann-Straße, 33
> >>> >> > 80636 München
> >>> >> >
> >>> >> > Geschäftsführer: Paul Manicle, Liana Sebastian
> >>> >> > Registergericht und -nummer: Hamburg, HRB 86891
> >>> >> > Sitz der Gesellschaft: Hamburg
> >>> >> >
> >>> >> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise
> >>> erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes
> >>> weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte
> >>> wissen, dass die E-Mail an die falsche Person gesendet wurde.
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > This e-mail is confidential. If you received this communication by
> >>> mistake, please don't forward it to anyone else, please erase all copies
> >>> and attachments, and please let me know that it has gone to the wrong
> >>> person.
> >>> >>
> >>> >> --
> >>> >> You received this message because you are subscribed to the Google
> >>> Groups "kasan-dev" group.
> >>> >> To unsubscribe from this group and stop receiving emails from it, send
> >>> an email to kasan-dev+unsubscribe@googlegroups.com.
> >>> >> To view this discussion on the web visit
> >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com
> >>> .
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Alexander Potapenko
> >>> > Software Engineer
> >>> >
> >>> > Google Germany GmbH
> >>> > Erika-Mann-Straße, 33
> >>> > 80636 München
> >>> >
> >>> > Geschäftsführer: Paul Manicle, Liana Sebastian
> >>> > Registergericht und -nummer: Hamburg, HRB 86891
> >>> > Sitz der Gesellschaft: Hamburg
> >>> >
> >>> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> >>> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> >>> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> >>> dass die E-Mail an die falsche Person gesendet wurde.
> >>> >
> >>> >
> >>> >
> >>> > This e-mail is confidential. If you received this communication by
> >>> mistake, please don't forward it to anyone else, please erase all copies
> >>> and attachments, and please let me know that it has gone to the wrong
> >>> person.
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google Groups
> >>> "kasan-dev" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send an
> >>> email to kasan-dev+unsubscribe@googlegroups.com.
> >>> To view this discussion on the web visit
> >>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCuJw8N0dUmQNdFqDM96bzKqPDjRe4FUnOCbjhJtO0R8Hg%40mail.gmail.com
> >>> .
> >>>
> >>
> >>
> >> --
> >> Alexander Potapenko
> >> Software Engineer
> >>
> >> Google Germany GmbH
> >> Erika-Mann-Straße, 33
> >> 80636 München
> >>
> >> Geschäftsführer: Paul Manicle, Liana Sebastian
> >> Registergericht und -nummer: Hamburg, HRB 86891
> >> Sitz der Gesellschaft: Hamburg
> >>
> >> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> >> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> >> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> >> dass die E-Mail an die falsche Person gesendet wurde.
> >>
> >>
> >>
> >> This e-mail is confidential. If you received this communication by mistake,
> >> please don't forward it to anyone else, please erase all copies and
> >> attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
  2022-03-01 17:39 ` Palmer Dabbelt
@ 2022-03-04  4:12   ` Palmer Dabbelt
  2022-03-09 10:45     ` Aleksandr Nogikh
  0 siblings, 1 reply; 17+ messages in thread
From: Palmer Dabbelt @ 2022-03-04  4:12 UTC (permalink / raw)
  To: glider, alexandre.ghiti
  Cc: elver, Paul Walmsley, aou, ryabinin.a.a, andreyknvl, dvyukov,
	nogikh, nickhu, linux-riscv, linux-kernel, kasan-dev

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 8693 bytes --]

On Tue, 01 Mar 2022 09:39:54 PST (-0800), Palmer Dabbelt wrote:
> On Fri, 25 Feb 2022 07:00:23 PST (-0800), glider@google.com wrote:
>> On Fri, Feb 25, 2022 at 3:47 PM Alexandre Ghiti <
>> alexandre.ghiti@canonical.com> wrote:
>>
>>> On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com>
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <
>>> alexandre.ghiti@canonical.com> wrote:
>>> >>
>>> >> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com>
>>> wrote:
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <
>>> alexandre.ghiti@canonical.com> wrote:
>>> >> >>
>>> >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com>
>>> wrote:
>>> >> >> >
>>> >> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
>>> >> >> > <alexandre.ghiti@canonical.com> wrote:
>>> >> >> > >
>>> >> >> > > As reported by Aleksandr, syzbot riscv is broken since commit
>>> >> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit
>>> actually
>>> >> >> > > breaks KASAN_INLINE which is not fixed in this series, that will
>>> come later
>>> >> >> > > when found.
>>> >> >> > >
>>> >> >> > > Nevertheless, this series fixes small things that made the syzbot
>>> >> >> > > configuration + KASAN_OUTLINE fail to boot.
>>> >> >> > >
>>> >> >> > > Note that even though the config at [1] boots fine with this
>>> series, I
>>> >> >> > > was not able to boot the small config at [2] which fails because
>>> >> >> > > kasan_poison receives a really weird address 0x4075706301000000
>>> (maybe a
>>> >> >> > > kasan person could provide some hint about what happens below in
>>> >> >> > > do_ctors -> __asan_register_globals):
>>> >> >> >
>>> >> >> > asan_register_globals is responsible for poisoning redzones around
>>> >> >> > globals. As hinted by 'do_ctors', it calls constructors, and in
>>> this
>>> >> >> > case a compiler-generated constructor that calls
>>> >> >> > __asan_register_globals with metadata generated by the compiler.
>>> That
>>> >> >> > metadata contains information about global variables. Note, these
>>> >> >> > constructors are called on initial boot, but also every time a
>>> kernel
>>> >> >> > module (that has globals) is loaded.
>>> >> >> >
>>> >> >> > It may also be a toolchain issue, but it's hard to say. If you're
>>> >> >> > using GCC to test, try Clang (11 or later), and vice-versa.
>>> >> >>
>>> >> >> I tried 3 different gcc toolchains already, but that did not fix the
>>> >> >> issue. The only thing that worked was setting asan-globals=0 in
>>> >> >> scripts/Makefile.kasan, but ok, that's not a fix.
>>> >> >> I tried to bisect this issue but our kasan implementation has been
>>> >> >> broken quite a few times, so it failed.
>>> >> >>
>>> >> >> I keep digging!
>>> >> >>
>>> >> >
>>> >> > The problem does not reproduce for me with GCC 11.2.0: kernels built
>>> with both [1] and [2] are bootable.
>>> >>
>>> >> Do you mean you reach userspace? Because my image boots too, and fails
>>> >> at some point:
>>> >>
>>> >> [    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
>>> >> every 4398046511100ns
>>> >> [    0.015847] Console: colour dummy device 80x25
>>> >> [    0.016899] printk: console [tty0] enabled
>>> >> [    0.020326] printk: bootconsole [ns16550a0] disabled
>>> >>
>>> >
>>> > In my case, QEMU successfully boots to the login prompt.
>>> > I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr
>>> shared with me (guess it was built according to this instruction:
>>> https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md
>>> )
>>> >
>>>
>>> Nice thanks guys! I always use the latest opensbi and not the one that
>>> is embedded in qemu, which is the only difference between your command
>>> line (which works) and mine (which does not work). So the issue is
>>> probably there, I really need to investigate that now.
>>>
>>> Great to hear that!
>>
>>
>>> That means I only need to fix KASAN_INLINE and we're good.
>>>
>>> I imagine Palmer can add your Tested-by on the series then?
>>>
>> Sure :)
>
> Do you mind actually posting that (i, the Tested-by tag)?  It's less
> likely to get lost that way.  I intend on taking this into fixes ASAP,
> my builds have blown up for some reason (I got bounced between machines,
> so I'm blaming that) so I need to fix that first.

This is on fixes (with a "Tested-by: Alexander Potapenko 
<glider@google.com>"), along with some trivial commit message fixes.

Thanks!

>
>>
>>>
>>> Thanks again!
>>>
>>> Alex
>>>
>>> >>
>>> >> It traps here.
>>> >>
>>> >> > FWIW here is how I run them:
>>> >> >
>>> >> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
>>> >> >   -device virtio-rng-pci -machine virt -device \
>>> >> >   virtio-net-pci,netdev=net0 -netdev \
>>> >> >   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
>>> >> >   virtio-blk-device,drive=hd0 -drive \
>>> >> >   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
>>> >> >   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append
>>> "root=/dev/vda
>>> >> >   console=ttyS0 earlyprintk=serial"
>>> >> >
>>> >> >
>>> >> >>
>>> >> >> Thanks for the tips,
>>> >> >>
>>> >> >> Alex
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Alexander Potapenko
>>> >> > Software Engineer
>>> >> >
>>> >> > Google Germany GmbH
>>> >> > Erika-Mann-Straße, 33
>>> >> > 80636 München
>>> >> >
>>> >> > Geschäftsführer: Paul Manicle, Liana Sebastian
>>> >> > Registergericht und -nummer: Hamburg, HRB 86891
>>> >> > Sitz der Gesellschaft: Hamburg
>>> >> >
>>> >> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise
>>> erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes
>>> weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte
>>> wissen, dass die E-Mail an die falsche Person gesendet wurde.
>>> >> >
>>> >> >
>>> >> >
>>> >> > This e-mail is confidential. If you received this communication by
>>> mistake, please don't forward it to anyone else, please erase all copies
>>> and attachments, and please let me know that it has gone to the wrong
>>> person.
>>> >>
>>> >> --
>>> >> You received this message because you are subscribed to the Google
>>> Groups "kasan-dev" group.
>>> >> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to kasan-dev+unsubscribe@googlegroups.com.
>>> >> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com
>>> .
>>> >
>>> >
>>> >
>>> > --
>>> > Alexander Potapenko
>>> > Software Engineer
>>> >
>>> > Google Germany GmbH
>>> > Erika-Mann-Straße, 33
>>> > 80636 München
>>> >
>>> > Geschäftsführer: Paul Manicle, Liana Sebastian
>>> > Registergericht und -nummer: Hamburg, HRB 86891
>>> > Sitz der Gesellschaft: Hamburg
>>> >
>>> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
>>> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
>>> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
>>> dass die E-Mail an die falsche Person gesendet wurde.
>>> >
>>> >
>>> >
>>> > This e-mail is confidential. If you received this communication by
>>> mistake, please don't forward it to anyone else, please erase all copies
>>> and attachments, and please let me know that it has gone to the wrong
>>> person.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "kasan-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to kasan-dev+unsubscribe@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCuJw8N0dUmQNdFqDM96bzKqPDjRe4FUnOCbjhJtO0R8Hg%40mail.gmail.com
>>> .
>>>
>>
>>
>> --
>> Alexander Potapenko
>> Software Engineer
>>
>> Google Germany GmbH
>> Erika-Mann-Straße, 33
>> 80636 München
>>
>> Geschäftsführer: Paul Manicle, Liana Sebastian
>> Registergericht und -nummer: Hamburg, HRB 86891
>> Sitz der Gesellschaft: Hamburg
>>
>> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
>> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
>> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
>> dass die E-Mail an die falsche Person gesendet wurde.
>>
>>
>>
>> This e-mail is confidential. If you received this communication by mistake,
>> please don't forward it to anyone else, please erase all copies and
>> attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -fixes v3 0/6] Fixes KASAN and other along the way
       [not found] <CAG_fn=WTJF24TH6ENGD-3S0B_AV4=-39=2ry-uDguZ8Q7f=z=Q@mail.gmail.com>
@ 2022-03-01 17:39 ` Palmer Dabbelt
  2022-03-04  4:12   ` Palmer Dabbelt
  0 siblings, 1 reply; 17+ messages in thread
From: Palmer Dabbelt @ 2022-03-01 17:39 UTC (permalink / raw)
  To: glider
  Cc: alexandre.ghiti, elver, Paul Walmsley, aou, ryabinin.a.a,
	andreyknvl, dvyukov, nogikh, nickhu, linux-riscv, linux-kernel,
	kasan-dev

On Fri, 25 Feb 2022 07:00:23 PST (-0800), glider@google.com wrote:
> On Fri, Feb 25, 2022 at 3:47 PM Alexandre Ghiti <
> alexandre.ghiti@canonical.com> wrote:
>
>> On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com>
>> wrote:
>> >
>> >
>> >
>> > On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <
>> alexandre.ghiti@canonical.com> wrote:
>> >>
>> >> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com>
>> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <
>> alexandre.ghiti@canonical.com> wrote:
>> >> >>
>> >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com>
>> wrote:
>> >> >> >
>> >> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti
>> >> >> > <alexandre.ghiti@canonical.com> wrote:
>> >> >> > >
>> >> >> > > As reported by Aleksandr, syzbot riscv is broken since commit
>> >> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit
>> actually
>> >> >> > > breaks KASAN_INLINE which is not fixed in this series, that will
>> come later
>> >> >> > > when found.
>> >> >> > >
>> >> >> > > Nevertheless, this series fixes small things that made the syzbot
>> >> >> > > configuration + KASAN_OUTLINE fail to boot.
>> >> >> > >
>> >> >> > > Note that even though the config at [1] boots fine with this
>> series, I
>> >> >> > > was not able to boot the small config at [2] which fails because
>> >> >> > > kasan_poison receives a really weird address 0x4075706301000000
>> (maybe a
>> >> >> > > kasan person could provide some hint about what happens below in
>> >> >> > > do_ctors -> __asan_register_globals):
>> >> >> >
>> >> >> > asan_register_globals is responsible for poisoning redzones around
>> >> >> > globals. As hinted by 'do_ctors', it calls constructors, and in
>> this
>> >> >> > case a compiler-generated constructor that calls
>> >> >> > __asan_register_globals with metadata generated by the compiler.
>> That
>> >> >> > metadata contains information about global variables. Note, these
>> >> >> > constructors are called on initial boot, but also every time a
>> kernel
>> >> >> > module (that has globals) is loaded.
>> >> >> >
>> >> >> > It may also be a toolchain issue, but it's hard to say. If you're
>> >> >> > using GCC to test, try Clang (11 or later), and vice-versa.
>> >> >>
>> >> >> I tried 3 different gcc toolchains already, but that did not fix the
>> >> >> issue. The only thing that worked was setting asan-globals=0 in
>> >> >> scripts/Makefile.kasan, but ok, that's not a fix.
>> >> >> I tried to bisect this issue but our kasan implementation has been
>> >> >> broken quite a few times, so it failed.
>> >> >>
>> >> >> I keep digging!
>> >> >>
>> >> >
>> >> > The problem does not reproduce for me with GCC 11.2.0: kernels built
>> with both [1] and [2] are bootable.
>> >>
>> >> Do you mean you reach userspace? Because my image boots too, and fails
>> >> at some point:
>> >>
>> >> [    0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
>> >> every 4398046511100ns
>> >> [    0.015847] Console: colour dummy device 80x25
>> >> [    0.016899] printk: console [tty0] enabled
>> >> [    0.020326] printk: bootconsole [ns16550a0] disabled
>> >>
>> >
>> > In my case, QEMU successfully boots to the login prompt.
>> > I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr
>> shared with me (guess it was built according to this instruction:
>> https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md
>> )
>> >
>>
>> Nice thanks guys! I always use the latest opensbi and not the one that
>> is embedded in qemu, which is the only difference between your command
>> line (which works) and mine (which does not work). So the issue is
>> probably there, I really need to investigate that now.
>>
>> Great to hear that!
>
>
>> That means I only need to fix KASAN_INLINE and we're good.
>>
>> I imagine Palmer can add your Tested-by on the series then?
>>
> Sure :)

Do you mind actually posting that (i, the Tested-by tag)?  It's less 
likely to get lost that way.  I intend on taking this into fixes ASAP, 
my builds have blown up for some reason (I got bounced between machines, 
so I'm blaming that) so I need to fix that first.

>
>>
>> Thanks again!
>>
>> Alex
>>
>> >>
>> >> It traps here.
>> >>
>> >> > FWIW here is how I run them:
>> >> >
>> >> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \
>> >> >   -device virtio-rng-pci -machine virt -device \
>> >> >   virtio-net-pci,netdev=net0 -netdev \
>> >> >   user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \
>> >> >   virtio-blk-device,drive=hd0 -drive \
>> >> >   file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \
>> >> >   -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append
>> "root=/dev/vda
>> >> >   console=ttyS0 earlyprintk=serial"
>> >> >
>> >> >
>> >> >>
>> >> >> Thanks for the tips,
>> >> >>
>> >> >> Alex
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Alexander Potapenko
>> >> > Software Engineer
>> >> >
>> >> > Google Germany GmbH
>> >> > Erika-Mann-Straße, 33
>> >> > 80636 München
>> >> >
>> >> > Geschäftsführer: Paul Manicle, Liana Sebastian
>> >> > Registergericht und -nummer: Hamburg, HRB 86891
>> >> > Sitz der Gesellschaft: Hamburg
>> >> >
>> >> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise
>> erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes
>> weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte
>> wissen, dass die E-Mail an die falsche Person gesendet wurde.
>> >> >
>> >> >
>> >> >
>> >> > This e-mail is confidential. If you received this communication by
>> mistake, please don't forward it to anyone else, please erase all copies
>> and attachments, and please let me know that it has gone to the wrong
>> person.
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google
>> Groups "kasan-dev" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send
>> an email to kasan-dev+unsubscribe@googlegroups.com.
>> >> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com
>> .
>> >
>> >
>> >
>> > --
>> > Alexander Potapenko
>> > Software Engineer
>> >
>> > Google Germany GmbH
>> > Erika-Mann-Straße, 33
>> > 80636 München
>> >
>> > Geschäftsführer: Paul Manicle, Liana Sebastian
>> > Registergericht und -nummer: Hamburg, HRB 86891
>> > Sitz der Gesellschaft: Hamburg
>> >
>> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
>> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
>> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
>> dass die E-Mail an die falsche Person gesendet wurde.
>> >
>> >
>> >
>> > This e-mail is confidential. If you received this communication by
>> mistake, please don't forward it to anyone else, please erase all copies
>> and attachments, and please let me know that it has gone to the wrong
>> person.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "kasan-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to kasan-dev+unsubscribe@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCuJw8N0dUmQNdFqDM96bzKqPDjRe4FUnOCbjhJtO0R8Hg%40mail.gmail.com
>> .
>>
>
>
> -- 
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Paul Manicle, Liana Sebastian
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
>
> Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten
> haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter,
> löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen,
> dass die E-Mail an die falsche Person gesendet wurde.
>
>
>
> This e-mail is confidential. If you received this communication by mistake,
> please don't forward it to anyone else, please erase all copies and
> attachments, and please let me know that it has gone to the wrong person.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-03-24 16:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-25 12:39 [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Alexandre Ghiti
2022-02-25 12:39 ` [PATCH -fixes v3 1/6] riscv: Fix is_linear_mapping with recent move of KASAN region Alexandre Ghiti
2022-02-25 12:39 ` [PATCH -fixes v3 2/6] riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP Alexandre Ghiti
2022-02-25 12:39 ` [PATCH -fixes v3 3/6] riscv: Fix DEBUG_VIRTUAL false warnings Alexandre Ghiti
2022-02-25 12:39 ` [PATCH -fixes v3 4/6] riscv: Fix config KASAN && DEBUG_VIRTUAL Alexandre Ghiti
2022-02-25 12:39 ` [PATCH -fixes v3 5/6] riscv: Move high_memory initialization to setup_bootmem Alexandre Ghiti
2022-02-25 12:39 ` [PATCH -fixes v3 6/6] riscv: Fix kasan pud population Alexandre Ghiti
2022-02-25 13:05 ` [PATCH -fixes v3 0/6] Fixes KASAN and other along the way Marco Elver
2022-02-25 14:04   ` Alexandre Ghiti
     [not found]     ` <CAG_fn=WYmkqPX_qCVmxv1dx87JkXHGF1-a6_8K0jwWuBWzRJfA@mail.gmail.com>
2022-02-25 14:15       ` Alexandre Ghiti
     [not found]         ` <CAG_fn=VZ3fS7ekmJknQ6sW5zC09iUT9mzWjEhyrn3NaAWfVP_Q@mail.gmail.com>
2022-02-25 14:46           ` Alexandre Ghiti
     [not found] <CAG_fn=WTJF24TH6ENGD-3S0B_AV4=-39=2ry-uDguZ8Q7f=z=Q@mail.gmail.com>
2022-03-01 17:39 ` Palmer Dabbelt
2022-03-04  4:12   ` Palmer Dabbelt
2022-03-09 10:45     ` Aleksandr Nogikh
2022-03-09 10:52       ` Dmitry Vyukov
2022-03-10  8:41         ` Alexandre Ghiti
2022-03-24 16:53           ` Aleksandr Nogikh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).