Linux-RISC-V Archive on lore.kernel.org
 help / color / Atom feed
* syzkaller on risc-v
@ 2020-06-30 12:48 Dmitry Vyukov
  2020-06-30 12:57 ` Andreas Schwab
                   ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Dmitry Vyukov @ 2020-06-30 12:48 UTC (permalink / raw)
  To: Tobias Klauser, Björn Töpel, Paul Walmsley,
	Palmer Dabbelt, Albert Ou
  Cc: linux-riscv, syzkaller

Hello risc-v maintainers,

Few days ago Tobias ported syzkaller (kernel fuzzer) to risc-v arch:
https://github.com/google/syzkaller/pull/1867
Tobias also provided nice instructions on how to run it using qemu+buildroot:
https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md
I tried to run it and it works. I wanted to write down some findings
in a public place. Some may be known, some not, some may be easier to
address, some maybe harder. For now my goal is just to document this.

1. KASAN does not seem to work.
I've tried both v5.8-rc2 and 1590a2e1c681b0991bd42c992cabfd380e0338f2
with/without KASAN and KCOV, both inline and outline and all
experiments point to broken KASAN. Boot gets to "INSTRUCTION SETS WANT
TO BE FREE" banner and then it hangs dead in secondary_start_common,
you may see some details here:
https://github.com/google/syzkaller/pull/1875#issuecomment-650545255
KASAN would be a prerequisite for testing risc-v on syzbot.
The recent KCOV patch works well, though.

2. I've also tried to convert our beefy syzbot config for x86_64, it
includes both lots of debug configs and subsystem configs:
https://github.com/google/syzkaller/blob/master/dashboard/config/upstream-kasan.config
I've passed it via olddefconfig for risc-v, disabled KASAN and tried
to boot and got a similar boot hang. I did not try to bisect the
config further.

3. Running with a small config (defconfig+KCOV) initially I got stack
overflows all over the place. Here are some samples:
https://gist.githubusercontent.com/dvyukov/0b6c7d93e2059f91241677a115c8e1ef/raw/947b7626f724262ba6fa3eb67b81f1a3f65cb419/gistfile1.txt
I ended up doing:

--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
-#define THREAD_SIZE_ORDER      (1)
+#define THREAD_SIZE_ORDER      (2)

This eliminated stack overflows.
KCOV may increase stack usage a bit, but not radically like KASAN. So
I would assume some stack overflows can happen without KCOV as well.
So either we need this, or at least bump stack size under KCOV.

4. In lots of cases I did not get meaningful stack traces.
E.g. WARNINGs don't unwind past the exception, which makes the stack useless:
https://gist.githubusercontent.com/dvyukov/717c748dd5cc20f2214026331467cd9f/raw/dd5da078a0bc0210ecf00bdee1112d610305189c/gistfile1.txt
This also happened a dozen of times for stack overflows:
https://gist.githubusercontent.com/dvyukov/6f58a866c8ba53343fd2142b1dfcfffa/raw/1ac463c5924fa53fbe99fd8a4e093af3e3429c0f/gistfile1.txt
also rcu stalls did not get stacks past the timer interrupt:
https://gist.githubusercontent.com/dvyukov/bbad28c67d55fb4e12936da13c533cf5/raw/fb41b4805238fed753b39641d6c7e496519f7f56/gistfile1.txt
and various kinds of exceptions did not get any meaningful stack traces:
https://gist.githubusercontent.com/dvyukov/59fa9ef0f8e1f780c75a2f561b1efd24/raw/91e1f60c23992e6985fc155c2cfb081a30da7662/gistfile1.txt
This makes it hard to debug, but stack traces are also required by
proper bug bucketing by syzkaller.

5. Once we have proper stack traces, we will need to extend syzkaller
test case base to include samples of risc-v crashes:
https://github.com/google/syzkaller/tree/master/pkg/report/testdata/linux/report
and crash parsing code to properly understand and bucket these crashes:
https://github.com/google/syzkaller/blob/master/pkg/report/linux.go#L914-L1685

6. I observed lots of what looks like user-space process memory
corruptions. There included thousands of panics in our Go programs
with things that I would consider "impossible", at least they did not
come up before in our syzbot fuzzing. Also some Go runtime
"impossible" crashes, e.g.:
https://gist.githubusercontent.com/dvyukov/fb489ed93f7180621c71714ee07e53dc/raw/a7d2e98a56da17af2aec79c164cd3a8e154ecf5c/gistfile1.txt
Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?
Though it's not necessary Go b/c kernel contains hundreds of memory
corruptions and we observed kernel corrupting user-space processes
routinely. This is especially true without KASAN because kernel
corruptions are not caught early. However, the ratio and nature of
crashes makes me suspect some issue in Go risc-v runtime.

Thanks

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 12:48 syzkaller on risc-v Dmitry Vyukov
@ 2020-06-30 12:57 ` Andreas Schwab
  2020-06-30 13:26   ` Dmitry Vyukov
  2020-06-30 13:03 ` Andreas Schwab
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 12:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Jun 30 2020, Dmitry Vyukov wrote:

> KASAN would be a prerequisite for testing risc-v on syzbot.

You need to implement the GCC support first.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 12:48 syzkaller on risc-v Dmitry Vyukov
  2020-06-30 12:57 ` Andreas Schwab
@ 2020-06-30 13:03 ` Andreas Schwab
  2020-06-30 13:26   ` David Abdurachmanov
  2020-06-30 13:07 ` Andreas Schwab
  2020-06-30 15:10 ` Tobias Klauser
  3 siblings, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 13:03 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Jun 30 2020, Dmitry Vyukov wrote:

> I would assume some stack overflows can happen without KCOV as well.

Yes, I see stack overflows quite a lot, like this:

[62192.908680] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[62192.915752] CPU: 0 PID: 12347 Comm: ld Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
[62192.925204] Call Trace:
[62192.927646] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
[62192.933030] [<ffffffe000202b76>] show_stack+0x2a/0x34
[62192.938066] [<ffffffe000557d44>] dump_stack+0x6e/0x88
[62192.943098] [<ffffffe00020c2d2>] panic+0xe8/0x26a
[62192.947785] [<ffffffe00085ab9c>] schedule+0x0/0xb2
[62192.952561] [<ffffffe00085af36>] _cond_resched+0x32/0x44
[62192.957859] [<ffffffe0002f18ea>] invalidate_mapping_pages+0xe0/0x1ce
[62192.964193] [<ffffffe000370aa4>] inode_lru_isolate+0x238/0x298
[62192.970012] [<ffffffe000308098>] __list_lru_walk_one+0x5e/0xf6
[62192.975826] [<ffffffe000308516>] list_lru_walk_one+0x42/0x98
[62192.981470] [<ffffffe0003717e8>] prune_icache_sb+0x32/0x72
[62192.986941] [<ffffffe000358366>] super_cache_scan+0xe4/0x13e
[62192.992586] [<ffffffe0002f1fac>] do_shrink_slab+0x10e/0x17e
[62192.998142] [<ffffffe0002f2126>] shrink_slab_memcg+0x10a/0x1de
[62193.003957] [<ffffffe0002f5314>] shrink_node_memcgs+0x12e/0x1a4
[62193.009861] [<ffffffe0002f5484>] shrink_node+0xfa/0x43c
[62193.015067] [<ffffffe0002f583e>] shrink_zones+0x78/0x18c
[62193.020365] [<ffffffe0002f59f0>] do_try_to_free_pages+0x9e/0x23e
[62193.026352] [<ffffffe0002f65ac>] try_to_free_pages+0xb2/0xf4
[62193.031991] [<ffffffe000322952>] __alloc_pages_slowpath.constprop.0+0x2d0/0x6c2
[62193.039284] [<ffffffe000322e9a>] __alloc_pages_nodemask+0x156/0x1b2
[62193.045535] [<ffffffe00030c730>] do_anonymous_page+0x58/0x41c
[62193.051266] [<ffffffe00030f50e>] handle_pte_fault+0x12e/0x156
[62193.056994] [<ffffffe000310444>] __handle_mm_fault+0xca/0x118
[62193.062725] [<ffffffe000310532>] handle_mm_fault+0xa0/0x152
[62193.068278] [<ffffffe0002055ba>] do_page_fault+0xd6/0x370
[62193.073666] [<ffffffe00020140a>] ret_from_exception+0x0/0xc
[62193.079222] [<ffffffe0004fc16a>] copy_page_to_iter_iovec+0x4c/0x154

or this:

[200460.114397] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[200460.121553] CPU: 0 PID: 32619 Comm: sh Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
[200460.131090] Call Trace:
[200460.133623] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
[200460.139091] [<ffffffe000202b76>] show_stack+0x2a/0x34
[200460.144212] [<ffffffe000557d44>] dump_stack+0x6e/0x88
[200460.149335] [<ffffffe00020c2d2>] panic+0xe8/0x26a
[200460.154109] [<ffffffe00085ab9c>] schedule+0x0/0xb2
[200460.158969] [<ffffffe00085af36>] _cond_resched+0x32/0x44
[200460.164348] [<ffffffe000498572>] aa_sk_perm+0x38/0x138
[200460.169559] [<ffffffe00048d4b4>] apparmor_socket_sendmsg+0x18/0x20
[200460.175817] [<ffffffe0004508e0>] security_socket_sendmsg+0x2a/0x42
[200460.182061] [<ffffffe0006f4c0a>] sock_sendmsg+0x1a/0x40
[200460.195979] [<ffffffdf817210cc>] xprt_sock_sendmsg+0xb2/0x2b6 [sunrpc]
[200460.210450] [<ffffffdf81723bde>] xs_tcp_send_request+0xc6/0x206 [sunrpc]
[200460.224930] [<ffffffdf8171f538>] xprt_request_transmit.constprop.0+0x88/0x218 [sunrpc]
[200460.240731] [<ffffffdf81720610>] xprt_transmit+0x9a/0x182 [sunrpc]
[200460.254858] [<ffffffdf8171a584>] call_transmit+0x68/0xb8 [sunrpc]
[200460.268817] [<ffffffdf81726660>] __rpc_execute+0x84/0x222 [sunrpc]
[200460.282787] [<ffffffdf81726cea>] rpc_execute+0xac/0xb8 [sunrpc]
[200460.296493] [<ffffffdf8171c5ca>] rpc_run_task+0x122/0x178 [sunrpc]
[200460.314422] [<ffffffdf82e1533a>] nfs4_do_call_sync+0x64/0x84 [nfsv4]
[200460.332514] [<ffffffdf82e1541c>] _nfs4_proc_getattr+0xc2/0xd4 [nfsv4]
[200460.350813] [<ffffffdf82e1cafc>] nfs4_proc_getattr+0x48/0x72 [nfsv4]
[200460.363307] [<ffffffdf8292c1f6>] __nfs_revalidate_inode+0x104/0x2c8 [nfs]
[200460.376204] [<ffffffdf82926d18>] nfs_access_get_cached+0x104/0x212 [nfs]
[200460.389112] [<ffffffdf82926f20>] nfs_do_access+0xfa/0x178 [nfs]
[200460.401176] [<ffffffdf82927070>] nfs_permission+0x8e/0x184 [nfs]
[200460.406497] [<ffffffe000361936>] inode_permission.part.0+0x78/0x118
[200460.412838] [<ffffffe0003638ea>] link_path_walk.part.0+0x1bc/0x212
[200460.419086] [<ffffffe000363c7e>] path_lookupat+0x34/0x172
[200460.424559] [<ffffffe0003653de>] filename_lookup+0x5c/0xf4
[200460.430114] [<ffffffe00036551e>] user_path_at_empty+0x3a/0x5e
[200460.435931] [<ffffffe00035b838>] vfs_statx+0x62/0xbc
[200460.440966] [<ffffffe00035b92a>] __do_sys_newfstatat+0x24/0x3a
[200460.446870] [<ffffffe00035bafa>] sys_newfstatat+0x10/0x18
[200460.452339] [<ffffffe0002013fc>] ret_from_syscall+0x0/0x2

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 12:48 syzkaller on risc-v Dmitry Vyukov
  2020-06-30 12:57 ` Andreas Schwab
  2020-06-30 13:03 ` Andreas Schwab
@ 2020-06-30 13:07 ` Andreas Schwab
  2020-06-30 13:20   ` David Abdurachmanov
  2020-06-30 15:10 ` Tobias Klauser
  3 siblings, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 13:07 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Jun 30 2020, Dmitry Vyukov wrote:

> Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?

Go is still broken, it doesn't even bootstrap.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:07 ` Andreas Schwab
@ 2020-06-30 13:20   ` David Abdurachmanov
  2020-06-30 13:23     ` Dmitry Vyukov
  2020-06-30 13:30     ` Andreas Schwab
  0 siblings, 2 replies; 29+ messages in thread
From: David Abdurachmanov @ 2020-06-30 13:20 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv, Dmitry Vyukov

On Tue, Jun 30, 2020 at 4:07 PM Andreas Schwab <schwab@suse.de> wrote:
>
> On Jun 30 2020, Dmitry Vyukov wrote:
>
> > Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?
>
> Go is still broken, it doesn't even bootstrap.

Could you elaborate? I have Golang available in Fedora/RISCV and the
bootstrap process was fine (after the TLB bug fix was applied in
OpenSBI). The current stable release is missing cgo and buildmode pie
support, but otherwise seems to work fine. I was told that cgo and
buildmode pie is supported now, but not sure if that will land in the
next release.

david

>
> Andreas.
>
> --
> Andreas Schwab, SUSE Labs, schwab@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:20   ` David Abdurachmanov
@ 2020-06-30 13:23     ` Dmitry Vyukov
  2020-06-30 13:30     ` Andreas Schwab
  1 sibling, 0 replies; 29+ messages in thread
From: Dmitry Vyukov @ 2020-06-30 13:23 UTC (permalink / raw)
  To: David Abdurachmanov
  Cc: Albert Ou, Andreas Schwab, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv,
	Björn Töpel

On Tue, Jun 30, 2020 at 3:21 PM David Abdurachmanov
<david.abdurachmanov@gmail.com> wrote:
>
> On Tue, Jun 30, 2020 at 4:07 PM Andreas Schwab <schwab@suse.de> wrote:
> >
> > On Jun 30 2020, Dmitry Vyukov wrote:
> >
> > > Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?
> >
> > Go is still broken, it doesn't even bootstrap.
>
> Could you elaborate? I have Golang available in Fedora/RISCV and the
> bootstrap process was fine (after the TLB bug fix was applied in
> OpenSBI). The current stable release is missing cgo and buildmode pie
> support, but otherwise seems to work fine. I was told that cgo and
> buildmode pie is supported now, but not sure if that will land in the
> next release.

FWIW I took stock Go 1.14 and the whole large system mostly worked
without any problems (except for these episodic crashes).

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:03 ` Andreas Schwab
@ 2020-06-30 13:26   ` David Abdurachmanov
  2020-06-30 13:37     ` Colin Ian King
  0 siblings, 1 reply; 29+ messages in thread
From: David Abdurachmanov @ 2020-06-30 13:26 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Colin Ian King, Tobias Klauser, linux-riscv,
	Dmitry Vyukov

On Tue, Jun 30, 2020 at 4:04 PM Andreas Schwab <schwab@suse.de> wrote:
>
> On Jun 30 2020, Dmitry Vyukov wrote:
>
> > I would assume some stack overflows can happen without KCOV as well.
>
> Yes, I see stack overflows quite a lot, like this:
>
> [62192.908680] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> [62192.915752] CPU: 0 PID: 12347 Comm: ld Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
> [62192.925204] Call Trace:
> [62192.927646] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
> [62192.933030] [<ffffffe000202b76>] show_stack+0x2a/0x34
> [62192.938066] [<ffffffe000557d44>] dump_stack+0x6e/0x88
> [62192.943098] [<ffffffe00020c2d2>] panic+0xe8/0x26a
> [62192.947785] [<ffffffe00085ab9c>] schedule+0x0/0xb2
> [62192.952561] [<ffffffe00085af36>] _cond_resched+0x32/0x44
> [62192.957859] [<ffffffe0002f18ea>] invalidate_mapping_pages+0xe0/0x1ce
> [62192.964193] [<ffffffe000370aa4>] inode_lru_isolate+0x238/0x298
> [62192.970012] [<ffffffe000308098>] __list_lru_walk_one+0x5e/0xf6
> [62192.975826] [<ffffffe000308516>] list_lru_walk_one+0x42/0x98
> [62192.981470] [<ffffffe0003717e8>] prune_icache_sb+0x32/0x72
> [62192.986941] [<ffffffe000358366>] super_cache_scan+0xe4/0x13e
> [62192.992586] [<ffffffe0002f1fac>] do_shrink_slab+0x10e/0x17e
> [62192.998142] [<ffffffe0002f2126>] shrink_slab_memcg+0x10a/0x1de
> [62193.003957] [<ffffffe0002f5314>] shrink_node_memcgs+0x12e/0x1a4
> [62193.009861] [<ffffffe0002f5484>] shrink_node+0xfa/0x43c
> [62193.015067] [<ffffffe0002f583e>] shrink_zones+0x78/0x18c
> [62193.020365] [<ffffffe0002f59f0>] do_try_to_free_pages+0x9e/0x23e
> [62193.026352] [<ffffffe0002f65ac>] try_to_free_pages+0xb2/0xf4
> [62193.031991] [<ffffffe000322952>] __alloc_pages_slowpath.constprop.0+0x2d0/0x6c2
> [62193.039284] [<ffffffe000322e9a>] __alloc_pages_nodemask+0x156/0x1b2
> [62193.045535] [<ffffffe00030c730>] do_anonymous_page+0x58/0x41c
> [62193.051266] [<ffffffe00030f50e>] handle_pte_fault+0x12e/0x156
> [62193.056994] [<ffffffe000310444>] __handle_mm_fault+0xca/0x118
> [62193.062725] [<ffffffe000310532>] handle_mm_fault+0xa0/0x152
> [62193.068278] [<ffffffe0002055ba>] do_page_fault+0xd6/0x370
> [62193.073666] [<ffffffe00020140a>] ret_from_exception+0x0/0xc
> [62193.079222] [<ffffffe0004fc16a>] copy_page_to_iter_iovec+0x4c/0x154

There was a report from Canonical that enabling gcov causes similar issues.

linux: riscv: corrupted stack detected inside scheduler
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877954

Adding Colin to CC. So far we couldn't reproduce this locally, I
guess, because we don't have the right config.

david


>
> or this:
>
> [200460.114397] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> [200460.121553] CPU: 0 PID: 32619 Comm: sh Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
> [200460.131090] Call Trace:
> [200460.133623] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
> [200460.139091] [<ffffffe000202b76>] show_stack+0x2a/0x34
> [200460.144212] [<ffffffe000557d44>] dump_stack+0x6e/0x88
> [200460.149335] [<ffffffe00020c2d2>] panic+0xe8/0x26a
> [200460.154109] [<ffffffe00085ab9c>] schedule+0x0/0xb2
> [200460.158969] [<ffffffe00085af36>] _cond_resched+0x32/0x44
> [200460.164348] [<ffffffe000498572>] aa_sk_perm+0x38/0x138
> [200460.169559] [<ffffffe00048d4b4>] apparmor_socket_sendmsg+0x18/0x20
> [200460.175817] [<ffffffe0004508e0>] security_socket_sendmsg+0x2a/0x42
> [200460.182061] [<ffffffe0006f4c0a>] sock_sendmsg+0x1a/0x40
> [200460.195979] [<ffffffdf817210cc>] xprt_sock_sendmsg+0xb2/0x2b6 [sunrpc]
> [200460.210450] [<ffffffdf81723bde>] xs_tcp_send_request+0xc6/0x206 [sunrpc]
> [200460.224930] [<ffffffdf8171f538>] xprt_request_transmit.constprop.0+0x88/0x218 [sunrpc]
> [200460.240731] [<ffffffdf81720610>] xprt_transmit+0x9a/0x182 [sunrpc]
> [200460.254858] [<ffffffdf8171a584>] call_transmit+0x68/0xb8 [sunrpc]
> [200460.268817] [<ffffffdf81726660>] __rpc_execute+0x84/0x222 [sunrpc]
> [200460.282787] [<ffffffdf81726cea>] rpc_execute+0xac/0xb8 [sunrpc]
> [200460.296493] [<ffffffdf8171c5ca>] rpc_run_task+0x122/0x178 [sunrpc]
> [200460.314422] [<ffffffdf82e1533a>] nfs4_do_call_sync+0x64/0x84 [nfsv4]
> [200460.332514] [<ffffffdf82e1541c>] _nfs4_proc_getattr+0xc2/0xd4 [nfsv4]
> [200460.350813] [<ffffffdf82e1cafc>] nfs4_proc_getattr+0x48/0x72 [nfsv4]
> [200460.363307] [<ffffffdf8292c1f6>] __nfs_revalidate_inode+0x104/0x2c8 [nfs]
> [200460.376204] [<ffffffdf82926d18>] nfs_access_get_cached+0x104/0x212 [nfs]
> [200460.389112] [<ffffffdf82926f20>] nfs_do_access+0xfa/0x178 [nfs]
> [200460.401176] [<ffffffdf82927070>] nfs_permission+0x8e/0x184 [nfs]
> [200460.406497] [<ffffffe000361936>] inode_permission.part.0+0x78/0x118
> [200460.412838] [<ffffffe0003638ea>] link_path_walk.part.0+0x1bc/0x212
> [200460.419086] [<ffffffe000363c7e>] path_lookupat+0x34/0x172
> [200460.424559] [<ffffffe0003653de>] filename_lookup+0x5c/0xf4
> [200460.430114] [<ffffffe00036551e>] user_path_at_empty+0x3a/0x5e
> [200460.435931] [<ffffffe00035b838>] vfs_statx+0x62/0xbc
> [200460.440966] [<ffffffe00035b92a>] __do_sys_newfstatat+0x24/0x3a
> [200460.446870] [<ffffffe00035bafa>] sys_newfstatat+0x10/0x18
> [200460.452339] [<ffffffe0002013fc>] ret_from_syscall+0x0/0x2
>
> Andreas.
>
> --
> Andreas Schwab, SUSE Labs, schwab@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 12:57 ` Andreas Schwab
@ 2020-06-30 13:26   ` Dmitry Vyukov
  2020-06-30 13:33     ` Andreas Schwab
  2020-07-01 10:42     ` Björn Töpel
  0 siblings, 2 replies; 29+ messages in thread
From: Dmitry Vyukov @ 2020-06-30 13:26 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Tue, Jun 30, 2020 at 3:14 PM Andreas Schwab <schwab@suse.de> wrote:
>
> On Jun 30 2020, Dmitry Vyukov wrote:
>
> > KASAN would be a prerequisite for testing risc-v on syzbot.
>
> You need to implement the GCC support first.

Interesting. Björn claimed KASAN works already.  And there is:

commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
Author: Nick Hu
Date:   Mon Jan 6 10:38:32 2020 -0800
    riscv: Add KASAN support

Is there any known issue with gcc?
Did anyone try clang? AddressSanitizer pass in clang is
arch-independent. Not sure about gcc... it looked mostly
arch-independent.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:20   ` David Abdurachmanov
  2020-06-30 13:23     ` Dmitry Vyukov
@ 2020-06-30 13:30     ` Andreas Schwab
  2020-06-30 13:35       ` David Abdurachmanov
  1 sibling, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 13:30 UTC (permalink / raw)
  To: David Abdurachmanov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv, Dmitry Vyukov

On Jun 30 2020, David Abdurachmanov wrote:

> On Tue, Jun 30, 2020 at 4:07 PM Andreas Schwab <schwab@suse.de> wrote:
>>
>> On Jun 30 2020, Dmitry Vyukov wrote:
>>
>> > Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?
>>
>> Go is still broken, it doesn't even bootstrap.
>
> Could you elaborate? 

https://build.opensuse.org/package/live_build_log/home:Andreas_Schwab:riscv:go/go1.14/r/riscv64

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:26   ` Dmitry Vyukov
@ 2020-06-30 13:33     ` Andreas Schwab
  2020-06-30 13:40       ` Dmitry Vyukov
  2020-07-01 10:42     ` Björn Töpel
  1 sibling, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 13:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Jun 30 2020, Dmitry Vyukov wrote:

> On Tue, Jun 30, 2020 at 3:14 PM Andreas Schwab <schwab@suse.de> wrote:
>>
>> On Jun 30 2020, Dmitry Vyukov wrote:
>>
>> > KASAN would be a prerequisite for testing risc-v on syzbot.
>>
>> You need to implement the GCC support first.
>
> Interesting. Björn claimed KASAN works already.  And there is:
>
> commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
> Author: Nick Hu
> Date:   Mon Jan 6 10:38:32 2020 -0800
>     riscv: Add KASAN support
>
> Is there any known issue with gcc?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441

$ gcc -fsanitize=kernel-address -xc - </dev/null
cc1: warning: ‘-fsanitize=address’ and ‘-fsanitize=kernel-address’ are not supported for this target

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:30     ` Andreas Schwab
@ 2020-06-30 13:35       ` David Abdurachmanov
  2020-06-30 13:43         ` Andreas Schwab
  0 siblings, 1 reply; 29+ messages in thread
From: David Abdurachmanov @ 2020-06-30 13:35 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv, Dmitry Vyukov

On Tue, Jun 30, 2020 at 4:30 PM Andreas Schwab <schwab@suse.de> wrote:
>
> On Jun 30 2020, David Abdurachmanov wrote:
>
> > On Tue, Jun 30, 2020 at 4:07 PM Andreas Schwab <schwab@suse.de> wrote:
> >>
> >> On Jun 30 2020, Dmitry Vyukov wrote:
> >>
> >> > Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?
> >>
> >> Go is still broken, it doesn't even bootstrap.
> >
> > Could you elaborate?
>
> https://build.opensuse.org/package/live_build_log/home:Andreas_Schwab:riscv:go/go1.14/r/riscv64

You seem to bootstrap with gccgo, which will not work (tried it). You
need to cross-compile golang and use that to bootstrap.

david
>
> Andreas.
>
> --
> Andreas Schwab, SUSE Labs, schwab@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:26   ` David Abdurachmanov
@ 2020-06-30 13:37     ` Colin Ian King
  2020-06-30 13:57       ` David Abdurachmanov
  0 siblings, 1 reply; 29+ messages in thread
From: Colin Ian King @ 2020-06-30 13:37 UTC (permalink / raw)
  To: David Abdurachmanov, Andreas Schwab
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv, Dmitry Vyukov

I believe I'm also seeing some potential stack smashing issues in the
lua engine in ZFS on risc-v. It is taking a while for me to debug, but I
don't see the failure on other arches.  Is there a way to bump the stack
size up temporarily to test with larger stacks on risc-v?

Colin

On 30/06/2020 14:26, David Abdurachmanov wrote:
> On Tue, Jun 30, 2020 at 4:04 PM Andreas Schwab <schwab@suse.de> wrote:
>>
>> On Jun 30 2020, Dmitry Vyukov wrote:
>>
>>> I would assume some stack overflows can happen without KCOV as well.
>>
>> Yes, I see stack overflows quite a lot, like this:
>>
>> [62192.908680] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>> [62192.915752] CPU: 0 PID: 12347 Comm: ld Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>> [62192.925204] Call Trace:
>> [62192.927646] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>> [62192.933030] [<ffffffe000202b76>] show_stack+0x2a/0x34
>> [62192.938066] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>> [62192.943098] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>> [62192.947785] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>> [62192.952561] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>> [62192.957859] [<ffffffe0002f18ea>] invalidate_mapping_pages+0xe0/0x1ce
>> [62192.964193] [<ffffffe000370aa4>] inode_lru_isolate+0x238/0x298
>> [62192.970012] [<ffffffe000308098>] __list_lru_walk_one+0x5e/0xf6
>> [62192.975826] [<ffffffe000308516>] list_lru_walk_one+0x42/0x98
>> [62192.981470] [<ffffffe0003717e8>] prune_icache_sb+0x32/0x72
>> [62192.986941] [<ffffffe000358366>] super_cache_scan+0xe4/0x13e
>> [62192.992586] [<ffffffe0002f1fac>] do_shrink_slab+0x10e/0x17e
>> [62192.998142] [<ffffffe0002f2126>] shrink_slab_memcg+0x10a/0x1de
>> [62193.003957] [<ffffffe0002f5314>] shrink_node_memcgs+0x12e/0x1a4
>> [62193.009861] [<ffffffe0002f5484>] shrink_node+0xfa/0x43c
>> [62193.015067] [<ffffffe0002f583e>] shrink_zones+0x78/0x18c
>> [62193.020365] [<ffffffe0002f59f0>] do_try_to_free_pages+0x9e/0x23e
>> [62193.026352] [<ffffffe0002f65ac>] try_to_free_pages+0xb2/0xf4
>> [62193.031991] [<ffffffe000322952>] __alloc_pages_slowpath.constprop.0+0x2d0/0x6c2
>> [62193.039284] [<ffffffe000322e9a>] __alloc_pages_nodemask+0x156/0x1b2
>> [62193.045535] [<ffffffe00030c730>] do_anonymous_page+0x58/0x41c
>> [62193.051266] [<ffffffe00030f50e>] handle_pte_fault+0x12e/0x156
>> [62193.056994] [<ffffffe000310444>] __handle_mm_fault+0xca/0x118
>> [62193.062725] [<ffffffe000310532>] handle_mm_fault+0xa0/0x152
>> [62193.068278] [<ffffffe0002055ba>] do_page_fault+0xd6/0x370
>> [62193.073666] [<ffffffe00020140a>] ret_from_exception+0x0/0xc
>> [62193.079222] [<ffffffe0004fc16a>] copy_page_to_iter_iovec+0x4c/0x154
> 
> There was a report from Canonical that enabling gcov causes similar issues.
> 
> linux: riscv: corrupted stack detected inside scheduler
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877954
> 
> Adding Colin to CC. So far we couldn't reproduce this locally, I
> guess, because we don't have the right config.
> 
> david
> 
> 
>>
>> or this:
>>
>> [200460.114397] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>> [200460.121553] CPU: 0 PID: 32619 Comm: sh Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>> [200460.131090] Call Trace:
>> [200460.133623] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>> [200460.139091] [<ffffffe000202b76>] show_stack+0x2a/0x34
>> [200460.144212] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>> [200460.149335] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>> [200460.154109] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>> [200460.158969] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>> [200460.164348] [<ffffffe000498572>] aa_sk_perm+0x38/0x138
>> [200460.169559] [<ffffffe00048d4b4>] apparmor_socket_sendmsg+0x18/0x20
>> [200460.175817] [<ffffffe0004508e0>] security_socket_sendmsg+0x2a/0x42
>> [200460.182061] [<ffffffe0006f4c0a>] sock_sendmsg+0x1a/0x40
>> [200460.195979] [<ffffffdf817210cc>] xprt_sock_sendmsg+0xb2/0x2b6 [sunrpc]
>> [200460.210450] [<ffffffdf81723bde>] xs_tcp_send_request+0xc6/0x206 [sunrpc]
>> [200460.224930] [<ffffffdf8171f538>] xprt_request_transmit.constprop.0+0x88/0x218 [sunrpc]
>> [200460.240731] [<ffffffdf81720610>] xprt_transmit+0x9a/0x182 [sunrpc]
>> [200460.254858] [<ffffffdf8171a584>] call_transmit+0x68/0xb8 [sunrpc]
>> [200460.268817] [<ffffffdf81726660>] __rpc_execute+0x84/0x222 [sunrpc]
>> [200460.282787] [<ffffffdf81726cea>] rpc_execute+0xac/0xb8 [sunrpc]
>> [200460.296493] [<ffffffdf8171c5ca>] rpc_run_task+0x122/0x178 [sunrpc]
>> [200460.314422] [<ffffffdf82e1533a>] nfs4_do_call_sync+0x64/0x84 [nfsv4]
>> [200460.332514] [<ffffffdf82e1541c>] _nfs4_proc_getattr+0xc2/0xd4 [nfsv4]
>> [200460.350813] [<ffffffdf82e1cafc>] nfs4_proc_getattr+0x48/0x72 [nfsv4]
>> [200460.363307] [<ffffffdf8292c1f6>] __nfs_revalidate_inode+0x104/0x2c8 [nfs]
>> [200460.376204] [<ffffffdf82926d18>] nfs_access_get_cached+0x104/0x212 [nfs]
>> [200460.389112] [<ffffffdf82926f20>] nfs_do_access+0xfa/0x178 [nfs]
>> [200460.401176] [<ffffffdf82927070>] nfs_permission+0x8e/0x184 [nfs]
>> [200460.406497] [<ffffffe000361936>] inode_permission.part.0+0x78/0x118
>> [200460.412838] [<ffffffe0003638ea>] link_path_walk.part.0+0x1bc/0x212
>> [200460.419086] [<ffffffe000363c7e>] path_lookupat+0x34/0x172
>> [200460.424559] [<ffffffe0003653de>] filename_lookup+0x5c/0xf4
>> [200460.430114] [<ffffffe00036551e>] user_path_at_empty+0x3a/0x5e
>> [200460.435931] [<ffffffe00035b838>] vfs_statx+0x62/0xbc
>> [200460.440966] [<ffffffe00035b92a>] __do_sys_newfstatat+0x24/0x3a
>> [200460.446870] [<ffffffe00035bafa>] sys_newfstatat+0x10/0x18
>> [200460.452339] [<ffffffe0002013fc>] ret_from_syscall+0x0/0x2
>>
>> Andreas.
>>
>> --
>> Andreas Schwab, SUSE Labs, schwab@suse.de
>> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
>> "And now for something completely different."
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:33     ` Andreas Schwab
@ 2020-06-30 13:40       ` Dmitry Vyukov
  2020-06-30 13:45         ` Andreas Schwab
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2020-06-30 13:40 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Tue, Jun 30, 2020 at 3:33 PM Andreas Schwab <schwab@suse.de> wrote:
>
> On Jun 30 2020, Dmitry Vyukov wrote:
>
> > On Tue, Jun 30, 2020 at 3:14 PM Andreas Schwab <schwab@suse.de> wrote:
> >>
> >> On Jun 30 2020, Dmitry Vyukov wrote:
> >>
> >> > KASAN would be a prerequisite for testing risc-v on syzbot.
> >>
> >> You need to implement the GCC support first.
> >
> > Interesting. Björn claimed KASAN works already.  And there is:
> >
> > commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
> > Author: Nick Hu
> > Date:   Mon Jan 6 10:38:32 2020 -0800
> >     riscv: Add KASAN support
> >
> > Is there any known issue with gcc?
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441
>
> $ gcc -fsanitize=kernel-address -xc - </dev/null
> cc1: warning: ‘-fsanitize=address’ and ‘-fsanitize=kernel-address’ are not supported for this target

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441#c3
Fixed ;)

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:35       ` David Abdurachmanov
@ 2020-06-30 13:43         ` Andreas Schwab
  2020-07-02 22:00           ` Aurelien Jarno
  0 siblings, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 13:43 UTC (permalink / raw)
  To: David Abdurachmanov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv, Dmitry Vyukov

On Jun 30 2020, David Abdurachmanov wrote:

> You seem to bootstrap with gccgo, which will not work (tried it).

I consider that broken.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:40       ` Dmitry Vyukov
@ 2020-06-30 13:45         ` Andreas Schwab
  2020-06-30 13:49           ` Dmitry Vyukov
  0 siblings, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 13:45 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Jun 30 2020, Dmitry Vyukov wrote:

> On Tue, Jun 30, 2020 at 3:33 PM Andreas Schwab <schwab@suse.de> wrote:
>>
>> On Jun 30 2020, Dmitry Vyukov wrote:
>>
>> > On Tue, Jun 30, 2020 at 3:14 PM Andreas Schwab <schwab@suse.de> wrote:
>> >>
>> >> On Jun 30 2020, Dmitry Vyukov wrote:
>> >>
>> >> > KASAN would be a prerequisite for testing risc-v on syzbot.
>> >>
>> >> You need to implement the GCC support first.
>> >
>> > Interesting. Björn claimed KASAN works already.  And there is:
>> >
>> > commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
>> > Author: Nick Hu
>> > Date:   Mon Jan 6 10:38:32 2020 -0800
>> >     riscv: Add KASAN support
>> >
>> > Is there any known issue with gcc?
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441
>>
>> $ gcc -fsanitize=kernel-address -xc - </dev/null
>> cc1: warning: ‘-fsanitize=address’ and ‘-fsanitize=kernel-address’ are not supported for this target
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441#c3
> Fixed ;)

Yes.  What's your point?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:45         ` Andreas Schwab
@ 2020-06-30 13:49           ` Dmitry Vyukov
  2020-06-30 13:52             ` Andreas Schwab
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Vyukov @ 2020-06-30 13:49 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Tue, Jun 30, 2020 at 3:45 PM Andreas Schwab <schwab@suse.de> wrote:
> >> >> On Jun 30 2020, Dmitry Vyukov wrote:
> >> >>
> >> >> > KASAN would be a prerequisite for testing risc-v on syzbot.
> >> >>
> >> >> You need to implement the GCC support first.
> >> >
> >> > Interesting. Björn claimed KASAN works already.  And there is:
> >> >
> >> > commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
> >> > Author: Nick Hu
> >> > Date:   Mon Jan 6 10:38:32 2020 -0800
> >> >     riscv: Add KASAN support
> >> >
> >> > Is there any known issue with gcc?
> >>
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441
> >>
> >> $ gcc -fsanitize=kernel-address -xc - </dev/null
> >> cc1: warning: ‘-fsanitize=address’ and ‘-fsanitize=kernel-address’ are not supported for this target
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441#c3
> > Fixed ;)
>
> Yes.  What's your point?

My point is that the GCC part seems to be implemented already and if
one wants to use KASAN, they don't need to implement GCC support
first. It looks like something on the kernel side broke since January.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:49           ` Dmitry Vyukov
@ 2020-06-30 13:52             ` Andreas Schwab
  0 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 13:52 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Tobias Klauser, linux-riscv

On Jun 30 2020, Dmitry Vyukov wrote:

> On Tue, Jun 30, 2020 at 3:45 PM Andreas Schwab <schwab@suse.de> wrote:
>> >> >> On Jun 30 2020, Dmitry Vyukov wrote:
>> >> >>
>> >> >> > KASAN would be a prerequisite for testing risc-v on syzbot.
>> >> >>
>> >> >> You need to implement the GCC support first.
>> >> >
>> >> > Interesting. Björn claimed KASAN works already.  And there is:
>> >> >
>> >> > commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
>> >> > Author: Nick Hu
>> >> > Date:   Mon Jan 6 10:38:32 2020 -0800
>> >> >     riscv: Add KASAN support
>> >> >
>> >> > Is there any known issue with gcc?
>> >>
>> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441
>> >>
>> >> $ gcc -fsanitize=kernel-address -xc - </dev/null
>> >> cc1: warning: ‘-fsanitize=address’ and ‘-fsanitize=kernel-address’ are not supported for this target
>> >
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91441#c3
>> > Fixed ;)
>>
>> Yes.  What's your point?
>
> My point is that the GCC part seems to be implemented already

??? No, of course not.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:37     ` Colin Ian King
@ 2020-06-30 13:57       ` David Abdurachmanov
  2020-06-30 14:55         ` Andreas Schwab
  2020-07-06 10:12         ` Colin Ian King
  0 siblings, 2 replies; 29+ messages in thread
From: David Abdurachmanov @ 2020-06-30 13:57 UTC (permalink / raw)
  To: Colin Ian King
  Cc: Albert Ou, Andreas Schwab, Paul Walmsley, syzkaller,
	Palmer Dabbelt, Björn Töpel, Tobias Klauser,
	linux-riscv, Dmitry Vyukov

On Tue, Jun 30, 2020 at 4:38 PM Colin Ian King <colin.king@canonical.com> wrote:
>
> I believe I'm also seeing some potential stack smashing issues in the
> lua engine in ZFS on risc-v. It is taking a while for me to debug, but I
> don't see the failure on other arches.  Is there a way to bump the stack
> size up temporarily to test with larger stacks on risc-v?

Dmitry wrote on the original email that the follow solves issues with
KCOV enabled:

--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
-#define THREAD_SIZE_ORDER      (1)
+#define THREAD_SIZE_ORDER      (2)

I see MIPS have:

[..]
 80 /* thread information allocation */
 81 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_32BIT)
 82 #define THREAD_SIZE_ORDER (1)
 83 #endif
 84 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_64BIT)
 85 #define THREAD_SIZE_ORDER (2)
[..]

david

>
> Colin
>
> On 30/06/2020 14:26, David Abdurachmanov wrote:
> > On Tue, Jun 30, 2020 at 4:04 PM Andreas Schwab <schwab@suse.de> wrote:
> >>
> >> On Jun 30 2020, Dmitry Vyukov wrote:
> >>
> >>> I would assume some stack overflows can happen without KCOV as well.
> >>
> >> Yes, I see stack overflows quite a lot, like this:
> >>
> >> [62192.908680] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> >> [62192.915752] CPU: 0 PID: 12347 Comm: ld Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
> >> [62192.925204] Call Trace:
> >> [62192.927646] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
> >> [62192.933030] [<ffffffe000202b76>] show_stack+0x2a/0x34
> >> [62192.938066] [<ffffffe000557d44>] dump_stack+0x6e/0x88
> >> [62192.943098] [<ffffffe00020c2d2>] panic+0xe8/0x26a
> >> [62192.947785] [<ffffffe00085ab9c>] schedule+0x0/0xb2
> >> [62192.952561] [<ffffffe00085af36>] _cond_resched+0x32/0x44
> >> [62192.957859] [<ffffffe0002f18ea>] invalidate_mapping_pages+0xe0/0x1ce
> >> [62192.964193] [<ffffffe000370aa4>] inode_lru_isolate+0x238/0x298
> >> [62192.970012] [<ffffffe000308098>] __list_lru_walk_one+0x5e/0xf6
> >> [62192.975826] [<ffffffe000308516>] list_lru_walk_one+0x42/0x98
> >> [62192.981470] [<ffffffe0003717e8>] prune_icache_sb+0x32/0x72
> >> [62192.986941] [<ffffffe000358366>] super_cache_scan+0xe4/0x13e
> >> [62192.992586] [<ffffffe0002f1fac>] do_shrink_slab+0x10e/0x17e
> >> [62192.998142] [<ffffffe0002f2126>] shrink_slab_memcg+0x10a/0x1de
> >> [62193.003957] [<ffffffe0002f5314>] shrink_node_memcgs+0x12e/0x1a4
> >> [62193.009861] [<ffffffe0002f5484>] shrink_node+0xfa/0x43c
> >> [62193.015067] [<ffffffe0002f583e>] shrink_zones+0x78/0x18c
> >> [62193.020365] [<ffffffe0002f59f0>] do_try_to_free_pages+0x9e/0x23e
> >> [62193.026352] [<ffffffe0002f65ac>] try_to_free_pages+0xb2/0xf4
> >> [62193.031991] [<ffffffe000322952>] __alloc_pages_slowpath.constprop.0+0x2d0/0x6c2
> >> [62193.039284] [<ffffffe000322e9a>] __alloc_pages_nodemask+0x156/0x1b2
> >> [62193.045535] [<ffffffe00030c730>] do_anonymous_page+0x58/0x41c
> >> [62193.051266] [<ffffffe00030f50e>] handle_pte_fault+0x12e/0x156
> >> [62193.056994] [<ffffffe000310444>] __handle_mm_fault+0xca/0x118
> >> [62193.062725] [<ffffffe000310532>] handle_mm_fault+0xa0/0x152
> >> [62193.068278] [<ffffffe0002055ba>] do_page_fault+0xd6/0x370
> >> [62193.073666] [<ffffffe00020140a>] ret_from_exception+0x0/0xc
> >> [62193.079222] [<ffffffe0004fc16a>] copy_page_to_iter_iovec+0x4c/0x154
> >
> > There was a report from Canonical that enabling gcov causes similar issues.
> >
> > linux: riscv: corrupted stack detected inside scheduler
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877954
> >
> > Adding Colin to CC. So far we couldn't reproduce this locally, I
> > guess, because we don't have the right config.
> >
> > david
> >
> >
> >>
> >> or this:
> >>
> >> [200460.114397] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> >> [200460.121553] CPU: 0 PID: 32619 Comm: sh Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
> >> [200460.131090] Call Trace:
> >> [200460.133623] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
> >> [200460.139091] [<ffffffe000202b76>] show_stack+0x2a/0x34
> >> [200460.144212] [<ffffffe000557d44>] dump_stack+0x6e/0x88
> >> [200460.149335] [<ffffffe00020c2d2>] panic+0xe8/0x26a
> >> [200460.154109] [<ffffffe00085ab9c>] schedule+0x0/0xb2
> >> [200460.158969] [<ffffffe00085af36>] _cond_resched+0x32/0x44
> >> [200460.164348] [<ffffffe000498572>] aa_sk_perm+0x38/0x138
> >> [200460.169559] [<ffffffe00048d4b4>] apparmor_socket_sendmsg+0x18/0x20
> >> [200460.175817] [<ffffffe0004508e0>] security_socket_sendmsg+0x2a/0x42
> >> [200460.182061] [<ffffffe0006f4c0a>] sock_sendmsg+0x1a/0x40
> >> [200460.195979] [<ffffffdf817210cc>] xprt_sock_sendmsg+0xb2/0x2b6 [sunrpc]
> >> [200460.210450] [<ffffffdf81723bde>] xs_tcp_send_request+0xc6/0x206 [sunrpc]
> >> [200460.224930] [<ffffffdf8171f538>] xprt_request_transmit.constprop.0+0x88/0x218 [sunrpc]
> >> [200460.240731] [<ffffffdf81720610>] xprt_transmit+0x9a/0x182 [sunrpc]
> >> [200460.254858] [<ffffffdf8171a584>] call_transmit+0x68/0xb8 [sunrpc]
> >> [200460.268817] [<ffffffdf81726660>] __rpc_execute+0x84/0x222 [sunrpc]
> >> [200460.282787] [<ffffffdf81726cea>] rpc_execute+0xac/0xb8 [sunrpc]
> >> [200460.296493] [<ffffffdf8171c5ca>] rpc_run_task+0x122/0x178 [sunrpc]
> >> [200460.314422] [<ffffffdf82e1533a>] nfs4_do_call_sync+0x64/0x84 [nfsv4]
> >> [200460.332514] [<ffffffdf82e1541c>] _nfs4_proc_getattr+0xc2/0xd4 [nfsv4]
> >> [200460.350813] [<ffffffdf82e1cafc>] nfs4_proc_getattr+0x48/0x72 [nfsv4]
> >> [200460.363307] [<ffffffdf8292c1f6>] __nfs_revalidate_inode+0x104/0x2c8 [nfs]
> >> [200460.376204] [<ffffffdf82926d18>] nfs_access_get_cached+0x104/0x212 [nfs]
> >> [200460.389112] [<ffffffdf82926f20>] nfs_do_access+0xfa/0x178 [nfs]
> >> [200460.401176] [<ffffffdf82927070>] nfs_permission+0x8e/0x184 [nfs]
> >> [200460.406497] [<ffffffe000361936>] inode_permission.part.0+0x78/0x118
> >> [200460.412838] [<ffffffe0003638ea>] link_path_walk.part.0+0x1bc/0x212
> >> [200460.419086] [<ffffffe000363c7e>] path_lookupat+0x34/0x172
> >> [200460.424559] [<ffffffe0003653de>] filename_lookup+0x5c/0xf4
> >> [200460.430114] [<ffffffe00036551e>] user_path_at_empty+0x3a/0x5e
> >> [200460.435931] [<ffffffe00035b838>] vfs_statx+0x62/0xbc
> >> [200460.440966] [<ffffffe00035b92a>] __do_sys_newfstatat+0x24/0x3a
> >> [200460.446870] [<ffffffe00035bafa>] sys_newfstatat+0x10/0x18
> >> [200460.452339] [<ffffffe0002013fc>] ret_from_syscall+0x0/0x2
> >>
> >> Andreas.
> >>
> >> --
> >> Andreas Schwab, SUSE Labs, schwab@suse.de
> >> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> >> "And now for something completely different."
> >>
> >> _______________________________________________
> >> linux-riscv mailing list
> >> linux-riscv@lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/linux-riscv
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:57       ` David Abdurachmanov
@ 2020-06-30 14:55         ` Andreas Schwab
  2020-07-06 10:12         ` Colin Ian King
  1 sibling, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2020-06-30 14:55 UTC (permalink / raw)
  To: David Abdurachmanov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, Colin Ian King, Tobias Klauser, linux-riscv,
	Dmitry Vyukov

On Jun 30 2020, David Abdurachmanov wrote:

> On Tue, Jun 30, 2020 at 4:38 PM Colin Ian King <colin.king@canonical.com> wrote:
>>
>> I believe I'm also seeing some potential stack smashing issues in the
>> lua engine in ZFS on risc-v. It is taking a while for me to debug, but I
>> don't see the failure on other arches.  Is there a way to bump the stack
>> size up temporarily to test with larger stacks on risc-v?
>
> Dmitry wrote on the original email that the follow solves issues with
> KCOV enabled:
>
> --- a/arch/riscv/include/asm/thread_info.h
> +++ b/arch/riscv/include/asm/thread_info.h
> -#define THREAD_SIZE_ORDER      (1)
> +#define THREAD_SIZE_ORDER      (2)

I think riscv should follow mips and use 16Kb stacks for 64-bit.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 12:48 syzkaller on risc-v Dmitry Vyukov
                   ` (2 preceding siblings ...)
  2020-06-30 13:07 ` Andreas Schwab
@ 2020-06-30 15:10 ` Tobias Klauser
  2020-07-01 10:03   ` Dmitry Vyukov
  3 siblings, 1 reply; 29+ messages in thread
From: Tobias Klauser @ 2020-06-30 15:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, linux-riscv

On 2020-06-30 at 14:48:31 +0200, Dmitry Vyukov <dvyukov@google.com> wrote:
[...]
> 6. I observed lots of what looks like user-space process memory
> corruptions. There included thousands of panics in our Go programs
> with things that I would consider "impossible", at least they did not
> come up before in our syzbot fuzzing. Also some Go runtime
> "impossible" crashes, e.g.:
> https://gist.githubusercontent.com/dvyukov/fb489ed93f7180621c71714ee07e53dc/raw/a7d2e98a56da17af2aec79c164cd3a8e154ecf5c/gistfile1.txt
> Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?
> Though it's not necessary Go b/c kernel contains hundreds of memory
> corruptions and we observed kernel corrupting user-space processes
> routinely. This is especially true without KASAN because kernel
> corruptions are not caught early. However, the ratio and nature of
> crashes makes me suspect some issue in Go risc-v runtime.

I haven't seen any of these crashes myself when testing the syzkaller
port, but then again I only ran it for rather brief amounts of time
(~1h) on my laptop using the riscv defconfig and a few additional
configs enabled.

AFAIK Go tip has seen quite some improvments to its RISC-V port, so it
might be worth giving it (or Go 1.14beta1) a try.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 15:10 ` Tobias Klauser
@ 2020-07-01 10:03   ` Dmitry Vyukov
  0 siblings, 0 replies; 29+ messages in thread
From: Dmitry Vyukov @ 2020-07-01 10:03 UTC (permalink / raw)
  To: Tobias Klauser
  Cc: Albert Ou, Björn Töpel, syzkaller, Palmer Dabbelt,
	Paul Walmsley, linux-riscv

On Tue, Jun 30, 2020 at 5:10 PM Tobias Klauser <tklauser@distanz.ch> wrote:
>
> On 2020-06-30 at 14:48:31 +0200, Dmitry Vyukov <dvyukov@google.com> wrote:
> [...]
> > 6. I observed lots of what looks like user-space process memory
> > corruptions. There included thousands of panics in our Go programs
> > with things that I would consider "impossible", at least they did not
> > come up before in our syzbot fuzzing. Also some Go runtime
> > "impossible" crashes, e.g.:
> > https://gist.githubusercontent.com/dvyukov/fb489ed93f7180621c71714ee07e53dc/raw/a7d2e98a56da17af2aec79c164cd3a8e154ecf5c/gistfile1.txt
> > Maybe it's a known issue? Should we use tip instead of 1.14? Is it more stable?
> > Though it's not necessary Go b/c kernel contains hundreds of memory
> > corruptions and we observed kernel corrupting user-space processes
> > routinely. This is especially true without KASAN because kernel
> > corruptions are not caught early. However, the ratio and nature of
> > crashes makes me suspect some issue in Go risc-v runtime.
>
> I haven't seen any of these crashes myself when testing the syzkaller
> port, but then again I only ran it for rather brief amounts of time
> (~1h) on my laptop using the riscv defconfig and a few additional
> configs enabled.
>
> AFAIK Go tip has seen quite some improvments to its RISC-V port, so it
> might be worth giving it (or Go 1.14beta1) a try.


No luck. I tried:
go version devel +4b28f5ded3 Tue Jun 30 13:18:16 2020 +0000 linux/amd64
and the log is still full of these crashes we don't see on any other instances:

2020/07/01 11:48:09 vm-2: crash: panic: invalid argument to Intn
2020/07/01 11:48:09 vm-28: crash: panic: invalid argument to Intn
2020/07/01 11:48:10 vm-25: crash: panic: invalid argument to Intn
2020/07/01 11:48:11 vm-35: crash: panic: invalid argument to Intn
2020/07/01 11:48:15 VMs 13, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 391, repro 0
2020/07/01 11:48:16 vm-16: crash: panic: invalid argument to Intn
2020/07/01 11:48:25 vm-6: crash: panic: invalid argument to Intn
2020/07/01 11:48:25 VMs 11, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 393, repro 0
2020/07/01 11:48:29 vm-0: crash: panic: invalid argument to Intn
2020/07/01 11:48:35 VMs 14, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 394, repro 0
2020/07/01 11:48:45 VMs 17, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 394, repro 0
2020/07/01 11:48:55 VMs 19, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 394, repro 0
2020/07/01 11:49:05 VMs 22, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 394, repro 0
2020/07/01 11:49:15 VMs 32, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 394, repro 0
2020/07/01 11:49:25 VMs 33, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 394, repro 0
2020/07/01 11:49:35 VMs 33, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 394, repro 0
2020/07/01 11:49:44 vm-12: crash: panic: invalid argument to Intn
2020/07/01 11:49:45 VMs 35, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 395, repro 0
2020/07/01 11:49:49 vm-32: crash: panic: invalid argument to Intn
2020/07/01 11:49:51 vm-5: crash: panic: invalid argument to Intn
2020/07/01 11:49:52 vm-34: crash: panic: invalid argument to Intn
2020/07/01 11:49:52 vm-9: crash: panic: invalid argument to Intn
2020/07/01 11:49:54 vm-17: crash: panic: invalid argument to Intn
2020/07/01 11:49:55 VMs 32, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 400, repro 0
2020/07/01 11:49:59 vm-22: crash: panic: invalid argument to Intn
2020/07/01 11:50:05 VMs 33, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 401, repro 0
2020/07/01 11:50:15 VMs 33, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 401, repro 0
2020/07/01 11:50:25 VMs 33, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 401, repro 0
2020/07/01 11:50:30 vm-10: crash: panic: invalid argument to Intn
2020/07/01 11:50:35 VMs 32, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 402, repro 0
2020/07/01 11:50:45 VMs 33, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 402, repro 0
2020/07/01 11:50:50 vm-8: crash: panic: invalid argument to Intn
2020/07/01 11:50:54 vm-13: crash: panic: invalid argument to Intn
2020/07/01 11:50:54 vm-37: crash: panic: invalid argument to Intn
2020/07/01 11:50:55 VMs 30, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 405, repro 0
2020/07/01 11:51:05 VMs 30, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 405, repro 0
2020/07/01 11:51:15 VMs 30, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 405, repro 0
2020/07/01 11:51:25 VMs 35, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 405, repro 0
2020/07/01 11:51:26 vm-27: crash: panic: invalid argument to Intn
2020/07/01 11:51:29 vm-31: crash: panic: invalid argument to Intn
2020/07/01 11:51:35 VMs 34, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 407, repro 0
2020/07/01 11:51:36 vm-15: crash: panic: invalid argument to Intn
2020/07/01 11:51:36 vm-23: crash: panic: invalid argument to Intn
2020/07/01 11:51:40 vm-39: crash: panic: invalid argument to Intn
2020/07/01 11:51:42 vm-7: crash: panic: invalid argument to Intn
2020/07/01 11:51:45 vm-4: crash: panic: invalid argument to Intn
2020/07/01 11:51:45 VMs 29, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 412, repro 0
2020/07/01 11:51:52 vm-19: crash: panic: invalid argument to Intn
2020/07/01 11:51:54 vm-26: crash: panic: invalid argument to Intn
2020/07/01 11:51:55 VMs 27, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 414, repro 0
2020/07/01 11:52:03 vm-38: crash: panic: invalid argument to Intn
2020/07/01 11:52:03 vm-36: crash: panic: invalid argument to Intn
2020/07/01 11:52:05 VMs 26, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 416, repro 0
2020/07/01 11:52:07 vm-11: crash: panic: invalid argument to Intn
2020/07/01 11:52:12 vm-33: crash: panic: invalid argument to Intn
2020/07/01 11:52:13 vm-29: crash: panic: invalid argument to Intn
2020/07/01 11:52:15 vm-20: crash: panic: invalid argument to Intn
2020/07/01 11:52:15 VMs 22, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 420, repro 0
2020/07/01 11:52:16 vm-24: crash: panic: invalid argument to Intn
2020/07/01 11:52:17 vm-3: crash: panic: invalid argument to Intn
2020/07/01 11:52:17 vm-30: crash: panic: invalid argument to Intn
2020/07/01 11:52:18 vm-14: crash: panic: invalid argument to Intn
2020/07/01 11:52:20 vm-21: crash: panic: invalid argument to Intn
2020/07/01 11:52:20 vm-18: crash: panic: invalid argument to Intn
2020/07/01 11:52:22 vm-1: crash: panic: invalid argument to Intn
2020/07/01 11:52:25 VMs 18, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 427, repro 0
2020/07/01 11:52:35 VMs 18, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 427, repro 0
2020/07/01 11:52:43 vm-2: crash: panic: invalid argument to Intn
2020/07/01 11:52:44 vm-25: crash: panic: invalid argument to Intn
2020/07/01 11:52:44 vm-35: crash: panic: invalid argument to Intn
2020/07/01 11:52:45 VMs 15, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 430, repro 0
2020/07/01 11:52:46 vm-16: crash: panic: invalid argument to Intn
2020/07/01 11:52:47 vm-28: crash: panic: invalid argument to Intn
2020/07/01 11:52:51 vm-9: crash: panic: invalid argument to Intn
2020/07/01 11:52:54 vm-6: crash: panic: invalid argument to Intn
2020/07/01 11:52:55 VMs 12, executed 153462, corpus cover 79651,
corpus signal 174611, max signal 185505, crashes 434, repro 0
2020/07/01 11:53:00 vm-0: crash: panic: invalid argument to Intn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:26   ` Dmitry Vyukov
  2020-06-30 13:33     ` Andreas Schwab
@ 2020-07-01 10:42     ` Björn Töpel
  2020-07-01 10:43       ` Björn Töpel
  1 sibling, 1 reply; 29+ messages in thread
From: Björn Töpel @ 2020-07-01 10:42 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Andreas Schwab, Atish Patra, syzkaller,
	Palmer Dabbelt, Paul Walmsley, Tobias Klauser, linux-riscv

On Tue, 30 Jun 2020 at 15:27, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Tue, Jun 30, 2020 at 3:14 PM Andreas Schwab <schwab@suse.de> wrote:
> >
> > On Jun 30 2020, Dmitry Vyukov wrote:
> >
> > > KASAN would be a prerequisite for testing risc-v on syzbot.
> >
> > You need to implement the GCC support first.
>
> Interesting. Björn claimed KASAN works already.  And there is:
>
> commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
> Author: Nick Hu
> Date:   Mon Jan 6 10:38:32 2020 -0800
>     riscv: Add KASAN support
>
> Is there any known issue with gcc?
> Did anyone try clang? AddressSanitizer pass in clang is
> arch-independent. Not sure about gcc... it looked mostly
> arch-independent.

Weird. Did a quick bisect (just "does it boot with KASAN or not"
test), and this fell out:

--
efca13989250c3edebaf8fcaa8ca7c966739c65a is the first bad commit
commit efca13989250c3edebaf8fcaa8ca7c966739c65a
Author: Atish Patra <atish.patra@wdc.com>
Date:   Tue Mar 17 18:11:37 2020 -0700

    RISC-V: Introduce a new config for SBI v0.1

    We now have SBI v0.2 which is more scalable and extendable to handle
    future needs for RISC-V supervisor interfaces.

    Introduce a new config and move all SBI v0.1 code under that config.
    This allows to implement the new replacement SBI extensions cleanly
    and remove v0.1 extensions easily in future. Currently, the config
    is enabled by default. Once all M-mode software, with v0.1, is no
    longer in use, this config option and all relevant code can be easily
    removed.

    Signed-off-by: Atish Patra <atish.patra@wdc.com>
    Reviewed-by: Anup Patel <anup@brainfault.org>
    Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>

 arch/riscv/Kconfig           |   7 +++
 arch/riscv/include/asm/sbi.h |   2 +
 arch/riscv/kernel/sbi.c      | 132 +++++++++++++++++++++++++++++++++++--------
 3 files changed, 118 insertions(+), 23 deletions(-)
--

I'll dig a bit more.


Björn

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-07-01 10:42     ` Björn Töpel
@ 2020-07-01 10:43       ` Björn Töpel
  2020-07-01 11:34         ` Dmitry Vyukov
  2020-07-01 13:52         ` Tobias Klauser
  0 siblings, 2 replies; 29+ messages in thread
From: Björn Töpel @ 2020-07-01 10:43 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Albert Ou, Andreas Schwab, Atish Patra, syzkaller,
	Palmer Dabbelt, Paul Walmsley, Tobias Klauser, linux-riscv

On Wed, 1 Jul 2020 at 12:42, Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> On Tue, 30 Jun 2020 at 15:27, Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Tue, Jun 30, 2020 at 3:14 PM Andreas Schwab <schwab@suse.de> wrote:
> > >
> > > On Jun 30 2020, Dmitry Vyukov wrote:
> > >
> > > > KASAN would be a prerequisite for testing risc-v on syzbot.
> > >
> > > You need to implement the GCC support first.
> >
> > Interesting. Björn claimed KASAN works already.  And there is:
> >
> > commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
> > Author: Nick Hu
> > Date:   Mon Jan 6 10:38:32 2020 -0800
> >     riscv: Add KASAN support
> >
> > Is there any known issue with gcc?
> > Did anyone try clang? AddressSanitizer pass in clang is
> > arch-independent. Not sure about gcc... it looked mostly
> > arch-independent.
>
> Weird. Did a quick bisect (just "does it boot with KASAN or not"
> test), and this fell out:
>
> --
> efca13989250c3edebaf8fcaa8ca7c966739c65a is the first bad commit
> commit efca13989250c3edebaf8fcaa8ca7c966739c65a
> Author: Atish Patra <atish.patra@wdc.com>
> Date:   Tue Mar 17 18:11:37 2020 -0700
>
>     RISC-V: Introduce a new config for SBI v0.1
>
>     We now have SBI v0.2 which is more scalable and extendable to handle
>     future needs for RISC-V supervisor interfaces.
>
>     Introduce a new config and move all SBI v0.1 code under that config.
>     This allows to implement the new replacement SBI extensions cleanly
>     and remove v0.1 extensions easily in future. Currently, the config
>     is enabled by default. Once all M-mode software, with v0.1, is no
>     longer in use, this config option and all relevant code can be easily
>     removed.
>
>     Signed-off-by: Atish Patra <atish.patra@wdc.com>
>     Reviewed-by: Anup Patel <anup@brainfault.org>
>     Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
>
>  arch/riscv/Kconfig           |   7 +++
>  arch/riscv/include/asm/sbi.h |   2 +
>  arch/riscv/kernel/sbi.c      | 132 +++++++++++++++++++++++++++++++++++--------
>  3 files changed, 118 insertions(+), 23 deletions(-)
> --
>
> I'll dig a bit more.
>

Oh, forgot one thing; I'm booting the kernel with OpenSBI (OpenSBI
v0.8-3-gec3e5b14d52a) and not the Berkley loader.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-07-01 10:43       ` Björn Töpel
@ 2020-07-01 11:34         ` Dmitry Vyukov
  2020-07-01 13:52         ` Tobias Klauser
  1 sibling, 0 replies; 29+ messages in thread
From: Dmitry Vyukov @ 2020-07-01 11:34 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Albert Ou, Andreas Schwab, Atish Patra, syzkaller,
	Palmer Dabbelt, Paul Walmsley, Tobias Klauser, linux-riscv

On Wed, Jul 1, 2020 at 12:43 PM Björn Töpel <bjorn.topel@gmail.com> wrote:
> > > > On Jun 30 2020, Dmitry Vyukov wrote:
> > > >
> > > > > KASAN would be a prerequisite for testing risc-v on syzbot.
> > > >
> > > > You need to implement the GCC support first.
> > >
> > > Interesting. Björn claimed KASAN works already.  And there is:
> > >
> > > commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
> > > Author: Nick Hu
> > > Date:   Mon Jan 6 10:38:32 2020 -0800
> > >     riscv: Add KASAN support
> > >
> > > Is there any known issue with gcc?
> > > Did anyone try clang? AddressSanitizer pass in clang is
> > > arch-independent. Not sure about gcc... it looked mostly
> > > arch-independent.
> >
> > Weird. Did a quick bisect (just "does it boot with KASAN or not"
> > test), and this fell out:
> >
> > --
> > efca13989250c3edebaf8fcaa8ca7c966739c65a is the first bad commit
> > commit efca13989250c3edebaf8fcaa8ca7c966739c65a
> > Author: Atish Patra <atish.patra@wdc.com>
> > Date:   Tue Mar 17 18:11:37 2020 -0700
> >
> >     RISC-V: Introduce a new config for SBI v0.1
> >
> >     We now have SBI v0.2 which is more scalable and extendable to handle
> >     future needs for RISC-V supervisor interfaces.
> >
> >     Introduce a new config and move all SBI v0.1 code under that config.
> >     This allows to implement the new replacement SBI extensions cleanly
> >     and remove v0.1 extensions easily in future. Currently, the config
> >     is enabled by default. Once all M-mode software, with v0.1, is no
> >     longer in use, this config option and all relevant code can be easily
> >     removed.
> >
> >     Signed-off-by: Atish Patra <atish.patra@wdc.com>
> >     Reviewed-by: Anup Patel <anup@brainfault.org>
> >     Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
> >
> >  arch/riscv/Kconfig           |   7 +++
> >  arch/riscv/include/asm/sbi.h |   2 +
> >  arch/riscv/kernel/sbi.c      | 132 +++++++++++++++++++++++++++++++++++--------
> >  3 files changed, 118 insertions(+), 23 deletions(-)
> > --
> >
> > I'll dig a bit more.

We may need more KASAN_SANITIZE := n in some Makefiles.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-07-01 10:43       ` Björn Töpel
  2020-07-01 11:34         ` Dmitry Vyukov
@ 2020-07-01 13:52         ` Tobias Klauser
  1 sibling, 0 replies; 29+ messages in thread
From: Tobias Klauser @ 2020-07-01 13:52 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Albert Ou, Andreas Schwab, Atish Patra, syzkaller,
	Palmer Dabbelt, Paul Walmsley, linux-riscv, Dmitry Vyukov

On 2020-07-01 at 12:43:43 +0200, Björn Töpel <bjorn.topel@gmail.com> wrote:
> On Wed, 1 Jul 2020 at 12:42, Björn Töpel <bjorn.topel@gmail.com> wrote:
> >
> > On Tue, 30 Jun 2020 at 15:27, Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > On Tue, Jun 30, 2020 at 3:14 PM Andreas Schwab <schwab@suse.de> wrote:
> > > >
> > > > On Jun 30 2020, Dmitry Vyukov wrote:
> > > >
> > > > > KASAN would be a prerequisite for testing risc-v on syzbot.
> > > >
> > > > You need to implement the GCC support first.
> > >
> > > Interesting. Björn claimed KASAN works already.  And there is:
> > >
> > > commit 8ad8b72721d0f07fa02dbe71f901743f9c71c8e6
> > > Author: Nick Hu
> > > Date:   Mon Jan 6 10:38:32 2020 -0800
> > >     riscv: Add KASAN support
> > >
> > > Is there any known issue with gcc?
> > > Did anyone try clang? AddressSanitizer pass in clang is
> > > arch-independent. Not sure about gcc... it looked mostly
> > > arch-independent.
> >
> > Weird. Did a quick bisect (just "does it boot with KASAN or not"
> > test), and this fell out:
> >
> > --
> > efca13989250c3edebaf8fcaa8ca7c966739c65a is the first bad commit
> > commit efca13989250c3edebaf8fcaa8ca7c966739c65a
> > Author: Atish Patra <atish.patra@wdc.com>
> > Date:   Tue Mar 17 18:11:37 2020 -0700
> >
> >     RISC-V: Introduce a new config for SBI v0.1
> >
> >     We now have SBI v0.2 which is more scalable and extendable to handle
> >     future needs for RISC-V supervisor interfaces.
> >
> >     Introduce a new config and move all SBI v0.1 code under that config.
> >     This allows to implement the new replacement SBI extensions cleanly
> >     and remove v0.1 extensions easily in future. Currently, the config
> >     is enabled by default. Once all M-mode software, with v0.1, is no
> >     longer in use, this config option and all relevant code can be easily
> >     removed.
> >
> >     Signed-off-by: Atish Patra <atish.patra@wdc.com>
> >     Reviewed-by: Anup Patel <anup@brainfault.org>
> >     Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
> >
> >  arch/riscv/Kconfig           |   7 +++
> >  arch/riscv/include/asm/sbi.h |   2 +
> >  arch/riscv/kernel/sbi.c      | 132 +++++++++++++++++++++++++++++++++++--------
> >  3 files changed, 118 insertions(+), 23 deletions(-)
> > --
> >
> > I'll dig a bit more.
> >
> 
> Oh, forgot one thing; I'm booting the kernel with OpenSBI (OpenSBI
> v0.8-3-gec3e5b14d52a) and not the Berkley loader.

Thanks for the hint regarding OpenSDBI. I just tried it for booting a
kernel built from the riscv for-next branch at commit a2693fe254e7
("RISC-V: Use a local variable instead of smp_processor_id()") with the
following two additional patches:

  https://lore.kernel.org/linux-riscv/20200626124056.29708-1-tklauser@distanz.ch/
  https://lore.kernel.org/linux-riscv/20200627105050.11088-1-tklauser@distanz.ch/

As soon as I enable KASAN (regardless of CONFIG_KCOV being set or not),
it seems to hang after the OpenSBI boot messages, same as when using BBL.

FWIW I sent an update to the syzkaller docs to use OpenSBI instead of
BBL, since that seems to be the recommended way to boot now:

  https://github.com/google/syzkaller/pull/1888

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:43         ` Andreas Schwab
@ 2020-07-02 22:00           ` Aurelien Jarno
  2020-07-06  8:14             ` Andreas Schwab
  0 siblings, 1 reply; 29+ messages in thread
From: Aurelien Jarno @ 2020-07-02 22:00 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Albert Ou, David Abdurachmanov, Björn Töpel, syzkaller,
	Palmer Dabbelt, Paul Walmsley, Tobias Klauser, linux-riscv,
	Dmitry Vyukov

On 2020-06-30 15:43, Andreas Schwab wrote:
> On Jun 30 2020, David Abdurachmanov wrote:
> 
> > You seem to bootstrap with gccgo, which will not work (tried it).
> 
> I consider that broken.

I agree that something is broken. Here gccgo is broken, not Golang. 

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-07-02 22:00           ` Aurelien Jarno
@ 2020-07-06  8:14             ` Andreas Schwab
  0 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2020-07-06  8:14 UTC (permalink / raw)
  To: Aurelien Jarno
  Cc: Albert Ou, David Abdurachmanov, Björn Töpel, syzkaller,
	Palmer Dabbelt, Paul Walmsley, Tobias Klauser, linux-riscv,
	Dmitry Vyukov

On Jul 03 2020, Aurelien Jarno wrote:

> On 2020-06-30 15:43, Andreas Schwab wrote:
>> On Jun 30 2020, David Abdurachmanov wrote:
>> 
>> > You seem to bootstrap with gccgo, which will not work (tried it).
>> 
>> I consider that broken.
>
> I agree that something is broken. Here gccgo is broken, not Golang. 

Agreed.  I was able to bootstrap go1.14 now with the other bootstrapping
method.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-06-30 13:57       ` David Abdurachmanov
  2020-06-30 14:55         ` Andreas Schwab
@ 2020-07-06 10:12         ` Colin Ian King
  2020-07-14  1:21           ` Palmer Dabbelt
  1 sibling, 1 reply; 29+ messages in thread
From: Colin Ian King @ 2020-07-06 10:12 UTC (permalink / raw)
  To: David Abdurachmanov
  Cc: Albert Ou, Andreas Schwab, Paul Walmsley, syzkaller,
	Palmer Dabbelt, Björn Töpel, Tobias Klauser,
	linux-riscv, Dmitry Vyukov

FYI, increasing the THREAD_SIZE_ORDER to 2 fixes the gcov stack crashes
I'm seeing on a 5.4 kernel.

On 30/06/2020 14:57, David Abdurachmanov wrote:
> On Tue, Jun 30, 2020 at 4:38 PM Colin Ian King <colin.king@canonical.com> wrote:
>>
>> I believe I'm also seeing some potential stack smashing issues in the
>> lua engine in ZFS on risc-v. It is taking a while for me to debug, but I
>> don't see the failure on other arches.  Is there a way to bump the stack
>> size up temporarily to test with larger stacks on risc-v?
> 
> Dmitry wrote on the original email that the follow solves issues with
> KCOV enabled:
> 
> --- a/arch/riscv/include/asm/thread_info.h
> +++ b/arch/riscv/include/asm/thread_info.h
> -#define THREAD_SIZE_ORDER      (1)
> +#define THREAD_SIZE_ORDER      (2)
> 
> I see MIPS have:
> 
> [..]
>  80 /* thread information allocation */
>  81 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_32BIT)
>  82 #define THREAD_SIZE_ORDER (1)
>  83 #endif
>  84 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_64BIT)
>  85 #define THREAD_SIZE_ORDER (2)
> [..]
> 
> david
> 
>>
>> Colin
>>
>> On 30/06/2020 14:26, David Abdurachmanov wrote:
>>> On Tue, Jun 30, 2020 at 4:04 PM Andreas Schwab <schwab@suse.de> wrote:
>>>>
>>>> On Jun 30 2020, Dmitry Vyukov wrote:
>>>>
>>>>> I would assume some stack overflows can happen without KCOV as well.
>>>>
>>>> Yes, I see stack overflows quite a lot, like this:
>>>>
>>>> [62192.908680] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>>>> [62192.915752] CPU: 0 PID: 12347 Comm: ld Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>>>> [62192.925204] Call Trace:
>>>> [62192.927646] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>>>> [62192.933030] [<ffffffe000202b76>] show_stack+0x2a/0x34
>>>> [62192.938066] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>>>> [62192.943098] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>>>> [62192.947785] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>>>> [62192.952561] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>>>> [62192.957859] [<ffffffe0002f18ea>] invalidate_mapping_pages+0xe0/0x1ce
>>>> [62192.964193] [<ffffffe000370aa4>] inode_lru_isolate+0x238/0x298
>>>> [62192.970012] [<ffffffe000308098>] __list_lru_walk_one+0x5e/0xf6
>>>> [62192.975826] [<ffffffe000308516>] list_lru_walk_one+0x42/0x98
>>>> [62192.981470] [<ffffffe0003717e8>] prune_icache_sb+0x32/0x72
>>>> [62192.986941] [<ffffffe000358366>] super_cache_scan+0xe4/0x13e
>>>> [62192.992586] [<ffffffe0002f1fac>] do_shrink_slab+0x10e/0x17e
>>>> [62192.998142] [<ffffffe0002f2126>] shrink_slab_memcg+0x10a/0x1de
>>>> [62193.003957] [<ffffffe0002f5314>] shrink_node_memcgs+0x12e/0x1a4
>>>> [62193.009861] [<ffffffe0002f5484>] shrink_node+0xfa/0x43c
>>>> [62193.015067] [<ffffffe0002f583e>] shrink_zones+0x78/0x18c
>>>> [62193.020365] [<ffffffe0002f59f0>] do_try_to_free_pages+0x9e/0x23e
>>>> [62193.026352] [<ffffffe0002f65ac>] try_to_free_pages+0xb2/0xf4
>>>> [62193.031991] [<ffffffe000322952>] __alloc_pages_slowpath.constprop.0+0x2d0/0x6c2
>>>> [62193.039284] [<ffffffe000322e9a>] __alloc_pages_nodemask+0x156/0x1b2
>>>> [62193.045535] [<ffffffe00030c730>] do_anonymous_page+0x58/0x41c
>>>> [62193.051266] [<ffffffe00030f50e>] handle_pte_fault+0x12e/0x156
>>>> [62193.056994] [<ffffffe000310444>] __handle_mm_fault+0xca/0x118
>>>> [62193.062725] [<ffffffe000310532>] handle_mm_fault+0xa0/0x152
>>>> [62193.068278] [<ffffffe0002055ba>] do_page_fault+0xd6/0x370
>>>> [62193.073666] [<ffffffe00020140a>] ret_from_exception+0x0/0xc
>>>> [62193.079222] [<ffffffe0004fc16a>] copy_page_to_iter_iovec+0x4c/0x154
>>>
>>> There was a report from Canonical that enabling gcov causes similar issues.
>>>
>>> linux: riscv: corrupted stack detected inside scheduler
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877954
>>>
>>> Adding Colin to CC. So far we couldn't reproduce this locally, I
>>> guess, because we don't have the right config.
>>>
>>> david
>>>
>>>
>>>>
>>>> or this:
>>>>
>>>> [200460.114397] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>>>> [200460.121553] CPU: 0 PID: 32619 Comm: sh Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>>>> [200460.131090] Call Trace:
>>>> [200460.133623] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>>>> [200460.139091] [<ffffffe000202b76>] show_stack+0x2a/0x34
>>>> [200460.144212] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>>>> [200460.149335] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>>>> [200460.154109] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>>>> [200460.158969] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>>>> [200460.164348] [<ffffffe000498572>] aa_sk_perm+0x38/0x138
>>>> [200460.169559] [<ffffffe00048d4b4>] apparmor_socket_sendmsg+0x18/0x20
>>>> [200460.175817] [<ffffffe0004508e0>] security_socket_sendmsg+0x2a/0x42
>>>> [200460.182061] [<ffffffe0006f4c0a>] sock_sendmsg+0x1a/0x40
>>>> [200460.195979] [<ffffffdf817210cc>] xprt_sock_sendmsg+0xb2/0x2b6 [sunrpc]
>>>> [200460.210450] [<ffffffdf81723bde>] xs_tcp_send_request+0xc6/0x206 [sunrpc]
>>>> [200460.224930] [<ffffffdf8171f538>] xprt_request_transmit.constprop.0+0x88/0x218 [sunrpc]
>>>> [200460.240731] [<ffffffdf81720610>] xprt_transmit+0x9a/0x182 [sunrpc]
>>>> [200460.254858] [<ffffffdf8171a584>] call_transmit+0x68/0xb8 [sunrpc]
>>>> [200460.268817] [<ffffffdf81726660>] __rpc_execute+0x84/0x222 [sunrpc]
>>>> [200460.282787] [<ffffffdf81726cea>] rpc_execute+0xac/0xb8 [sunrpc]
>>>> [200460.296493] [<ffffffdf8171c5ca>] rpc_run_task+0x122/0x178 [sunrpc]
>>>> [200460.314422] [<ffffffdf82e1533a>] nfs4_do_call_sync+0x64/0x84 [nfsv4]
>>>> [200460.332514] [<ffffffdf82e1541c>] _nfs4_proc_getattr+0xc2/0xd4 [nfsv4]
>>>> [200460.350813] [<ffffffdf82e1cafc>] nfs4_proc_getattr+0x48/0x72 [nfsv4]
>>>> [200460.363307] [<ffffffdf8292c1f6>] __nfs_revalidate_inode+0x104/0x2c8 [nfs]
>>>> [200460.376204] [<ffffffdf82926d18>] nfs_access_get_cached+0x104/0x212 [nfs]
>>>> [200460.389112] [<ffffffdf82926f20>] nfs_do_access+0xfa/0x178 [nfs]
>>>> [200460.401176] [<ffffffdf82927070>] nfs_permission+0x8e/0x184 [nfs]
>>>> [200460.406497] [<ffffffe000361936>] inode_permission.part.0+0x78/0x118
>>>> [200460.412838] [<ffffffe0003638ea>] link_path_walk.part.0+0x1bc/0x212
>>>> [200460.419086] [<ffffffe000363c7e>] path_lookupat+0x34/0x172
>>>> [200460.424559] [<ffffffe0003653de>] filename_lookup+0x5c/0xf4
>>>> [200460.430114] [<ffffffe00036551e>] user_path_at_empty+0x3a/0x5e
>>>> [200460.435931] [<ffffffe00035b838>] vfs_statx+0x62/0xbc
>>>> [200460.440966] [<ffffffe00035b92a>] __do_sys_newfstatat+0x24/0x3a
>>>> [200460.446870] [<ffffffe00035bafa>] sys_newfstatat+0x10/0x18
>>>> [200460.452339] [<ffffffe0002013fc>] ret_from_syscall+0x0/0x2
>>>>
>>>> Andreas.
>>>>
>>>> --
>>>> Andreas Schwab, SUSE Labs, schwab@suse.de
>>>> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
>>>> "And now for something completely different."
>>>>
>>>> _______________________________________________
>>>> linux-riscv mailing list
>>>> linux-riscv@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: syzkaller on risc-v
  2020-07-06 10:12         ` Colin Ian King
@ 2020-07-14  1:21           ` Palmer Dabbelt
  0 siblings, 0 replies; 29+ messages in thread
From: Palmer Dabbelt @ 2020-07-14  1:21 UTC (permalink / raw)
  To: colin.king
  Cc: aou, david.abdurachmanov, schwab, Paul Walmsley, syzkaller,
	Bjorn Topel, tklauser, linux-riscv, dvyukov

On Mon, 06 Jul 2020 03:12:05 PDT (-0700), colin.king@canonical.com wrote:
> FYI, increasing the THREAD_SIZE_ORDER to 2 fixes the gcov stack crashes
> I'm seeing on a 5.4 kernel.

Sorry I'm a bit slow here, but I think setting THREAD_SIZE_ORDER to 2 on the
64-bit targets seems in line with what everyone else is doing.  IIRC this is
essentially the stack size, which does tend to be larger on 64-bit platforms,
so it seems reasonable.  I wouldn't be terribly surprised if we also have
larger stacks on rv32 than other platforms do, but I think it's best to avoid
increasing the size over there without at least seeing some failures.

I've just sent out a patch, as I don't see one in my inbox.

Thanks!

>
> On 30/06/2020 14:57, David Abdurachmanov wrote:
>> On Tue, Jun 30, 2020 at 4:38 PM Colin Ian King <colin.king@canonical.com> wrote:
>>>
>>> I believe I'm also seeing some potential stack smashing issues in the
>>> lua engine in ZFS on risc-v. It is taking a while for me to debug, but I
>>> don't see the failure on other arches.  Is there a way to bump the stack
>>> size up temporarily to test with larger stacks on risc-v?
>>
>> Dmitry wrote on the original email that the follow solves issues with
>> KCOV enabled:
>>
>> --- a/arch/riscv/include/asm/thread_info.h
>> +++ b/arch/riscv/include/asm/thread_info.h
>> -#define THREAD_SIZE_ORDER      (1)
>> +#define THREAD_SIZE_ORDER      (2)
>>
>> I see MIPS have:
>>
>> [..]
>>  80 /* thread information allocation */
>>  81 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_32BIT)
>>  82 #define THREAD_SIZE_ORDER (1)
>>  83 #endif
>>  84 #if defined(CONFIG_PAGE_SIZE_4KB) && defined(CONFIG_64BIT)
>>  85 #define THREAD_SIZE_ORDER (2)
>> [..]
>>
>> david
>>
>>>
>>> Colin
>>>
>>> On 30/06/2020 14:26, David Abdurachmanov wrote:
>>>> On Tue, Jun 30, 2020 at 4:04 PM Andreas Schwab <schwab@suse.de> wrote:
>>>>>
>>>>> On Jun 30 2020, Dmitry Vyukov wrote:
>>>>>
>>>>>> I would assume some stack overflows can happen without KCOV as well.
>>>>>
>>>>> Yes, I see stack overflows quite a lot, like this:
>>>>>
>>>>> [62192.908680] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>>>>> [62192.915752] CPU: 0 PID: 12347 Comm: ld Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>>>>> [62192.925204] Call Trace:
>>>>> [62192.927646] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>>>>> [62192.933030] [<ffffffe000202b76>] show_stack+0x2a/0x34
>>>>> [62192.938066] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>>>>> [62192.943098] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>>>>> [62192.947785] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>>>>> [62192.952561] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>>>>> [62192.957859] [<ffffffe0002f18ea>] invalidate_mapping_pages+0xe0/0x1ce
>>>>> [62192.964193] [<ffffffe000370aa4>] inode_lru_isolate+0x238/0x298
>>>>> [62192.970012] [<ffffffe000308098>] __list_lru_walk_one+0x5e/0xf6
>>>>> [62192.975826] [<ffffffe000308516>] list_lru_walk_one+0x42/0x98
>>>>> [62192.981470] [<ffffffe0003717e8>] prune_icache_sb+0x32/0x72
>>>>> [62192.986941] [<ffffffe000358366>] super_cache_scan+0xe4/0x13e
>>>>> [62192.992586] [<ffffffe0002f1fac>] do_shrink_slab+0x10e/0x17e
>>>>> [62192.998142] [<ffffffe0002f2126>] shrink_slab_memcg+0x10a/0x1de
>>>>> [62193.003957] [<ffffffe0002f5314>] shrink_node_memcgs+0x12e/0x1a4
>>>>> [62193.009861] [<ffffffe0002f5484>] shrink_node+0xfa/0x43c
>>>>> [62193.015067] [<ffffffe0002f583e>] shrink_zones+0x78/0x18c
>>>>> [62193.020365] [<ffffffe0002f59f0>] do_try_to_free_pages+0x9e/0x23e
>>>>> [62193.026352] [<ffffffe0002f65ac>] try_to_free_pages+0xb2/0xf4
>>>>> [62193.031991] [<ffffffe000322952>] __alloc_pages_slowpath.constprop.0+0x2d0/0x6c2
>>>>> [62193.039284] [<ffffffe000322e9a>] __alloc_pages_nodemask+0x156/0x1b2
>>>>> [62193.045535] [<ffffffe00030c730>] do_anonymous_page+0x58/0x41c
>>>>> [62193.051266] [<ffffffe00030f50e>] handle_pte_fault+0x12e/0x156
>>>>> [62193.056994] [<ffffffe000310444>] __handle_mm_fault+0xca/0x118
>>>>> [62193.062725] [<ffffffe000310532>] handle_mm_fault+0xa0/0x152
>>>>> [62193.068278] [<ffffffe0002055ba>] do_page_fault+0xd6/0x370
>>>>> [62193.073666] [<ffffffe00020140a>] ret_from_exception+0x0/0xc
>>>>> [62193.079222] [<ffffffe0004fc16a>] copy_page_to_iter_iovec+0x4c/0x154
>>>>
>>>> There was a report from Canonical that enabling gcov causes similar issues.
>>>>
>>>> linux: riscv: corrupted stack detected inside scheduler
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877954
>>>>
>>>> Adding Colin to CC. So far we couldn't reproduce this locally, I
>>>> guess, because we don't have the right config.
>>>>
>>>> david
>>>>
>>>>
>>>>>
>>>>> or this:
>>>>>
>>>>> [200460.114397] Kernel panic - not syncing: corrupted stack end detected inside scheduler
>>>>> [200460.121553] CPU: 0 PID: 32619 Comm: sh Not tainted 5.7.5-221-default #1 openSUSE Tumbleweed (unreleased)
>>>>> [200460.131090] Call Trace:
>>>>> [200460.133623] [<ffffffe0002028ae>] walk_stackframe+0x0/0xaa
>>>>> [200460.139091] [<ffffffe000202b76>] show_stack+0x2a/0x34
>>>>> [200460.144212] [<ffffffe000557d44>] dump_stack+0x6e/0x88
>>>>> [200460.149335] [<ffffffe00020c2d2>] panic+0xe8/0x26a
>>>>> [200460.154109] [<ffffffe00085ab9c>] schedule+0x0/0xb2
>>>>> [200460.158969] [<ffffffe00085af36>] _cond_resched+0x32/0x44
>>>>> [200460.164348] [<ffffffe000498572>] aa_sk_perm+0x38/0x138
>>>>> [200460.169559] [<ffffffe00048d4b4>] apparmor_socket_sendmsg+0x18/0x20
>>>>> [200460.175817] [<ffffffe0004508e0>] security_socket_sendmsg+0x2a/0x42
>>>>> [200460.182061] [<ffffffe0006f4c0a>] sock_sendmsg+0x1a/0x40
>>>>> [200460.195979] [<ffffffdf817210cc>] xprt_sock_sendmsg+0xb2/0x2b6 [sunrpc]
>>>>> [200460.210450] [<ffffffdf81723bde>] xs_tcp_send_request+0xc6/0x206 [sunrpc]
>>>>> [200460.224930] [<ffffffdf8171f538>] xprt_request_transmit.constprop.0+0x88/0x218 [sunrpc]
>>>>> [200460.240731] [<ffffffdf81720610>] xprt_transmit+0x9a/0x182 [sunrpc]
>>>>> [200460.254858] [<ffffffdf8171a584>] call_transmit+0x68/0xb8 [sunrpc]
>>>>> [200460.268817] [<ffffffdf81726660>] __rpc_execute+0x84/0x222 [sunrpc]
>>>>> [200460.282787] [<ffffffdf81726cea>] rpc_execute+0xac/0xb8 [sunrpc]
>>>>> [200460.296493] [<ffffffdf8171c5ca>] rpc_run_task+0x122/0x178 [sunrpc]
>>>>> [200460.314422] [<ffffffdf82e1533a>] nfs4_do_call_sync+0x64/0x84 [nfsv4]
>>>>> [200460.332514] [<ffffffdf82e1541c>] _nfs4_proc_getattr+0xc2/0xd4 [nfsv4]
>>>>> [200460.350813] [<ffffffdf82e1cafc>] nfs4_proc_getattr+0x48/0x72 [nfsv4]
>>>>> [200460.363307] [<ffffffdf8292c1f6>] __nfs_revalidate_inode+0x104/0x2c8 [nfs]
>>>>> [200460.376204] [<ffffffdf82926d18>] nfs_access_get_cached+0x104/0x212 [nfs]
>>>>> [200460.389112] [<ffffffdf82926f20>] nfs_do_access+0xfa/0x178 [nfs]
>>>>> [200460.401176] [<ffffffdf82927070>] nfs_permission+0x8e/0x184 [nfs]
>>>>> [200460.406497] [<ffffffe000361936>] inode_permission.part.0+0x78/0x118
>>>>> [200460.412838] [<ffffffe0003638ea>] link_path_walk.part.0+0x1bc/0x212
>>>>> [200460.419086] [<ffffffe000363c7e>] path_lookupat+0x34/0x172
>>>>> [200460.424559] [<ffffffe0003653de>] filename_lookup+0x5c/0xf4
>>>>> [200460.430114] [<ffffffe00036551e>] user_path_at_empty+0x3a/0x5e
>>>>> [200460.435931] [<ffffffe00035b838>] vfs_statx+0x62/0xbc
>>>>> [200460.440966] [<ffffffe00035b92a>] __do_sys_newfstatat+0x24/0x3a
>>>>> [200460.446870] [<ffffffe00035bafa>] sys_newfstatat+0x10/0x18
>>>>> [200460.452339] [<ffffffe0002013fc>] ret_from_syscall+0x0/0x2
>>>>>
>>>>> Andreas.
>>>>>
>>>>> --
>>>>> Andreas Schwab, SUSE Labs, schwab@suse.de
>>>>> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
>>>>> "And now for something completely different."
>>>>>
>>>>> _______________________________________________
>>>>> linux-riscv mailing list
>>>>> linux-riscv@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, back to index

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-30 12:48 syzkaller on risc-v Dmitry Vyukov
2020-06-30 12:57 ` Andreas Schwab
2020-06-30 13:26   ` Dmitry Vyukov
2020-06-30 13:33     ` Andreas Schwab
2020-06-30 13:40       ` Dmitry Vyukov
2020-06-30 13:45         ` Andreas Schwab
2020-06-30 13:49           ` Dmitry Vyukov
2020-06-30 13:52             ` Andreas Schwab
2020-07-01 10:42     ` Björn Töpel
2020-07-01 10:43       ` Björn Töpel
2020-07-01 11:34         ` Dmitry Vyukov
2020-07-01 13:52         ` Tobias Klauser
2020-06-30 13:03 ` Andreas Schwab
2020-06-30 13:26   ` David Abdurachmanov
2020-06-30 13:37     ` Colin Ian King
2020-06-30 13:57       ` David Abdurachmanov
2020-06-30 14:55         ` Andreas Schwab
2020-07-06 10:12         ` Colin Ian King
2020-07-14  1:21           ` Palmer Dabbelt
2020-06-30 13:07 ` Andreas Schwab
2020-06-30 13:20   ` David Abdurachmanov
2020-06-30 13:23     ` Dmitry Vyukov
2020-06-30 13:30     ` Andreas Schwab
2020-06-30 13:35       ` David Abdurachmanov
2020-06-30 13:43         ` Andreas Schwab
2020-07-02 22:00           ` Aurelien Jarno
2020-07-06  8:14             ` Andreas Schwab
2020-06-30 15:10 ` Tobias Klauser
2020-07-01 10:03   ` Dmitry Vyukov

Linux-RISC-V Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-riscv/0 linux-riscv/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-riscv linux-riscv/ https://lore.kernel.org/linux-riscv \
		linux-riscv@lists.infradead.org
	public-inbox-index linux-riscv

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-riscv


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git