All of lore.kernel.org
 help / color / mirror / Atom feed
* Arm + KASAN + syzbot
@ 2021-01-18 16:31 Dmitry Vyukov
  2021-01-19  8:36 ` Krzysztof Kozlowski
                   ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-18 16:31 UTC (permalink / raw)
  To: Russell King - ARM Linux, Linux ARM, Linus Walleij, liu.hailong6,
	Arnd Bergmann, kasan-dev, syzkaller, Krzysztof Kozlowski

Hello Arm maintainers,

We are considering setting up an Arm 32-bit instance on syzbot for
continuous testing using qemu emulation and I have several questions
related to that.

1. Is there interest in this on your end? What git tree/branch should
be used for testing (contains latest development and is regularly
updated with fixes)?

2. I see KASAN has just become supported for Arm, which is very
useful, but I can't boot a kernel with KASAN enabled. I am using
v5.11-rc4 and this config without KASAN boots fine:
https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt
using the following qemu command line:
qemu-system-arm \
  -machine vexpress-a15 -cpu max -smp 2 -m 2G \
  -device virtio-blk-device,drive=hd0 \
  -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
  -kernel arch/arm/boot/zImage \
  -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \
  -nographic \
  -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
virtio-net-device,netdev=net0 \
  -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"

However, when I enable KASAN and get this config:
https://gist.githubusercontent.com/dvyukov/a7e3edd35cc39a1b69b11530c7d2e7ac/raw/7cbda88085d3ccd11227224a1c9964ccb8484d4e/gistfile1.txt

kernel does not boot, qemu only prints the following output and then silence:
pulseaudio: set_sink_input_volume() failed
pulseaudio: Reason: Invalid argument
pulseaudio: set_sink_input_mute() failed
pulseaudio: Reason: Invalid argument

What am I doing wrong?

3. CONFIG_KCOV does not seem to fully work.
It seems to work except for when the kernel crashes, and that's the
most interesting scenario for us. When the kernel crashes for other
reasons, crash handlers re-crashe in KCOV making all crashes
unactionable and indistinguishable.
Here are some samples (search for __sanitizer_cov_trace):
https://gist.githubusercontent.com/dvyukov/c8a7ff1c00a5223c5143fd90073f5bc4/raw/c0f4ac7fd7faad7253843584fed8620ac6006338/gistfile1.txt
Perhaps some additional Makefiles in arch/arm need KCOV_INSTRUMENT :=
n to fix this.
And LKDTM can be used for testing:
https://www.kernel.org/doc/html/latest/fault-injection/provoke-crashes.html

Thanks

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-18 16:31 Arm + KASAN + syzbot Dmitry Vyukov
@ 2021-01-19  8:36 ` Krzysztof Kozlowski
  2021-01-19  8:46   ` Linus Walleij
  2021-01-19 10:04   ` Dmitry Vyukov
  2021-01-19  8:41 ` Linus Walleij
  2021-01-19 10:03 ` Mark Rutland
  2 siblings, 2 replies; 47+ messages in thread
From: Krzysztof Kozlowski @ 2021-01-19  8:36 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, Russell King - ARM Linux,
	kasan-dev, syzkaller, liu.hailong6, Linux ARM

On Mon, 18 Jan 2021 at 17:31, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> Hello Arm maintainers,
>
> We are considering setting up an Arm 32-bit instance on syzbot for
> continuous testing using qemu emulation and I have several questions
> related to that.
>
> 1. Is there interest in this on your end?

Sure, the more, the better.

> What git tree/branch should
> be used for testing (contains latest development and is regularly
> updated with fixes)?

Depends on your testing capabilities, whether you can deal with every
sub-maintainer's tree. 0-day kernel robot tests everything possible
and this allows each submaintanier to early receive feedback about his
tree. It can be around 30 Git trees, though... If you want only few, I
would start with:
 - https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git/
 - linux-next
 - and Russell's for-next
(http://git.armlinux.org.uk/cgit/linux-arm.git/log/?h=for-next)

> 2. I see KASAN has just become supported for Arm, which is very
> useful, but I can't boot a kernel with KASAN enabled. I am using
> v5.11-rc4 and this config without KASAN boots fine:
> https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt

Maybe try first with a kernel based on vexpress defconfig. Yours looks
closer to multi_v7 which enables a lot of stuff also as modules and
this by itself brought up few issues (mostly with order of probes).

You could also try other QEMU machine (I don't know many of them, some
time ago I was using exynos defconfig on smdkc210, but without KASAN).

> using the following qemu command line:
> qemu-system-arm \
>   -machine vexpress-a15 -cpu max -smp 2 -m 2G \
>   -device virtio-blk-device,drive=hd0 \
>   -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
>   -kernel arch/arm/boot/zImage \
>   -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \
>   -nographic \
>   -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
> virtio-net-device,netdev=net0 \
>   -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
> oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"
>
> However, when I enable KASAN and get this config:
> https://gist.githubusercontent.com/dvyukov/a7e3edd35cc39a1b69b11530c7d2e7ac/raw/7cbda88085d3ccd11227224a1c9964ccb8484d4e/gistfile1.txt
>
> kernel does not boot, qemu only prints the following output and then silence:
> pulseaudio: set_sink_input_volume() failed
> pulseaudio: Reason: Invalid argument
> pulseaudio: set_sink_input_mute() failed
> pulseaudio: Reason: Invalid argument
>
> What am I doing wrong?

No clue but I just tried KASAN on my ARMv7 Exynos5422 board (real
hardware) and it works (although kernel log appeared with a bigger
delay):

[    0.000000] Booting Linux on physical CPU 0x100
[    0.000000] Linux version
5.11.0-rc3-next-20210115-00001-g77140600eeec (kozik@kozik-lap)
(arm-linux-gnueabi-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld
(GNU Binutils for Ubuntu) 2.34) #144 SMP PREEMPT Tue Jan 19 09:23:24
CET 2021
[    0.000000] CPU: ARMv7 Processor [410fc073] revision 3 (ARMv7), cr=10c5387d
...
[    0.000000] kasan: Truncating shadow for memory block at
0x40000000-0xbea00000 to lowmem region at 0x70000000
[    0.000000] kasan: Mapping kernel virtual memory block:
c0000000-f0000000 at shadow: b7000000-bd000000
[    0.000000] kasan: Mapping kernel virtual memory block:
bf000000-c0000000 at shadow: b6e00000-b7000000
[    0.000000] kasan: Kernel address sanitizer initialized

Best regards,
Krzysztof

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-18 16:31 Arm + KASAN + syzbot Dmitry Vyukov
  2021-01-19  8:36 ` Krzysztof Kozlowski
@ 2021-01-19  8:41 ` Linus Walleij
  2021-01-19  8:43   ` Linus Walleij
  2021-01-19 10:18   ` Dmitry Vyukov
  2021-01-19 10:03 ` Mark Rutland
  2 siblings, 2 replies; 47+ messages in thread
From: Linus Walleij @ 2021-01-19  8:41 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Mon, Jan 18, 2021 at 5:31 PM Dmitry Vyukov <dvyukov@google.com> wrote:

> We are considering setting up an Arm 32-bit instance on syzbot for
> continuous testing using qemu emulation and I have several questions
> related to that.

That's interesting. I don't know much about syzbot but it reminds me
of syzcaller.

> 1. Is there interest in this on your end? What git tree/branch should
> be used for testing (contains latest development and is regularly
> updated with fixes)?

The most important would be Russell's branch I think, that is where
the core architecture changes end up. They also land in linux-next.

I think for the core developers this is the interesting tree,
the corporate users mostly use KASAN for fuzzing their
out-of-tree codebase and that is not of our concern. There can
be some specific platforms we want to test but they mostly
require real hardware because the interesting bugs tend to be
in drivers and driver subsystems that only gets exercised on
real hardware (not Qemu).

> 2. I see KASAN has just become supported for Arm, which is very
> useful, but I can't boot a kernel with KASAN enabled. I am using
> v5.11-rc4 and this config without KASAN boots fine:
> https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt
> using the following qemu command line:
> qemu-system-arm \
>   -machine vexpress-a15 -cpu max -smp 2 -m 2G \
>   -device virtio-blk-device,drive=hd0 \
>   -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
>   -kernel arch/arm/boot/zImage \
>   -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \
>   -nographic \
>   -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
> virtio-net-device,netdev=net0 \
>   -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
> oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"
>
> However, when I enable KASAN and get this config:
> https://gist.githubusercontent.com/dvyukov/a7e3edd35cc39a1b69b11530c7d2e7ac/raw/7cbda88085d3ccd11227224a1c9964ccb8484d4e/gistfile1.txt
>
> kernel does not boot, qemu only prints the following output and then silence:
> pulseaudio: set_sink_input_volume() failed
> pulseaudio: Reason: Invalid argument
> pulseaudio: set_sink_input_mute() failed
> pulseaudio: Reason: Invalid argument
>
> What am I doing wrong?

I tried it with both KASAN_INLINE and KASAN_OUTLINE this
morning on Torvald's tree and it works fine for me.
I brought it up with this and it booted (takes ~30 seconds to come up
on an i7).

Here is my config:
https://dflund.se/~triad/krad/vexpress_config.txt

> 3. CONFIG_KCOV does not seem to fully work.
> It seems to work except for when the kernel crashes, and that's the
> most interesting scenario for us. When the kernel crashes for other
> reasons, crash handlers re-crashe in KCOV making all crashes
> unactionable and indistinguishable.
> Here are some samples (search for __sanitizer_cov_trace):
> https://gist.githubusercontent.com/dvyukov/c8a7ff1c00a5223c5143fd90073f5bc4/raw/c0f4ac7fd7faad7253843584fed8620ac6006338/gistfile1.txt
> Perhaps some additional Makefiles in arch/arm need KCOV_INSTRUMENT :=
> n to fix this.
> And LKDTM can be used for testing:
> https://www.kernel.org/doc/html/latest/fault-injection/provoke-crashes.html

I have never use CONFIG_KCOV really, it's yet another universe
that I haven't looked into.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19  8:41 ` Linus Walleij
@ 2021-01-19  8:43   ` Linus Walleij
  2021-01-19 10:18   ` Dmitry Vyukov
  1 sibling, 0 replies; 47+ messages in thread
From: Linus Walleij @ 2021-01-19  8:43 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 9:41 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> On Mon, Jan 18, 2021 at 5:31 PM Dmitry Vyukov <dvyukov@google.com> wrote:

> I tried it with both KASAN_INLINE and KASAN_OUTLINE this
> morning on Torvald's tree and it works fine for me.
> I brought it up with this and it booted (takes ~30 seconds to come up
> on an i7).
>
> Here is my config:
> https://dflund.se/~triad/krad/vexpress_config.txt

BTW here is how I invoke QEMU on this:
qemu-system-arm -M vexpress-a15 -no-reboot -smp cpus=2 -kernel
${HOME}/zImage -dtb ${HOME}/vexpress-v2p-ca15-tc1.dtb -append
"console=ttyAMA0" -serial stdio

Using the DTB from the same kernel build.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19  8:36 ` Krzysztof Kozlowski
@ 2021-01-19  8:46   ` Linus Walleij
  2021-01-19 10:04   ` Dmitry Vyukov
  1 sibling, 0 replies; 47+ messages in thread
From: Linus Walleij @ 2021-01-19  8:46 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Dmitry Vyukov, Linux ARM

On Tue, Jan 19, 2021 at 9:37 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:

> No clue but I just tried KASAN on my ARMv7 Exynos5422 board (real
> hardware) and it works (although kernel log appeared with a bigger
> delay):
>
> [    0.000000] Booting Linux on physical CPU 0x100
> [    0.000000] Linux version
> 5.11.0-rc3-next-20210115-00001-g77140600eeec (kozik@kozik-lap)
> (arm-linux-gnueabi-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld
> (GNU Binutils for Ubuntu) 2.34) #144 SMP PREEMPT Tue Jan 19 09:23:24
> CET 2021
> [    0.000000] CPU: ARMv7 Processor [410fc073] revision 3 (ARMv7), cr=10c5387d
> ...
> [    0.000000] kasan: Truncating shadow for memory block at
> 0x40000000-0xbea00000 to lowmem region at 0x70000000
> [    0.000000] kasan: Mapping kernel virtual memory block:
> c0000000-f0000000 at shadow: b7000000-bd000000
> [    0.000000] kasan: Mapping kernel virtual memory block:
> bf000000-c0000000 at shadow: b6e00000-b7000000
> [    0.000000] kasan: Kernel address sanitizer initialized

This looks right, I'm happy that it works on Exynos! :)

I recently summarized the stuff we had to fix up for getting
KASAN to work on ARM in a talkative blog post:
https://people.kernel.org/linusw/kasan-for-arm32-decompression-stop

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-18 16:31 Arm + KASAN + syzbot Dmitry Vyukov
  2021-01-19  8:36 ` Krzysztof Kozlowski
  2021-01-19  8:41 ` Linus Walleij
@ 2021-01-19 10:03 ` Mark Rutland
  2021-01-19 10:34   ` Dmitry Vyukov
  2 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2021-01-19 10:03 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, Russell King - ARM Linux,
	kasan-dev, syzkaller, Krzysztof Kozlowski, liu.hailong6,
	Linux ARM

Hi Dmitry,

On Mon, Jan 18, 2021 at 05:31:36PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> 2. I see KASAN has just become supported for Arm, which is very
> useful, but I can't boot a kernel with KASAN enabled. I am using
> v5.11-rc4 and this config without KASAN boots fine:
> https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt
> using the following qemu command line:
> qemu-system-arm \
>   -machine vexpress-a15 -cpu max -smp 2 -m 2G \

It might be best to use `-machine virt` here instead; that way QEMU
won't need to emulate any of the real vexpress HW, and the kernel won't
need to waste any time poking it.

IIUC with that, you also wouldn't need to provide a DTB explicitly as
QEMU will generate one...

>   -device virtio-blk-device,drive=hd0 \
>   -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
>   -kernel arch/arm/boot/zImage \
>   -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \

... so this line could go, too.

>   -nographic \
>   -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
> virtio-net-device,netdev=net0 \
>   -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
> oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"

[...]

> 3. CONFIG_KCOV does not seem to fully work.
> It seems to work except for when the kernel crashes, and that's the
> most interesting scenario for us. When the kernel crashes for other
> reasons, crash handlers re-crashe in KCOV making all crashes
> unactionable and indistinguishable.
> Here are some samples (search for __sanitizer_cov_trace):
> https://gist.githubusercontent.com/dvyukov/c8a7ff1c00a5223c5143fd90073f5bc4/raw/c0f4ac7fd7faad7253843584fed8620ac6006338/gistfile1.txt

Most of those are all small offsets from 0, which suggests an offset is
being added to a NULL pointer somewhere, which I suspect means
task_struct::kcov_area is NULL. We could hack-in a check for that, and
see if that's the case (though I can't see how from a quick scan of the
kcov code).

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19  8:36 ` Krzysztof Kozlowski
  2021-01-19  8:46   ` Linus Walleij
@ 2021-01-19 10:04   ` Dmitry Vyukov
  2021-01-19 10:17     ` Linus Walleij
  1 sibling, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 10:04 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Arnd Bergmann, Linus Walleij, Russell King - ARM Linux,
	kasan-dev, syzkaller, liu.hailong6, Linux ARM

On Tue, Jan 19, 2021 at 9:37 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On Mon, 18 Jan 2021 at 17:31, Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > Hello Arm maintainers,
> >
> > We are considering setting up an Arm 32-bit instance on syzbot for
> > continuous testing using qemu emulation and I have several questions
> > related to that.
> >
> > 1. Is there interest in this on your end?
>
> Sure, the more, the better.
>
> > What git tree/branch should
> > be used for testing (contains latest development and is regularly
> > updated with fixes)?
>
> Depends on your testing capabilities, whether you can deal with every
> sub-maintainer's tree. 0-day kernel robot tests everything possible
> and this allows each submaintanier to early receive feedback about his
> tree. It can be around 30 Git trees, though... If you want only few, I
> would start with:
>  - https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git/
>  - linux-next
>  - and Russell's for-next
> (http://git.armlinux.org.uk/cgit/linux-arm.git/log/?h=for-next)

Hi Krzysztof,

We need to start with just 1 tree. What syzbot is doing is slightly
different from 0-day. 0-day is unit testing, while syzbot is fuzzing.
One caveat is that majority of bugs won't be arm-specific, hundreds of
bugs will be just generic kernel bugs, so the tested tree needs to be
regularly updated to pick up fixes for all these generic bugs.
Otherwise the instance will be just re-hitting these known and already
fixed bugs all the time without having time to discover any new
arm-specific bugs.
I see that for-next branch of
git://git.armlinux.org.uk/~rmk/linux-arm.git is last updated on Dec
21, so it does not even include v5.11-rc11 created on Dec 27, and we
are now on rc4.
We could use linux-next, but sometimes it's broken or pulls in bugs
that cause crashes all the time. So it's not ideal as well.
Maybe we should just use the upstream tree?



> > 2. I see KASAN has just become supported for Arm, which is very
> > useful, but I can't boot a kernel with KASAN enabled. I am using
> > v5.11-rc4 and this config without KASAN boots fine:
> > https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt
>
> Maybe try first with a kernel based on vexpress defconfig. Yours looks
> closer to multi_v7 which enables a lot of stuff also as modules and
> this by itself brought up few issues (mostly with order of probes).

The first config I provided above works fine, so there is no need to
reduce it. The problem is with KASAN.

syzbot also needs a number of debugging configs, a number of configs
that allow to run in qemu, sandboxing/isolation configs, etc. Plus it
enables configs for tested subsystems. All syzbot configs:
https://github.com/google/syzkaller/tree/master/dashboard/config/linux
are produced from the same fragments:
https://github.com/google/syzkaller/tree/master/dashboard/config/linux/bits
That's the plan for Arm as well, we don't want to do 100% custom
things for each new tree/configuration. That's not
scalable/maintainable.


> You could also try other QEMU machine (I don't know many of them, some
> time ago I was using exynos defconfig on smdkc210, but without KASAN).

vexpress-a15 seems to be the most widely used and more maintained. It
works without KASAN. Is there a reason to switch to something else?

> > using the following qemu command line:
> > qemu-system-arm \
> >   -machine vexpress-a15 -cpu max -smp 2 -m 2G \
> >   -device virtio-blk-device,drive=hd0 \
> >   -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
> >   -kernel arch/arm/boot/zImage \
> >   -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \
> >   -nographic \
> >   -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
> > virtio-net-device,netdev=net0 \
> >   -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
> > oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"
> >
> > However, when I enable KASAN and get this config:
> > https://gist.githubusercontent.com/dvyukov/a7e3edd35cc39a1b69b11530c7d2e7ac/raw/7cbda88085d3ccd11227224a1c9964ccb8484d4e/gistfile1.txt
> >
> > kernel does not boot, qemu only prints the following output and then silence:
> > pulseaudio: set_sink_input_volume() failed
> > pulseaudio: Reason: Invalid argument
> > pulseaudio: set_sink_input_mute() failed
> > pulseaudio: Reason: Invalid argument
> >
> > What am I doing wrong?
>
> No clue but I just tried KASAN on my ARMv7 Exynos5422 board (real
> hardware) and it works (although kernel log appeared with a bigger
> delay):
>
> [    0.000000] Booting Linux on physical CPU 0x100
> [    0.000000] Linux version
> 5.11.0-rc3-next-20210115-00001-g77140600eeec (kozik@kozik-lap)
> (arm-linux-gnueabi-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld
> (GNU Binutils for Ubuntu) 2.34) #144 SMP PREEMPT Tue Jan 19 09:23:24
> CET 2021
> [    0.000000] CPU: ARMv7 Processor [410fc073] revision 3 (ARMv7), cr=10c5387d
> ...
> [    0.000000] kasan: Truncating shadow for memory block at
> 0x40000000-0xbea00000 to lowmem region at 0x70000000
> [    0.000000] kasan: Mapping kernel virtual memory block:
> c0000000-f0000000 at shadow: b7000000-bd000000
> [    0.000000] kasan: Mapping kernel virtual memory block:
> bf000000-c0000000 at shadow: b6e00000-b7000000
> [    0.000000] kasan: Kernel address sanitizer initialized
>
> Best regards,
> Krzysztof

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:04   ` Dmitry Vyukov
@ 2021-01-19 10:17     ` Linus Walleij
  2021-01-19 10:23       ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Linus Walleij @ 2021-01-19 10:17 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 11:04 AM Dmitry Vyukov <dvyukov@google.com> wrote:

> > You could also try other QEMU machine (I don't know many of them, some
> > time ago I was using exynos defconfig on smdkc210, but without KASAN).
>
> vexpress-a15 seems to be the most widely used and more maintained. It
> works without KASAN. Is there a reason to switch to something else?

Vexpress A15 is as good as any.

It can however be compiled in two different ways depending on whether
you use LPAE or not, and the defconfig does not use LPAE.
By setting CONFIG_ARM_LPAE you more or less activate a totally
different MMU on the same machine, and those are the two
MMUs used by ARM32 systems, so I would test these two.

The other interesting Qemu target that is and was used a lot is
Versatile, versatile_defconfig. This is an older ARMv5 (ARM926EJ-S)
CPU core with less memory, but the MMU should be behaving the same
as vanilla Vexpress.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19  8:41 ` Linus Walleij
  2021-01-19  8:43   ` Linus Walleij
@ 2021-01-19 10:18   ` Dmitry Vyukov
  2021-01-19 10:27     ` Linus Walleij
  1 sibling, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 10:18 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 9:41 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > We are considering setting up an Arm 32-bit instance on syzbot for
> > continuous testing using qemu emulation and I have several questions
> > related to that.
>
> That's interesting. I don't know much about syzbot but it reminds me
> of syzcaller.

Hi Linus,

Yes, these are related. syzkaller is a fuzzer, while syzbot is a
continuous fuzzing and bug reporting system.
It's like clang and a CI that uses clang to do continuous builds.


> > 1. Is there interest in this on your end? What git tree/branch should
> > be used for testing (contains latest development and is regularly
> > updated with fixes)?
>
> The most important would be Russell's branch I think, that is where
> the core architecture changes end up. They also land in linux-next.
>
> I think for the core developers this is the interesting tree,
> the corporate users mostly use KASAN for fuzzing their
> out-of-tree codebase and that is not of our concern. There can
> be some specific platforms we want to test but they mostly
> require real hardware because the interesting bugs tend to be
> in drivers and driver subsystems that only gets exercised on
> real hardware (not Qemu).

See my previous reply to Krzysztof re freshness of the tree.
I don't maybe it's just due to the winter holidays and usually it's
updated more frequently?


> > 2. I see KASAN has just become supported for Arm, which is very
> > useful, but I can't boot a kernel with KASAN enabled. I am using
> > v5.11-rc4 and this config without KASAN boots fine:
> > https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt
> > using the following qemu command line:
> > qemu-system-arm \
> >   -machine vexpress-a15 -cpu max -smp 2 -m 2G \
> >   -device virtio-blk-device,drive=hd0 \
> >   -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
> >   -kernel arch/arm/boot/zImage \
> >   -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \
> >   -nographic \
> >   -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
> > virtio-net-device,netdev=net0 \
> >   -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
> > oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"
> >
> > However, when I enable KASAN and get this config:
> > https://gist.githubusercontent.com/dvyukov/a7e3edd35cc39a1b69b11530c7d2e7ac/raw/7cbda88085d3ccd11227224a1c9964ccb8484d4e/gistfile1.txt
> >
> > kernel does not boot, qemu only prints the following output and then silence:
> > pulseaudio: set_sink_input_volume() failed
> > pulseaudio: Reason: Invalid argument
> > pulseaudio: set_sink_input_mute() failed
> > pulseaudio: Reason: Invalid argument
> >
> > What am I doing wrong?
>
> I tried it with both KASAN_INLINE and KASAN_OUTLINE this
> morning on Torvald's tree and it works fine for me.
> I brought it up with this and it booted (takes ~30 seconds to come up
> on an i7).
>
> Here is my config:
> https://dflund.se/~triad/krad/vexpress_config.txt


See my previous reply to Krzysztof re syzbot configs. syzbot can't use
random configs.


> > 3. CONFIG_KCOV does not seem to fully work.
> > It seems to work except for when the kernel crashes, and that's the
> > most interesting scenario for us. When the kernel crashes for other
> > reasons, crash handlers re-crashe in KCOV making all crashes
> > unactionable and indistinguishable.
> > Here are some samples (search for __sanitizer_cov_trace):
> > https://gist.githubusercontent.com/dvyukov/c8a7ff1c00a5223c5143fd90073f5bc4/raw/c0f4ac7fd7faad7253843584fed8620ac6006338/gistfile1.txt
> > Perhaps some additional Makefiles in arch/arm need KCOV_INSTRUMENT :=
> > n to fix this.
> > And LKDTM can be used for testing:
> > https://www.kernel.org/doc/html/latest/fault-injection/provoke-crashes.html
>
> I have never use CONFIG_KCOV really, it's yet another universe
> that I haven't looked into.

I looked up who added KCOV/Arm support to CC... it's turned out to be me...

I think we need to disable KCOV instrumentation for arch/arm/mm/fault.c:
[<802b36ac>] (__sanitizer_cov_trace_const_cmp4) from [<801228c8>]
(do_DataAbort+0x60/0xe8 arch/arm/mm/fault.c:522)


But I am more concerned about the following stacks. These happen in
the generic kernel files. The most expected reason for crashes in KCOV
would be that current is not properly setup. So presumably these arm
thunks call into common kernel code without proper current being
setup. Other arches (x86_64, arm64) should do it properly, so maybe
it's fixable for arm as well?

[<802b3ffc>] (__sanitizer_cov_trace_pc) from [<802e12cc>]
(trace_hardirqs_off+0x14/0x120 kernel/trace/trace_preemptirq.c:76)
[<802e12b8>] (trace_hardirqs_off) from [<80100a74>]
(__dabt_svc+0x54/0xa0 arch/arm/kernel/entry-armv.S:194)

[<802b3460>] (write_comp_data) from [<802b3728>]
(__sanitizer_cov_trace_const_cmp8+0x40/0x48 kernel/kcov.c:291)
 r9:00000000 r8:8a2f00ac r7:00000000 r6:dead4ead r5:00000000 r4:00000000
[<802b36e8>] (__sanitizer_cov_trace_const_cmp8) from [<801f6cb8>]
(vprintk_func+0xf0/0x2ac kernel/printk/printk_safe.c:385)
 r7:83f885a4 r6:dead4ead r5:00000000 r4:844f2694
[<801f6bc8>] (vprintk_func) from [<8367b7b0>] (printk+0x40/0x68
kernel/printk/printk.c:2076)
 r9:8a2f0000 r8:8456390c r7:8a2f00f0 r6:00000367 r5:00000001 r4:83f885a4
[<8367b770>] (printk) from [<801228ec>] (do_DataAbort+0x84/0xe8
arch/arm/mm/fault.c:525)
 r3:dead4ead r2:00ad0000 r1:ffffffff r0:83f885a4
 r4:0000001b
[<80122868>] (do_DataAbort) from [<80100a7c>] (__dabt_svc+0x5c/0xa0
arch/arm/kernel/entry-armv.S:196)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:17     ` Linus Walleij
@ 2021-01-19 10:23       ` Dmitry Vyukov
  2021-01-19 10:28         ` Linus Walleij
  0 siblings, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 10:23 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 11:17 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > > You could also try other QEMU machine (I don't know many of them, some
> > > time ago I was using exynos defconfig on smdkc210, but without KASAN).
> >
> > vexpress-a15 seems to be the most widely used and more maintained. It
> > works without KASAN. Is there a reason to switch to something else?
>
> Vexpress A15 is as good as any.
>
> It can however be compiled in two different ways depending on whether
> you use LPAE or not, and the defconfig does not use LPAE.
> By setting CONFIG_ARM_LPAE you more or less activate a totally
> different MMU on the same machine, and those are the two
> MMUs used by ARM32 systems, so I would test these two.
>
> The other interesting Qemu target that is and was used a lot is
> Versatile, versatile_defconfig. This is an older ARMv5 (ARM926EJ-S)
> CPU core with less memory, but the MMU should be behaving the same
> as vanilla Vexpress.

That's interesting. If we have more than 1 instance in future we could
vary different aspects between them to get more combined coverage.
E.g. one could use ARM_LPAE=y while another ARM_LPAE=n.

But let's start with 1 instance running first :)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:18   ` Dmitry Vyukov
@ 2021-01-19 10:27     ` Linus Walleij
  2021-01-19 10:36       ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Linus Walleij @ 2021-01-19 10:27 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 11:18 AM Dmitry Vyukov <dvyukov@google.com> wrote:

> > Here is my config:
> > https://dflund.se/~triad/krad/vexpress_config.txt
>
> See my previous reply to Krzysztof re syzbot configs. syzbot can't use
> random configs.

What I'm using is based on vexpress_defconfig with a bunch of
stuff added on top (like activating KASAN)...

I derive my .config from vexpress_defconfig using this
Makefile:
https://dflund.se/~triad/krad/makefiles/vexpress.mak

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:23       ` Dmitry Vyukov
@ 2021-01-19 10:28         ` Linus Walleij
  2021-01-19 10:53           ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Linus Walleij @ 2021-01-19 10:28 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 11:23 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> On Tue, Jan 19, 2021 at 11:17 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > > > You could also try other QEMU machine (I don't know many of them, some
> > > > time ago I was using exynos defconfig on smdkc210, but without KASAN).
> > >
> > > vexpress-a15 seems to be the most widely used and more maintained. It
> > > works without KASAN. Is there a reason to switch to something else?
> >
> > Vexpress A15 is as good as any.
> >
> > It can however be compiled in two different ways depending on whether
> > you use LPAE or not, and the defconfig does not use LPAE.
> > By setting CONFIG_ARM_LPAE you more or less activate a totally
> > different MMU on the same machine, and those are the two
> > MMUs used by ARM32 systems, so I would test these two.
> >
> > The other interesting Qemu target that is and was used a lot is
> > Versatile, versatile_defconfig. This is an older ARMv5 (ARM926EJ-S)
> > CPU core with less memory, but the MMU should be behaving the same
> > as vanilla Vexpress.
>
> That's interesting. If we have more than 1 instance in future we could
> vary different aspects between them to get more combined coverage.
> E.g. one could use ARM_LPAE=y while another ARM_LPAE=n.
>
> But let's start with 1 instance running first :)

Hm I noticed that I was running in LPAE mode by default on Vexpress
so I try non-LPAE now. Let's see what happens...

Linus

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:03 ` Mark Rutland
@ 2021-01-19 10:34   ` Dmitry Vyukov
  2021-01-19 10:55     ` Russell King - ARM Linux admin
  2021-01-19 13:00     ` Mark Rutland
  0 siblings, 2 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 10:34 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Arnd Bergmann, Linus Walleij, Russell King - ARM Linux,
	kasan-dev, syzkaller, Krzysztof Kozlowski, Hailong Liu,
	Linux ARM

On Tue, Jan 19, 2021 at 11:04 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Hi Dmitry,
>
> On Mon, Jan 18, 2021 at 05:31:36PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> > 2. I see KASAN has just become supported for Arm, which is very
> > useful, but I can't boot a kernel with KASAN enabled. I am using
> > v5.11-rc4 and this config without KASAN boots fine:
> > https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt
> > using the following qemu command line:
> > qemu-system-arm \
> >   -machine vexpress-a15 -cpu max -smp 2 -m 2G \
>
> It might be best to use `-machine virt` here instead; that way QEMU
> won't need to emulate any of the real vexpress HW, and the kernel won't
> need to waste any time poking it.

Hi Mark,

The whole point of setting up an Arm instance is getting as much
coverage we can't get on x86_64 instances as possible. The instance
will use qemu emulation (extremely slow) and limited capacity.
I see some drivers and associated hardware support as one of the main
such areas. That's why I tried to use vexpress-a15. And it boots
without KASAN, so presumably it can be used in general.


> IIUC with that, you also wouldn't need to provide a DTB explicitly as
> QEMU will generate one...
>
> >   -device virtio-blk-device,drive=hd0 \
> >   -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
> >   -kernel arch/arm/boot/zImage \
> >   -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \
>
> ... so this line could go, too.
>
> >   -nographic \
> >   -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
> > virtio-net-device,netdev=net0 \
> >   -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
> > oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"
>
> [...]
>
> > 3. CONFIG_KCOV does not seem to fully work.
> > It seems to work except for when the kernel crashes, and that's the
> > most interesting scenario for us. When the kernel crashes for other
> > reasons, crash handlers re-crashe in KCOV making all crashes
> > unactionable and indistinguishable.
> > Here are some samples (search for __sanitizer_cov_trace):
> > https://gist.githubusercontent.com/dvyukov/c8a7ff1c00a5223c5143fd90073f5bc4/raw/c0f4ac7fd7faad7253843584fed8620ac6006338/gistfile1.txt
>
> Most of those are all small offsets from 0, which suggests an offset is
> being added to a NULL pointer somewhere, which I suspect means
> task_struct::kcov_area is NULL. We could hack-in a check for that, and
> see if that's the case (though I can't see how from a quick scan of the
> kcov code).

My first guess would be is that current itself if NULL. Accesses to
current->kcov* are well tested on other arches, including using KCOV
in interrupts, etc.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:27     ` Linus Walleij
@ 2021-01-19 10:36       ` Dmitry Vyukov
  0 siblings, 0 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 10:36 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 11:27 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Tue, Jan 19, 2021 at 11:18 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> > > Here is my config:
> > > https://dflund.se/~triad/krad/vexpress_config.txt
> >
> > See my previous reply to Krzysztof re syzbot configs. syzbot can't use
> > random configs.
>
> What I'm using is based on vexpress_defconfig with a bunch of
> stuff added on top (like activating KASAN)...
>
> I derive my .config from vexpress_defconfig using this
> Makefile:
> https://dflund.se/~triad/krad/makefiles/vexpress.mak

The syzbot config I referenced is also based on vexpress_defconfig
with a bunch stuff added on top:
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/arm.yml#L10-L11
(but what you see in that single fragment file is not all).

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:28         ` Linus Walleij
@ 2021-01-19 10:53           ` Dmitry Vyukov
  2021-01-19 11:05             ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 10:53 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 11:28 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Tue, Jan 19, 2021 at 11:23 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > On Tue, Jan 19, 2021 at 11:17 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > > > > You could also try other QEMU machine (I don't know many of them, some
> > > > > time ago I was using exynos defconfig on smdkc210, but without KASAN).
> > > >
> > > > vexpress-a15 seems to be the most widely used and more maintained. It
> > > > works without KASAN. Is there a reason to switch to something else?
> > >
> > > Vexpress A15 is as good as any.
> > >
> > > It can however be compiled in two different ways depending on whether
> > > you use LPAE or not, and the defconfig does not use LPAE.
> > > By setting CONFIG_ARM_LPAE you more or less activate a totally
> > > different MMU on the same machine, and those are the two
> > > MMUs used by ARM32 systems, so I would test these two.
> > >
> > > The other interesting Qemu target that is and was used a lot is
> > > Versatile, versatile_defconfig. This is an older ARMv5 (ARM926EJ-S)
> > > CPU core with less memory, but the MMU should be behaving the same
> > > as vanilla Vexpress.
> >
> > That's interesting. If we have more than 1 instance in future we could
> > vary different aspects between them to get more combined coverage.
> > E.g. one could use ARM_LPAE=y while another ARM_LPAE=n.
> >
> > But let's start with 1 instance running first :)
>
> Hm I noticed that I was running in LPAE mode by default on Vexpress
> so I try non-LPAE now. Let's see what happens...

Good point. I've tried to enable CONFIG_ARM_LPAE=y in my config with
KASAN, and it did not help. No output after 8 minutes.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:34   ` Dmitry Vyukov
@ 2021-01-19 10:55     ` Russell King - ARM Linux admin
  2021-01-19 13:00     ` Mark Rutland
  1 sibling, 0 replies; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-19 10:55 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Mark Rutland, Arnd Bergmann, Linus Walleij, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 11:34:33AM +0100, Dmitry Vyukov wrote:
> My first guess would be is that current itself if NULL. Accesses to
> current->kcov* are well tested on other arches, including using KCOV
> in interrupts, etc.

There is a window in dup_task_struct() where the new thread info has
a NULL ->task pointer, but this will never be the current thread,
and so would not affect current.

If we do have a NULL current, that would cause the kernel to explode,
since a context switch to or from such a case would dereference a NULL
pointer.

So, I think your theory is highly unlikely.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:53           ` Dmitry Vyukov
@ 2021-01-19 11:05             ` Dmitry Vyukov
  2021-01-19 11:13               ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 11:05 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux, kasan-dev,
	syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 19, 2021 at 11:53 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Tue, Jan 19, 2021 at 11:28 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > On Tue, Jan 19, 2021 at 11:23 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > On Tue, Jan 19, 2021 at 11:17 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > > > > > You could also try other QEMU machine (I don't know many of them, some
> > > > > > time ago I was using exynos defconfig on smdkc210, but without KASAN).
> > > > >
> > > > > vexpress-a15 seems to be the most widely used and more maintained. It
> > > > > works without KASAN. Is there a reason to switch to something else?
> > > >
> > > > Vexpress A15 is as good as any.
> > > >
> > > > It can however be compiled in two different ways depending on whether
> > > > you use LPAE or not, and the defconfig does not use LPAE.
> > > > By setting CONFIG_ARM_LPAE you more or less activate a totally
> > > > different MMU on the same machine, and those are the two
> > > > MMUs used by ARM32 systems, so I would test these two.
> > > >
> > > > The other interesting Qemu target that is and was used a lot is
> > > > Versatile, versatile_defconfig. This is an older ARMv5 (ARM926EJ-S)
> > > > CPU core with less memory, but the MMU should be behaving the same
> > > > as vanilla Vexpress.
> > >
> > > That's interesting. If we have more than 1 instance in future we could
> > > vary different aspects between them to get more combined coverage.
> > > E.g. one could use ARM_LPAE=y while another ARM_LPAE=n.
> > >
> > > But let's start with 1 instance running first :)
> >
> > Hm I noticed that I was running in LPAE mode by default on Vexpress
> > so I try non-LPAE now. Let's see what happens...
>
> Good point. I've tried to enable CONFIG_ARM_LPAE=y in my config with
> KASAN, and it did not help. No output after 8 minutes.

But I also spied this in your makefile:

config-earlydebug: config-base
$(CURDIR)/scripts/config --file $(config_file) \
--enable DEBUG_LL \
--enable EARLY_PRINTK \
--enable DEBUG_VEXPRESS_UART0_RS1 \

With these configs, qemu prints something more useful:

pulseaudio: set_sink_input_volume() failed
pulseaudio: Reason: Invalid argument
pulseaudio: set_sink_input_mute() failed
pulseaudio: Reason: Invalid argument
Error: invalid dtb and unrecognized/unsupported machine ID
  r1=0x000008e0, r2=0x00000000
Available machine support:
ID (hex) NAME
ffffffff Generic DT based system
ffffffff Samsung Exynos (Flattened Device Tree)
ffffffff Hisilicon Hi3620 (Flattened Device Tree)
ffffffff ARM-Versatile Express
Please check your kernel config and/or bootloader.


What does this mean? And is this affected by KASAN?... I do specify
the ARM-Versatile Express machine...

Can it be too large kernel size which is not supported/properly
diagnosed by qemu/kernel?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 11:05             ` Dmitry Vyukov
@ 2021-01-19 11:13               ` Russell King - ARM Linux admin
  2021-01-19 11:17                 ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-19 11:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 12:05:01PM +0100, Dmitry Vyukov wrote:
> But I also spied this in your makefile:
> 
> config-earlydebug: config-base
> $(CURDIR)/scripts/config --file $(config_file) \
> --enable DEBUG_LL \
> --enable EARLY_PRINTK \
> --enable DEBUG_VEXPRESS_UART0_RS1 \
> 
> With these configs, qemu prints something more useful:
> 
> pulseaudio: set_sink_input_volume() failed
> pulseaudio: Reason: Invalid argument
> pulseaudio: set_sink_input_mute() failed
> pulseaudio: Reason: Invalid argument
> Error: invalid dtb and unrecognized/unsupported machine ID
>   r1=0x000008e0, r2=0x00000000
> Available machine support:
> ID (hex) NAME
> ffffffff Generic DT based system
> ffffffff Samsung Exynos (Flattened Device Tree)
> ffffffff Hisilicon Hi3620 (Flattened Device Tree)
> ffffffff ARM-Versatile Express
> Please check your kernel config and/or bootloader.
> 
> 
> What does this mean? And is this affected by KASAN?... I do specify
> the ARM-Versatile Express machine...
> 
> Can it be too large kernel size which is not supported/properly
> diagnosed by qemu/kernel?

It means that your kernel only supports DT platforms, but there was
no DT passed to the kernel (r2 is the pointer to DT). Consequently
the kernel has no idea what hardware it is running on.

I don't use qemu very much, so I can't suggest anything.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 11:13               ` Russell King - ARM Linux admin
@ 2021-01-19 11:17                 ` Dmitry Vyukov
  2021-01-19 11:43                   ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 11:17 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Arnd Bergmann, Linus Walleij, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 12:13 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jan 19, 2021 at 12:05:01PM +0100, Dmitry Vyukov wrote:
> > But I also spied this in your makefile:
> >
> > config-earlydebug: config-base
> > $(CURDIR)/scripts/config --file $(config_file) \
> > --enable DEBUG_LL \
> > --enable EARLY_PRINTK \
> > --enable DEBUG_VEXPRESS_UART0_RS1 \
> >
> > With these configs, qemu prints something more useful:
> >
> > pulseaudio: set_sink_input_volume() failed
> > pulseaudio: Reason: Invalid argument
> > pulseaudio: set_sink_input_mute() failed
> > pulseaudio: Reason: Invalid argument
> > Error: invalid dtb and unrecognized/unsupported machine ID
> >   r1=0x000008e0, r2=0x00000000
> > Available machine support:
> > ID (hex) NAME
> > ffffffff Generic DT based system
> > ffffffff Samsung Exynos (Flattened Device Tree)
> > ffffffff Hisilicon Hi3620 (Flattened Device Tree)
> > ffffffff ARM-Versatile Express
> > Please check your kernel config and/or bootloader.
> >
> >
> > What does this mean? And is this affected by KASAN?... I do specify
> > the ARM-Versatile Express machine...
> >
> > Can it be too large kernel size which is not supported/properly
> > diagnosed by qemu/kernel?
>
> It means that your kernel only supports DT platforms, but there was
> no DT passed to the kernel (r2 is the pointer to DT). Consequently
> the kernel has no idea what hardware it is running on.
>
> I don't use qemu very much, so I can't suggest anything.

I do pass DT and it boots fine w/o KASAN, so it seems to be poor
diagnostics of something else.

It seems to be due to kernel size. I enabled CONFIG_KASAN_OUTLINE=y
and CONFIG_CC_OPTIMIZE_FOR_SIZE=y and now it boots...

Almost...
Now I got the following, which will prevent it from booting with
panic_on_warn that syzbot uses.


------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/printk/printk.c:2790
register_console+0x2f4/0x3c4 kernel/printk/printk.c:2790
console 'earlycon0' already registered
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.11.0-rc4-next-20210119 #27
Hardware name: ARM-Versatile Express
Backtrace:
[<82e981d0>] (dump_backtrace) from [<82e98430>] (show_stack+0x18/0x1c
arch/arm/kernel/traps.c:252)
 r7:00000080 r6:600001d3 r5:00000000 r4:84efddc0
[<82e98418>] (show_stack) from [<82ead110>] (__dump_stack
lib/dump_stack.c:79 [inline])
[<82e98418>] (show_stack) from [<82ead110>] (dump_stack+0x9c/0xc4
lib/dump_stack.c:120)
[<82ead074>] (dump_stack) from [<8024c6bc>] (__warn+0x12c/0x174
kernel/panic.c:609)
 r7:8303c220 r6:802e5554 r5:84a03c20 r4:8303c7e0
[<8024c590>] (__warn) from [<82e99040>] (warn_slowpath_fmt+0xb8/0x114
kernel/panic.c:635)
 r10:8303c7e0 r9:00000009 r8:00000ae6 r7:802e5554 r6:8303c220 r5:84a03c20
 r4:6f940780
[<82e98f8c>] (warn_slowpath_fmt) from [<802e5554>]
(register_console+0x2f4/0x3c4 kernel/printk/printk.c:2790)
 r10:848f747e r9:848f7472 r8:830000c0 r7:84a70a20 r6:85d00dc0 r5:84a70a20
 r4:84a70a20
[<802e5260>] (register_console) from [<84808424>]
(setup_early_printk+0x24/0x34 arch/arm/kernel/early_printk.c:43)
 r10:848f747e r9:848f7472 r8:830000c0 r7:849203d8 r6:848f747e r5:848f7472
 r4:85d018e0
[<84808400>] (setup_early_printk) from [<848004e4>]
(do_early_param+0x90/0xdc init/main.c:735)
 r5:848f7472 r4:8491fc04
[<84800454>] (do_early_param) from [<8028079c>] (parse_one
kernel/params.c:153 [inline])
[<84800454>] (do_early_param) from [<8028079c>]
(parse_args+0x37c/0x460 kernel/params.c:188)
 r9:848f7472 r8:83000a00 r7:00000000 r6:848f7485 r5:848f7000 r4:84a03de0
[<80280420>] (parse_args) from [<84800ddc>]
(parse_early_options+0x38/0x48 init/main.c:745)
 r10:856ed8c0 r9:80008000 r8:000002de r7:00000000 r6:848f7404 r5:848f7000
 r4:000002de
[<84800da4>] (parse_early_options) from [<84800e64>]
(parse_early_param+0x78/0x94 init/main.c:760)
[<84800dec>] (parse_early_param) from [<848057c8>]
(setup_arch+0x250/0xc5c arch/arm/kernel/setup.c:1129)
 r7:848f7a80 r6:84a6a200 r5:848f20f8 r4:84a03f80
[<84805578>] (setup_arch) from [<84800ff0>] (start_kernel+0x7c/0x3e4
init/main.c:873)
 r10:30c5387d r9:412fc0f1 r8:88000000 r7:000008e0 r6:ffffffff r5:84a50c40
 r4:856ed000
[<84800f74>] (start_kernel) from [<00000000>] (0x0)
 r6:30c0387d r5:00000000 r4:84800334
irq event stamp: 0
hardirqs last  enabled at (0): [<00000000>] 0x0
hardirqs last disabled at (0): [<00000000>] 0x0
softirqs last  enabled at (0): [<00000000>] 0x0
softirqs last disabled at (0): [<00000000>] 0x0
random: get_random_bytes called from init_oops_id kernel/panic.c:546
[inline] with crng_init=0
random: get_random_bytes called from init_oops_id+0x2c/0x4c
kernel/panic.c:543 with crng_init=0
---[ end trace 0000000000000000 ]---

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 11:17                 ` Dmitry Vyukov
@ 2021-01-19 11:43                   ` Russell King - ARM Linux admin
  2021-01-19 12:05                     ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-19 11:43 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 12:17:37PM +0100, Dmitry Vyukov wrote:
> On Tue, Jan 19, 2021 at 12:13 PM Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
> >
> > On Tue, Jan 19, 2021 at 12:05:01PM +0100, Dmitry Vyukov wrote:
> > > But I also spied this in your makefile:
> > >
> > > config-earlydebug: config-base
> > > $(CURDIR)/scripts/config --file $(config_file) \
> > > --enable DEBUG_LL \
> > > --enable EARLY_PRINTK \
> > > --enable DEBUG_VEXPRESS_UART0_RS1 \
> > >
> > > With these configs, qemu prints something more useful:
> > >
> > > pulseaudio: set_sink_input_volume() failed
> > > pulseaudio: Reason: Invalid argument
> > > pulseaudio: set_sink_input_mute() failed
> > > pulseaudio: Reason: Invalid argument
> > > Error: invalid dtb and unrecognized/unsupported machine ID
> > >   r1=0x000008e0, r2=0x00000000
> > > Available machine support:
> > > ID (hex) NAME
> > > ffffffff Generic DT based system
> > > ffffffff Samsung Exynos (Flattened Device Tree)
> > > ffffffff Hisilicon Hi3620 (Flattened Device Tree)
> > > ffffffff ARM-Versatile Express
> > > Please check your kernel config and/or bootloader.
> > >
> > >
> > > What does this mean? And is this affected by KASAN?... I do specify
> > > the ARM-Versatile Express machine...
> > >
> > > Can it be too large kernel size which is not supported/properly
> > > diagnosed by qemu/kernel?
> >
> > It means that your kernel only supports DT platforms, but there was
> > no DT passed to the kernel (r2 is the pointer to DT). Consequently
> > the kernel has no idea what hardware it is running on.
> >
> > I don't use qemu very much, so I can't suggest anything.
> 
> I do pass DT and it boots fine w/o KASAN, so it seems to be poor
> diagnostics of something else.

It is the best we can do at that time. Consider yourself lucky that you
can even get _that_ message since the kernel has no clue what hardware
is available, and there is no standardised hardware.

All that the kernel knows at this point is that (1) the machine ID in
r1 does not match anything the kernel knows about (which are all DT
platforms), and r2 is NULL, meaning no DT was passed to the
decompressed kernel.

There is no further information that the kernel knows. I suppose we
could hexdump random bits of memory space through the serial port or
whatever, but that would be very random.

I'm not sure what else you think the kernel could do at this point.

> It seems to be due to kernel size. I enabled CONFIG_KASAN_OUTLINE=y
> and CONFIG_CC_OPTIMIZE_FOR_SIZE=y and now it boots...

So, likely the DT was obliterated. How are you passing the DT? If
you are passing it via qemu, then qemu's placement of DT is too close
to the kernel.

> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at kernel/printk/printk.c:2790
> register_console+0x2f4/0x3c4 kernel/printk/printk.c:2790
> console 'earlycon0' already registered

Two "earlycons" or whatever the early console kernel parameter is?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 11:43                   ` Russell King - ARM Linux admin
@ 2021-01-19 12:05                     ` Dmitry Vyukov
  2021-01-19 12:36                       ` Russell King - ARM Linux admin
  2021-01-19 13:22                       ` Linus Walleij
  0 siblings, 2 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 12:05 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Arnd Bergmann, Linus Walleij, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 12:43 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jan 19, 2021 at 12:17:37PM +0100, Dmitry Vyukov wrote:
> > On Tue, Jan 19, 2021 at 12:13 PM Russell King - ARM Linux admin
> > <linux@armlinux.org.uk> wrote:
> > >
> > > On Tue, Jan 19, 2021 at 12:05:01PM +0100, Dmitry Vyukov wrote:
> > > > But I also spied this in your makefile:
> > > >
> > > > config-earlydebug: config-base
> > > > $(CURDIR)/scripts/config --file $(config_file) \
> > > > --enable DEBUG_LL \
> > > > --enable EARLY_PRINTK \
> > > > --enable DEBUG_VEXPRESS_UART0_RS1 \
> > > >
> > > > With these configs, qemu prints something more useful:
> > > >
> > > > pulseaudio: set_sink_input_volume() failed
> > > > pulseaudio: Reason: Invalid argument
> > > > pulseaudio: set_sink_input_mute() failed
> > > > pulseaudio: Reason: Invalid argument
> > > > Error: invalid dtb and unrecognized/unsupported machine ID
> > > >   r1=0x000008e0, r2=0x00000000
> > > > Available machine support:
> > > > ID (hex) NAME
> > > > ffffffff Generic DT based system
> > > > ffffffff Samsung Exynos (Flattened Device Tree)
> > > > ffffffff Hisilicon Hi3620 (Flattened Device Tree)
> > > > ffffffff ARM-Versatile Express
> > > > Please check your kernel config and/or bootloader.
> > > >
> > > >
> > > > What does this mean? And is this affected by KASAN?... I do specify
> > > > the ARM-Versatile Express machine...
> > > >
> > > > Can it be too large kernel size which is not supported/properly
> > > > diagnosed by qemu/kernel?
> > >
> > > It means that your kernel only supports DT platforms, but there was
> > > no DT passed to the kernel (r2 is the pointer to DT). Consequently
> > > the kernel has no idea what hardware it is running on.
> > >
> > > I don't use qemu very much, so I can't suggest anything.
> >
> > I do pass DT and it boots fine w/o KASAN, so it seems to be poor
> > diagnostics of something else.
>
> It is the best we can do at that time. Consider yourself lucky that you
> can even get _that_ message since the kernel has no clue what hardware
> is available, and there is no standardised hardware.
>
> All that the kernel knows at this point is that (1) the machine ID in
> r1 does not match anything the kernel knows about (which are all DT
> platforms), and r2 is NULL, meaning no DT was passed to the
> decompressed kernel.
>
> There is no further information that the kernel knows. I suppose we
> could hexdump random bits of memory space through the serial port or
> whatever, but that would be very random.
>
> I'm not sure what else you think the kernel could do at this point.
>
> > It seems to be due to kernel size. I enabled CONFIG_KASAN_OUTLINE=y
> > and CONFIG_CC_OPTIMIZE_FOR_SIZE=y and now it boots...
>
> So, likely the DT was obliterated. How are you passing the DT? If
> you are passing it via qemu, then qemu's placement of DT is too close
> to the kernel.

Yes, I used the qemu -dtb flag.

I tried to use CONFIG_ARM_APPENDED_DTB because it looks like a very
nice option. However, I couldn't make it work.
I enabled:
CONFIG_ARM_APPENDED_DTB=y
CONFIG_ARM_ATAG_DTB_COMPAT=y
# CONFIG_ARM_ATAG_DTB_COMPAT_CMDLINE_FROM_BOOTLOADER is not set
CONFIG_ARM_ATAG_DTB_COMPAT_CMDLINE_EXTEND=y
and removed qemu -dtb flag and I see:

Error: invalid dtb and unrecognized/unsupported machine ID
  r1=0x000008e0, r2=0x80000100
  r2[]=05 00 00 00 01 00 41 54 01 00 00 00 00 10 00 00
Available machine support:

ID (hex) NAME
ffffffff Generic DT based system
ffffffff Samsung Exynos (Flattened Device Tree)
ffffffff Hisilicon Hi3620 (Flattened Device Tree)
ffffffff ARM-Versatile Express

Please check your kernel config and/or bootloader.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 12:05                     ` Dmitry Vyukov
@ 2021-01-19 12:36                       ` Russell King - ARM Linux admin
  2021-01-19 18:57                         ` Dmitry Vyukov
  2021-01-19 13:22                       ` Linus Walleij
  1 sibling, 1 reply; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-19 12:36 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 01:05:11PM +0100, Dmitry Vyukov wrote:
> Yes, I used the qemu -dtb flag.
> 
> I tried to use CONFIG_ARM_APPENDED_DTB because it looks like a very
> nice option. However, I couldn't make it work.
> I enabled:
> CONFIG_ARM_APPENDED_DTB=y
> CONFIG_ARM_ATAG_DTB_COMPAT=y
> # CONFIG_ARM_ATAG_DTB_COMPAT_CMDLINE_FROM_BOOTLOADER is not set
> CONFIG_ARM_ATAG_DTB_COMPAT_CMDLINE_EXTEND=y
> and removed qemu -dtb flag and I see:
> 
> Error: invalid dtb and unrecognized/unsupported machine ID
>   r1=0x000008e0, r2=0x80000100
>   r2[]=05 00 00 00 01 00 41 54 01 00 00 00 00 10 00 00

Right, r2 now doesn't point at valid DT, but points to an ATAG list.

The decompressor should notice that, and fix up the appended DTB.

I assume you concatenated the zImage and the appropriate DTB and
passed _that_ as the kernel to qemu?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 10:34   ` Dmitry Vyukov
  2021-01-19 10:55     ` Russell King - ARM Linux admin
@ 2021-01-19 13:00     ` Mark Rutland
  1 sibling, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2021-01-19 13:00 UTC (permalink / raw)
  To: 'Dmitry Vyukov' via syzkaller
  Cc: Arnd Bergmann, Linus Walleij, Russell King - ARM Linux,
	kasan-dev, Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 11:34:33AM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> On Tue, Jan 19, 2021 at 11:04 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > On Mon, Jan 18, 2021 at 05:31:36PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> > It might be best to use `-machine virt` here instead; that way QEMU
> > won't need to emulate any of the real vexpress HW, and the kernel won't
> > need to waste any time poking it.
> 
> Hi Mark,
> 
> The whole point of setting up an Arm instance is getting as much
> coverage we can't get on x86_64 instances as possible. The instance
> will use qemu emulation (extremely slow) and limited capacity.
> I see some drivers and associated hardware support as one of the main
> such areas. That's why I tried to use vexpress-a15. And it boots
> without KASAN, so presumably it can be used in general.

Fair enough.

I had assumed that your first aim would to cover the arch code shared
across all arm platforms, to flush out any big/common problems first,
for which the virt platform is a good start, and has worked quite well
for arm64.

[...]

> > > 3. CONFIG_KCOV does not seem to fully work.
> > > It seems to work except for when the kernel crashes, and that's the
> > > most interesting scenario for us. When the kernel crashes for other
> > > reasons, crash handlers re-crashe in KCOV making all crashes
> > > unactionable and indistinguishable.
> > > Here are some samples (search for __sanitizer_cov_trace):
> > > https://gist.githubusercontent.com/dvyukov/c8a7ff1c00a5223c5143fd90073f5bc4/raw/c0f4ac7fd7faad7253843584fed8620ac6006338/gistfile1.txt
> >
> > Most of those are all small offsets from 0, which suggests an offset is
> > being added to a NULL pointer somewhere, which I suspect means
> > task_struct::kcov_area is NULL. We could hack-in a check for that, and
> > see if that's the case (though I can't see how from a quick scan of the
> > kcov code).
> 
> My first guess would be is that current itself if NULL.

I think if that were to happen (which'd imply corruption of thread_info)
the fault handling and logging would also blow up, so I suspect this
isn't the case. 

Do you have a reelvant vmlinux to hand? With that we could figure out
which access is faulting, how the address is being generated, and where
the bogus address is coming from, without having to guess. :)

> Accesses to current->kcov* are well tested on other arches, including
> using KCOV in interrupts, etc.

While that's generally true, architectures differ in a number of ways
that can affect this (e.g. how the vmalloc area is faulted, what
precisely is preemptible/interruptible), and we had to make preparatory
changes to make KCOV work on arm even though it was working perfectly
fine on arm64 and x86_64, e.g.

* c9484b986ef03492 ("kcov: ensure irq code sees a valid area")
* dc55daff9040a90a ("kcov: prefault the kcov_area")
* 0ed557aa813922f6 ("sched/core / kcov: avoid kcov_area during task switch")

... so I don't think we can rule out the possibility of a latent issue
here, even if we haven't triggered it elsewhere.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 12:05                     ` Dmitry Vyukov
  2021-01-19 12:36                       ` Russell King - ARM Linux admin
@ 2021-01-19 13:22                       ` Linus Walleij
  1 sibling, 0 replies; 47+ messages in thread
From: Linus Walleij @ 2021-01-19 13:22 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux admin,
	Krzysztof Kozlowski, syzkaller, kasan-dev, Linux ARM

Hi Dmitry,

I created a minimal diff to vexpress_defconfig and it boils down to
this:

+CONFIG_SLUB_DEBUG_ON=y
+CONFIG_KASAN=y

This is really all I do!

On Tue, Jan 19, 2021 at 1:05 PM Dmitry Vyukov <dvyukov@google.com> wrote:

> Yes, I used the qemu -dtb flag.

I'm using that too and WorksForMe :/

> Error: invalid dtb and unrecognized/unsupported machine ID
>   r1=0x000008e0, r2=0x80000100
>   r2[]=05 00 00 00 01 00 41 54 01 00 00 00 00 10 00 00
> Available machine support:
>
> ID (hex) NAME
> ffffffff Generic DT based system
> ffffffff Samsung Exynos (Flattened Device Tree)
> ffffffff Hisilicon Hi3620 (Flattened Device Tree)
> ffffffff ARM-Versatile Express
>
> Please check your kernel config and/or bootloader.

Appended DTB works fine for me too, just echo foo.dtb >> zImage

You have to use a compressed kernel for appended DTB to work
though.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 12:36                       ` Russell King - ARM Linux admin
@ 2021-01-19 18:57                         ` Dmitry Vyukov
  2021-01-19 19:48                           ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-19 18:57 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Arnd Bergmann, Linus Walleij, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 1:37 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jan 19, 2021 at 01:05:11PM +0100, Dmitry Vyukov wrote:
> > Yes, I used the qemu -dtb flag.
> >
> > I tried to use CONFIG_ARM_APPENDED_DTB because it looks like a very
> > nice option. However, I couldn't make it work.
> > I enabled:
> > CONFIG_ARM_APPENDED_DTB=y
> > CONFIG_ARM_ATAG_DTB_COMPAT=y
> > # CONFIG_ARM_ATAG_DTB_COMPAT_CMDLINE_FROM_BOOTLOADER is not set
> > CONFIG_ARM_ATAG_DTB_COMPAT_CMDLINE_EXTEND=y
> > and removed qemu -dtb flag and I see:
> >
> > Error: invalid dtb and unrecognized/unsupported machine ID
> >   r1=0x000008e0, r2=0x80000100
> >   r2[]=05 00 00 00 01 00 41 54 01 00 00 00 00 10 00 00
>
> Right, r2 now doesn't point at valid DT, but points to an ATAG list.
>
> The decompressor should notice that, and fix up the appended DTB.
>
> I assume you concatenated the zImage and the appropriate DTB and
> passed _that_ as the kernel to qemu?

Mkay, I didn't. I assumed kbuild will do this for me.

Appending dtb works, but not completely. I did:

cp arch/arm/boot/zImage arch/arm/boot/zImage.dtb
cat arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb >> arch/arm/boot/zImage.dtb

Now I have:
ls -l arch/arm/boot/zImage* arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb
-rw-r----- 1 dvyukov primarygroup    13209 Jan 14 13:41
arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb
-rwxr-x--- 1 dvyukov primarygroup 33712008 Jan 19 16:55 arch/arm/boot/zImage
-rwxr-x--- 1 dvyukov primarygroup 33725217 Jan 19 18:57 arch/arm/boot/zImage.dtb

Using "-kernel arch/arm/boot/zImage -dtb
arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb" fully works.
Using just "-kernel arch/arm/boot/zImage" does not work, not output
from qemu whatsoever (expected).
But using just "-kernel arch/arm/boot/zImage.dtb" gives an interesting
effect. Kernel starts booting, I see console output up to late init
stages, but then it can't find the root device.
So appended dtb works... but only in half. Is names of block devices
something that's controlled by dtb?

[   89.140285][    T1] VFS: Cannot open root device "vda" or
unknown-block(0,0): error -6
[   89.144547][    T1] Please append a correct "root=" boot option;
here are the available partitions:
[   89.146058][    T1] 0100            4096 ram0
[   89.146295][    T1]  (driver?)
[   89.147537][    T1] 0101            4096 ram1
[   89.147740][    T1]  (driver?)
[   89.148948][    T1] 0102            4096 ram2
[   89.149150][    T1]  (driver?)
[   89.150296][    T1] 0103            4096 ram3
[   89.150497][    T1]  (driver?)
[   89.152714][    T1] 0104            4096 ram4
[   89.152920][    T1]  (driver?)
[   89.154198][    T1] 0105            4096 ram5
[   89.154401][    T1]  (driver?)
[   89.155609][    T1] 0106            4096 ram6
[   89.155811][    T1]  (driver?)
[   89.157020][    T1] 0107            4096 ram7
[   89.157221][    T1]  (driver?)
[   89.158507][    T1] 0108            4096 ram8
[   89.158708][    T1]  (driver?)
[   89.159907][    T1] 0109            4096 ram9
[   89.160109][    T1]  (driver?)
[   89.163842][    T1] 010a            4096 ram10
[   89.164055][    T1]  (driver?)
[   89.165300][    T1] 010b            4096 ram11
[   89.165502][    T1]  (driver?)
[   89.166705][    T1] 010c            4096 ram12
[   89.166906][    T1]  (driver?)
[   89.168131][    T1] 010d            4096 ram13
[   89.168341][    T1]  (driver?)
[   89.169551][    T1] 010e            4096 ram14
[   89.169753][    T1]  (driver?)
[   89.170957][    T1] 010f            4096 ram15
[   89.172047][    T1]  (driver?)
[   89.175569][    T1] 1f00          131072 mtdblock0
[   89.175801][    T1]  (driver?)
[   89.177051][    T1] 1f01           32768 mtdblock1
[   89.177256][    T1]  (driver?)
[   89.178481][    T1] 1f02             128 mtdblock2
[   89.178685][    T1]  (driver?)


Just in case, that's v5.11-rc4 with this config:
https://gist.githubusercontent.com/dvyukov/aeb69235ff37a3d48c1a8a74c2fad162/raw/b37273ba14306d4ca2e2fffc07af41c759e092b7/gistfile1.txt
and this qemu command line:

qemu-system-arm      -machine vexpress-a15 -cpu max -smp 2 -m 2G
-device virtio-blk-device,drive=hd0     -drive
if=none,format=raw,id=hd0,file=image-arm -snapshot     -kernel
arch/arm/boot/zImage.dtb                -nographic      -netdev
user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
virtio-net-device,netdev=net0 -append "earlyprintk=serial oops=panic
panic_on_warn=1 nmi_watchdog=panic panic=86400 net.ifnames=0
sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb
kvm-intel.nested=1 nf-conntrack-ftp.ports=20000
nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000
nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000
vivid.n_devs=16 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2
netrom.nr_ndevs=16 rose.rose_ndevs=16 spec_store_bypass_disable=prctl
numa=fake=2 nopcid dummy_hcd.num=8 binder.debug_mask=0
rcupdate.rcu_expedited=1 root=/dev/vda console=ttyAMA0 vmalloc=512M
watchdog_thresh=165 workqueue.watchdog_thresh=420"

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 18:57                         ` Dmitry Vyukov
@ 2021-01-19 19:48                           ` Russell King - ARM Linux admin
  2021-01-21 13:14                             ` Russell King - ARM Linux admin
  2021-01-21 13:59                             ` Dmitry Vyukov
  0 siblings, 2 replies; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-19 19:48 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 07:57:16PM +0100, Dmitry Vyukov wrote:
> Using "-kernel arch/arm/boot/zImage -dtb
> arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb" fully works.

Good.

> Using just "-kernel arch/arm/boot/zImage" does not work, not output
> from qemu whatsoever (expected).

Yep.

> But using just "-kernel arch/arm/boot/zImage.dtb" gives an interesting
> effect. Kernel starts booting, I see console output up to late init
> stages, but then it can't find the root device.
> So appended dtb works... but only in half. Is names of block devices
> something that's controlled by dtb?

My knowledge about this is limited to qemu being used for KVM.

Firstly, there is are no block devices except for MTD, USB, or CF
based block devices in the Versatile Express hardware. So, the DTB
contains no block devices.

In your first case above, it is likely that QEMU modifies the passed
DTB to add PCIe devices to describe a virtio block device.

In this case, because QEMU has no visibility of the appended DTB, it
can't modify it, so the kernel only knows about devices found on the
real hardware. Hence, any of the "special" virtio devices that QEMU
use likely won't be found.

I'm not sure how QEMU adds those (you're probably in a better position
than I to boot using your first method, grab a copy of the DTB that
the booted kernel used from /sys/firmware/fdt, and use dtc to turn it
back into a dts and see what the changes are.

I suspect you'll find that there's a new PCIe controller been added
by QEMU, behind which will be a load of virtio devices for things like
network and the "vda" block device.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 19:48                           ` Russell King - ARM Linux admin
@ 2021-01-21 13:14                             ` Russell King - ARM Linux admin
  2021-01-21 13:49                               ` Dmitry Vyukov
  2021-01-21 13:59                             ` Dmitry Vyukov
  1 sibling, 1 reply; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-21 13:14 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 07:48:27PM +0000, Russell King - ARM Linux admin wrote:
> My knowledge about this is limited to qemu being used for KVM.
> 
> Firstly, there is are no block devices except for MTD, USB, or CF
> based block devices in the Versatile Express hardware. So, the DTB
> contains no block devices.
> 
> In your first case above, it is likely that QEMU modifies the passed
> DTB to add PCIe devices to describe a virtio block device.
> 
> In this case, because QEMU has no visibility of the appended DTB, it
> can't modify it, so the kernel only knows about devices found on the
> real hardware. Hence, any of the "special" virtio devices that QEMU
> use likely won't be found.
> 
> I'm not sure how QEMU adds those (you're probably in a better position
> than I to boot using your first method, grab a copy of the DTB that
> the booted kernel used from /sys/firmware/fdt, and use dtc to turn it
> back into a dts and see what the changes are.
> 
> I suspect you'll find that there's a new PCIe controller been added
> by QEMU, behind which will be a load of virtio devices for things like
> network and the "vda" block device.

It may also be of relevance that 5.9 + a revert of the font changes
boots for me under KVM, but 5.10 does not.

The font changes were:
6735b4632def Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts

5.10-rc1 similarly does not, but bisecting that brings me to:
316cdaa1158a net: add option to not create fall-back tunnels in root-ns as well

which seems entirely unrelated, and looks like a false outcome.

I've tried going back to 5.10 and turning off CONFIG_STRICT_KERNEL_RWX.
Still doesn't boot.

I've tried reverting the changes to the decompressor between 5.9 and
5.10. Still doesn't boot.

Asking for a memory dump in ELF coredump format of the guest doesn't give
anything useful - I can see that the kernel has been decompressed, but
the BSS is completely uninitialised. It looks like the LPAE page tables
have been initialised.

The PC value in the ELF coredump seems to be spinning through a large
amount of memory (physical address) and the CPSR is 0x197, which
suggests it's taken an abort without any vectors setup.

I'm currently struggling to find a way to debug what's going on.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-21 13:14                             ` Russell King - ARM Linux admin
@ 2021-01-21 13:49                               ` Dmitry Vyukov
  2021-01-21 14:04                                 ` Arnd Bergmann
  0 siblings, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-21 13:49 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Arnd Bergmann, Linus Walleij, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Thu, Jan 21, 2021 at 2:14 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jan 19, 2021 at 07:48:27PM +0000, Russell King - ARM Linux admin wrote:
> > My knowledge about this is limited to qemu being used for KVM.
> >
> > Firstly, there is are no block devices except for MTD, USB, or CF
> > based block devices in the Versatile Express hardware. So, the DTB
> > contains no block devices.
> >
> > In your first case above, it is likely that QEMU modifies the passed
> > DTB to add PCIe devices to describe a virtio block device.
> >
> > In this case, because QEMU has no visibility of the appended DTB, it
> > can't modify it, so the kernel only knows about devices found on the
> > real hardware. Hence, any of the "special" virtio devices that QEMU
> > use likely won't be found.
> >
> > I'm not sure how QEMU adds those (you're probably in a better position
> > than I to boot using your first method, grab a copy of the DTB that
> > the booted kernel used from /sys/firmware/fdt, and use dtc to turn it
> > back into a dts and see what the changes are.
> >
> > I suspect you'll find that there's a new PCIe controller been added
> > by QEMU, behind which will be a load of virtio devices for things like
> > network and the "vda" block device.
>
> It may also be of relevance that 5.9 + a revert of the font changes
> boots for me under KVM, but 5.10 does not.
>
> The font changes were:
> 6735b4632def Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts
>
> 5.10-rc1 similarly does not, but bisecting that brings me to:
> 316cdaa1158a net: add option to not create fall-back tunnels in root-ns as well
>
> which seems entirely unrelated, and looks like a false outcome.
>
> I've tried going back to 5.10 and turning off CONFIG_STRICT_KERNEL_RWX.
> Still doesn't boot.
>
> I've tried reverting the changes to the decompressor between 5.9 and
> 5.10. Still doesn't boot.
>
> Asking for a memory dump in ELF coredump format of the guest doesn't give
> anything useful - I can see that the kernel has been decompressed, but
> the BSS is completely uninitialised. It looks like the LPAE page tables
> have been initialised.
>
> The PC value in the ELF coredump seems to be spinning through a large
> amount of memory (physical address) and the CPSR is 0x197, which
> suggests it's taken an abort without any vectors setup.
>
> I'm currently struggling to find a way to debug what's going on.

I wonder if qemu has some kind of tracing that may be useful in such cases.
Some googling shows this, which seems that it can give a trace of all
PCs (which is a reasonable feature to have), it may show where things
go wrong:
https://rwmj.wordpress.com/2016/03/17/tracing-qemu-guest-execution/
https://github.com/qemu/qemu/blob/master/docs/devel/tracing.txt
But I never used such heavy-weight artillery myself.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-19 19:48                           ` Russell King - ARM Linux admin
  2021-01-21 13:14                             ` Russell King - ARM Linux admin
@ 2021-01-21 13:59                             ` Dmitry Vyukov
  2021-01-21 14:52                               ` Linus Walleij
  1 sibling, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-21 13:59 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Arnd Bergmann, Linus Walleij, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Hailong Liu, Linux ARM

On Tue, Jan 19, 2021 at 8:48 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jan 19, 2021 at 07:57:16PM +0100, Dmitry Vyukov wrote:
> > Using "-kernel arch/arm/boot/zImage -dtb
> > arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb" fully works.
>
> Good.
>
> > Using just "-kernel arch/arm/boot/zImage" does not work, not output
> > from qemu whatsoever (expected).
>
> Yep.
>
> > But using just "-kernel arch/arm/boot/zImage.dtb" gives an interesting
> > effect. Kernel starts booting, I see console output up to late init
> > stages, but then it can't find the root device.
> > So appended dtb works... but only in half. Is names of block devices
> > something that's controlled by dtb?
>
> My knowledge about this is limited to qemu being used for KVM.
>
> Firstly, there is are no block devices except for MTD, USB, or CF
> based block devices in the Versatile Express hardware. So, the DTB
> contains no block devices.
>
> In your first case above, it is likely that QEMU modifies the passed
> DTB to add PCIe devices to describe a virtio block device.
>
> In this case, because QEMU has no visibility of the appended DTB, it
> can't modify it, so the kernel only knows about devices found on the
> real hardware. Hence, any of the "special" virtio devices that QEMU
> use likely won't be found.
>
> I'm not sure how QEMU adds those (you're probably in a better position
> than I to boot using your first method, grab a copy of the DTB that
> the booted kernel used from /sys/firmware/fdt, and use dtc to turn it
> back into a dts and see what the changes are.
>
> I suspect you'll find that there's a new PCIe controller been added
> by QEMU, behind which will be a load of virtio devices for things like
> network and the "vda" block device.

Thanks, Russell. This makes perfect sense.

I think allowing qemu to modify dtb on the fly (rather than appending
it to the kernel) may be useful for testing purposes. In future we
will probably want to make qemu emulate as many devices as possible to
increase testing coverage. Passing dtb separately will allow qemu to
emulate all kinds of devices that are not originally on the board.

However, I hit the next problem.
If I build a kernel with KASAN, binaries built from Go sources don't
work. dhcpd/sshd/etc start fine, but any Go binaries just consume 100%
of CPU and do nothing. The process state is R and it manages to create
2 child threads and mmap ~800MB of virtual memory, which I suspect may
be the root cause (though, actual memory consumption is much smaller,
dozen of MB or so). The binary cannot be killed with kill -9. I tried
to give VM 2GB and 8GB, so it should have plenty of RAM. These
binaries run fine on non-KASAN kernel...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-21 13:49                               ` Dmitry Vyukov
@ 2021-01-21 14:04                                 ` Arnd Bergmann
  0 siblings, 0 replies; 47+ messages in thread
From: Arnd Bergmann @ 2021-01-21 14:04 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Linus Walleij, Russell King - ARM Linux admin,
	Krzysztof Kozlowski, syzkaller, kasan-dev, Hailong Liu,
	Linux ARM

On Thu, Jan 21, 2021 at 2:49 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> On Thu, Jan 21, 2021 at 2:14 PM Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote:
> >
> > The PC value in the ELF coredump seems to be spinning through a large
> > amount of memory (physical address) and the CPSR is 0x197, which
> > suggests it's taken an abort without any vectors setup.
> >
> > I'm currently struggling to find a way to debug what's going on.
>
> I wonder if qemu has some kind of tracing that may be useful in such cases.
> Some googling shows this, which seems that it can give a trace of all
> PCs (which is a reasonable feature to have), it may show where things
> go wrong:
> https://rwmj.wordpress.com/2016/03/17/tracing-qemu-guest-execution/
> https://github.com/qemu/qemu/blob/master/docs/devel/tracing.txt
> But I never used such heavy-weight artillery myself.

I tend to attach gdb, in one of two ways:

- If the bug is in really early boot, I single-step the instructions to see when
  it goes wrong. Using 'stepi 30000' I see if it's still in a sane state 30000
  instructions into the boot, or if the registers are in an obviously
broken state.
  From there, I can bisect the number of instructions after boot before it
  breaks, which usually doesn't take that long.

- If it crashes after setting up the virtual mapping, I use normal breakpoints
  to see how far it gets, and bisect init/main.c symbolically, starting with a
  breakpoint in start_kernel().

Of course, if it doesn't get into start_kernel, but there are too many
instructions before the crash, neither of the two works all that well.

      Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-21 13:59                             ` Dmitry Vyukov
@ 2021-01-21 14:52                               ` Linus Walleij
  2021-01-26 21:24                                 ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Linus Walleij @ 2021-01-21 14:52 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux admin,
	kasan-dev, syzkaller, Krzysztof Kozlowski, Linux ARM

On Thu, Jan 21, 2021 at 2:59 PM Dmitry Vyukov <dvyukov@google.com> wrote:

> I think allowing qemu to modify dtb on the fly (rather than appending
> it to the kernel) may be useful for testing purposes.

Agree.

> In future we
> will probably want to make qemu emulate as many devices as possible to
> increase testing coverage. Passing dtb separately will allow qemu to
> emulate all kinds of devices that are not originally on the board.

At one point I even suggested we extend QEMU with some error injection
capabilities. For example PCI bridges can generate a lot of error states
but the emulated bridges are exposing kind of ideal behavior. It would
be an interesting testing vector to augment QEMU devices (I was thinking
of PCI hosts but also other things) to randomly misbehave and exercise
the error path of the drivers and frameworks.

> However, I hit the next problem.
> If I build a kernel with KASAN, binaries built from Go sources don't
> work. dhcpd/sshd/etc start fine, but any Go binaries just consume 100%
> of CPU and do nothing. The process state is R and it manages to create
> 2 child threads and mmap ~800MB of virtual memory, which I suspect may
> be the root cause (though, actual memory consumption is much smaller,
> dozen of MB or so). The binary cannot be killed with kill -9. I tried
> to give VM 2GB and 8GB, so it should have plenty of RAM. These
> binaries run fine on non-KASAN kernel...

It looks like Go uses a lot of memory right?

Your .config says:

CONFIG_VMSPLIT_2G=y
# CONFIG_VMSPLIT_1G is not set
CONFIG_PAGE_OFFSET=0x80000000
CONFIG_KASAN_SHADOW_OFFSET=0x5f000000

This means that if your process including children start using close
to 2GB +/- it runs out of virtual memory and start thrashing.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-21 14:52                               ` Linus Walleij
@ 2021-01-26 21:24                                 ` Dmitry Vyukov
  2021-01-27  8:24                                   ` Linus Walleij
  0 siblings, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-26 21:24 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux admin,
	kasan-dev, syzkaller, Krzysztof Kozlowski, Linux ARM

On Thu, Jan 21, 2021 at 3:52 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Thu, Jan 21, 2021 at 2:59 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> > I think allowing qemu to modify dtb on the fly (rather than appending
> > it to the kernel) may be useful for testing purposes.
>
> Agree.
>
> > In future we
> > will probably want to make qemu emulate as many devices as possible to
> > increase testing coverage. Passing dtb separately will allow qemu to
> > emulate all kinds of devices that are not originally on the board.
>
> At one point I even suggested we extend QEMU with some error injection
> capabilities. For example PCI bridges can generate a lot of error states
> but the emulated bridges are exposing kind of ideal behavior. It would
> be an interesting testing vector to augment QEMU devices (I was thinking
> of PCI hosts but also other things) to randomly misbehave and exercise
> the error path of the drivers and frameworks.
>
> > However, I hit the next problem.
> > If I build a kernel with KASAN, binaries built from Go sources don't
> > work. dhcpd/sshd/etc start fine, but any Go binaries just consume 100%
> > of CPU and do nothing. The process state is R and it manages to create
> > 2 child threads and mmap ~800MB of virtual memory, which I suspect may
> > be the root cause (though, actual memory consumption is much smaller,
> > dozen of MB or so). The binary cannot be killed with kill -9. I tried
> > to give VM 2GB and 8GB, so it should have plenty of RAM. These
> > binaries run fine on non-KASAN kernel...
>
> It looks like Go uses a lot of memory right?
>
> Your .config says:
>
> CONFIG_VMSPLIT_2G=y
> # CONFIG_VMSPLIT_1G is not set
> CONFIG_PAGE_OFFSET=0x80000000
> CONFIG_KASAN_SHADOW_OFFSET=0x5f000000
>
> This means that if your process including children start using close
> to 2GB +/- it runs out of virtual memory and start thrashing.
>
> Yours,
> Linus Walleij


I've set up an arm32 instance (w/o KASAN for now), but kernel fails during boot:
https://groups.google.com/g/syzkaller-bugs/c/omh0Em-CPq0
So far arm32 testing does not progress beyond attempts to boot.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-26 21:24                                 ` Dmitry Vyukov
@ 2021-01-27  8:24                                   ` Linus Walleij
  2021-01-27  9:39                                     ` Dmitry Vyukov
  2021-01-27 10:19                                     ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 47+ messages in thread
From: Linus Walleij @ 2021-01-27  8:24 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux admin,
	kasan-dev, syzkaller, Krzysztof Kozlowski, Linux ARM

On Tue, Jan 26, 2021 at 10:24 PM Dmitry Vyukov <dvyukov@google.com> wrote:

> I've set up an arm32 instance (w/o KASAN for now), but kernel fails during boot:
> https://groups.google.com/g/syzkaller-bugs/c/omh0Em-CPq0
> So far arm32 testing does not progress beyond attempts to boot.

It is booting all right it seems.

Today it looks like Hillf Danton found the problem: if I understand correctly
the code is executing arm32-on-arm64 (virtualized QEMU for ARM32
on ARM64?) and that was not working with the vexpress QEMU model
because not properly tested.

I don't know if I understand the problem right though :/

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-27  8:24                                   ` Linus Walleij
@ 2021-01-27  9:39                                     ` Dmitry Vyukov
  2021-01-27  9:57                                       ` Linus Walleij
  2021-01-27 10:19                                     ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-27  9:39 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux admin,
	kasan-dev, syzkaller, Krzysztof Kozlowski, Linux ARM

On Wed, Jan 27, 2021 at 9:24 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > I've set up an arm32 instance (w/o KASAN for now), but kernel fails during boot:
> > https://groups.google.com/g/syzkaller-bugs/c/omh0Em-CPq0
> > So far arm32 testing does not progress beyond attempts to boot.
>
> It is booting all right it seems.

It depends on the definition of "all right". If you are looking for
bugs, and you have bugs during boot, then that's it  :)

> Today it looks like Hillf Danton found the problem:

Yes, it seems so.

> if I understand correctly
> the code is executing arm32-on-arm64 (virtualized QEMU for ARM32
> on ARM64?) and that was not working with the vexpress QEMU model
> because not properly tested.

It's qemu-system-arm running on x86_64.
But I don't think that bug is related, it seems to affect arm32 in general.



> I don't know if I understand the problem right though :/
>
> Yours,
> Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-27  9:39                                     ` Dmitry Vyukov
@ 2021-01-27  9:57                                       ` Linus Walleij
  2021-01-27 10:12                                         ` Dmitry Vyukov
  0 siblings, 1 reply; 47+ messages in thread
From: Linus Walleij @ 2021-01-27  9:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux admin,
	kasan-dev, syzkaller, Krzysztof Kozlowski, Linux ARM

On Wed, Jan 27, 2021 at 10:39 AM Dmitry Vyukov <dvyukov@google.com> wrote:

> It's qemu-system-arm running on x86_64.
> But I don't think that bug is related, it seems to affect arm32 in general.

Yep. I am trying to reproduce with your defconfig.
It seems you are not using vexpress_defconfig:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/configs/vexpress_defconfig
?

Instead this looks like a modified multi_v7 config, right?
Then a bunch of debugging options have been turned on as it
seems.

multi_v7 "should work" too but I haven't used that.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-27  9:57                                       ` Linus Walleij
@ 2021-01-27 10:12                                         ` Dmitry Vyukov
  0 siblings, 0 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-01-27 10:12 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Hailong Liu, Russell King - ARM Linux admin,
	kasan-dev, syzkaller, Krzysztof Kozlowski, Linux ARM

On Wed, Jan 27, 2021 at 10:57 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Wed, Jan 27, 2021 at 10:39 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> > It's qemu-system-arm running on x86_64.
> > But I don't think that bug is related, it seems to affect arm32 in general.
>
> Yep. I am trying to reproduce with your defconfig.
> It seems you are not using vexpress_defconfig:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/configs/vexpress_defconfig
> ?
>
> Instead this looks like a modified multi_v7 config, right?
> Then a bunch of debugging options have been turned on as it
> seems.
>
> multi_v7 "should work" too but I haven't used that.

The config is based on vexpress_defconfig:
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/arm.yml#L5

With a bunch of debug configs on top (among other things):
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/debug.yml

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-27  8:24                                   ` Linus Walleij
  2021-01-27  9:39                                     ` Dmitry Vyukov
@ 2021-01-27 10:19                                     ` Russell King - ARM Linux admin
  2021-03-11 10:54                                       ` Dmitry Vyukov
  1 sibling, 1 reply; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-27 10:19 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Linux ARM, Arnd Bergmann, Hailong Liu, kasan-dev, syzkaller,
	Krzysztof Kozlowski, Dmitry Vyukov

On Wed, Jan 27, 2021 at 09:24:06AM +0100, Linus Walleij wrote:
> On Tue, Jan 26, 2021 at 10:24 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> 
> > I've set up an arm32 instance (w/o KASAN for now), but kernel fails during boot:
> > https://groups.google.com/g/syzkaller-bugs/c/omh0Em-CPq0
> > So far arm32 testing does not progress beyond attempts to boot.
> 
> It is booting all right it seems.
> 
> Today it looks like Hillf Danton found the problem: if I understand correctly
> the code is executing arm32-on-arm64 (virtualized QEMU for ARM32
> on ARM64?) and that was not working with the vexpress QEMU model
> because not properly tested.
> 
> I don't know if I understand the problem right though :/

There is an issue with ARMv7 and the decompressor currently - see the
patch from Ard - it's 9052/1 in the patch system.

That's already known to stuff up my 32-bit ARM VMs under KVM - maybe
other QEMU models are also affected by it.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-01-27 10:19                                     ` Russell King - ARM Linux admin
@ 2021-03-11 10:54                                       ` Dmitry Vyukov
  2021-03-11 13:42                                         ` Russell King - ARM Linux admin
  2021-03-11 13:55                                         ` Linus Walleij
  0 siblings, 2 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-03-11 10:54 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Linus Walleij, Arnd Bergmann, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Wed, Jan 27, 2021 at 11:19 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Wed, Jan 27, 2021 at 09:24:06AM +0100, Linus Walleij wrote:
> > On Tue, Jan 26, 2021 at 10:24 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > > I've set up an arm32 instance (w/o KASAN for now), but kernel fails during boot:
> > > https://groups.google.com/g/syzkaller-bugs/c/omh0Em-CPq0
> > > So far arm32 testing does not progress beyond attempts to boot.
> >
> > It is booting all right it seems.
> >
> > Today it looks like Hillf Danton found the problem: if I understand correctly
> > the code is executing arm32-on-arm64 (virtualized QEMU for ARM32
> > on ARM64?) and that was not working with the vexpress QEMU model
> > because not properly tested.
> >
> > I don't know if I understand the problem right though :/
>
> There is an issue with ARMv7 and the decompressor currently - see the
> patch from Ard - it's 9052/1 in the patch system.
>
> That's already known to stuff up my 32-bit ARM VMs under KVM - maybe
> other QEMU models are also affected by it.

Status update on the arm syzbot instance:

The boot issue is finally fixed:
https://syzkaller.appspot.com/bug?id=a85a0181a55e02756ce5ffa43c71d74a4e309263

and the instance is up and running:
https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm32

The instance config:
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/upstream-arm-kasan.config

The instance has KASAN disabled because Go binaries don't run on KASAN kernel:
https://lore.kernel.org/linux-arm-kernel/CACT4Y+YdJoNTqnBSELcEbcbVsKBtJfYUc7_GSXbUQfAJN3JyRg@mail.gmail.com/

It also has KCOV disabled (so no coverage guidance and coverage
reports for now) because KCOV does not fully work on arm:
https://lore.kernel.org/linux-arm-kernel/20210119130010.GA2338@C02TD0UTHF1T.local/T/#m78fdfcc41ae831f91c93ad5dabe63f7ccfb482f0

But the instance seems to be efficient at finding 32-bit specific bugs.

The instance uses qemu tcg and -machine vexpress-a15 -cpu max flags.

The instance uses qemu emulation (-machine vexpress-a15 -cpu max) and
lots of debug configs, so it's quite slow and it makes sense to target
it at arm-specific parts of the kernel as much as possible (rather
than stress generic subsystems that are already stressed on x86). So
the question is: what arm-specific parts are there that we can reach
in qemu?
Can you think of any qemu flags (cpu features, device emulation, etc)?
Any kernel subsystems with heavy arm-specific parts that we may be
missing?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 10:54                                       ` Dmitry Vyukov
@ 2021-03-11 13:42                                         ` Russell King - ARM Linux admin
  2021-03-11 18:05                                           ` Dmitry Vyukov
  2021-03-11 13:55                                         ` Linus Walleij
  1 sibling, 1 reply; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-03-11 13:42 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Linus Walleij, Arnd Bergmann, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Thu, Mar 11, 2021 at 11:54:22AM +0100, Dmitry Vyukov wrote:
> The instance has KASAN disabled because Go binaries don't run on KASAN kernel:
> https://lore.kernel.org/linux-arm-kernel/CACT4Y+YdJoNTqnBSELcEbcbVsKBtJfYUc7_GSXbUQfAJN3JyRg@mail.gmail.com/

I suspect this is unlikely to change as it hasn't attracted any
interest. Someone using Go and KASAN needs to debug this... I suspect
it may be due to something being KASAN instrumented that shouldn't be.

> It also has KCOV disabled (so no coverage guidance and coverage
> reports for now) because KCOV does not fully work on arm:
> https://lore.kernel.org/linux-arm-kernel/20210119130010.GA2338@C02TD0UTHF1T.local/T/#m78fdfcc41ae831f91c93ad5dabe63f7ccfb482f0

Looking at those, they look a bit weird. First:

PC is at check_kcov_mode kernel/kcov.c:163 [inline]
PC is at __sanitizer_cov_trace_pc+0x40/0x78 kernel/kcov.c:197

Why is this duplicated?

Second:

sp : 8b4e6078  ip : 8b4e6088  fp : 8b4e6084
...
Process   (pid: 0, stack limit = 0x147f9c36)

The stack limit is definitely wrong, and it looks like the thread_info
is likely wrong too. Given the value of "sp" I wonder if the kernel
stack has overflowed and overwritten the thread_info structure at the
bottom of the kernel stack.

I've no idea what effect KCOV would have on the kernel - it's something
I've never looked at, so I don't know what changes it would impose.
At this point, as there's very little commercial interest in arm32,
there's probably little hope in getting this sorted. It may make sense
to force KCOV to be disabled for arm32.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 10:54                                       ` Dmitry Vyukov
  2021-03-11 13:42                                         ` Russell King - ARM Linux admin
@ 2021-03-11 13:55                                         ` Linus Walleij
  2021-03-11 14:09                                           ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 47+ messages in thread
From: Linus Walleij @ 2021-03-11 13:55 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Russell King - ARM Linux admin, Arnd Bergmann,
	Krzysztof Kozlowski, syzkaller, kasan-dev, Hailong Liu,
	Linux ARM

On Thu, Mar 11, 2021 at 11:54 AM Dmitry Vyukov <dvyukov@google.com> wrote:

> The instance has KASAN disabled because Go binaries don't run on KASAN kernel:
> https://lore.kernel.org/linux-arm-kernel/CACT4Y+YdJoNTqnBSELcEbcbVsKBtJfYUc7_GSXbUQfAJN3JyRg@mail.gmail.com/

I am still puzzled by this, but I still have the open question about how much
memory the Go runtime really use. I am suspecting quite a lot, and the
ARM32 instance isn't on par with any contemporary server or desktop
when it comes to memory, it has ~2GB for a userspace program, after
that bad things will happen: the machine will start thrashing.

Do you have some idea about how much memory these Go binaries
use up at runtime on x86 or Aarch64?

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 13:55                                         ` Linus Walleij
@ 2021-03-11 14:09                                           ` Russell King - ARM Linux admin
  2021-03-11 14:37                                             ` Linus Walleij
  2021-03-11 14:55                                             ` Arnd Bergmann
  0 siblings, 2 replies; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-03-11 14:09 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Dmitry Vyukov, Arnd Bergmann, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Thu, Mar 11, 2021 at 02:55:54PM +0100, Linus Walleij wrote:
> On Thu, Mar 11, 2021 at 11:54 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > The instance has KASAN disabled because Go binaries don't run on KASAN kernel:
> > https://lore.kernel.org/linux-arm-kernel/CACT4Y+YdJoNTqnBSELcEbcbVsKBtJfYUc7_GSXbUQfAJN3JyRg@mail.gmail.com/
> 
> I am still puzzled by this, but I still have the open question about how much
> memory the Go runtime really use. I am suspecting quite a lot, and the
> ARM32 instance isn't on par with any contemporary server or desktop
> when it comes to memory, it has ~2GB for a userspace program, after
> that bad things will happen: the machine will start thrashing.

I believe grafana is a Go binary - I run this in a VM with only 1G
of memory and no swap along with apache. It's happy enough.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
grafana   1122  0.0  5.9 920344 60484 ?        Ssl  Feb18  28:31 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini ...

So, I suspect it's basically KASAN upsetting Go somehow that then
causes the memory usage to spiral out of control.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 14:09                                           ` Russell King - ARM Linux admin
@ 2021-03-11 14:37                                             ` Linus Walleij
  2021-03-11 14:55                                             ` Arnd Bergmann
  1 sibling, 0 replies; 47+ messages in thread
From: Linus Walleij @ 2021-03-11 14:37 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Dmitry Vyukov, Arnd Bergmann, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Thu, Mar 11, 2021 at 3:09 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:

> So, I suspect it's basically KASAN upsetting Go somehow that then
> causes the memory usage to spiral out of control.

That's annoying. I have admittedly used KASAN on quite light
distributions such as a minimal busybox or openwrt.

I will try to enable it on a more substantial userspace such
as Phosh or Plasma Mobile and see how it deals with that.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 14:09                                           ` Russell King - ARM Linux admin
  2021-03-11 14:37                                             ` Linus Walleij
@ 2021-03-11 14:55                                             ` Arnd Bergmann
  2021-03-11 18:08                                               ` Dmitry Vyukov
  2021-03-15 14:01                                               ` Linus Walleij
  1 sibling, 2 replies; 47+ messages in thread
From: Arnd Bergmann @ 2021-03-11 14:55 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Linus Walleij, Dmitry Vyukov, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Thu, Mar 11, 2021 at 3:09 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
> On Thu, Mar 11, 2021 at 02:55:54PM +0100, Linus Walleij wrote:
> > On Thu, Mar 11, 2021 at 11:54 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > The instance has KASAN disabled because Go binaries don't run on KASAN kernel:
> > > https://lore.kernel.org/linux-arm-kernel/CACT4Y+YdJoNTqnBSELcEbcbVsKBtJfYUc7_GSXbUQfAJN3JyRg@mail.gmail.com/
> >
> > I am still puzzled by this, but I still have the open question about how much
> > memory the Go runtime really use. I am suspecting quite a lot, and the
> > ARM32 instance isn't on par with any contemporary server or desktop
> > when it comes to memory, it has ~2GB for a userspace program, after
> > that bad things will happen: the machine will start thrashing.
>
> I believe grafana is a Go binary - I run this in a VM with only 1G
> of memory and no swap along with apache. It's happy enough.
>
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> grafana   1122  0.0  5.9 920344 60484 ?        Ssl  Feb18  28:31 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini ...
>
> So, I suspect it's basically KASAN upsetting Go somehow that then
> causes the memory usage to spiral out of control.

I found a bug report about someone complaining that Go reserves a lot of
virtual address space, and that this breaks an application that works
with VMSPLIT_3G
when changing to VMSPLIT_2G

https://github.com/golang/go/issues/35677

If KASAN limits the address space available to user space, there might be
a related issue, even when there is still physical memory available.

       Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 13:42                                         ` Russell King - ARM Linux admin
@ 2021-03-11 18:05                                           ` Dmitry Vyukov
  0 siblings, 0 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-03-11 18:05 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Linus Walleij, Arnd Bergmann, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Thu, Mar 11, 2021 at 2:42 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Thu, Mar 11, 2021 at 11:54:22AM +0100, Dmitry Vyukov wrote:
> > The instance has KASAN disabled because Go binaries don't run on KASAN kernel:
> > https://lore.kernel.org/linux-arm-kernel/CACT4Y+YdJoNTqnBSELcEbcbVsKBtJfYUc7_GSXbUQfAJN3JyRg@mail.gmail.com/
>
> I suspect this is unlikely to change as it hasn't attracted any
> interest. Someone using Go and KASAN needs to debug this... I suspect
> it may be due to something being KASAN instrumented that shouldn't be.
>
> > It also has KCOV disabled (so no coverage guidance and coverage
> > reports for now) because KCOV does not fully work on arm:
> > https://lore.kernel.org/linux-arm-kernel/20210119130010.GA2338@C02TD0UTHF1T.local/T/#m78fdfcc41ae831f91c93ad5dabe63f7ccfb482f0
>
> Looking at those, they look a bit weird. First:
>
> PC is at check_kcov_mode kernel/kcov.c:163 [inline]
> PC is at __sanitizer_cov_trace_pc+0x40/0x78 kernel/kcov.c:197
>
> Why is this duplicated?

It's an artifact of the symbolization process, to add the [inline]
file:line it duplicated the PC line.
I've posted 3 unaltered crashes at the bottom.


> Second:
>
> sp : 8b4e6078  ip : 8b4e6088  fp : 8b4e6084
> ...
> Process   (pid: 0, stack limit = 0x147f9c36)
>
> The stack limit is definitely wrong, and it looks like the thread_info
> is likely wrong too. Given the value of "sp" I wonder if the kernel
> stack has overflowed and overwritten the thread_info structure at the
> bottom of the kernel stack.

Humm... this is possible...

> I've no idea what effect KCOV would have on the kernel - it's something
> I've never looked at, so I don't know what changes it would impose.
> At this point, as there's very little commercial interest in arm32,
> there's probably little hope in getting this sorted. It may make sense
> to force KCOV to be disabled for arm32.

KCOV makes the compiler insert __sanitizer_trace_pc() function call
into every basic block. This increases code size and can also increase
stack usage because of more spills. And other debug configs increase
stack usage even more.

Here 3 random crash samples:

[ 2552.083059][ T5194] 8<--- cut here ---
[ 2552.084367][ T5194] Unhandled fault: page domain fault (0x01b) at 0x00000e30
[ 2552.085401][ T5194] pgd = c87495f5
[ 2552.086224][ T5194] [00000e30] *pgd=00000000
[ 2552.088694][ T5194] Internal error: : 1b [#1] PREEMPT SMP ARM
[ 2552.090195][ T5194] Dumping ftrace buffer:
[ 2552.091249][ T5194]    (ftrace buffer empty)
[ 2552.091895][ T5194] Modules linked in:
[ 2552.092768][ T5194] CPU: 1 PID: 5194 Comm: kworker/1:4 Not tainted
5.10.0-rc1+ #19
[ 2552.093459][ T5194] Hardware name: ARM-Versatile Express
[ 2552.094153][ T5126] ------------[ cut here ]------------
[ 2552.095215][ T5194] Workqueue:  0x0 (wg-crypt-wg0)
[ 2552.099654][ T5194] PC is at __sanitizer_cov_trace_pc+0x4c/0x78
[ 2552.100071][ T5126] WARNING: CPU: 0 PID: 5126 at
net/core/skbuff.c:2206 skb_copy_bits+0x368/0x510
[ 2552.101457][ T5194] LR is at trace_hardirqs_off+0x14/0x120
[ 2552.102019][ T5194] pc : [<802b4048>]    lr : [<802e12cc>]    psr: 60000193
[ 2552.102782][ T5194] sp : 8b614060  ip : 8b614070  fp : 8b61406c
[ 2552.103590][ T5194] r10: 0000a300  r9 : 8b614000  r8 : 8b7bbe14
[ 2552.104357][ T5194] r7 : 80100a74  r6 : ffffffff  r5 : 60000193  r4
: 802b4048
[ 2552.105448][ T5194] r3 : 8b614000  r2 : 00000000  r1 : 00000000  r0
: 00000000
[ 2552.106549][ T5194] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32
ISA ARM  Segment none
[ 2552.107905][ T5194] Control: 10c5387d  Table: 8acfc06a  DAC: 00000051
[ 2552.108580][ T5194] Process kworker/1:4 (pid: 5194, stack limit = 0xa47ae3aa)
[ 2552.110752][ T5194] ---[ end trace 4b8c0315965ef9d6 ]---
[ 2552.112816][ T5194] Kernel panic - not syncing: Fatal exception
[ 2552.114081][    C0] CPU0: stopping
[ 2552.115360][    C0] CPU: 0 PID: 5133 Comm: syz-executor.1 Tainted:
G      D           5.10.0-rc1+ #19
[ 2552.116483][    C0] Hardware name: ARM-Versatile Express
[ 2552.117423][    C0] Backtrace:
[ 2552.118784][    C0] [<8367729c>] (dump_backtrace) from [<83677618>]
(show_stack+0x28/0x2c)
[ 2552.120132][    C0]  r9:ffffffff r8:40000193 r7:00000080
r6:00000000 r5:841ff0ac r4:00000000
[ 2552.121629][    C0] [<836775f0>] (show_stack) from [<8368d44c>]
(dump_stack+0x124/0x170)
[ 2552.122928][    C0]  r5:00000000 r4:847241a4
[ 2552.124044][    C0] [<8368d328>] (dump_stack) from [<80118d78>]
(do_handle_IPI+0x5e4/0x618)
[ 2552.125536][    C0]  r10:8af89d68 r9:8af89dd8 r8:8af89d40
r7:814f5cc4 r6:00000014 r5:00000000
[ 2552.126748][    C0]  r4:00000002 r3:00000000
[ 2552.127815][    C0] [<80118794>] (do_handle_IPI) from [<80118dd4>]
(ipi_handler+0x28/0x30)
[ 2552.129265][    C0]  r10:8af89d68 r9:8af89dd8 r8:8af89d40
r7:814f5cc4 r6:00000014 r5:8580cc40
[ 2552.130466][    C0]  r4:00000014 r3:8454ec60
[ 2552.131534][    C0] [<80118dac>] (ipi_handler) from [<802040e8>]
(handle_percpu_devid_fasteoi_ipi+0xa8/0xbc)
[ 2552.132872][    C0]  r5:8580cc40 r4:858c8000
[ 2552.134078][    C0] [<80204040>] (handle_percpu_devid_fasteoi_ipi)
from [<801f9bc4>] (__handle_domain_irq+0xec/0x168)
[ 2552.135731][    C0]  r9:8af89dd8 r8:0000003b r7:846355b4
r6:00000000 r5:844fd41c r4:00000000
[ 2552.137183][    C0] [<801f9ad8>] (__handle_domain_irq) from
[<814f5bb8>] (gic_handle_irq+0xbc/0xe4)
[ 2552.138687][    C0]  r10:e000200c r9:00000000 r8:e0002000
r7:8af89dd8 r6:8454f53c r5:00000004
[ 2552.139827][    C0]  r4:00000404
[ 2552.140735][    C0] [<814f5afc>] (gic_handle_irq) from [<80100b30>]
(__irq_svc+0x70/0xb0)
[ 2552.141936][    C0] Exception stack(0x8af89dd8 to 0x8af89e20)
[ 2560.845970][ T5194] SMP: failed to stop secondary CPUs
[ 2560.849196][ T5194] Dumping ftrace buffer:
[ 2560.849806][ T5194]    (ftrace buffer empty)
[ 2560.850981][ T5194] Rebooting in 86400 seconds..


[ 2818.793436][ T5710] 8<--- cut here ---
[ 2818.794918][ T5710] Unhandled fault: page domain fault (0x01b) at 0x00000e30
[ 2818.797895][ T5710] pgd = 24e3cd1d
[ 2818.798832][ T5710] [00000e30] *pgd=e3d98835
[ 2818.801168][    C0] 8<--- cut here ---
[ 2818.801661][ T5710] Internal error: : 1b [#1] PREEMPT SMP ARM
[ 2818.802585][    C0] Unhandled fault: page domain fault (0x01b) at 0x00000030
[ 2818.803226][ T5710] Dumping ftrace buffer:
[ 2818.803646][    C0] pgd = 8f5822fe
[ 2818.804367][    C0] [00000030] *pgd=00000000
[ 2818.804766][ T5710]    (ftrace buffer empty)
[ 2818.805361][    C0] Internal error: : 1b [#2] PREEMPT SMP ARM
[ 2818.806139][ T5710] Modules linked in:
[ 2818.806362][    C0] Dumping ftrace buffer:
[ 2818.806743][    C0]    (ftrace buffer empty)
[ 2818.807645][ T5710] CPU: 0 PID: 5710 Comm: syz-executor.1 Not
tainted 5.10.0-rc1+ #19
[ 2818.807904][ T5710] Hardware name: ARM-Versatile Express
[ 2818.808299][    C0] Modules linked in:
[ 2818.810264][ T5710] PC is at __sanitizer_cov_trace_pc+0x4c/0x78
[ 2818.810676][ T5710] LR is at check_preemption_disabled+0x60/0x17c
[ 2818.811017][ T5710] pc : [<802b4048>]    lr : [<836bb728>]    psr: 60000193
[ 2818.811656][    C0]
[ 2818.812153][ T5710] sp : 8ad42010  ip : 8ad42020  fp : 8ad4201c
[ 2818.812954][    C0] CPU: 0 PID: 5112 Comm: kworker/u4:2 Not tainted
5.10.0-rc1+ #19
[ 2818.813291][    C0] Hardware name: ARM-Versatile Express
[ 2818.813808][ T5710] r10: 00000000  r9 : 8ad4205c  r8 : 841ca824
[ 2818.815046][    C0] Workqueue: bat_events
batadv_iv_send_outstanding_bat_ogm_packet
[ 2818.815919][ T5710] r7 : 84089a40  r6 : 836bb890  r5 : ffffe000  r4
: 00000000
[ 2818.816847][    C0] PC is at rb_erase+0x148/0x374
[ 2818.817317][ T5710] r3 : 8ad42000  r2 : 00000000  r1 : 00000000  r0
: 00000000
[ 2818.818245][    C0] LR is at 0x0
[ 2818.818563][    C0] pc : [<814dd100>]    lr : [<00000000>]    psr: 60000193
[ 2818.819014][ T5710] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32
ISA ARM  Segment none
[ 2818.819558][    C0] sp : 8acffab8  ip : 8ad41dc0  fp : 8acffacc
[ 2818.820250][ T5710] Control: 10c5387d  Table: 8aebc06a  DAC: 00000051
[ 2818.820790][    C0] r10: de5c82c0  r9 : 8acfe000  r8 : de5c8320
[ 2818.821351][ T5710] Process syz-executor.1 (pid: 5710, stack limit
= 0xa8637c39)
[ 2818.822487][    C0] r7 : 00000000  r6 : 8ad41dc1  r5 : de5c834c  r4
: de5c8840
[ 2818.824313][ T5710] Stack: (0x8ad42010 to 0x8ad42000)
[ 2818.826259][    C0] r3 : 00000030  r2 : 00000000  r1 : de5c834c  r0
: de5c8840
[ 2818.827049][ T5710] Backtrace:
[ 2818.827567][    C0] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32
ISA ARM  Segment none
[ 2818.827864][ T5710]
[ 2818.828367][    C0] Control: 10c5387d  Table: 8a15806a  DAC: 00000051
[ 2818.829428][ T5710] [<802b3ffc>] (__sanitizer_cov_trace_pc) from
[<836bb728>] (check_preemption_disabled+0x60/0x17c)
[ 2818.830282][ T5710] [<836bb6c8>] (check_preemption_disabled) from
[<836bb890>] (__this_cpu_preempt_check+0x24/0x28)
[ 2818.830931][ T5710]  r10:00000000 r9:8ad42000 r8:00000000
r7:80100a74 r6:ffffffff r5:60000193
[ 2818.831360][    C0] Process kworker/u4:2 (pid: 5112, stack limit =
0x209a2e04)
[ 2818.831960][ T5710]  r4:841ca824
[ 2818.832647][    C0] Stack: (0x8acffab8 to 0x8ad00000)
[ 2818.833429][ T5710] [<836bb86c>] (__this_cpu_preempt_check) from
[<836ba86c>] (lockdep_hardirqs_off+0x54/0x174)
[ 2818.833711][ T5710]  r5:60000193 r4:80100a74
[ 2818.836100][ T5710] [<836ba818>] (lockdep_hardirqs_off) from
[<802e12d4>] (trace_hardirqs_off+0x1c/0x120)
[ 2818.837807][ T5710]  r7:80100a74 r6:ffffffff r5:60000193 r4:802b405c
[ 2818.839303][ T5710] [<802e12b8>] (trace_hardirqs_off) from [<


[ 4902.579940][    C1] 8<--- cut here ---
[ 4902.580159][    C1] Unhandled fault: page domain fault (0x01b) at 0x00000e50
[ 4902.580388][    C1] pgd = bc232184
[ 4902.580542][    C1] [00000e50] *pgd=00000000
[ 4902.584882][    C1] Internal error: : 1b [#1] PREEMPT SMP ARM
[ 4902.585007][    C1] Dumping ftrace buffer:
[ 4902.585114][    C1]    (ftrace buffer empty)
[ 4902.585209][    C1] Modules linked in:
[ 4902.585674][    C1] CPU: 1 PID: 5928 Comm: kworker/1:7 Not tainted
5.10.0-rc1+ #19
[ 4902.585787][    C1] Hardware name: ARM-Versatile Express
[ 4902.589427][    C1] Workqueue:  0x0 (wg-crypt-wg1)
[ 4902.589785][    C1] PC is at __sanitizer_cov_trace_pc+0x40/0x78
[ 4902.589924][    C1] LR is at trace_hardirqs_off+0x14/0x120
[ 4902.590080][    C1] pc : [<802b403c>]    lr : [<802e12cc>]    psr: 00000193
[ 4902.590210][    C1] sp : 8b4c4020  ip : 8b4c4030  fp : 8b4c402c
[ 4902.590340][    C1] r10: 00000010  r9 : 8b4c4000  r8 : de5c7698
[ 4902.590496][    C1] r7 : 80100a74  r6 : ffffffff  r5 : 00000193  r4
: 802b403c
[ 4902.590652][    C1] r3 : 84262114  r2 : 00260100  r1 : 00000004  r0
: 84262114
[ 4902.590819][    C1] Flags: nzcv  IRQs off  FIQs on  Mode SVC_32
ISA ARM  Segment none
[ 4902.590962][    C1] Control: 10c5387d  Table: 895ac06a  DAC: 00000051
[ 4902.591127][    C1] Process kworker/1:7 (pid: 5928, stack limit = 0x4e3e8f57)
[ 4902.591245][    C1] Stack: (0x8b4c4020 to 0x8b4c4000)
[ 4902.591324][    C1] Backtrace:
[ 4902.599980][    C1] [<802b3ffc>] (__sanitizer_cov_trace_pc) from
[<802e12cc>] (trace_hardirqs_off+0x14/0x120)
[ 4902.600211][    C1] [<802e12b8>] (trace_hardirqs_off) from
[<80100a74>] (__dabt_svc+0x54/0xa0)
[ 4902.600341][    C1] Exception stack(0x8b4c4058 to 0x8b4c40a0)
[ 4902.656393][ T5953] 8<--- cut here ---
[ 4902.657475][ T5953] Unhandled fault: page domain fault (0x01b) at 0x0000003c
[ 4902.658584][ T5953] pgd = bc232184
[ 4902.659363][ T5953] [0000003c] *pgd=00000000
[ 4902.660316][ T5953] Internal error: : 1b [#2] PREEMPT SMP ARM
[ 4902.661065][ T5953] Dumping ftrace buffer:
[ 4902.661594][ T5953]    (ftrace buffer empty)
[ 4902.662235][ T5953] Modules linked in:
[ 4902.663209][ T5953] CPU: 1 PID: 5953 Comm: kworker/u4:5 Not tainted
5.10.0-rc1+ #19
[ 4902.663783][ T5953] Hardware name: ARM-Versatile Express
[ 4902.664811][ T5953] Workqueue: bat_events
batadv_iv_send_outstanding_bat_ogm_packet
[ 4902.666303][ T5953] PC is at batadv_iv_ogm_schedule_buff+0x540/0x8f4
[ 4902.666952][ T5953] LR is at batadv_iv_ogm_schedule_buff+0x540/0x8f4
[ 4902.667455][ T5953] pc : [<83588324>]    lr : [<83588324>]    psr: 800f0113
[ 4902.667987][ T5953] sp : 8a209e20  ip : 8a209e20  fp : 8a209e84
[ 4902.668495][ T5953] r10: 8b19be00  r9 : 8b16c7a0  r8 : 0000003c
[ 4902.669039][ T5953] r7 : 00000000  r6 : 00000001  r5 : 00000007  r4
: 8b1b0c18
[ 4902.669722][ T5953] r3 : 00000000  r2 : 00000000  r1 : 8b5b2dc0  r0
: 00000000
[ 4902.670686][ T5953] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA
ARM  Segment none
[ 4902.671561][ T5953] Control: 10c5387d  Table: 8b6a406a  DAC: 00000051
[ 4902.672286][ T5953] Process kworker/u4:5 (pid: 5953, stack limit =
0x0cc057c1)
[ 4902.672870][ T5953] Stack: (0x8a209e20 to 0x8a20a000)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 14:55                                             ` Arnd Bergmann
@ 2021-03-11 18:08                                               ` Dmitry Vyukov
  2021-03-15 14:01                                               ` Linus Walleij
  1 sibling, 0 replies; 47+ messages in thread
From: Dmitry Vyukov @ 2021-03-11 18:08 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Russell King - ARM Linux admin, Linus Walleij,
	Krzysztof Kozlowski, syzkaller, kasan-dev, Hailong Liu,
	Linux ARM

On Thu, Mar 11, 2021 at 3:55 PM Arnd Bergmann <arnd@arndb.de> wrote:
> On Thu, Mar 11, 2021 at 3:09 PM Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
> > On Thu, Mar 11, 2021 at 02:55:54PM +0100, Linus Walleij wrote:
> > > On Thu, Mar 11, 2021 at 11:54 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > > The instance has KASAN disabled because Go binaries don't run on KASAN kernel:
> > > > https://lore.kernel.org/linux-arm-kernel/CACT4Y+YdJoNTqnBSELcEbcbVsKBtJfYUc7_GSXbUQfAJN3JyRg@mail.gmail.com/
> > >
> > > I am still puzzled by this, but I still have the open question about how much
> > > memory the Go runtime really use. I am suspecting quite a lot, and the
> > > ARM32 instance isn't on par with any contemporary server or desktop
> > > when it comes to memory, it has ~2GB for a userspace program, after
> > > that bad things will happen: the machine will start thrashing.
> >
> > I believe grafana is a Go binary - I run this in a VM with only 1G
> > of memory and no swap along with apache. It's happy enough.
> >
> > USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> > grafana   1122  0.0  5.9 920344 60484 ?        Ssl  Feb18  28:31 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini ...
> >
> > So, I suspect it's basically KASAN upsetting Go somehow that then
> > causes the memory usage to spiral out of control.
>
> I found a bug report about someone complaining that Go reserves a lot of
> virtual address space, and that this breaks an application that works
> with VMSPLIT_3G
> when changing to VMSPLIT_2G
>
> https://github.com/golang/go/issues/35677
>
> If KASAN limits the address space available to user space, there might be
> a related issue, even when there is still physical memory available.

Issue with virtual/physical memory is my current hypothesis as well
(though, not much grounded). The Go binary is also quite beefy (in
terms of code and memory consumption, but works fine w/o KASAN).
We have a long term plan to move all Go binaries out of the target,
but there is no ETA, more like a wish.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-11 14:55                                             ` Arnd Bergmann
  2021-03-11 18:08                                               ` Dmitry Vyukov
@ 2021-03-15 14:01                                               ` Linus Walleij
  2021-03-15 19:03                                                 ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 47+ messages in thread
From: Linus Walleij @ 2021-03-15 14:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Russell King - ARM Linux admin, Dmitry Vyukov,
	Krzysztof Kozlowski, syzkaller, kasan-dev, Hailong Liu,
	Linux ARM

On Thu, Mar 11, 2021 at 3:55 PM Arnd Bergmann <arnd@arndb.de> wrote:

> If KASAN limits the address space available to user space, there might be
> a related issue, even when there is still physical memory available.

So in this case with the 2/2 split userspace TASK_SIZE
will be (include/asm/memory.h) KASAN_SHADOW_START
which in this case is 0x6ee00000.
Details in
commit c12366ba441da2f6f2b915410aca2b5b39c1651,

I'm just puzzled that OOM is not kicking in if the binary
runs out of virtual memory (hits 0x6ee00000).
It sure occurse when we run out of physical memory,
that has happened to me on 16MB systems.

What happens if we just use PAGE_OFFSET 0xC0000000
like most platforms? This free:s up a whole bunch of virtual
memory for userspace (will be 0xb6e00000).

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Arm + KASAN + syzbot
  2021-03-15 14:01                                               ` Linus Walleij
@ 2021-03-15 19:03                                                 ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 47+ messages in thread
From: Russell King - ARM Linux admin @ 2021-03-15 19:03 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Arnd Bergmann, Dmitry Vyukov, Krzysztof Kozlowski, syzkaller,
	kasan-dev, Hailong Liu, Linux ARM

On Mon, Mar 15, 2021 at 03:01:32PM +0100, Linus Walleij wrote:
> On Thu, Mar 11, 2021 at 3:55 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > If KASAN limits the address space available to user space, there might be
> > a related issue, even when there is still physical memory available.
> 
> I'm just puzzled that OOM is not kicking in if the binary
> runs out of virtual memory (hits 0x6ee00000).

The OOM-killer has nothing to do with the virtual space for processes.
The OOM-killer is about physical page starvation in the kernel.

A process will instead find mmap() returning NULL or attempts to
increase the heap via brk() failing.

Neither of these events should result in any effect on the kernel;
the process on the other hand may make an illegal access and be
given a segfault.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2021-03-15 19:05 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-18 16:31 Arm + KASAN + syzbot Dmitry Vyukov
2021-01-19  8:36 ` Krzysztof Kozlowski
2021-01-19  8:46   ` Linus Walleij
2021-01-19 10:04   ` Dmitry Vyukov
2021-01-19 10:17     ` Linus Walleij
2021-01-19 10:23       ` Dmitry Vyukov
2021-01-19 10:28         ` Linus Walleij
2021-01-19 10:53           ` Dmitry Vyukov
2021-01-19 11:05             ` Dmitry Vyukov
2021-01-19 11:13               ` Russell King - ARM Linux admin
2021-01-19 11:17                 ` Dmitry Vyukov
2021-01-19 11:43                   ` Russell King - ARM Linux admin
2021-01-19 12:05                     ` Dmitry Vyukov
2021-01-19 12:36                       ` Russell King - ARM Linux admin
2021-01-19 18:57                         ` Dmitry Vyukov
2021-01-19 19:48                           ` Russell King - ARM Linux admin
2021-01-21 13:14                             ` Russell King - ARM Linux admin
2021-01-21 13:49                               ` Dmitry Vyukov
2021-01-21 14:04                                 ` Arnd Bergmann
2021-01-21 13:59                             ` Dmitry Vyukov
2021-01-21 14:52                               ` Linus Walleij
2021-01-26 21:24                                 ` Dmitry Vyukov
2021-01-27  8:24                                   ` Linus Walleij
2021-01-27  9:39                                     ` Dmitry Vyukov
2021-01-27  9:57                                       ` Linus Walleij
2021-01-27 10:12                                         ` Dmitry Vyukov
2021-01-27 10:19                                     ` Russell King - ARM Linux admin
2021-03-11 10:54                                       ` Dmitry Vyukov
2021-03-11 13:42                                         ` Russell King - ARM Linux admin
2021-03-11 18:05                                           ` Dmitry Vyukov
2021-03-11 13:55                                         ` Linus Walleij
2021-03-11 14:09                                           ` Russell King - ARM Linux admin
2021-03-11 14:37                                             ` Linus Walleij
2021-03-11 14:55                                             ` Arnd Bergmann
2021-03-11 18:08                                               ` Dmitry Vyukov
2021-03-15 14:01                                               ` Linus Walleij
2021-03-15 19:03                                                 ` Russell King - ARM Linux admin
2021-01-19 13:22                       ` Linus Walleij
2021-01-19  8:41 ` Linus Walleij
2021-01-19  8:43   ` Linus Walleij
2021-01-19 10:18   ` Dmitry Vyukov
2021-01-19 10:27     ` Linus Walleij
2021-01-19 10:36       ` Dmitry Vyukov
2021-01-19 10:03 ` Mark Rutland
2021-01-19 10:34   ` Dmitry Vyukov
2021-01-19 10:55     ` Russell King - ARM Linux admin
2021-01-19 13:00     ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.