linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* BUG: KASAN: stack-out-of-bounds
@ 2019-02-27  8:25 Christophe Leroy
  2019-02-27  8:34 ` Dmitry Vyukov
  2019-02-27  9:19 ` Andrey Ryabinin
  0 siblings, 2 replies; 13+ messages in thread
From: Christophe Leroy @ 2019-02-27  8:25 UTC (permalink / raw)
  To: Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov
  Cc: Daniel Axtens, linux-mm, linuxppc-dev, kasan-dev

With version v8 of the series implementing KASAN on 32 bits powerpc 
(https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), 
I'm now able to activate KASAN on a mac99 is QEMU.

Then I get the following reports at startup. Which of the two reports I 
get seems to depend on the option used to build the kernel, but for a 
given kernel I always get the same report.

Is that a real bug, in which case how could I spot it ? Or is it 
something wrong in my implementation of KASAN ?

I checked that after kasan_init(), the entire shadow memory is full of 0 
only.

I also made a try with the strong STACK_PROTECTOR compiled in, but no 
difference and nothing detected by the stack protector.

==================================================================
BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
Read of size 1 at addr c0ecdd40 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
Call Trace:
[c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
[c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
[c0e9dd10] [c089579c] memchr+0x24/0x74
[c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
[c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
[c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
--- interrupt: c0e9df00 at 0x400f330
     LR = init_stack+0x1f00/0x2000
[c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
[c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
[c0e9df50] [c0c16434] start_kernel+0x310/0x488
[c0e9dff0] [00003484] 0x3484

The buggy address belongs to the variable:
  __log_buf+0xec0/0x4020
The buggy address belongs to the page:
page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
flags: 0x1000(reserved)
raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
                                    ^
  c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

==================================================================
BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x600
Read of size 1 at addr f6f37de0 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1134
Call Trace:
[c0ff7d60] [c01fe808] print_address_description+0x6c/0x2b0 (unreliable)
[c0ff7d90] [c01fe4fc] kasan_report+0x13c/0x1ac
[c0ff7dd0] [c0d34324] pmac_nvram_init+0x1ec/0x600
[c0ff7ef0] [c0d31148] pmac_setup_arch+0x280/0x308
[c0ff7f20] [c0d2c30c] setup_arch+0x250/0x280
[c0ff7f50] [c0d26354] start_kernel+0xb8/0x4d8
[c0ff7ff0] [00003484] 0x3484


Memory state around the buggy address:
  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
                                                ^
  f6f37e00: 00 00 00 00 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
  f6f37e80: 00 00 01 f2 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

==================================================================
BUG: KASAN: stack-out-of-bounds in memchr+0xa0/0xac
Read of size 1 at addr c17cdd30 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1135
Call Trace:
[c179dc90] [c032fe28] print_address_description+0x64/0x2bc (unreliable)
[c179dcc0] [c033020c] kasan_report+0xfc/0x180
[c179dd00] [c115ef50] memchr+0xa0/0xac
[c179dd20] [c01297f8] msg_print_text+0xc8/0x67c
[c179ddd0] [c012bc8c] console_unlock+0x17c/0x818
[c179de40] [c012f420] vprintk_emit+0x188/0x1c4
--- interrupt: c179df30 at 0x400def0
     LR = init_stack+0x1ef0/0x2000
[c179de80] [c012fff0] printk+0xa8/0xcc (unreliable)
[c179df20] [c150b4b8] early_irq_init+0x38/0x108
[c179df50] [c14ef7f8] start_kernel+0x30c/0x530
[c179dff0] [00003484] 0x3484

The buggy address belongs to the variable:
  __log_buf+0xeb0/0x4020
The buggy address belongs to the page:
page:c6ebe9a0 count:1 mapcount:0 mapping:00000000 index:0x0
flags: 0x1000(reserved)
raw: 00001000 c6ebe9a4 c6ebe9a4 00000000 00000000 00000000 ffffffff 00000001
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  c17cdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  c17cdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >c17cdd00: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 f3 f3
                              ^
  c17cdd80: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  c17cde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

==================================================================
BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x228/0xae0
Read of size 1 at addr f6f37dd0 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1136
Call Trace:
[c1c37d50] [c03f7e88] print_address_description+0x6c/0x2b0 (unreliable)
[c1c37d80] [c03f7bd4] kasan_report+0x10c/0x16c
[c1c37dc0] [c19879b4] pmac_nvram_init+0x228/0xae0
[c1c37ef0] [c19826bc] pmac_setup_arch+0x578/0x6a8
[c1c37f20] [c19792bc] setup_arch+0x5f4/0x620
[c1c37f50] [c196f898] start_kernel+0xb8/0x588
[c1c37ff0] [00003484] 0x3484


Memory state around the buggy address:
  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >f6f37d80: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00
                                          ^
  f6f37e00: 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
  f6f37e80: 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
==================================================================

==================================================================
BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x5ec
Read of size 1 at addr f6f37de0 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1137
Call Trace:
[c0fb7d60] [c01f8184] print_address_description+0x6c/0x2b0 (unreliable)
[c0fb7d90] [c01f7ed0] kasan_report+0x10c/0x16c
[c0fb7dd0] [c0d1dfe8] pmac_nvram_init+0x1ec/0x5ec
[c0fb7ef0] [c0d1ae90] pmac_setup_arch+0x280/0x308
[c0fb7f20] [c0d16138] setup_arch+0x250/0x280
[c0fb7f50] [c0d1032c] start_kernel+0xb8/0x4a4
[c0fb7ff0] [00003484] 0x3484


Memory state around the buggy address:
  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
                                                ^
  f6f37e00: 00 00 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
  f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
==================================================================

Thanks
Christophe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27  8:25 BUG: KASAN: stack-out-of-bounds Christophe Leroy
@ 2019-02-27  8:34 ` Dmitry Vyukov
  2019-02-27 12:35   ` Christophe Leroy
  2019-02-27  9:19 ` Andrey Ryabinin
  1 sibling, 1 reply; 13+ messages in thread
From: Dmitry Vyukov @ 2019-02-27  8:34 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Andrey Ryabinin, Alexander Potapenko, Daniel Axtens, Linux-MM,
	linuxppc-dev, kasan-dev

On Wed, Feb 27, 2019 at 9:25 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> With version v8 of the series implementing KASAN on 32 bits powerpc
> (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309),
> I'm now able to activate KASAN on a mac99 is QEMU.
>
> Then I get the following reports at startup. Which of the two reports I
> get seems to depend on the option used to build the kernel, but for a
> given kernel I always get the same report.
>
> Is that a real bug, in which case how could I spot it ? Or is it
> something wrong in my implementation of KASAN ?

What is the state of your source tree?
Please pass output through some symbolization script, function offsets
are not too useful.
There was some in scripts/ dir IIRC, but here is another one (though,
never tested on powerpc):
https://github.com/google/sanitizers/blob/master/address-sanitizer/tools/kasan_symbolize.py



> I checked that after kasan_init(), the entire shadow memory is full of 0
> only.
>
> I also made a try with the strong STACK_PROTECTOR compiled in, but no
> difference and nothing detected by the stack protector.
>
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
> Read of size 1 at addr c0ecdd40 by task swapper/0
>
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
> Call Trace:
> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
> [c0e9dd10] [c089579c] memchr+0x24/0x74
> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
> --- interrupt: c0e9df00 at 0x400f330
>      LR = init_stack+0x1f00/0x2000
> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
> [c0e9dff0] [00003484] 0x3484
>
> The buggy address belongs to the variable:
>   __log_buf+0xec0/0x4020
> The buggy address belongs to the page:
> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> flags: 0x1000(reserved)
> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  >c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>                                     ^
>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
>
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x600
> Read of size 1 at addr f6f37de0 by task swapper/0
>
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1134
> Call Trace:
> [c0ff7d60] [c01fe808] print_address_description+0x6c/0x2b0 (unreliable)
> [c0ff7d90] [c01fe4fc] kasan_report+0x13c/0x1ac
> [c0ff7dd0] [c0d34324] pmac_nvram_init+0x1ec/0x600
> [c0ff7ef0] [c0d31148] pmac_setup_arch+0x280/0x308
> [c0ff7f20] [c0d2c30c] setup_arch+0x250/0x280
> [c0ff7f50] [c0d26354] start_kernel+0xb8/0x4d8
> [c0ff7ff0] [00003484] 0x3484
>
>
> Memory state around the buggy address:
>   f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>                                                 ^
>   f6f37e00: 00 00 00 00 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>   f6f37e80: 00 00 01 f2 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
>
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in memchr+0xa0/0xac
> Read of size 1 at addr c17cdd30 by task swapper/0
>
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1135
> Call Trace:
> [c179dc90] [c032fe28] print_address_description+0x64/0x2bc (unreliable)
> [c179dcc0] [c033020c] kasan_report+0xfc/0x180
> [c179dd00] [c115ef50] memchr+0xa0/0xac
> [c179dd20] [c01297f8] msg_print_text+0xc8/0x67c
> [c179ddd0] [c012bc8c] console_unlock+0x17c/0x818
> [c179de40] [c012f420] vprintk_emit+0x188/0x1c4
> --- interrupt: c179df30 at 0x400def0
>      LR = init_stack+0x1ef0/0x2000
> [c179de80] [c012fff0] printk+0xa8/0xcc (unreliable)
> [c179df20] [c150b4b8] early_irq_init+0x38/0x108
> [c179df50] [c14ef7f8] start_kernel+0x30c/0x530
> [c179dff0] [00003484] 0x3484
>
> The buggy address belongs to the variable:
>   __log_buf+0xeb0/0x4020
> The buggy address belongs to the page:
> page:c6ebe9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> flags: 0x1000(reserved)
> raw: 00001000 c6ebe9a4 c6ebe9a4 00000000 00000000 00000000 ffffffff 00000001
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>   c17cdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   c17cdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  >c17cdd00: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 f3 f3
>                               ^
>   c17cdd80: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   c17cde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
>
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x228/0xae0
> Read of size 1 at addr f6f37dd0 by task swapper/0
>
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1136
> Call Trace:
> [c1c37d50] [c03f7e88] print_address_description+0x6c/0x2b0 (unreliable)
> [c1c37d80] [c03f7bd4] kasan_report+0x10c/0x16c
> [c1c37dc0] [c19879b4] pmac_nvram_init+0x228/0xae0
> [c1c37ef0] [c19826bc] pmac_setup_arch+0x578/0x6a8
> [c1c37f20] [c19792bc] setup_arch+0x5f4/0x620
> [c1c37f50] [c196f898] start_kernel+0xb8/0x588
> [c1c37ff0] [00003484] 0x3484
>
>
> Memory state around the buggy address:
>   f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  >f6f37d80: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00
>                                           ^
>   f6f37e00: 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
>   f6f37e80: 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
> ==================================================================
>
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x5ec
> Read of size 1 at addr f6f37de0 by task swapper/0
>
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1137
> Call Trace:
> [c0fb7d60] [c01f8184] print_address_description+0x6c/0x2b0 (unreliable)
> [c0fb7d90] [c01f7ed0] kasan_report+0x10c/0x16c
> [c0fb7dd0] [c0d1dfe8] pmac_nvram_init+0x1ec/0x5ec
> [c0fb7ef0] [c0d1ae90] pmac_setup_arch+0x280/0x308
> [c0fb7f20] [c0d16138] setup_arch+0x250/0x280
> [c0fb7f50] [c0d1032c] start_kernel+0xb8/0x4a4
> [c0fb7ff0] [00003484] 0x3484
>
>
> Memory state around the buggy address:
>   f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>                                                 ^
>   f6f37e00: 00 00 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>   f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
> ==================================================================
>
> Thanks
> Christophe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27  8:25 BUG: KASAN: stack-out-of-bounds Christophe Leroy
  2019-02-27  8:34 ` Dmitry Vyukov
@ 2019-02-27  9:19 ` Andrey Ryabinin
  2019-02-27  9:25   ` Dmitry Vyukov
  2019-02-27 13:11   ` Christophe Leroy
  1 sibling, 2 replies; 13+ messages in thread
From: Andrey Ryabinin @ 2019-02-27  9:19 UTC (permalink / raw)
  To: Christophe Leroy, Alexander Potapenko, Dmitry Vyukov
  Cc: Daniel Axtens, linux-mm, linuxppc-dev, kasan-dev



On 2/27/19 11:25 AM, Christophe Leroy wrote:
> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
> 
> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
> 
> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
> 
> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
> 
> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
> 
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
> Read of size 1 at addr c0ecdd40 by task swapper/0
> 
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
> Call Trace:
> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
> [c0e9dd10] [c089579c] memchr+0x24/0x74
> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
> --- interrupt: c0e9df00 at 0x400f330
>     LR = init_stack+0x1f00/0x2000
> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
> [c0e9dff0] [00003484] 0x3484
> 
> The buggy address belongs to the variable:
>  __log_buf+0xec0/0x4020
> The buggy address belongs to the page:
> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> flags: 0x1000(reserved)
> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
> page dumped because: kasan: bad access detected
> 
> Memory state around the buggy address:
>  c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>                                    ^
>  c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>  c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
> 

This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is 
	"The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
 which is printed by following code:
	if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
		pr_err("The buggy address belongs to the variable:\n");
		pr_err(" %pS\n", addr);
	}

So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
KASAN with stack instrumentation significantly increases stack usage.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27  9:19 ` Andrey Ryabinin
@ 2019-02-27  9:25   ` Dmitry Vyukov
  2019-02-27  9:33     ` Christophe Leroy
  2019-02-27 13:11   ` Christophe Leroy
  1 sibling, 1 reply; 13+ messages in thread
From: Dmitry Vyukov @ 2019-02-27  9:25 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Christophe Leroy, Alexander Potapenko, Daniel Axtens, Linux-MM,
	linuxppc-dev, kasan-dev

On Wed, Feb 27, 2019 at 10:18 AM Andrey Ryabinin
<aryabinin@virtuozzo.com> wrote:
> On 2/27/19 11:25 AM, Christophe Leroy wrote:
> > With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
> >
> > Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
> >
> > Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
> >
> > I checked that after kasan_init(), the entire shadow memory is full of 0 only.
> >
> > I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
> >
> > ==================================================================
> > BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
> > Read of size 1 at addr c0ecdd40 by task swapper/0
> >
> > CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
> > Call Trace:
> > [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
> > [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
> > [c0e9dd10] [c089579c] memchr+0x24/0x74
> > [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
> > [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
> > [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
> > --- interrupt: c0e9df00 at 0x400f330
> >     LR = init_stack+0x1f00/0x2000
> > [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
> > [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
> > [c0e9df50] [c0c16434] start_kernel+0x310/0x488
> > [c0e9dff0] [00003484] 0x3484
> >
> > The buggy address belongs to the variable:
> >  __log_buf+0xec0/0x4020
> > The buggy address belongs to the page:
> > page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> > flags: 0x1000(reserved)
> > raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
> > page dumped because: kasan: bad access detected
> >
> > Memory state around the buggy address:
> >  c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> >                                    ^
> >  c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> >  c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ==================================================================
> >
>
> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
>         "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
>  which is printed by following code:
>         if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
>                 pr_err("The buggy address belongs to the variable:\n");
>                 pr_err(" %pS\n", addr);
>         }
>
> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
> KASAN with stack instrumentation significantly increases stack usage.

A straightforward explanation would be that this happens before real
shadow is mapped and we don't turn off KASAN reports. Should be easy
to check so worth eliminating this possibility before any other
debugging.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27  9:25   ` Dmitry Vyukov
@ 2019-02-27  9:33     ` Christophe Leroy
  0 siblings, 0 replies; 13+ messages in thread
From: Christophe Leroy @ 2019-02-27  9:33 UTC (permalink / raw)
  To: Dmitry Vyukov, Andrey Ryabinin
  Cc: Alexander Potapenko, Daniel Axtens, Linux-MM, linuxppc-dev, kasan-dev



Le 27/02/2019 à 10:25, Dmitry Vyukov a écrit :
> On Wed, Feb 27, 2019 at 10:18 AM Andrey Ryabinin
> <aryabinin@virtuozzo.com> wrote:
>> On 2/27/19 11:25 AM, Christophe Leroy wrote:
>>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
>>>
>>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
>>>
>>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
>>>
>>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
>>>
>>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
>>>
>>> ==================================================================
>>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
>>> Read of size 1 at addr c0ecdd40 by task swapper/0
>>>
>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
>>> Call Trace:
>>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
>>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
>>> [c0e9dd10] [c089579c] memchr+0x24/0x74
>>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
>>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
>>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
>>> --- interrupt: c0e9df00 at 0x400f330
>>>      LR = init_stack+0x1f00/0x2000
>>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
>>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
>>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
>>> [c0e9dff0] [00003484] 0x3484
>>>
>>> The buggy address belongs to the variable:
>>>   __log_buf+0xec0/0x4020
>>> The buggy address belongs to the page:
>>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>>> flags: 0x1000(reserved)
>>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
>>> page dumped because: kasan: bad access detected
>>>
>>> Memory state around the buggy address:
>>>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>>>                                     ^
>>>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> ==================================================================
>>>
>>
>> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
>>          "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
>>   which is printed by following code:
>>          if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
>>                  pr_err("The buggy address belongs to the variable:\n");
>>                  pr_err(" %pS\n", addr);
>>          }
>>
>> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
>> KASAN with stack instrumentation significantly increases stack usage.
> 
> A straightforward explanation would be that this happens before real
> shadow is mapped and we don't turn off KASAN reports. Should be easy
> to check so worth eliminating this possibility before any other
> debugging.
> 

I confirm this happens _after_ the call of kasan_init() which sets up 
the final shadow mapping. And after the call of kasan_init() I can 
confirm that the entire shadow area is zeroized.

kasan_init() is called at the top of setup_arch() which is called soon 
after the begining of start_kernel() (see 'KASAN init done' below).

early_irq_init() is called long after that.

Booting Linux via __start() @ 0x01000000 ...
Hello World !
Total memory = 128MB; using 256kB for hash table (at (ptrval))
Linux version 5.0.0-rc7+ (root@po16846vm.idsi0.si.c-s.fr) (gcc version 
5.4.0 (GCC)) #1133 Tue Feb 26 03:30:01 UTC 2019
KASAN init done
Found UniNorth memory controller & host bridge @ 0xf8000000 revision: 0x07
Mapped at 0xf77c0000
Found a Keylargo mac-io controller, rev: 0, mapped at 0x(ptrval)
PowerMac motherboard: PowerMac G4 AGP Graphics
boot stdout isn't a display !
Using PowerMac machine description
printk: bootconsole [udbg0] enabled
-----------------------------------------------------
Hash_size         = 0x40000
phys_mem_size     = 0x8000000
dcache_bsize      = 0x20
icache_bsize      = 0x20
cpu_features      = 0x000000000401a00a
   possible        = 0x000000002f7ff14b
   always          = 0x0000000000000000
cpu_user_features = 0x9c000001 0x00000000
mmu_features      = 0x00000001
Hash              = 0x(ptrval)
Hash_mask         = 0xfff
-----------------------------------------------------
Found UniNorth PCI host bridge at 0x00000000f2000000. Firmware bus 
number: 0->0
PCI host bridge /pci@f2000000 (primary) ranges:
   IO 0x00000000f2000000..0x00000000f27fffff -> 0x0000000000000000
  MEM 0x0000000080000000..0x000000008fffffff -> 0x0000000080000000
nvram: Checking bank 0...
Invalid signature
Invalid checksum
nvram: gen0=0, gen1=0
nvram: Active bank is: 0
nvram: OF partition at 0xffffffff
nvram: XP partition at 0xffffffff
nvram: NR partition at 0xffffffff
Zone ranges:
   Normal   [mem 0x0000000000000000-0x0000000007ffffff]
   HighMem  empty
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
Built 1 zonelists, mobility grouping on.  Total pages: 32512
Kernel command line: console=/dev/ttyS0
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 93544K/131072K available (8868K kernel code, 1700K rwdata, 3484K 
rodata, 1004K init, 4434K bss, 37528K reserved, 0K cma-reserved, 0K highmem)
Kernel virtual memory layout:
   * 0xf8000000..0x00000000  : kasan shadow mem
   * 0xf7fd0000..0xf8000000  : fixmap
   * 0xf7800000..0xf7c00000  : highmem PTEs
   * 0xf6f36000..0xf7800000  : early ioremap
   * 0xc9000000..0xf6f36000  : vmalloc & ioremap
SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
mpic: Setting up MPIC " MPIC 1   " version 1.2 at 80040000, max 1 CPUs
mpic: ISU size: 64, shift: 6, mask: 3f
mpic: Initializing for 64 sources
GMT Delta read from XPRAM: 0 minutes, DST: on
clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 
0x171024e7e0, max_idle_ns: 440795205315 ns
clocksource: timebase mult[a000000] shift[24] registered
==================================================================
BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
Read of size 1 at addr c0ecdd40 by task swapper/0

...

Christophe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27  8:34 ` Dmitry Vyukov
@ 2019-02-27 12:35   ` Christophe Leroy
  2019-02-27 13:07     ` Dmitry Vyukov
  0 siblings, 1 reply; 13+ messages in thread
From: Christophe Leroy @ 2019-02-27 12:35 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrey Ryabinin, Alexander Potapenko, Daniel Axtens, Linux-MM,
	linuxppc-dev, kasan-dev



On 02/27/2019 08:34 AM, Dmitry Vyukov wrote:
> On Wed, Feb 27, 2019 at 9:25 AM Christophe Leroy
> <christophe.leroy@c-s.fr> wrote:
>>
>> With version v8 of the series implementing KASAN on 32 bits powerpc
>> (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309),
>> I'm now able to activate KASAN on a mac99 is QEMU.
>>
>> Then I get the following reports at startup. Which of the two reports I
>> get seems to depend on the option used to build the kernel, but for a
>> given kernel I always get the same report.
>>
>> Is that a real bug, in which case how could I spot it ? Or is it
>> something wrong in my implementation of KASAN ?
> 
> What is the state of your source tree?
> Please pass output through some symbolization script, function offsets
> are not too useful.
> There was some in scripts/ dir IIRC, but here is another one (though,
> never tested on powerpc):
> https://github.com/google/sanitizers/blob/master/address-sanitizer/tools/kasan_symbolize.py

I get the following. It doesn't seem much interesting, does it ?

==================================================================
BUG: KASAN: stack-out-of-bounds in[<        none        >] 
memchr+0x24/0x74 lib/string.c:958
Read of size 1 at addr c0ecdd40 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1142
Call Trace:
[c0e9dca0] [c01c42c0] print_address_description+0x64/0x2bc (unreliable)
[c0e9dcd0] [c01c46a4] kasan_report+0xfc/0x180
[c0e9dd10] [c0895150] memchr+0x24/0x74
[c0e9dd30] [c00a9e58] msg_print_text+0x124/0x574
[c0e9dde0] [c00ab730] console_unlock+0x114/0x4f8
[c0e9de40] [c00adc80] vprintk_emit+0x188/0x1c4
[c0e9de80] [c00ae3e4] printk+0xa8/0xcc
[c0e9df20] [c0c27e44] early_irq_init+0x38/0x108
[c0e9df50] [c0c15434] start_kernel+0x310/0x488
[c0e9dff0] [00003484] 0x3484

The buggy address belongs to the variable:
[<        none        >] __log_buf+0xec0/0x4020 
arch/powerpc/kernel/head_32.S:?
The buggy address belongs to the page:
page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
flags: 0x1000(reserved)
raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
                                    ^
  c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


Christophe

> 
> 
> 
>> I checked that after kasan_init(), the entire shadow memory is full of 0
>> only.
>>
>> I also made a try with the strong STACK_PROTECTOR compiled in, but no
>> difference and nothing detected by the stack protector.
>>
>> ==================================================================
>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
>> Read of size 1 at addr c0ecdd40 by task swapper/0
>>
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
>> Call Trace:
>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
>> [c0e9dd10] [c089579c] memchr+0x24/0x74
>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
>> --- interrupt: c0e9df00 at 0x400f330
>>       LR = init_stack+0x1f00/0x2000
>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
>> [c0e9dff0] [00003484] 0x3484
>>
>> The buggy address belongs to the variable:
>>    __log_buf+0xec0/0x4020
>> The buggy address belongs to the page:
>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>> flags: 0x1000(reserved)
>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
>> page dumped because: kasan: bad access detected
>>
>> Memory state around the buggy address:
>>    c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>    c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   >c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>>                                      ^
>>    c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>    c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> ==================================================================
>>
>> ==================================================================
>> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x600
>> Read of size 1 at addr f6f37de0 by task swapper/0
>>
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1134
>> Call Trace:
>> [c0ff7d60] [c01fe808] print_address_description+0x6c/0x2b0 (unreliable)
>> [c0ff7d90] [c01fe4fc] kasan_report+0x13c/0x1ac
>> [c0ff7dd0] [c0d34324] pmac_nvram_init+0x1ec/0x600
>> [c0ff7ef0] [c0d31148] pmac_setup_arch+0x280/0x308
>> [c0ff7f20] [c0d2c30c] setup_arch+0x250/0x280
>> [c0ff7f50] [c0d26354] start_kernel+0xb8/0x4d8
>> [c0ff7ff0] [00003484] 0x3484
>>
>>
>> Memory state around the buggy address:
>>    f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>    f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>>                                                  ^
>>    f6f37e00: 00 00 00 00 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>>    f6f37e80: 00 00 01 f2 00 00 00 00 00 00 00 00 00 00 00 00
>> ==================================================================
>>
>> ==================================================================
>> BUG: KASAN: stack-out-of-bounds in memchr+0xa0/0xac
>> Read of size 1 at addr c17cdd30 by task swapper/0
>>
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1135
>> Call Trace:
>> [c179dc90] [c032fe28] print_address_description+0x64/0x2bc (unreliable)
>> [c179dcc0] [c033020c] kasan_report+0xfc/0x180
>> [c179dd00] [c115ef50] memchr+0xa0/0xac
>> [c179dd20] [c01297f8] msg_print_text+0xc8/0x67c
>> [c179ddd0] [c012bc8c] console_unlock+0x17c/0x818
>> [c179de40] [c012f420] vprintk_emit+0x188/0x1c4
>> --- interrupt: c179df30 at 0x400def0
>>       LR = init_stack+0x1ef0/0x2000
>> [c179de80] [c012fff0] printk+0xa8/0xcc (unreliable)
>> [c179df20] [c150b4b8] early_irq_init+0x38/0x108
>> [c179df50] [c14ef7f8] start_kernel+0x30c/0x530
>> [c179dff0] [00003484] 0x3484
>>
>> The buggy address belongs to the variable:
>>    __log_buf+0xeb0/0x4020
>> The buggy address belongs to the page:
>> page:c6ebe9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>> flags: 0x1000(reserved)
>> raw: 00001000 c6ebe9a4 c6ebe9a4 00000000 00000000 00000000 ffffffff 00000001
>> page dumped because: kasan: bad access detected
>>
>> Memory state around the buggy address:
>>    c17cdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>    c17cdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   >c17cdd00: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 f3 f3
>>                                ^
>>    c17cdd80: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>    c17cde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> ==================================================================
>>
>> ==================================================================
>> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x228/0xae0
>> Read of size 1 at addr f6f37dd0 by task swapper/0
>>
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1136
>> Call Trace:
>> [c1c37d50] [c03f7e88] print_address_description+0x6c/0x2b0 (unreliable)
>> [c1c37d80] [c03f7bd4] kasan_report+0x10c/0x16c
>> [c1c37dc0] [c19879b4] pmac_nvram_init+0x228/0xae0
>> [c1c37ef0] [c19826bc] pmac_setup_arch+0x578/0x6a8
>> [c1c37f20] [c19792bc] setup_arch+0x5f4/0x620
>> [c1c37f50] [c196f898] start_kernel+0xb8/0x588
>> [c1c37ff0] [00003484] 0x3484
>>
>>
>> Memory state around the buggy address:
>>    f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>    f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   >f6f37d80: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00
>>                                            ^
>>    f6f37e00: 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
>>    f6f37e80: 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
>> ==================================================================
>>
>> ==================================================================
>> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x5ec
>> Read of size 1 at addr f6f37de0 by task swapper/0
>>
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1137
>> Call Trace:
>> [c0fb7d60] [c01f8184] print_address_description+0x6c/0x2b0 (unreliable)
>> [c0fb7d90] [c01f7ed0] kasan_report+0x10c/0x16c
>> [c0fb7dd0] [c0d1dfe8] pmac_nvram_init+0x1ec/0x5ec
>> [c0fb7ef0] [c0d1ae90] pmac_setup_arch+0x280/0x308
>> [c0fb7f20] [c0d16138] setup_arch+0x250/0x280
>> [c0fb7f50] [c0d1032c] start_kernel+0xb8/0x4a4
>> [c0fb7ff0] [00003484] 0x3484
>>
>>
>> Memory state around the buggy address:
>>    f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>    f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>>                                                  ^
>>    f6f37e00: 00 00 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>>    f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
>> ==================================================================
>>
>> Thanks
>> Christophe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27 12:35   ` Christophe Leroy
@ 2019-02-27 13:07     ` Dmitry Vyukov
  0 siblings, 0 replies; 13+ messages in thread
From: Dmitry Vyukov @ 2019-02-27 13:07 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Andrey Ryabinin, Alexander Potapenko, Daniel Axtens, Linux-MM,
	linuxppc-dev, kasan-dev

On Wed, Feb 27, 2019 at 1:35 PM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
>
>
> On 02/27/2019 08:34 AM, Dmitry Vyukov wrote:
> > On Wed, Feb 27, 2019 at 9:25 AM Christophe Leroy
> > <christophe.leroy@c-s.fr> wrote:
> >>
> >> With version v8 of the series implementing KASAN on 32 bits powerpc
> >> (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309),
> >> I'm now able to activate KASAN on a mac99 is QEMU.
> >>
> >> Then I get the following reports at startup. Which of the two reports I
> >> get seems to depend on the option used to build the kernel, but for a
> >> given kernel I always get the same report.
> >>
> >> Is that a real bug, in which case how could I spot it ? Or is it
> >> something wrong in my implementation of KASAN ?
> >
> > What is the state of your source tree?
> > Please pass output through some symbolization script, function offsets
> > are not too useful.
> > There was some in scripts/ dir IIRC, but here is another one (though,
> > never tested on powerpc):
> > https://github.com/google/sanitizers/blob/master/address-sanitizer/tools/kasan_symbolize.py
>
> I get the following. It doesn't seem much interesting, does it ?


Yes, it does not seem to work for powerpc32.
Then please pass addresses through addr2line -fi.



> ==================================================================
> BUG: KASAN: stack-out-of-bounds in[<        none        >]
> memchr+0x24/0x74 lib/string.c:958
> Read of size 1 at addr c0ecdd40 by task swapper/0
>
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1142
> Call Trace:
> [c0e9dca0] [c01c42c0] print_address_description+0x64/0x2bc (unreliable)
> [c0e9dcd0] [c01c46a4] kasan_report+0xfc/0x180
> [c0e9dd10] [c0895150] memchr+0x24/0x74
> [c0e9dd30] [c00a9e58] msg_print_text+0x124/0x574
> [c0e9dde0] [c00ab730] console_unlock+0x114/0x4f8
> [c0e9de40] [c00adc80] vprintk_emit+0x188/0x1c4
> [c0e9de80] [c00ae3e4] printk+0xa8/0xcc
> [c0e9df20] [c0c27e44] early_irq_init+0x38/0x108
> [c0e9df50] [c0c15434] start_kernel+0x310/0x488
> [c0e9dff0] [00003484] 0x3484
>
> The buggy address belongs to the variable:
> [<        none        >] __log_buf+0xec0/0x4020
> arch/powerpc/kernel/head_32.S:?
> The buggy address belongs to the page:
> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> flags: 0x1000(reserved)
> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  >c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>                                     ^
>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
>
>
> Christophe
>
> >
> >
> >
> >> I checked that after kasan_init(), the entire shadow memory is full of 0
> >> only.
> >>
> >> I also made a try with the strong STACK_PROTECTOR compiled in, but no
> >> difference and nothing detected by the stack protector.
> >>
> >> ==================================================================
> >> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
> >> Read of size 1 at addr c0ecdd40 by task swapper/0
> >>
> >> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
> >> Call Trace:
> >> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
> >> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
> >> [c0e9dd10] [c089579c] memchr+0x24/0x74
> >> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
> >> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
> >> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
> >> --- interrupt: c0e9df00 at 0x400f330
> >>       LR = init_stack+0x1f00/0x2000
> >> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
> >> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
> >> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
> >> [c0e9dff0] [00003484] 0x3484
> >>
> >> The buggy address belongs to the variable:
> >>    __log_buf+0xec0/0x4020
> >> The buggy address belongs to the page:
> >> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> >> flags: 0x1000(reserved)
> >> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
> >> page dumped because: kasan: bad access detected
> >>
> >> Memory state around the buggy address:
> >>    c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>    c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>   >c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> >>                                      ^
> >>    c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> >>    c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >> ==================================================================
> >>
> >> ==================================================================
> >> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x600
> >> Read of size 1 at addr f6f37de0 by task swapper/0
> >>
> >> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1134
> >> Call Trace:
> >> [c0ff7d60] [c01fe808] print_address_description+0x6c/0x2b0 (unreliable)
> >> [c0ff7d90] [c01fe4fc] kasan_report+0x13c/0x1ac
> >> [c0ff7dd0] [c0d34324] pmac_nvram_init+0x1ec/0x600
> >> [c0ff7ef0] [c0d31148] pmac_setup_arch+0x280/0x308
> >> [c0ff7f20] [c0d2c30c] setup_arch+0x250/0x280
> >> [c0ff7f50] [c0d26354] start_kernel+0xb8/0x4d8
> >> [c0ff7ff0] [00003484] 0x3484
> >>
> >>
> >> Memory state around the buggy address:
> >>    f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>    f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>   >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
> >>                                                  ^
> >>    f6f37e00: 00 00 00 00 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
> >>    f6f37e80: 00 00 01 f2 00 00 00 00 00 00 00 00 00 00 00 00
> >> ==================================================================
> >>
> >> ==================================================================
> >> BUG: KASAN: stack-out-of-bounds in memchr+0xa0/0xac
> >> Read of size 1 at addr c17cdd30 by task swapper/0
> >>
> >> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1135
> >> Call Trace:
> >> [c179dc90] [c032fe28] print_address_description+0x64/0x2bc (unreliable)
> >> [c179dcc0] [c033020c] kasan_report+0xfc/0x180
> >> [c179dd00] [c115ef50] memchr+0xa0/0xac
> >> [c179dd20] [c01297f8] msg_print_text+0xc8/0x67c
> >> [c179ddd0] [c012bc8c] console_unlock+0x17c/0x818
> >> [c179de40] [c012f420] vprintk_emit+0x188/0x1c4
> >> --- interrupt: c179df30 at 0x400def0
> >>       LR = init_stack+0x1ef0/0x2000
> >> [c179de80] [c012fff0] printk+0xa8/0xcc (unreliable)
> >> [c179df20] [c150b4b8] early_irq_init+0x38/0x108
> >> [c179df50] [c14ef7f8] start_kernel+0x30c/0x530
> >> [c179dff0] [00003484] 0x3484
> >>
> >> The buggy address belongs to the variable:
> >>    __log_buf+0xeb0/0x4020
> >> The buggy address belongs to the page:
> >> page:c6ebe9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> >> flags: 0x1000(reserved)
> >> raw: 00001000 c6ebe9a4 c6ebe9a4 00000000 00000000 00000000 ffffffff 00000001
> >> page dumped because: kasan: bad access detected
> >>
> >> Memory state around the buggy address:
> >>    c17cdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>    c17cdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>   >c17cdd00: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 f3 f3
> >>                                ^
> >>    c17cdd80: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>    c17cde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >> ==================================================================
> >>
> >> ==================================================================
> >> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x228/0xae0
> >> Read of size 1 at addr f6f37dd0 by task swapper/0
> >>
> >> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1136
> >> Call Trace:
> >> [c1c37d50] [c03f7e88] print_address_description+0x6c/0x2b0 (unreliable)
> >> [c1c37d80] [c03f7bd4] kasan_report+0x10c/0x16c
> >> [c1c37dc0] [c19879b4] pmac_nvram_init+0x228/0xae0
> >> [c1c37ef0] [c19826bc] pmac_setup_arch+0x578/0x6a8
> >> [c1c37f20] [c19792bc] setup_arch+0x5f4/0x620
> >> [c1c37f50] [c196f898] start_kernel+0xb8/0x588
> >> [c1c37ff0] [00003484] 0x3484
> >>
> >>
> >> Memory state around the buggy address:
> >>    f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>    f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>   >f6f37d80: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00
> >>                                            ^
> >>    f6f37e00: 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
> >>    f6f37e80: 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
> >> ==================================================================
> >>
> >> ==================================================================
> >> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1ec/0x5ec
> >> Read of size 1 at addr f6f37de0 by task swapper/0
> >>
> >> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1137
> >> Call Trace:
> >> [c0fb7d60] [c01f8184] print_address_description+0x6c/0x2b0 (unreliable)
> >> [c0fb7d90] [c01f7ed0] kasan_report+0x10c/0x16c
> >> [c0fb7dd0] [c0d1dfe8] pmac_nvram_init+0x1ec/0x5ec
> >> [c0fb7ef0] [c0d1ae90] pmac_setup_arch+0x280/0x308
> >> [c0fb7f20] [c0d16138] setup_arch+0x250/0x280
> >> [c0fb7f50] [c0d1032c] start_kernel+0xb8/0x4a4
> >> [c0fb7ff0] [00003484] 0x3484
> >>
> >>
> >> Memory state around the buggy address:
> >>    f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>    f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>   >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
> >>                                                  ^
> >>    f6f37e00: 00 00 01 f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
> >>    f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
> >> ==================================================================
> >>
> >> Thanks
> >> Christophe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27  9:19 ` Andrey Ryabinin
  2019-02-27  9:25   ` Dmitry Vyukov
@ 2019-02-27 13:11   ` Christophe Leroy
  2019-02-28  9:22     ` Andrey Ryabinin
  1 sibling, 1 reply; 13+ messages in thread
From: Christophe Leroy @ 2019-02-27 13:11 UTC (permalink / raw)
  To: Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov
  Cc: Daniel Axtens, linux-mm, linuxppc-dev, kasan-dev



Le 27/02/2019 à 10:19, Andrey Ryabinin a écrit :
> 
> 
> On 2/27/19 11:25 AM, Christophe Leroy wrote:
>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
>>
>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
>>
>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
>>
>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
>>
>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
>>
>> ==================================================================
>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
>> Read of size 1 at addr c0ecdd40 by task swapper/0
>>
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
>> Call Trace:
>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
>> [c0e9dd10] [c089579c] memchr+0x24/0x74
>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
>> --- interrupt: c0e9df00 at 0x400f330
>>      LR = init_stack+0x1f00/0x2000
>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
>> [c0e9dff0] [00003484] 0x3484
>>
>> The buggy address belongs to the variable:
>>   __log_buf+0xec0/0x4020
>> The buggy address belongs to the page:
>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>> flags: 0x1000(reserved)
>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
>> page dumped because: kasan: bad access detected
>>
>> Memory state around the buggy address:
>>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>>                                     ^
>>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> ==================================================================
>>
> 
> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
> 	"The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
>   which is printed by following code:
> 	if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
> 		pr_err("The buggy address belongs to the variable:\n");
> 		pr_err(" %pS\n", addr);
> 	}
> 
> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
> KASAN with stack instrumentation significantly increases stack usage.
> 

I get the above with THREAD_SHIFT set to 13 (default value).
If increasing it to 14, I get the following instead. That means that in 
that case the problem arises a lot earlier in the boot process (but 
still after the final kasan shadow setup).

==================================================================
BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1f8/0x5d0
Read of size 1 at addr f6f37de0 by task swapper/0

CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1143
Call Trace:
[c0e9fd60] [c01c43c0] print_address_description+0x164/0x2bc (unreliable)
[c0e9fd90] [c01c46a4] kasan_report+0xfc/0x180
[c0e9fdd0] [c0c226d4] pmac_nvram_init+0x1f8/0x5d0
[c0e9fef0] [c0c1f73c] pmac_setup_arch+0x298/0x314
[c0e9ff20] [c0c1ac40] setup_arch+0x250/0x268
[c0e9ff50] [c0c151dc] start_kernel+0xb8/0x488
[c0e9fff0] [00003484] 0x3484


Memory state around the buggy address:
  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
                                                ^
  f6f37e00: 00 00 01 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
  f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
==================================================================

Christophe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-27 13:11   ` Christophe Leroy
@ 2019-02-28  9:22     ` Andrey Ryabinin
  2019-02-28  9:27       ` Dmitry Vyukov
  0 siblings, 1 reply; 13+ messages in thread
From: Andrey Ryabinin @ 2019-02-28  9:22 UTC (permalink / raw)
  To: Christophe Leroy, Alexander Potapenko, Dmitry Vyukov
  Cc: Daniel Axtens, linux-mm, linuxppc-dev, kasan-dev



On 2/27/19 4:11 PM, Christophe Leroy wrote:
> 
> 
> Le 27/02/2019 à 10:19, Andrey Ryabinin a écrit :
>>
>>
>> On 2/27/19 11:25 AM, Christophe Leroy wrote:
>>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
>>>
>>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
>>>
>>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
>>>
>>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
>>>
>>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
>>>
>>> ==================================================================
>>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
>>> Read of size 1 at addr c0ecdd40 by task swapper/0
>>>
>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
>>> Call Trace:
>>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
>>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
>>> [c0e9dd10] [c089579c] memchr+0x24/0x74
>>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
>>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
>>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
>>> --- interrupt: c0e9df00 at 0x400f330
>>>      LR = init_stack+0x1f00/0x2000
>>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
>>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
>>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
>>> [c0e9dff0] [00003484] 0x3484
>>>
>>> The buggy address belongs to the variable:
>>>   __log_buf+0xec0/0x4020
>>> The buggy address belongs to the page:
>>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>>> flags: 0x1000(reserved)
>>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
>>> page dumped because: kasan: bad access detected
>>>
>>> Memory state around the buggy address:
>>>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>>>                                     ^
>>>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> ==================================================================
>>>
>>
>> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
>>     "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
>>   which is printed by following code:
>>     if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
>>         pr_err("The buggy address belongs to the variable:\n");
>>         pr_err(" %pS\n", addr);
>>     }
>>
>> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
>> KASAN with stack instrumentation significantly increases stack usage.
>>
> 
> I get the above with THREAD_SHIFT set to 13 (default value).
> If increasing it to 14, I get the following instead. That means that in that case the problem arises a lot earlier in the boot process (but still after the final kasan shadow setup).
> 

We usually use 15 (with 4k pages), but I think 14 should be enough for the clean boot.

> ==================================================================
> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1f8/0x5d0
> Read of size 1 at addr f6f37de0 by task swapper/0
> 
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1143
> Call Trace:
> [c0e9fd60] [c01c43c0] print_address_description+0x164/0x2bc (unreliable)
> [c0e9fd90] [c01c46a4] kasan_report+0xfc/0x180
> [c0e9fdd0] [c0c226d4] pmac_nvram_init+0x1f8/0x5d0
> [c0e9fef0] [c0c1f73c] pmac_setup_arch+0x298/0x314
> [c0e9ff20] [c0c1ac40] setup_arch+0x250/0x268
> [c0e9ff50] [c0c151dc] start_kernel+0xb8/0x488
> [c0e9fff0] [00003484] 0x3484
> 
> 
> Memory state around the buggy address:
>  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>                                                ^
>  f6f37e00: 00 00 01 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>  f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
> ==================================================================

Powerpc's show_stack() prints stack addresses, so we know that stack is something near 0xc0e9f... address.
f6f37de0 is definitely not stack address and it's to far for the stack overflow.
So it looks like shadow for stack  - kasan_mem_to_shadow(0xc0e9f...) and shadow for address in report - kasan_mem_to_shadow(0xf6f37de0)
point to the same physical page. 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-28  9:22     ` Andrey Ryabinin
@ 2019-02-28  9:27       ` Dmitry Vyukov
  2019-02-28  9:47         ` Andrey Ryabinin
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Vyukov @ 2019-02-28  9:27 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Christophe Leroy, Alexander Potapenko, Daniel Axtens, Linux-MM,
	linuxppc-dev, kasan-dev

On Thu, Feb 28, 2019 at 10:22 AM Andrey Ryabinin
<aryabinin@virtuozzo.com> wrote:
>
>
>
> On 2/27/19 4:11 PM, Christophe Leroy wrote:
> >
> >
> > Le 27/02/2019 à 10:19, Andrey Ryabinin a écrit :
> >>
> >>
> >> On 2/27/19 11:25 AM, Christophe Leroy wrote:
> >>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
> >>>
> >>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
> >>>
> >>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
> >>>
> >>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
> >>>
> >>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
> >>>
> >>> ==================================================================
> >>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
> >>> Read of size 1 at addr c0ecdd40 by task swapper/0
> >>>
> >>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
> >>> Call Trace:
> >>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
> >>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
> >>> [c0e9dd10] [c089579c] memchr+0x24/0x74
> >>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
> >>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
> >>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
> >>> --- interrupt: c0e9df00 at 0x400f330
> >>>      LR = init_stack+0x1f00/0x2000
> >>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
> >>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
> >>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
> >>> [c0e9dff0] [00003484] 0x3484
> >>>
> >>> The buggy address belongs to the variable:
> >>>   __log_buf+0xec0/0x4020
> >>> The buggy address belongs to the page:
> >>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> >>> flags: 0x1000(reserved)
> >>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
> >>> page dumped because: kasan: bad access detected
> >>>
> >>> Memory state around the buggy address:
> >>>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> >>>                                     ^
> >>>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> >>>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>> ==================================================================
> >>>
> >>
> >> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
> >>     "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
> >>   which is printed by following code:
> >>     if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
> >>         pr_err("The buggy address belongs to the variable:\n");
> >>         pr_err(" %pS\n", addr);
> >>     }
> >>
> >> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
> >> KASAN with stack instrumentation significantly increases stack usage.
> >>
> >
> > I get the above with THREAD_SHIFT set to 13 (default value).
> > If increasing it to 14, I get the following instead. That means that in that case the problem arises a lot earlier in the boot process (but still after the final kasan shadow setup).
> >
>
> We usually use 15 (with 4k pages), but I think 14 should be enough for the clean boot.
>
> > ==================================================================
> > BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1f8/0x5d0
> > Read of size 1 at addr f6f37de0 by task swapper/0
> >
> > CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1143
> > Call Trace:
> > [c0e9fd60] [c01c43c0] print_address_description+0x164/0x2bc (unreliable)
> > [c0e9fd90] [c01c46a4] kasan_report+0xfc/0x180
> > [c0e9fdd0] [c0c226d4] pmac_nvram_init+0x1f8/0x5d0
> > [c0e9fef0] [c0c1f73c] pmac_setup_arch+0x298/0x314
> > [c0e9ff20] [c0c1ac40] setup_arch+0x250/0x268
> > [c0e9ff50] [c0c151dc] start_kernel+0xb8/0x488
> > [c0e9fff0] [00003484] 0x3484
> >
> >
> > Memory state around the buggy address:
> >  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
> >                                                ^
> >  f6f37e00: 00 00 01 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
> >  f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
> > ==================================================================
>
> Powerpc's show_stack() prints stack addresses, so we know that stack is something near 0xc0e9f... address.
> f6f37de0 is definitely not stack address and it's to far for the stack overflow.
> So it looks like shadow for stack  - kasan_mem_to_shadow(0xc0e9f...) and shadow for address in report - kasan_mem_to_shadow(0xf6f37de0)
> point to the same physical page.

Shouldn't shadow start at 0xf8 for powerpc32? I did some math
yesterday which I think lead me to 0xf8.
This allows to cover at most 1GB of memory. Do you have more by any chance?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-28  9:27       ` Dmitry Vyukov
@ 2019-02-28  9:47         ` Andrey Ryabinin
  2019-02-28  9:54           ` Dmitry Vyukov
  2019-02-28 13:41           ` Christophe Leroy
  0 siblings, 2 replies; 13+ messages in thread
From: Andrey Ryabinin @ 2019-02-28  9:47 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christophe Leroy, Alexander Potapenko, Daniel Axtens, Linux-MM,
	linuxppc-dev, kasan-dev



On 2/28/19 12:27 PM, Dmitry Vyukov wrote:
> On Thu, Feb 28, 2019 at 10:22 AM Andrey Ryabinin
> <aryabinin@virtuozzo.com> wrote:
>>
>>
>>
>> On 2/27/19 4:11 PM, Christophe Leroy wrote:
>>>
>>>
>>> Le 27/02/2019 à 10:19, Andrey Ryabinin a écrit :
>>>>
>>>>
>>>> On 2/27/19 11:25 AM, Christophe Leroy wrote:
>>>>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
>>>>>
>>>>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
>>>>>
>>>>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
>>>>>
>>>>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
>>>>>
>>>>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
>>>>>
>>>>> ==================================================================
>>>>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
>>>>> Read of size 1 at addr c0ecdd40 by task swapper/0
>>>>>
>>>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
>>>>> Call Trace:
>>>>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
>>>>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
>>>>> [c0e9dd10] [c089579c] memchr+0x24/0x74
>>>>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
>>>>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
>>>>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
>>>>> --- interrupt: c0e9df00 at 0x400f330
>>>>>      LR = init_stack+0x1f00/0x2000
>>>>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
>>>>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
>>>>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
>>>>> [c0e9dff0] [00003484] 0x3484
>>>>>
>>>>> The buggy address belongs to the variable:
>>>>>   __log_buf+0xec0/0x4020
>>>>> The buggy address belongs to the page:
>>>>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>>>>> flags: 0x1000(reserved)
>>>>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
>>>>> page dumped because: kasan: bad access detected
>>>>>
>>>>> Memory state around the buggy address:
>>>>>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>>>>>                                     ^
>>>>>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>> ==================================================================
>>>>>
>>>>
>>>> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
>>>>     "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
>>>>   which is printed by following code:
>>>>     if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
>>>>         pr_err("The buggy address belongs to the variable:\n");
>>>>         pr_err(" %pS\n", addr);
>>>>     }
>>>>
>>>> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
>>>> KASAN with stack instrumentation significantly increases stack usage.
>>>>
>>>
>>> I get the above with THREAD_SHIFT set to 13 (default value).
>>> If increasing it to 14, I get the following instead. That means that in that case the problem arises a lot earlier in the boot process (but still after the final kasan shadow setup).
>>>
>>
>> We usually use 15 (with 4k pages), but I think 14 should be enough for the clean boot.
>>
>>> ==================================================================
>>> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1f8/0x5d0
>>> Read of size 1 at addr f6f37de0 by task swapper/0
>>>
>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1143
>>> Call Trace:
>>> [c0e9fd60] [c01c43c0] print_address_description+0x164/0x2bc (unreliable)
>>> [c0e9fd90] [c01c46a4] kasan_report+0xfc/0x180
>>> [c0e9fdd0] [c0c226d4] pmac_nvram_init+0x1f8/0x5d0
>>> [c0e9fef0] [c0c1f73c] pmac_setup_arch+0x298/0x314
>>> [c0e9ff20] [c0c1ac40] setup_arch+0x250/0x268
>>> [c0e9ff50] [c0c151dc] start_kernel+0xb8/0x488
>>> [c0e9fff0] [00003484] 0x3484
>>>
>>>
>>> Memory state around the buggy address:
>>>  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>>>                                                ^
>>>  f6f37e00: 00 00 01 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>>>  f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
>>> ==================================================================
>>
>> Powerpc's show_stack() prints stack addresses, so we know that stack is something near 0xc0e9f... address.
>> f6f37de0 is definitely not stack address and it's to far for the stack overflow.
>> So it looks like shadow for stack  - kasan_mem_to_shadow(0xc0e9f...) and shadow for address in report - kasan_mem_to_shadow(0xf6f37de0)
>> point to the same physical page.
> 
> Shouldn't shadow start at 0xf8 for powerpc32? I did some math
> yesterday which I think lead me to 0xf8.

Dunno, maybe. How is this relevant? In case you referring to the 0xf6f* addresses in the report,
these are not shadow, but accessed addresses.

> This allows to cover at most 1GB of memory. Do you have more by any chance?
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-28  9:47         ` Andrey Ryabinin
@ 2019-02-28  9:54           ` Dmitry Vyukov
  2019-02-28 13:41           ` Christophe Leroy
  1 sibling, 0 replies; 13+ messages in thread
From: Dmitry Vyukov @ 2019-02-28  9:54 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Christophe Leroy, Alexander Potapenko, Daniel Axtens, Linux-MM,
	linuxppc-dev, kasan-dev

On Thu, Feb 28, 2019 at 10:46 AM Andrey Ryabinin
<aryabinin@virtuozzo.com> wrote:
>
>
>
> On 2/28/19 12:27 PM, Dmitry Vyukov wrote:
> > On Thu, Feb 28, 2019 at 10:22 AM Andrey Ryabinin
> > <aryabinin@virtuozzo.com> wrote:
> >>
> >>
> >>
> >> On 2/27/19 4:11 PM, Christophe Leroy wrote:
> >>>
> >>>
> >>> Le 27/02/2019 à 10:19, Andrey Ryabinin a écrit :
> >>>>
> >>>>
> >>>> On 2/27/19 11:25 AM, Christophe Leroy wrote:
> >>>>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
> >>>>>
> >>>>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
> >>>>>
> >>>>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
> >>>>>
> >>>>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
> >>>>>
> >>>>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
> >>>>>
> >>>>> ==================================================================
> >>>>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
> >>>>> Read of size 1 at addr c0ecdd40 by task swapper/0
> >>>>>
> >>>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
> >>>>> Call Trace:
> >>>>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
> >>>>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
> >>>>> [c0e9dd10] [c089579c] memchr+0x24/0x74
> >>>>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
> >>>>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
> >>>>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
> >>>>> --- interrupt: c0e9df00 at 0x400f330
> >>>>>      LR = init_stack+0x1f00/0x2000
> >>>>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
> >>>>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
> >>>>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
> >>>>> [c0e9dff0] [00003484] 0x3484
> >>>>>
> >>>>> The buggy address belongs to the variable:
> >>>>>   __log_buf+0xec0/0x4020
> >>>>> The buggy address belongs to the page:
> >>>>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
> >>>>> flags: 0x1000(reserved)
> >>>>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
> >>>>> page dumped because: kasan: bad access detected
> >>>>>
> >>>>> Memory state around the buggy address:
> >>>>>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>>>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> >>>>>                                     ^
> >>>>>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> >>>>>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>>> ==================================================================
> >>>>>
> >>>>
> >>>> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
> >>>>     "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
> >>>>   which is printed by following code:
> >>>>     if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
> >>>>         pr_err("The buggy address belongs to the variable:\n");
> >>>>         pr_err(" %pS\n", addr);
> >>>>     }
> >>>>
> >>>> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
> >>>> KASAN with stack instrumentation significantly increases stack usage.
> >>>>
> >>>
> >>> I get the above with THREAD_SHIFT set to 13 (default value).
> >>> If increasing it to 14, I get the following instead. That means that in that case the problem arises a lot earlier in the boot process (but still after the final kasan shadow setup).
> >>>
> >>
> >> We usually use 15 (with 4k pages), but I think 14 should be enough for the clean boot.
> >>
> >>> ==================================================================
> >>> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1f8/0x5d0
> >>> Read of size 1 at addr f6f37de0 by task swapper/0
> >>>
> >>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1143
> >>> Call Trace:
> >>> [c0e9fd60] [c01c43c0] print_address_description+0x164/0x2bc (unreliable)
> >>> [c0e9fd90] [c01c46a4] kasan_report+0xfc/0x180
> >>> [c0e9fdd0] [c0c226d4] pmac_nvram_init+0x1f8/0x5d0
> >>> [c0e9fef0] [c0c1f73c] pmac_setup_arch+0x298/0x314
> >>> [c0e9ff20] [c0c1ac40] setup_arch+0x250/0x268
> >>> [c0e9ff50] [c0c151dc] start_kernel+0xb8/0x488
> >>> [c0e9fff0] [00003484] 0x3484
> >>>
> >>>
> >>> Memory state around the buggy address:
> >>>  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>> f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
> >>>                                                ^
> >>>  f6f37e00: 00 00 01 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
> >>>  f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
> >>> ==================================================================
> >>
> >> Powerpc's show_stack() prints stack addresses, so we know that stack is something near 0xc0e9f... address.
> >> f6f37de0 is definitely not stack address and it's to far for the stack overflow.
> >> So it looks like shadow for stack  - kasan_mem_to_shadow(0xc0e9f...) and shadow for address in report - kasan_mem_to_shadow(0xf6f37de0)
> >> point to the same physical page.
> >
> > Shouldn't shadow start at 0xf8 for powerpc32? I did some math
> > yesterday which I think lead me to 0xf8.
>
> Dunno, maybe. How is this relevant? In case you referring to the 0xf6f* addresses in the report,
> these are not shadow, but accessed addresses.

Right. Then never mind.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: BUG: KASAN: stack-out-of-bounds
  2019-02-28  9:47         ` Andrey Ryabinin
  2019-02-28  9:54           ` Dmitry Vyukov
@ 2019-02-28 13:41           ` Christophe Leroy
  1 sibling, 0 replies; 13+ messages in thread
From: Christophe Leroy @ 2019-02-28 13:41 UTC (permalink / raw)
  To: Andrey Ryabinin, Dmitry Vyukov
  Cc: Alexander Potapenko, Daniel Axtens, Linux-MM, linuxppc-dev, kasan-dev



Le 28/02/2019 à 10:47, Andrey Ryabinin a écrit :
> 
> 
> On 2/28/19 12:27 PM, Dmitry Vyukov wrote:
>> On Thu, Feb 28, 2019 at 10:22 AM Andrey Ryabinin
>> <aryabinin@virtuozzo.com> wrote:
>>>
>>>
>>>
>>> On 2/27/19 4:11 PM, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 27/02/2019 à 10:19, Andrey Ryabinin a écrit :
>>>>>
>>>>>
>>>>> On 2/27/19 11:25 AM, Christophe Leroy wrote:
>>>>>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
>>>>>>
>>>>>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
>>>>>>
>>>>>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
>>>>>>
>>>>>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
>>>>>>
>>>>>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
>>>>>>
>>>>>> ==================================================================
>>>>>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
>>>>>> Read of size 1 at addr c0ecdd40 by task swapper/0
>>>>>>
>>>>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
>>>>>> Call Trace:
>>>>>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
>>>>>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
>>>>>> [c0e9dd10] [c089579c] memchr+0x24/0x74
>>>>>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
>>>>>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
>>>>>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
>>>>>> --- interrupt: c0e9df00 at 0x400f330
>>>>>>       LR = init_stack+0x1f00/0x2000
>>>>>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
>>>>>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
>>>>>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
>>>>>> [c0e9dff0] [00003484] 0x3484
>>>>>>
>>>>>> The buggy address belongs to the variable:
>>>>>>    __log_buf+0xec0/0x4020
>>>>>> The buggy address belongs to the page:
>>>>>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>>>>>> flags: 0x1000(reserved)
>>>>>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
>>>>>> page dumped because: kasan: bad access detected
>>>>>>
>>>>>> Memory state around the buggy address:
>>>>>>    c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>>    c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>>>>>>                                      ^
>>>>>>    c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>>    c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>>> ==================================================================
>>>>>>
>>>>>
>>>>> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
>>>>>      "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
>>>>>    which is printed by following code:
>>>>>      if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
>>>>>          pr_err("The buggy address belongs to the variable:\n");
>>>>>          pr_err(" %pS\n", addr);
>>>>>      }
>>>>>
>>>>> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
>>>>> KASAN with stack instrumentation significantly increases stack usage.
>>>>>
>>>>
>>>> I get the above with THREAD_SHIFT set to 13 (default value).
>>>> If increasing it to 14, I get the following instead. That means that in that case the problem arises a lot earlier in the boot process (but still after the final kasan shadow setup).
>>>>
>>>
>>> We usually use 15 (with 4k pages), but I think 14 should be enough for the clean boot.
>>>
>>>> ==================================================================
>>>> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1f8/0x5d0
>>>> Read of size 1 at addr f6f37de0 by task swapper/0
>>>>
>>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1143
>>>> Call Trace:
>>>> [c0e9fd60] [c01c43c0] print_address_description+0x164/0x2bc (unreliable)
>>>> [c0e9fd90] [c01c46a4] kasan_report+0xfc/0x180
>>>> [c0e9fdd0] [c0c226d4] pmac_nvram_init+0x1f8/0x5d0
>>>> [c0e9fef0] [c0c1f73c] pmac_setup_arch+0x298/0x314
>>>> [c0e9ff20] [c0c1ac40] setup_arch+0x250/0x268
>>>> [c0e9ff50] [c0c151dc] start_kernel+0xb8/0x488
>>>> [c0e9fff0] [00003484] 0x3484
>>>>
>>>>
>>>> Memory state around the buggy address:
>>>>   f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>   f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>> f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>>>>                                                 ^
>>>>   f6f37e00: 00 00 01 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>>>>   f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
>>>> ==================================================================
>>>
>>> Powerpc's show_stack() prints stack addresses, so we know that stack is something near 0xc0e9f... address.
>>> f6f37de0 is definitely not stack address and it's to far for the stack overflow.
>>> So it looks like shadow for stack  - kasan_mem_to_shadow(0xc0e9f...) and shadow for address in report - kasan_mem_to_shadow(0xf6f37de0)
>>> point to the same physical page.
>>
>> Shouldn't shadow start at 0xf8 for powerpc32? I did some math
>> yesterday which I think lead me to 0xf8.
> 
> Dunno, maybe. How is this relevant? In case you referring to the 0xf6f* addresses in the report,
> these are not shadow, but accessed addresses.

Thanks for your help. Indeed you made me realise here that the access is 
to an IO Mapping, so being covered by the zero shadow page.

After some investigation I saw that the zero shadow page was being 
poisonned allthough i confirmed it was mapped RO in every page table 
entry referencing it.

What I finaly discovered is that in fact the HW still had some of the 
early page table entries pointing to the zero page in RW.

The reason for the above is due to the PGD having multiple entries 
pointing to kasan_early_shadow_pte[]. In powerpc hash32, a flag 
_PAGE_HASHPTE is set to tell when a PTE has been given to HW. Then when 
flush_tlb_kernel_range() is called, the kernel walks the page tables and 
only really flushes the pages having the _PAGE_HASHPTE flag, then clear it.
The consequence is that when the kernel walk again that PTE from a 
different PGD entry, it is seen as not needing flush anymore.

So, the conclusion to this that I'm finalising at the moment is to have 
the final shadow page table layout set up as soon as memblock is 
available and before switching from the early hash table to the final 
hash table.

Christophe

> 
>> This allows to cover at most 1GB of memory. Do you have more by any chance?
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-02-28 13:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-27  8:25 BUG: KASAN: stack-out-of-bounds Christophe Leroy
2019-02-27  8:34 ` Dmitry Vyukov
2019-02-27 12:35   ` Christophe Leroy
2019-02-27 13:07     ` Dmitry Vyukov
2019-02-27  9:19 ` Andrey Ryabinin
2019-02-27  9:25   ` Dmitry Vyukov
2019-02-27  9:33     ` Christophe Leroy
2019-02-27 13:11   ` Christophe Leroy
2019-02-28  9:22     ` Andrey Ryabinin
2019-02-28  9:27       ` Dmitry Vyukov
2019-02-28  9:47         ` Andrey Ryabinin
2019-02-28  9:54           ` Dmitry Vyukov
2019-02-28 13:41           ` Christophe Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).