All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
@ 2021-05-31  7:15 syzbot
  2021-06-02 23:00 ` Josh Poimboeuf
  2021-06-03  8:02 ` syzbot
  0 siblings, 2 replies; 12+ messages in thread
From: syzbot @ 2021-05-31  7:15 UTC (permalink / raw)
  To: ak, bp, hpa, inglorion, jpoimboe, linux-kernel, mingo,
	syzkaller-bugs, tglx, x86

Hello,

syzbot found the following issue on:

HEAD commit:    7ac3a1c1 Merge tag 'mtd/fixes-for-5.13-rc4' of git://git.k..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1246d43dd00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f9f3fc7daa178986
dashboard link: https://syzkaller.appspot.com/bug?extid=84fe685c02cd112a2ac3
compiler:       Debian clang version 11.0.1-2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+84fe685c02cd112a2ac3@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: stack-out-of-bounds in profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
Read of size 8 at addr ffffc90001c0f7a0 by task systemd-udevd/12323

CPU: 1 PID: 12323 Comm: systemd-udevd Not tainted 5.13.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x202/0x31e lib/dump_stack.c:120
 print_address_description+0x5f/0x3b0 mm/kasan/report.c:233
 __kasan_report mm/kasan/report.c:419 [inline]
 kasan_report+0x15c/0x200 mm/kasan/report.c:436
 profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
 profile_tick+0xcd/0x120 kernel/profile.c:408
 tick_sched_handle kernel/time/tick-sched.c:227 [inline]
 tick_sched_timer+0x287/0x420 kernel/time/tick-sched.c:1373
 __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
 __hrtimer_run_queues+0x4cb/0xa60 kernel/time/hrtimer.c:1601
 hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1663
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
 __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1106
 sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:647
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xbc/0x120 kernel/locking/spinlock.c:191
Code: f0 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 f7 e8 5a 57 04 f8 f6 44 24 21 02 75 4e 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> 0f 3f 94 f7 65 8b 05 50 54 3f 76 85 c0 74 3f 48 c7 04 24 0e 36
RSP: 0018:ffffc90001c0f7a0 EFLAGS: 00000206
RAX: 1ffff92000381ef8 RBX: ffff88802e529540 RCX: ffffffff90e84703
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000001
RBP: ffffc90001c0f830 R08: ffffffff81855cb0 R09: ffffed1005ca52a9
R10: ffffed1005ca52a9 R11: 0000000000000000 R12: dffffc0000000000
R13: 1ffff92000381ef4 R14: ffffc90001c0f7c0 R15: 0000000000000a02
 spin_unlock_irqrestore include/linux/spinlock.h:409 [inline]
 __wake_up_common_lock kernel/sched/wait.c:140 [inline]
 __wake_up_sync_key+0x134/0x1f0 kernel/sched/wait.c:205
 sock_def_readable+0x141/0x210 net/core/sock.c:2910
 unix_dgram_sendmsg+0x1a6e/0x2aa0 net/unix/af_unix.c:1800
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 sock_write_iter+0x398/0x520 net/socket.c:1001
 call_write_iter include/linux/fs.h:2114 [inline]
 new_sync_write fs/read_write.c:518 [inline]
 vfs_write+0xa39/0xc90 fs/read_write.c:605
 ksys_write+0x171/0x2a0 fs/read_write.c:658
 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f8354df21b0
Code: 2e 0f 1f 84 00 00 00 00 00 90 48 8b 05 19 7e 20 00 c3 0f 1f 84 00 00 00 00 00 83 3d 19 c2 20 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae fc ff ff 48 89 04 24
RSP: 002b:00007fff1f53e558 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000055f37f7b13c0 RCX: 00007f8354df21b0
RDX: 0000000000000000 RSI: 00007fff1f53e610 RDI: 0000000000000008
RBP: 00007fff1f53e6d0 R08: 000055f37f7a0a04 R09: 0000000000000000
R10: 0000000000000004 R11: 0000000000000246 R12: 00007fff1f53e620
R13: 000055f37f78c880 R14: 0000000000000003 R15: 000000000000000e


addr ffffc90001c0f7a0 is located in stack of task systemd-udevd/12323 at offset 0 in frame:
 _raw_spin_unlock_irqrestore+0x0/0x120 kernel/locking/spinlock.c:184

this frame has 1 object:
 [32, 40) 'flags.i.i.i.i'

Memory state around the buggy address:
 ffffc90001c0f680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc90001c0f700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc90001c0f780: 00 00 00 00 f1 f1 f1 f1 00 f3 f3 f3 00 00 00 00
                               ^
 ffffc90001c0f800: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
 ffffc90001c0f880: 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3 00 00 00 00
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-05-31  7:15 [syzbot] KASAN: stack-out-of-bounds Read in profile_pc syzbot
@ 2021-06-02 23:00 ` Josh Poimboeuf
  2021-06-02 23:35   ` Andi Kleen
  2021-06-03  8:02 ` syzbot
  1 sibling, 1 reply; 12+ messages in thread
From: Josh Poimboeuf @ 2021-06-02 23:00 UTC (permalink / raw)
  To: syzbot
  Cc: ak, bp, hpa, inglorion, linux-kernel, mingo, syzkaller-bugs,
	tglx, x86, Peter Zijlstra, Andy Lutomirski

On Mon, May 31, 2021 at 12:15:23AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    7ac3a1c1 Merge tag 'mtd/fixes-for-5.13-rc4' of git://git.k..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1246d43dd00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f9f3fc7daa178986
> dashboard link: https://syzkaller.appspot.com/bug?extid=84fe685c02cd112a2ac3
> compiler:       Debian clang version 11.0.1-2
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+84fe685c02cd112a2ac3@syzkaller.appspotmail.com
> 
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
> Read of size 8 at addr ffffc90001c0f7a0 by task systemd-udevd/12323

This looks like a valid bug in profile_pc().  With !FRAME_POINTER, it
has an ancient (2006) hack for unwinding a single frame, for when
regs->ip is in a lock function.

I guess the point is to put lock functions' callees in the profile,
rather than the lock functions themselves.

profile_pc() assumes the return address is either directly at regs->sp,
or one word adjacent to it due to saved flags, both of which are just
completely wrong.  This code has probably never worked with ORC, and
nobody noticed apparently.

We could just use ORC to unwind to the next frame.  Though, isn't
/proc/profile redundant, compared to all the more sophisticated options
nowadays?  Is there still a distinct use case for it or can we just
remove it?

-- 
Josh


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-06-02 23:00 ` Josh Poimboeuf
@ 2021-06-02 23:35   ` Andi Kleen
  2021-06-03 13:29     ` Josh Poimboeuf
  2021-06-03 13:30     ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Andi Kleen @ 2021-06-02 23:35 UTC (permalink / raw)
  To: Josh Poimboeuf, syzbot
  Cc: bp, hpa, inglorion, linux-kernel, mingo, syzkaller-bugs, tglx,
	x86, Peter Zijlstra, Andy Lutomirski


> profile_pc() assumes the return address is either directly at regs->sp,
> or one word adjacent to it due to saved flags, both of which are just
> completely wrong.  This code has probably never worked with ORC, and
> nobody noticed apparently.

I presume it used to work because the lock functions were really simple, 
but that's not true anymore.

>
> We could just use ORC to unwind to the next frame.  Though, isn't
> /proc/profile redundant, compared to all the more sophisticated options
> nowadays?  Is there still a distinct use case for it or can we just
> remove it?

It's still needed for some special cases. For example there is no other 
viable way to profile early boot without a VM

I would just drop the hack to unwind, at least for the early boot 
profile use case locking profiling is usually not needed.

-Andi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-05-31  7:15 [syzbot] KASAN: stack-out-of-bounds Read in profile_pc syzbot
  2021-06-02 23:00 ` Josh Poimboeuf
@ 2021-06-03  8:02 ` syzbot
  1 sibling, 0 replies; 12+ messages in thread
From: syzbot @ 2021-06-03  8:02 UTC (permalink / raw)
  To: ak, bp, hpa, inglorion, jpoimboe, linux-kernel, luto, mingo,
	peterz, syzkaller-bugs, tglx, x86

syzbot has found a reproducer for the following issue on:

HEAD commit:    324c92e5 Merge tag 'efi-urgent-2021-06-02' of git://git.ke..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1653122fd00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=ad5040c83f09b8e4
dashboard link: https://syzkaller.appspot.com/bug?extid=84fe685c02cd112a2ac3
compiler:       Debian clang version 11.0.1-2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=171e0683d00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1723cc47d00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+84fe685c02cd112a2ac3@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: stack-out-of-bounds in profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
Read of size 8 at addr ffffc9000163f620 by task syz-executor815/8426

CPU: 1 PID: 8426 Comm: syz-executor815 Not tainted 5.13.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x202/0x31e lib/dump_stack.c:120
 print_address_description+0x5f/0x3b0 mm/kasan/report.c:233
 __kasan_report mm/kasan/report.c:419 [inline]
 kasan_report+0x15c/0x200 mm/kasan/report.c:436
 profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
 profile_tick+0xcd/0x120 kernel/profile.c:408
 tick_sched_handle kernel/time/tick-sched.c:227 [inline]
 tick_sched_timer+0x287/0x420 kernel/time/tick-sched.c:1373
 __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
 __hrtimer_run_queues+0x4cb/0xa60 kernel/time/hrtimer.c:1601
 hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1663
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
 __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1106
 sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:647
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xbc/0x120 kernel/locking/spinlock.c:191
Code: f0 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 f7 e8 ea e7 03 f8 f6 44 24 21 02 75 4e 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> 1f b3 93 f7 65 8b 05 50 c4 3e 76 85 c0 74 3f 48 c7 04 24 0e 36
RSP: 0018:ffffc9000163f620 EFLAGS: 00000206
RAX: 1ffff920002c7ec8 RBX: ffffffff9117f258 RCX: ffffffff90e85703
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000001
RBP: ffffc9000163f6b8 R08: ffffffff818560c0 R09: fffffbfff222fe4c
R10: fffffbfff222fe4c R11: 0000000000000000 R12: dffffc0000000000
R13: 1ffff920002c7ec4 R14: ffffc9000163f640 R15: 0000000000000a02
 __debug_check_no_obj_freed lib/debugobjects.c:997 [inline]
 debug_check_no_obj_freed+0x5a2/0x650 lib/debugobjects.c:1018
 free_pages_prepare mm/page_alloc.c:1303 [inline]
 __free_pages_ok+0x2f5/0x1180 mm/page_alloc.c:1572
 destroy_compound_page include/linux/mm.h:939 [inline]
 __put_compound_page mm/swap.c:111 [inline]
 release_pages+0x600/0x1b80 mm/swap.c:948
 tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
 tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249
 tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:340
 exit_mmap+0x2c6/0x5f0 mm/mmap.c:3210
 __mmput+0x111/0x370 kernel/fork.c:1096
 exit_mm+0x67e/0x7d0 kernel/exit.c:502
 do_exit+0x6b9/0x23d0 kernel/exit.c:813
 do_group_exit+0x168/0x2d0 kernel/exit.c:923
 __do_sys_exit_group+0x13/0x20 kernel/exit.c:934
 __se_sys_exit_group+0x10/0x10 kernel/exit.c:932
 __x64_sys_exit_group+0x37/0x40 kernel/exit.c:932
 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x446bc9
Code: Unable to access opcode bytes at RIP 0x446b9f.
RSP: 002b:00007ffdae409208 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00000000004b8390 RCX: 0000000000446bc9
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
RBP: 0000000000000001 R08: ffffffffffffffc4 R09: 0000000000000004
R10: 00000000004004a0 R11: 0000000000000246 R12: 00000000004b8390
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001


addr ffffc9000163f620 is located in stack of task syz-executor815/8426 at offset 0 in frame:
 _raw_spin_unlock_irqrestore+0x0/0x120 kernel/locking/spinlock.c:184

this frame has 1 object:
 [32, 40) 'flags.i.i.i.i'

Memory state around the buggy address:
 ffffc9000163f500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc9000163f580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc9000163f600: 00 00 00 00 f1 f1 f1 f1 00 f3 f3 f3 00 00 00 00
                               ^
 ffffc9000163f680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc9000163f700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-06-02 23:35   ` Andi Kleen
@ 2021-06-03 13:29     ` Josh Poimboeuf
  2021-06-03 13:30     ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Josh Poimboeuf @ 2021-06-03 13:29 UTC (permalink / raw)
  To: Andi Kleen
  Cc: syzbot, bp, hpa, inglorion, linux-kernel, mingo, syzkaller-bugs,
	tglx, x86, Peter Zijlstra, Andy Lutomirski

On Wed, Jun 02, 2021 at 04:35:11PM -0700, Andi Kleen wrote:
> 
> > profile_pc() assumes the return address is either directly at regs->sp,
> > or one word adjacent to it due to saved flags, both of which are just
> > completely wrong.  This code has probably never worked with ORC, and
> > nobody noticed apparently.
> 
> I presume it used to work because the lock functions were really simple, but
> that's not true anymore.

Yeah, I figured as much.

> > We could just use ORC to unwind to the next frame.  Though, isn't
> > /proc/profile redundant, compared to all the more sophisticated options
> > nowadays?  Is there still a distinct use case for it or can we just
> > remove it?
> 
> It's still needed for some special cases. For example there is no other
> viable way to profile early boot without a VM
> 
> I would just drop the hack to unwind, at least for the early boot profile
> use case locking profiling is usually not needed.

Ok, I'll just get rid of the hack then.

-- 
Josh


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-06-02 23:35   ` Andi Kleen
  2021-06-03 13:29     ` Josh Poimboeuf
@ 2021-06-03 13:30     ` Peter Zijlstra
  2021-06-03 13:39       ` Josh Poimboeuf
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2021-06-03 13:30 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Josh Poimboeuf, syzbot, bp, hpa, inglorion, linux-kernel, mingo,
	syzkaller-bugs, tglx, x86, Andy Lutomirski

On Wed, Jun 02, 2021 at 04:35:11PM -0700, Andi Kleen wrote:

> > We could just use ORC to unwind to the next frame.  Though, isn't
> > /proc/profile redundant, compared to all the more sophisticated options
> > nowadays?  Is there still a distinct use case for it or can we just
> > remove it?
> 
> It's still needed for some special cases. For example there is no other
> viable way to profile early boot without a VM
> 
> I would just drop the hack to unwind, at least for the early boot profile
> use case locking profiling is usually not needed.

Surely we can cook up something else there and delete this thing? ftrace
buffers are available really early, it shouldn't be hard to dump some
data in there during boot.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-06-03 13:30     ` Peter Zijlstra
@ 2021-06-03 13:39       ` Josh Poimboeuf
  2021-06-03 13:52         ` Andi Kleen
  0 siblings, 1 reply; 12+ messages in thread
From: Josh Poimboeuf @ 2021-06-03 13:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, syzbot, bp, hpa, inglorion, linux-kernel, mingo,
	syzkaller-bugs, tglx, x86, Andy Lutomirski, Steven Rostedt

On Thu, Jun 03, 2021 at 03:30:10PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 02, 2021 at 04:35:11PM -0700, Andi Kleen wrote:
> 
> > > We could just use ORC to unwind to the next frame.  Though, isn't
> > > /proc/profile redundant, compared to all the more sophisticated options
> > > nowadays?  Is there still a distinct use case for it or can we just
> > > remove it?
> > 
> > It's still needed for some special cases. For example there is no other
> > viable way to profile early boot without a VM
> > 
> > I would just drop the hack to unwind, at least for the early boot profile
> > use case locking profiling is usually not needed.
> 
> Surely we can cook up something else there and delete this thing? ftrace
> buffers are available really early, it shouldn't be hard to dump some
> data in there during boot.

True, ftrace does have function profiling (function_profile_enabled).

Steve, is there a way to enable that on the kernel cmdline?

-- 
Josh


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-06-03 13:39       ` Josh Poimboeuf
@ 2021-06-03 13:52         ` Andi Kleen
  2021-10-11 13:07           ` Lee Jones
  0 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2021-06-03 13:52 UTC (permalink / raw)
  To: Josh Poimboeuf, Peter Zijlstra
  Cc: syzbot, bp, hpa, inglorion, linux-kernel, mingo, syzkaller-bugs,
	tglx, x86, Andy Lutomirski, Steven Rostedt


> True, ftrace does have function profiling (function_profile_enabled).
>
> Steve, is there a way to enable that on the kernel cmdline?

That's not really comparable. function profiling has a lot more 
overhead. Also there is various code which has ftrace instrumentation 
disabled.

I don't think why you want to kill the old profiler. It's rarely used, 
but when you need it usually works. It's always good to have simple fall 
backs. And it's not that it's a lot of difficult code.

-Andi


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-06-03 13:52         ` Andi Kleen
@ 2021-10-11 13:07           ` Lee Jones
  2021-10-11 14:43             ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Lee Jones @ 2021-10-11 13:07 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Josh Poimboeuf, Peter Zijlstra, syzbot, bp, hpa, inglorion,
	linux-kernel, mingo, syzkaller-bugs, tglx, x86, Andy Lutomirski,
	Steven Rostedt, tkjos

On Thu, 03 Jun 2021, Andi Kleen wrote:

> 
> > True, ftrace does have function profiling (function_profile_enabled).
> > 
> > Steve, is there a way to enable that on the kernel cmdline?
> 
> That's not really comparable. function profiling has a lot more overhead.
> Also there is various code which has ftrace instrumentation disabled.
> 
> I don't think why you want to kill the old profiler. It's rarely used, but
> when you need it usually works. It's always good to have simple fall backs.
> And it's not that it's a lot of difficult code.

sysbot is still sending out reports on this:

  https://syzkaller.appspot.com/bug?id=00c965d957410afc0d40cac5343064e0a98b9ecd

Are you guys still planning on sending out a fix?

Is there anything I can do to help?

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-10-11 13:07           ` Lee Jones
@ 2021-10-11 14:43             ` Steven Rostedt
  2021-10-11 17:10               ` Dmitry Vyukov
  2021-10-11 17:30               ` Josh Poimboeuf
  0 siblings, 2 replies; 12+ messages in thread
From: Steven Rostedt @ 2021-10-11 14:43 UTC (permalink / raw)
  To: Lee Jones
  Cc: Andi Kleen, Josh Poimboeuf, Peter Zijlstra, syzbot, bp, hpa,
	inglorion, linux-kernel, mingo, syzkaller-bugs, tglx, x86,
	Andy Lutomirski, tkjos

On Mon, 11 Oct 2021 14:07:15 +0100
Lee Jones <lee.jones@linaro.org> wrote:

> On Thu, 03 Jun 2021, Andi Kleen wrote:
> 
> >   
> > > True, ftrace does have function profiling (function_profile_enabled).
> > > 
> > > Steve, is there a way to enable that on the kernel cmdline?  
> > 
> > That's not really comparable. function profiling has a lot more overhead.
> > Also there is various code which has ftrace instrumentation disabled.
> > 
> > I don't think why you want to kill the old profiler. It's rarely used, but
> > when you need it usually works. It's always good to have simple fall backs.
> > And it's not that it's a lot of difficult code.  
> 
> sysbot is still sending out reports on this:
> 
>   https://syzkaller.appspot.com/bug?id=00c965d957410afc0d40cac5343064e0a98b9ecd
> 
> Are you guys still planning on sending out a fix?
> 
> Is there anything I can do to help?
> 

According to the above:

==================================================================
BUG: KASAN: stack-out-of-bounds in profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
Read of size 8 at addr ffffc90001c0f7a0 by task systemd-udevd/12323

CPU: 1 PID: 12323 Comm: systemd-udevd Not tainted 5.13.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x202/0x31e lib/dump_stack.c:120
 print_address_description+0x5f/0x3b0 mm/kasan/report.c:233
 __kasan_report mm/kasan/report.c:419 [inline]
 kasan_report+0x15c/0x200 mm/kasan/report.c:436
 profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
 profile_tick+0xcd/0x120 kernel/profile.c:408
 tick_sched_handle kernel/time/tick-sched.c:227 [inline]
 tick_sched_timer+0x287/0x420 kernel/time/tick-sched.c:1373
 __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
 __hrtimer_run_queues+0x4cb/0xa60 kernel/time/hrtimer.c:1601
 hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1663
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
 __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1106
 sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1100

And the code has:

 profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42

unsigned long profile_pc(struct pt_regs *regs)
{
	unsigned long pc = instruction_pointer(regs);

	if (!user_mode(regs) && in_lock_functions(pc)) {
#ifdef CONFIG_FRAME_POINTER
		return *(unsigned long *)(regs->bp + sizeof(long));
#else
		unsigned long *sp = (unsigned long *)regs->sp;
		/*
		 * Return address is either directly at stack pointer
		 * or above a saved flags. Eflags has bits 22-31 zero,
		 * kernel addresses don't.
		 */
		if (sp[0] >> 22)
			return sp[0];  <== line 42
		if (sp[1] >> 22)
			return sp[1];
#endif
	}
	return pc;
}
EXPORT_SYMBOL(profile_pc);


It looks to me that the profiler is doing a trick to read the contents of
the stack when the interrupt went off, but this triggers the KASAN
instrumentation to think it's a mistake when it's not. aka "false positive".

How does one tell KASAN that it wants to go outside the stack, because it
knows what its doing?

Should that just be converted to a "copy_from_kernel_nofault()"? That is,
does this fix it? (not even compiled tested)

-- Steve


diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index e42faa792c07..cc6ec29aa14d 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -34,15 +34,18 @@ unsigned long profile_pc(struct pt_regs *regs)
 		return *(unsigned long *)(regs->bp + sizeof(long));
 #else
 		unsigned long *sp = (unsigned long *)regs->sp;
+		unsigned long retaddr;
 		/*
 		 * Return address is either directly at stack pointer
 		 * or above a saved flags. Eflags has bits 22-31 zero,
 		 * kernel addresses don't.
 		 */
-		if (sp[0] >> 22)
-			return sp[0];
-		if (sp[1] >> 22)
-			return sp[1];
+		if (!copy_from_kernel_nofault(&retaddr, sp, sizeof(long)) &&
+		    retaddr >> 22)
+			return retaddr;
+		if (!copy_from_kernel_nofault(&retaddr, sp + 1, sizeof(long)) &&
+		    retaddr >> 22)
+			return retaddr;
 #endif
 	}
 	return pc;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-10-11 14:43             ` Steven Rostedt
@ 2021-10-11 17:10               ` Dmitry Vyukov
  2021-10-11 17:30               ` Josh Poimboeuf
  1 sibling, 0 replies; 12+ messages in thread
From: Dmitry Vyukov @ 2021-10-11 17:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lee Jones, Andi Kleen, Josh Poimboeuf, Peter Zijlstra, syzbot,
	bp, hpa, inglorion, linux-kernel, mingo, syzkaller-bugs, tglx,
	x86, Andy Lutomirski, tkjos

On Mon, 11 Oct 2021 at 16:43, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Mon, 11 Oct 2021 14:07:15 +0100
> Lee Jones <lee.jones@linaro.org> wrote:
>
> > On Thu, 03 Jun 2021, Andi Kleen wrote:
> >
> > >
> > > > True, ftrace does have function profiling (function_profile_enabled).
> > > >
> > > > Steve, is there a way to enable that on the kernel cmdline?
> > >
> > > That's not really comparable. function profiling has a lot more overhead.
> > > Also there is various code which has ftrace instrumentation disabled.
> > >
> > > I don't think why you want to kill the old profiler. It's rarely used, but
> > > when you need it usually works. It's always good to have simple fall backs.
> > > And it's not that it's a lot of difficult code.
> >
> > sysbot is still sending out reports on this:
> >
> >   https://syzkaller.appspot.com/bug?id=00c965d957410afc0d40cac5343064e0a98b9ecd
> >
> > Are you guys still planning on sending out a fix?
> >
> > Is there anything I can do to help?
> >
>
> According to the above:
>
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
> Read of size 8 at addr ffffc90001c0f7a0 by task systemd-udevd/12323
>
> CPU: 1 PID: 12323 Comm: systemd-udevd Not tainted 5.13.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x202/0x31e lib/dump_stack.c:120
>  print_address_description+0x5f/0x3b0 mm/kasan/report.c:233
>  __kasan_report mm/kasan/report.c:419 [inline]
>  kasan_report+0x15c/0x200 mm/kasan/report.c:436
>  profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
>  profile_tick+0xcd/0x120 kernel/profile.c:408
>  tick_sched_handle kernel/time/tick-sched.c:227 [inline]
>  tick_sched_timer+0x287/0x420 kernel/time/tick-sched.c:1373
>  __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
>  __hrtimer_run_queues+0x4cb/0xa60 kernel/time/hrtimer.c:1601
>  hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1663
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
>  __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1106
>  sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1100
>
> And the code has:
>
>  profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
>
> unsigned long profile_pc(struct pt_regs *regs)
> {
>         unsigned long pc = instruction_pointer(regs);
>
>         if (!user_mode(regs) && in_lock_functions(pc)) {
> #ifdef CONFIG_FRAME_POINTER
>                 return *(unsigned long *)(regs->bp + sizeof(long));
> #else
>                 unsigned long *sp = (unsigned long *)regs->sp;
>                 /*
>                  * Return address is either directly at stack pointer
>                  * or above a saved flags. Eflags has bits 22-31 zero,
>                  * kernel addresses don't.
>                  */
>                 if (sp[0] >> 22)
>                         return sp[0];  <== line 42
>                 if (sp[1] >> 22)
>                         return sp[1];
> #endif
>         }
>         return pc;
> }
> EXPORT_SYMBOL(profile_pc);
>
>
> It looks to me that the profiler is doing a trick to read the contents of
> the stack when the interrupt went off, but this triggers the KASAN
> instrumentation to think it's a mistake when it's not. aka "false positive".
>
> How does one tell KASAN that it wants to go outside the stack, because it
> knows what its doing?
>
> Should that just be converted to a "copy_from_kernel_nofault()"? That is,
> does this fix it? (not even compiled tested)
>
> -- Steve
>
>
> diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
> index e42faa792c07..cc6ec29aa14d 100644
> --- a/arch/x86/kernel/time.c
> +++ b/arch/x86/kernel/time.c
> @@ -34,15 +34,18 @@ unsigned long profile_pc(struct pt_regs *regs)
>                 return *(unsigned long *)(regs->bp + sizeof(long));
>  #else
>                 unsigned long *sp = (unsigned long *)regs->sp;
> +               unsigned long retaddr;
>                 /*
>                  * Return address is either directly at stack pointer
>                  * or above a saved flags. Eflags has bits 22-31 zero,
>                  * kernel addresses don't.
>                  */
> -               if (sp[0] >> 22)
> -                       return sp[0];
> -               if (sp[1] >> 22)
> -                       return sp[1];
> +               if (!copy_from_kernel_nofault(&retaddr, sp, sizeof(long)) &&
> +                   retaddr >> 22)
> +                       return retaddr;
> +               if (!copy_from_kernel_nofault(&retaddr, sp + 1, sizeof(long)) &&
> +                   retaddr >> 22)
> +                       return retaddr;
>  #endif
>         }
>         return pc;



This looks like a fit for READ_ONCE_NOCHECK(). It's already used in a
number of similar places like unwinders:
E.g.:
https://elixir.bootlin.com/linux/v5.15-rc4/source/arch/x86/kernel/process.c#L984
https://elixir.bootlin.com/linux/v5.15-rc4/A/ident/READ_ONCE_NOCHECK

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [syzbot] KASAN: stack-out-of-bounds Read in profile_pc
  2021-10-11 14:43             ` Steven Rostedt
  2021-10-11 17:10               ` Dmitry Vyukov
@ 2021-10-11 17:30               ` Josh Poimboeuf
  1 sibling, 0 replies; 12+ messages in thread
From: Josh Poimboeuf @ 2021-10-11 17:30 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lee Jones, Andi Kleen, Peter Zijlstra, syzbot, bp, hpa,
	inglorion, linux-kernel, mingo, syzkaller-bugs, tglx, x86,
	Andy Lutomirski, tkjos

On Mon, Oct 11, 2021 at 10:43:19AM -0400, Steven Rostedt wrote:
> ==================================================================
> BUG: KASAN: stack-out-of-bounds in profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
> Read of size 8 at addr ffffc90001c0f7a0 by task systemd-udevd/12323
> 
> CPU: 1 PID: 12323 Comm: systemd-udevd Not tainted 5.13.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x202/0x31e lib/dump_stack.c:120
>  print_address_description+0x5f/0x3b0 mm/kasan/report.c:233
>  __kasan_report mm/kasan/report.c:419 [inline]
>  kasan_report+0x15c/0x200 mm/kasan/report.c:436
>  profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
>  profile_tick+0xcd/0x120 kernel/profile.c:408
>  tick_sched_handle kernel/time/tick-sched.c:227 [inline]
>  tick_sched_timer+0x287/0x420 kernel/time/tick-sched.c:1373
>  __run_hrtimer kernel/time/hrtimer.c:1537 [inline]
>  __hrtimer_run_queues+0x4cb/0xa60 kernel/time/hrtimer.c:1601
>  hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1663
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1089 [inline]
>  __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1106
>  sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1100
> 
> And the code has:
> 
>  profile_pc+0xa4/0xe0 arch/x86/kernel/time.c:42
> 
> unsigned long profile_pc(struct pt_regs *regs)
> {
> 	unsigned long pc = instruction_pointer(regs);
> 
> 	if (!user_mode(regs) && in_lock_functions(pc)) {
> #ifdef CONFIG_FRAME_POINTER
> 		return *(unsigned long *)(regs->bp + sizeof(long));
> #else
> 		unsigned long *sp = (unsigned long *)regs->sp;
> 		/*
> 		 * Return address is either directly at stack pointer
> 		 * or above a saved flags. Eflags has bits 22-31 zero,
> 		 * kernel addresses don't.
> 		 */
> 		if (sp[0] >> 22)
> 			return sp[0];  <== line 42
> 		if (sp[1] >> 22)
> 			return sp[1];
> #endif
> 	}
> 	return pc;
> }
> EXPORT_SYMBOL(profile_pc);
> 
> 
> It looks to me that the profiler is doing a trick to read the contents of
> the stack when the interrupt went off, but this triggers the KASAN
> instrumentation to think it's a mistake when it's not. aka "false positive".
> 
> How does one tell KASAN that it wants to go outside the stack, because it
> knows what its doing?

*If* the code knew what it were doing, it could use READ_ONCE_NOCHECK()
to skip KASAN checking.  But this code is horribly broken and dangerous.

-- 
Josh


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-10-11 17:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-31  7:15 [syzbot] KASAN: stack-out-of-bounds Read in profile_pc syzbot
2021-06-02 23:00 ` Josh Poimboeuf
2021-06-02 23:35   ` Andi Kleen
2021-06-03 13:29     ` Josh Poimboeuf
2021-06-03 13:30     ` Peter Zijlstra
2021-06-03 13:39       ` Josh Poimboeuf
2021-06-03 13:52         ` Andi Kleen
2021-10-11 13:07           ` Lee Jones
2021-10-11 14:43             ` Steven Rostedt
2021-10-11 17:10               ` Dmitry Vyukov
2021-10-11 17:30               ` Josh Poimboeuf
2021-06-03  8:02 ` syzbot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.