* BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter @ 2019-02-21 14:52 syzbot 2019-02-21 19:36 ` Andrew Morton 0 siblings, 1 reply; 7+ messages in thread From: syzbot @ 2019-02-21 14:52 UTC (permalink / raw) To: akpm, amir73il, darrick.wong, david, hannes, hughd, jrdr.linux, linux-kernel, linux-mm, syzkaller-bugs, willy Hello, syzbot found the following crash on: HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [INSTR] PGD a7ea0067 P4D a7ea0067 PUD 81535067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 15924 Comm: syz-executor0 Not tainted 5.0.0-rc4+ #50 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010: (null) Code: Bad RIP value. RSP: 0018:ffff88804c3d7858 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff885aeb60 RCX: 000000000000005b RDX: 0000000000000000 RSI: ffff88807ec22930 RDI: ffff8880a59bdcc0 RBP: ffff88804c3d79b8 R08: 0000000000000000 R09: ffff88804c3d7910 R10: ffff8880835ca200 R11: 0000000000000000 R12: 000000000000005b R13: ffff88804c3d7c98 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007f3456db4700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000814ac000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3333 ext4_file_write_iter+0x37a/0x1410 fs/ext4/file.c:266 call_write_iter include/linux/fs.h:1862 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x764/0xb40 fs/read_write.c:487 vfs_write+0x20c/0x580 fs/read_write.c:549 ksys_write+0x105/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458089 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f3456db3c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458089 RDX: 000000000000005b RSI: 0000000020000240 RDI: 0000000000000003 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3456db46d4 R13: 00000000004c7450 R14: 00000000004dce68 R15: 00000000ffffffff Modules linked in: CR2: 0000000000000000 ---[ end trace 5cac9d2c75a59916 ]--- kobject: 'loop5' (000000004426a409): kobject_uevent_env RIP: 0010: (null) Code: Bad RIP value. RSP: 0018:ffff88804c3d7858 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff885aeb60 RCX: 000000000000005b kobject: 'loop5' (000000004426a409): fill_kobj_path: path = '/devices/virtual/block/loop5' RDX: 0000000000000000 RSI: ffff88807ec22930 RDI: ffff8880a59bdcc0 kobject: 'loop2' (00000000b82e0c58): kobject_uevent_env kobject: 'loop2' (00000000b82e0c58): fill_kobj_path: path = '/devices/virtual/block/loop2' RBP: ffff88804c3d79b8 R08: 0000000000000000 R09: ffff88804c3d7910 R10: ffff8880835ca200 R11: 0000000000000000 R12: 000000000000005b R13: ffff88804c3d7c98 R14: dffffc0000000000 R15: 0000000000000000 kobject: 'loop5' (000000004426a409): kobject_uevent_env FS: 00007f3456db4700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 kobject: 'loop5' (000000004426a409): fill_kobj_path: path = '/devices/virtual/block/loop5' CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000022029a0 CR3: 00000000814ac000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter 2019-02-21 14:52 BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter syzbot @ 2019-02-21 19:36 ` Andrew Morton 2019-02-27 23:13 ` missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] Jann Horn 2019-03-01 3:31 ` BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter Josh Poimboeuf 0 siblings, 2 replies; 7+ messages in thread From: Andrew Morton @ 2019-02-21 19:36 UTC (permalink / raw) To: syzbot Cc: amir73il, darrick.wong, david, hannes, hughd, jrdr.linux, linux-kernel, linux-mm, syzkaller-bugs, willy, Jan Kara On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com> wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 > kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > Unfortunately, I don't have any reproducer for this crash yet. Not understanding. That seems to be saying that we got a NULL pointer deref in __generic_file_write_iter() at written = generic_perform_write(file, from, iocb->ki_pos); which isn't possible. I'm not seeing recent changes in there which could have caused this. Help. > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > #PF error: [INSTR] > PGD a7ea0067 P4D a7ea0067 PUD 81535067 PMD 0 > Oops: 0010 [#1] PREEMPT SMP KASAN > CPU: 0 PID: 15924 Comm: syz-executor0 Not tainted 5.0.0-rc4+ #50 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010: (null) > Code: Bad RIP value. > RSP: 0018:ffff88804c3d7858 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffffffff885aeb60 RCX: 000000000000005b > RDX: 0000000000000000 RSI: ffff88807ec22930 RDI: ffff8880a59bdcc0 > RBP: ffff88804c3d79b8 R08: 0000000000000000 R09: ffff88804c3d7910 > R10: ffff8880835ca200 R11: 0000000000000000 R12: 000000000000005b > R13: ffff88804c3d7c98 R14: dffffc0000000000 R15: 0000000000000000 > FS: 00007f3456db4700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 00000000814ac000 CR4: 00000000001406f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3333 > ext4_file_write_iter+0x37a/0x1410 fs/ext4/file.c:266 > call_write_iter include/linux/fs.h:1862 [inline] > new_sync_write fs/read_write.c:474 [inline] > __vfs_write+0x764/0xb40 fs/read_write.c:487 > vfs_write+0x20c/0x580 fs/read_write.c:549 > ksys_write+0x105/0x260 fs/read_write.c:598 > __do_sys_write fs/read_write.c:610 [inline] > __se_sys_write fs/read_write.c:607 [inline] > __x64_sys_write+0x73/0xb0 fs/read_write.c:607 > do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x458089 > Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007f3456db3c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458089 > RDX: 000000000000005b RSI: 0000000020000240 RDI: 0000000000000003 > RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3456db46d4 > R13: 00000000004c7450 R14: 00000000004dce68 R15: 00000000ffffffff > Modules linked in: > CR2: 0000000000000000 > ---[ end trace 5cac9d2c75a59916 ]--- > kobject: 'loop5' (000000004426a409): kobject_uevent_env > RIP: 0010: (null) > Code: Bad RIP value. > RSP: 0018:ffff88804c3d7858 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffffffff885aeb60 RCX: 000000000000005b > kobject: 'loop5' (000000004426a409): fill_kobj_path: path > = '/devices/virtual/block/loop5' > RDX: 0000000000000000 RSI: ffff88807ec22930 RDI: ffff8880a59bdcc0 > kobject: 'loop2' (00000000b82e0c58): kobject_uevent_env > kobject: 'loop2' (00000000b82e0c58): fill_kobj_path: path > = '/devices/virtual/block/loop2' > RBP: ffff88804c3d79b8 R08: 0000000000000000 R09: ffff88804c3d7910 > R10: ffff8880835ca200 R11: 0000000000000000 R12: 000000000000005b > R13: ffff88804c3d7c98 R14: dffffc0000000000 R15: 0000000000000000 > kobject: 'loop5' (000000004426a409): kobject_uevent_env > FS: 00007f3456db4700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 > kobject: 'loop5' (000000004426a409): fill_kobj_path: path > = '/devices/virtual/block/loop5' > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000022029a0 CR3: 00000000814ac000 CR4: 00000000001406f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > --- > This bug is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this bug report. See: > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with > syzbot. ^ permalink raw reply [flat|nested] 7+ messages in thread
* missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] 2019-02-21 19:36 ` Andrew Morton @ 2019-02-27 23:13 ` Jann Horn 2019-02-28 12:56 ` Thomas Gleixner 2019-03-01 3:31 ` BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter Josh Poimboeuf 1 sibling, 1 reply; 7+ messages in thread From: Jann Horn @ 2019-02-27 23:13 UTC (permalink / raw) To: Andrew Morton, Josh Poimboeuf Cc: syzbot, amir73il, Darrick J. Wong, Dave Chinner, hannes, Hugh Dickins, jrdr.linux, kernel list, Linux-MM, syzkaller-bugs, Matthew Wilcox, Jan Kara, the arch/x86 maintainers +Josh for unwinding, +x86 folks On Wed, Feb 27, 2019 at 11:43 PM Andrew Morton <akpm@linux-foundation.org> wrote: > On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com> wrote: > > > Hello, > > > > syzbot found the following crash on: > > > > HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 > > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > Unfortunately, I don't have any reproducer for this crash yet. > > Not understanding. That seems to be saying that we got a NULL pointer > deref in __generic_file_write_iter() at > > written = generic_perform_write(file, from, iocb->ki_pos); > > which isn't possible. > > I'm not seeing recent changes in there which could have caused this. Help. + Maybe the problem is that the frame pointer unwinder isn't designed to cope with NULL function pointers - or more generally, with an unwinding operation that starts before the function's frame pointer has been set up? Unwinding starts at show_trace_log_lvl(). That begins with unwind_start(), which calls __unwind_start(), which uses get_frame_pointer(), which just returns regs->bp. But that frame pointer points to the part of the stack that's storing the address of the caller of the function that called NULL; the caller of NULL is skipped, as far as I can tell. What's kind of annoying here is that we don't have a proper frame set up yet, we only have half a stack frame (saved RIP but no saved RBP). > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > > #PF error: [INSTR] > > PGD a7ea0067 P4D a7ea0067 PUD 81535067 PMD 0 > > Oops: 0010 [#1] PREEMPT SMP KASAN > > CPU: 0 PID: 15924 Comm: syz-executor0 Not tainted 5.0.0-rc4+ #50 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > RIP: 0010: (null) > > Code: Bad RIP value. > > RSP: 0018:ffff88804c3d7858 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: ffffffff885aeb60 RCX: 000000000000005b > > RDX: 0000000000000000 RSI: ffff88807ec22930 RDI: ffff8880a59bdcc0 > > RBP: ffff88804c3d79b8 R08: 0000000000000000 R09: ffff88804c3d7910 > > R10: ffff8880835ca200 R11: 0000000000000000 R12: 000000000000005b > > R13: ffff88804c3d7c98 R14: dffffc0000000000 R15: 0000000000000000 > > FS: 00007f3456db4700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: ffffffffffffffd6 CR3: 00000000814ac000 CR4: 00000000001406f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > __generic_file_write_iter+0x25e/0x630 mm/filemap.c:3333 > > ext4_file_write_iter+0x37a/0x1410 fs/ext4/file.c:266 > > call_write_iter include/linux/fs.h:1862 [inline] > > new_sync_write fs/read_write.c:474 [inline] > > __vfs_write+0x764/0xb40 fs/read_write.c:487 > > vfs_write+0x20c/0x580 fs/read_write.c:549 > > ksys_write+0x105/0x260 fs/read_write.c:598 > > __do_sys_write fs/read_write.c:610 [inline] > > __se_sys_write fs/read_write.c:607 [inline] > > __x64_sys_write+0x73/0xb0 fs/read_write.c:607 > > do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x458089 > > Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > > ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > > RSP: 002b:00007f3456db3c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458089 > > RDX: 000000000000005b RSI: 0000000020000240 RDI: 0000000000000003 > > RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3456db46d4 > > R13: 00000000004c7450 R14: 00000000004dce68 R15: 00000000ffffffff > > Modules linked in: > > CR2: 0000000000000000 > > ---[ end trace 5cac9d2c75a59916 ]--- > > kobject: 'loop5' (000000004426a409): kobject_uevent_env > > RIP: 0010: (null) > > Code: Bad RIP value. > > RSP: 0018:ffff88804c3d7858 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: ffffffff885aeb60 RCX: 000000000000005b > > kobject: 'loop5' (000000004426a409): fill_kobj_path: path > > = '/devices/virtual/block/loop5' > > RDX: 0000000000000000 RSI: ffff88807ec22930 RDI: ffff8880a59bdcc0 > > kobject: 'loop2' (00000000b82e0c58): kobject_uevent_env > > kobject: 'loop2' (00000000b82e0c58): fill_kobj_path: path > > = '/devices/virtual/block/loop2' > > RBP: ffff88804c3d79b8 R08: 0000000000000000 R09: ffff88804c3d7910 > > R10: ffff8880835ca200 R11: 0000000000000000 R12: 000000000000005b > > R13: ffff88804c3d7c98 R14: dffffc0000000000 R15: 0000000000000000 > > kobject: 'loop5' (000000004426a409): kobject_uevent_env > > FS: 00007f3456db4700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 > > kobject: 'loop5' (000000004426a409): fill_kobj_path: path > > = '/devices/virtual/block/loop5' > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000022029a0 CR3: 00000000814ac000 CR4: 00000000001406f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > --- > > This bug is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > syzbot will keep track of this bug report. See: > > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with > > syzbot. > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] 2019-02-27 23:13 ` missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] Jann Horn @ 2019-02-28 12:56 ` Thomas Gleixner 2019-02-28 16:34 ` Jann Horn 0 siblings, 1 reply; 7+ messages in thread From: Thomas Gleixner @ 2019-02-28 12:56 UTC (permalink / raw) To: Jann Horn Cc: Andrew Morton, Josh Poimboeuf, syzbot, amir73il, Darrick J. Wong, Dave Chinner, hannes, Hugh Dickins, jrdr.linux, kernel list, Linux-MM, syzkaller-bugs, Matthew Wilcox, Jan Kara, the arch/x86 maintainers On Thu, 28 Feb 2019, Jann Horn wrote: > +Josh for unwinding, +x86 folks > On Wed, Feb 27, 2019 at 11:43 PM Andrew Morton > <akpm@linux-foundation.org> wrote: > > On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com> wrote: > > > > > Hello, > > > > > > syzbot found the following crash on: > > > > > > HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > Not understanding. That seems to be saying that we got a NULL pointer > > deref in __generic_file_write_iter() at > > > > written = generic_perform_write(file, from, iocb->ki_pos); > > > > which isn't possible. > > > > I'm not seeing recent changes in there which could have caused this. Help. > > + > > Maybe the problem is that the frame pointer unwinder isn't designed to > cope with NULL function pointers - or more generally, with an > unwinding operation that starts before the function's frame pointer > has been set up? > > Unwinding starts at show_trace_log_lvl(). That begins with > unwind_start(), which calls __unwind_start(), which uses > get_frame_pointer(), which just returns regs->bp. But that frame > pointer points to the part of the stack that's storing the address of > the caller of the function that called NULL; the caller of NULL is > skipped, as far as I can tell. > > What's kind of annoying here is that we don't have a proper frame set > up yet, we only have half a stack frame (saved RIP but no saved RBP). That wreckage is related to the fact that the indirect calls are going through __x86_indirect_thunk_$REG. I just verified on a VM with some other callback NULL'ed that the resulting backtrace is not really helpful. So in that case generic_perform_write() has two indirect calls: mapping->a_ops->write_begin() and ->write_end() Thanks, tglx ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] 2019-02-28 12:56 ` Thomas Gleixner @ 2019-02-28 16:34 ` Jann Horn 2019-03-01 0:54 ` Jann Horn 0 siblings, 1 reply; 7+ messages in thread From: Jann Horn @ 2019-02-28 16:34 UTC (permalink / raw) To: Thomas Gleixner Cc: Andrew Morton, Josh Poimboeuf, syzbot, amir73il, Darrick J. Wong, Dave Chinner, hannes, Hugh Dickins, jrdr.linux, kernel list, Linux-MM, syzkaller-bugs, Matthew Wilcox, Jan Kara, the arch/x86 maintainers On Thu, Feb 28, 2019 at 1:57 PM Thomas Gleixner <tglx@linutronix.de> wrote: > On Thu, 28 Feb 2019, Jann Horn wrote: > > +Josh for unwinding, +x86 folks > > On Wed, Feb 27, 2019 at 11:43 PM Andrew Morton > > <akpm@linux-foundation.org> wrote: > > > On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com> wrote: > > > > > > > Hello, > > > > > > > > syzbot found the following crash on: > > > > > > > > HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. > > > > git tree: upstream > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab > > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > > > Not understanding. That seems to be saying that we got a NULL pointer > > > deref in __generic_file_write_iter() at > > > > > > written = generic_perform_write(file, from, iocb->ki_pos); > > > > > > which isn't possible. > > > > > > I'm not seeing recent changes in there which could have caused this. Help. > > > > + > > > > Maybe the problem is that the frame pointer unwinder isn't designed to > > cope with NULL function pointers - or more generally, with an > > unwinding operation that starts before the function's frame pointer > > has been set up? > > > > Unwinding starts at show_trace_log_lvl(). That begins with > > unwind_start(), which calls __unwind_start(), which uses > > get_frame_pointer(), which just returns regs->bp. But that frame > > pointer points to the part of the stack that's storing the address of > > the caller of the function that called NULL; the caller of NULL is > > skipped, as far as I can tell. > > > > What's kind of annoying here is that we don't have a proper frame set > > up yet, we only have half a stack frame (saved RIP but no saved RBP). > > That wreckage is related to the fact that the indirect calls are going > through __x86_indirect_thunk_$REG. I just verified on a VM with some other > callback NULL'ed that the resulting backtrace is not really helpful. > > So in that case generic_perform_write() has two indirect calls: > > mapping->a_ops->write_begin() and ->write_end() Does the indirect thunk thing really make any difference? When you arrive at RIP=NULL, RSP points to a saved instruction pointer, just like when indirect calls are compiled normally. I just compiled kernels with artificial calls to a NULL function pointer (in prctl_set_seccomp()), with retpoline disabled, with both unwinders. The ORC unwinder shows a call trace with "?" everywhere that doesn't show the caller: [ 228.219140] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 228.223897] #PF error: [INSTR] [ 228.224562] PGD 0 P4D 0 [ 228.225119] Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN [ 228.226319] CPU: 1 PID: 1099 Comm: artificial_null Not tainted 5.0.0-rc8+ #299 [ 228.227818] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 228.229542] RIP: 0010: (null) [ 228.230331] Code: Bad RIP value. [ 228.231011] RSP: 0018:ffff8881d798fe88 EFLAGS: 00010246 [ 228.232104] RAX: ffffffffffffffda RBX: 0000000000000016 RCX: ffffffffa0368205 [ 228.233599] RDX: dffffc0000000000 RSI: 00007ffde0d71168 RDI: 0000000000000042 [ 228.235077] RBP: 1ffff1103af31fd4 R08: 0000561b50807740 R09: 0000000000000016 [ 228.236557] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000042 [ 228.238039] R13: 0000000000000000 R14: 00007ffde0d71168 R15: 0000561b50807740 [ 228.239517] FS: 00007fe31f1cf700(0000) GS:ffff8881eb040000(0000) knlGS:0000000000000000 [ 228.241213] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 228.242411] CR2: ffffffffffffffd6 CR3: 00000001df8b8004 CR4: 0000000000360ee0 [ 228.243886] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 228.245364] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 228.246841] Call Trace: [ 228.247366] ? __x64_sys_prctl+0x402/0x680 [ 228.248224] ? __ia32_sys_prctl+0x6e0/0x6e0 [ 228.249106] ? __do_page_fault+0x457/0x620 [ 228.249969] ? do_syscall_64+0x6d/0x160 [ 228.250778] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [...] whereas the FP unwinder shows this, listing prctl_set_seccomp only with a question mark: [ 47.469957] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 47.476973] #PF error: [INSTR] [ 47.477742] PGD 0 P4D 0 [ 47.478341] Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN [ 47.479703] CPU: 4 PID: 1079 Comm: artificial_null Not tainted 5.0.0-rc8+ #300 [ 47.481489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 47.483442] RIP: 0010: (null) [ 47.484328] Code: Bad RIP value. [ 47.485085] RSP: 0018:ffff8881e01f7e70 EFLAGS: 00010246 [ 47.486358] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: ffffffffafbf007a [ 47.488090] RDX: dffffc0000000000 RSI: 00007ffe164b4f28 RDI: 0000000000000042 [ 47.489862] RBP: ffff8881e01f7e78 R08: 0000562942136740 R09: 0000000000000016 [ 47.491491] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1103c03efd3 [ 47.493144] R13: 0000000000000042 R14: 00007ffe164b4f28 R15: 0000000000000016 [ 47.494795] FS: 00007fa38b1d6700(0000) GS:ffff8881eb300000(0000) knlGS:0000000000000000 [ 47.496638] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.497981] CR2: ffffffffffffffd6 CR3: 00000001e0e4c006 CR4: 0000000000360ee0 [ 47.499623] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.501252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.502927] Call Trace: [ 47.503501] ? prctl_set_seccomp+0x3a/0x50 [ 47.504450] __x64_sys_prctl+0x457/0x6f0 [ 47.505349] ? __ia32_sys_prctl+0x750/0x750 [ 47.506352] do_syscall_64+0x72/0x160 [ 47.507214] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [...] Looking back at the syzkaller report, the command line output (https://syzkaller.appspot.com/x/log.txt?x=1101382f400000) has this: [ 375.092788] Call Trace: [ 375.095378] ? generic_perform_write+0x2a0/0x6b0 [ 375.100150] ? add_page_wait_queue+0x480/0x480 [ 375.104744] ? current_time+0x1b0/0x1b0 [ 375.108727] ? generic_write_check_limits+0x380/0x380 [ 375.113942] ? ext4_file_write_iter+0x28b/0x1410 {some non-dmesg output here} [ 375.118711] __generic_file_write_iter+0x25e/0x630 [ 375.123714] ext4_file_write_iter+0x37a/0x1410 The first entry with a question mark is *RSP, the real caller; that's generic_perform_write(), as expected. The rest is probably just random garbage that happened to still be on the stack. It looks like syzkaller strips out trace elements with question marks in front. So I think this doesn't really have anything to do with __x86_indirect_thunk_$REG, and the best possible fix might be to teach the unwinders that RIP==NULL means "pretend that RIP is *real_RSP and that RSP is real_RSP+8, and report *real_RSP as the first element of the backtrace". ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] 2019-02-28 16:34 ` Jann Horn @ 2019-03-01 0:54 ` Jann Horn 0 siblings, 0 replies; 7+ messages in thread From: Jann Horn @ 2019-03-01 0:54 UTC (permalink / raw) To: Thomas Gleixner Cc: Andrew Morton, Josh Poimboeuf, syzbot, Amir Goldstein, Darrick J. Wong, Dave Chinner, hannes, Hugh Dickins, Souptick Joarder, kernel list, Linux-MM, syzkaller-bugs, Matthew Wilcox, Jan Kara, the arch/x86 maintainers On Thu, Feb 28, 2019 at 5:34 PM Jann Horn <jannh@google.com> wrote: > > On Thu, Feb 28, 2019 at 1:57 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Thu, 28 Feb 2019, Jann Horn wrote: > > > +Josh for unwinding, +x86 folks > > > On Wed, Feb 27, 2019 at 11:43 PM Andrew Morton > > > <akpm@linux-foundation.org> wrote: > > > > On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com> wrote: > > > > > > > > > Hello, > > > > > > > > > > syzbot found the following crash on: > > > > > > > > > > HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. > > > > > git tree: upstream > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab > > > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > > > > > Not understanding. That seems to be saying that we got a NULL pointer > > > > deref in __generic_file_write_iter() at > > > > > > > > written = generic_perform_write(file, from, iocb->ki_pos); > > > > > > > > which isn't possible. > > > > > > > > I'm not seeing recent changes in there which could have caused this. Help. > > > > > > + > > > > > > Maybe the problem is that the frame pointer unwinder isn't designed to > > > cope with NULL function pointers - or more generally, with an > > > unwinding operation that starts before the function's frame pointer > > > has been set up? > > > > > > Unwinding starts at show_trace_log_lvl(). That begins with > > > unwind_start(), which calls __unwind_start(), which uses > > > get_frame_pointer(), which just returns regs->bp. But that frame > > > pointer points to the part of the stack that's storing the address of > > > the caller of the function that called NULL; the caller of NULL is > > > skipped, as far as I can tell. > > > > > > What's kind of annoying here is that we don't have a proper frame set > > > up yet, we only have half a stack frame (saved RIP but no saved RBP). > > > > That wreckage is related to the fact that the indirect calls are going > > through __x86_indirect_thunk_$REG. I just verified on a VM with some other > > callback NULL'ed that the resulting backtrace is not really helpful. > > > > So in that case generic_perform_write() has two indirect calls: > > > > mapping->a_ops->write_begin() and ->write_end() > > Does the indirect thunk thing really make any difference? When you > arrive at RIP=NULL, RSP points to a saved instruction pointer, just > like when indirect calls are compiled normally. > > I just compiled kernels with artificial calls to a NULL function > pointer (in prctl_set_seccomp()), with retpoline disabled, with both > unwinders. The ORC unwinder shows a call trace with "?" everywhere > that doesn't show the caller: [...] > So I think this doesn't really have anything to do with > __x86_indirect_thunk_$REG, and the best possible fix might be to teach > the unwinders that RIP==NULL means "pretend that RIP is *real_RSP and > that RSP is real_RSP+8, and report *real_RSP as the first element of > the backtrace". Cooking up some patches now... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter 2019-02-21 19:36 ` Andrew Morton 2019-02-27 23:13 ` missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] Jann Horn @ 2019-03-01 3:31 ` Josh Poimboeuf 1 sibling, 0 replies; 7+ messages in thread From: Josh Poimboeuf @ 2019-03-01 3:31 UTC (permalink / raw) To: Andrew Morton Cc: syzbot, amir73il, darrick.wong, david, hannes, hughd, jrdr.linux, linux-kernel, linux-mm, syzkaller-bugs, willy, Jan Kara On Thu, Feb 21, 2019 at 11:36:24AM -0800, Andrew Morton wrote: > On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com> wrote: > > > Hello, > > > > syzbot found the following crash on: > > > > HEAD commit: 4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20 > > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > Unfortunately, I don't have any reproducer for this crash yet. > > Not understanding. That seems to be saying that we got a NULL pointer > deref in __generic_file_write_iter() at > > written = generic_perform_write(file, from, iocb->ki_pos); > > which isn't possible. > > I'm not seeing recent changes in there which could have caused this. Help. FWIW, the panic happened in generic_perform_write() when it called a_ops->write_begin, which was NULL. I agree with Jann that the unwinders should handle this scenario better. -- Josh ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-03-01 3:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-02-21 14:52 BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter syzbot 2019-02-21 19:36 ` Andrew Morton 2019-02-27 23:13 ` missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] Jann Horn 2019-02-28 12:56 ` Thomas Gleixner 2019-02-28 16:34 ` Jann Horn 2019-03-01 0:54 ` Jann Horn 2019-03-01 3:31 ` BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter Josh Poimboeuf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).