All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>,
	amir73il@gmail.com, "Darrick J. Wong" <darrick.wong@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	hannes@cmpxchg.org, Hugh Dickins <hughd@google.com>,
	jrdr.linux@gmail.com, kernel list <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	"the arch/x86 maintainers" <x86@kernel.org>
Subject: Re: missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter]
Date: Thu, 28 Feb 2019 17:34:16 +0100	[thread overview]
Message-ID: <CAG48ez2huzOwKvH5qVGaGeWOKRDX8qr_9keHBcZCyBaw85ed-g@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1902281248400.1821@nanos.tec.linutronix.de>

On Thu, Feb 28, 2019 at 1:57 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 28 Feb 2019, Jann Horn wrote:
> > +Josh for unwinding, +x86 folks
> > On Wed, Feb 27, 2019 at 11:43 PM Andrew Morton
> > <akpm@linux-foundation.org> wrote:
> > > On Thu, 21 Feb 2019 06:52:04 -0800 syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    4aa9fc2a435a Revert "mm, memory_hotplug: initialize struct..
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1101382f400000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=4fceea9e2d99ac20
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=ca95b2b7aef9e7cbd6ab
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > >
> > > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > Not understanding.  That seems to be saying that we got a NULL pointer
> > > deref in __generic_file_write_iter() at
> > >
> > >                 written = generic_perform_write(file, from, iocb->ki_pos);
> > >
> > > which isn't possible.
> > >
> > > I'm not seeing recent changes in there which could have caused this.  Help.
> >
> > +
> >
> > Maybe the problem is that the frame pointer unwinder isn't designed to
> > cope with NULL function pointers - or more generally, with an
> > unwinding operation that starts before the function's frame pointer
> > has been set up?
> >
> > Unwinding starts at show_trace_log_lvl(). That begins with
> > unwind_start(), which calls __unwind_start(), which uses
> > get_frame_pointer(), which just returns regs->bp. But that frame
> > pointer points to the part of the stack that's storing the address of
> > the caller of the function that called NULL; the caller of NULL is
> > skipped, as far as I can tell.
> >
> > What's kind of annoying here is that we don't have a proper frame set
> > up yet, we only have half a stack frame (saved RIP but no saved RBP).
>
> That wreckage is related to the fact that the indirect calls are going
> through __x86_indirect_thunk_$REG. I just verified on a VM with some other
> callback NULL'ed that the resulting backtrace is not really helpful.
>
> So in that case generic_perform_write() has two indirect calls:
>
>   mapping->a_ops->write_begin() and ->write_end()

Does the indirect thunk thing really make any difference? When you
arrive at RIP=NULL, RSP points to a saved instruction pointer, just
like when indirect calls are compiled normally.

I just compiled kernels with artificial calls to a NULL function
pointer (in prctl_set_seccomp()), with retpoline disabled, with both
unwinders. The ORC unwinder shows a call trace with "?" everywhere
that doesn't show the caller:

[  228.219140] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000000
[  228.223897] #PF error: [INSTR]
[  228.224562] PGD 0 P4D 0
[  228.225119] Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[  228.226319] CPU: 1 PID: 1099 Comm: artificial_null Not tainted
5.0.0-rc8+ #299
[  228.227818] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[  228.229542] RIP: 0010:          (null)
[  228.230331] Code: Bad RIP value.
[  228.231011] RSP: 0018:ffff8881d798fe88 EFLAGS: 00010246
[  228.232104] RAX: ffffffffffffffda RBX: 0000000000000016 RCX: ffffffffa0368205
[  228.233599] RDX: dffffc0000000000 RSI: 00007ffde0d71168 RDI: 0000000000000042
[  228.235077] RBP: 1ffff1103af31fd4 R08: 0000561b50807740 R09: 0000000000000016
[  228.236557] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000042
[  228.238039] R13: 0000000000000000 R14: 00007ffde0d71168 R15: 0000561b50807740
[  228.239517] FS:  00007fe31f1cf700(0000) GS:ffff8881eb040000(0000)
knlGS:0000000000000000
[  228.241213] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  228.242411] CR2: ffffffffffffffd6 CR3: 00000001df8b8004 CR4: 0000000000360ee0
[  228.243886] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  228.245364] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  228.246841] Call Trace:
[  228.247366]  ? __x64_sys_prctl+0x402/0x680
[  228.248224]  ? __ia32_sys_prctl+0x6e0/0x6e0
[  228.249106]  ? __do_page_fault+0x457/0x620
[  228.249969]  ? do_syscall_64+0x6d/0x160
[  228.250778]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[...]

whereas the FP unwinder shows this, listing prctl_set_seccomp only
with a question mark:

[   47.469957] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000000
[   47.476973] #PF error: [INSTR]
[   47.477742] PGD 0 P4D 0
[   47.478341] Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[   47.479703] CPU: 4 PID: 1079 Comm: artificial_null Not tainted
5.0.0-rc8+ #300
[   47.481489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[   47.483442] RIP: 0010:          (null)
[   47.484328] Code: Bad RIP value.
[   47.485085] RSP: 0018:ffff8881e01f7e70 EFLAGS: 00010246
[   47.486358] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: ffffffffafbf007a
[   47.488090] RDX: dffffc0000000000 RSI: 00007ffe164b4f28 RDI: 0000000000000042
[   47.489862] RBP: ffff8881e01f7e78 R08: 0000562942136740 R09: 0000000000000016
[   47.491491] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1103c03efd3
[   47.493144] R13: 0000000000000042 R14: 00007ffe164b4f28 R15: 0000000000000016
[   47.494795] FS:  00007fa38b1d6700(0000) GS:ffff8881eb300000(0000)
knlGS:0000000000000000
[   47.496638] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   47.497981] CR2: ffffffffffffffd6 CR3: 00000001e0e4c006 CR4: 0000000000360ee0
[   47.499623] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   47.501252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   47.502927] Call Trace:
[   47.503501]  ? prctl_set_seccomp+0x3a/0x50
[   47.504450]  __x64_sys_prctl+0x457/0x6f0
[   47.505349]  ? __ia32_sys_prctl+0x750/0x750
[   47.506352]  do_syscall_64+0x72/0x160
[   47.507214]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[...]

Looking back at the syzkaller report, the command line output
(https://syzkaller.appspot.com/x/log.txt?x=1101382f400000) has this:

[  375.092788] Call Trace:
[  375.095378]  ? generic_perform_write+0x2a0/0x6b0
[  375.100150]  ? add_page_wait_queue+0x480/0x480
[  375.104744]  ? current_time+0x1b0/0x1b0
[  375.108727]  ? generic_write_check_limits+0x380/0x380
[  375.113942]  ? ext4_file_write_iter+0x28b/0x1410
{some non-dmesg output here}
[  375.118711]  __generic_file_write_iter+0x25e/0x630
[  375.123714]  ext4_file_write_iter+0x37a/0x1410

The first entry with a question mark is *RSP, the real caller; that's
generic_perform_write(), as expected. The rest is probably just random
garbage that happened to still be on the stack. It looks like
syzkaller strips out trace elements with question marks in front.


So I think this doesn't really have anything to do with
__x86_indirect_thunk_$REG, and the best possible fix might be to teach
the unwinders that RIP==NULL means "pretend that RIP is *real_RSP and
that RSP is real_RSP+8, and report *real_RSP as the first element of
the backtrace".

  reply	other threads:[~2019-02-28 16:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-21 14:52 BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter syzbot
2019-02-21 14:52 ` syzbot
2019-02-21 19:36 ` Andrew Morton
2019-02-27 23:13   ` missing stack trace entry on NULL pointer call [was: Re: BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter] Jann Horn
2019-02-27 23:13     ` Jann Horn
2019-02-28 12:56     ` Thomas Gleixner
2019-02-28 12:56       ` Thomas Gleixner
2019-02-28 16:34       ` Jann Horn [this message]
2019-02-28 16:34         ` Jann Horn
2019-03-01  0:54         ` Jann Horn
2019-03-01  0:54           ` Jann Horn
2019-03-01  3:31   ` BUG: unable to handle kernel NULL pointer dereference in __generic_file_write_iter Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAG48ez2huzOwKvH5qVGaGeWOKRDX8qr_9keHBcZCyBaw85ed-g@mail.gmail.com \
    --to=jannh@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jpoimboe@redhat.com \
    --cc=jrdr.linux@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@linutronix.de \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.