From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753081AbcBZAiQ (ORCPT <rfc822;w@1wt.eu>);
	Thu, 25 Feb 2016 19:38:16 -0500
Received: from mail-io0-f174.google.com ([209.85.223.174]:33279 "EHLO
	mail-io0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752607AbcBZAiO (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 25 Feb 2016 19:38:14 -0500
MIME-Version: 1.0
In-Reply-To: <56CF72EA.9040009@suse.cz>
References: <20160217203730.GA14820@kroah.com>
	<56CED373.9060603@suse.cz>
	<56CF4A83.3040408@hurleysoftware.com>
	<CA+55aFyeVmnNuk5pPoH05uPKZRSXt1hv_0PWuvndptqBSfPrbA@mail.gmail.com>
	<56CF64C9.8050705@hurleysoftware.com>
	<CA+55aFx306jsTaUkm_c4nJtEo=A3vdDFYqqpLJj1zKUh=wLxog@mail.gmail.com>
	<56CF72EA.9040009@suse.cz>
Date: Thu, 25 Feb 2016 16:38:13 -0800
X-Google-Sender-Auth: U7HcAp9S7orhKqpUid4OkN5hIQQ
Message-ID: <CA+55aFzQzCKhX73bUYwNMK8Dd9ECtmav8u7ds2aXKjEo7DV3Gg@mail.gmail.com>
Subject: Re: BUG: unable to handle kernel paging request from pty_write [was:
 Linux 4.4.2]
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jiri Slaby <jslaby@suse.cz>
Cc: Peter Hurley <peter@hurleysoftware.com>,
        Greg KH <gregkh@linuxfoundation.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        stable <stable@vger.kernel.org>, lwn@lwn.net,
        Steven Rostedt <rostedt@goodmis.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Feb 25, 2016 at 1:32 PM, Jiri Slaby <jslaby@suse.cz> wrote:
>
> Interestingly, RBP contains address inside try_to_wake_up --
> ffffffff810a535a (dunno why) which is:
> ffffffff810a5355:       e8 66 a0 ff ff          callq  ffffffff8109f3c0
> <ttwu_stat>
> ffffffff810a535a:       e9 9d fe ff ff          jmpq   ffffffff810a51fc
> <try_to_wake_up+0x3c>
>
> ttwu_stat does in the begginning:
> mov    $0x16e80,%r14
>
> which is what we actually still have in r14 when it crashes. The first
> ttwu_stat's "if" has to go through the true branch (otherwise r14 would
> be overwritten).

Hmm. That does sound very much like it might be ttwu_stat() that has
gotten the stack frame wrong, and when finishes exits, it does

        popq    %rbp
        ret

but in fact it popped the return address, and then returned to a crazy address.

Which sounds like a corrupted stack pointer (not a corrupted stack).

Can you make just the "vmlinux" file available somewhere?

In my own private configuration, ttwu_stat() doesn't actually touch
the stack at all - no stack pointer action anywhere except for the

ttwu_stat:
1:      call    __fentry__
        pushq   %rbp
   ..
        movq    %rsp, %rbp      #,

 .....

        popq    %rbp
        ret

but yeah, as Peter says, maybe an exception screwed up %rsp somehow..

I really don't see how it would happen here - that code doesn't look
particularly odd.

And the fentry code used by the function tracer can certainly screw
things up, but even that would be hard-pressed to screw up %rbp, since
the saving of rbp comes *after* fentry. Old pre-__fentry__ gcc
versions had a much higher likelihood (the whole mcount thing is a
disaster, but I'm assuming you have a compiler that does __fentry__
and have CC_USING_FENTRY set?)

               Linus