From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753081AbcBZAiQ (ORCPT ); Thu, 25 Feb 2016 19:38:16 -0500 Received: from mail-io0-f174.google.com ([209.85.223.174]:33279 "EHLO mail-io0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752607AbcBZAiO (ORCPT ); Thu, 25 Feb 2016 19:38:14 -0500 MIME-Version: 1.0 In-Reply-To: <56CF72EA.9040009@suse.cz> References: <20160217203730.GA14820@kroah.com> <56CED373.9060603@suse.cz> <56CF4A83.3040408@hurleysoftware.com> <56CF64C9.8050705@hurleysoftware.com> <56CF72EA.9040009@suse.cz> Date: Thu, 25 Feb 2016 16:38:13 -0800 X-Google-Sender-Auth: U7HcAp9S7orhKqpUid4OkN5hIQQ Message-ID: Subject: Re: BUG: unable to handle kernel paging request from pty_write [was: Linux 4.4.2] From: Linus Torvalds To: Jiri Slaby Cc: Peter Hurley , Greg KH , Linux Kernel Mailing List , Andrew Morton , stable , lwn@lwn.net, Steven Rostedt Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 25, 2016 at 1:32 PM, Jiri Slaby wrote: > > Interestingly, RBP contains address inside try_to_wake_up -- > ffffffff810a535a (dunno why) which is: > ffffffff810a5355: e8 66 a0 ff ff callq ffffffff8109f3c0 > > ffffffff810a535a: e9 9d fe ff ff jmpq ffffffff810a51fc > > > ttwu_stat does in the begginning: > mov $0x16e80,%r14 > > which is what we actually still have in r14 when it crashes. The first > ttwu_stat's "if" has to go through the true branch (otherwise r14 would > be overwritten). Hmm. That does sound very much like it might be ttwu_stat() that has gotten the stack frame wrong, and when finishes exits, it does popq %rbp ret but in fact it popped the return address, and then returned to a crazy address. Which sounds like a corrupted stack pointer (not a corrupted stack). Can you make just the "vmlinux" file available somewhere? In my own private configuration, ttwu_stat() doesn't actually touch the stack at all - no stack pointer action anywhere except for the ttwu_stat: 1: call __fentry__ pushq %rbp .. movq %rsp, %rbp #, ..... popq %rbp ret but yeah, as Peter says, maybe an exception screwed up %rsp somehow.. I really don't see how it would happen here - that code doesn't look particularly odd. And the fentry code used by the function tracer can certainly screw things up, but even that would be hard-pressed to screw up %rbp, since the saving of rbp comes *after* fentry. Old pre-__fentry__ gcc versions had a much higher likelihood (the whole mcount thing is a disaster, but I'm assuming you have a compiler that does __fentry__ and have CC_USING_FENTRY set?) Linus