From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756406AbdAEREj (ORCPT ); Thu, 5 Jan 2017 12:04:39 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58222 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753237AbdAERDo (ORCPT ); Thu, 5 Jan 2017 12:03:44 -0500 Date: Thu, 5 Jan 2017 11:02:14 -0600 From: Josh Poimboeuf To: Dave Jones Cc: Linux Kernel Subject: Re: stack unwinder warning. Message-ID: <20170105170214.sqf6rjywjgtxpbr2@treble> References: <20161227190030.dlyn24um2mcpbuzu@codemonkey.org.uk> <20170105145249.v2q3r3i46vzkg4au@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20170105145249.v2q3r3i46vzkg4au@treble> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 05 Jan 2017 17:02:16 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 05, 2017 at 08:52:49AM -0600, Josh Poimboeuf wrote: > On Tue, Dec 27, 2016 at 02:00:30PM -0500, Dave Jones wrote: > > I'm not sure what to make of this. Josh ? (4.10-rc1) > > > > WARNING: kernel stack frame pointer at ffffc900003e7858 in trinity-c6:29122 has bad value ffffffff82103a80 > > unwind stack type:0 next_sp: (null) mask:2 graph_idx:0 > > ffffc900003e7808: ffffffff811a02e5 (ring_buffer_lock_reserve+0x1d5/0x580) > > ffffc900003e7810: ffffffff8119adc3 (rb_commit+0x93/0x350) > > ffffc900003e7818: ffffffff811b31d4 (function_trace_call+0x104/0x1f0) > > ffffc900003e7820: ffff8804f10ec000 (0xffff8804f10ec000) > > ffffc900003e7828: 0000000000000000 ... > > ffffc900003e7830: ffffffff8119b3ae (ring_buffer_unlock_commit+0x8e/0x120) > > ffffc900003e7838: 0000000000000001 (0x1) > > ffffc900003e7840: ffffea0002854e00 (0xffffea0002854e00) > > ffffc900003e7848: 000000000000000a (0xa) > > ffffc900003e7850: ffffea0002854ec0 (0xffffea0002854ec0) > > ffffc900003e7858: ffffea000287c480 (0xffffea000287c480) > > The value reported by the warning contradicts the value reported by the > dump. So this seems to have been caused by dumping the stack of a task > which is running on another CPU. There are still some places in the > code where that's possible. So I'm going to need to remove these > unwinder warnings for now. I'll be submitting the following patch soon, which I think should silence the warning. If the warning is recreatable, would you mind testing it? diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c index 4443e49..6fda186 100644 --- a/arch/x86/kernel/unwind_frame.c +++ b/arch/x86/kernel/unwind_frame.c @@ -207,6 +207,16 @@ bool unwind_next_frame(struct unwind_state *state) return true; bad_address: + /* + * When dumping a task other than current, the task might actually be + * running on another CPU, in which case it could be modifying its + * stack while we're reading it. This is generally not a problem and + * can be ignored as long as the caller understands that unwinding + * another task will not always succeed. + */ + if (state->task != current) + goto the_end; + if (state->regs) { printk_deferred_once(KERN_WARNING "WARNING: kernel stack regs at %p in %s:%d has bad 'bp' value %p\n", -- 2.7.4