linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <greg@kroah.com>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Michael Ellerman <michael@ellerman.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: [ 25/26] powerpc: Fix stack overflow crash in resume_kernel when ftracing
Date: Tue, 18 Jun 2013 09:15:05 -0700	[thread overview]
Message-ID: <20130618161238.626277186@linuxfoundation.org> (raw)
In-Reply-To: <20130618161231.154881788@linuxfoundation.org>

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Ellerman <michael@ellerman.id.au>

commit 0e37739b1c96d65e6433998454985de994383019 upstream.

It's possible for us to crash when running with ftrace enabled, eg:

  Bad kernel stack pointer bffffd12 at c00000000000a454
  cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
      pc: c00000000000a454: resume_kernel+0x34/0x60
      lr: c00000000000335c: performance_monitor_common+0x15c/0x180
      sp: bffffd12
     msr: 8000000000001032
     dar: bffffd12
   dsisr: 42000000

If we look at current's stack (paca->__current->stack) we see it is
equal to c0000002ecab0000. Our stack is 16K, and comparing to
paca->kstack (c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info->flags and set
_TIF_EMULATE_STACK_STORE.

Dumping the stack we see:

  3:mon> t c0000002ecab0000
  [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
  [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
  --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
  [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
  [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
  [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
  [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
  [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
  [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
  [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
  [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
  [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
  [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
  [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
  --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0

  ... and so on

__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.

This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.

The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().

Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.

[ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873
  "powerpc: Rework runlatch code" by myself --BenH
]

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/powerpc/include/asm/exception-64s.h |    2 +-
 arch/powerpc/kernel/process.c            |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -320,7 +320,7 @@ label##_common:							\
  */
 #define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr)		  \
 	EXCEPTION_COMMON(trap, label, hdlr, ret_from_except_lite, \
-			 FINISH_NAP;RUNLATCH_ON;DISABLE_INTS)
+			 FINISH_NAP;DISABLE_INTS;RUNLATCH_ON)
 
 /*
  * When the idle code in power4_idle puts the CPU into NAP mode,
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1218,7 +1218,7 @@ EXPORT_SYMBOL(dump_stack);
 
 #ifdef CONFIG_PPC64
 /* Called with hard IRQs off */
-void __ppc64_runlatch_on(void)
+void notrace __ppc64_runlatch_on(void)
 {
 	struct thread_info *ti = current_thread_info();
 	unsigned long ctrl;
@@ -1231,7 +1231,7 @@ void __ppc64_runlatch_on(void)
 }
 
 /* Called with hard IRQs off */
-void __ppc64_runlatch_off(void)
+void notrace __ppc64_runlatch_off(void)
 {
 	struct thread_info *ti = current_thread_info();
 	unsigned long ctrl;



  parent reply	other threads:[~2013-06-18 16:16 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-18 16:14 [ 00/26] 3.4.50-stable review Greg Kroah-Hartman
2013-06-18 16:14 ` [ 01/26] b43: stop format string leaking into error msgs Greg Kroah-Hartman
2013-06-18 16:14 ` [ 02/26] libceph: must hold mutex for reset_changed_osds() Greg Kroah-Hartman
2013-06-18 16:14 ` [ 03/26] ceph: add cpu_to_le32() calls when encoding a reconnect capability Greg Kroah-Hartman
2013-06-18 16:14 ` [ 04/26] ceph: ceph_pagelist_append might sleep while atomic Greg Kroah-Hartman
2013-06-18 16:14 ` [ 05/26] drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree Greg Kroah-Hartman
2013-06-18 16:14 ` [ 06/26] drm/gma500/psb: Unpin framebuffer on crtc disable Greg Kroah-Hartman
2013-06-18 16:14 ` [ 07/26] drm/gma500/cdv: " Greg Kroah-Hartman
2013-06-18 16:14 ` [ 08/26] Bluetooth: Fix mgmt handling of power on failures Greg Kroah-Hartman
2013-06-18 16:14 ` [ 09/26] ath9k: Disable PowerSave by default Greg Kroah-Hartman
2013-06-18 16:14 ` [ 10/26] ath9k: Use minstrel rate control " Greg Kroah-Hartman
2013-06-18 16:14 ` [ 11/26] CPU hotplug: provide a generic helper to disable/enable CPU hotplug Greg Kroah-Hartman
2013-06-18 16:14 ` [ 12/26] reboot: rigrate shutdown/reboot to boot cpu Greg Kroah-Hartman
2013-06-18 16:14 ` [ 13/26] cciss: fix broken mutex usage in ioctl Greg Kroah-Hartman
2013-06-18 16:14 ` [ 14/26] drm/i915: prefer VBT modes for SVDO-LVDS over EDID Greg Kroah-Hartman
2013-06-18 16:14 ` [ 15/26] swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion Greg Kroah-Hartman
2013-06-18 16:14 ` [ 16/26] md/raid1: consider WRITE as successful only if at least one non-Faulty and non-rebuilding drive completed it Greg Kroah-Hartman
2013-06-18 16:14 ` [ 17/26] mm: migration: add migrate_entry_wait_huge() Greg Kroah-Hartman
2013-06-18 16:14 ` [ 18/26] x86: Fix typo in kexec register clearing Greg Kroah-Hartman
2013-06-18 16:14 ` [ 19/26] libceph: clear messenger auth_retry flag when we authenticate Greg Kroah-Hartman
2013-06-18 16:15 ` [ 20/26] libceph: fix authorizer invalidation Greg Kroah-Hartman
2013-06-18 16:15 ` [ 21/26] libceph: add update_authorizer auth method Greg Kroah-Hartman
2013-06-18 16:15 ` [ 22/26] libceph: wrap auth ops in wrapper functions Greg Kroah-Hartman
2013-06-18 16:15 ` [ 23/26] libceph: wrap auth methods in a mutex Greg Kroah-Hartman
2013-06-18 16:15 ` [ 24/26] ceph: fix statvfs fr_size Greg Kroah-Hartman
2013-06-18 16:15 ` Greg Kroah-Hartman [this message]
2013-06-18 16:15 ` [ 26/26] powerpc: Fix missing/delayed calls to irq_work Greg Kroah-Hartman
     [not found] ` <CAKocOOOnfMbYymOamd8Woy=tmjnQ1i9xGLx-U2Unj8j7sk2RQA@mail.gmail.com>
2013-06-18 22:23   ` [ 00/26] 3.4.50-stable review Shuah Khan
2013-06-20 10:53 ` Satoru Takeuchi
2013-06-20 16:59   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130618161238.626277186@linuxfoundation.org \
    --to=greg@kroah.com \
    --cc=benh@kernel.crashing.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael@ellerman.id.au \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).