All of lore.kernel.org
 help / color / mirror / Atom feed
From: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v3 0/4] Fixes for 3 separate NMI reentrancy bugs
Date: Tue, 26 Feb 2019 12:21:10 +0530	[thread overview]
Message-ID: <20190226065110.GA13164@sathnaga86.in.ibm.com> (raw)
In-Reply-To: <20190226060901.18715-1-npiggin@gmail.com>

On Tue, Feb 26, 2019 at 04:08:57PM +1000, Nicholas Piggin wrote:
> This series fixes several similar but unrelated bugs with NMIs
> clobbering live registers without noticing it, because MSR[RI] is set.
> Pretty rare bugs, but serious silent corruption consequences.
> 
> For the most part these can be observed and tested quite easily
> with the mambo simulator, except that it does not seem to follow
> the architecture wrt leaving MSR[RI] unchanged for HV interrupts.
> Mambo clears MSR[RI], so you have to account for that manually.
> 
> Since v1:
> - Fixed several build bugs.
> 
> Since v2:
> - Improved changelog and comments.
> - Fixed the NIA test for virt mode interrupts.

Hit with below crash on Power8 box, patch built with linuxppc merge branch with `ppc64le_defconfig`

UnknownStateTransition: Something happened system state="8" and we transitioned to UNKNOWN state.  Review the following for more details
Message="OpTestSystem in run_IPLing and Exception="Kernel OOPS (machine in state '5'): Oops: Kernel access of bad area, sig: 11 [#1]
[    0.000000] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7-gf46b87021 #1
[    0.000000] NIP:  c000000000c1306c LR: c000000000c12f64 CTR: c00000000033d860
[    0.000000] REGS: c0000000014878b0 TRAP: 0380   Not tainted  (5.0.0-rc7-gf46b87021)
[    0.000000] MSR:  9000000000001033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28002224  XER: 00000000
[    0.000000] CFAR: c000000000c12f7c IRQMASK: 1 
[    0.000000] GPR00: c000000000c12f64 c000000001487b40 c000000001488400 f000000000000000 
[    0.000000] GPR04: c000000001487b18 c000000001487b20 0000000000000000 c000000001388400 
[    0.000000] GPR08: f000000000000000 f000000000000008 0000000000000000 0000000800000000 
[    0.000000] GPR12: c0000000015e1ed0 c000000001670000 0000000000000000 0000000000000000 
[    0.000000] GPR16: 0000000000000000 0000000000000000 c0000000015e0d40 0000000000000001 
[    0.000000] GPR20: ffffffffffffffff ffffffffffffffff 0000000008000000 c000000001413b90 
[    0.000000] GPR24: c000000001413b98 007ffff000000000 0000000000080000 0000000000000000 
[    0.000000] GPR28: 0000000000000000 0000000000000000 007ffff000001000 0000000000000000 
[    0.000000] NIP [c000000000c1306c] memmap_init_zone+0x258/0x308
[    0.000000] LR [c000000000c12f64] memmap_init_zone+0x150/0x308
[    0.000000] Call Trace:
[    0.000000] [c000000001487b40] [c000000000c12f64] memmap_init_zone+0x150/0x308 (unreliable)
[    0.000000] [c000000001487be0] [c000000000f87acc] free_area_init_node+0x480/0x518
[    0.000000] [c000000001487cf0] [c000000000f88630] free_area_init_nodes+0x838/0x940
[    0.000000] [c000000001487e10] [c000000000f6340c] paging_init+0x8c/0xa8
[    0.000000] [c000000001487e80] [c000000000f5bc00] setup_arch+0x3b4/0x3f0
[    0.000000] [c000000001487ef0] [c000000000f53b68] start_kernel+0x94/0x630
[    0.000000] [c000000001487f90] [c00000000000b37c] start_here_common+0x1c/0x520
[    0.000000] Instruction dump:
[    0.000000] 71290002 41820014 ebea0008 7cc6fa14 78df8402 48000070 3d22000c 7bea3664 
[    0.000000] 39299d20 e9090000 7c685214 39230008 <fa290010> fa290018 fa290020 fa290030 
[    0.000000] random: get_random_bytes called from print_oops_end_marker+0x40/0x80 with crng_init=0
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] 
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] Rebooting in 10 seconds" caused the system to go to UNKNOWN_BAD and the system will be stopping."

Regards,
-Satheesh.
> 
> Nicholas Piggin (4):
>   powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
>   powerpc/64s: system reset interrupt preserve HSRRs
>   powerpc/64s: Prepare to handle data interrupts vs d-side MCE
>     reentrancy
>   powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
> 
>  arch/powerpc/include/asm/asm-prototypes.h |  8 ++
>  arch/powerpc/include/asm/nmi.h            |  2 +
>  arch/powerpc/kernel/exceptions-64s.S      | 92 +++++++++++++++++++----
>  arch/powerpc/kernel/mce.c                 |  3 +
>  arch/powerpc/kernel/traps.c               | 91 +++++++++++++++++++++-
>  5 files changed, 179 insertions(+), 17 deletions(-)
> 
> -- 
> 2.18.0
> 


      parent reply	other threads:[~2019-02-26  6:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-26  6:08 [PATCH v3 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
2019-02-26  6:08 ` [PATCH v3 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
2019-02-26  6:08 ` [PATCH v3 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
2019-02-26  6:09 ` [PATCH v3 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
2019-02-26  6:09 ` [PATCH v3 4/4] powerpc/64s: Fix " Nicholas Piggin
2019-02-26  6:51 ` Satheesh Rajendran [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190226065110.GA13164@sathnaga86.in.ibm.com \
    --to=sathnaga@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.