All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christophe Leroy <christophe.leroy@c-s.fr>
To: Radu Rendec <radu.rendec@gmail.com>, linuxppc-dev@lists.ozlabs.org
Subject: Re: MCE handler gets NIP wrong on MPC8378
Date: Tue, 18 Feb 2020 19:08:37 +0100	[thread overview]
Message-ID: <a0856192-804b-fe2a-ccb8-48b43b130696@c-s.fr> (raw)
In-Reply-To: <CAD5jUk_8DAvneGjkQ7JOOuNeXaKU1g9E09+H8M5Eo=ttgthdgg@mail.gmail.com>



Le 18/02/2020 à 18:07, Radu Rendec a écrit :
> Hi Everyone,
> 
> The saved NIP seems to be broken inside machine_check_exception() on
> MPC8378, running Linux 4.9.191. The value is 0x900 most of the times,
> but I have seen other weird values.
> 
> I've been able to track down the entry code to head_32.S (vector 0x200),
> but I'm not sure where/how the NIP value (where the exception occurred)
> is captured.

NIP value is supposed to come from SRR0, loaded in r12 in PROLOG_2 and 
saved into _NIP(r11) in transfer_to_handler in entry_32.S

Can something clobber r12 at some point ?

Maybe add the following at some place to trap when it happens ?

tweqi r12, 0x900

If you put it just after reading SRR0, and just before writing into 
NIP(r11), you'll see if its wrong from the begining or if it is 
overwriten later.

Christophe

> 
> I also noticed most of the code has moved to head_32.h in newer kernel
> versions, but EXCEPTION_PROLOG_1 and EXCEPTION_PROLOG_2 look pretty much
> the same. I guess the same thing happens on a recent kernel, even though
> I don't have an easy way to test it.
> 
> The original MCE that I see is triggered by a failed PCIe transaction,
> but I was able to reproduce it by just reading from a (physically)
> unmapped memory area. Sample code and kernel crash dump are included
> below.
> 
> Can anyone please provide any suggestion as to what to look at next?
> 
> Thanks,
> Radu
> 
> 8<--------------------------------------------------------------------
> 
> #include <linux/module.h>
> #include <linux/delay.h>
> #include <asm/io.h>
> 
> static void __iomem *bad_addr_base;
> 
> static int __init test_mce_init(void)
> {
>          unsigned int x;
> 
>          bad_addr_base = ioremap(0xf0000000, 0x100);
> 
>          if (bad_addr_base) {
>                  __asm__ __volatile__ ("isync");
>                  x = ioread32(bad_addr_base);
>                  pr_info("Test: %#0x\n", x);
>          } else
>                  pr_err("Cannot map\n");
> 
>          return 0;
> }
> 
> static void __exit test_mce_exit(void)
> {
>          iounmap(bad_addr_base);
> }
> 
> module_init(test_mce_init);
> module_exit(test_mce_exit);
> 
> MODULE_LICENSE("GPL");
> 
> 8<--------------------------------------------------------------------
> 
> [   14.977053] mce: loading out-of-tree module taints kernel.
> [   15.004285] Disabling lock debugging due to kernel taint
> [   15.026151] Machine check in kernel mode.
> [   15.030153] Caused by (from SRR1=41000): [   15.033982] Transfer
> error ack signal
> [   15.037652] Oops: Machine check 1, sig: 7 [#1]
> [   15.042088] PREEMPT [   15.044091] MPC8378_CUSTOM
> [   15.046967] Modules linked in: mce(O+) iptable_filter ip_tables
> x_tables ipv6 mpc8xxx_wdt yaffs spidev spi_fsl_spi spi_fsl_lib
> spi_fsl_cpm fsl_mph_dr_of ehci_fsl ehci_hcd
> [   15.067486] CPU: 0 PID: 1213 Comm: insmod Tainted: G   M     C O
> 4.9.191-default-mpc8378-p3c692a64ae1d #31
> [   15.078175] task: 9e83e550 task.stack: 9ea2e000
> [   15.082699] NIP: 00000900 LR: b147e030 CTR: 80015d50
> [   15.087659] REGS: 9ea2fca0 TRAP: 0200   Tainted: G   M     C O
> (4.9.191-default-mpc8378-p3c692a64ae1d)
> [   15.098084] MSR: 00041000 <ME>[   15.100973]   CR: 42002228  XER: 00000000
> [   15.104976] DAR: 80017414 DSISR: 00000000
> GPR00: b147e030 9ea2fd50 9e83e550 00000000 b1480000 9c652200 9ea2fd18 00000000
> GPR08: 9c652200 00000000 b1480000 00001032 80015d50 100b93d0 b147e308 805eb3d8
> GPR16: 0000003a 00000550 b1473b5c b147c2a4 8048e444 80082b08 00000000 b147c0e8
> GPR24: 00000124 00000578 00000000 00000000 b147c0a0 b147e000 9eb7c280 b147c0a0
> NIP [00000900] 0x900
> [   15.139310] LR [b147e030] test_mce_init+0x30/0xa8 [mce]
> [   15.144528] Call Trace:
> [   15.146973] [9ea2fd50] [b147e000] test_mce_init+0x0/0xa8 [mce] (unreliable)
> [   15.153940] [9ea2fd60] [b147e030] test_mce_init+0x30/0xa8 [mce]
> [   15.159864] [9ea2fd70] [80003ac4] do_one_initcall+0xbc/0x184
> [   15.165527] [9ea2fde0] [804857e8] do_init_module+0x64/0x1e4
> [   15.171107] [9ea2fe00] [80086014] load_module+0x1c78/0x2268
> [   15.176680] [9ea2fec0] [80086780] SyS_init_module+0x17c/0x190
> [   15.182433] [9ea2ff40] [80010acc] ret_from_syscall+0x0/0x38
> [   15.188005] --- interrupt: c01 at 0xfdfdb64
> [   15.188005]     LR = 0x10013c64
> [   15.195309] Instruction dump:
> [   15.198274] 00000000 XXXXXXXX 00000000 XXXXXXXX 00000000 XXXXXXXX
> 00000000 XXXXXXXX
> [   15.206047] 00000000 XXXXXXXX 00000000 XXXXXXXX 7d5043a6 XXXXXXXX
> 7d400026 XXXXXXXX
> [   15.213824] ---[ end trace d38922938e009d45 ]---
> [   15.218434]
> [   16.219951] Kernel panic - not syncing: Fatal exception
> [   16.225174] Rebooting in 1 seconds..
> 

  reply	other threads:[~2020-02-18 18:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 17:07 MCE handler gets NIP wrong on MPC8378 Radu Rendec
2020-02-18 18:08 ` Christophe Leroy [this message]
2020-02-19 15:11   ` Radu Rendec
2020-02-19 19:46     ` Radu Rendec
2020-02-19 21:08       ` Christophe Leroy
2020-02-19 21:21         ` Christophe Leroy
2020-02-19 22:39           ` Radu Rendec
2020-02-20  8:38             ` Christophe Leroy
2020-02-20 16:02               ` Radu Rendec
2020-02-20 16:25                 ` Christophe Leroy
2020-02-20 17:34                   ` Radu Rendec
2020-02-20 17:48                     ` Christophe Leroy
2020-02-26  0:01     ` Radu Rendec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0856192-804b-fe2a-ccb8-48b43b130696@c-s.fr \
    --to=christophe.leroy@c-s.fr \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=radu.rendec@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.