All of lore.kernel.org
 help / color / mirror / Atom feed
* Machine Check in P2010(e500v2)
@ 2017-09-01 11:32 Joakim Tjernlund
  2017-09-05  8:40 ` Joakim Tjernlund
  2017-09-06 10:05 ` Laurentiu Tudor
  0 siblings, 2 replies; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-01 11:32 UTC (permalink / raw)
  To: linuxppc-dev

I am trying to debug a Machine Check for a P2010 (e500v2) CPU:

[   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus Error
[   28.117998] Oops: Machine check, sig: 7 [#1]
[   28.122263] P1010 RDB
[   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) lin=
ux_kernel_bde(PO)
[   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O    =
4.1.38+ #49
[   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
[   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
[   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O     (4.1.=
38+)
[   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
[   28.164140] DEAR: b7187000 ESR: 00000000
GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000 00000000 132f9=
fd8
GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c 00000000 00000=
000
GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 00000=
001
GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 00000=
000
[   28.196375] NIP [00000000]   (null)
[   28.199859] LR [10a4e404] 0x10a4e404
[   28.203426] Call Trace:
[   28.205866] ---[ end trace f456255ddf9bee83 ]---

I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
MCSRR0 early on but maybe it is lost somehow?

Anyhow, looking at entry_32.S:
	.globl	mcheck_transfer_to_handler
mcheck_transfer_to_handler:
	mfspr	r0,SPRN_DSRR0
	stw	r0,_DSRR0(r11)
	mfspr	r0,SPRN_DSRR1
	stw	r0,_DSRR1(r11)
	/* fall through */

	.globl	debug_transfer_to_handler
debug_transfer_to_handler:
	mfspr	r0,SPRN_CSRR0
	stw	r0,_CSRR0(r11)
	mfspr	r0,SPRN_CSRR1
	stw	r0,_CSRR1(r11)
	/* fall through */

	.globl	crit_transfer_to_handler
crit_transfer_to_handler:

It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and
crit has none. Should not this assigment be shifted down one level?

  Jocke=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-01 11:32 Machine Check in P2010(e500v2) Joakim Tjernlund
@ 2017-09-05  8:40 ` Joakim Tjernlund
  2017-09-06 15:38   ` York Sun
  2017-09-06 10:05 ` Laurentiu Tudor
  1 sibling, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-05  8:40 UTC (permalink / raw)
  To: linuxppc-dev, scottwood, york.sun

So after some debugging I found this bug:
@@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
        if (is_in_pci_mem_space(addr)) {
                if (user_mode(regs)) {
                        pagefault_disable();
-                       ret =3D get_user(regs->nip, &inst);
+                       ret =3D get_user(inst, (__u32 __user *)regs->nip);
                        pagefault_enable();
                } else {
                        ret =3D probe_kernel_address(regs->nip, inst);

However, the kernel still locked up after fixing that.
Now I wonder why this fixup is there in the first place? The routine
will not really fixup the insn, just return 0xffffffff for the failing
read and then advance the process NIP.

Removing the fixup does not help either, kernel still locks up:
[   28.170532] Machine check in kernel mode.
[   28.174538] Caused by (from MCSR=3D10008):
[   28.182804] Bus - Read Data Bus Error: DAR:b7013000
[   28.197079] Oops: Machine check, sig: 7 [#1]
[   28.201343] P1010 RDB
[   28.203608] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) lin=
ux_kernel_bde(PO)
[   28.211796] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O    =
4.1.38+ #201
[   28.219540] task: db16ed10 ti: df122000 task.ti: df122000
[   28.224935] NIP: 10a4e2f4 LR: 10a4e404 CTR: 10046c38
[   28.229896] REGS: df123f10 TRAP: 0204   Tainted: P           O     (4.1.=
38+)
[   28.236942] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
[   28.243306] DEAR: b7013000 ESR: 00000000
GPR00: 10a4e404 bfab2730 b7b354a0 132f9fa8 07006000 07000000 00000000 132f9=
fd8
GPR08: b6fd5000 b6fe5000 0003e000 bfab2720 24004424 11d6cf7c 00000000 00000=
000
GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 00000=
001
GPR24: 01a5bd3e 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 00000=
000
[   28.275547] NIP [10a4e2f4] 0x10a4e2f4
[   28.279204] LR [10a4e404] 0x10a4e404
[   28.282772] Call Trace:
[   28.285213] ---[ end trace 9f8b64ab1e83f449 ]---
[   28.289825]


 Jocke=20

On Fri, 2017-09-01 at 13:32 +0200, Joakim Tjernlund wrote:
> I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
>=20
> [   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus Error
> [   28.117998] Oops: Machine check, sig: 7 [#1]
> [   28.122263] P1010 RDB
> [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) l=
inux_kernel_bde(PO)
> [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O  =
  4.1.38+ #49
> [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
> [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O     (4.=
1.38+)
> [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
> [   28.164140] DEAR: b7187000 ESR: 00000000
> GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000 00000000 132=
f9fd8
> GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c 00000000 000=
00000
> GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 000=
00001
> GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 000=
00000
> [   28.196375] NIP [00000000]   (null)
> [   28.199859] LR [10a4e404] 0x10a4e404
> [   28.203426] Call Trace:
> [   28.205866] ---[ end trace f456255ddf9bee83 ]---
>=20
> I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
> MCSRR0 early on but maybe it is lost somehow?
>=20
> Anyhow, looking at entry_32.S:
> 	.globl	mcheck_transfer_to_handler
> mcheck_transfer_to_handler:
> 	mfspr	r0,SPRN_DSRR0
> 	stw	r0,_DSRR0(r11)
> 	mfspr	r0,SPRN_DSRR1
> 	stw	r0,_DSRR1(r11)
> 	/* fall through */
>=20
> 	.globl	debug_transfer_to_handler
> debug_transfer_to_handler:
> 	mfspr	r0,SPRN_CSRR0
> 	stw	r0,_CSRR0(r11)
> 	mfspr	r0,SPRN_CSRR1
> 	stw	r0,_CSRR1(r11)
> 	/* fall through */
>=20
> 	.globl	crit_transfer_to_handler
> crit_transfer_to_handler:
>=20
> It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and
> crit has none. Should not this assigment be shifted down one level?
>=20
>   Jocke

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-01 11:32 Machine Check in P2010(e500v2) Joakim Tjernlund
  2017-09-05  8:40 ` Joakim Tjernlund
@ 2017-09-06 10:05 ` Laurentiu Tudor
  2017-09-06 10:16   ` Joakim Tjernlund
  1 sibling, 1 reply; 21+ messages in thread
From: Laurentiu Tudor @ 2017-09-06 10:05 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev

Hi Jocke,

On 09/01/2017 02:32 PM, Joakim Tjernlund wrote:
> I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
>
> [   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus Error
> [   28.117998] Oops: Machine check, sig: 7 [#1]
> [   28.122263] P1010 RDB
> [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) l=
inux_kernel_bde(PO)
> [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O  =
  4.1.38+ #49
> [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
> [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O     (4.=
1.38+)
> [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
> [   28.164140] DEAR: b7187000 ESR: 00000000
> GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000 00000000 132=
f9fd8
> GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c 00000000 000=
00000
> GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 000=
00001
> GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 000=
00000
> [   28.196375] NIP [00000000]   (null)
> [   28.199859] LR [10a4e404] 0x10a4e404
> [   28.203426] Call Trace:
> [   28.205866] ---[ end trace f456255ddf9bee83 ]---
>
> I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
> MCSRR0 early on but maybe it is lost somehow?
>
> Anyhow, looking at entry_32.S:
> 	.globl	mcheck_transfer_to_handler
> mcheck_transfer_to_handler:
> 	mfspr	r0,SPRN_DSRR0
> 	stw	r0,_DSRR0(r11)
> 	mfspr	r0,SPRN_DSRR1
> 	stw	r0,_DSRR1(r11)
> 	/* fall through */
>
> 	.globl	debug_transfer_to_handler
> debug_transfer_to_handler:
> 	mfspr	r0,SPRN_CSRR0
> 	stw	r0,_CSRR0(r11)
> 	mfspr	r0,SPRN_CSRR1
> 	stw	r0,_CSRR1(r11)
> 	/* fall through */
>
> 	.globl	crit_transfer_to_handler
> crit_transfer_to_handler:
>
> It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and
> crit has none. Should not this assigment be shifted down one level?
>

This does indeed looks weird. Have you tried moving the SPRN_CSRR*=20
saving in the crit section? Any results?

---
Best Regards, Laurentiu=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-06 10:05 ` Laurentiu Tudor
@ 2017-09-06 10:16   ` Joakim Tjernlund
  2017-09-08  1:56     ` Scott Wood
  0 siblings, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-06 10:16 UTC (permalink / raw)
  To: linuxppc-dev, oss, laurentiu.tudor

On Wed, 2017-09-06 at 10:05 +0000, Laurentiu Tudor wrote:
> Hi Jocke,
>=20
> On 09/01/2017 02:32 PM, Joakim Tjernlund wrote:
> > I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
> >=20
> > [   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus Error
> > [   28.117998] Oops: Machine check, sig: 7 [#1]
> > [   28.122263] P1010 RDB
> > [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO)=
 linux_kernel_bde(PO)
> > [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O=
    4.1.38+ #49
> > [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> > [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
> > [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O     (=
4.1.38+)
> > [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
> > [   28.164140] DEAR: b7187000 ESR: 00000000
> > GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000 00000000 1=
32f9fd8
> > GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c 00000000 0=
0000000
> > GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 0=
0000001
> > GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 0=
0000000
> > [   28.196375] NIP [00000000]   (null)
> > [   28.199859] LR [10a4e404] 0x10a4e404
> > [   28.203426] Call Trace:
> > [   28.205866] ---[ end trace f456255ddf9bee83 ]---
> >=20
> > I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
> > MCSRR0 early on but maybe it is lost somehow?
> >=20
> > Anyhow, looking at entry_32.S:
> > 	.globl	mcheck_transfer_to_handler
> > mcheck_transfer_to_handler:
> > 	mfspr	r0,SPRN_DSRR0
> > 	stw	r0,_DSRR0(r11)
> > 	mfspr	r0,SPRN_DSRR1
> > 	stw	r0,_DSRR1(r11)
> > 	/* fall through */
> >=20
> > 	.globl	debug_transfer_to_handler
> > debug_transfer_to_handler:
> > 	mfspr	r0,SPRN_CSRR0
> > 	stw	r0,_CSRR0(r11)
> > 	mfspr	r0,SPRN_CSRR1
> > 	stw	r0,_CSRR1(r11)
> > 	/* fall through */
> >=20
> > 	.globl	crit_transfer_to_handler
> > crit_transfer_to_handler:
> >=20
> > It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and
> > crit has none. Should not this assigment be shifted down one level?
> >=20
>=20
> This does indeed looks weird. Have you tried moving the SPRN_CSRR*=20
> saving in the crit section? Any results?

After looking at this somwhat I think this is intentional and OK.
I sorted NIP =3D=3D NULL too:
@@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
        if (is_in_pci_mem_space(addr)) {
                if (user_mode(regs)) {
                        pagefault_disable();
-                       ret =3D get_user(regs->nip, &inst);
+                       ret =3D get_user(inst, (__u32 __user *)regs->nip);
                        pagefault_enable();
                } else {
                        ret =3D probe_kernel_address(regs->nip, inst);

But after this, the CPU is still locked after an Machine Check. Is this
to be expected? I figured the user space process would get a SIGBUS and ker=
nel
would resume normal operations.

Scott, maybe you have some idea?

 Jocke=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-05  8:40 ` Joakim Tjernlund
@ 2017-09-06 15:38   ` York Sun
  2017-09-06 19:31     ` Leo Li
  0 siblings, 1 reply; 21+ messages in thread
From: York Sun @ 2017-09-06 15:38 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev, Leo Li

Scott is no longer with Freescale/NXP. Adding Leo.=0A=
=0A=
On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:=0A=
> So after some debugging I found this bug:=0A=
> @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)=0A=
>          if (is_in_pci_mem_space(addr)) {=0A=
>                  if (user_mode(regs)) {=0A=
>                          pagefault_disable();=0A=
> -                       ret =3D get_user(regs->nip, &inst);=0A=
> +                       ret =3D get_user(inst, (__u32 __user *)regs->nip)=
;=0A=
>                          pagefault_enable();=0A=
>                  } else {=0A=
>                          ret =3D probe_kernel_address(regs->nip, inst);=
=0A=
> =0A=
> However, the kernel still locked up after fixing that.=0A=
> Now I wonder why this fixup is there in the first place? The routine=0A=
> will not really fixup the insn, just return 0xffffffff for the failing=0A=
> read and then advance the process NIP.=0A=
> =0A=
> Removing the fixup does not help either, kernel still locks up:=0A=
> [   28.170532] Machine check in kernel mode.=0A=
> [   28.174538] Caused by (from MCSR=3D10008):=0A=
> [   28.182804] Bus - Read Data Bus Error: DAR:b7013000=0A=
> [   28.197079] Oops: Machine check, sig: 7 [#1]=0A=
> [   28.201343] P1010 RDB=0A=
> [   28.203608] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) l=
inux_kernel_bde(PO)=0A=
> [   28.211796] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O  =
  4.1.38+ #201=0A=
> [   28.219540] task: db16ed10 ti: df122000 task.ti: df122000=0A=
> [   28.224935] NIP: 10a4e2f4 LR: 10a4e404 CTR: 10046c38=0A=
> [   28.229896] REGS: df123f10 TRAP: 0204   Tainted: P           O     (4.=
1.38+)=0A=
> [   28.236942] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000=
=0A=
> [   28.243306] DEAR: b7013000 ESR: 00000000=0A=
> GPR00: 10a4e404 bfab2730 b7b354a0 132f9fa8 07006000 07000000 00000000 132=
f9fd8=0A=
> GPR08: b6fd5000 b6fe5000 0003e000 bfab2720 24004424 11d6cf7c 00000000 000=
00000=0A=
> GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 000=
00001=0A=
> GPR24: 01a5bd3e 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 000=
00000=0A=
> [   28.275547] NIP [10a4e2f4] 0x10a4e2f4=0A=
> [   28.279204] LR [10a4e404] 0x10a4e404=0A=
> [   28.282772] Call Trace:=0A=
> [   28.285213] ---[ end trace 9f8b64ab1e83f449 ]---=0A=
> [   28.289825]=0A=
> =0A=
> =0A=
>   Jocke=0A=
> =0A=
> On Fri, 2017-09-01 at 13:32 +0200, Joakim Tjernlund wrote:=0A=
>> I am trying to debug a Machine Check for a P2010 (e500v2) CPU:=0A=
>>=0A=
>> [   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus Error=
=0A=
>> [   28.117998] Oops: Machine check, sig: 7 [#1]=0A=
>> [   28.122263] P1010 RDB=0A=
>> [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) =
linux_kernel_bde(PO)=0A=
>> [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O =
   4.1.38+ #49=0A=
>> [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000=0A=
>> [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38=0A=
>> [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O     (4=
.1.38+)=0A=
>> [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000=
=0A=
>> [   28.164140] DEAR: b7187000 ESR: 00000000=0A=
>> GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000 00000000 13=
2f9fd8=0A=
>> GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c 00000000 00=
000000=0A=
>> GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 00=
000001=0A=
>> GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 00=
000000=0A=
>> [   28.196375] NIP [00000000]   (null)=0A=
>> [   28.199859] LR [10a4e404] 0x10a4e404=0A=
>> [   28.203426] Call Trace:=0A=
>> [   28.205866] ---[ end trace f456255ddf9bee83 ]---=0A=
>>=0A=
>> I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to=0A=
>> MCSRR0 early on but maybe it is lost somehow?=0A=
>>=0A=
>> Anyhow, looking at entry_32.S:=0A=
>> 	.globl	mcheck_transfer_to_handler=0A=
>> mcheck_transfer_to_handler:=0A=
>> 	mfspr	r0,SPRN_DSRR0=0A=
>> 	stw	r0,_DSRR0(r11)=0A=
>> 	mfspr	r0,SPRN_DSRR1=0A=
>> 	stw	r0,_DSRR1(r11)=0A=
>> 	/* fall through */=0A=
>>=0A=
>> 	.globl	debug_transfer_to_handler=0A=
>> debug_transfer_to_handler:=0A=
>> 	mfspr	r0,SPRN_CSRR0=0A=
>> 	stw	r0,_CSRR0(r11)=0A=
>> 	mfspr	r0,SPRN_CSRR1=0A=
>> 	stw	r0,_CSRR1(r11)=0A=
>> 	/* fall through */=0A=
>>=0A=
>> 	.globl	crit_transfer_to_handler=0A=
>> crit_transfer_to_handler:=0A=
>>=0A=
>> It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and=0A=
>> crit has none. Should not this assigment be shifted down one level?=0A=
>>=0A=
>>    Jocke=0A=
> =0A=
=0A=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Machine Check in P2010(e500v2)
  2017-09-06 15:38   ` York Sun
@ 2017-09-06 19:31     ` Leo Li
  2017-09-06 20:17       ` Joakim Tjernlund
  0 siblings, 1 reply; 21+ messages in thread
From: Leo Li @ 2017-09-06 19:31 UTC (permalink / raw)
  To: York Sun, Joakim Tjernlund, linuxppc-dev



> -----Original Message-----
> From: York Sun
> Sent: Wednesday, September 06, 2017 10:38 AM
> To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>; linuxppc-
> dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>
> Subject: Re: Machine Check in P2010(e500v2)
>=20
> Scott is no longer with Freescale/NXP. Adding Leo.
>=20
> On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > So after some debugging I found this bug:
> > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
> >          if (is_in_pci_mem_space(addr)) {
> >                  if (user_mode(regs)) {
> >                          pagefault_disable();
> > -                       ret =3D get_user(regs->nip, &inst);
> > +                       ret =3D get_user(inst, (__u32 __user
> > + *)regs->nip);
> >                          pagefault_enable();
> >                  } else {
> >                          ret =3D probe_kernel_address(regs->nip, inst);
> >
> > However, the kernel still locked up after fixing that.
> > Now I wonder why this fixup is there in the first place? The routine
> > will not really fixup the insn, just return 0xffffffff for the failing
> > read and then advance the process NIP.

You are right.  The code here only gives 0xffffffff to the load instruction=
s and continue with the next instruction when the load instruction is causi=
ng the machine check.  This will prevent a system lockup when reading from =
PCI/RapidIO device which is link down.

I don't know what is actual problem in your case.  Maybe it is a write inst=
ruction instead of read?   Or the code is in a infinite loop waiting for a =
valid read result?  Are you able to do some further debugging with the NIP =
correctly printed?

Regards,
Leo

> >
> > Removing the fixup does not help either, kernel still locks up:
> > [   28.170532] Machine check in kernel mode.
> > [   28.174538] Caused by (from MCSR=3D10008):
> > [   28.182804] Bus - Read Data Bus Error: DAR:b7013000
> > [   28.197079] Oops: Machine check, sig: 7 [#1]
> > [   28.201343] P1010 RDB
> > [   28.203608] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO)
> linux_kernel_bde(PO)
> > [   28.211796] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O
> 4.1.38+ #201
> > [   28.219540] task: db16ed10 ti: df122000 task.ti: df122000
> > [   28.224935] NIP: 10a4e2f4 LR: 10a4e404 CTR: 10046c38
> > [   28.229896] REGS: df123f10 TRAP: 0204   Tainted: P           O     (=
4.1.38+)
> > [   28.236942] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
> > [   28.243306] DEAR: b7013000 ESR: 00000000
> > GPR00: 10a4e404 bfab2730 b7b354a0 132f9fa8 07006000 07000000
> 00000000
> > 132f9fd8
> > GPR08: b6fd5000 b6fe5000 0003e000 bfab2720 24004424 11d6cf7c 00000000
> > 00000000
> > GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011
> > 00000001
> > GPR24: 01a5bd3e 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8
> 00000000
> > [   28.275547] NIP [10a4e2f4] 0x10a4e2f4
> > [   28.279204] LR [10a4e404] 0x10a4e404
> > [   28.282772] Call Trace:
> > [   28.285213] ---[ end trace 9f8b64ab1e83f449 ]---
> > [   28.289825]
> >
> >
> >   Jocke
> >
> > On Fri, 2017-09-01 at 13:32 +0200, Joakim Tjernlund wrote:
> >> I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
> >>
> >> [   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus Erro=
r
> >> [   28.117998] Oops: Machine check, sig: 7 [#1]
> >> [   28.122263] P1010 RDB
> >> [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO=
)
> linux_kernel_bde(PO)
> >> [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           =
O
> 4.1.38+ #49
> >> [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> >> [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
> >> [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O     =
(4.1.38+)
> >> [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 0000000=
0
> >> [   28.164140] DEAR: b7187000 ESR: 00000000
> >> GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000
> 00000000
> >> 132f9fd8
> >> GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c
> 00000000
> >> 00000000
> >> GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc
> 00000011
> >> 00000001
> >> GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000
> 132f9fa8 00000000
> >> [   28.196375] NIP [00000000]   (null)
> >> [   28.199859] LR [10a4e404] 0x10a4e404
> >> [   28.203426] Call Trace:
> >> [   28.205866] ---[ end trace f456255ddf9bee83 ]---
> >>
> >> I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
> >> MCSRR0 early on but maybe it is lost somehow?
> >>
> >> Anyhow, looking at entry_32.S:
> >> 	.globl	mcheck_transfer_to_handler
> >> mcheck_transfer_to_handler:
> >> 	mfspr	r0,SPRN_DSRR0
> >> 	stw	r0,_DSRR0(r11)
> >> 	mfspr	r0,SPRN_DSRR1
> >> 	stw	r0,_DSRR1(r11)
> >> 	/* fall through */
> >>
> >> 	.globl	debug_transfer_to_handler
> >> debug_transfer_to_handler:
> >> 	mfspr	r0,SPRN_CSRR0
> >> 	stw	r0,_CSRR0(r11)
> >> 	mfspr	r0,SPRN_CSRR1
> >> 	stw	r0,_CSRR1(r11)
> >> 	/* fall through */
> >>
> >> 	.globl	crit_transfer_to_handler
> >> crit_transfer_to_handler:
> >>
> >> It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and
> >> crit has none. Should not this assigment be shifted down one level?
> >>
> >>    Jocke
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-06 19:31     ` Leo Li
@ 2017-09-06 20:17       ` Joakim Tjernlund
  2017-09-06 20:28         ` Leo Li
  0 siblings, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-06 20:17 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > -----Original Message-----
> > From: York Sun
> > Sent: Wednesday, September 06, 2017 10:38 AM
> > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>; linuxppc-
> > dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> >=20
> > Scott is no longer with Freescale/NXP. Adding Leo.
> >=20
> > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > So after some debugging I found this bug:
> > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs=
)
> > >          if (is_in_pci_mem_space(addr)) {
> > >                  if (user_mode(regs)) {
> > >                          pagefault_disable();
> > > -                       ret =3D get_user(regs->nip, &inst);
> > > +                       ret =3D get_user(inst, (__u32 __user
> > > + *)regs->nip);
> > >                          pagefault_enable();
> > >                  } else {
> > >                          ret =3D probe_kernel_address(regs->nip, inst=
);
> > >=20
> > > However, the kernel still locked up after fixing that.
> > > Now I wonder why this fixup is there in the first place? The routine
> > > will not really fixup the insn, just return 0xffffffff for the failin=
g
> > > read and then advance the process NIP.
>=20
> You are right.  The code here only gives 0xffffffff to the load instructi=
ons and continue with the next instruction when the load instruction is cau=
sing the machine check.  This will prevent a system lockup when reading fro=
m PCI/RapidIO device which is link down.
>=20
> I don't know what is actual problem in your case.  Maybe it is a write in=
struction instead of read?   Or the code is in a infinite loop waiting for =
a valid read result?  Are you able to do some further debugging with the NI=
P correctly printed?
>=20

According to the MC it is a Read and the NIP also leads to a read in the pr=
ogram.
ATM, I have disabled the fixup but I will enable that again.
Question, is it safe add a small printk when this MC happens(after fixing u=
p)? I need to see that
it has happened as the error is somewhat random.

 Jocke

> Regards,
> Leo
>=20
> > >=20
> > > Removing the fixup does not help either, kernel still locks up:
> > > [   28.170532] Machine check in kernel mode.
> > > [   28.174538] Caused by (from MCSR=3D10008):
> > > [   28.182804] Bus - Read Data Bus Error: DAR:b7013000
> > > [   28.197079] Oops: Machine check, sig: 7 [#1]
> > > [   28.201343] P1010 RDB
> > > [   28.203608] Modules linked in: linux_bcm_knet(PO) linux_user_bde(P=
O)
> >=20
> > linux_kernel_bde(PO)
> > > [   28.211796] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P          =
 O
> >=20
> > 4.1.38+ #201
> > > [   28.219540] task: db16ed10 ti: df122000 task.ti: df122000
> > > [   28.224935] NIP: 10a4e2f4 LR: 10a4e404 CTR: 10046c38
> > > [   28.229896] REGS: df123f10 TRAP: 0204   Tainted: P           O    =
 (4.1.38+)
> > > [   28.236942] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 000000=
00
> > > [   28.243306] DEAR: b7013000 ESR: 00000000
> > > GPR00: 10a4e404 bfab2730 b7b354a0 132f9fa8 07006000 07000000
> >=20
> > 00000000
> > > 132f9fd8
> > > GPR08: b6fd5000 b6fe5000 0003e000 bfab2720 24004424 11d6cf7c 00000000
> > > 00000000
> > > GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011
> > > 00000001
> > > GPR24: 01a5bd3e 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8
> >=20
> > 00000000
> > > [   28.275547] NIP [10a4e2f4] 0x10a4e2f4
> > > [   28.279204] LR [10a4e404] 0x10a4e404
> > > [   28.282772] Call Trace:
> > > [   28.285213] ---[ end trace 9f8b64ab1e83f449 ]---
> > > [   28.289825]
> > >=20
> > >=20
> > >   Jocke
> > >=20
> > > On Fri, 2017-09-01 at 13:32 +0200, Joakim Tjernlund wrote:
> > > > I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
> > > >=20
> > > > [   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus E=
rror
> > > > [   28.117998] Oops: Machine check, sig: 7 [#1]
> > > > [   28.122263] P1010 RDB
> > > > [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde=
(PO)
> >=20
> > linux_kernel_bde(PO)
> > > > [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P        =
   O
> >=20
> > 4.1.38+ #49
> > > > [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> > > > [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
> > > > [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O  =
   (4.1.38+)
> > > > [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 0000=
0000
> > > > [   28.164140] DEAR: b7187000 ESR: 00000000
> > > > GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000
> >=20
> > 00000000
> > > > 132f9fd8
> > > > GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c
> >=20
> > 00000000
> > > > 00000000
> > > > GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc
> >=20
> > 00000011
> > > > 00000001
> > > > GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000
> >=20
> > 132f9fa8 00000000
> > > > [   28.196375] NIP [00000000]   (null)
> > > > [   28.199859] LR [10a4e404] 0x10a4e404
> > > > [   28.203426] Call Trace:
> > > > [   28.205866] ---[ end trace f456255ddf9bee83 ]---
> > > >=20
> > > > I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
> > > > MCSRR0 early on but maybe it is lost somehow?
> > > >=20
> > > > Anyhow, looking at entry_32.S:
> > > > 	.globl	mcheck_transfer_to_handler
> > > > mcheck_transfer_to_handler:
> > > > 	mfspr	r0,SPRN_DSRR0
> > > > 	stw	r0,_DSRR0(r11)
> > > > 	mfspr	r0,SPRN_DSRR1
> > > > 	stw	r0,_DSRR1(r11)
> > > > 	/* fall through */
> > > >=20
> > > > 	.globl	debug_transfer_to_handler
> > > > debug_transfer_to_handler:
> > > > 	mfspr	r0,SPRN_CSRR0
> > > > 	stw	r0,_CSRR0(r11)
> > > > 	mfspr	r0,SPRN_CSRR1
> > > > 	stw	r0,_CSRR1(r11)
> > > > 	/* fall through */
> > > >=20
> > > > 	.globl	crit_transfer_to_handler
> > > > crit_transfer_to_handler:
> > > >=20
> > > > It looks odd that DSRRx is assigned in mcheck and CSRRx in debug an=
d
> > > > crit has none. Should not this assigment be shifted down one level?
> > > >=20
> > > >    Jocke
>=20
>=20

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Machine Check in P2010(e500v2)
  2017-09-06 20:17       ` Joakim Tjernlund
@ 2017-09-06 20:28         ` Leo Li
  2017-09-06 20:53           ` Joakim Tjernlund
  0 siblings, 1 reply; 21+ messages in thread
From: Leo Li @ 2017-09-06 20:28 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev, York Sun



> -----Original Message-----
> From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> Sent: Wednesday, September 06, 2017 3:17 PM
> To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Sun
> <york.sun@nxp.com>
> Subject: Re: Machine Check in P2010(e500v2)
>=20
> On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > -----Original Message-----
> > > From: York Sun
> > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>; linuxppc-
> > > dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > >
> > > Scott is no longer with Freescale/NXP. Adding Leo.
> > >
> > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > So after some debugging I found this bug:
> > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *re=
gs)
> > > >          if (is_in_pci_mem_space(addr)) {
> > > >                  if (user_mode(regs)) {
> > > >                          pagefault_disable();
> > > > -                       ret =3D get_user(regs->nip, &inst);
> > > > +                       ret =3D get_user(inst, (__u32 __user
> > > > + *)regs->nip);
> > > >                          pagefault_enable();
> > > >                  } else {
> > > >                          ret =3D probe_kernel_address(regs->nip,
> > > > inst);
> > > >
> > > > However, the kernel still locked up after fixing that.
> > > > Now I wonder why this fixup is there in the first place? The
> > > > routine will not really fixup the insn, just return 0xffffffff for
> > > > the failing read and then advance the process NIP.
> >
> > You are right.  The code here only gives 0xffffffff to the load instruc=
tions and
> continue with the next instruction when the load instruction is causing t=
he
> machine check.  This will prevent a system lockup when reading from
> PCI/RapidIO device which is link down.
> >
> > I don't know what is actual problem in your case.  Maybe it is a write
> instruction instead of read?   Or the code is in a infinite loop waiting =
for a valid
> read result?  Are you able to do some further debugging with the NIP corr=
ectly
> printed?
> >
>=20
> According to the MC it is a Read and the NIP also leads to a read in the =
program.
> ATM, I have disabled the fixup but I will enable that again.
> Question, is it safe add a small printk when this MC happens(after fixing=
 up)? I
> need to see that it has happened as the error is somewhat random.

I think it is safe to add printk as the current machine check handlers are =
also using printk.

>=20
>  Jocke
>=20
> > Regards,
> > Leo
> >
> > > >
> > > > Removing the fixup does not help either, kernel still locks up:
> > > > [   28.170532] Machine check in kernel mode.
> > > > [   28.174538] Caused by (from MCSR=3D10008):
> > > > [   28.182804] Bus - Read Data Bus Error: DAR:b7013000
> > > > [   28.197079] Oops: Machine check, sig: 7 [#1]
> > > > [   28.201343] P1010 RDB
> > > > [   28.203608] Modules linked in: linux_bcm_knet(PO) linux_user_bde=
(PO)
> > >
> > > linux_kernel_bde(PO)
> > > > [   28.211796] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P        =
   O
> > >
> > > 4.1.38+ #201
> > > > [   28.219540] task: db16ed10 ti: df122000 task.ti: df122000
> > > > [   28.224935] NIP: 10a4e2f4 LR: 10a4e404 CTR: 10046c38
> > > > [   28.229896] REGS: df123f10 TRAP: 0204   Tainted: P           O  =
   (4.1.38+)
> > > > [   28.236942] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER:
> 00000000
> > > > [   28.243306] DEAR: b7013000 ESR: 00000000
> > > > GPR00: 10a4e404 bfab2730 b7b354a0 132f9fa8 07006000 07000000
> > >
> > > 00000000
> > > > 132f9fd8
> > > > GPR08: b6fd5000 b6fe5000 0003e000 bfab2720 24004424 11d6cf7c
> > > > 00000000
> > > > 00000000
> > > > GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc
> > > > 00000011
> > > > 00000001
> > > > GPR24: 01a5bd3e 132ffbf0 11d60000 00000000 07006000 00000000
> > > > 132f9fa8
> > >
> > > 00000000
> > > > [   28.275547] NIP [10a4e2f4] 0x10a4e2f4
> > > > [   28.279204] LR [10a4e404] 0x10a4e404
> > > > [   28.282772] Call Trace:
> > > > [   28.285213] ---[ end trace 9f8b64ab1e83f449 ]---
> > > > [   28.289825]
> > > >
> > > >
> > > >   Jocke
> > > >
> > > > On Fri, 2017-09-01 at 13:32 +0200, Joakim Tjernlund wrote:
> > > > > I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
> > > > >
> > > > > [   28.111816] Caused by (from MCSR=3D10008): Bus - Read Data Bus=
 Error
> > > > > [   28.117998] Oops: Machine check, sig: 7 [#1]
> > > > > [   28.122263] P1010 RDB
> > > > > [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_b=
de(PO)
> > >
> > > linux_kernel_bde(PO)
> > > > > [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P      =
     O
> > >
> > > 4.1.38+ #49
> > > > > [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> > > > > [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
> > > > > [   28.150730] REGS: df129f10 TRAP: 0204   Tainted: P           O=
     (4.1.38+)
> > > > > [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER:
> 00000000
> > > > > [   28.164140] DEAR: b7187000 ESR: 00000000
> > > > > GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000
> > >
> > > 00000000
> > > > > 132f9fd8
> > > > > GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c
> > >
> > > 00000000
> > > > > 00000000
> > > > > GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc
> > >
> > > 00000011
> > > > > 00000001
> > > > > GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000
> > >
> > > 132f9fa8 00000000
> > > > > [   28.196375] NIP [00000000]   (null)
> > > > > [   28.199859] LR [10a4e404] 0x10a4e404
> > > > > [   28.203426] Call Trace:
> > > > > [   28.205866] ---[ end trace f456255ddf9bee83 ]---
> > > > >
> > > > > I cannot figure out why NIP is NULL ? It LOOKs like NIP is set
> > > > > to
> > > > > MCSRR0 early on but maybe it is lost somehow?
> > > > >
> > > > > Anyhow, looking at entry_32.S:
> > > > > 	.globl	mcheck_transfer_to_handler
> > > > > mcheck_transfer_to_handler:
> > > > > 	mfspr	r0,SPRN_DSRR0
> > > > > 	stw	r0,_DSRR0(r11)
> > > > > 	mfspr	r0,SPRN_DSRR1
> > > > > 	stw	r0,_DSRR1(r11)
> > > > > 	/* fall through */
> > > > >
> > > > > 	.globl	debug_transfer_to_handler
> > > > > debug_transfer_to_handler:
> > > > > 	mfspr	r0,SPRN_CSRR0
> > > > > 	stw	r0,_CSRR0(r11)
> > > > > 	mfspr	r0,SPRN_CSRR1
> > > > > 	stw	r0,_CSRR1(r11)
> > > > > 	/* fall through */
> > > > >
> > > > > 	.globl	crit_transfer_to_handler
> > > > > crit_transfer_to_handler:
> > > > >
> > > > > It looks odd that DSRRx is assigned in mcheck and CSRRx in debug
> > > > > and crit has none. Should not this assigment be shifted down one =
level?
> > > > >
> > > > >    Jocke
> >
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-06 20:28         ` Leo Li
@ 2017-09-06 20:53           ` Joakim Tjernlund
  2017-09-06 21:13             ` Leo Li
  0 siblings, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-06 20:53 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > -----Original Message-----
> > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > Sent: Wednesday, September 06, 2017 3:17 PM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Su=
n
> > <york.sun@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> >=20
> > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > -----Original Message-----
> > > > From: York Sun
> > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>; linuxppc-
> > > > dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > >=20
> > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > >=20
> > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > So after some debugging I found this bug:
> > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *=
regs)
> > > > >          if (is_in_pci_mem_space(addr)) {
> > > > >                  if (user_mode(regs)) {
> > > > >                          pagefault_disable();
> > > > > -                       ret =3D get_user(regs->nip, &inst);
> > > > > +                       ret =3D get_user(inst, (__u32 __user
> > > > > + *)regs->nip);
> > > > >                          pagefault_enable();
> > > > >                  } else {
> > > > >                          ret =3D probe_kernel_address(regs->nip,
> > > > > inst);
> > > > >=20
> > > > > However, the kernel still locked up after fixing that.
> > > > > Now I wonder why this fixup is there in the first place? The
> > > > > routine will not really fixup the insn, just return 0xffffffff fo=
r
> > > > > the failing read and then advance the process NIP.
> > >=20
> > > You are right.  The code here only gives 0xffffffff to the load instr=
uctions and
> >=20
> > continue with the next instruction when the load instruction is causing=
 the
> > machine check.  This will prevent a system lockup when reading from
> > PCI/RapidIO device which is link down.
> > >=20
> > > I don't know what is actual problem in your case.  Maybe it is a writ=
e
> >=20
> > instruction instead of read?   Or the code is in a infinite loop waitin=
g for a valid
> > read result?  Are you able to do some further debugging with the NIP co=
rrectly
> > printed?
> > >=20
> >=20
> > According to the MC it is a Read and the NIP also leads to a read in th=
e program.
> > ATM, I have disabled the fixup but I will enable that again.
> > Question, is it safe add a small printk when this MC happens(after fixi=
ng up)? I
> > need to see that it has happened as the error is somewhat random.
>=20
> I think it is safe to add printk as the current machine check handlers ar=
e also using printk.

I hope so, but if the fixup fires there is no printk at all so I was a bit =
unsure.
Don't like this fixup though, is there not a better way than faking a read
to user space(or kernel for that matter) ?

 Jocke=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Machine Check in P2010(e500v2)
  2017-09-06 20:53           ` Joakim Tjernlund
@ 2017-09-06 21:13             ` Leo Li
  2017-09-06 22:50               ` Joakim Tjernlund
  0 siblings, 1 reply; 21+ messages in thread
From: Leo Li @ 2017-09-06 21:13 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev, York Sun



> -----Original Message-----
> From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> Sent: Wednesday, September 06, 2017 3:54 PM
> To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Sun
> <york.sun@nxp.com>
> Subject: Re: Machine Check in P2010(e500v2)
>=20
> On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > -----Original Message-----
> > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York
> > > Sun <york.sun@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > >
> > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > -----Original Message-----
> > > > > From: York Sun
> > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>; linuxppc-
> > > > > dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>
> > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > >
> > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > >
> > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > So after some debugging I found this bug:
> > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs
> *regs)
> > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > >                  if (user_mode(regs)) {
> > > > > >                          pagefault_disable();
> > > > > > -                       ret =3D get_user(regs->nip, &inst);
> > > > > > +                       ret =3D get_user(inst, (__u32 __user
> > > > > > + *)regs->nip);
> > > > > >                          pagefault_enable();
> > > > > >                  } else {
> > > > > >                          ret =3D probe_kernel_address(regs->nip=
,
> > > > > > inst);
> > > > > >
> > > > > > However, the kernel still locked up after fixing that.
> > > > > > Now I wonder why this fixup is there in the first place? The
> > > > > > routine will not really fixup the insn, just return 0xffffffff
> > > > > > for the failing read and then advance the process NIP.
> > > >
> > > > You are right.  The code here only gives 0xffffffff to the load
> > > > instructions and
> > >
> > > continue with the next instruction when the load instruction is
> > > causing the machine check.  This will prevent a system lockup when
> > > reading from PCI/RapidIO device which is link down.
> > > >
> > > > I don't know what is actual problem in your case.  Maybe it is a
> > > > write
> > >
> > > instruction instead of read?   Or the code is in a infinite loop wait=
ing for a
> valid
> > > read result?  Are you able to do some further debugging with the NIP
> > > correctly printed?
> > > >
> > >
> > > According to the MC it is a Read and the NIP also leads to a read in =
the
> program.
> > > ATM, I have disabled the fixup but I will enable that again.
> > > Question, is it safe add a small printk when this MC happens(after
> > > fixing up)? I need to see that it has happened as the error is somewh=
at
> random.
> >
> > I think it is safe to add printk as the current machine check handlers =
are also
> using printk.
>=20
> I hope so, but if the fixup fires there is no printk at all so I was a bi=
t unsure.
> Don't like this fixup though, is there not a better way than faking a rea=
d to user
> space(or kernel for that matter) ?

I don't have a better idea.  Without the fixup, the offending load instruct=
ion will never finish if there is anything wrong with the backing device an=
d freeze the whole system.  Do you have any suggestion in mind?

Regards,
Leo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-06 21:13             ` Leo Li
@ 2017-09-06 22:50               ` Joakim Tjernlund
  2017-09-07  8:41                 ` Joakim Tjernlund
  0 siblings, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-06 22:50 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > -----Original Message-----
> > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > Sent: Wednesday, September 06, 2017 3:54 PM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Su=
n
> > <york.sun@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> >=20
> > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > -----Original Message-----
> > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; Yor=
k
> > > > Sun <york.sun@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > >=20
> > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > -----Original Message-----
> > > > > > From: York Sun
> > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>; linuxppc-
> > > > > > dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>
> > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > >=20
> > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > >=20
> > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > So after some debugging I found this bug:
> > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_re=
gs
> >=20
> > *regs)
> > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > >                  if (user_mode(regs)) {
> > > > > > >                          pagefault_disable();
> > > > > > > -                       ret =3D get_user(regs->nip, &inst);
> > > > > > > +                       ret =3D get_user(inst, (__u32 __user
> > > > > > > + *)regs->nip);
> > > > > > >                          pagefault_enable();
> > > > > > >                  } else {
> > > > > > >                          ret =3D probe_kernel_address(regs->n=
ip,
> > > > > > > inst);
> > > > > > >=20
> > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > Now I wonder why this fixup is there in the first place? The
> > > > > > > routine will not really fixup the insn, just return 0xfffffff=
f
> > > > > > > for the failing read and then advance the process NIP.
> > > > >=20
> > > > > You are right.  The code here only gives 0xffffffff to the load
> > > > > instructions and
> > > >=20
> > > > continue with the next instruction when the load instruction is
> > > > causing the machine check.  This will prevent a system lockup when
> > > > reading from PCI/RapidIO device which is link down.
> > > > >=20
> > > > > I don't know what is actual problem in your case.  Maybe it is a
> > > > > write
> > > >=20
> > > > instruction instead of read?   Or the code is in a infinite loop wa=
iting for a
> >=20
> > valid
> > > > read result?  Are you able to do some further debugging with the NI=
P
> > > > correctly printed?
> > > > >=20
> > > >=20
> > > > According to the MC it is a Read and the NIP also leads to a read i=
n the
> >=20
> > program.
> > > > ATM, I have disabled the fixup but I will enable that again.
> > > > Question, is it safe add a small printk when this MC happens(after
> > > > fixing up)? I need to see that it has happened as the error is some=
what
> >=20
> > random.
> > >=20
> > > I think it is safe to add printk as the current machine check handler=
s are also
> >=20
> > using printk.
> >=20
> > I hope so, but if the fixup fires there is no printk at all so I was a =
bit unsure.
> > Don't like this fixup though, is there not a better way than faking a r=
ead to user
> > space(or kernel for that matter) ?
>=20
> I don't have a better idea.  Without the fixup, the offending load instru=
ction will never finish if there is anything wrong with the backing device =
and freeze the whole system.  Do you have any suggestion in mind?
>=20

But it never finishes the load, it just fakes a load of 0xfffffffff, for us=
er space I rather have it signal
a SIGBUS but that does not seem to work either, at least not for us but tha=
t could be a bug in general MC code
 maybe.
This fixup might be valid for kernel only as it has never worked for user s=
pace due to the bug I found.

Where can I read about this errata ?

 Jocke

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-06 22:50               ` Joakim Tjernlund
@ 2017-09-07  8:41                 ` Joakim Tjernlund
  2017-09-07 18:54                   ` Leo Li
  0 siblings, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-07  8:41 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > -----Original Message-----
> > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York =
Sun
> > > <york.sun@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > >=20
> > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > -----Original Message-----
> > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; Y=
ork
> > > > > Sun <york.sun@nxp.com>
> > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > >=20
> > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: York Sun
> > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>; linuxpp=
c-
> > > > > > > dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > >=20
> > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > >=20
> > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > So after some debugging I found this bug:
> > > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_=
regs
> > >=20
> > > *regs)
> > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > >                  if (user_mode(regs)) {
> > > > > > > >                          pagefault_disable();
> > > > > > > > -                       ret =3D get_user(regs->nip, &inst);
> > > > > > > > +                       ret =3D get_user(inst, (__u32 __use=
r
> > > > > > > > + *)regs->nip);
> > > > > > > >                          pagefault_enable();
> > > > > > > >                  } else {
> > > > > > > >                          ret =3D probe_kernel_address(regs-=
>nip,
> > > > > > > > inst);
> > > > > > > >=20
> > > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > > Now I wonder why this fixup is there in the first place? Th=
e
> > > > > > > > routine will not really fixup the insn, just return 0xfffff=
fff
> > > > > > > > for the failing read and then advance the process NIP.
> > > > > >=20
> > > > > > You are right.  The code here only gives 0xffffffff to the load
> > > > > > instructions and
> > > > >=20
> > > > > continue with the next instruction when the load instruction is
> > > > > causing the machine check.  This will prevent a system lockup whe=
n
> > > > > reading from PCI/RapidIO device which is link down.
> > > > > >=20
> > > > > > I don't know what is actual problem in your case.  Maybe it is =
a
> > > > > > write
> > > > >=20
> > > > > instruction instead of read?   Or the code is in a infinite loop =
waiting for a
> > >=20
> > > valid
> > > > > read result?  Are you able to do some further debugging with the =
NIP
> > > > > correctly printed?
> > > > > >=20
> > > > >=20
> > > > > According to the MC it is a Read and the NIP also leads to a read=
 in the
> > >=20
> > > program.
> > > > > ATM, I have disabled the fixup but I will enable that again.
> > > > > Question, is it safe add a small printk when this MC happens(afte=
r
> > > > > fixing up)? I need to see that it has happened as the error is so=
mewhat
> > >=20
> > > random.
> > > >=20
> > > > I think it is safe to add printk as the current machine check handl=
ers are also
> > >=20
> > > using printk.
> > >=20
> > > I hope so, but if the fixup fires there is no printk at all so I was =
a bit unsure.
> > > Don't like this fixup though, is there not a better way than faking a=
 read to user
> > > space(or kernel for that matter) ?
> >=20
> > I don't have a better idea.  Without the fixup, the offending load inst=
ruction will never finish if there is anything wrong with the backing devic=
e and freeze the whole system.  Do you have any suggestion in mind?
> >=20
>=20
> But it never finishes the load, it just fakes a load of 0xfffffffff, for =
user space I rather have it signal
> a SIGBUS but that does not seem to work either, at least not for us but t=
hat could be a bug in general MC code
>  maybe.
> This fixup might be valid for kernel only as it has never worked for user=
 space due to the bug I found.
>=20
> Where can I read about this errata ?

I have look high and low an cannot find an errata which maps to this fixup.
The closest I get is A-005125 which seems to have another workaround, I can=
not find
any evidence that this workaround has been applied in Linux, can you?

 Jocke=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Machine Check in P2010(e500v2)
  2017-09-07  8:41                 ` Joakim Tjernlund
@ 2017-09-07 18:54                   ` Leo Li
  2017-09-08  9:54                     ` Joakim Tjernlund
  0 siblings, 1 reply; 21+ messages in thread
From: Leo Li @ 2017-09-07 18:54 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev, York Sun



> -----Original Message-----
> From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> Sent: Thursday, September 07, 2017 3:41 AM
> To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Sun
> <york.sun@nxp.com>
> Subject: Re: Machine Check in P2010(e500v2)
>=20
> On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > -----Original Message-----
> > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>;
> > > > York Sun <york.sun@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > >
> > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > -----Original Message-----
> > > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > >
> > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > -----Original Message-----
> > > > > > > > From: York Sun
> > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>;
> > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > >
> > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > >
> > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct
> > > > > > > > > pt_regs
> > > >
> > > > *regs)
> > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > >                          pagefault_disable();
> > > > > > > > > -                       ret =3D get_user(regs->nip, &inst=
);
> > > > > > > > > +                       ret =3D get_user(inst, (__u32
> > > > > > > > > + __user *)regs->nip);
> > > > > > > > >                          pagefault_enable();
> > > > > > > > >                  } else {
> > > > > > > > >                          ret =3D
> > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > >
> > > > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > > > Now I wonder why this fixup is there in the first place?
> > > > > > > > > The routine will not really fixup the insn, just return
> > > > > > > > > 0xffffffff for the failing read and then advance the proc=
ess NIP.
> > > > > > >
> > > > > > > You are right.  The code here only gives 0xffffffff to the
> > > > > > > load instructions and
> > > > > >
> > > > > > continue with the next instruction when the load instruction
> > > > > > is causing the machine check.  This will prevent a system
> > > > > > lockup when reading from PCI/RapidIO device which is link down.
> > > > > > >
> > > > > > > I don't know what is actual problem in your case.  Maybe it
> > > > > > > is a write
> > > > > >
> > > > > > instruction instead of read?   Or the code is in a infinite loo=
p waiting for
> a
> > > >
> > > > valid
> > > > > > read result?  Are you able to do some further debugging with
> > > > > > the NIP correctly printed?
> > > > > > >
> > > > > >
> > > > > > According to the MC it is a Read and the NIP also leads to a
> > > > > > read in the
> > > >
> > > > program.
> > > > > > ATM, I have disabled the fixup but I will enable that again.
> > > > > > Question, is it safe add a small printk when this MC
> > > > > > happens(after fixing up)? I need to see that it has happened
> > > > > > as the error is somewhat
> > > >
> > > > random.
> > > > >
> > > > > I think it is safe to add printk as the current machine check
> > > > > handlers are also
> > > >
> > > > using printk.
> > > >
> > > > I hope so, but if the fixup fires there is no printk at all so I wa=
s a bit unsure.
> > > > Don't like this fixup though, is there not a better way than
> > > > faking a read to user space(or kernel for that matter) ?
> > >
> > > I don't have a better idea.  Without the fixup, the offending load in=
struction
> will never finish if there is anything wrong with the backing device and =
freeze the
> whole system.  Do you have any suggestion in mind?
> > >
> >
> > But it never finishes the load, it just fakes a load of 0xfffffffff,
> > for user space I rather have it signal a SIGBUS but that does not seem
> > to work either, at least not for us but that could be a bug in general =
MC code
> maybe.
> > This fixup might be valid for kernel only as it has never worked for us=
er space
> due to the bug I found.
> >
> > Where can I read about this errata ?
>=20
> I have look high and low an cannot find an errata which maps to this fixu=
p.
> The closest I get is A-005125 which seems to have another workaround, I c=
annot
> find any evidence that this workaround has been applied in Linux, can you=
?

This is not A-005125.  There was an erratum for this issue with older silic=
ons (e.g. erratum PCI-ex 3 for MPC8572). =20
" When its link goes down, the PCI Express controller clears all outstandin=
g transactions with an
error indicator and sends a link down exception to the interrupt controller=
 if
PEX_PME_MES_DISR[LDDD] =3D 0. If, however, any transactions are sent to the=
 controller after
the link down event, they are accepted by the controller and wait for the l=
ink to come back up
before starting any timeout counters (for example, completion timeout). The=
re is no mechanism to
cancel the new transactions short of a device HRESET. "

But it was removed in newer silicon like P2020/P2010 probably because a Mac=
hine Check will be triggered in this situation to deal with the stalled ins=
truction and no longer considered it as a hardware issue.

The A-005125 is dealt with in u-boot.   https://lists.denx.de/pipermail/u-b=
oot/2013-August/161185.html

Regards,
Leo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-06 10:16   ` Joakim Tjernlund
@ 2017-09-08  1:56     ` Scott Wood
  0 siblings, 0 replies; 21+ messages in thread
From: Scott Wood @ 2017-09-08  1:56 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev, laurentiu.tudor

On Wed, 2017-09-06 at 10:16 +0000, Joakim Tjernlund wrote:
> On Wed, 2017-09-06 at 10:05 +0000, Laurentiu Tudor wrote:
> > Hi Jocke,
> > 
> > On 09/01/2017 02:32 PM, Joakim Tjernlund wrote:
> > > I am trying to debug a Machine Check for a P2010 (e500v2) CPU:
> > > 
> > > [   28.111816] Caused by (from MCSR=10008): Bus - Read Data Bus Error
> > > [   28.117998] Oops: Machine check, sig: 7 [#1]
> > > [   28.122263] P1010 RDB
> > > [   28.124529] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO)
> > > linux_kernel_bde(PO)
> > > [   28.132718] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted:
> > > P           O    4.1.38+ #49
> > > [   28.140376] task: db16cd10 ti: df128000 task.ti: df128000
> > > [   28.145770] NIP: 00000000 LR: 10a4e404 CTR: 10046c38
> > > [   28.150730] REGS: df129f10 TRAP: 0204   Tainted:
> > > P           O     (4.1.38+)
> > > [   28.157776] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
> > > [   28.164140] DEAR: b7187000 ESR: 00000000
> > > GPR00: 10a4e404 bf86ea30 b7ca94a0 132f9fa8 07006000 07000000 00000000
> > > 132f9fd8
> > > GPR08: b7149000 b7159000 0003e000 bf86ea20 24004424 11d6cf7c 00000000
> > > 00000000
> > > GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011
> > > 00000001
> > > GPR24: 01a4d12d 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8
> > > 00000000
> > > [   28.196375] NIP [00000000]   (null)
> > > [   28.199859] LR [10a4e404] 0x10a4e404
> > > [   28.203426] Call Trace:
> > > [   28.205866] ---[ end trace f456255ddf9bee83 ]---
> > > 
> > > I cannot figure out why NIP is NULL ? It LOOKs like NIP is set to
> > > MCSRR0 early on but maybe it is lost somehow?
> > > 
> > > Anyhow, looking at entry_32.S:
> > > 	.globl	mcheck_transfer_to_handler
> > > mcheck_transfer_to_handler:
> > > 	mfspr	r0,SPRN_DSRR0
> > > 	stw	r0,_DSRR0(r11)
> > > 	mfspr	r0,SPRN_DSRR1
> > > 	stw	r0,_DSRR1(r11)
> > > 	/* fall through */
> > > 
> > > 	.globl	debug_transfer_to_handler
> > > debug_transfer_to_handler:
> > > 	mfspr	r0,SPRN_CSRR0
> > > 	stw	r0,_CSRR0(r11)
> > > 	mfspr	r0,SPRN_CSRR1
> > > 	stw	r0,_CSRR1(r11)
> > > 	/* fall through */
> > > 
> > > 	.globl	crit_transfer_to_handler
> > > crit_transfer_to_handler:
> > > 
> > > It looks odd that DSRRx is assigned in mcheck and CSRRx in debug and
> > > crit has none. Should not this assigment be shifted down one level?
> > > 
> > 
> > This does indeed looks weird. Have you tried moving the SPRN_CSRR* 
> > saving in the crit section? Any results?
> 
> After looking at this somwhat I think this is intentional and OK.
> I sorted NIP == NULL too:
> @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
>         if (is_in_pci_mem_space(addr)) {
>                 if (user_mode(regs)) {
>                         pagefault_disable();
> -                       ret = get_user(regs->nip, &inst);
> +                       ret = get_user(inst, (__u32 __user *)regs->nip);
>                         pagefault_enable();
>                 } else {
>                         ret = probe_kernel_address(regs->nip, inst);

:-(

> 
> But after this, the CPU is still locked after an Machine Check. Is this
> to be expected? I figured the user space process would get a SIGBUS and
> kernel
> would resume normal operations.
> 
> Scott, maybe you have some idea?

The userspace process should exit with SIGBUS (not quite the same as receiving
a SIGBUS that can be handled).  Maybe whatever is causing the machine check
ends up causing more problems that lead to the hang.

-Scott

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-07 18:54                   ` Leo Li
@ 2017-09-08  9:54                     ` Joakim Tjernlund
  2017-09-08 12:50                       ` Joakim Tjernlund
  0 siblings, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-08  9:54 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Thu, 2017-09-07 at 18:54 +0000, Leo Li wrote:
> > -----Original Message-----
> > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > Sent: Thursday, September 07, 2017 3:41 AM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Su=
n
> > <york.sun@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> >=20
> > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > > -----Original Message-----
> > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>;
> > > > > York Sun <york.sun@nxp.com>
> > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > >=20
> > > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > >=20
> > > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: York Sun
> > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>;
> > > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > >=20
> > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > >=20
> > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(struct
> > > > > > > > > > pt_regs
> > > > >=20
> > > > > *regs)
> > > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > > >                          pagefault_disable();
> > > > > > > > > > -                       ret =3D get_user(regs->nip, &in=
st);
> > > > > > > > > > +                       ret =3D get_user(inst, (__u32
> > > > > > > > > > + __user *)regs->nip);
> > > > > > > > > >                          pagefault_enable();
> > > > > > > > > >                  } else {
> > > > > > > > > >                          ret =3D
> > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > >=20
> > > > > > > > > > However, the kernel still locked up after fixing that.
> > > > > > > > > > Now I wonder why this fixup is there in the first place=
?
> > > > > > > > > > The routine will not really fixup the insn, just return
> > > > > > > > > > 0xffffffff for the failing read and then advance the pr=
ocess NIP.
> > > > > > > >=20
> > > > > > > > You are right.  The code here only gives 0xffffffff to the
> > > > > > > > load instructions and
> > > > > > >=20
> > > > > > > continue with the next instruction when the load instruction
> > > > > > > is causing the machine check.  This will prevent a system
> > > > > > > lockup when reading from PCI/RapidIO device which is link dow=
n.
> > > > > > > >=20
> > > > > > > > I don't know what is actual problem in your case.  Maybe it
> > > > > > > > is a write
> > > > > > >=20
> > > > > > > instruction instead of read?   Or the code is in a infinite l=
oop waiting for
> >=20
> > a
> > > > >=20
> > > > > valid
> > > > > > > read result?  Are you able to do some further debugging with
> > > > > > > the NIP correctly printed?
> > > > > > > >=20
> > > > > > >=20
> > > > > > > According to the MC it is a Read and the NIP also leads to a
> > > > > > > read in the
> > > > >=20
> > > > > program.
> > > > > > > ATM, I have disabled the fixup but I will enable that again.
> > > > > > > Question, is it safe add a small printk when this MC
> > > > > > > happens(after fixing up)? I need to see that it has happened
> > > > > > > as the error is somewhat
> > > > >=20
> > > > > random.
> > > > > >=20
> > > > > > I think it is safe to add printk as the current machine check
> > > > > > handlers are also
> > > > >=20
> > > > > using printk.
> > > > >=20
> > > > > I hope so, but if the fixup fires there is no printk at all so I =
was a bit unsure.
> > > > > Don't like this fixup though, is there not a better way than
> > > > > faking a read to user space(or kernel for that matter) ?
> > > >=20
> > > > I don't have a better idea.  Without the fixup, the offending load =
instruction
> >=20
> > will never finish if there is anything wrong with the backing device an=
d freeze the
> > whole system.  Do you have any suggestion in mind?
> > > >=20
> > >=20
> > > But it never finishes the load, it just fakes a load of 0xfffffffff,
> > > for user space I rather have it signal a SIGBUS but that does not see=
m
> > > to work either, at least not for us but that could be a bug in genera=
l MC code
> >=20
> > maybe.
> > > This fixup might be valid for kernel only as it has never worked for =
user space
> >=20
> > due to the bug I found.
> > >=20
> > > Where can I read about this errata ?
> >=20
> > I have look high and low an cannot find an errata which maps to this fi=
xup.
> > The closest I get is A-005125 which seems to have another workaround, I=
 cannot
> > find any evidence that this workaround has been applied in Linux, can y=
ou?
>=20
> This is not A-005125.  There was an erratum for this issue with older sil=
icons (e.g. erratum PCI-ex 3 for MPC8572). =20
> " When its link goes down, the PCI Express controller clears all outstand=
ing transactions with an
> error indicator and sends a link down exception to the interrupt controll=
er if
> PEX_PME_MES_DISR[LDDD] =3D 0. If, however, any transactions are sent to t=
he controller after
> the link down event, they are accepted by the controller and wait for the=
 link to come back up
> before starting any timeout counters (for example, completion timeout). T=
here is no mechanism to
> cancel the new transactions short of a device HRESET. "
>
> But it was removed in newer silicon like P2020/P2010 probably because a M=
achine Check will be triggered in this situation to deal with the stalled i=
nstruction and no longer considered it as a hardware issue.
>=20

Maybe this fixup should be configurable then?

> The A-005125 is dealt with in u-boot.   https://lists.denx.de/pipermail/u=
-boot/2013-August/161185.html

Yes, I found it eventually :)

However, I cannot return to normal execution. I can follow the code to retu=
rning from
machine_check_exception() and moving into ASM handler for returning from a =
ME but then I
am a bit lost. It does not seem to be any problem executing, it feels more =
like a SW bug
dealing with machine checks. Don't known how to diagnose this further and c=
ould use some pointers.

 Jocke=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-08  9:54                     ` Joakim Tjernlund
@ 2017-09-08 12:50                       ` Joakim Tjernlund
  2017-09-08 22:27                         ` Leo Li
  0 siblings, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-08 12:50 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> On Thu, 2017-09-07 at 18:54 +0000, Leo Li wrote:
> > > -----Original Message-----
> > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > Sent: Thursday, September 07, 2017 3:41 AM
> > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York =
Sun
> > > <york.sun@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > >=20
> > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > > > -----Original Message-----
> > > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>;
> > > > > > York Sun <york.sun@nxp.com>
> > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > >=20
> > > > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > > > -----Original Message-----
> > > > > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.co=
m]
> > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > >=20
> > > > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: York Sun
> > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>;
> > > > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > >=20
> > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > >=20
> > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > @@ -996,7 +998,7 @@ int fsl_pci_mcheck_exception(stru=
ct
> > > > > > > > > > > pt_regs
> > > > > >=20
> > > > > > *regs)
> > > > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > > > >                          pagefault_disable();
> > > > > > > > > > > -                       ret =3D get_user(regs->nip, &=
inst);
> > > > > > > > > > > +                       ret =3D get_user(inst, (__u32
> > > > > > > > > > > + __user *)regs->nip);
> > > > > > > > > > >                          pagefault_enable();
> > > > > > > > > > >                  } else {
> > > > > > > > > > >                          ret =3D
> > > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > >=20
> > > > > > > > > > > However, the kernel still locked up after fixing that=
.
> > > > > > > > > > > Now I wonder why this fixup is there in the first pla=
ce?
> > > > > > > > > > > The routine will not really fixup the insn, just retu=
rn
> > > > > > > > > > > 0xffffffff for the failing read and then advance the =
process NIP.
> > > > > > > > >=20
> > > > > > > > > You are right.  The code here only gives 0xffffffff to th=
e
> > > > > > > > > load instructions and
> > > > > > > >=20
> > > > > > > > continue with the next instruction when the load instructio=
n
> > > > > > > > is causing the machine check.  This will prevent a system
> > > > > > > > lockup when reading from PCI/RapidIO device which is link d=
own.
> > > > > > > > >=20
> > > > > > > > > I don't know what is actual problem in your case.  Maybe =
it
> > > > > > > > > is a write
> > > > > > > >=20
> > > > > > > > instruction instead of read?   Or the code is in a infinite=
 loop waiting for
> > >=20
> > > a
> > > > > >=20
> > > > > > valid
> > > > > > > > read result?  Are you able to do some further debugging wit=
h
> > > > > > > > the NIP correctly printed?
> > > > > > > > >=20
> > > > > > > >=20
> > > > > > > > According to the MC it is a Read and the NIP also leads to =
a
> > > > > > > > read in the
> > > > > >=20
> > > > > > program.
> > > > > > > > ATM, I have disabled the fixup but I will enable that again=
.
> > > > > > > > Question, is it safe add a small printk when this MC
> > > > > > > > happens(after fixing up)? I need to see that it has happene=
d
> > > > > > > > as the error is somewhat
> > > > > >=20
> > > > > > random.
> > > > > > >=20
> > > > > > > I think it is safe to add printk as the current machine check
> > > > > > > handlers are also
> > > > > >=20
> > > > > > using printk.
> > > > > >=20
> > > > > > I hope so, but if the fixup fires there is no printk at all so =
I was a bit unsure.
> > > > > > Don't like this fixup though, is there not a better way than
> > > > > > faking a read to user space(or kernel for that matter) ?
> > > > >=20
> > > > > I don't have a better idea.  Without the fixup, the offending loa=
d instruction
> > >=20
> > > will never finish if there is anything wrong with the backing device =
and freeze the
> > > whole system.  Do you have any suggestion in mind?
> > > > >=20
> > > >=20
> > > > But it never finishes the load, it just fakes a load of 0xfffffffff=
,
> > > > for user space I rather have it signal a SIGBUS but that does not s=
eem
> > > > to work either, at least not for us but that could be a bug in gene=
ral MC code
> > >=20
> > > maybe.
> > > > This fixup might be valid for kernel only as it has never worked fo=
r user space
> > >=20
> > > due to the bug I found.
> > > >=20
> > > > Where can I read about this errata ?
> > >=20
> > > I have look high and low an cannot find an errata which maps to this =
fixup.
> > > The closest I get is A-005125 which seems to have another workaround,=
 I cannot
> > > find any evidence that this workaround has been applied in Linux, can=
 you?
> >=20
> > This is not A-005125.  There was an erratum for this issue with older s=
ilicons (e.g. erratum PCI-ex 3 for MPC8572). =20
> > " When its link goes down, the PCI Express controller clears all outsta=
nding transactions with an
> > error indicator and sends a link down exception to the interrupt contro=
ller if
> > PEX_PME_MES_DISR[LDDD] =3D 0. If, however, any transactions are sent to=
 the controller after
> > the link down event, they are accepted by the controller and wait for t=
he link to come back up
> > before starting any timeout counters (for example, completion timeout).=
 There is no mechanism to
> > cancel the new transactions short of a device HRESET. "
> >=20
> > But it was removed in newer silicon like P2020/P2010 probably because a=
 Machine Check will be triggered in this situation to deal with the stalled=
 instruction and no longer considered it as a hardware issue.
> >=20
>=20
> Maybe this fixup should be configurable then?
>=20
> > The A-005125 is dealt with in u-boot.   https://lists.denx.de/pipermail=
/u-boot/2013-August/161185.html
>=20
> Yes, I found it eventually :)
>=20
> However, I cannot return to normal execution. I can follow the code to re=
turning from
> machine_check_exception() and moving into ASM handler for returning from =
a ME but then I
> am a bit lost. It does not seem to be any problem executing, it feels mor=
e like a SW bug
> dealing with machine checks. Don't known how to diagnose this further and=
 could use some pointers.
>=20
>  Jocke

I note that MSR_RI is not set in MSR, can that be a clue?

[   28.118737] Machine check in kernel mode.
[   28.122751] Caused by (from MCSR=3D10008): Bus - Read Data Bus Error: DA=
R:b6f02000
[   28.133106] Oops: Machine check, sig: 7 [#1]
[   28.137370] P2010 RDB
[   28.139636] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) lin=
ux_kernel_bde(PO)
[   28.147826] CPU: 0 PID: 470 Comm: emxp2_hw_bl Tainted: P           O    =
4.1.38+ #206
[   28.155570] task: db16cd10 ti: df12a000 task.ti: df12a000
[   28.160964] NIP: 10a4e2f4 LR: 10a4e404 CTR: 10046c38
[   28.165925] REGS: df12bf10 TRAP: 0204   Tainted: P           O     (4.1.=
38+)
[   28.172971] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 00000000
[   28.179336] DEAR: b6f02000 ESR: 00000000
GPR00: 10a4e404 bff8cc90 b7a244a0 132f9fa8 07006000 07000000 00000000 132f9=
fd8
GPR08: b6ec4000 b6ed4000 0003e000 bff8cc80 24004424 11d6cf7c 00000000 00000=
000
GPR16: 10f6e29c 10f6c872 10f6db01 0000b541 0000b541 11d92fcc 00000011 00000=
001
GPR24: 01a5048d 132ffbf0 11d60000 00000000 07006000 00000000 132f9fa8 00000=
000
[   28.211576] NIP [10a4e2f4] 0x10a4e2f4
[   28.215233] LR [10a4e404] 0x10a4e404
[   28.218802] Call Trace:
[   28.221243] ---[ end trace bc4afbb242721e8a ]---

Finally, I am on kernel 4.1.43

 Jocke=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Machine Check in P2010(e500v2)
  2017-09-08 12:50                       ` Joakim Tjernlund
@ 2017-09-08 22:27                         ` Leo Li
  2017-09-09 12:45                           ` Joakim Tjernlund
  0 siblings, 1 reply; 21+ messages in thread
From: Leo Li @ 2017-09-08 22:27 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev, York Sun



> -----Original Message-----
> From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> Sent: Friday, September 08, 2017 7:51 AM
> To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Sun
> <york.sun@nxp.com>
> Subject: Re: Machine Check in P2010(e500v2)
>=20
> On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > On Thu, 2017-09-07 at 18:54 +0000, Leo Li wrote:
> > > > -----Original Message-----
> > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>;
> > > > York Sun <york.sun@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > >
> > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Joakim Tjernlund
> > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > >
> > > > > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > >
> > > > > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: York Sun
> > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > <Joakim.Tjernlund@infinera.com>;
> > > > > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > >
> > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > > >
> > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > >
> > > > > > > *regs)
> > > > > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > > > > >                          pagefault_disable();
> > > > > > > > > > > > -                       ret =3D get_user(regs->nip,=
 &inst);
> > > > > > > > > > > > +                       ret =3D get_user(inst,
> > > > > > > > > > > > + (__u32 __user *)regs->nip);
> > > > > > > > > > > >                          pagefault_enable();
> > > > > > > > > > > >                  } else {
> > > > > > > > > > > >                          ret =3D
> > > > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > > >
> > > > > > > > > > > > However, the kernel still locked up after fixing th=
at.
> > > > > > > > > > > > Now I wonder why this fixup is there in the first p=
lace?
> > > > > > > > > > > > The routine will not really fixup the insn, just
> > > > > > > > > > > > return 0xffffffff for the failing read and then adv=
ance the
> process NIP.
> > > > > > > > > >
> > > > > > > > > > You are right.  The code here only gives 0xffffffff to
> > > > > > > > > > the load instructions and
> > > > > > > > >
> > > > > > > > > continue with the next instruction when the load
> > > > > > > > > instruction is causing the machine check.  This will
> > > > > > > > > prevent a system lockup when reading from PCI/RapidIO dev=
ice
> which is link down.
> > > > > > > > > >
> > > > > > > > > > I don't know what is actual problem in your case.
> > > > > > > > > > Maybe it is a write
> > > > > > > > >
> > > > > > > > > instruction instead of read?   Or the code is in a infini=
te loop
> waiting for
> > > >
> > > > a
> > > > > > >
> > > > > > > valid
> > > > > > > > > read result?  Are you able to do some further debugging
> > > > > > > > > with the NIP correctly printed?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > According to the MC it is a Read and the NIP also leads
> > > > > > > > > to a read in the
> > > > > > >
> > > > > > > program.
> > > > > > > > > ATM, I have disabled the fixup but I will enable that aga=
in.
> > > > > > > > > Question, is it safe add a small printk when this MC
> > > > > > > > > happens(after fixing up)? I need to see that it has
> > > > > > > > > happened as the error is somewhat
> > > > > > >
> > > > > > > random.
> > > > > > > >
> > > > > > > > I think it is safe to add printk as the current machine
> > > > > > > > check handlers are also
> > > > > > >
> > > > > > > using printk.
> > > > > > >
> > > > > > > I hope so, but if the fixup fires there is no printk at all s=
o I was a bit
> unsure.
> > > > > > > Don't like this fixup though, is there not a better way than
> > > > > > > faking a read to user space(or kernel for that matter) ?
> > > > > >
> > > > > > I don't have a better idea.  Without the fixup, the offending
> > > > > > load instruction
> > > >
> > > > will never finish if there is anything wrong with the backing
> > > > device and freeze the whole system.  Do you have any suggestion in =
mind?
> > > > > >
> > > > >
> > > > > But it never finishes the load, it just fakes a load of
> > > > > 0xfffffffff, for user space I rather have it signal a SIGBUS but
> > > > > that does not seem to work either, at least not for us but that
> > > > > could be a bug in general MC code
> > > >
> > > > maybe.
> > > > > This fixup might be valid for kernel only as it has never worked
> > > > > for user space
> > > >
> > > > due to the bug I found.
> > > > >
> > > > > Where can I read about this errata ?
> > > >
> > > > I have look high and low an cannot find an errata which maps to thi=
s fixup.
> > > > The closest I get is A-005125 which seems to have another
> > > > workaround, I cannot find any evidence that this workaround has bee=
n
> applied in Linux, can you?
> > >
> > > This is not A-005125.  There was an erratum for this issue with older=
 silicons
> (e.g. erratum PCI-ex 3 for MPC8572).
> > > " When its link goes down, the PCI Express controller clears all
> > > outstanding transactions with an error indicator and sends a link
> > > down exception to the interrupt controller if PEX_PME_MES_DISR[LDDD]
> > > =3D 0. If, however, any transactions are sent to the controller after
> > > the link down event, they are accepted by the controller and wait
> > > for the link to come back up before starting any timeout counters (fo=
r
> example, completion timeout). There is no mechanism to cancel the new
> transactions short of a device HRESET. "
> > >
> > > But it was removed in newer silicon like P2020/P2010 probably because=
 a
> Machine Check will be triggered in this situation to deal with the stalle=
d
> instruction and no longer considered it as a hardware issue.
> > >
> >
> > Maybe this fixup should be configurable then?

No.  My point is that the problem was no longer considered a hardware issue=
 because of the machine check mechanism is in place to handle it.  If there=
 is no handling of this special case, we would still experience a system ha=
ng if this situation really occurs.

> >
> > > The A-005125 is dealt with in u-boot.
> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Flist=
s.de
> nx.de%2Fpipermail%2Fu-boot%2F2013-
> August%2F161185.html&data=3D01%7C01%7Cleoyang.li%40nxp.com%7Ccb8a93e
> 0090e48eb53a008d4f6b84235%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0&
> sdata=3D8sR4yoXA4adqMHz6TY%2BvmYpfCBTcYEZHjPuANjz%2F1EQ%3D&reserve
> d=3D0
> >
> > Yes, I found it eventually :)
> >
> > However, I cannot return to normal execution. I can follow the code to
> > returning from
> > machine_check_exception() and moving into ASM handler for returning
> > from a ME but then I am a bit lost. It does not seem to be any problem
> > executing, it feels more like a SW bug dealing with machine checks. Don=
't
> known how to diagnose this further and could use some pointers.

Is the execution returned to the user application?  I doubt the system hang=
 is caused by the machine check handling.  You can try to comment out the m=
achine check handling code and check if there is any improvement and see if=
 this is related to the machine check handling.

Machine check is a serious situation and not always possible to be recovere=
d from.  I would focus more on debugging why the machine check is triggered=
 by the user space application.  Can you locate what code is causing this m=
achine check from user space?  Is it accessing some hardware related space =
which is not ready?  Or is it accessing address that it shouldn't have acce=
ssed?

Regards,
Leo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-08 22:27                         ` Leo Li
@ 2017-09-09 12:45                           ` Joakim Tjernlund
       [not found]                             ` <1504961965.31322.72.camel@infinera.com>
  2017-09-20 16:45                             ` Joakim Tjernlund
  0 siblings, 2 replies; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-09 12:45 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Fri, 2017-09-08 at 22:27 +0000, Leo Li wrote:
> > -----Original Message-----
> > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > Sent: Friday, September 08, 2017 7:51 AM
> > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Su=
n
> > <york.sun@nxp.com>
> > Subject: Re: Machine Check in P2010(e500v2)
> >=20
> > On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > > On Thu, 2017-09-07 at 18:54 +0000, Leo Li wrote:
> > > > > -----Original Message-----
> > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>;
> > > > > York Sun <york.sun@nxp.com>
> > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > >=20
> > > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > > > > > -----Original Message-----
> > > > > > > > From: Joakim Tjernlund
> > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > >=20
> > > > > > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > >=20
> > > > > > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > From: York Sun
> > > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > > <Joakim.Tjernlund@infinera.com>;
> > > > > > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > >=20
> > > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo.
> > > > > > > > > > > >=20
> > > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > > >=20
> > > > > > > > *regs)
> > > > > > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > > > > > >                          pagefault_disable();
> > > > > > > > > > > > > -                       ret =3D get_user(regs->ni=
p, &inst);
> > > > > > > > > > > > > +                       ret =3D get_user(inst,
> > > > > > > > > > > > > + (__u32 __user *)regs->nip);
> > > > > > > > > > > > >                          pagefault_enable();
> > > > > > > > > > > > >                  } else {
> > > > > > > > > > > > >                          ret =3D
> > > > > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > > > >=20
> > > > > > > > > > > > > However, the kernel still locked up after fixing =
that.
> > > > > > > > > > > > > Now I wonder why this fixup is there in the first=
 place?
> > > > > > > > > > > > > The routine will not really fixup the insn, just
> > > > > > > > > > > > > return 0xffffffff for the failing read and then a=
dvance the
> >=20
> > process NIP.
> > > > > > > > > > >=20
> > > > > > > > > > > You are right.  The code here only gives 0xffffffff t=
o
> > > > > > > > > > > the load instructions and
> > > > > > > > > >=20
> > > > > > > > > > continue with the next instruction when the load
> > > > > > > > > > instruction is causing the machine check.  This will
> > > > > > > > > > prevent a system lockup when reading from PCI/RapidIO d=
evice
> >=20
> > which is link down.
> > > > > > > > > > >=20
> > > > > > > > > > > I don't know what is actual problem in your case.
> > > > > > > > > > > Maybe it is a write
> > > > > > > > > >=20
> > > > > > > > > > instruction instead of read?   Or the code is in a infi=
nite loop
> >=20
> > waiting for
> > > > >=20
> > > > > a
> > > > > > > >=20
> > > > > > > > valid
> > > > > > > > > > read result?  Are you able to do some further debugging
> > > > > > > > > > with the NIP correctly printed?
> > > > > > > > > > >=20
> > > > > > > > > >=20
> > > > > > > > > > According to the MC it is a Read and the NIP also leads
> > > > > > > > > > to a read in the
> > > > > > > >=20
> > > > > > > > program.
> > > > > > > > > > ATM, I have disabled the fixup but I will enable that a=
gain.
> > > > > > > > > > Question, is it safe add a small printk when this MC
> > > > > > > > > > happens(after fixing up)? I need to see that it has
> > > > > > > > > > happened as the error is somewhat
> > > > > > > >=20
> > > > > > > > random.
> > > > > > > > >=20
> > > > > > > > > I think it is safe to add printk as the current machine
> > > > > > > > > check handlers are also
> > > > > > > >=20
> > > > > > > > using printk.
> > > > > > > >=20
> > > > > > > > I hope so, but if the fixup fires there is no printk at all=
 so I was a bit
> >=20
> > unsure.
> > > > > > > > Don't like this fixup though, is there not a better way tha=
n
> > > > > > > > faking a read to user space(or kernel for that matter) ?
> > > > > > >=20
> > > > > > > I don't have a better idea.  Without the fixup, the offending
> > > > > > > load instruction
> > > > >=20
> > > > > will never finish if there is anything wrong with the backing
> > > > > device and freeze the whole system.  Do you have any suggestion i=
n mind?
> > > > > > >=20
> > > > > >=20
> > > > > > But it never finishes the load, it just fakes a load of
> > > > > > 0xfffffffff, for user space I rather have it signal a SIGBUS bu=
t
> > > > > > that does not seem to work either, at least not for us but that
> > > > > > could be a bug in general MC code
> > > > >=20
> > > > > maybe.
> > > > > > This fixup might be valid for kernel only as it has never worke=
d
> > > > > > for user space
> > > > >=20
> > > > > due to the bug I found.
> > > > > >=20
> > > > > > Where can I read about this errata ?
> > > > >=20
> > > > > I have look high and low an cannot find an errata which maps to t=
his fixup.
> > > > > The closest I get is A-005125 which seems to have another
> > > > > workaround, I cannot find any evidence that this workaround has b=
een
> >=20
> > applied in Linux, can you?
> > > >=20
> > > > This is not A-005125.  There was an erratum for this issue with old=
er silicons
> >=20
> > (e.g. erratum PCI-ex 3 for MPC8572).
> > > > " When its link goes down, the PCI Express controller clears all
> > > > outstanding transactions with an error indicator and sends a link
> > > > down exception to the interrupt controller if PEX_PME_MES_DISR[LDDD=
]
> > > > =3D 0. If, however, any transactions are sent to the controller aft=
er
> > > > the link down event, they are accepted by the controller and wait
> > > > for the link to come back up before starting any timeout counters (=
for
> >=20
> > example, completion timeout). There is no mechanism to cancel the new
> > transactions short of a device HRESET. "
> > > >=20
> > > > But it was removed in newer silicon like P2020/P2010 probably becau=
se a
> >=20
> > Machine Check will be triggered in this situation to deal with the stal=
led
> > instruction and no longer considered it as a hardware issue.
> > > >=20
> > >=20
> > > Maybe this fixup should be configurable then?
>=20
> No.  My point is that the problem was no longer considered a hardware iss=
ue because of the machine check mechanism is in place to handle it.  If the=
re is no handling of this special case, we would still experience a system =
hang if this situation really occurs.
>=20
> > >=20
> > > > The A-005125 is dealt with in u-boot.
> >=20
> > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fli=
sts.de
> > nx.de%2Fpipermail%2Fu-boot%2F2013-
> > August%2F161185.html&data=3D01%7C01%7Cleoyang.li%40nxp.com%7Ccb8a93e
> > 0090e48eb53a008d4f6b84235%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0&
> > sdata=3D8sR4yoXA4adqMHz6TY%2BvmYpfCBTcYEZHjPuANjz%2F1EQ%3D&reserve
> > d=3D0
> > >=20
> > > Yes, I found it eventually :)
> > >=20
> > > However, I cannot return to normal execution. I can follow the code t=
o
> > > returning from
> > > machine_check_exception() and moving into ASM handler for returning
> > > from a ME but then I am a bit lost. It does not seem to be any proble=
m
> > > executing, it feels more like a SW bug dealing with machine checks. D=
on't
> >=20
> > known how to diagnose this further and could use some pointers.
>=20
> Is the execution returned to the user application?  I doubt the system ha=
ng is caused by the machine check handling.
> You can try to comment out the machine check handling code and check if t=
here is any improvement and see if
> this is related to the machine check handling.

It tries to return to user app but I cannot see what happens as the system =
lock up when the
MC returns.
How do you mean comment out MC handling? The simplest path is the PCI fixup=
 which will
just do regs->nip +=3D 4; and then return to user space. That still does no=
t work as
as soon MC handling returns, the system is locked up.

>=20
> Machine check is a serious situation and not always possible to be recove=
red from.=20

This one should at least not kill the whole system. It is a simple bus erro=
r in user space and
the app should get SIGBUS and the the system should carry on.=20

> I would focus more on debugging why the machine check is triggered by the=
 user space application.
> Can you locate what code is causing this machine check from user space? =
=20
> Is it accessing some hardware related space which is not ready?=20
> Or is it accessing address that it shouldn't have accessed?

of course, this is ongoing and getting closer a solution. The MC looking th=
e machine completely
does not make this any easier though.
These are 2 separate things, fixing the cause and not having a simple bus e=
rror lock up the machine.
I am focusing on fixing the lockup.

I have been following the execution in the kernel and I always end up in th=
e ASM returning
from the MC.
The other day we got a similar PCI MC(bus error) on T1042 CPU(e5500/e500mc)=
 and there
the system survived. The one thing I see different there is that MSR RI is =
set
when entering MC, why is that?

 Jocke=

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
       [not found]                             ` <1504961965.31322.72.camel@infinera.com>
@ 2017-09-14 16:55                               ` Joakim Tjernlund
  0 siblings, 0 replies; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-14 16:55 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Sat, 2017-09-09 at 14:59 +0200, Joakim Tjernlund wrote:
> On Sat, 2017-09-09 at 14:45 +0200, Joakim Tjernlund wrote:
> > On Fri, 2017-09-08 at 22:27 +0000, Leo Li wrote:
> > > > -----Original Message-----
> > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > Sent: Friday, September 08, 2017 7:51 AM
> > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; Yor=
k Sun
> > > > <york.sun@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > >=20
> > > > On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > > > > On Thu, 2017-09-07 at 18:54 +0000, Leo Li wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com=
>;
> > > > > > > York Sun <york.sun@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > >=20
> > > > > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > > > > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > >=20
> > > > > > > > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > >=20
> > > > > > > > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > From: York Sun
> > > > > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > > > > <Joakim.Tjernlund@infinera.com>;
> > > > > > > > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > > > >=20
> > > > > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding L=
eo.
> > > > > > > > > > > > > >=20
> > > > > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > > > > >=20
> > > > > > > > > > *regs)
> > > > > > > > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > > > > > > > >                          pagefault_disable();
> > > > > > > > > > > > > > > -                       ret =3D get_user(regs=
->nip, &inst);
> > > > > > > > > > > > > > > +                       ret =3D get_user(inst=
,
> > > > > > > > > > > > > > > + (__u32 __user *)regs->nip);
> > > > > > > > > > > > > > >                          pagefault_enable();
> > > > > > > > > > > > > > >                  } else {
> > > > > > > > > > > > > > >                          ret =3D
> > > > > > > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > > > > > >=20
> > > > > > > > > > > > > > > However, the kernel still locked up after fix=
ing that.
> > > > > > > > > > > > > > > Now I wonder why this fixup is there in the f=
irst place?
> > > > > > > > > > > > > > > The routine will not really fixup the insn, j=
ust
> > > > > > > > > > > > > > > return 0xffffffff for the failing read and th=
en advance the
> > > >=20
> > > > process NIP.
> > > > > > > > > > > > >=20
> > > > > > > > > > > > > You are right.  The code here only gives 0xffffff=
ff to
> > > > > > > > > > > > > the load instructions and
> > > > > > > > > > > >=20
> > > > > > > > > > > > continue with the next instruction when the load
> > > > > > > > > > > > instruction is causing the machine check.  This wil=
l
> > > > > > > > > > > > prevent a system lockup when reading from PCI/Rapid=
IO device
> > > >=20
> > > > which is link down.
> > > > > > > > > > > > >=20
> > > > > > > > > > > > > I don't know what is actual problem in your case.
> > > > > > > > > > > > > Maybe it is a write
> > > > > > > > > > > >=20
> > > > > > > > > > > > instruction instead of read?   Or the code is in a =
infinite loop
> > > >=20
> > > > waiting for
> > > > > > >=20
> > > > > > > a
> > > > > > > > > >=20
> > > > > > > > > > valid
> > > > > > > > > > > > read result?  Are you able to do some further debug=
ging
> > > > > > > > > > > > with the NIP correctly printed?
> > > > > > > > > > > > >=20
> > > > > > > > > > > >=20
> > > > > > > > > > > > According to the MC it is a Read and the NIP also l=
eads
> > > > > > > > > > > > to a read in the
> > > > > > > > > >=20
> > > > > > > > > > program.
> > > > > > > > > > > > ATM, I have disabled the fixup but I will enable th=
at again.
> > > > > > > > > > > > Question, is it safe add a small printk when this M=
C
> > > > > > > > > > > > happens(after fixing up)? I need to see that it has
> > > > > > > > > > > > happened as the error is somewhat
> > > > > > > > > >=20
> > > > > > > > > > random.
> > > > > > > > > > >=20
> > > > > > > > > > > I think it is safe to add printk as the current machi=
ne
> > > > > > > > > > > check handlers are also
> > > > > > > > > >=20
> > > > > > > > > > using printk.
> > > > > > > > > >=20
> > > > > > > > > > I hope so, but if the fixup fires there is no printk at=
 all so I was a bit
> > > >=20
> > > > unsure.
> > > > > > > > > > Don't like this fixup though, is there not a better way=
 than
> > > > > > > > > > faking a read to user space(or kernel for that matter) =
?
> > > > > > > > >=20
> > > > > > > > > I don't have a better idea.  Without the fixup, the offen=
ding
> > > > > > > > > load instruction
> > > > > > >=20
> > > > > > > will never finish if there is anything wrong with the backing
> > > > > > > device and freeze the whole system.  Do you have any suggesti=
on in mind?
> > > > > > > > >=20
> > > > > > > >=20
> > > > > > > > But it never finishes the load, it just fakes a load of
> > > > > > > > 0xfffffffff, for user space I rather have it signal a SIGBU=
S but
> > > > > > > > that does not seem to work either, at least not for us but =
that
> > > > > > > > could be a bug in general MC code
> > > > > > >=20
> > > > > > > maybe.
> > > > > > > > This fixup might be valid for kernel only as it has never w=
orked
> > > > > > > > for user space
> > > > > > >=20
> > > > > > > due to the bug I found.
> > > > > > > >=20
> > > > > > > > Where can I read about this errata ?
> > > > > > >=20
> > > > > > > I have look high and low an cannot find an errata which maps =
to this fixup.
> > > > > > > The closest I get is A-005125 which seems to have another
> > > > > > > workaround, I cannot find any evidence that this workaround h=
as been
> > > >=20
> > > > applied in Linux, can you?
> > > > > >=20
> > > > > > This is not A-005125.  There was an erratum for this issue with=
 older silicons
> > > >=20
> > > > (e.g. erratum PCI-ex 3 for MPC8572).
> > > > > > " When its link goes down, the PCI Express controller clears al=
l
> > > > > > outstanding transactions with an error indicator and sends a li=
nk
> > > > > > down exception to the interrupt controller if PEX_PME_MES_DISR[=
LDDD]
> > > > > > =3D 0. If, however, any transactions are sent to the controller=
 after
> > > > > > the link down event, they are accepted by the controller and wa=
it
> > > > > > for the link to come back up before starting any timeout counte=
rs (for
> > > >=20
> > > > example, completion timeout). There is no mechanism to cancel the n=
ew
> > > > transactions short of a device HRESET. "
> > > > > >=20
> > > > > > But it was removed in newer silicon like P2020/P2010 probably b=
ecause a
> > > >=20
> > > > Machine Check will be triggered in this situation to deal with the =
stalled
> > > > instruction and no longer considered it as a hardware issue.
> > > > > >=20
> > > > >=20
> > > > > Maybe this fixup should be configurable then?
> > >=20
> > > No.  My point is that the problem was no longer considered a hardware=
 issue because of the machine check mechanism is in place to handle it.  If=
 there is no handling of this special case, we would still experience a sys=
tem hang if this situation really occurs.
> > >=20
> > > > >=20
> > > > > > The A-005125 is dealt with in u-boot.
> > > >=20
> > > > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%=
2Flists.de
> > > > nx.de%2Fpipermail%2Fu-boot%2F2013-
> > > > August%2F161185.html&data=3D01%7C01%7Cleoyang.li%40nxp.com%7Ccb8a93=
e
> > > > 0090e48eb53a008d4f6b84235%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0&
> > > > sdata=3D8sR4yoXA4adqMHz6TY%2BvmYpfCBTcYEZHjPuANjz%2F1EQ%3D&reserve
> > > > d=3D0
> > > > >=20
> > > > > Yes, I found it eventually :)
> > > > >=20
> > > > > However, I cannot return to normal execution. I can follow the co=
de to
> > > > > returning from
> > > > > machine_check_exception() and moving into ASM handler for returni=
ng
> > > > > from a ME but then I am a bit lost. It does not seem to be any pr=
oblem
> > > > > executing, it feels more like a SW bug dealing with machine check=
s. Don't
> > > >=20
> > > > known how to diagnose this further and could use some pointers.
> > >=20
> > > Is the execution returned to the user application?  I doubt the syste=
m hang is caused by the machine check handling.
> > > You can try to comment out the machine check handling code and check =
if there is any improvement and see if
> > > this is related to the machine check handling.
> >=20
> > It tries to return to user app but I cannot see what happens as the sys=
tem lock up when the
> > MC returns.
> > How do you mean comment out MC handling? The simplest path is the PCI f=
ixup which will
> > just do regs->nip +=3D 4; and then return to user space. That still doe=
s not work as
> > as soon MC handling returns, the system is locked up.
> >=20
> > >=20
> > > Machine check is a serious situation and not always possible to be re=
covered from.=20
> >=20
> > This one should at least not kill the whole system. It is a simple bus =
error in user space and
> > the app should get SIGBUS and the the system should carry on.=20
> >=20
> > > I would focus more on debugging why the machine check is triggered by=
 the user space application.
> > > Can you locate what code is causing this machine check from user spac=
e? =20
> > > Is it accessing some hardware related space which is not ready?=20
> > > Or is it accessing address that it shouldn't have accessed?
> >=20
> > of course, this is ongoing and getting closer a solution. The MC lookin=
g the machine completely
> > does not make this any easier though.
> > These are 2 separate things, fixing the cause and not having a simple b=
us error lock up the machine.
> > I am focusing on fixing the lockup.
> >=20
> > I have been following the execution in the kernel and I always end up i=
n the ASM returning
> > from the MC.
> > The other day we got a similar PCI MC(bus error) on T1042 CPU(e5500/e50=
0mc) and there
> > the system survived. The one thing I see different there is that MSR RI=
 is set
> > when entering MC, why is that?
>=20
> Before you ask, I have tried to add MSR_RI to both msr and mcsrr1. Didn't=
 help.

I managed to provoke another Machine Check, much earlier this time:
[   15.047108] Machine check in kernel mode.
[   15.051120] Caused by (from MCSR=3D10008): Bus - Read Data Bus Error
[   15.057302] Oops: Machine check, sig: 7 [#1]
[   15.061567] P1010 RDB
[   15.063832] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) lin=
ux_kernel_bde(PO)
[   15.072022] CPU: 0 PID: 472 Comm: emxp2_hw_bl Tainted: P           O    =
4.1.43+ #52
[   15.079680] task: db1a7990 ti: df18c000 task.ti: df18c000
[   15.085075] NIP: 00000000 LR: 109e7648 CTR: 00000000
[   15.090036] REGS: df18df10 TRAP: 0204   Tainted: P           O     (4.1.=
43+)
[   15.097082] MSR: 0002d000 <CE,EE,PR,ME>  CR: 280004e8  XER: 20000000
[   15.103448] DEAR: b6e44140 ESR: 00000000=20
GPR00: 10ac1160 bfa44010 b79734a0 136eb4a0 bfa44030 01010101 bfa44038 00000=
020=20
GPR08: 00000000 b6e13000 063e521e 0f9ed9c4 22000422 11db7334 00000000 00000=
000=20
GPR16: 10f8b054 10f895e5 10f8a8bf 00031150 136eb4d0 00030000 00031140 00031=
140=20
GPR24: 00000000 00000000 136f10a0 00000000 00000000 00000000 00031140 136eb=
4a0=20
[   15.135690] NIP [00000000]   (null)
[   15.139174] LR [109e7648] 0x109e7648
[   15.142743] Call Trace:
[   15.145184] ---[ end trace c00af6117685cb6e ]---

The fun part is that now the OS did NOT lock up!

Looking that the faulting process, emxp2_hw_bl, I see it is in Zombie state=
(cd /proc/472):
cat status=20
Name:	emxp2_hw_bl
State:	Z (zombie)
Tgid:	472
Ngid:	0
Pid:	472
PPid:	468
TracerPid:	0
Uid:	0	0	0	0
Gid:	0	0	0	0
FDSize:	0
Groups:=09
Threads:	8
SigQ:	0/3462
SigPnd:	0000000000000000
ShdPnd:	0000000000000000
SigBlk:	0000000000000000
SigIgn:	0000000000001000
SigCgt:	00000001c0000628
CapInh:	0000000000000000
CapPrm:	0000003fffffffff
CapEff:	0000003fffffffff
CapBnd:	0000003fffffffff
Cpus_allowed:	1
Cpus_allowed_list:	0
voluntary_ctxt_switches:	1126
nonvoluntary_ctxt_switches:	376

This even after parent process has called waitid(2) for emxp2_hw_bl
If I now do a kill -s SIGBUS/TERM <pid of emxp2_hw_bl> this
signal is propagated to the parent and emxp2_hw_bl goes away.

Stack:
cat stack=20
[<c0071c04>] do_futex+0x150/0x874
[<c0027670>] do_exit+0x4e8/0x7d0
[<c000a164>] die+0x178/0x1d8
[<c000a7c8>] machine_check_exception+0xcc/0x17c
[<c000dd94>] ret_from_mcheck_exc+0x0/0x144

So emxp2_hw_bl is stuck somewhere in down in machine_check_exception().
This all looks like Linux bugs when asked to kill a user process
from Machine Check.

I don't think I will get any further without some pointers now.

 Jocke

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Machine Check in P2010(e500v2)
  2017-09-09 12:45                           ` Joakim Tjernlund
       [not found]                             ` <1504961965.31322.72.camel@infinera.com>
@ 2017-09-20 16:45                             ` Joakim Tjernlund
  2017-09-21 18:53                               ` Leo Li
  1 sibling, 1 reply; 21+ messages in thread
From: Joakim Tjernlund @ 2017-09-20 16:45 UTC (permalink / raw)
  To: linuxppc-dev, leoyang.li, york.sun

On Sat, 2017-09-09 at 14:45 +0200, Joakim Tjernlund wrote:
> On Fri, 2017-09-08 at 22:27 +0000, Leo Li wrote:
> > > -----Original Message-----
> > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > Sent: Friday, September 08, 2017 7:51 AM
> > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York =
Sun
> > > <york.sun@nxp.com>
> > > Subject: Re: Machine Check in P2010(e500v2)
> > >=20
> > > On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > > > On Thu, 2017-09-07 at 18:54 +0000, Leo Li wrote:
> > > > > > -----Original Message-----
> > > > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>;
> > > > > > York Sun <york.sun@nxp.com>
> > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > >=20
> > > > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > > > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > >=20
> > > > > > > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > >=20
> > > > > > > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > From: York Sun
> > > > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > > > <Joakim.Tjernlund@infinera.com>;
> > > > > > > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > > >=20
> > > > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding Leo=
.
> > > > > > > > > > > > >=20
> > > > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > > > >=20
> > > > > > > > > *regs)
> > > > > > > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > > > > > > >                          pagefault_disable();
> > > > > > > > > > > > > > -                       ret =3D get_user(regs->=
nip, &inst);
> > > > > > > > > > > > > > +                       ret =3D get_user(inst,
> > > > > > > > > > > > > > + (__u32 __user *)regs->nip);
> > > > > > > > > > > > > >                          pagefault_enable();
> > > > > > > > > > > > > >                  } else {
> > > > > > > > > > > > > >                          ret =3D
> > > > > > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > > > > >=20
> > > > > > > > > > > > > > However, the kernel still locked up after fixin=
g that.
> > > > > > > > > > > > > > Now I wonder why this fixup is there in the fir=
st place?
> > > > > > > > > > > > > > The routine will not really fixup the insn, jus=
t
> > > > > > > > > > > > > > return 0xffffffff for the failing read and then=
 advance the
> > >=20
> > > process NIP.
> > > > > > > > > > > >=20
> > > > > > > > > > > > You are right.  The code here only gives 0xffffffff=
 to
> > > > > > > > > > > > the load instructions and
> > > > > > > > > > >=20
> > > > > > > > > > > continue with the next instruction when the load
> > > > > > > > > > > instruction is causing the machine check.  This will
> > > > > > > > > > > prevent a system lockup when reading from PCI/RapidIO=
 device
> > >=20
> > > which is link down.
> > > > > > > > > > > >=20
> > > > > > > > > > > > I don't know what is actual problem in your case.
> > > > > > > > > > > > Maybe it is a write
> > > > > > > > > > >=20
> > > > > > > > > > > instruction instead of read?   Or the code is in a in=
finite loop
> > >=20
> > > waiting for
> > > > > >=20
> > > > > > a
> > > > > > > > >=20
> > > > > > > > > valid
> > > > > > > > > > > read result?  Are you able to do some further debuggi=
ng
> > > > > > > > > > > with the NIP correctly printed?
> > > > > > > > > > > >=20
> > > > > > > > > > >=20
> > > > > > > > > > > According to the MC it is a Read and the NIP also lea=
ds
> > > > > > > > > > > to a read in the
> > > > > > > > >=20
> > > > > > > > > program.
> > > > > > > > > > > ATM, I have disabled the fixup but I will enable that=
 again.
> > > > > > > > > > > Question, is it safe add a small printk when this MC
> > > > > > > > > > > happens(after fixing up)? I need to see that it has
> > > > > > > > > > > happened as the error is somewhat
> > > > > > > > >=20
> > > > > > > > > random.
> > > > > > > > > >=20
> > > > > > > > > > I think it is safe to add printk as the current machine
> > > > > > > > > > check handlers are also
> > > > > > > > >=20
> > > > > > > > > using printk.
> > > > > > > > >=20
> > > > > > > > > I hope so, but if the fixup fires there is no printk at a=
ll so I was a bit
> > >=20
> > > unsure.
> > > > > > > > > Don't like this fixup though, is there not a better way t=
han
> > > > > > > > > faking a read to user space(or kernel for that matter) ?
> > > > > > > >=20
> > > > > > > > I don't have a better idea.  Without the fixup, the offendi=
ng
> > > > > > > > load instruction
> > > > > >=20
> > > > > > will never finish if there is anything wrong with the backing
> > > > > > device and freeze the whole system.  Do you have any suggestion=
 in mind?
> > > > > > > >=20
> > > > > > >=20
> > > > > > > But it never finishes the load, it just fakes a load of
> > > > > > > 0xfffffffff, for user space I rather have it signal a SIGBUS =
but
> > > > > > > that does not seem to work either, at least not for us but th=
at
> > > > > > > could be a bug in general MC code
> > > > > >=20
> > > > > > maybe.
> > > > > > > This fixup might be valid for kernel only as it has never wor=
ked
> > > > > > > for user space
> > > > > >=20
> > > > > > due to the bug I found.
> > > > > > >=20
> > > > > > > Where can I read about this errata ?
> > > > > >=20
> > > > > > I have look high and low an cannot find an errata which maps to=
 this fixup.
> > > > > > The closest I get is A-005125 which seems to have another
> > > > > > workaround, I cannot find any evidence that this workaround has=
 been
> > >=20
> > > applied in Linux, can you?
> > > > >=20
> > > > > This is not A-005125.  There was an erratum for this issue with o=
lder silicons
> > >=20
> > > (e.g. erratum PCI-ex 3 for MPC8572).
> > > > > " When its link goes down, the PCI Express controller clears all
> > > > > outstanding transactions with an error indicator and sends a link
> > > > > down exception to the interrupt controller if PEX_PME_MES_DISR[LD=
DD]
> > > > > =3D 0. If, however, any transactions are sent to the controller a=
fter
> > > > > the link down event, they are accepted by the controller and wait
> > > > > for the link to come back up before starting any timeout counters=
 (for
> > >=20
> > > example, completion timeout). There is no mechanism to cancel the new
> > > transactions short of a device HRESET. "
> > > > >=20
> > > > > But it was removed in newer silicon like P2020/P2010 probably bec=
ause a
> > >=20
> > > Machine Check will be triggered in this situation to deal with the st=
alled
> > > instruction and no longer considered it as a hardware issue.
> > > > >=20
> > > >=20
> > > > Maybe this fixup should be configurable then?
> >=20
> > No.  My point is that the problem was no longer considered a hardware i=
ssue because of the machine check mechanism is in place to handle it.  If t=
here is no handling of this special case, we would still experience a syste=
m hang if this situation really occurs.
> >=20
> > > >=20
> > > > > The A-005125 is dealt with in u-boot.
> > >=20
> > > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2F=
lists.de
> > > nx.de%2Fpipermail%2Fu-boot%2F2013-
> > > August%2F161185.html&data=3D01%7C01%7Cleoyang.li%40nxp.com%7Ccb8a93e
> > > 0090e48eb53a008d4f6b84235%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0&
> > > sdata=3D8sR4yoXA4adqMHz6TY%2BvmYpfCBTcYEZHjPuANjz%2F1EQ%3D&reserve
> > > d=3D0
> > > >=20
> > > > Yes, I found it eventually :)
> > > >=20
> > > > However, I cannot return to normal execution. I can follow the code=
 to
> > > > returning from
> > > > machine_check_exception() and moving into ASM handler for returning
> > > > from a ME but then I am a bit lost. It does not seem to be any prob=
lem
> > > > executing, it feels more like a SW bug dealing with machine checks.=
 Don't
> > >=20
> > > known how to diagnose this further and could use some pointers.
> >=20
> > Is the execution returned to the user application?  I doubt the system =
hang is caused by the machine check handling.
> > You can try to comment out the machine check handling code and check if=
 there is any improvement and see if
> > this is related to the machine check handling.
>=20
> It tries to return to user app but I cannot see what happens as the syste=
m lock up when the
> MC returns.
> How do you mean comment out MC handling? The simplest path is the PCI fix=
up which will
> just do regs->nip +=3D 4; and then return to user space. That still does =
not work as
> as soon MC handling returns, the system is locked up.
>=20
> >=20
> > Machine check is a serious situation and not always possible to be reco=
vered from.=20
>=20
> This one should at least not kill the whole system. It is a simple bus er=
ror in user space and
> the app should get SIGBUS and the the system should carry on.=20
>=20
> > I would focus more on debugging why the machine check is triggered by t=
he user space application.
> > Can you locate what code is causing this machine check from user space?=
 =20
> > Is it accessing some hardware related space which is not ready?=20
> > Or is it accessing address that it shouldn't have accessed?
>=20
> of course, this is ongoing and getting closer a solution. The MC looking =
the machine completely
> does not make this any easier though.
> These are 2 separate things, fixing the cause and not having a simple bus=
 error lock up the machine.
> I am focusing on fixing the lockup.
>=20
> I have been following the execution in the kernel and I always end up in =
the ASM returning
> from the MC.
> The other day we got a similar PCI MC(bus error) on T1042 CPU(e5500/e500m=
c) and there
> the system survived. The one thing I see different there is that MSR RI i=
s set
> when entering MC, why is that?
>=20
>  Jocke

Got some more info now, this is a new errata I think, adding EDAC to the mi=
x yields:
[   28.372574] LTSSM:16
[   28.377197] Machine check in kernel mode.
[   28.381201] Caused by (from MCSR=3D10008, MCAR:0x8003e000): Bus - Read D=
ata Bus Error
[   28.388861] Oops: Machine check, sig: 7 [#1]
[   28.393125] P2010 E500v2
[   28.395651] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO) lin=
ux_kernel_bde(PO)
[   28.403842] CPU: 0 PID: 485 Comm: emxp2_hw_bl Tainted: P           O    =
4.1.43+ #19
[   28.411499] task: db13a0f0 ti: df17c000 task.ti: df17c000
[   28.416894] NIP: 10a66954 LR: 10a66a88 CTR: 0f9e7f44
[   28.421855] REGS: df17df10 TRAP: 0204   Tainted: P           O     (4.1.=
43+)
[   28.428901] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 20000000
[   28.435267] DEAR: b73cc000 ESR: 00000000=20
GPR00: 10a66a88 bfc21bc0 b7eee4a0 136eb4a0 00000000 00000000 00000000 00000=
000=20
GPR08: 0002d000 0003e000 b738e000 00000000 24002422 11db7334 00000000 00000=
000=20
GPR16: 10f8b054 10f895e5 10f8a8bf 0000b541 0000b541 11ddd380 00000011 00000=
001=20
GPR24: 01a9985e 136f1010 07000000 136eb4a0 00006000 07006000 00000000 00000=
000=20
[   28.467506] NIP [10a66954] 0x10a66954
[   28.471162] LR [10a66a88] 0x10a66a88
[   28.474730] Call Trace:
[   28.477170] ---[ end trace b25436dea505b49d ]---
[   28.481781]=20
[   28.483267] PCIe error(s) detected
[   28.486662] PCIe ERR_DR register: 0x00800000
[   28.490927] PCIe ERR_CAP_STAT register: 0x00000023
[   28.495713] PCIe ERR_CAP_R0 register: 0x00000000
[   28.500324] PCIe ERR_CAP_R1 register: 0x00000000
[   28.504936] PCIe ERR_CAP_R2 register: 0x00000000
[   28.509548] PCIe ERR_CAP_R3 register: 0x00000000

I logged LTSSM and it is 16(link up) and Ref. manual says this about ERR_DR=
 =3D 0x00800000:

PCIe ERR_DR: PCT bit
PCI Express completion time-out. A completion time-out condition was detect=
ed for a non-posted,
outbound PCI Express transaction. An error response is sent back to the req=
uestor. Note that a
completion timeout counter only starts when the non-posted request was able=
 to send to the link partner.
-
A completion time-out on the PCI Express link was detected. Note that a com=
pletion timeout error is a
fatal error. If a completion timeout error is detected, the system has beco=
me unstable. Hot reset is
recommended to restore stability of the system.

This error is not described in any errata I can find, how to workaround thi=
s?

   Jocke

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Machine Check in P2010(e500v2)
  2017-09-20 16:45                             ` Joakim Tjernlund
@ 2017-09-21 18:53                               ` Leo Li
  0 siblings, 0 replies; 21+ messages in thread
From: Leo Li @ 2017-09-21 18:53 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev, York Sun, Mingkai Hu, Richard Nie



> -----Original Message-----
> From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> Sent: Wednesday, September 20, 2017 11:45 AM
> To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>; York Sun
> <york.sun@nxp.com>
> Subject: Re: Machine Check in P2010(e500v2)
>=20
> On Sat, 2017-09-09 at 14:45 +0200, Joakim Tjernlund wrote:
> > On Fri, 2017-09-08 at 22:27 +0000, Leo Li wrote:
> > > > -----Original Message-----
> > > > From: Joakim Tjernlund [mailto:Joakim.Tjernlund@infinera.com]
> > > > Sent: Friday, September 08, 2017 7:51 AM
> > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li <leoyang.li@nxp.com>;
> > > > York Sun <york.sun@nxp.com>
> > > > Subject: Re: Machine Check in P2010(e500v2)
> > > >
> > > > On Fri, 2017-09-08 at 11:54 +0200, Joakim Tjernlund wrote:
> > > > > On Thu, 2017-09-07 at 18:54 +0000, Leo Li wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Joakim Tjernlund
> > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > Sent: Thursday, September 07, 2017 3:41 AM
> > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > >
> > > > > > > On Thu, 2017-09-07 at 00:50 +0200, Joakim Tjernlund wrote:
> > > > > > > > On Wed, 2017-09-06 at 21:13 +0000, Leo Li wrote:
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > > Sent: Wednesday, September 06, 2017 3:54 PM
> > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > >
> > > > > > > > > > On Wed, 2017-09-06 at 20:28 +0000, Leo Li wrote:
> > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > From: Joakim Tjernlund
> > > > > > > > > > > > [mailto:Joakim.Tjernlund@infinera.com]
> > > > > > > > > > > > Sent: Wednesday, September 06, 2017 3:17 PM
> > > > > > > > > > > > To: linuxppc-dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > <leoyang.li@nxp.com>; York Sun <york.sun@nxp.com>
> > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, 2017-09-06 at 19:31 +0000, Leo Li wrote:
> > > > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > > > From: York Sun
> > > > > > > > > > > > > > Sent: Wednesday, September 06, 2017 10:38 AM
> > > > > > > > > > > > > > To: Joakim Tjernlund
> > > > > > > > > > > > > > <Joakim.Tjernlund@infinera.com>;
> > > > > > > > > > > > > > linuxppc- dev@lists.ozlabs.org; Leo Li
> > > > > > > > > > > > > > <leoyang.li@nxp.com>
> > > > > > > > > > > > > > Subject: Re: Machine Check in P2010(e500v2)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Scott is no longer with Freescale/NXP. Adding L=
eo.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On 09/05/2017 01:40 AM, Joakim Tjernlund wrote:
> > > > > > > > > > > > > > > So after some debugging I found this bug:
> > > > > > > > > > > > > > > @@ -996,7 +998,7 @@ int
> > > > > > > > > > > > > > > fsl_pci_mcheck_exception(struct pt_regs
> > > > > > > > > >
> > > > > > > > > > *regs)
> > > > > > > > > > > > > > >          if (is_in_pci_mem_space(addr)) {
> > > > > > > > > > > > > > >                  if (user_mode(regs)) {
> > > > > > > > > > > > > > >                          pagefault_disable();
> > > > > > > > > > > > > > > -                       ret =3D get_user(regs=
->nip, &inst);
> > > > > > > > > > > > > > > +                       ret =3D get_user(inst=
,
> > > > > > > > > > > > > > > + (__u32 __user *)regs->nip);
> > > > > > > > > > > > > > >                          pagefault_enable();
> > > > > > > > > > > > > > >                  } else {
> > > > > > > > > > > > > > >                          ret =3D
> > > > > > > > > > > > > > > probe_kernel_address(regs->nip, inst);
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, the kernel still locked up after fix=
ing that.
> > > > > > > > > > > > > > > Now I wonder why this fixup is there in the f=
irst place?
> > > > > > > > > > > > > > > The routine will not really fixup the insn,
> > > > > > > > > > > > > > > just return 0xffffffff for the failing read
> > > > > > > > > > > > > > > and then advance the
> > > >
> > > > process NIP.
> > > > > > > > > > > > >
> > > > > > > > > > > > > You are right.  The code here only gives
> > > > > > > > > > > > > 0xffffffff to the load instructions and
> > > > > > > > > > > >
> > > > > > > > > > > > continue with the next instruction when the load
> > > > > > > > > > > > instruction is causing the machine check.  This
> > > > > > > > > > > > will prevent a system lockup when reading from
> > > > > > > > > > > > PCI/RapidIO device
> > > >
> > > > which is link down.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I don't know what is actual problem in your case.
> > > > > > > > > > > > > Maybe it is a write
> > > > > > > > > > > >
> > > > > > > > > > > > instruction instead of read?   Or the code is in a =
infinite loop
> > > >
> > > > waiting for
> > > > > > >
> > > > > > > a
> > > > > > > > > >
> > > > > > > > > > valid
> > > > > > > > > > > > read result?  Are you able to do some further
> > > > > > > > > > > > debugging with the NIP correctly printed?
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > According to the MC it is a Read and the NIP also
> > > > > > > > > > > > leads to a read in the
> > > > > > > > > >
> > > > > > > > > > program.
> > > > > > > > > > > > ATM, I have disabled the fixup but I will enable th=
at again.
> > > > > > > > > > > > Question, is it safe add a small printk when this
> > > > > > > > > > > > MC happens(after fixing up)? I need to see that it
> > > > > > > > > > > > has happened as the error is somewhat
> > > > > > > > > >
> > > > > > > > > > random.
> > > > > > > > > > >
> > > > > > > > > > > I think it is safe to add printk as the current
> > > > > > > > > > > machine check handlers are also
> > > > > > > > > >
> > > > > > > > > > using printk.
> > > > > > > > > >
> > > > > > > > > > I hope so, but if the fixup fires there is no printk
> > > > > > > > > > at all so I was a bit
> > > >
> > > > unsure.
> > > > > > > > > > Don't like this fixup though, is there not a better
> > > > > > > > > > way than faking a read to user space(or kernel for that=
 matter) ?
> > > > > > > > >
> > > > > > > > > I don't have a better idea.  Without the fixup, the
> > > > > > > > > offending load instruction
> > > > > > >
> > > > > > > will never finish if there is anything wrong with the
> > > > > > > backing device and freeze the whole system.  Do you have any
> suggestion in mind?
> > > > > > > > >
> > > > > > > >
> > > > > > > > But it never finishes the load, it just fakes a load of
> > > > > > > > 0xfffffffff, for user space I rather have it signal a
> > > > > > > > SIGBUS but that does not seem to work either, at least not
> > > > > > > > for us but that could be a bug in general MC code
> > > > > > >
> > > > > > > maybe.
> > > > > > > > This fixup might be valid for kernel only as it has never
> > > > > > > > worked for user space
> > > > > > >
> > > > > > > due to the bug I found.
> > > > > > > >
> > > > > > > > Where can I read about this errata ?
> > > > > > >
> > > > > > > I have look high and low an cannot find an errata which maps =
to this
> fixup.
> > > > > > > The closest I get is A-005125 which seems to have another
> > > > > > > workaround, I cannot find any evidence that this workaround
> > > > > > > has been
> > > >
> > > > applied in Linux, can you?
> > > > > >
> > > > > > This is not A-005125.  There was an erratum for this issue
> > > > > > with older silicons
> > > >
> > > > (e.g. erratum PCI-ex 3 for MPC8572).
> > > > > > " When its link goes down, the PCI Express controller clears
> > > > > > all outstanding transactions with an error indicator and sends
> > > > > > a link down exception to the interrupt controller if
> > > > > > PEX_PME_MES_DISR[LDDD] =3D 0. If, however, any transactions are
> > > > > > sent to the controller after the link down event, they are
> > > > > > accepted by the controller and wait for the link to come back
> > > > > > up before starting any timeout counters (for
> > > >
> > > > example, completion timeout). There is no mechanism to cancel the
> > > > new transactions short of a device HRESET. "
> > > > > >
> > > > > > But it was removed in newer silicon like P2020/P2010 probably
> > > > > > because a
> > > >
> > > > Machine Check will be triggered in this situation to deal with the
> > > > stalled instruction and no longer considered it as a hardware issue=
.
> > > > > >
> > > > >
> > > > > Maybe this fixup should be configurable then?
> > >
> > > No.  My point is that the problem was no longer considered a hardware=
 issue
> because of the machine check mechanism is in place to handle it.  If ther=
e is no
> handling of this special case, we would still experience a system hang if=
 this
> situation really occurs.
> > >
> > > > >
> > > > > > The A-005125 is dealt with in u-boot.
> > > >
> > > > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%=
2
> > > > Flists.de
> > > > nx.de%2Fpipermail%2Fu-boot%2F2013-
> > > >
> August%2F161185.html&data=3D01%7C01%7Cleoyang.li%40nxp.com%7Ccb8a93e
> > > >
> 0090e48eb53a008d4f6b84235%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0&
> > > >
> sdata=3D8sR4yoXA4adqMHz6TY%2BvmYpfCBTcYEZHjPuANjz%2F1EQ%3D&reserve
> > > > d=3D0
> > > > >
> > > > > Yes, I found it eventually :)
> > > > >
> > > > > However, I cannot return to normal execution. I can follow the
> > > > > code to returning from
> > > > > machine_check_exception() and moving into ASM handler for
> > > > > returning from a ME but then I am a bit lost. It does not seem
> > > > > to be any problem executing, it feels more like a SW bug dealing
> > > > > with machine checks. Don't
> > > >
> > > > known how to diagnose this further and could use some pointers.
> > >
> > > Is the execution returned to the user application?  I doubt the syste=
m hang is
> caused by the machine check handling.
> > > You can try to comment out the machine check handling code and check
> > > if there is any improvement and see if this is related to the machine=
 check
> handling.
> >
> > It tries to return to user app but I cannot see what happens as the
> > system lock up when the MC returns.
> > How do you mean comment out MC handling? The simplest path is the PCI
> > fixup which will just do regs->nip +=3D 4; and then return to user
> > space. That still does not work as as soon MC handling returns, the sys=
tem is
> locked up.
> >
> > >
> > > Machine check is a serious situation and not always possible to be re=
covered
> from.
> >
> > This one should at least not kill the whole system. It is a simple bus
> > error in user space and the app should get SIGBUS and the the system sh=
ould
> carry on.
> >
> > > I would focus more on debugging why the machine check is triggered by=
 the
> user space application.
> > > Can you locate what code is causing this machine check from user spac=
e?
> > > Is it accessing some hardware related space which is not ready?
> > > Or is it accessing address that it shouldn't have accessed?
> >
> > of course, this is ongoing and getting closer a solution. The MC
> > looking the machine completely does not make this any easier though.
> > These are 2 separate things, fixing the cause and not having a simple b=
us error
> lock up the machine.
> > I am focusing on fixing the lockup.
> >
> > I have been following the execution in the kernel and I always end up
> > in the ASM returning from the MC.
> > The other day we got a similar PCI MC(bus error) on T1042
> > CPU(e5500/e500mc) and there the system survived. The one thing I see
> > different there is that MSR RI is set when entering MC, why is that?
> >
> >  Jocke
>=20
> Got some more info now, this is a new errata I think, adding EDAC to the =
mix
> yields:
> [   28.372574] LTSSM:16
> [   28.377197] Machine check in kernel mode.
> [   28.381201] Caused by (from MCSR=3D10008, MCAR:0x8003e000): Bus - Read
> Data Bus Error
> [   28.388861] Oops: Machine check, sig: 7 [#1]
> [   28.393125] P2010 E500v2
> [   28.395651] Modules linked in: linux_bcm_knet(PO) linux_user_bde(PO)
> linux_kernel_bde(PO)
> [   28.403842] CPU: 0 PID: 485 Comm: emxp2_hw_bl Tainted: P           O  =
  4.1.43+
> #19
> [   28.411499] task: db13a0f0 ti: df17c000 task.ti: df17c000
> [   28.416894] NIP: 10a66954 LR: 10a66a88 CTR: 0f9e7f44
> [   28.421855] REGS: df17df10 TRAP: 0204   Tainted: P           O     (4.=
1.43+)
> [   28.428901] MSR: 0002d000 <CE,EE,PR,ME>  CR: 44002428  XER: 20000000
> [   28.435267] DEAR: b73cc000 ESR: 00000000
> GPR00: 10a66a88 bfc21bc0 b7eee4a0 136eb4a0 00000000 00000000 00000000
> 00000000
> GPR08: 0002d000 0003e000 b738e000 00000000 24002422 11db7334 00000000
> 00000000
> GPR16: 10f8b054 10f895e5 10f8a8bf 0000b541 0000b541 11ddd380 00000011
> 00000001
> GPR24: 01a9985e 136f1010 07000000 136eb4a0 00006000 07006000 00000000
> 00000000
> [   28.467506] NIP [10a66954] 0x10a66954
> [   28.471162] LR [10a66a88] 0x10a66a88
> [   28.474730] Call Trace:
> [   28.477170] ---[ end trace b25436dea505b49d ]---
> [   28.481781]
> [   28.483267] PCIe error(s) detected
> [   28.486662] PCIe ERR_DR register: 0x00800000
> [   28.490927] PCIe ERR_CAP_STAT register: 0x00000023
> [   28.495713] PCIe ERR_CAP_R0 register: 0x00000000
> [   28.500324] PCIe ERR_CAP_R1 register: 0x00000000
> [   28.504936] PCIe ERR_CAP_R2 register: 0x00000000
> [   28.509548] PCIe ERR_CAP_R3 register: 0x00000000
>=20
> I logged LTSSM and it is 16(link up) and Ref. manual says this about ERR_=
DR =3D
> 0x00800000:
>=20
> PCIe ERR_DR: PCT bit
> PCI Express completion time-out. A completion time-out condition was dete=
cted
> for a non-posted, outbound PCI Express transaction. An error response is =
sent
> back to the requestor. Note that a completion timeout counter only starts=
 when
> the non-posted request was able to send to the link partner.
> -
> A completion time-out on the PCI Express link was detected. Note that a
> completion timeout error is a fatal error. If a completion timeout error =
is
> detected, the system has become unstable. Hot reset is recommended to
> restore stability of the system.
>=20
> This error is not described in any errata I can find, how to workaround t=
his?

Adding some PCIe experts to the loop.

Regards,
Leo

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-09-21 18:53 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-01 11:32 Machine Check in P2010(e500v2) Joakim Tjernlund
2017-09-05  8:40 ` Joakim Tjernlund
2017-09-06 15:38   ` York Sun
2017-09-06 19:31     ` Leo Li
2017-09-06 20:17       ` Joakim Tjernlund
2017-09-06 20:28         ` Leo Li
2017-09-06 20:53           ` Joakim Tjernlund
2017-09-06 21:13             ` Leo Li
2017-09-06 22:50               ` Joakim Tjernlund
2017-09-07  8:41                 ` Joakim Tjernlund
2017-09-07 18:54                   ` Leo Li
2017-09-08  9:54                     ` Joakim Tjernlund
2017-09-08 12:50                       ` Joakim Tjernlund
2017-09-08 22:27                         ` Leo Li
2017-09-09 12:45                           ` Joakim Tjernlund
     [not found]                             ` <1504961965.31322.72.camel@infinera.com>
2017-09-14 16:55                               ` Joakim Tjernlund
2017-09-20 16:45                             ` Joakim Tjernlund
2017-09-21 18:53                               ` Leo Li
2017-09-06 10:05 ` Laurentiu Tudor
2017-09-06 10:16   ` Joakim Tjernlund
2017-09-08  1:56     ` Scott Wood

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.