All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: Send SIGBUS from machine_check
@ 2020-10-01 17:05 Joakim Tjernlund
  2020-10-22 11:19 ` Joakim Tjernlund
  2020-10-23  0:57 ` Michael Ellerman
  0 siblings, 2 replies; 4+ messages in thread
From: Joakim Tjernlund @ 2020-10-01 17:05 UTC (permalink / raw)
  To: linuxppc-dev

Embedded PPC CPU should send SIGBUS to user space when applicable.

Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
---
 arch/powerpc/kernel/traps.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 0381242920d9..12715d24141c 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
 		       reason & MCSR_MEA ? "Effective" : "Physical", addr);
 	}
 
+	if ((user_mode(regs))) {
+		_exception(SIGBUS, regs, reason, regs->nip);
+		recoverable = 1;
+	}
+
 silent_out:
 	mtspr(SPRN_MCSR, mcsr);
 	return mfspr(SPRN_MCSR) == 0 && recoverable;
@@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
 	if (reason & MCSR_BUS_RPERR)
 		printk("Bus - Read Parity Error\n");
 
+	if ((user_mode(regs))) {
+		_exception(SIGBUS, regs, reason, regs->nip);
+		return 1;
+	}
 	return 0;
 }
 
@@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
 	if (reason & MCSR_BUS_WRERR)
 		printk("Bus - Write Bus Error on buffered store or cache line push\n");
 
+	if ((user_mode(regs))) {
+		_exception(SIGBUS, regs, reason, regs->nip);
+		return 1;
+	}
 	return 0;
 }
 #elif defined(CONFIG_PPC32)
@@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
 	default:
 		printk("Unknown values in msr\n");
 	}
+	if ((user_mode(regs))) {
+		_exception(SIGBUS, regs, reason, regs->nip);
+		return 1;
+	}
 	return 0;
 }
 #endif /* everything else */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc: Send SIGBUS from machine_check
  2020-10-01 17:05 [PATCH] powerpc: Send SIGBUS from machine_check Joakim Tjernlund
@ 2020-10-22 11:19 ` Joakim Tjernlund
  2020-10-23  0:57 ` Michael Ellerman
  1 sibling, 0 replies; 4+ messages in thread
From: Joakim Tjernlund @ 2020-10-22 11:19 UTC (permalink / raw)
  To: linuxppc-dev

ping

Also Cc: stable@vger.kernel.org

On Thu, 2020-10-01 at 19:05 +0200, Joakim Tjernlund wrote:
> Embedded PPC CPU should send SIGBUS to user space when applicable.
> 
> Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
> ---
>  arch/powerpc/kernel/traps.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0381242920d9..12715d24141c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
>  		       reason & MCSR_MEA ? "Effective" : "Physical", addr);
>  	}
>  
> 
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		recoverable = 1;
> +	}
> +
>  silent_out:
>  	mtspr(SPRN_MCSR, mcsr);
>  	return mfspr(SPRN_MCSR) == 0 && recoverable;
> @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
>  	if (reason & MCSR_BUS_RPERR)
>  		printk("Bus - Read Parity Error\n");
>  
> 
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}
>  	return 0;
>  }
>  
> 
> @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
>  	if (reason & MCSR_BUS_WRERR)
>  		printk("Bus - Write Bus Error on buffered store or cache line push\n");
>  
> 
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}
>  	return 0;
>  }
>  #elif defined(CONFIG_PPC32)
> @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
>  	default:
>  		printk("Unknown values in msr\n");
>  	}
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}
>  	return 0;
>  }
>  #endif /* everything else */


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc: Send SIGBUS from machine_check
  2020-10-01 17:05 [PATCH] powerpc: Send SIGBUS from machine_check Joakim Tjernlund
  2020-10-22 11:19 ` Joakim Tjernlund
@ 2020-10-23  0:57 ` Michael Ellerman
  2020-10-23  9:23   ` Joakim Tjernlund
  1 sibling, 1 reply; 4+ messages in thread
From: Michael Ellerman @ 2020-10-23  0:57 UTC (permalink / raw)
  To: Joakim Tjernlund, linuxppc-dev

Joakim Tjernlund <joakim.tjernlund@infinera.com> writes:
> Embedded PPC CPU should send SIGBUS to user space when applicable.

Yeah, but it's not clear that it's applicable in all cases.

At least I need some reasoning for why it's safe in all cases below to
just send a SIGBUS and take no other action.

Is there a particular CPU you're working on? Can we start with that and
look at all the machine check causes and which can be safely handled.

Some comments below ...


> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0381242920d9..12715d24141c 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)

At the beginning of the function we have:

	printk("Machine check in kernel mode.\n");

Which should be updated.

>  		       reason & MCSR_MEA ? "Effective" : "Physical", addr);
>  	}
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		recoverable = 1;
> +	}

For most of the error causes we take no action and set recoverable = 0.

Then you just declare that it is recoverable because it hit in
userspace. Depending on the cause that might be OK, but it's not
obviously correct in all cases.


> +
>  silent_out:
>  	mtspr(SPRN_MCSR, mcsr);
>  	return mfspr(SPRN_MCSR) == 0 && recoverable;
> @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)

Same comment about the printk().

>  	if (reason & MCSR_BUS_RPERR)
>  		printk("Bus - Read Parity Error\n");
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

And same comment more or less.

Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
function does nothing to clear the cause of the machine check.

>  	return 0;
>  }
>  
> @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
>  	if (reason & MCSR_BUS_WRERR)
>  		printk("Bus - Write Bus Error on buffered store or cache line push\n");
>  
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

Same.

>  	return 0;
>  }
>  #elif defined(CONFIG_PPC32)
> @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
>  	default:
>  		printk("Unknown values in msr\n");
>  	}
> +	if ((user_mode(regs))) {
> +		_exception(SIGBUS, regs, reason, regs->nip);
> +		return 1;
> +	}

Same.

>  	return 0;
>  }
>  #endif /* everything else */
> -- 
> 2.26.2


cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc: Send SIGBUS from machine_check
  2020-10-23  0:57 ` Michael Ellerman
@ 2020-10-23  9:23   ` Joakim Tjernlund
  0 siblings, 0 replies; 4+ messages in thread
From: Joakim Tjernlund @ 2020-10-23  9:23 UTC (permalink / raw)
  To: linuxppc-dev, mpe

On Fri, 2020-10-23 at 11:57 +1100, Michael Ellerman wrote:
> 
> 
> Joakim Tjernlund <joakim.tjernlund@infinera.com> writes:
> > Embedded PPC CPU should send SIGBUS to user space when applicable.
> 
> Yeah, but it's not clear that it's applicable in all cases.
> 
> At least I need some reasoning for why it's safe in all cases below to
> just send a SIGBUS and take no other action.

For me this came from an User SDK accessing a PCI device(also using PCI IRQs) and this
SDK did some strange stuff during shutdown which disabled the device before SW was done.
This caused PCI accesses, both from User Space and kernel PCI IRQs access) to the device
which caused an Machine Check(PCI transfer failed). Without this patch, the kernel
would just OOPS and hang/do strange things even for an access made by User space.
Now the User app just gets a SIGBUS and the kernel still works as it should.

Perhaps a SIGBUS and recover isn't right in all cases but without it there will be a
system break down.


> Is there a particular CPU you're working on? Can we start with that and
> look at all the machine check causes and which can be safely handled.

This was a T1042(e5500) but we have e500 and mpc832x as well.

> 
> Some comments below ...
> 
> 
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index 0381242920d9..12715d24141c 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -621,6 +621,11 @@ int machine_check_e500mc(struct pt_regs *regs)
> 
> At the beginning of the function we have:
> 
>         printk("Machine check in kernel mode.\n");
> 
> Which should be updated.

Sure, just remove the "in kernel mode" perhaps?

> 
> >                      reason & MCSR_MEA ? "Effective" : "Physical", addr);
> >       }
> > 
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             recoverable = 1;
> > +     }
> 
> For most of the error causes we take no action and set recoverable = 0.
> 
> Then you just declare that it is recoverable because it hit in
> userspace. Depending on the cause that might be OK, but it's not
> obviously correct in all cases.

Not so familiar with PPC that I can make out what is OK or not.
I do think you stand a better chance now that before though.  

> 
> 
> > +
> >  silent_out:
> >       mtspr(SPRN_MCSR, mcsr);
> >       return mfspr(SPRN_MCSR) == 0 && recoverable;
> > @@ -665,6 +670,10 @@ int machine_check_e500(struct pt_regs *regs)
> 
> Same comment about the printk().
> 
> >       if (reason & MCSR_BUS_RPERR)
> >               printk("Bus - Read Parity Error\n");
> > 
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             return 1;
> > +     }
> 
> And same comment more or less.
> 
> Other than the MCSR_BUS_RBERR cases that are explicitly checked, the
> function does nothing to clear the cause of the machine check.
> 
> >       return 0;
> >  }
> > 
> > @@ -695,6 +704,10 @@ int machine_check_e200(struct pt_regs *regs)
> >       if (reason & MCSR_BUS_WRERR)
> >               printk("Bus - Write Bus Error on buffered store or cache line push\n");
> > 
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             return 1;
> > +     }
> 
> Same.
> 
> >       return 0;
> >  }
> >  #elif defined(CONFIG_PPC32)
> > @@ -731,6 +744,10 @@ int machine_check_generic(struct pt_regs *regs)
> >       default:
> >               printk("Unknown values in msr\n");
> >       }
> > +     if ((user_mode(regs))) {
> > +             _exception(SIGBUS, regs, reason, regs->nip);
> > +             return 1;
> > +     }
> 
> Same.
> 
> >       return 0;
> >  }
> >  #endif /* everything else */
> > --
> > 2.26.2
> 
> 
> cheers


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-10-23  9:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-01 17:05 [PATCH] powerpc: Send SIGBUS from machine_check Joakim Tjernlund
2020-10-22 11:19 ` Joakim Tjernlund
2020-10-23  0:57 ` Michael Ellerman
2020-10-23  9:23   ` Joakim Tjernlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.