linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* powerpc: Don't silently handle machine checks from userspace
@ 2012-11-02 11:48 Martijn de Gouw
  2012-11-02 16:36 ` Scott Wood
  0 siblings, 1 reply; 6+ messages in thread
From: Martijn de Gouw @ 2012-11-02 11:48 UTC (permalink / raw)
  To: Anton Blanchard, Benjamin Herrenschmidt, linuxppc-dev; +Cc: Micha Nelissen

Hi,

The following commit:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commit;h=e49b1fae0ba4d06b29bd753a961abb447566bf4a

causes confusion, because it prints "Machine check in kernel mode" also when the bus error is actually in user space. When using RapidIO memory mapped access, and the device is removed or powered off, then a bus error is generated. This is on a freescale mpc8548 powerpc. Due to removing the user_mode check, the kernel calls "die" which causes the process to die with a BUS error, regardless of having a SIGBUS handler or not.

Therefore I request to put this check back, and even to put the removed code at the top of the machine check handler because there is no point in trying to recover from a user space bus error anyway.

Best regards,

-- 
Martijn de Gouw
Engineer
Prodrive B.V.
Mobile: +31 63 17 76 161
Phone:  +31 40 26 76 200

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: powerpc: Don't silently handle machine checks from userspace
  2012-11-02 11:48 powerpc: Don't silently handle machine checks from userspace Martijn de Gouw
@ 2012-11-02 16:36 ` Scott Wood
  2012-11-06  9:21   ` Micha Nelissen
  0 siblings, 1 reply; 6+ messages in thread
From: Scott Wood @ 2012-11-02 16:36 UTC (permalink / raw)
  To: Martijn de Gouw; +Cc: Micha Nelissen, linuxppc-dev, Anton Blanchard

On 11/02/2012 06:48:40 AM, Martijn de Gouw wrote:
> Hi,
>=20
> The following commit:
>=20
> http://git.kernel.org/?p=3Dlinux/kernel/git/stable/linux-stable.git;a=3Dc=
ommit;h=3De49b1fae0ba4d06b29bd753a961abb447566bf4a
>=20
> causes confusion, because it prints "Machine check in kernel mode" =20
> also when the bus error is actually in user space. When using RapidIO =20
> memory mapped access, and the device is removed or powered off, then =20
> a bus error is generated. This is on a freescale mpc8548 powerpc. Due =20
> to removing the user_mode check, the kernel calls "die" which causes =20
> the process to die with a BUS error, regardless of having a SIGBUS =20
> handler or not.
>=20
> Therefore I request to put this check back, and even to put the =20
> removed code at the top of the machine check handler because there is =20
> no point in trying to recover from a user space bus error anyway.

Why is there no point trying to recover?  For example, see MCSR_ICPERR =20
and MCSR_DCPERR_MC in machine_check_e500mc.  The machine check is just =20
letting us know that there was an error and the read-only cache got =20
dumped (i.e. it was a correctable error).

-Scott=

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: powerpc: Don't silently handle machine checks from userspace
  2012-11-02 16:36 ` Scott Wood
@ 2012-11-06  9:21   ` Micha Nelissen
  2012-11-06 16:34     ` Scott Wood
  0 siblings, 1 reply; 6+ messages in thread
From: Micha Nelissen @ 2012-11-06  9:21 UTC (permalink / raw)
  To: 'Scott Wood', Martijn de Gouw; +Cc: linuxppc-dev, Anton Blanchard

From: Scott Wood [mailto:scottwood@freescale.com]
> > Therefore I request to put this check back, and even to put the
> > removed code at the top of the machine check handler because there is
> > no point in trying to recover from a user space bus error anyway.
>=20
> Why is there no point trying to recover?  For example, see MCSR_ICPERR
> and MCSR_DCPERR_MC in machine_check_e500mc.  The machine check is just
> letting us know that there was an error and the read-only cache got
> dumped (i.e. it was a correctable error).

Oh I overlooked those cases; those correctable errors shouldn't be bus erro=
rs for the user space process?

Hmm I guess there is no simple solution then, since the "recover" function =
also prints the kernel messages about the machine check being in kernel mod=
e without having checked whether it really was in kernel mode. In the past =
the user mode check was in between.

Micha

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: powerpc: Don't silently handle machine checks from userspace
  2012-11-06  9:21   ` Micha Nelissen
@ 2012-11-06 16:34     ` Scott Wood
  2012-11-06 16:43       ` Micha Nelissen
  0 siblings, 1 reply; 6+ messages in thread
From: Scott Wood @ 2012-11-06 16:34 UTC (permalink / raw)
  To: Micha Nelissen; +Cc: Martijn de Gouw, linuxppc-dev, Anton Blanchard

On 11/06/2012 03:21:37 AM, Micha Nelissen wrote:
> From: Scott Wood [mailto:scottwood@freescale.com]
> > > Therefore I request to put this check back, and even to put the
> > > removed code at the top of the machine check handler because =20
> there is
> > > no point in trying to recover from a user space bus error anyway.
> >
> > Why is there no point trying to recover?  For example, see =20
> MCSR_ICPERR
> > and MCSR_DCPERR_MC in machine_check_e500mc.  The machine check is =20
> just
> > letting us know that there was an error and the read-only cache got
> > dumped (i.e. it was a correctable error).
>=20
> Oh I overlooked those cases; those correctable errors shouldn't be =20
> bus errors for the user space process?
>=20
> Hmm I guess there is no simple solution then, since the "recover" =20
> function also prints the kernel messages about the machine check =20
> being in kernel mode without having checked whether it really was in =20
> kernel mode. In the past the user mode check was in between.

It shouldn't be that difficult to make it say "in user mode" or "in =20
kernel mode" depending on which it was... or just remove that phrase =20
altogether and let the following output indicate whether it was in =20
kernel mode.

-Scott=

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: powerpc: Don't silently handle machine checks from userspace
  2012-11-06 16:34     ` Scott Wood
@ 2012-11-06 16:43       ` Micha Nelissen
  2012-11-06 20:13         ` Scott Wood
  0 siblings, 1 reply; 6+ messages in thread
From: Micha Nelissen @ 2012-11-06 16:43 UTC (permalink / raw)
  To: 'Scott Wood'; +Cc: Martijn de Gouw, linuxppc-dev, Anton Blanchard

From: Scott Wood [mailto:scottwood@freescale.com]
>> Hmm I guess there is no simple solution then, since the "recover"
>> function also prints the kernel messages about the machine check
>> being in kernel mode without having checked whether it really was in
>> kernel mode. In the past the user mode check was in between.
>=20
> It shouldn't be that difficult to make it say "in user mode" or "in
> kernel mode" depending on which it was... or just remove that phrase
> altogether and let the following output indicate whether it was in
> kernel mode.

Well printing the correct message is only part of it: do "we" want to pass =
guarded load errors (RapidIO bus errors) to user space or not? And if we pa=
ss them, should we print a kernel message at all?

Micha

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: powerpc: Don't silently handle machine checks from userspace
  2012-11-06 16:43       ` Micha Nelissen
@ 2012-11-06 20:13         ` Scott Wood
  0 siblings, 0 replies; 6+ messages in thread
From: Scott Wood @ 2012-11-06 20:13 UTC (permalink / raw)
  To: Micha Nelissen; +Cc: Martijn de Gouw, linuxppc-dev, Anton Blanchard

On 11/06/2012 10:43:19 AM, Micha Nelissen wrote:
> From: Scott Wood [mailto:scottwood@freescale.com]
> >> Hmm I guess there is no simple solution then, since the "recover"
> >> function also prints the kernel messages about the machine check
> >> being in kernel mode without having checked whether it really was =20
> in
> >> kernel mode. In the past the user mode check was in between.
> >
> > It shouldn't be that difficult to make it say "in user mode" or "in
> > kernel mode" depending on which it was... or just remove that phrase
> > altogether and let the following output indicate whether it was in
> > kernel mode.
>=20
> Well printing the correct message is only part of it: do "we" want to =20
> pass guarded load errors (RapidIO bus errors) to user space or not?

Yes.

> And if we pass them, should we print a kernel message at all?

Yes (but maybe ratelimited).

-Scott=

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-11-06 20:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-02 11:48 powerpc: Don't silently handle machine checks from userspace Martijn de Gouw
2012-11-02 16:36 ` Scott Wood
2012-11-06  9:21   ` Micha Nelissen
2012-11-06 16:34     ` Scott Wood
2012-11-06 16:43       ` Micha Nelissen
2012-11-06 20:13         ` Scott Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).