All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] Introduce add_taint variant that does not print warning.
@ 2017-02-21  4:21 Mahesh J Salgaonkar
  2017-02-21  4:21 ` [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early() Mahesh J Salgaonkar
  0 siblings, 1 reply; 5+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  4:21 UTC (permalink / raw)
  To: linuxppc-dev, Linux Kernel, Michael Ellerman

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Some interrupt handler's may not be happy to call printk and would lead to
unexpected behavior or kernel panic. e.g. machine_check_early() from MCE
handler on OPAL based system. Introduce add_taint variant that does not call
printk.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 include/linux/kernel.h |    1 +
 kernel/panic.c         |    6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index cb09238..799943e 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -480,6 +480,7 @@ enum lockdep_ok {
 	LOCKDEP_NOW_UNRELIABLE
 };
 extern void add_taint(unsigned flag, enum lockdep_ok);
+extern void add_taint_no_warn(unsigned flag, enum lockdep_ok);
 extern int test_taint(unsigned flag);
 extern unsigned long get_taint(void);
 extern int root_mountflags;
diff --git a/kernel/panic.c b/kernel/panic.c
index 08aa88d..d344ea3 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -392,6 +392,12 @@ void add_taint(unsigned flag, enum lockdep_ok lockdep_ok)
 }
 EXPORT_SYMBOL(add_taint);
 
+void add_taint_no_warn(unsigned flag, enum lockdep_ok lockdep_ok)
+{
+	set_bit(flag, &tainted_mask);
+}
+EXPORT_SYMBOL(add_taint_no_warn);
+
 static void spin_msec(int msecs)
 {
 	int i;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early().
  2017-02-21  4:21 [PATCH 1/2] Introduce add_taint variant that does not print warning Mahesh J Salgaonkar
@ 2017-02-21  4:21 ` Mahesh J Salgaonkar
  2017-04-11 10:35   ` Michael Ellerman
  2017-04-17 10:39   ` Daniel Axtens
  0 siblings, 2 replies; 5+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  4:21 UTC (permalink / raw)
  To: linuxppc-dev, Linux Kernel, Michael Ellerman

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

machine_check_early() gets called in real mode. The very first time when
add_taint() is called, it prints a warning which ends up calling opal
call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
very first machine check while we are in opal we are doomed. OPAL_CALL
overwrites the PACASAVEDMSR in r13 and in this case when we are done with
MCE handling the original opal call will use this new MSR on it's way
back to opal_return. This usually leads unexpected behaviour or kernel
to panic. Instead use the add_taint_no_warn() that does not call printk.

This is broken with current FW level. We got lucky so far for not getting
very first MCE hit while in OPAL. But easily reproducible on Mambo.
This should go to stable as well alongwith patch 1/2.

Fixes: 27ea2c420cad powerpc: Set the correct kernel taint on machine check errors.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/traps.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 62b587f..4a048dc 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -306,7 +306,7 @@ long machine_check_early(struct pt_regs *regs)
 
 	__this_cpu_inc(irq_stat.mce_exceptions);
 
-	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+	add_taint_no_warn(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
 
 	/*
 	 * See if platform is capable of handling machine check. (e.g. PowerNV

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early().
  2017-02-21  4:21 ` [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early() Mahesh J Salgaonkar
@ 2017-04-11 10:35   ` Michael Ellerman
  2017-04-17 10:39   ` Daniel Axtens
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2017-04-11 10:35 UTC (permalink / raw)
  To: Mahesh J Salgaonkar, linuxppc-dev, Linux Kernel

Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>
> machine_check_early() gets called in real mode. The very first time when
> add_taint() is called, it prints a warning which ends up calling opal
> call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
> very first machine check while we are in opal we are doomed. OPAL_CALL
> overwrites the PACASAVEDMSR in r13 and in this case when we are done with
> MCE handling the original opal call will use this new MSR on it's way
> back to opal_return. This usually leads unexpected behaviour or kernel
> to panic. Instead use the add_taint_no_warn() that does not call printk.
>
> This is broken with current FW level. We got lucky so far for not getting
> very first MCE hit while in OPAL. But easily reproducible on Mambo.
> This should go to stable as well alongwith patch 1/2.

This is not a good way to fix a bug that needs to go back to stable.
Changing generic code means I need to sync up with the right maintainer,
get acks, etc. And then convince people that it should go to stable also.

So you can please fix this a different way for stable?

Can we just do the tainting later, once we're in virtual mode?

cheers

> Fixes: 27ea2c420cad powerpc: Set the correct kernel taint on machine check errors.
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/traps.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 62b587f..4a048dc 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -306,7 +306,7 @@ long machine_check_early(struct pt_regs *regs)
>  
>  	__this_cpu_inc(irq_stat.mce_exceptions);
>  
> -	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
> +	add_taint_no_warn(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
>  
>  	/*
>  	 * See if platform is capable of handling machine check. (e.g. PowerNV

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early().
  2017-02-21  4:21 ` [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early() Mahesh J Salgaonkar
  2017-04-11 10:35   ` Michael Ellerman
@ 2017-04-17 10:39   ` Daniel Axtens
  2017-04-17 11:19     ` Mahesh Jagannath Salgaonkar
  1 sibling, 1 reply; 5+ messages in thread
From: Daniel Axtens @ 2017-04-17 10:39 UTC (permalink / raw)
  To: Mahesh J Salgaonkar, linuxppc-dev, Linux Kernel, Michael Ellerman

Hi Mahesh,

> Fixes: 27ea2c420cad powerpc: Set the correct kernel taint on machine check errors.

I notice this Fixes a commit I introduced. Please could you cc me when
you do this? I am likely to miss it otherwise, especially since I have
now left IBM.

Being cced allows me to provide an Ack or a review. And getting feedback
on my changes is very helpful in becoming a better programmer.

In this case, as per Michael's comment, why don't we just move the
add_taint from machine_check_early to
machine_check_process_queued_event - the other side of the work queue.

The work queue system is supposed to provide us with a safe place to do
printing, etc., so it's an appropriate place. Also, we already do
machine_check_print_event_info there, and adding the taint doesn't need
to be done synchronously.

Regards,
Daniel

Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>
> machine_check_early() gets called in real mode. The very first time when
> add_taint() is called, it prints a warning which ends up calling opal
> call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
> very first machine check while we are in opal we are doomed. OPAL_CALL
> overwrites the PACASAVEDMSR in r13 and in this case when we are done with
> MCE handling the original opal call will use this new MSR on it's way
> back to opal_return. This usually leads unexpected behaviour or kernel
> to panic. Instead use the add_taint_no_warn() that does not call printk.
>
> This is broken with current FW level. We got lucky so far for not getting
> very first MCE hit while in OPAL. But easily reproducible on Mambo.
> This should go to stable as well alongwith patch 1/2.
>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/traps.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 62b587f..4a048dc 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -306,7 +306,7 @@ long machine_check_early(struct pt_regs *regs)
>  
>  	__this_cpu_inc(irq_stat.mce_exceptions);
>  
> -	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
> +	add_taint_no_warn(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
>  
>  	/*
>  	 * See if platform is capable of handling machine check. (e.g. PowerNV

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early().
  2017-04-17 10:39   ` Daniel Axtens
@ 2017-04-17 11:19     ` Mahesh Jagannath Salgaonkar
  0 siblings, 0 replies; 5+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2017-04-17 11:19 UTC (permalink / raw)
  To: Daniel Axtens, linuxppc-dev, Linux Kernel, Michael Ellerman

On 04/17/2017 04:09 PM, Daniel Axtens wrote:
> Hi Mahesh,
> 
>> Fixes: 27ea2c420cad powerpc: Set the correct kernel taint on machine check errors.
> 
> I notice this Fixes a commit I introduced. Please could you cc me when
> you do this? I am likely to miss it otherwise, especially since I have
> now left IBM.

Sure will do. :-)

> 
> Being cced allows me to provide an Ack or a review. And getting feedback
> on my changes is very helpful in becoming a better programmer.
> 
> In this case, as per Michael's comment, why don't we just move the
> add_taint from machine_check_early to
> machine_check_process_queued_event - the other side of the work queue.

Yes. That is what my plan is. Also, that is not the only place.
add_taint() need to be called from machine_check_exception() as well. So
it will be called from two places.

Thanks,
-Mahesh.

> 
> The work queue system is supposed to provide us with a safe place to do
> printing, etc., so it's an appropriate place. Also, we already do
> machine_check_print_event_info there, and adding the taint doesn't need
> to be done synchronously.
> 
> Regards,
> Daniel
> 
> Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes:
> 
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> machine_check_early() gets called in real mode. The very first time when
>> add_taint() is called, it prints a warning which ends up calling opal
>> call (that uses OPAL_CALL wrapper) for writing it to console. If we get a
>> very first machine check while we are in opal we are doomed. OPAL_CALL
>> overwrites the PACASAVEDMSR in r13 and in this case when we are done with
>> MCE handling the original opal call will use this new MSR on it's way
>> back to opal_return. This usually leads unexpected behaviour or kernel
>> to panic. Instead use the add_taint_no_warn() that does not call printk.
>>
>> This is broken with current FW level. We got lucky so far for not getting
>> very first MCE hit while in OPAL. But easily reproducible on Mambo.
>> This should go to stable as well alongwith patch 1/2.
>>
>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/kernel/traps.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
>> index 62b587f..4a048dc 100644
>> --- a/arch/powerpc/kernel/traps.c
>> +++ b/arch/powerpc/kernel/traps.c
>> @@ -306,7 +306,7 @@ long machine_check_early(struct pt_regs *regs)
>>  
>>  	__this_cpu_inc(irq_stat.mce_exceptions);
>>  
>> -	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
>> +	add_taint_no_warn(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
>>  
>>  	/*
>>  	 * See if platform is capable of handling machine check. (e.g. PowerNV
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-17 11:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-21  4:21 [PATCH 1/2] Introduce add_taint variant that does not print warning Mahesh J Salgaonkar
2017-02-21  4:21 ` [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early() Mahesh J Salgaonkar
2017-04-11 10:35   ` Michael Ellerman
2017-04-17 10:39   ` Daniel Axtens
2017-04-17 11:19     ` Mahesh Jagannath Salgaonkar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.