All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] arm64: traps: add dump instr before BUG in kernel
@ 2021-09-29 13:29 ` Chen Lin
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Lin @ 2021-09-29 13:29 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, Chen Lin

From: Chen Lin <chen.lin5@zte.com.cn>

we should dump the real instructions before BUG in kernel mode, and
compare this to the instructions from objdump.

Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
---
 arch/arm64/kernel/traps.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index b03e383..621a9dd 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
 	if (call_undef_hook(regs) == 0)
 		return;
 
-	BUG_ON(!user_mode(regs));
+	if (!user_mode(regs)) {
+		pr_emerg("Undef instruction in kernel, dump instr:");
+		dump_kernel_instr(KERN_EMERG, regs);
+		BUG();
+	}
+
 	force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
 }
 NOKPROBE_SYMBOL(do_undefinstr);
-- 
1.7.9.5


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] arm64: traps: add dump instr before BUG in kernel
@ 2021-09-29 13:29 ` Chen Lin
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Lin @ 2021-09-29 13:29 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, Chen Lin

From: Chen Lin <chen.lin5@zte.com.cn>

we should dump the real instructions before BUG in kernel mode, and
compare this to the instructions from objdump.

Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
---
 arch/arm64/kernel/traps.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index b03e383..621a9dd 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
 	if (call_undef_hook(regs) == 0)
 		return;
 
-	BUG_ON(!user_mode(regs));
+	if (!user_mode(regs)) {
+		pr_emerg("Undef instruction in kernel, dump instr:");
+		dump_kernel_instr(KERN_EMERG, regs);
+		BUG();
+	}
+
 	force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
 }
 NOKPROBE_SYMBOL(do_undefinstr);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
  2021-09-29 13:29 ` Chen Lin
@ 2021-09-30  8:42   ` Will Deacon
  -1 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2021-09-30  8:42 UTC (permalink / raw)
  To: Chen Lin
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, Chen Lin

On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
> From: Chen Lin <chen.lin5@zte.com.cn>
> 
> we should dump the real instructions before BUG in kernel mode, and
> compare this to the instructions from objdump.
> 
> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
> ---
>  arch/arm64/kernel/traps.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index b03e383..621a9dd 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
>  	if (call_undef_hook(regs) == 0)
>  		return;
>  
> -	BUG_ON(!user_mode(regs));
> +	if (!user_mode(regs)) {
> +		pr_emerg("Undef instruction in kernel, dump instr:");
> +		dump_kernel_instr(KERN_EMERG, regs);
> +		BUG();
> +	}

Hmm, I'm not completely convinced about this as the instruction in the
i-cache could be completely different. I think the PC value (for addr2line)
is a lot more useful, and we should be printing that already.

Maybe you can elaborate on a situation where this information was helpful?

Thanks,

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
@ 2021-09-30  8:42   ` Will Deacon
  0 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2021-09-30  8:42 UTC (permalink / raw)
  To: Chen Lin
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, Chen Lin

On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
> From: Chen Lin <chen.lin5@zte.com.cn>
> 
> we should dump the real instructions before BUG in kernel mode, and
> compare this to the instructions from objdump.
> 
> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
> ---
>  arch/arm64/kernel/traps.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index b03e383..621a9dd 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
>  	if (call_undef_hook(regs) == 0)
>  		return;
>  
> -	BUG_ON(!user_mode(regs));
> +	if (!user_mode(regs)) {
> +		pr_emerg("Undef instruction in kernel, dump instr:");
> +		dump_kernel_instr(KERN_EMERG, regs);
> +		BUG();
> +	}

Hmm, I'm not completely convinced about this as the instruction in the
i-cache could be completely different. I think the PC value (for addr2line)
is a lot more useful, and we should be printing that already.

Maybe you can elaborate on a situation where this information was helpful?

Thanks,

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re:Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
  2021-09-30  8:42   ` Will Deacon
@ 2021-09-30 14:41     ` Chen Lin
  -1 siblings, 0 replies; 10+ messages in thread
From: Chen Lin @ 2021-09-30 14:41 UTC (permalink / raw)
  To: will
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, chen.lin5, Chen Lin

At 2021-09-30 15:42:47, "Will Deacon" <will@kernel.org> wrote:

>On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
>> From: Chen Lin <chen.lin5@zte.com.cn>
>> 
>> we should dump the real instructions before BUG in kernel mode, and
>> compare this to the instructions from objdump.
>> 
>> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
>> ---
>>  arch/arm64/kernel/traps.c |    7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> index b03e383..621a9dd 100644
>> --- a/arch/arm64/kernel/traps.c
>> +++ b/arch/arm64/kernel/traps.c
>> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
>>  	if (call_undef_hook(regs) == 0)
>>  		return;
>>  
>> -	BUG_ON(!user_mode(regs));
>> +	if (!user_mode(regs)) {
>> +		pr_emerg("Undef instruction in kernel, dump instr:");
>> +		dump_kernel_instr(KERN_EMERG, regs);
>> +		BUG();
>> +	}
>
>Hmm, I'm not completely convinced about this as the instruction in the
>i-cache could be completely different. I think the PC value (for addr2line)
>is a lot more useful, and we should be printing that already.
>
>Maybe you can elaborate on a situation where this information was helpful?
>
>Thanks,
>
>Will

Undef instruction occurs in some cases

1. CPU do not have the permission to execute the instruction or the current CPU
 version does not support the instruction. For example, execute 
 'mrs x0, tcr_el3' under el1.

2. The instruction is a normal instruction, but it is changed during board 
running in some abnormal situation. eg: DDR bit flip, the normal instruction 
will become an undefined one. By printing the instruction, we can see the 
accurate instruction code and compare it with the instruction code from objdump
to determine that it is a DDR issue.

3. It is rare that the instructions seen through the CPU are inconsistent with 
the instructions in the actual DDR.You can also compare the printed instructions
with the instructions in memory(may through kdump) to determine that it is the
CPU cache or some other issue.

However, now the instruction code causing the sync 'undef instr exception' cannot be
seen. The second and third type problem above cannot be determined.

Before the commit 8a60419d36762a1 "arm64: force_signal_inject: WARN if called 
from kernel context", the instructions can be printed when the CPU encounters an
undef instruction in kernel mode. 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re:Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
@ 2021-09-30 14:41     ` Chen Lin
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Lin @ 2021-09-30 14:41 UTC (permalink / raw)
  To: will
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, chen.lin5, Chen Lin

At 2021-09-30 15:42:47, "Will Deacon" <will@kernel.org> wrote:

>On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
>> From: Chen Lin <chen.lin5@zte.com.cn>
>> 
>> we should dump the real instructions before BUG in kernel mode, and
>> compare this to the instructions from objdump.
>> 
>> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
>> ---
>>  arch/arm64/kernel/traps.c |    7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> index b03e383..621a9dd 100644
>> --- a/arch/arm64/kernel/traps.c
>> +++ b/arch/arm64/kernel/traps.c
>> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
>>  	if (call_undef_hook(regs) == 0)
>>  		return;
>>  
>> -	BUG_ON(!user_mode(regs));
>> +	if (!user_mode(regs)) {
>> +		pr_emerg("Undef instruction in kernel, dump instr:");
>> +		dump_kernel_instr(KERN_EMERG, regs);
>> +		BUG();
>> +	}
>
>Hmm, I'm not completely convinced about this as the instruction in the
>i-cache could be completely different. I think the PC value (for addr2line)
>is a lot more useful, and we should be printing that already.
>
>Maybe you can elaborate on a situation where this information was helpful?
>
>Thanks,
>
>Will

Undef instruction occurs in some cases

1. CPU do not have the permission to execute the instruction or the current CPU
 version does not support the instruction. For example, execute 
 'mrs x0, tcr_el3' under el1.

2. The instruction is a normal instruction, but it is changed during board 
running in some abnormal situation. eg: DDR bit flip, the normal instruction 
will become an undefined one. By printing the instruction, we can see the 
accurate instruction code and compare it with the instruction code from objdump
to determine that it is a DDR issue.

3. It is rare that the instructions seen through the CPU are inconsistent with 
the instructions in the actual DDR.You can also compare the printed instructions
with the instructions in memory(may through kdump) to determine that it is the
CPU cache or some other issue.

However, now the instruction code causing the sync 'undef instr exception' cannot be
seen. The second and third type problem above cannot be determined.

Before the commit 8a60419d36762a1 "arm64: force_signal_inject: WARN if called 
from kernel context", the instructions can be printed when the CPU encounters an
undef instruction in kernel mode. 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
  2021-09-30 14:41     ` Chen Lin
@ 2021-10-11 10:06       ` Will Deacon
  -1 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2021-10-11 10:06 UTC (permalink / raw)
  To: Chen Lin
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, chen.lin5

On Thu, Sep 30, 2021 at 10:41:30PM +0800, Chen Lin wrote:
> At 2021-09-30 15:42:47, "Will Deacon" <will@kernel.org> wrote:
> 
> >On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
> >> From: Chen Lin <chen.lin5@zte.com.cn>
> >> 
> >> we should dump the real instructions before BUG in kernel mode, and
> >> compare this to the instructions from objdump.
> >> 
> >> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
> >> ---
> >>  arch/arm64/kernel/traps.c |    7 ++++++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> >> index b03e383..621a9dd 100644
> >> --- a/arch/arm64/kernel/traps.c
> >> +++ b/arch/arm64/kernel/traps.c
> >> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
> >>  	if (call_undef_hook(regs) == 0)
> >>  		return;
> >>  
> >> -	BUG_ON(!user_mode(regs));
> >> +	if (!user_mode(regs)) {
> >> +		pr_emerg("Undef instruction in kernel, dump instr:");
> >> +		dump_kernel_instr(KERN_EMERG, regs);
> >> +		BUG();
> >> +	}
> >
> >Hmm, I'm not completely convinced about this as the instruction in the
> >i-cache could be completely different. I think the PC value (for addr2line)
> >is a lot more useful, and we should be printing that already.
> >
> >Maybe you can elaborate on a situation where this information was helpful?
> >
> >Thanks,
> >
> >Will
> 
> Undef instruction occurs in some cases
> 
> 1. CPU do not have the permission to execute the instruction or the current CPU
>  version does not support the instruction. For example, execute 
>  'mrs x0, tcr_el3' under el1.

This really shouldn't happen, but if it did, the PC would surely be enough
to debug the problem?

> 2. The instruction is a normal instruction, but it is changed during board 
> running in some abnormal situation. eg: DDR bit flip, the normal instruction 
> will become an undefined one. By printing the instruction, we can see the 
> accurate instruction code and compare it with the instruction code from objdump
> to determine that it is a DDR issue.

Is this really something we should be designing our exception handlers for?
If we're getting DDR bit flips for kernel .text, then it sounds like we need
ECC and/or RAS features to deal with them.

So I'm not really sold on this change.

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
@ 2021-10-11 10:06       ` Will Deacon
  0 siblings, 0 replies; 10+ messages in thread
From: Will Deacon @ 2021-10-11 10:06 UTC (permalink / raw)
  To: Chen Lin
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, chen.lin5

On Thu, Sep 30, 2021 at 10:41:30PM +0800, Chen Lin wrote:
> At 2021-09-30 15:42:47, "Will Deacon" <will@kernel.org> wrote:
> 
> >On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
> >> From: Chen Lin <chen.lin5@zte.com.cn>
> >> 
> >> we should dump the real instructions before BUG in kernel mode, and
> >> compare this to the instructions from objdump.
> >> 
> >> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
> >> ---
> >>  arch/arm64/kernel/traps.c |    7 ++++++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> >> index b03e383..621a9dd 100644
> >> --- a/arch/arm64/kernel/traps.c
> >> +++ b/arch/arm64/kernel/traps.c
> >> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
> >>  	if (call_undef_hook(regs) == 0)
> >>  		return;
> >>  
> >> -	BUG_ON(!user_mode(regs));
> >> +	if (!user_mode(regs)) {
> >> +		pr_emerg("Undef instruction in kernel, dump instr:");
> >> +		dump_kernel_instr(KERN_EMERG, regs);
> >> +		BUG();
> >> +	}
> >
> >Hmm, I'm not completely convinced about this as the instruction in the
> >i-cache could be completely different. I think the PC value (for addr2line)
> >is a lot more useful, and we should be printing that already.
> >
> >Maybe you can elaborate on a situation where this information was helpful?
> >
> >Thanks,
> >
> >Will
> 
> Undef instruction occurs in some cases
> 
> 1. CPU do not have the permission to execute the instruction or the current CPU
>  version does not support the instruction. For example, execute 
>  'mrs x0, tcr_el3' under el1.

This really shouldn't happen, but if it did, the PC would surely be enough
to debug the problem?

> 2. The instruction is a normal instruction, but it is changed during board 
> running in some abnormal situation. eg: DDR bit flip, the normal instruction 
> will become an undefined one. By printing the instruction, we can see the 
> accurate instruction code and compare it with the instruction code from objdump
> to determine that it is a DDR issue.

Is this really something we should be designing our exception handlers for?
If we're getting DDR bit flips for kernel .text, then it sounds like we need
ECC and/or RAS features to deal with them.

So I'm not really sold on this change.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re:Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
  2021-10-11 10:06       ` Will Deacon
@ 2021-10-13 13:08         ` Chen Lin
  -1 siblings, 0 replies; 10+ messages in thread
From: Chen Lin @ 2021-10-13 13:08 UTC (permalink / raw)
  To: will
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, chen.lin5, Chen Lin

At 2021-10-11 17:06:50, "Will Deacon" <will@kernel.org> wrote:
>On Thu, Sep 30, 2021 at 10:41:30PM +0800, Chen Lin wrote:
>> At 2021-09-30 15:42:47, "Will Deacon" <will@kernel.org> wrote:
>> 
>> >On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
>> >> From: Chen Lin <chen.lin5@zte.com.cn>
>> >> 
>> >> we should dump the real instructions before BUG in kernel mode, and
>> >> compare this to the instructions from objdump.
>> >> 
>> >> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
>> >> ---
>> >>  arch/arm64/kernel/traps.c |    7 ++++++-
>> >>  1 file changed, 6 insertions(+), 1 deletion(-)
>> >> 
>> >> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> >> index b03e383..621a9dd 100644
>> >> --- a/arch/arm64/kernel/traps.c
>> >> +++ b/arch/arm64/kernel/traps.c
>> >> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
>> >>  	if (call_undef_hook(regs) == 0)
>> >>  		return;
>> >>  
>> >> -	BUG_ON(!user_mode(regs));
>> >> +	if (!user_mode(regs)) {
>> >> +		pr_emerg("Undef instruction in kernel, dump instr:");
>> >> +		dump_kernel_instr(KERN_EMERG, regs);
>> >> +		BUG();
>> >> +	}
>> >
>> >Hmm, I'm not completely convinced about this as the instruction in the
>> >i-cache could be completely different. I think the PC value (for addr2line)
>> >is a lot more useful, and we should be printing that already.
>> >
>> >Maybe you can elaborate on a situation where this information was helpful?
>> >
>> >Thanks,
>> >
>> >Will
>> 
>> Undef instruction occurs in some cases
>> 
>> 1. CPU do not have the permission to execute the instruction or the current CPU
>>  version does not support the instruction. For example, execute 
>>  'mrs x0, tcr_el3' under el1.
>
>This really shouldn't happen, but if it did, the PC would surely be enough
>to debug the problem?
>

yes, PC is enough in this situation.

>> 2. The instruction is a normal instruction, but it is changed during board 
>> running in some abnormal situation. eg: DDR bit flip, the normal instruction 
>> will become an undefined one. By printing the instruction, we can see the 
>> accurate instruction code and compare it with the instruction code from objdump
>> to determine that it is a DDR issue.
>
>Is this really something we should be designing our exception handlers for?
>If we're getting DDR bit flips for kernel .text, then it sounds like we need
>ECC and/or RAS features to deal with them.
>
>So I'm not really sold on this change.

1. About the DDR bit flip, YES, the instruction code information is really useless 
in the ideal state. The ideal state includes the following conditions like: the DDR 
controller do supports ECC, and also is configured correctly, such as ECC enabled 
and ECC fixing enabled. There may exist various old boards or abnormal DDR 
configurations in embedded systems.


2. DDR flip is just one example. Other examples include text segment being overwritten 
by DMA and other accidental writes, we want more info about what it writes. 
Another example, some instructions may change at runtime by ALTERNATIVE(oldinstr, newinstr, feature) 
or live patch, we want both the pc and the instruction code to double check what 
happened if illegal instruction exception happen at such points.


Personally, I think adding more information can make it easier to locate the above problems. 
Anyway, Thank you for your patience in reading and replying.
>
>Will


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re:Re: [PATCH] arm64: traps: add dump instr before BUG in kernel
@ 2021-10-13 13:08         ` Chen Lin
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Lin @ 2021-10-13 13:08 UTC (permalink / raw)
  To: will
  Cc: catalin.marinas, mark.rutland, joey.gouly, maz, linux-arm-kernel,
	linux-kernel, chen.lin5, Chen Lin

At 2021-10-11 17:06:50, "Will Deacon" <will@kernel.org> wrote:
>On Thu, Sep 30, 2021 at 10:41:30PM +0800, Chen Lin wrote:
>> At 2021-09-30 15:42:47, "Will Deacon" <will@kernel.org> wrote:
>> 
>> >On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
>> >> From: Chen Lin <chen.lin5@zte.com.cn>
>> >> 
>> >> we should dump the real instructions before BUG in kernel mode, and
>> >> compare this to the instructions from objdump.
>> >> 
>> >> Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
>> >> ---
>> >>  arch/arm64/kernel/traps.c |    7 ++++++-
>> >>  1 file changed, 6 insertions(+), 1 deletion(-)
>> >> 
>> >> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> >> index b03e383..621a9dd 100644
>> >> --- a/arch/arm64/kernel/traps.c
>> >> +++ b/arch/arm64/kernel/traps.c
>> >> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
>> >>  	if (call_undef_hook(regs) == 0)
>> >>  		return;
>> >>  
>> >> -	BUG_ON(!user_mode(regs));
>> >> +	if (!user_mode(regs)) {
>> >> +		pr_emerg("Undef instruction in kernel, dump instr:");
>> >> +		dump_kernel_instr(KERN_EMERG, regs);
>> >> +		BUG();
>> >> +	}
>> >
>> >Hmm, I'm not completely convinced about this as the instruction in the
>> >i-cache could be completely different. I think the PC value (for addr2line)
>> >is a lot more useful, and we should be printing that already.
>> >
>> >Maybe you can elaborate on a situation where this information was helpful?
>> >
>> >Thanks,
>> >
>> >Will
>> 
>> Undef instruction occurs in some cases
>> 
>> 1. CPU do not have the permission to execute the instruction or the current CPU
>>  version does not support the instruction. For example, execute 
>>  'mrs x0, tcr_el3' under el1.
>
>This really shouldn't happen, but if it did, the PC would surely be enough
>to debug the problem?
>

yes, PC is enough in this situation.

>> 2. The instruction is a normal instruction, but it is changed during board 
>> running in some abnormal situation. eg: DDR bit flip, the normal instruction 
>> will become an undefined one. By printing the instruction, we can see the 
>> accurate instruction code and compare it with the instruction code from objdump
>> to determine that it is a DDR issue.
>
>Is this really something we should be designing our exception handlers for?
>If we're getting DDR bit flips for kernel .text, then it sounds like we need
>ECC and/or RAS features to deal with them.
>
>So I'm not really sold on this change.

1. About the DDR bit flip, YES, the instruction code information is really useless 
in the ideal state. The ideal state includes the following conditions like: the DDR 
controller do supports ECC, and also is configured correctly, such as ECC enabled 
and ECC fixing enabled. There may exist various old boards or abnormal DDR 
configurations in embedded systems.


2. DDR flip is just one example. Other examples include text segment being overwritten 
by DMA and other accidental writes, we want more info about what it writes. 
Another example, some instructions may change at runtime by ALTERNATIVE(oldinstr, newinstr, feature) 
or live patch, we want both the pc and the instruction code to double check what 
happened if illegal instruction exception happen at such points.


Personally, I think adding more information can make it easier to locate the above problems. 
Anyway, Thank you for your patience in reading and replying.
>
>Will


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-13 13:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-29 13:29 [PATCH] arm64: traps: add dump instr before BUG in kernel Chen Lin
2021-09-29 13:29 ` Chen Lin
2021-09-30  8:42 ` Will Deacon
2021-09-30  8:42   ` Will Deacon
2021-09-30 14:41   ` Chen Lin
2021-09-30 14:41     ` Chen Lin
2021-10-11 10:06     ` Will Deacon
2021-10-11 10:06       ` Will Deacon
2021-10-13 13:08       ` Chen Lin
2021-10-13 13:08         ` Chen Lin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.