All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
@ 2013-11-08 12:24 Dave Martin
  2013-11-08 13:34 ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Martin @ 2013-11-08 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

Copying a function with memcpy() and then trying to execute the
result isn't trivially portable to Thumb.

This patch modifies the kexec soft restart code to copy its
assembler trampoline relocate_new_kernel() using fncpy() instead,
so that relocate_new_kernel can be in the same ISA as the rest of
the kernel without problems.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
Changes since v1:

  * Move ENDPROC() after relocate_new_kernel's literals, to be
    consistent the location of relocate_new_kernel_end and with the way
    GCC sets ELF symbol sizes for functions.  This is just a tidyup,
    with no functional impact.

 arch/arm/kernel/machine_kexec.c   |   17 ++++++++++-------
 arch/arm/kernel/relocate_kernel.S |    8 ++++++--
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 57221e3..f0d180d 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -14,11 +14,12 @@
 #include <asm/pgalloc.h>
 #include <asm/mmu_context.h>
 #include <asm/cacheflush.h>
+#include <asm/fncpy.h>
 #include <asm/mach-types.h>
 #include <asm/smp_plat.h>
 #include <asm/system_misc.h>
 
-extern const unsigned char relocate_new_kernel[];
+extern void relocate_new_kernel(void);
 extern const unsigned int relocate_new_kernel_size;
 
 extern unsigned long kexec_start_address;
@@ -142,6 +143,8 @@ void machine_kexec(struct kimage *image)
 {
 	unsigned long page_list;
 	unsigned long reboot_code_buffer_phys;
+	unsigned long reboot_entry = (unsigned long)relocate_new_kernel;
+	unsigned long reboot_entry_phys;
 	void *reboot_code_buffer;
 
 	/*
@@ -168,16 +171,16 @@ void machine_kexec(struct kimage *image)
 
 
 	/* copy our kernel relocation code to the control code page */
-	memcpy(reboot_code_buffer,
-	       relocate_new_kernel, relocate_new_kernel_size);
+	reboot_entry = fncpy(reboot_code_buffer,
+			     reboot_entry,
+			     relocate_new_kernel_size);
+	reboot_entry_phys = (unsigned long)reboot_entry +
+		(reboot_code_buffer_phys - (unsigned long)reboot_code_buffer);
 
-
-	flush_icache_range((unsigned long) reboot_code_buffer,
-			   (unsigned long) reboot_code_buffer + KEXEC_CONTROL_PAGE_SIZE);
 	printk(KERN_INFO "Bye!\n");
 
 	if (kexec_reinit)
 		kexec_reinit();
 
-	soft_restart(reboot_code_buffer_phys);
+	soft_restart(reboot_entry_phys);
 }
diff --git a/arch/arm/kernel/relocate_kernel.S b/arch/arm/kernel/relocate_kernel.S
index d0cdedf..9585896 100644
--- a/arch/arm/kernel/relocate_kernel.S
+++ b/arch/arm/kernel/relocate_kernel.S
@@ -2,10 +2,12 @@
  * relocate_kernel.S - put the kernel image in place to boot
  */
 
+#include <linux/linkage.h>
 #include <asm/kexec.h>
 
-	.globl relocate_new_kernel
-relocate_new_kernel:
+	.align	3	/* not needed for this code, but keeps fncpy() happy */
+
+ENTRY(relocate_new_kernel)
 
 	ldr	r0,kexec_indirection_page
 	ldr	r1,kexec_start_address
@@ -79,6 +81,8 @@ kexec_mach_type:
 kexec_boot_atags:
 	.long	0x0
 
+ENDPROC(relocate_new_kernel)
+
 relocate_new_kernel_end:
 
 	.globl relocate_new_kernel_size
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
  2013-11-08 12:24 [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel Dave Martin
@ 2013-11-08 13:34 ` Will Deacon
  2013-11-08 18:46   ` Dave Martin
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2013-11-08 13:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 08, 2013 at 12:24:04PM +0000, Dave Martin wrote:
> Copying a function with memcpy() and then trying to execute the
> result isn't trivially portable to Thumb.
> 
> This patch modifies the kexec soft restart code to copy its
> assembler trampoline relocate_new_kernel() using fncpy() instead,
> so that relocate_new_kernel can be in the same ISA as the rest of
> the kernel without problems.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
> Changes since v1:
> 
>   * Move ENDPROC() after relocate_new_kernel's literals, to be
>     consistent the location of relocate_new_kernel_end and with the way
>     GCC sets ELF symbol sizes for functions.  This is just a tidyup,
>     with no functional impact.

Cheers Dave:

  Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
  2013-11-08 13:34 ` Will Deacon
@ 2013-11-08 18:46   ` Dave Martin
  2013-11-12 19:29     ` Taras Kondratiuk
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Martin @ 2013-11-08 18:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 08, 2013 at 01:34:27PM +0000, Will Deacon wrote:
> On Fri, Nov 08, 2013 at 12:24:04PM +0000, Dave Martin wrote:
> > Copying a function with memcpy() and then trying to execute the
> > result isn't trivially portable to Thumb.
> > 
> > This patch modifies the kexec soft restart code to copy its
> > assembler trampoline relocate_new_kernel() using fncpy() instead,
> > so that relocate_new_kernel can be in the same ISA as the rest of
> > the kernel without problems.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> > Changes since v1:
> > 
> >   * Move ENDPROC() after relocate_new_kernel's literals, to be
> >     consistent the location of relocate_new_kernel_end and with the way
> >     GCC sets ELF symbol sizes for functions.  This is just a tidyup,
> >     with no functional impact.
> 
> Cheers Dave:
> 
>   Acked-by: Will Deacon <will.deacon@arm.com>

Thanks


Are you still in a position to test this, Taras?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
  2013-11-08 18:46   ` Dave Martin
@ 2013-11-12 19:29     ` Taras Kondratiuk
  2013-11-15 11:28       ` Taras Kondratiuk
  0 siblings, 1 reply; 8+ messages in thread
From: Taras Kondratiuk @ 2013-11-12 19:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Dave

Yes. I've tested it on Pandaboard and results are quite weird.
ARM->ARM, Thumb->Thumb and Thumb->ARM kernel transition works fine
for both kexec and kdump ways. But ARM->Thumb works for only kdump
via kernel panic. In case of "kexec -e" the second Thumb kernel
doesn't come up.

I don't have JTAG now. I will check this tomorrow morning.

On 8 November 2013 20:46, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Nov 08, 2013 at 01:34:27PM +0000, Will Deacon wrote:
>> On Fri, Nov 08, 2013 at 12:24:04PM +0000, Dave Martin wrote:
>> > Copying a function with memcpy() and then trying to execute the
>> > result isn't trivially portable to Thumb.
>> >
>> > This patch modifies the kexec soft restart code to copy its
>> > assembler trampoline relocate_new_kernel() using fncpy() instead,
>> > so that relocate_new_kernel can be in the same ISA as the rest of
>> > the kernel without problems.
>> >
>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> > ---
>> > Changes since v1:
>> >
>> >   * Move ENDPROC() after relocate_new_kernel's literals, to be
>> >     consistent the location of relocate_new_kernel_end and with the way
>> >     GCC sets ELF symbol sizes for functions.  This is just a tidyup,
>> >     with no functional impact.
>>
>> Cheers Dave:
>>
>>   Acked-by: Will Deacon <will.deacon@arm.com>
>
> Thanks
>
>
> Are you still in a position to test this, Taras?
>
> Cheers
> ---Dave



-- 
Regards,
Taras Kondratiuk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
  2013-11-12 19:29     ` Taras Kondratiuk
@ 2013-11-15 11:28       ` Taras Kondratiuk
  2013-11-15 17:38         ` Dave Martin
  0 siblings, 1 reply; 8+ messages in thread
From: Taras Kondratiuk @ 2013-11-15 11:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/12/2013 09:29 PM, Taras Kondratiuk wrote:
> Hi Dave
> 
> Yes. I've tested it on Pandaboard and results are quite weird.
> ARM->ARM, Thumb->Thumb and Thumb->ARM kernel transition works fine
> for both kexec and kdump ways. But ARM->Thumb works for only kdump
> via kernel panic. In case of "kexec -e" the second Thumb kernel
> doesn't come up.
> 
> I don't have JTAG now. I will check this tomorrow morning.

Hi Dave, Will

The issue I observed is not caused by this patch.
I was able to reproduce it with my initial simple patch.

So for this one:
Reported-and-Tested-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>

And the issue I'm frequently facing in reloaded kernel (Thumb from ARM)
is random crashes caused by undefined instructions.

My observation summary:
- Before starting a second kernel I'm dumping loaded zImage and then
  unpacked Image at final location and they are correct, so no issue
  with loading.
- I observe two types of crash:
  1) Undefined instruction in the middle of kernel code. After a crash
     I check failing address and there is always a *valid* Thumb
     instruction (CPU is in Thumb mode).
  2) Jump to a wrong address which consequently causes undefined
     instruction exception. A trace of one example of a wrong jump is
     captured in [1]. Instead of jumping to 0xC049097C code gets
     executed at 0xED85E008. BTW the wrong address suspiciously looks
     like an ARM instruction.
- If second kernel is placed at different address (like in kdump case),
  then it boots fine and I don't observe any crashes.
- If I check failing address in the first kernel (ARM) the code there
  is really undefined instruction if executed as Thumb.
- Looks like pieces of old ARM kernel gets executed instead of new
  Thumb kernel. But as I've mentioned I'm reading physical memory via
  JTAG before starting second kernel and memory is matching a compiled
  Thumb 'Image'. Icache also gets cleaned...
- Once when stopped on breakpoint I've seen a piece of ARM code in
  Thumb kernel. Interesting that I was looking at the same memory
  location via physical and virtual addresses simultaneously and only
  virtual address showed an old code. After a few memory browsing
  operations, data at both addresses got synced to correct Thumb code.
  Sure it could be a debugger lag, but it fits nicely with other
  observations.

Do you have some ideas what could cause such behavior?

Unfortunately I don't have more time now to debug it further,
but I will try to return to this later.

[1]
https://drive.google.com/file/d/0ByfnRzd5ZYtdQWJKc1k0VmxrZlE/edit?usp=sharing

-- 
Taras Kondratiuk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
  2013-11-15 11:28       ` Taras Kondratiuk
@ 2013-11-15 17:38         ` Dave Martin
  2013-11-15 18:11           ` Taras Kondratiuk
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Martin @ 2013-11-15 17:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 15, 2013 at 01:28:21PM +0200, Taras Kondratiuk wrote:
> On 11/12/2013 09:29 PM, Taras Kondratiuk wrote:
> > Hi Dave
> > 
> > Yes. I've tested it on Pandaboard and results are quite weird.
> > ARM->ARM, Thumb->Thumb and Thumb->ARM kernel transition works fine
> > for both kexec and kdump ways. But ARM->Thumb works for only kdump
> > via kernel panic. In case of "kexec -e" the second Thumb kernel
> > doesn't come up.
> > 
> > I don't have JTAG now. I will check this tomorrow morning.
> 
> Hi Dave, Will
> 
> The issue I observed is not caused by this patch.
> I was able to reproduce it with my initial simple patch.
> 
> So for this one:
> Reported-and-Tested-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>

Thanks for that.

> 
> And the issue I'm frequently facing in reloaded kernel (Thumb from ARM)
> is random crashes caused by undefined instructions.
> 
> My observation summary:
> - Before starting a second kernel I'm dumping loaded zImage and then
>   unpacked Image at final location and they are correct, so no issue
>   with loading.
> - I observe two types of crash:
>   1) Undefined instruction in the middle of kernel code. After a crash
>      I check failing address and there is always a *valid* Thumb
>      instruction (CPU is in Thumb mode).
>   2) Jump to a wrong address which consequently causes undefined
>      instruction exception. A trace of one example of a wrong jump is
>      captured in [1]. Instead of jumping to 0xC049097C code gets
>      executed at 0xED85E008. BTW the wrong address suspiciously looks
>      like an ARM instruction.

That jump to 0xED85E008 certainly looks strange ... I wonder whether
there could be some instructions missing from the trace.


How early do these crashes happen?

Is this happening on SMP, and if so, what is the state of secondary
CPUs across kexec?

If secondary CPUs are not safely parked, or their caches are not drained
before the kexec occurs, this can cause corruption of the new kernel
or unpredictable behaviour of the secondary CPUs.

> - If second kernel is placed at different address (like in kdump case),
>   then it boots fine and I don't observe any crashes.
> - If I check failing address in the first kernel (ARM) the code there
>   is really undefined instruction if executed as Thumb.
> - Looks like pieces of old ARM kernel gets executed instead of new
>   Thumb kernel. But as I've mentioned I'm reading physical memory via
>   JTAG before starting second kernel and memory is matching a compiled
>   Thumb 'Image'. Icache also gets cleaned...
> - Once when stopped on breakpoint I've seen a piece of ARM code in
>   Thumb kernel. Interesting that I was looking at the same memory

Thumb kernels do contain a small amount of ARM code, in the vectors
page for example.  But it's possible you were also looking at stale
data.

>   location via physical and virtual addresses simultaneously and only
>   virtual address showed an old code. After a few memory browsing

It's possible that those views could be inconsistent either due to
the behaviour of the debugger, or because inconsistent memory types
are used to construct the two views.

>   operations, data at both addresses got synced to correct Thumb code.
>   Sure it could be a debugger lag, but it fits nicely with other
>   observations.
> 
> Do you have some ideas what could cause such behavior?

Not really, apart from the above ideas.

> 
> Unfortunately I don't have more time now to debug it further,
> but I will try to return to this later.

OK ... let me know if you see this again or get any more clues.

Cheers
---Dave

> 
> [1]
> https://drive.google.com/file/d/0ByfnRzd5ZYtdQWJKc1k0VmxrZlE/edit?usp=sharing
> 
> -- 
> Taras Kondratiuk
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
  2013-11-15 17:38         ` Dave Martin
@ 2013-11-15 18:11           ` Taras Kondratiuk
  2013-11-15 18:33             ` Dave P Martin
  0 siblings, 1 reply; 8+ messages in thread
From: Taras Kondratiuk @ 2013-11-15 18:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/15/2013 07:38 PM, Dave Martin wrote:
> On Fri, Nov 15, 2013 at 01:28:21PM +0200, Taras Kondratiuk wrote:
>> And the issue I'm frequently facing in reloaded kernel (Thumb from ARM)
>> is random crashes caused by undefined instructions.
>>
>> My observation summary:
>> - Before starting a second kernel I'm dumping loaded zImage and then
>>   unpacked Image at final location and they are correct, so no issue
>>   with loading.
>> - I observe two types of crash:
>>   1) Undefined instruction in the middle of kernel code. After a crash
>>      I check failing address and there is always a *valid* Thumb
>>      instruction (CPU is in Thumb mode).
>>   2) Jump to a wrong address which consequently causes undefined
>>      instruction exception. A trace of one example of a wrong jump is
>>      captured in [1]. Instead of jumping to 0xC049097C code gets
>>      executed at 0xED85E008. BTW the wrong address suspiciously looks
>>      like an ARM instruction.
> 
> That jump to 0xED85E008 certainly looks strange ... I wonder whether
> there could be some instructions missing from the trace.
> 
> 
> How early do these crashes happen?

At very early stages starting from setup_arch() up to early initcalls.

> Is this happening on SMP, and if so, what is the state of secondary
> CPUs across kexec?

I have disabled CONFIG_SMP. Second CPU is busy-looping in ROM code and
shouldn't cause any issues.

> If secondary CPUs are not safely parked, or their caches are not drained
> before the kexec occurs, this can cause corruption of the new kernel
> or unpredictable behaviour of the secondary CPUs.
> 
>> - If second kernel is placed at different address (like in kdump case),
>>   then it boots fine and I don't observe any crashes.
>> - If I check failing address in the first kernel (ARM) the code there
>>   is really undefined instruction if executed as Thumb.
>> - Looks like pieces of old ARM kernel gets executed instead of new
>>   Thumb kernel. But as I've mentioned I'm reading physical memory via
>>   JTAG before starting second kernel and memory is matching a compiled
>>   Thumb 'Image'. Icache also gets cleaned...
>> - Once when stopped on breakpoint I've seen a piece of ARM code in
>>   Thumb kernel. Interesting that I was looking at the same memory
> 
> Thumb kernels do contain a small amount of ARM code, in the vectors
> page for example.  But it's possible you were also looking at stale
> data.

Right, but I mean there was an ARM code in place where definitely a
Thumb code should be.

> 
>>   location via physical and virtual addresses simultaneously and only
>>   virtual address showed an old code. After a few memory browsing
> 
> It's possible that those views could be inconsistent either due to
> the behaviour of the debugger, or because inconsistent memory types
> are used to construct the two views.
> 
>>   operations, data at both addresses got synced to correct Thumb code.
>>   Sure it could be a debugger lag, but it fits nicely with other
>>   observations.
>>
>> Do you have some ideas what could cause such behavior?
> 
> Not really, apart from the above ideas.
> 
>>
>> Unfortunately I don't have more time now to debug it further,
>> but I will try to return to this later.
> 
> OK ... let me know if you see this again or get any more clues.
> 
> Cheers
> ---Dave
> 
>>
>> [1]
>> https://drive.google.com/file/d/0ByfnRzd5ZYtdQWJKc1k0VmxrZlE/edit?usp=sharing
>>
>> -- 
>> Taras Kondratiuk

-- 
Taras Kondratiuk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
  2013-11-15 18:11           ` Taras Kondratiuk
@ 2013-11-15 18:33             ` Dave P Martin
  0 siblings, 0 replies; 8+ messages in thread
From: Dave P Martin @ 2013-11-15 18:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 15, 2013 at 06:11:43PM +0000, Taras Kondratiuk wrote:
> On 11/15/2013 07:38 PM, Dave Martin wrote:
> > On Fri, Nov 15, 2013 at 01:28:21PM +0200, Taras Kondratiuk wrote:
> >> And the issue I'm frequently facing in reloaded kernel (Thumb from ARM)
> >> is random crashes caused by undefined instructions.
> >>
> >> My observation summary:
> >> - Before starting a second kernel I'm dumping loaded zImage and then
> >>   unpacked Image at final location and they are correct, so no issue
> >>   with loading.
> >> - I observe two types of crash:
> >>   1) Undefined instruction in the middle of kernel code. After a crash
> >>      I check failing address and there is always a *valid* Thumb
> >>      instruction (CPU is in Thumb mode).
> >>   2) Jump to a wrong address which consequently causes undefined
> >>      instruction exception. A trace of one example of a wrong jump is
> >>      captured in [1]. Instead of jumping to 0xC049097C code gets
> >>      executed at 0xED85E008. BTW the wrong address suspiciously looks
> >>      like an ARM instruction.
> > 
> > That jump to 0xED85E008 certainly looks strange ... I wonder whether
> > there could be some instructions missing from the trace.
> > 
> > 
> > How early do these crashes happen?
> 
> At very early stages starting from setup_arch() up to early initcalls.
> 
> > Is this happening on SMP, and if so, what is the state of secondary
> > CPUs across kexec?
> 
> I have disabled CONFIG_SMP. Second CPU is busy-looping in ROM code and
> shouldn't cause any issues.

OK, that sounds reasonable.

> > If secondary CPUs are not safely parked, or their caches are not drained
> > before the kexec occurs, this can cause corruption of the new kernel
> > or unpredictable behaviour of the secondary CPUs.
> > 
> >> - If second kernel is placed at different address (like in kdump case),
> >>   then it boots fine and I don't observe any crashes.
> >> - If I check failing address in the first kernel (ARM) the code there
> >>   is really undefined instruction if executed as Thumb.
> >> - Looks like pieces of old ARM kernel gets executed instead of new
> >>   Thumb kernel. But as I've mentioned I'm reading physical memory via
> >>   JTAG before starting second kernel and memory is matching a compiled
> >>   Thumb 'Image'. Icache also gets cleaned...
> >> - Once when stopped on breakpoint I've seen a piece of ARM code in
> >>   Thumb kernel. Interesting that I was looking at the same memory
> > 
> > Thumb kernels do contain a small amount of ARM code, in the vectors
> > page for example.  But it's possible you were also looking at stale
> > data.
> 
> Right, but I mean there was an ARM code in place where definitely a
> Thumb code should be.

Sure.  Well, I guess this remains unexplained for now, but keep me
posted.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-11-15 18:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-08 12:24 [PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel Dave Martin
2013-11-08 13:34 ` Will Deacon
2013-11-08 18:46   ` Dave Martin
2013-11-12 19:29     ` Taras Kondratiuk
2013-11-15 11:28       ` Taras Kondratiuk
2013-11-15 17:38         ` Dave Martin
2013-11-15 18:11           ` Taras Kondratiuk
2013-11-15 18:33             ` Dave P Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.