* [PATCH] arm64: Minor refactoring of cpu_switch_to() to fix build breakage
@ 2015-07-20 2:29 Olof Johansson
2015-07-20 7:36 ` Ingo Molnar
0 siblings, 1 reply; 5+ messages in thread
From: Olof Johansson @ 2015-07-20 2:29 UTC (permalink / raw)
To: will.deacon, catalin.marinas
Cc: linux-kernel, linux-arm-kernel, Ingo Molnar, Olof Johansson, Dave Hansen
Commit 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
moved the thread_struct to the bottom of task_struct. As a result, the
offset is now too large to be used in an immediate add on arm64 with
some kernel configs:
arch/arm64/kernel/entry.S: Assembler messages:
arch/arm64/kernel/entry.S:588: Error: immediate out of range
arch/arm64/kernel/entry.S:597: Error: immediate out of range
There's really no reason for cpu_switch_to to take a task_struct pointer
in the first place, since all it does is access the thread.cpu_context
member. So, just pass that in directly.
Fixes: 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
---
arch/arm64/include/asm/processor.h | 4 ++--
arch/arm64/kernel/asm-offsets.c | 2 --
arch/arm64/kernel/entry.S | 34 ++++++++++++++++------------------
arch/arm64/kernel/process.c | 3 ++-
4 files changed, 20 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index e4c893e..ba90764 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -152,8 +152,8 @@ static inline void cpu_relax(void)
#define cpu_relax_lowlatency() cpu_relax()
/* Thread switching */
-extern struct task_struct *cpu_switch_to(struct task_struct *prev,
- struct task_struct *next);
+extern struct task_struct *cpu_switch_to(struct cpu_context *prev,
+ struct cpu_context *next);
#define task_pt_regs(p) \
((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index c99701a..c9e13f6 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -39,8 +39,6 @@ int main(void)
DEFINE(TI_TASK, offsetof(struct thread_info, task));
DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
BLANK();
- DEFINE(THREAD_CPU_CONTEXT, offsetof(struct task_struct, thread.cpu_context));
- BLANK();
DEFINE(S_X0, offsetof(struct pt_regs, regs[0]));
DEFINE(S_X1, offsetof(struct pt_regs, regs[1]));
DEFINE(S_X2, offsetof(struct pt_regs, regs[2]));
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f860bfd..2216326 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -579,29 +579,27 @@ ENDPROC(el0_irq)
/*
* Register switch for AArch64. The callee-saved registers need to be saved
* and restored. On entry:
- * x0 = previous task_struct (must be preserved across the switch)
- * x1 = next task_struct
+ * x0 = previous cpu_context (must be preserved across the switch)
+ * x1 = next cpu_context
* Previous and next are guaranteed not to be the same.
*
*/
ENTRY(cpu_switch_to)
- add x8, x0, #THREAD_CPU_CONTEXT
mov x9, sp
- stp x19, x20, [x8], #16 // store callee-saved registers
- stp x21, x22, [x8], #16
- stp x23, x24, [x8], #16
- stp x25, x26, [x8], #16
- stp x27, x28, [x8], #16
- stp x29, x9, [x8], #16
- str lr, [x8]
- add x8, x1, #THREAD_CPU_CONTEXT
- ldp x19, x20, [x8], #16 // restore callee-saved registers
- ldp x21, x22, [x8], #16
- ldp x23, x24, [x8], #16
- ldp x25, x26, [x8], #16
- ldp x27, x28, [x8], #16
- ldp x29, x9, [x8], #16
- ldr lr, [x8]
+ stp x19, x20, [x0], #16 // store callee-saved registers
+ stp x21, x22, [x0], #16
+ stp x23, x24, [x0], #16
+ stp x25, x26, [x0], #16
+ stp x27, x28, [x0], #16
+ stp x29, x9, [x0], #16
+ str lr, [x0]
+ ldp x19, x20, [x1], #16 // restore callee-saved registers
+ ldp x21, x22, [x1], #16
+ ldp x23, x24, [x1], #16
+ ldp x25, x26, [x1], #16
+ ldp x27, x28, [x1], #16
+ ldp x29, x9, [x1], #16
+ ldr lr, [x1]
mov sp, x9
ret
ENDPROC(cpu_switch_to)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 223b093..6b9a09c 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -325,7 +325,8 @@ struct task_struct *__switch_to(struct task_struct *prev,
dsb(ish);
/* the actual thread switch */
- last = cpu_switch_to(prev, next);
+ last = cpu_switch_to(&prev->thread.cpu_context,
+ &next->thread.cpu_context);
return last;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] arm64: Minor refactoring of cpu_switch_to() to fix build breakage
2015-07-20 2:29 [PATCH] arm64: Minor refactoring of cpu_switch_to() to fix build breakage Olof Johansson
@ 2015-07-20 7:36 ` Ingo Molnar
2015-07-20 10:53 ` Will Deacon
0 siblings, 1 reply; 5+ messages in thread
From: Ingo Molnar @ 2015-07-20 7:36 UTC (permalink / raw)
To: Olof Johansson
Cc: will.deacon, catalin.marinas, linux-kernel, linux-arm-kernel,
Dave Hansen
* Olof Johansson <olof@lixom.net> wrote:
> Commit 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
> moved the thread_struct to the bottom of task_struct. As a result, the
> offset is now too large to be used in an immediate add on arm64 with
> some kernel configs:
>
> arch/arm64/kernel/entry.S: Assembler messages:
> arch/arm64/kernel/entry.S:588: Error: immediate out of range
> arch/arm64/kernel/entry.S:597: Error: immediate out of range
>
> There's really no reason for cpu_switch_to to take a task_struct pointer
> in the first place, since all it does is access the thread.cpu_context
> member. So, just pass that in directly.
>
> Fixes: 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Olof Johansson <olof@lixom.net>
> ---
> arch/arm64/include/asm/processor.h | 4 ++--
> arch/arm64/kernel/asm-offsets.c | 2 --
> arch/arm64/kernel/entry.S | 34 ++++++++++++++++------------------
> arch/arm64/kernel/process.c | 3 ++-
> 4 files changed, 20 insertions(+), 23 deletions(-)
So why not pass in 'thread_struct' as the patch below does - it looks much simpler
to me. This way the assembly doesn't have to be changed at all.
Thanks,
Ingo
=====================================>
* Guenter <linux@roeck-us.net> wrote:
> On Sat, Jul 18, 2015 at 04:27:17PM -0700, Guenter wrote:
> > Hi,
> >
> > Commit 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
> > causes s390 builds in mainline to fail as follows.
> >
> > arch/s390/kernel/traps.c: Assembler messages:
> > arch/s390/kernel/traps.c:262: Error: operand out of range
> > (0x00000000000023e8 is not between 0x0000000000000000 and 0x0000000000000fff)
> > arch/s390/kernel/traps.c:300: Error: operand out of range
> > (0x00000000000023e8 is not between 0x0000000000000000 and 0x0000000000000fff)
> >
>
> Also:
>
> arm64:allmodconfig:
>
> arch/arm64/kernel/entry.S: Assembler messages:
> arch/arm64/kernel/entry.S:588: Error: immediate out of range
> arch/arm64/kernel/entry.S:597: Error: immediate out of range
> make[1]: *** [arch/arm64/kernel/entry.o] Error 1
>
> I didn't bisect that one, but it looks like the cause is the same.
Hm, it looks like the new, increased offset of 'thread_struct' within
'task_struct' goes over a limit that these instructions are able to support on
arm64:
arch/arm64/kernel/asm-offsets.c: DEFINE(THREAD_CPU_CONTEXT, offsetof(struct task_struct, thread.cpu_context));
arch/arm64/kernel/entry.S: add x8, x0, #THREAD_CPU_CONTEXT
arch/arm64/kernel/entry.S: add x8, x1, #THREAD_CPU_CONTEXT
If there's no instruction that can support such offset sizes then I suspect the
straightforward fix would be to pass in thread_struct instead - like the patch
below. That's a tiny bit cleaner for type encapsulation anyway.
Warning: it's not even build tested, but in case it works:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thanks,
Ingo
================
arch/arm64/include/asm/processor.h | 4 ++--
arch/arm64/kernel/asm-offsets.c | 2 +-
arch/arm64/kernel/process.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index e4c893e54f01..890f84bb3b8c 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -152,8 +152,8 @@ static inline void cpu_relax(void)
#define cpu_relax_lowlatency() cpu_relax()
/* Thread switching */
-extern struct task_struct *cpu_switch_to(struct task_struct *prev,
- struct task_struct *next);
+extern struct task_struct *cpu_switch_to(struct thread_struct *prev,
+ struct thread_struct *next);
#define task_pt_regs(p) \
((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index c99701a34d7b..3785373c2369 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -39,7 +39,7 @@ int main(void)
DEFINE(TI_TASK, offsetof(struct thread_info, task));
DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
BLANK();
- DEFINE(THREAD_CPU_CONTEXT, offsetof(struct task_struct, thread.cpu_context));
+ DEFINE(THREAD_CPU_CONTEXT, offsetof(struct thread_struct, cpu_context));
BLANK();
DEFINE(S_X0, offsetof(struct pt_regs, regs[0]));
DEFINE(S_X1, offsetof(struct pt_regs, regs[1]));
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 223b093c9440..436e95bda1b2 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -325,7 +325,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
dsb(ish);
/* the actual thread switch */
- last = cpu_switch_to(prev, next);
+ last = cpu_switch_to(&prev.thread, &next.thread);
return last;
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] arm64: Minor refactoring of cpu_switch_to() to fix build breakage
2015-07-20 7:36 ` Ingo Molnar
@ 2015-07-20 10:53 ` Will Deacon
2015-07-20 14:20 ` Guenter Roeck
0 siblings, 1 reply; 5+ messages in thread
From: Will Deacon @ 2015-07-20 10:53 UTC (permalink / raw)
To: Ingo Molnar
Cc: Olof Johansson, Catalin Marinas, linux-kernel, linux-arm-kernel,
Dave Hansen
On Mon, Jul 20, 2015 at 08:36:47AM +0100, Ingo Molnar wrote:
> * Olof Johansson <olof@lixom.net> wrote:
>
> > Commit 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
> > moved the thread_struct to the bottom of task_struct. As a result, the
> > offset is now too large to be used in an immediate add on arm64 with
> > some kernel configs:
> >
> > arch/arm64/kernel/entry.S: Assembler messages:
> > arch/arm64/kernel/entry.S:588: Error: immediate out of range
> > arch/arm64/kernel/entry.S:597: Error: immediate out of range
> >
> > There's really no reason for cpu_switch_to to take a task_struct pointer
> > in the first place, since all it does is access the thread.cpu_context
> > member. So, just pass that in directly.
> >
> > Fixes: 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Signed-off-by: Olof Johansson <olof@lixom.net>
> > ---
> > arch/arm64/include/asm/processor.h | 4 ++--
> > arch/arm64/kernel/asm-offsets.c | 2 --
> > arch/arm64/kernel/entry.S | 34 ++++++++++++++++------------------
> > arch/arm64/kernel/process.c | 3 ++-
> > 4 files changed, 20 insertions(+), 23 deletions(-)
>
> So why not pass in 'thread_struct' as the patch below does - it looks much
> simpler to me. This way the assembly doesn't have to be changed at all.
Unfortunately, neither of these approaches really work:
- We need to return last from __switch_to, which means not corrupting
x0 in cpu_switch_to and then having an ugly container_of to get back
at the task_struct
- ret_from_fork needs to pass the task_struct of prev to schedule_tail,
so we have the same issue there
Patch below fixes things, but it's a shame we have to use an extra register
like this.
Will
--->8
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f860bfda454a..e16351819fed 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -585,7 +585,8 @@ ENDPROC(el0_irq)
*
*/
ENTRY(cpu_switch_to)
- add x8, x0, #THREAD_CPU_CONTEXT
+ mov x10, #THREAD_CPU_CONTEXT
+ add x8, x0, x10
mov x9, sp
stp x19, x20, [x8], #16 // store callee-saved registers
stp x21, x22, [x8], #16
@@ -594,7 +595,7 @@ ENTRY(cpu_switch_to)
stp x27, x28, [x8], #16
stp x29, x9, [x8], #16
str lr, [x8]
- add x8, x1, #THREAD_CPU_CONTEXT
+ add x8, x1, x10
ldp x19, x20, [x8], #16 // restore callee-saved registers
ldp x21, x22, [x8], #16
ldp x23, x24, [x8], #16
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] arm64: Minor refactoring of cpu_switch_to() to fix build breakage
2015-07-20 10:53 ` Will Deacon
@ 2015-07-20 14:20 ` Guenter Roeck
2015-07-20 16:33 ` Olof Johansson
0 siblings, 1 reply; 5+ messages in thread
From: Guenter Roeck @ 2015-07-20 14:20 UTC (permalink / raw)
To: Will Deacon
Cc: Ingo Molnar, Olof Johansson, Catalin Marinas, linux-kernel,
linux-arm-kernel
On Mon, Jul 20, 2015 at 11:53:45AM +0100, Will Deacon wrote:
> On Mon, Jul 20, 2015 at 08:36:47AM +0100, Ingo Molnar wrote:
> > * Olof Johansson <olof@lixom.net> wrote:
> >
> > > Commit 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
> > > moved the thread_struct to the bottom of task_struct. As a result, the
> > > offset is now too large to be used in an immediate add on arm64 with
> > > some kernel configs:
> > >
> > > arch/arm64/kernel/entry.S: Assembler messages:
> > > arch/arm64/kernel/entry.S:588: Error: immediate out of range
> > > arch/arm64/kernel/entry.S:597: Error: immediate out of range
> > >
> > > There's really no reason for cpu_switch_to to take a task_struct pointer
> > > in the first place, since all it does is access the thread.cpu_context
> > > member. So, just pass that in directly.
> > >
> > > Fixes: 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
> > > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > > Signed-off-by: Olof Johansson <olof@lixom.net>
> > > ---
> > > arch/arm64/include/asm/processor.h | 4 ++--
> > > arch/arm64/kernel/asm-offsets.c | 2 --
> > > arch/arm64/kernel/entry.S | 34 ++++++++++++++++------------------
> > > arch/arm64/kernel/process.c | 3 ++-
> > > 4 files changed, 20 insertions(+), 23 deletions(-)
> >
> > So why not pass in 'thread_struct' as the patch below does - it looks much
> > simpler to me. This way the assembly doesn't have to be changed at all.
>
> Unfortunately, neither of these approaches really work:
>
> - We need to return last from __switch_to, which means not corrupting
> x0 in cpu_switch_to and then having an ugly container_of to get back
> at the task_struct
>
> - ret_from_fork needs to pass the task_struct of prev to schedule_tail,
> so we have the same issue there
>
Confirmed; both Ingo's patch (after fixing it up) and Olof's patch
fail my qemu tests (qemu hangs with both patches and does not produce
any console output).
> Patch below fixes things, but it's a shame we have to use an extra register
> like this.
>
Yes, your patch works, at least with my qemu tests, and the allmodconfig build
no longer fails.
Tested-by: Guenter Roeck <linux@roeck-us.net>
> Will
>
> --->8
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index f860bfda454a..e16351819fed 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -585,7 +585,8 @@ ENDPROC(el0_irq)
> *
> */
> ENTRY(cpu_switch_to)
> - add x8, x0, #THREAD_CPU_CONTEXT
> + mov x10, #THREAD_CPU_CONTEXT
> + add x8, x0, x10
> mov x9, sp
> stp x19, x20, [x8], #16 // store callee-saved registers
> stp x21, x22, [x8], #16
> @@ -594,7 +595,7 @@ ENTRY(cpu_switch_to)
> stp x27, x28, [x8], #16
> stp x29, x9, [x8], #16
> str lr, [x8]
> - add x8, x1, #THREAD_CPU_CONTEXT
> + add x8, x1, x10
> ldp x19, x20, [x8], #16 // restore callee-saved registers
> ldp x21, x22, [x8], #16
> ldp x23, x24, [x8], #16
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] arm64: Minor refactoring of cpu_switch_to() to fix build breakage
2015-07-20 14:20 ` Guenter Roeck
@ 2015-07-20 16:33 ` Olof Johansson
0 siblings, 0 replies; 5+ messages in thread
From: Olof Johansson @ 2015-07-20 16:33 UTC (permalink / raw)
To: Guenter Roeck
Cc: Will Deacon, Ingo Molnar, Catalin Marinas, linux-kernel,
linux-arm-kernel
On Mon, Jul 20, 2015 at 7:20 AM, Guenter Roeck <linux@roeck-us.net> wrote:
> On Mon, Jul 20, 2015 at 11:53:45AM +0100, Will Deacon wrote:
>> On Mon, Jul 20, 2015 at 08:36:47AM +0100, Ingo Molnar wrote:
>> > * Olof Johansson <olof@lixom.net> wrote:
>> >
>> > > Commit 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
>> > > moved the thread_struct to the bottom of task_struct. As a result, the
>> > > offset is now too large to be used in an immediate add on arm64 with
>> > > some kernel configs:
>> > >
>> > > arch/arm64/kernel/entry.S: Assembler messages:
>> > > arch/arm64/kernel/entry.S:588: Error: immediate out of range
>> > > arch/arm64/kernel/entry.S:597: Error: immediate out of range
>> > >
>> > > There's really no reason for cpu_switch_to to take a task_struct pointer
>> > > in the first place, since all it does is access the thread.cpu_context
>> > > member. So, just pass that in directly.
>> > >
>> > > Fixes: 0c8c0f03e3a2 ("x86/fpu, sched: Dynamically allocate 'struct fpu'")
>> > > Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> > > Signed-off-by: Olof Johansson <olof@lixom.net>
>> > > ---
>> > > arch/arm64/include/asm/processor.h | 4 ++--
>> > > arch/arm64/kernel/asm-offsets.c | 2 --
>> > > arch/arm64/kernel/entry.S | 34 ++++++++++++++++------------------
>> > > arch/arm64/kernel/process.c | 3 ++-
>> > > 4 files changed, 20 insertions(+), 23 deletions(-)
>> >
>> > So why not pass in 'thread_struct' as the patch below does - it looks much
>> > simpler to me. This way the assembly doesn't have to be changed at all.
>>
>> Unfortunately, neither of these approaches really work:
>>
>> - We need to return last from __switch_to, which means not corrupting
>> x0 in cpu_switch_to and then having an ugly container_of to get back
>> at the task_struct
>>
>> - ret_from_fork needs to pass the task_struct of prev to schedule_tail,
>> so we have the same issue there
>>
> Confirmed; both Ingo's patch (after fixing it up) and Olof's patch
> fail my qemu tests (qemu hangs with both patches and does not produce
> any console output).
>
>> Patch below fixes things, but it's a shame we have to use an extra register
>> like this.
>>
> Yes, your patch works, at least with my qemu tests, and the allmodconfig build
> no longer fails.
>
> Tested-by: Guenter Roeck <linux@roeck-us.net>
Yep, my bad for not looking harder (and resurrecting my only arm64
test system that's currently down).
Thanks all. :)
-Olof
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-07-20 16:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-20 2:29 [PATCH] arm64: Minor refactoring of cpu_switch_to() to fix build breakage Olof Johansson
2015-07-20 7:36 ` Ingo Molnar
2015-07-20 10:53 ` Will Deacon
2015-07-20 14:20 ` Guenter Roeck
2015-07-20 16:33 ` Olof Johansson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).