* [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
@ 2020-05-27 23:34 Nobuhiro Iwamatsu
2020-05-28 8:12 ` Will Deacon
2020-05-28 11:53 ` Catalin Marinas
0 siblings, 2 replies; 4+ messages in thread
From: Nobuhiro Iwamatsu @ 2020-05-27 23:34 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Mark Rutland, catalin.marinas, Nobuhiro Iwamatsu,
Signed-off-by : Gavin Shan, Yuji Ishikawa
If boot_secondary() was successful, and cpu_online() was an error in
__cpu_up(), -EIO was returned, but 0 is returned by commit d22b115cbfbb7
("arm64/kernel: Simplify __cpu_up() by bailing out early").
Therefore, bringup_wait_for_ap() causes the primary core to wait for a
long time, which may cause boot failure.
This commit sets -EIO to return code under the same conditions.
Fixes: d22b115cbfbb7 ("arm64/kernel: Simplify __cpu_up() by bailing out early")
CC: Signed-off-by: Gavin Shan <gshan@redhat.com>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Mark Rutland <mark.rutland@arm.com>
Tested-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
---
arch/arm64/kernel/smp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 061f60fe452f7..ea677680e4277 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -137,6 +137,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
if (cpu_online(cpu))
return 0;
+ ret = -EIO;
pr_crit("CPU%u: failed to come online\n", cpu);
secondary_data.task = NULL;
secondary_data.stack = NULL;
--
2.27.0.rc0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
2020-05-27 23:34 [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up() Nobuhiro Iwamatsu
@ 2020-05-28 8:12 ` Will Deacon
2020-05-28 23:31 ` nobuhiro1.iwamatsu
2020-05-28 11:53 ` Catalin Marinas
1 sibling, 1 reply; 4+ messages in thread
From: Will Deacon @ 2020-05-28 8:12 UTC (permalink / raw)
To: Nobuhiro Iwamatsu
Cc: Mark Rutland, catalin.marinas, Signed-off-by : Gavin Shan,
Yuji Ishikawa, linux-arm-kernel
On Thu, May 28, 2020 at 08:34:57AM +0900, Nobuhiro Iwamatsu wrote:
> If boot_secondary() was successful, and cpu_online() was an error in
> __cpu_up(), -EIO was returned, but 0 is returned by commit d22b115cbfbb7
> ("arm64/kernel: Simplify __cpu_up() by bailing out early").
> Therefore, bringup_wait_for_ap() causes the primary core to wait for a
> long time, which may cause boot failure.
> This commit sets -EIO to return code under the same conditions.
>
> Fixes: d22b115cbfbb7 ("arm64/kernel: Simplify __cpu_up() by bailing out early")
> CC: Signed-off-by: Gavin Shan <gshan@redhat.com>
> CC: Catalin Marinas <catalin.marinas@arm.com>
> CC: Mark Rutland <mark.rutland@arm.com>
> Tested-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
> ---
> arch/arm64/kernel/smp.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 061f60fe452f7..ea677680e4277 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -137,6 +137,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> if (cpu_online(cpu))
> return 0;
>
> + ret = -EIO;
> pr_crit("CPU%u: failed to come online\n", cpu);
> secondary_data.task = NULL;
> secondary_data.stack = NULL;
This appears to restore the old behaviour, so looks good to me. I'd probably
just replace the final 'return ret' with 'return -EIO' since it's never
going to be anything else.
Also -- aren't you in a pretty serious mess if the CPU starts but doesn't
come online? I think the patch is fine, but this really shouldn't happen,
right? Could you share your dmesg?
Either way:
Acked-by: Will Deacon <will@kernel.org>
Catalin -- do you want to take this (the problematic change was introduced
during the last merge window afaict)?
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
2020-05-27 23:34 [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up() Nobuhiro Iwamatsu
2020-05-28 8:12 ` Will Deacon
@ 2020-05-28 11:53 ` Catalin Marinas
1 sibling, 0 replies; 4+ messages in thread
From: Catalin Marinas @ 2020-05-28 11:53 UTC (permalink / raw)
To: Nobuhiro Iwamatsu, linux-arm-kernel
Cc: Mark Rutland, Gavin Shan, Will Deacon, Yuji Ishikawa
On Thu, 28 May 2020 08:34:57 +0900, Nobuhiro Iwamatsu wrote:
> If boot_secondary() was successful, and cpu_online() was an error in
> __cpu_up(), -EIO was returned, but 0 is returned by commit d22b115cbfbb7
> ("arm64/kernel: Simplify __cpu_up() by bailing out early").
> Therefore, bringup_wait_for_ap() causes the primary core to wait for a
> long time, which may cause boot failure.
> This commit sets -EIO to return code under the same conditions.
Applied to arm64 (for-next/fixes), thanks!
[1/1] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
https://git.kernel.org/arm64/c/ba051f097fc3
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
2020-05-28 8:12 ` Will Deacon
@ 2020-05-28 23:31 ` nobuhiro1.iwamatsu
0 siblings, 0 replies; 4+ messages in thread
From: nobuhiro1.iwamatsu @ 2020-05-28 23:31 UTC (permalink / raw)
To: will
Cc: mark.rutland, catalin.marinas, gshan, yuji2.ishikawa, linux-arm-kernel
Hi,
Thanks for your review.
The patch has already been applied...
> -----Original Message-----
> From: Will Deacon [mailto:will@kernel.org]
> Sent: Thursday, May 28, 2020 5:12 PM
> To: iwamatsu nobuhiro(岩松 信洋 □SWC◯ACT) <nobuhiro1.iwamatsu@toshiba.co.jp>
> Cc: linux-arm-kernel@lists.infradead.org; Mark Rutland <mark.rutland@arm.com>; catalin.marinas@arm.com;
> Signed-off-by : Gavin Shan <gshan@redhat.com>; ishikawa yuji(石川 悠司 TDSC ○DSRC□ET開○ET3)
> <yuji2.ishikawa@toshiba.co.jp>
> Subject: Re: [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
>
> On Thu, May 28, 2020 at 08:34:57AM +0900, Nobuhiro Iwamatsu wrote:
> > If boot_secondary() was successful, and cpu_online() was an error in
> > __cpu_up(), -EIO was returned, but 0 is returned by commit d22b115cbfbb7
> > ("arm64/kernel: Simplify __cpu_up() by bailing out early").
> > Therefore, bringup_wait_for_ap() causes the primary core to wait for a
> > long time, which may cause boot failure.
> > This commit sets -EIO to return code under the same conditions.
> >
> > Fixes: d22b115cbfbb7 ("arm64/kernel: Simplify __cpu_up() by bailing out early")
> > CC: Signed-off-by: Gavin Shan <gshan@redhat.com>
> > CC: Catalin Marinas <catalin.marinas@arm.com>
> > CC: Mark Rutland <mark.rutland@arm.com>
> > Tested-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
> > Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
> > ---
> > arch/arm64/kernel/smp.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> > index 061f60fe452f7..ea677680e4277 100644
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -137,6 +137,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> > if (cpu_online(cpu))
> > return 0;
> >
> > + ret = -EIO;
> > pr_crit("CPU%u: failed to come online\n", cpu);
> > secondary_data.task = NULL;
> > secondary_data.stack = NULL;
>
> This appears to restore the old behaviour, so looks good to me. I'd probably
> just replace the final 'return ret' with 'return -EIO' since it's never
> going to be anything else.
Yeah, your suggestion is better.
>
> Also -- aren't you in a pretty serious mess if the CPU starts but doesn't
> come online? I think the patch is fine, but this really shouldn't happen,
> right? Could you share your dmesg?
>
I paste the dmesg below. It stops with "printk: bootconsole [pl11] disabled" in my environment.
After checking the bisect and the code, I realized that there was a problem with the target commit.
And SoC enable-method that causes the problem uses spin-table instead of PCSI.
----
Starting kernel ...
Booting Linux on physical CPU 0x0000000000 [0x410fd034]
Linux version 5.7.0-rc7-00032-g565e4e976d6e (iwamatsu@ryzen7) (gcc version 8.3.0 (Debian 8.3.0-21), GNU ld (GNU Binutils for Debian) 2.34) #8 SMP PREEMPT Thu May 28 20:48:58 JST 2020
Machine model: Toshiba TMPV7700 EVB
earlycon: pl11 at MMIO 0x0000000028200000 (options '')
printk: bootconsole [pl11] enabled
cma: Reserved 16 MiB at 0x00000000b7000000
NUMA: No NUMA configuration found
NUMA: Faking a node at [mem 0x0000000080000000-0x00000000b7ffffff]
NUMA: NODE_DATA [mem 0xb6e34100-0xb6e35fff]
Zone ranges:
DMA [mem 0x0000000080000000-0x00000000b7ffffff]
DMA32 empty
Normal empty
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000080000000-0x00000000b7ffffff]
Initmem setup node 0 [mem 0x0000000080000000-0x00000000b7ffffff]
percpu: Embedded 21 pages/cpu s46168 r8192 d31656 u86016
Detected VIPT I-cache on CPU0
CPU features: detected: ARM erratum 845719
Built 1 zonelists, mobility grouping on. Total pages: 225792
Policy zone: DMA
Kernel command line: earlycon=pl011,0x28200000
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 765600K/917504K available (4670K kernel code, 318K rwdata, 1236K rodata, 320K init, 327K bss, 135520K reserved, 16384K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
rcu: Preemptible hierarchical RCU implementation.
rcu: RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=8.
Tasks RCU enabled.
rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
GIC: Using split EOI/Deactivate mode
random: get_random_bytes called from start_kernel팝ﴱ with crng_init=0
arch_timer: cp15 timer(s) running at 150.00MHz (phys).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x2298375bd0, max_idle_ns: 440795208267 ns
sched_clock: 56 bits at 150MHz, resolution 6ns, wraps every 2199023255551ns
Console: colour dummy device 80x25
printk: console [tty0] enabled
printk: bootconsole [pl11] disabled
---
> Either way:
>
> Acked-by: Will Deacon <will@kernel.org>
>
Thanks!
> Catalin -- do you want to take this (the problematic change was introduced
> during the last merge window afaict)?
>
> Will
Best regards,
Nobuhiro
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-05-28 23:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-27 23:34 [PATCH/RFC] arm64/kernel: Fix return value when cpu_online() fails in __cpu_up() Nobuhiro Iwamatsu
2020-05-28 8:12 ` Will Deacon
2020-05-28 23:31 ` nobuhiro1.iwamatsu
2020-05-28 11:53 ` Catalin Marinas
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.