* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n @ 2021-11-25 4:20 ` Martin Kennedy 0 siblings, 0 replies; 11+ messages in thread From: Martin Kennedy @ 2021-11-25 4:20 UTC (permalink / raw) To: nixiaoming Cc: Yuantian.Tang, benh, chenhui.zhao, chenjianguo3, gregkh, linux-kernel, linuxppc-dev, liuwenliang, mpe, oss, paul.gortmaker, paulus, stable, wangle6, Christian Lamparter Hi there, I have bisected OpenWrt master, and then the Linux kernel down to this change, to confirm that this change causes a kernel panic on my P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of the second CPU: : [ 0.000000] Linux version 5.10.80 (labby@lobon) (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu Nov 25 02:49:35 2021 [ 0.000000] Using P1020 RDB machine description : [ 0.627233] smp: Bringing up secondary CPUs ... [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) [ 0.848899] Faulting instruction address: 0x00000000 [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB [ 1.031179] Modules linked in: [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 [ 1.340534] [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 00000000 00000001 00000000 [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 00000000 c00032c4 00000000 [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00029000 00000001 [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 c08ae0a8 00000001 dffff240 [ 1.758220] NIP [00000000] 0x0 [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 [ 1.856126] Call Trace: [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c [ 2.507125] Instruction dump: [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- [ 2.783149] [ 3.800879] Kernel panic - not syncing: Fatal exception [ 3.862353] Rebooting in 1 seconds.. [ 5.905097] System Halted, OK to turn off power Without this patch, the kernel no longer panics: [ 0.627232] smp: Bringing up secondary CPUs ... [ 0.681857] smp: Brought up 1 node, 2 CPUs Here is the kernel configuration for this built kernel: https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD In case a force-push is needed for the source repository (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), here is the device tree for this board: https://paste.c-net.org/TrousersSliced Martin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n @ 2021-11-25 4:20 ` Martin Kennedy 0 siblings, 0 replies; 11+ messages in thread From: Martin Kennedy @ 2021-11-25 4:20 UTC (permalink / raw) To: nixiaoming Cc: chenjianguo3, wangle6, chenhui.zhao, Christian Lamparter, oss, linux-kernel, stable, Yuantian.Tang, paul.gortmaker, paulus, gregkh, linuxppc-dev, liuwenliang Hi there, I have bisected OpenWrt master, and then the Linux kernel down to this change, to confirm that this change causes a kernel panic on my P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of the second CPU: : [ 0.000000] Linux version 5.10.80 (labby@lobon) (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu Nov 25 02:49:35 2021 [ 0.000000] Using P1020 RDB machine description : [ 0.627233] smp: Bringing up secondary CPUs ... [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) [ 0.848899] Faulting instruction address: 0x00000000 [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB [ 1.031179] Modules linked in: [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 [ 1.340534] [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 00000000 00000001 00000000 [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 00000000 c00032c4 00000000 [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00029000 00000001 [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 c08ae0a8 00000001 dffff240 [ 1.758220] NIP [00000000] 0x0 [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 [ 1.856126] Call Trace: [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c [ 2.507125] Instruction dump: [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- [ 2.783149] [ 3.800879] Kernel panic - not syncing: Fatal exception [ 3.862353] Rebooting in 1 seconds.. [ 5.905097] System Halted, OK to turn off power Without this patch, the kernel no longer panics: [ 0.627232] smp: Bringing up secondary CPUs ... [ 0.681857] smp: Brought up 1 node, 2 CPUs Here is the kernel configuration for this built kernel: https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD In case a force-push is needed for the source repository (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), here is the device tree for this board: https://paste.c-net.org/TrousersSliced Martin ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n 2021-11-25 4:20 ` Martin Kennedy @ 2021-11-25 7:23 ` Xiaoming Ni -1 siblings, 0 replies; 11+ messages in thread From: Xiaoming Ni @ 2021-11-25 7:23 UTC (permalink / raw) To: Martin Kennedy Cc: Yuantian.Tang, benh, chenhui.zhao, chenjianguo3, gregkh, linux-kernel, linuxppc-dev, liuwenliang, mpe, oss, paul.gortmaker, paulus, stable, wangle6, Christian Lamparter On 2021/11/25 12:20, Martin Kennedy wrote: > Hi there, > > I have bisected OpenWrt master, and then the Linux kernel down to this > change, to confirm that this change causes a kernel panic on my > P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of > the second CPU: > > : > [ 0.000000] Linux version 5.10.80 (labby@lobon) > (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 > r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu > Nov 25 02:49:35 2021 > [ 0.000000] Using P1020 RDB machine description > : > [ 0.627233] smp: Bringing up secondary CPUs ... > [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) > [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) > [ 0.848899] Faulting instruction address: 0x00000000 > [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] > [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB > [ 1.031179] Modules linked in: > [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 > [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 > [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) > [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 > [ 1.340534] > [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 > 00000000 00000001 00000000 > [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 > 00000000 c00032c4 00000000 > [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 > 00000000 00029000 00000001 > [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 > c08ae0a8 00000001 dffff240 > [ 1.758220] NIP [00000000] 0x0 > [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 > [ 1.856126] Call Trace: > [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) > [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 > [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 > [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c > [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 > [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 > [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 > [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 > [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c > [ 2.507125] Instruction dump: > [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > XXXXXXXX XXXXXXXX > [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > XXXXXXXX XXXXXXXX > [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- > [ 2.783149] > [ 3.800879] Kernel panic - not syncing: Fatal exception > [ 3.862353] Rebooting in 1 seconds.. > [ 5.905097] System Halted, OK to turn off power > > Without this patch, the kernel no longer panics: > > [ 0.627232] smp: Bringing up secondary CPUs ... > [ 0.681857] smp: Brought up 1 node, 2 CPUs > > Here is the kernel configuration for this built kernel: > https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD > > In case a force-push is needed for the source repository > (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), > here is the device tree for this board: > https://paste.c-net.org/TrousersSliced > > Martin > . > When CONFIG_FSL_PMC is set to n, cpu_up_prepare is not assigned to mpc85xx_pm_ops. I suspect that this is the cause of the current null pointer access. I do not have the corresponding board test environment. Can you help me to test whether the following patch solves the problem? diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c index 83f4a6389a28..d7081e9af65c 100644 --- a/arch/powerpc/platforms/85xx/smp.c +++ b/arch/powerpc/platforms/85xx/smp.c @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) local_irq_save(flags); hard_irq_disable(); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(cpu); /* if cpu is not spinning, reset it */ @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) booting_thread_hwid = cpu_thread_in_core(nr); primary = cpu_first_thread_sibling(nr); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(nr); /* ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n @ 2021-11-25 7:23 ` Xiaoming Ni 0 siblings, 0 replies; 11+ messages in thread From: Xiaoming Ni @ 2021-11-25 7:23 UTC (permalink / raw) To: Martin Kennedy Cc: chenjianguo3, wangle6, chenhui.zhao, Christian Lamparter, oss, linux-kernel, stable, Yuantian.Tang, paul.gortmaker, paulus, gregkh, linuxppc-dev, liuwenliang On 2021/11/25 12:20, Martin Kennedy wrote: > Hi there, > > I have bisected OpenWrt master, and then the Linux kernel down to this > change, to confirm that this change causes a kernel panic on my > P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of > the second CPU: > > : > [ 0.000000] Linux version 5.10.80 (labby@lobon) > (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 > r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu > Nov 25 02:49:35 2021 > [ 0.000000] Using P1020 RDB machine description > : > [ 0.627233] smp: Bringing up secondary CPUs ... > [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) > [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) > [ 0.848899] Faulting instruction address: 0x00000000 > [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] > [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB > [ 1.031179] Modules linked in: > [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 > [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 > [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) > [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 > [ 1.340534] > [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 > 00000000 00000001 00000000 > [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 > 00000000 c00032c4 00000000 > [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 > 00000000 00029000 00000001 > [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 > c08ae0a8 00000001 dffff240 > [ 1.758220] NIP [00000000] 0x0 > [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 > [ 1.856126] Call Trace: > [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) > [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 > [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 > [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c > [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 > [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 > [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 > [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 > [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c > [ 2.507125] Instruction dump: > [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > XXXXXXXX XXXXXXXX > [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > XXXXXXXX XXXXXXXX > [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- > [ 2.783149] > [ 3.800879] Kernel panic - not syncing: Fatal exception > [ 3.862353] Rebooting in 1 seconds.. > [ 5.905097] System Halted, OK to turn off power > > Without this patch, the kernel no longer panics: > > [ 0.627232] smp: Bringing up secondary CPUs ... > [ 0.681857] smp: Brought up 1 node, 2 CPUs > > Here is the kernel configuration for this built kernel: > https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD > > In case a force-push is needed for the source repository > (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), > here is the device tree for this board: > https://paste.c-net.org/TrousersSliced > > Martin > . > When CONFIG_FSL_PMC is set to n, cpu_up_prepare is not assigned to mpc85xx_pm_ops. I suspect that this is the cause of the current null pointer access. I do not have the corresponding board test environment. Can you help me to test whether the following patch solves the problem? diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c index 83f4a6389a28..d7081e9af65c 100644 --- a/arch/powerpc/platforms/85xx/smp.c +++ b/arch/powerpc/platforms/85xx/smp.c @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) local_irq_save(flags); hard_irq_disable(); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(cpu); /* if cpu is not spinning, reset it */ @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) booting_thread_hwid = cpu_thread_in_core(nr); primary = cpu_first_thread_sibling(nr); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(nr); /* ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n 2021-11-25 7:23 ` Xiaoming Ni @ 2021-11-25 14:34 ` Martin Kennedy -1 siblings, 0 replies; 11+ messages in thread From: Martin Kennedy @ 2021-11-25 14:34 UTC (permalink / raw) To: Xiaoming Ni Cc: chenjianguo3, wangle6, chenhui.zhao, Christian Lamparter, oss, linux-kernel, stable, Yuantian.Tang, paul.gortmaker, paulus, gregkh, linuxppc-dev, liuwenliang Hi there, Yes, I can test this patch. I have added it to my tree and removed the reversion, and can confirm that the second CPU comes up correctly now. Martin On Thu, Nov 25, 2021 at 2:23 AM Xiaoming Ni <nixiaoming@huawei.com> wrote: > > On 2021/11/25 12:20, Martin Kennedy wrote: > > Hi there, > > > > I have bisected OpenWrt master, and then the Linux kernel down to this > > change, to confirm that this change causes a kernel panic on my > > P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of > > the second CPU: > > > > : > > [ 0.000000] Linux version 5.10.80 (labby@lobon) > > (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 > > r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu > > Nov 25 02:49:35 2021 > > [ 0.000000] Using P1020 RDB machine description > > : > > [ 0.627233] smp: Bringing up secondary CPUs ... > > [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) > > [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) > > [ 0.848899] Faulting instruction address: 0x00000000 > > [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] > > [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB > > [ 1.031179] Modules linked in: > > [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 > > [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 > > [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) > > [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 > > [ 1.340534] > > [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 > > 00000000 00000001 00000000 > > [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 > > 00000000 c00032c4 00000000 > > [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 > > 00000000 00029000 00000001 > > [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 > > c08ae0a8 00000001 dffff240 > > [ 1.758220] NIP [00000000] 0x0 > > [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 > > [ 1.856126] Call Trace: > > [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) > > [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 > > [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 > > [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c > > [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 > > [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 > > [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 > > [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 > > [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c > > [ 2.507125] Instruction dump: > > [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > > XXXXXXXX XXXXXXXX > > [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > > XXXXXXXX XXXXXXXX > > [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- > > [ 2.783149] > > [ 3.800879] Kernel panic - not syncing: Fatal exception > > [ 3.862353] Rebooting in 1 seconds.. > > [ 5.905097] System Halted, OK to turn off power > > > > Without this patch, the kernel no longer panics: > > > > [ 0.627232] smp: Bringing up secondary CPUs ... > > [ 0.681857] smp: Brought up 1 node, 2 CPUs > > > > Here is the kernel configuration for this built kernel: > > https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD > > > > In case a force-push is needed for the source repository > > (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), > > here is the device tree for this board: > > https://paste.c-net.org/TrousersSliced > > > > Martin > > . > > > When CONFIG_FSL_PMC is set to n, cpu_up_prepare is not assigned to > mpc85xx_pm_ops. I suspect that this is the cause of the current null > pointer access. > I do not have the corresponding board test environment. Can you help me > to test whether the following patch solves the problem? > > diff --git a/arch/powerpc/platforms/85xx/smp.c > b/arch/powerpc/platforms/85xx/smp.c > index 83f4a6389a28..d7081e9af65c 100644 > --- a/arch/powerpc/platforms/85xx/smp.c > +++ b/arch/powerpc/platforms/85xx/smp.c > @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) > local_irq_save(flags); > hard_irq_disable(); > > - if (qoriq_pm_ops) > + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) > qoriq_pm_ops->cpu_up_prepare(cpu); > > /* if cpu is not spinning, reset it */ > @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) > booting_thread_hwid = cpu_thread_in_core(nr); > primary = cpu_first_thread_sibling(nr); > > - if (qoriq_pm_ops) > + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) > qoriq_pm_ops->cpu_up_prepare(nr); > > /* > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n @ 2021-11-25 14:34 ` Martin Kennedy 0 siblings, 0 replies; 11+ messages in thread From: Martin Kennedy @ 2021-11-25 14:34 UTC (permalink / raw) To: Xiaoming Ni Cc: Yuantian.Tang, benh, chenhui.zhao, chenjianguo3, gregkh, linux-kernel, linuxppc-dev, liuwenliang, mpe, oss, paul.gortmaker, paulus, stable, wangle6, Christian Lamparter Hi there, Yes, I can test this patch. I have added it to my tree and removed the reversion, and can confirm that the second CPU comes up correctly now. Martin On Thu, Nov 25, 2021 at 2:23 AM Xiaoming Ni <nixiaoming@huawei.com> wrote: > > On 2021/11/25 12:20, Martin Kennedy wrote: > > Hi there, > > > > I have bisected OpenWrt master, and then the Linux kernel down to this > > change, to confirm that this change causes a kernel panic on my > > P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of > > the second CPU: > > > > : > > [ 0.000000] Linux version 5.10.80 (labby@lobon) > > (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 > > r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu > > Nov 25 02:49:35 2021 > > [ 0.000000] Using P1020 RDB machine description > > : > > [ 0.627233] smp: Bringing up secondary CPUs ... > > [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) > > [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) > > [ 0.848899] Faulting instruction address: 0x00000000 > > [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] > > [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB > > [ 1.031179] Modules linked in: > > [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 > > [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 > > [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) > > [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 > > [ 1.340534] > > [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 > > 00000000 00000001 00000000 > > [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 > > 00000000 c00032c4 00000000 > > [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 > > 00000000 00029000 00000001 > > [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 > > c08ae0a8 00000001 dffff240 > > [ 1.758220] NIP [00000000] 0x0 > > [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 > > [ 1.856126] Call Trace: > > [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) > > [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 > > [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 > > [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c > > [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 > > [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 > > [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 > > [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 > > [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c > > [ 2.507125] Instruction dump: > > [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > > XXXXXXXX XXXXXXXX > > [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX > > XXXXXXXX XXXXXXXX > > [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- > > [ 2.783149] > > [ 3.800879] Kernel panic - not syncing: Fatal exception > > [ 3.862353] Rebooting in 1 seconds.. > > [ 5.905097] System Halted, OK to turn off power > > > > Without this patch, the kernel no longer panics: > > > > [ 0.627232] smp: Bringing up secondary CPUs ... > > [ 0.681857] smp: Brought up 1 node, 2 CPUs > > > > Here is the kernel configuration for this built kernel: > > https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD > > > > In case a force-push is needed for the source repository > > (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), > > here is the device tree for this board: > > https://paste.c-net.org/TrousersSliced > > > > Martin > > . > > > When CONFIG_FSL_PMC is set to n, cpu_up_prepare is not assigned to > mpc85xx_pm_ops. I suspect that this is the cause of the current null > pointer access. > I do not have the corresponding board test environment. Can you help me > to test whether the following patch solves the problem? > > diff --git a/arch/powerpc/platforms/85xx/smp.c > b/arch/powerpc/platforms/85xx/smp.c > index 83f4a6389a28..d7081e9af65c 100644 > --- a/arch/powerpc/platforms/85xx/smp.c > +++ b/arch/powerpc/platforms/85xx/smp.c > @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) > local_irq_save(flags); > hard_irq_disable(); > > - if (qoriq_pm_ops) > + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) > qoriq_pm_ops->cpu_up_prepare(cpu); > > /* if cpu is not spinning, reset it */ > @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) > booting_thread_hwid = cpu_thread_in_core(nr); > primary = cpu_first_thread_sibling(nr); > > - if (qoriq_pm_ops) > + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) > qoriq_pm_ops->cpu_up_prepare(nr); > > /* > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n 2021-11-25 14:34 ` Martin Kennedy @ 2021-11-26 1:22 ` Xiaoming Ni -1 siblings, 0 replies; 11+ messages in thread From: Xiaoming Ni @ 2021-11-26 1:22 UTC (permalink / raw) To: Martin Kennedy Cc: chenjianguo3, wangle6, chenhui.zhao, Christian Lamparter, oss, linux-kernel, stable, Yuantian.Tang, paul.gortmaker, paulus, gregkh, linuxppc-dev, liuwenliang On 2021/11/25 22:34, Martin Kennedy wrote: > Hi there, > > Yes, I can test this patch. > > I have added it to my tree and removed the reversion, and can confirm > that the second CPU comes up correctly now. > > Martin > Thank you very much for your report and testing, I'll send a patch later Thanks Xiaoming Ni > On Thu, Nov 25, 2021 at 2:23 AM Xiaoming Ni <nixiaoming@huawei.com> wrote: >> >> On 2021/11/25 12:20, Martin Kennedy wrote: >>> Hi there, >>> >>> I have bisected OpenWrt master, and then the Linux kernel down to this >>> change, to confirm that this change causes a kernel panic on my >>> P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of >>> the second CPU: >>> >>> : >>> [ 0.000000] Linux version 5.10.80 (labby@lobon) >>> (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 >>> r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu >>> Nov 25 02:49:35 2021 >>> [ 0.000000] Using P1020 RDB machine description >>> : >>> [ 0.627233] smp: Bringing up secondary CPUs ... >>> [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) >>> [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) >>> [ 0.848899] Faulting instruction address: 0x00000000 >>> [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] >>> [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB >>> [ 1.031179] Modules linked in: >>> [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 >>> [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 >>> [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) >>> [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 >>> [ 1.340534] >>> [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 >>> 00000000 00000001 00000000 >>> [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 >>> 00000000 c00032c4 00000000 >>> [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 >>> 00000000 00029000 00000001 >>> [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 >>> c08ae0a8 00000001 dffff240 >>> [ 1.758220] NIP [00000000] 0x0 >>> [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 >>> [ 1.856126] Call Trace: >>> [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) >>> [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 >>> [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 >>> [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c >>> [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 >>> [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 >>> [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 >>> [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 >>> [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c >>> [ 2.507125] Instruction dump: >>> [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX >>> XXXXXXXX XXXXXXXX >>> [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX >>> XXXXXXXX XXXXXXXX >>> [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- >>> [ 2.783149] >>> [ 3.800879] Kernel panic - not syncing: Fatal exception >>> [ 3.862353] Rebooting in 1 seconds.. >>> [ 5.905097] System Halted, OK to turn off power >>> >>> Without this patch, the kernel no longer panics: >>> >>> [ 0.627232] smp: Bringing up secondary CPUs ... >>> [ 0.681857] smp: Brought up 1 node, 2 CPUs >>> >>> Here is the kernel configuration for this built kernel: >>> https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD >>> >>> In case a force-push is needed for the source repository >>> (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), >>> here is the device tree for this board: >>> https://paste.c-net.org/TrousersSliced >>> >>> Martin >>> . >>> >> When CONFIG_FSL_PMC is set to n, cpu_up_prepare is not assigned to >> mpc85xx_pm_ops. I suspect that this is the cause of the current null >> pointer access. >> I do not have the corresponding board test environment. Can you help me >> to test whether the following patch solves the problem? >> >> diff --git a/arch/powerpc/platforms/85xx/smp.c >> b/arch/powerpc/platforms/85xx/smp.c >> index 83f4a6389a28..d7081e9af65c 100644 >> --- a/arch/powerpc/platforms/85xx/smp.c >> +++ b/arch/powerpc/platforms/85xx/smp.c >> @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) >> local_irq_save(flags); >> hard_irq_disable(); >> >> - if (qoriq_pm_ops) >> + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) >> qoriq_pm_ops->cpu_up_prepare(cpu); >> >> /* if cpu is not spinning, reset it */ >> @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) >> booting_thread_hwid = cpu_thread_in_core(nr); >> primary = cpu_first_thread_sibling(nr); >> >> - if (qoriq_pm_ops) >> + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) >> qoriq_pm_ops->cpu_up_prepare(nr); >> >> /* >> >> > . > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n @ 2021-11-26 1:22 ` Xiaoming Ni 0 siblings, 0 replies; 11+ messages in thread From: Xiaoming Ni @ 2021-11-26 1:22 UTC (permalink / raw) To: Martin Kennedy Cc: Yuantian.Tang, benh, chenhui.zhao, chenjianguo3, gregkh, linux-kernel, linuxppc-dev, liuwenliang, mpe, oss, paul.gortmaker, paulus, stable, wangle6, Christian Lamparter On 2021/11/25 22:34, Martin Kennedy wrote: > Hi there, > > Yes, I can test this patch. > > I have added it to my tree and removed the reversion, and can confirm > that the second CPU comes up correctly now. > > Martin > Thank you very much for your report and testing, I'll send a patch later Thanks Xiaoming Ni > On Thu, Nov 25, 2021 at 2:23 AM Xiaoming Ni <nixiaoming@huawei.com> wrote: >> >> On 2021/11/25 12:20, Martin Kennedy wrote: >>> Hi there, >>> >>> I have bisected OpenWrt master, and then the Linux kernel down to this >>> change, to confirm that this change causes a kernel panic on my >>> P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of >>> the second CPU: >>> >>> : >>> [ 0.000000] Linux version 5.10.80 (labby@lobon) >>> (powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0 >>> r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu >>> Nov 25 02:49:35 2021 >>> [ 0.000000] Using P1020 RDB machine description >>> : >>> [ 0.627233] smp: Bringing up secondary CPUs ... >>> [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) >>> [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) >>> [ 0.848899] Faulting instruction address: 0x00000000 >>> [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] >>> [ 0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB >>> [ 1.031179] Modules linked in: >>> [ 1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0 >>> [ 1.139507] NIP: 00000000 LR: c0021d2c CTR: 00000000 >>> [ 1.199921] REGS: c1051cf0 TRAP: 0400 Not tainted (5.10.80) >>> [ 1.269705] MSR: 00021000 <CE,ME> CR: 84020202 XER: 00000000 >>> [ 1.340534] >>> [ 1.340534] GPR00: c0021cb8 c1051da8 c1048000 00000001 00029000 >>> 00000000 00000001 00000000 >>> [ 1.340534] GPR08: 00000001 00000000 c08b0000 00000040 22000208 >>> 00000000 c00032c4 00000000 >>> [ 1.340534] GPR16: 00000000 00000000 00000000 00000000 00000000 >>> 00000000 00029000 00000001 >>> [ 1.340534] GPR24: 1ffff240 20000000 dffff240 c080a1f4 00000001 >>> c08ae0a8 00000001 dffff240 >>> [ 1.758220] NIP [00000000] 0x0 >>> [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 >>> [ 1.856126] Call Trace: >>> [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) >>> [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 >>> [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 >>> [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c >>> [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 >>> [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 >>> [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 >>> [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 >>> [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c >>> [ 2.507125] Instruction dump: >>> [ 2.542541] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX >>> XXXXXXXX XXXXXXXX >>> [ 2.635242] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX >>> XXXXXXXX XXXXXXXX >>> [ 2.727952] ---[ end trace 9b796a4bafb6bc14 ]--- >>> [ 2.783149] >>> [ 3.800879] Kernel panic - not syncing: Fatal exception >>> [ 3.862353] Rebooting in 1 seconds.. >>> [ 5.905097] System Halted, OK to turn off power >>> >>> Without this patch, the kernel no longer panics: >>> >>> [ 0.627232] smp: Bringing up secondary CPUs ... >>> [ 0.681857] smp: Brought up 1 node, 2 CPUs >>> >>> Here is the kernel configuration for this built kernel: >>> https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD >>> >>> In case a force-push is needed for the source repository >>> (https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf), >>> here is the device tree for this board: >>> https://paste.c-net.org/TrousersSliced >>> >>> Martin >>> . >>> >> When CONFIG_FSL_PMC is set to n, cpu_up_prepare is not assigned to >> mpc85xx_pm_ops. I suspect that this is the cause of the current null >> pointer access. >> I do not have the corresponding board test environment. Can you help me >> to test whether the following patch solves the problem? >> >> diff --git a/arch/powerpc/platforms/85xx/smp.c >> b/arch/powerpc/platforms/85xx/smp.c >> index 83f4a6389a28..d7081e9af65c 100644 >> --- a/arch/powerpc/platforms/85xx/smp.c >> +++ b/arch/powerpc/platforms/85xx/smp.c >> @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) >> local_irq_save(flags); >> hard_irq_disable(); >> >> - if (qoriq_pm_ops) >> + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) >> qoriq_pm_ops->cpu_up_prepare(cpu); >> >> /* if cpu is not spinning, reset it */ >> @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) >> booting_thread_hwid = cpu_thread_in_core(nr); >> primary = cpu_first_thread_sibling(nr); >> >> - if (qoriq_pm_ops) >> + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) >> qoriq_pm_ops->cpu_up_prepare(nr); >> >> /* >> >> > . > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] powerpc/85xx: fix oops when CONFIG_FSL_PMC=n 2021-11-26 1:22 ` Xiaoming Ni @ 2021-11-26 4:11 ` Xiaoming Ni -1 siblings, 0 replies; 11+ messages in thread From: Xiaoming Ni @ 2021-11-26 4:11 UTC (permalink / raw) To: hurricos, Yuantian.Tang, benh, chenhui.zhao, chenjianguo3, gregkh, linux-kernel, linuxppc-dev, liuwenliang, mpe, oss, paul.gortmaker, paulus, stable, wangle6, chunkeey Cc: nixiaoming When CONFIG_FSL_PMC is set to n, no value is assigned to cpu_up_prepare in the mpc85xx_pm_ops structure. As a result, oops is triggered in smp_85xx_start_cpu(). [ 0.627233] smp: Bringing up secondary CPUs ... [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) [ 0.848899] Faulting instruction address: 0x00000000 [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] ... [ 1.758220] NIP [00000000] 0x0 [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 [ 1.856126] Call Trace: [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c Fixes: c45361abb9185b ("powerpc/85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n") Link: https://lore.kernel.org/lkml/CANA18Uyba4kMJQrbCSZVTFep2Exe5izE45whNJgwwUvNSEcNLg@mail.gmail.com/ Reported-by: Martin Kennedy <hurricos@gmail.com> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Tested-by: Martin Kennedy <hurricos@gmail.com> Cc: stable@vger.kernel.org --- arch/powerpc/platforms/85xx/smp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c index 83f4a6389a28..d7081e9af65c 100644 --- a/arch/powerpc/platforms/85xx/smp.c +++ b/arch/powerpc/platforms/85xx/smp.c @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) local_irq_save(flags); hard_irq_disable(); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(cpu); /* if cpu is not spinning, reset it */ @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) booting_thread_hwid = cpu_thread_in_core(nr); primary = cpu_first_thread_sibling(nr); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(nr); /* -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH] powerpc/85xx: fix oops when CONFIG_FSL_PMC=n @ 2021-11-26 4:11 ` Xiaoming Ni 0 siblings, 0 replies; 11+ messages in thread From: Xiaoming Ni @ 2021-11-26 4:11 UTC (permalink / raw) To: hurricos, Yuantian.Tang, benh, chenhui.zhao, chenjianguo3, gregkh, linux-kernel, linuxppc-dev, liuwenliang, mpe, oss, paul.gortmaker, paulus, stable, wangle6, chunkeey Cc: nixiaoming When CONFIG_FSL_PMC is set to n, no value is assigned to cpu_up_prepare in the mpc85xx_pm_ops structure. As a result, oops is triggered in smp_85xx_start_cpu(). [ 0.627233] smp: Bringing up secondary CPUs ... [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) [ 0.848899] Faulting instruction address: 0x00000000 [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] ... [ 1.758220] NIP [00000000] 0x0 [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 [ 1.856126] Call Trace: [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c Fixes: c45361abb9185b ("powerpc/85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n") Link: https://lore.kernel.org/lkml/CANA18Uyba4kMJQrbCSZVTFep2Exe5izE45whNJgwwUvNSEcNLg@mail.gmail.com/ Reported-by: Martin Kennedy <hurricos@gmail.com> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Tested-by: Martin Kennedy <hurricos@gmail.com> Cc: stable@vger.kernel.org --- arch/powerpc/platforms/85xx/smp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c index 83f4a6389a28..d7081e9af65c 100644 --- a/arch/powerpc/platforms/85xx/smp.c +++ b/arch/powerpc/platforms/85xx/smp.c @@ -220,7 +220,7 @@ static int smp_85xx_start_cpu(int cpu) local_irq_save(flags); hard_irq_disable(); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(cpu); /* if cpu is not spinning, reset it */ @@ -292,7 +292,7 @@ static int smp_85xx_kick_cpu(int nr) booting_thread_hwid = cpu_thread_in_core(nr); primary = cpu_first_thread_sibling(nr); - if (qoriq_pm_ops) + if (qoriq_pm_ops && qoriq_pm_ops->cpu_up_prepare) qoriq_pm_ops->cpu_up_prepare(nr); /* -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] powerpc/85xx: fix oops when CONFIG_FSL_PMC=n 2021-11-26 4:11 ` Xiaoming Ni (?) @ 2021-12-03 11:53 ` Michael Ellerman -1 siblings, 0 replies; 11+ messages in thread From: Michael Ellerman @ 2021-12-03 11:53 UTC (permalink / raw) To: benh, chunkeey, gregkh, hurricos, chenhui.zhao, stable, Xiaoming Ni, wangle6, linuxppc-dev, mpe, oss, paulus, chenjianguo3, linux-kernel, liuwenliang, Yuantian.Tang, paul.gortmaker On Fri, 26 Nov 2021 12:11:53 +0800, Xiaoming Ni wrote: > When CONFIG_FSL_PMC is set to n, no value is assigned to cpu_up_prepare > in the mpc85xx_pm_ops structure. As a result, oops is triggered in > smp_85xx_start_cpu(). > > [ 0.627233] smp: Bringing up secondary CPUs ... > [ 0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0) > [ 0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?) > [ 0.848899] Faulting instruction address: 0x00000000 > [ 0.908273] Oops: Kernel access of bad area, sig: 11 [#1] > ... > [ 1.758220] NIP [00000000] 0x0 > [ 1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 > [ 1.856126] Call Trace: > [ 1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) > [ 1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228 > [ 2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 > [ 2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c > [ 2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 > [ 2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78 > [ 2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 > [ 2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124 > [ 2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c > > [...] Applied to powerpc/fixes. [1/1] powerpc/85xx: fix oops when CONFIG_FSL_PMC=n https://git.kernel.org/powerpc/c/3dc709e518b47386e6af937eaec37bb36539edfd cheers ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-12-03 11:53 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-25 4:20 [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n Martin Kennedy 2021-11-25 4:20 ` Martin Kennedy 2021-11-25 7:23 ` Xiaoming Ni 2021-11-25 7:23 ` Xiaoming Ni 2021-11-25 14:34 ` Martin Kennedy 2021-11-25 14:34 ` Martin Kennedy 2021-11-26 1:22 ` Xiaoming Ni 2021-11-26 1:22 ` Xiaoming Ni 2021-11-26 4:11 ` [PATCH] powerpc/85xx: fix oops when CONFIG_FSL_PMC=n Xiaoming Ni 2021-11-26 4:11 ` Xiaoming Ni 2021-12-03 11:53 ` Michael Ellerman
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.