* [PATCH 0/2] Fix hibernate on SMP spin-table systems @ 2016-06-17 9:34 James Morse 2016-06-17 9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse 2016-06-17 9:34 ` [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs James Morse 0 siblings, 2 replies; 11+ messages in thread From: James Morse @ 2016-06-17 9:34 UTC (permalink / raw) To: linux-arm-kernel Hi all, These two patches prevent hibernate on systems that use spin-tables and have multiple CPUs. It also wires up 'cpus_stuck_in_kernel' which was added for v4.7. Prior to 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N"), we would bring all the CPUs we would ever have up during boot. On a system with spin tables and multiple CPUS the core hibernate code would prevent hibernation because it can't disable secondary CPUs. After 44dbcc93ab67, when we boot with 'maxcpus=1', we no longer bring all the CPUs up, but we do move them into the secondary_holding_pen. Resuming from hibernate will overwrite the secondary_holding_pen, potentially releasing the secondary CPUs. If the kernel has been loaded at a different physical address over hibernate and resume the secondary_holding_pen may be at a different location after resume. The core code can't help us with this, because these CPUs don't show up in 'num_online_cpus()' These two patches fix the problem by detecting multiple 'possible cpus' that we have no mechanism to take offline and preventing hibernate [2]. This only happens for spin-table systems with multiple CPUs. Kexec needs the same checks, so the 'or spin-tables' logic[0] got added to the helper function. (This problem was spotted on another thread [1]) [0] http://www.spinics.net/lists/arm-kernel/msg510097.html [1] http://www.spinics.net/lists/arm-kernel/msg511880.html [2] Failing to hibernate an SMP spin-tables system booted with maxcpus=1 ---------------------%<--------------------- root@localhost:~# echo disk > /sys/power/state [12248.197718] PM: Syncing filesystems ... done. [12248.197727] Freezing user space processes ... (elapsed 0.001 seconds) done. [12248.203197] PM: Preallocating image memory... done (allocated 50769 pages) [12261.838699] PM: Allocated 203076 kbytes in 13.63 seconds (14.89 MB/s) [12261.838760] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [12261.840732] Suspending console(s) (use no_console_suspend to debug) [12261.842540] PM: freeze of devices complete after 1.732 msecs [12261.843897] PM: late freeze of devices complete after 1.333 msecs [12261.845191] PM: noirq freeze of devices complete after 1.272 msecs [12261.845201] Disabling non-boot CPUs ... [12261.845206] hibernate: Can't hibernate: no mechanism to offline secondary CPUs. [12261.845206] PM: Error -16 creating hibernation image [12261.846140] PM: noirq recover of devices complete after 0.908 msecs [12261.847160] PM: early recover of devices complete after 0.940 msecs [12262.765886] PM: recover of devices complete after 1.452 msecs [12262.769191] Restarting tasks ... done. -bash: echo: write error: Device or resource busy root at localhost:~# ---------------------%<--------------------- James Morse (2): arm64: smp: Add function to determine if cpus are stuck in the kernel arm64: hibernate: Don't hibernate on systems with stuck CPUs arch/arm64/include/asm/smp.h | 20 ++++++++++++++++++++ arch/arm64/kernel/hibernate.c | 6 ++++++ arch/arm64/kernel/smp.c | 13 +++++++++++++ 3 files changed, 39 insertions(+) -- 2.8.0.rc3 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 9:34 [PATCH 0/2] Fix hibernate on SMP spin-table systems James Morse @ 2016-06-17 9:34 ` James Morse 2016-06-17 10:27 ` Mark Rutland 2016-06-17 16:37 ` Geoff Levand 2016-06-17 9:34 ` [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs James Morse 1 sibling, 2 replies; 11+ messages in thread From: James Morse @ 2016-06-17 9:34 UTC (permalink / raw) To: linux-arm-kernel kernel/smp.c has a fancy counter that keeps track of the number of CPUs it marked as not-present and left in cpu_park_loop(). If there are any CPUs spinning in here, features like kexec or hibernate may release them by overwriting this memory. This problem also occurs on machines using spin-tables to release secondary cores. After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N") we bring all known cpus into the secondary holding pen, but may not bring them up depending on 'maxcpus'. This memory can't be re-used by kexec or hibernate. Add a function cpus_are_stuck_in_kernel() to determine if either of these cases have occurred. Signed-off-by: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> --- arch/arm64/include/asm/smp.h | 20 ++++++++++++++++++++ arch/arm64/kernel/smp.c | 13 +++++++++++++ 2 files changed, 33 insertions(+) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 433e50405274..4be755bcc07a 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -124,6 +124,26 @@ static inline void cpu_panic_kernel(void) cpu_park_loop(); } +/* + * Kernel features such as hibernate and kexec depend on cpu hotplug to know + * they can replace any kernel memory they are not using themselves. + * + * There are two corner cases: + * If a secondary CPU fails to come online, (e.g. due to mismatched features), + * it will try to call cpu_die(). If this fails, it increases the counter + * cpus_stuck_in_kernel and sits in cpu_park_loop(). The memory containing + * this function must not be re-used for anything else as the 'stuck' core + * is executing it. + * + * CPUs are also considered stuck in the kernel if we have multiple CPUs + * and no way to offline secondary CPUs. This happens when secondaries + * are released via spin-table, these CPUs are moved into the kernel's + * secondary_holding_pen, which must not be overwritten. + * + * This function is used to inhibit features like kexec and hibernate. + */ +bool cpus_are_stuck_in_kernel(void); + #endif /* ifndef __ASSEMBLY__ */ #endif /* ifndef __ASM_SMP_H */ diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 678e0842cb3b..e197502f94fd 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier) { return -EINVAL; } + +bool cpus_are_stuck_in_kernel(void) +{ + bool ret = !!cpus_stuck_in_kernel; +#ifdef CONFIG_HOTPLUG_CPU + int any_cpu = raw_smp_processor_id(); + + if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) + ret = true; +#endif + + return ret; +} -- 2.8.0.rc3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse @ 2016-06-17 10:27 ` Mark Rutland 2016-06-17 11:16 ` Suzuki K Poulose 2016-06-17 16:37 ` Geoff Levand 1 sibling, 1 reply; 11+ messages in thread From: Mark Rutland @ 2016-06-17 10:27 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jun 17, 2016 at 10:34:56AM +0100, James Morse wrote: > kernel/smp.c has a fancy counter that keeps track of the number of CPUs > it marked as not-present and left in cpu_park_loop(). If there are any > CPUs spinning in here, features like kexec or hibernate may release them > by overwriting this memory. > > This problem also occurs on machines using spin-tables to release > secondary cores. > After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N") > we bring all known cpus into the secondary holding pen, but may not bring > them up depending on 'maxcpus'. This memory can't be re-used by kexec > or hibernate. > > Add a function cpus_are_stuck_in_kernel() to determine if either of these > cases have occurred. > > Signed-off-by: James Morse <james.morse@arm.com> > Cc: Suzuki K Poulose <suzuki.poulose@arm.com> > --- > arch/arm64/include/asm/smp.h | 20 ++++++++++++++++++++ > arch/arm64/kernel/smp.c | 13 +++++++++++++ > 2 files changed, 33 insertions(+) > > diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h > index 433e50405274..4be755bcc07a 100644 > --- a/arch/arm64/include/asm/smp.h > +++ b/arch/arm64/include/asm/smp.h > @@ -124,6 +124,26 @@ static inline void cpu_panic_kernel(void) > cpu_park_loop(); > } > > +/* > + * Kernel features such as hibernate and kexec depend on cpu hotplug to know > + * they can replace any kernel memory they are not using themselves. > + * > + * There are two corner cases: > + * If a secondary CPU fails to come online, (e.g. due to mismatched features), > + * it will try to call cpu_die(). If this fails, it increases the counter > + * cpus_stuck_in_kernel and sits in cpu_park_loop(). The memory containing > + * this function must not be re-used for anything else as the 'stuck' core > + * is executing it. It might also be stuck in __no_granule_support, if it never made it to C code. In that case, the CPU in charge of bringing up that new CPU will increment the counter in __cpu_up. There might be other reasons we do something like that in future, so it might be better to be a little less specific and say something like: If a secondary CPU enters the kernel but fails to come online, (e.g. due to mismatched features), and cannot exit the kernel, we increment cpus_stuck_in_kernel and leave the CPU in a quiesecent loop within the kernel text. The memory containing this loop must not be re-used for anything else as the 'stuck' core is executing it. Otherwise, this looks good. FWIW, either way: Acked-by: Mark Rutland <mark.rutland@arm.com> Thanks, Mark. > + * > + * CPUs are also considered stuck in the kernel if we have multiple CPUs > + * and no way to offline secondary CPUs. This happens when secondaries > + * are released via spin-table, these CPUs are moved into the kernel's > + * secondary_holding_pen, which must not be overwritten. > + * > + * This function is used to inhibit features like kexec and hibernate. > + */ > +bool cpus_are_stuck_in_kernel(void); > + > #endif /* ifndef __ASSEMBLY__ */ > > #endif /* ifndef __ASM_SMP_H */ > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > index 678e0842cb3b..e197502f94fd 100644 > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier) > { > return -EINVAL; > } > + > +bool cpus_are_stuck_in_kernel(void) > +{ > + bool ret = !!cpus_stuck_in_kernel; > +#ifdef CONFIG_HOTPLUG_CPU > + int any_cpu = raw_smp_processor_id(); > + > + if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) > + ret = true; > +#endif > + > + return ret; > +} > -- > 2.8.0.rc3 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 10:27 ` Mark Rutland @ 2016-06-17 11:16 ` Suzuki K Poulose 2016-06-17 16:48 ` James Morse 0 siblings, 1 reply; 11+ messages in thread From: Suzuki K Poulose @ 2016-06-17 11:16 UTC (permalink / raw) To: linux-arm-kernel On 17/06/16 11:27, Mark Rutland wrote: > On Fri, Jun 17, 2016 at 10:34:56AM +0100, James Morse wrote: >> kernel/smp.c has a fancy counter that keeps track of the number of CPUs >> it marked as not-present and left in cpu_park_loop(). If there are any >> CPUs spinning in here, features like kexec or hibernate may release them >> by overwriting this memory. >> >> This problem also occurs on machines using spin-tables to release >> secondary cores. >> After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N") >> we bring all known cpus into the secondary holding pen, but may not bring >> them up depending on 'maxcpus'. This memory can't be re-used by kexec >> or hibernate. >> >> Add a function cpus_are_stuck_in_kernel() to determine if either of these >> cases have occurred. >> >> Signed-off-by: James Morse <james.morse@arm.com> >> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> >> --- >> arch/arm64/include/asm/smp.h | 20 ++++++++++++++++++++ >> arch/arm64/kernel/smp.c | 13 +++++++++++++ >> 2 files changed, 33 insertions(+) >> >> diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h >> index 433e50405274..4be755bcc07a 100644 >> --- a/arch/arm64/include/asm/smp.h >> +++ b/arch/arm64/include/asm/smp.h >> @@ -124,6 +124,26 @@ static inline void cpu_panic_kernel(void) >> cpu_park_loop(); >> } >> >> +/* >> + * Kernel features such as hibernate and kexec depend on cpu hotplug to know >> + * they can replace any kernel memory they are not using themselves. >> + * >> + * There are two corner cases: >> + * If a secondary CPU fails to come online, (e.g. due to mismatched features), >> + * it will try to call cpu_die(). If this fails, it increases the counter >> + * cpus_stuck_in_kernel and sits in cpu_park_loop(). The memory containing >> + * this function must not be re-used for anything else as the 'stuck' core >> + * is executing it. > > It might also be stuck in __no_granule_support, if it never made it to C > code. In that case, the CPU in charge of bringing up that new CPU will > increment the counter in __cpu_up. Just to clarify, *in all the cases*, the CPU in charge of bringing up updates the cpus_stuck_in_kernel. > > There might be other reasons we do something like that in future, so it > might be better to be a little less specific and say something like: > > If a secondary CPU enters the kernel but fails to come online, > (e.g. due to mismatched features), and cannot exit the kernel, > we increment cpus_stuck_in_kernel and leave the CPU in a > quiesecent loop within the kernel text. The memory containing > this loop must not be re-used for anything else as the 'stuck' > core is executing it. Agree. >> +bool cpus_are_stuck_in_kernel(void); >> + >> #endif /* ifndef __ASSEMBLY__ */ >> >> #endif /* ifndef __ASM_SMP_H */ >> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c >> index 678e0842cb3b..e197502f94fd 100644 >> --- a/arch/arm64/kernel/smp.c >> +++ b/arch/arm64/kernel/smp.c >> @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier) >> { >> return -EINVAL; >> } >> + >> +bool cpus_are_stuck_in_kernel(void) >> +{ >> + bool ret = !!cpus_stuck_in_kernel; >> +#ifdef CONFIG_HOTPLUG_CPU >> + int any_cpu = raw_smp_processor_id(); >> + >> + if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) >> + ret = true; >> +#endif Minor nit: Moving the cpu_die check to a static inline function with an obvious name might make the code look better. return !!cpus_stuck_in_kernel || !have_cpu_die() ? Eitherway, Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cheers Suzuki >> + >> + return ret; >> +} >> -- >> 2.8.0.rc3 >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 11:16 ` Suzuki K Poulose @ 2016-06-17 16:48 ` James Morse 2016-06-21 18:07 ` Will Deacon 0 siblings, 1 reply; 11+ messages in thread From: James Morse @ 2016-06-17 16:48 UTC (permalink / raw) To: linux-arm-kernel Hi Suzuki, On 17/06/16 12:16, Suzuki K Poulose wrote: > On 17/06/16 11:27, Mark Rutland wrote: >> On Fri, Jun 17, 2016 at 10:34:56AM +0100, James Morse wrote: >>> kernel/smp.c has a fancy counter that keeps track of the number of CPUs >>> it marked as not-present and left in cpu_park_loop(). If there are any >>> CPUs spinning in here, features like kexec or hibernate may release them >>> by overwriting this memory. >>> >>> This problem also occurs on machines using spin-tables to release >>> secondary cores. >>> After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N") >>> we bring all known cpus into the secondary holding pen, but may not bring >>> them up depending on 'maxcpus'. This memory can't be re-used by kexec >>> or hibernate. >>> >>> Add a function cpus_are_stuck_in_kernel() to determine if either of these >>> cases have occurred. >> It might also be stuck in __no_granule_support, if it never made it to C >> code. In that case, the CPU in charge of bringing up that new CPU will >> increment the counter in __cpu_up. > > Just to clarify, *in all the cases*, the CPU in charge of bringing up updates > the cpus_stuck_in_kernel. Ah, my mistake. I will switch it for Mark's suggestion. >>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c >>> index 678e0842cb3b..e197502f94fd 100644 >>> --- a/arch/arm64/kernel/smp.c >>> +++ b/arch/arm64/kernel/smp.c >>> @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier) >>> { >>> return -EINVAL; >>> } >>> + >>> +bool cpus_are_stuck_in_kernel(void) >>> +{ >>> + bool ret = !!cpus_stuck_in_kernel; >>> +#ifdef CONFIG_HOTPLUG_CPU >>> + int any_cpu = raw_smp_processor_id(); >>> + >>> + if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) >>> + ret = true; >>> +#endif > > Minor nit: Moving the cpu_die check to a static inline function with > an obvious name might make the code look better. > > return !!cpus_stuck_in_kernel || !have_cpu_die() ? > That would be better! > Eitherway, > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Thanks, James ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 16:48 ` James Morse @ 2016-06-21 18:07 ` Will Deacon 0 siblings, 0 replies; 11+ messages in thread From: Will Deacon @ 2016-06-21 18:07 UTC (permalink / raw) To: linux-arm-kernel HI James, On Fri, Jun 17, 2016 at 05:48:35PM +0100, James Morse wrote: > On 17/06/16 12:16, Suzuki K Poulose wrote: > > On 17/06/16 11:27, Mark Rutland wrote: > >> On Fri, Jun 17, 2016 at 10:34:56AM +0100, James Morse wrote: > >>> kernel/smp.c has a fancy counter that keeps track of the number of CPUs > >>> it marked as not-present and left in cpu_park_loop(). If there are any > >>> CPUs spinning in here, features like kexec or hibernate may release them > >>> by overwriting this memory. > >>> > >>> This problem also occurs on machines using spin-tables to release > >>> secondary cores. > >>> After commit 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N") > >>> we bring all known cpus into the secondary holding pen, but may not bring > >>> them up depending on 'maxcpus'. This memory can't be re-used by kexec > >>> or hibernate. > >>> > >>> Add a function cpus_are_stuck_in_kernel() to determine if either of these > >>> cases have occurred. > > >> It might also be stuck in __no_granule_support, if it never made it to C > >> code. In that case, the CPU in charge of bringing up that new CPU will > >> increment the counter in __cpu_up. > > > > Just to clarify, *in all the cases*, the CPU in charge of bringing up updates > > the cpus_stuck_in_kernel. > > Ah, my mistake. I will switch it for Mark's suggestion. > > >>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > >>> index 678e0842cb3b..e197502f94fd 100644 > >>> --- a/arch/arm64/kernel/smp.c > >>> +++ b/arch/arm64/kernel/smp.c > >>> @@ -909,3 +909,16 @@ int setup_profiling_timer(unsigned int multiplier) > >>> { > >>> return -EINVAL; > >>> } > >>> + > >>> +bool cpus_are_stuck_in_kernel(void) > >>> +{ > >>> + bool ret = !!cpus_stuck_in_kernel; > >>> +#ifdef CONFIG_HOTPLUG_CPU > >>> + int any_cpu = raw_smp_processor_id(); > >>> + > >>> + if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) > >>> + ret = true; > >>> +#endif > > > > Minor nit: Moving the cpu_die check to a static inline function with > > an obvious name might make the code look better. > > > > return !!cpus_stuck_in_kernel || !have_cpu_die() ? > > > > That would be better! Can you post a new version of this, please? Will ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse 2016-06-17 10:27 ` Mark Rutland @ 2016-06-17 16:37 ` Geoff Levand 2016-06-17 16:39 ` Suzuki K Poulose 2016-06-17 16:42 ` Mark Rutland 1 sibling, 2 replies; 11+ messages in thread From: Geoff Levand @ 2016-06-17 16:37 UTC (permalink / raw) To: linux-arm-kernel On Fri, 2016-06-17 at 10:34 +0100, James Morse wrote: > +bool cpus_are_stuck_in_kernel(void) > +{ > +> > bool ret = !!cpus_stuck_in_kernel; > +#ifdef CONFIG_HOTPLUG_CPU How about using 'if (IS_ENABLED(CONFIG_HOTPLUG_CPU))' here? > + int any_cpu = raw_smp_processor_id(); > + > +> > if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) > +> > > ret = true; > +#endif -Geoff ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 16:37 ` Geoff Levand @ 2016-06-17 16:39 ` Suzuki K Poulose 2016-06-17 16:42 ` Mark Rutland 1 sibling, 0 replies; 11+ messages in thread From: Suzuki K Poulose @ 2016-06-17 16:39 UTC (permalink / raw) To: linux-arm-kernel On 17/06/16 17:37, Geoff Levand wrote: > On Fri, 2016-06-17 at 10:34 +0100, James Morse wrote: >> +bool cpus_are_stuck_in_kernel(void) >> +{ >> +> > bool ret = !!cpus_stuck_in_kernel; >> +#ifdef CONFIG_HOTPLUG_CPU > > How about using 'if (IS_ENABLED(CONFIG_HOTPLUG_CPU))' here? > >> + int any_cpu = raw_smp_processor_id(); >> + >> +> > if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) >> +> > > ret = true; >> +#endif > That won't work, as the cpu_ops->cpu_die is defined only if CONFIG_HOTPLUG_CPU. May be we should fix that. Suzuki ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel 2016-06-17 16:37 ` Geoff Levand 2016-06-17 16:39 ` Suzuki K Poulose @ 2016-06-17 16:42 ` Mark Rutland 1 sibling, 0 replies; 11+ messages in thread From: Mark Rutland @ 2016-06-17 16:42 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jun 17, 2016 at 09:37:12AM -0700, Geoff Levand wrote: > On Fri, 2016-06-17 at 10:34 +0100, James Morse wrote: > > +bool cpus_are_stuck_in_kernel(void) > > +{ > > +> > bool ret = !!cpus_stuck_in_kernel; > > +#ifdef CONFIG_HOTPLUG_CPU > > How about using 'if (IS_ENABLED(CONFIG_HOTPLUG_CPU))' here? I wanted to suggest that too, but ... > > + int any_cpu = raw_smp_processor_id(); > > + > > +> > if (num_possible_cpus() > 1 && !cpu_ops[any_cpu]->cpu_die) > > +> > > ret = true; ... here we'd blow up as cpu_ops::cpu_die is only defined when CONFIG_HOTPLUG_CPU is selected. So we'll have to use an ifdef, but we could instead ifdef a whole stub function (e.g. have_cpu_die()), as Suzuki suggested. Thanks, Mark. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs 2016-06-17 9:34 [PATCH 0/2] Fix hibernate on SMP spin-table systems James Morse 2016-06-17 9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse @ 2016-06-17 9:34 ` James Morse 2016-06-17 10:28 ` Mark Rutland 1 sibling, 1 reply; 11+ messages in thread From: James Morse @ 2016-06-17 9:34 UTC (permalink / raw) To: linux-arm-kernel Hibernate relies on cpu hotplug to prevent secondary cores executing the kernel text while it is being restored. Add a call to cpus_are_stuck_in_kernel() to determine if there are CPUs not counted by 'num_online_cpus()', and prevent hibernate in this case. Fixes: 82869ac57b5 ("arm64: kernel: Add support for hibernate/suspend-to-disk") Signed-off-by: James Morse <james.morse@arm.com> --- arch/arm64/kernel/hibernate.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c index f8df75d740f4..21ab5df9fa76 100644 --- a/arch/arm64/kernel/hibernate.c +++ b/arch/arm64/kernel/hibernate.c @@ -33,6 +33,7 @@ #include <asm/pgtable.h> #include <asm/pgtable-hwdef.h> #include <asm/sections.h> +#include <asm/smp.h> #include <asm/suspend.h> #include <asm/virt.h> @@ -236,6 +237,11 @@ int swsusp_arch_suspend(void) unsigned long flags; struct sleep_stack_data state; + if (cpus_are_stuck_in_kernel()) { + pr_err("Can't hibernate: no mechanism to offline secondary CPUs.\n"); + return -EBUSY; + } + local_dbg_save(flags); if (__cpu_suspend_enter(&state)) { -- 2.8.0.rc3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs 2016-06-17 9:34 ` [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs James Morse @ 2016-06-17 10:28 ` Mark Rutland 0 siblings, 0 replies; 11+ messages in thread From: Mark Rutland @ 2016-06-17 10:28 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jun 17, 2016 at 10:34:57AM +0100, James Morse wrote: > Hibernate relies on cpu hotplug to prevent secondary cores executing > the kernel text while it is being restored. > > Add a call to cpus_are_stuck_in_kernel() to determine if there are > CPUs not counted by 'num_online_cpus()', and prevent hibernate in this > case. > > Fixes: 82869ac57b5 ("arm64: kernel: Add support for hibernate/suspend-to-disk") > Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Mark. > --- > arch/arm64/kernel/hibernate.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c > index f8df75d740f4..21ab5df9fa76 100644 > --- a/arch/arm64/kernel/hibernate.c > +++ b/arch/arm64/kernel/hibernate.c > @@ -33,6 +33,7 @@ > #include <asm/pgtable.h> > #include <asm/pgtable-hwdef.h> > #include <asm/sections.h> > +#include <asm/smp.h> > #include <asm/suspend.h> > #include <asm/virt.h> > > @@ -236,6 +237,11 @@ int swsusp_arch_suspend(void) > unsigned long flags; > struct sleep_stack_data state; > > + if (cpus_are_stuck_in_kernel()) { > + pr_err("Can't hibernate: no mechanism to offline secondary CPUs.\n"); > + return -EBUSY; > + } > + > local_dbg_save(flags); > > if (__cpu_suspend_enter(&state)) { > -- > 2.8.0.rc3 > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-06-21 18:07 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-17 9:34 [PATCH 0/2] Fix hibernate on SMP spin-table systems James Morse 2016-06-17 9:34 ` [PATCH 1/2] arm64: smp: Add function to determine if cpus are stuck in the kernel James Morse 2016-06-17 10:27 ` Mark Rutland 2016-06-17 11:16 ` Suzuki K Poulose 2016-06-17 16:48 ` James Morse 2016-06-21 18:07 ` Will Deacon 2016-06-17 16:37 ` Geoff Levand 2016-06-17 16:39 ` Suzuki K Poulose 2016-06-17 16:42 ` Mark Rutland 2016-06-17 9:34 ` [PATCH 2/2] arm64: hibernate: Don't hibernate on systems with stuck CPUs James Morse 2016-06-17 10:28 ` Mark Rutland
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.