linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH] ARM: kexec: Fix validating CPU hotplug support
@ 2014-11-04  9:40 HuKeping
  2014-11-04 10:55 ` Russell King - ARM Linux
  0 siblings, 1 reply; 4+ messages in thread
From: HuKeping @ 2014-11-04  9:40 UTC (permalink / raw)
  To: swarren, ebiederm, linux, rmk+kernel
  Cc: linux-arm-kernel, linux-kernel, sdu.liu, wangnan0, peifeiyue

Commit 2103f6cba61a8b8bea3fc1b63661d830a2125e76 added a hotplug checking in
machine_kexec_prepare(), but it will lead a failure when loading the
crash-kernel in some cases.

Kexec utility can load the crash kernel by two ways:
1. kexec -l kernel-image
2. kexec -p kernel-image

In case #1, for rapid reboot, it's correct to do the hotplug checking things,
for it will shut down cpus in _cpu_down() later when command "kexec -e" be
sent, this routine needs the support of cpu hotplug.

In case #2, for use on panic, it's unnecessary to do the same thing, the whole
routing has no business with cpu shutting down. Check for cpu hot plug will
lead a failure to load the kernel.

Prior to this patch, if the first kernel is not support the CPU hotplug, when
a crash come, the kexec utility will not work.

Signed-off-by: Hu Keping <hukeping@huawei.com>
---
 arch/arm/kernel/machine_kexec.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 8cf0996..7706c67 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -41,11 +41,16 @@ int machine_kexec_prepare(struct kimage *image)
 	int i, err;
 
 	/*
+	 * For rapid reboot:
 	 * Validate that if the current HW supports SMP, then the SW supports
 	 * and implements CPU hotplug for the current HW. If not, we won't be
 	 * able to kexec reliably, so fail the prepare operation.
+	 *
+	 * For use on panic:
+	 * It is unnecessary to check the cpu hot plug. 
 	 */
-	if (num_possible_cpus() > 1 && !platform_can_cpu_hotplug())
+	if (image->type != KEXEC_TYPE_CRASH &&
+		(num_possible_cpus() > 1 && !platform_can_cpu_hotplug()))
 		return -EINVAL;
 
 	/*
-- 
1.8.5.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH] ARM: kexec: Fix validating CPU hotplug support
  2014-11-04  9:40 [RESEND PATCH] ARM: kexec: Fix validating CPU hotplug support HuKeping
@ 2014-11-04 10:55 ` Russell King - ARM Linux
  2014-11-05 10:57   ` Hu Keping
  0 siblings, 1 reply; 4+ messages in thread
From: Russell King - ARM Linux @ 2014-11-04 10:55 UTC (permalink / raw)
  To: HuKeping
  Cc: swarren, ebiederm, linux-arm-kernel, linux-kernel, sdu.liu,
	wangnan0, peifeiyue

On Tue, Nov 04, 2014 at 05:40:25PM +0800, HuKeping wrote:
> Commit 2103f6cba61a8b8bea3fc1b63661d830a2125e76 added a hotplug checking in
> machine_kexec_prepare(), but it will lead a failure when loading the
> crash-kernel in some cases.
> 
> Kexec utility can load the crash kernel by two ways:
> 1. kexec -l kernel-image
> 2. kexec -p kernel-image
> 
> In case #1, for rapid reboot, it's correct to do the hotplug checking things,
> for it will shut down cpus in _cpu_down() later when command "kexec -e" be
> sent, this routine needs the support of cpu hotplug.
> 
> In case #2, for use on panic, it's unnecessary to do the same thing, the whole
> routing has no business with cpu shutting down. Check for cpu hot plug will
> lead a failure to load the kernel.

So what happens to the other CPUs when you kexec into the new kernel,
possibly overwriting the instructions which those CPUs are executing?

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH] ARM: kexec: Fix validating CPU hotplug support
  2014-11-04 10:55 ` Russell King - ARM Linux
@ 2014-11-05 10:57   ` Hu Keping
  2014-11-05 11:08     ` Russell King - ARM Linux
  0 siblings, 1 reply; 4+ messages in thread
From: Hu Keping @ 2014-11-05 10:57 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: swarren, ebiederm, linux-arm-kernel, linux-kernel, sdu.liu,
	wangnan0, peifeiyue



于 2014/11/4 18:55, Russell King - ARM Linux 写道:
> On Tue, Nov 04, 2014 at 05:40:25PM +0800, HuKeping wrote:
>> Commit 2103f6cba61a8b8bea3fc1b63661d830a2125e76 added a hotplug checking in
>> machine_kexec_prepare(), but it will lead a failure when loading the
>> crash-kernel in some cases.
>>
>> Kexec utility can load the crash kernel by two ways:
>> 1. kexec -l kernel-image
>> 2. kexec -p kernel-image
>>
>> In case #1, for rapid reboot, it's correct to do the hotplug checking things,
>> for it will shut down cpus in _cpu_down() later when command "kexec -e" be
>> sent, this routine needs the support of cpu hotplug.
>>
>> In case #2, for use on panic, it's unnecessary to do the same thing, the whole
>> routing has no business with cpu shutting down. Check for cpu hot plug will
>> lead a failure to load the kernel.
>
> So what happens to the other CPUs when you kexec into the new kernel,
> possibly overwriting the instructions which those CPUs are executing?
>

Actually, i do think there is something wrong in the panic-rountine:
when panic comes, we clear the cpu_online_bits of other CPUs and
keep them calling cpu_relax(). That's why I post that patch ,because
we do not really shut down the CPUs.

But as your mentioned , there is another problem:
what's in the pc register of each cpu is unknown after the MMU has been
shut down.

On X86, there is a halt() before the cpu_relax(), so do you think we
need a call wfi() before cpu_relax() to keep the other CPUs on
status-WFI on ARM?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH] ARM: kexec: Fix validating CPU hotplug support
  2014-11-05 10:57   ` Hu Keping
@ 2014-11-05 11:08     ` Russell King - ARM Linux
  0 siblings, 0 replies; 4+ messages in thread
From: Russell King - ARM Linux @ 2014-11-05 11:08 UTC (permalink / raw)
  To: Hu Keping
  Cc: swarren, ebiederm, linux-arm-kernel, linux-kernel, sdu.liu,
	wangnan0, peifeiyue

On Wed, Nov 05, 2014 at 06:57:06PM +0800, Hu Keping wrote:
> Actually, i do think there is something wrong in the panic-rountine:
> when panic comes, we clear the cpu_online_bits of other CPUs and
> keep them calling cpu_relax(). That's why I post that patch ,because
> we do not really shut down the CPUs.
> 
> But as your mentioned , there is another problem:
> what's in the pc register of each cpu is unknown after the MMU has been
> shut down.

Correct.

> On X86, there is a halt() before the cpu_relax(), so do you think we
> need a call wfi() before cpu_relax() to keep the other CPUs on
> status-WFI on ARM?

X86 benefits from the fact that it is a known architecture, and there are
ways to ensure that the other CPUs are held in reset or whatever, so the
system is recoverable from such a situation.

That is far from true on ARM: on ARM, everyone does their own thing, which
leads to situations where we can't reset other CPUs (eg, because the
hardware isn't implemented, or the secure firmware doesn't support being
called by non-boot CPUs, etc.)

So, while adding a wfi() call in machine_crash_nonpanic_core() will stop
the CPU executing instructions, the kernel being kexec'd will not see
the CPUs it expects.  Also, I worry whether a wfi() is sufficient - what
if an interrupt does get delivered to that CPU (eg, as part of the kexec'd
kernel trying to bring the CPU online) or a device raises its interrupt
and the interrupt has been routed to that CPU.

I think this is the reason why we went for the simple option here: we
know that all the conditions are not correct for being able to safely
kexec() in SMP mode, especially in a panic scenario.

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-05 11:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-04  9:40 [RESEND PATCH] ARM: kexec: Fix validating CPU hotplug support HuKeping
2014-11-04 10:55 ` Russell King - ARM Linux
2014-11-05 10:57   ` Hu Keping
2014-11-05 11:08     ` Russell King - ARM Linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).