From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua0-x243.google.com (mail-ua0-x243.google.com [IPv6:2607:f8b0:400c:c08::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3yyCGQ1mRNzDrbG for ; Thu, 14 Dec 2017 23:16:29 +1100 (AEDT) Received: by mail-ua0-x243.google.com with SMTP id p33so3765727uag.9 for ; Thu, 14 Dec 2017 04:16:29 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20171214115137.5603f77d@roar.ozlabs.ibm.com> References: <20171213080828.2800-1-bsingharora@gmail.com> <20171213080828.2800-2-bsingharora@gmail.com> <20171213205101.2cc0c830@roar.ozlabs.ibm.com> <20171214111213.07ba99a4@gmail.com> <20171214115137.5603f77d@roar.ozlabs.ibm.com> From: Balbir Singh Date: Thu, 14 Dec 2017 23:16:26 +1100 Message-ID: Subject: Re: [PATCH 2/2] powernv/kdump: Fix cases where the kdump kernel can get HMI's To: Nicholas Piggin Cc: "open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)" , Michael Ellerman Content-Type: text/plain; charset="UTF-8" List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Thu, Dec 14, 2017 at 12:51 PM, Nicholas Piggin wrote: > On Thu, 14 Dec 2017 11:12:13 +1100 > Balbir Singh wrote: > >> On Wed, 13 Dec 2017 20:51:01 +1000 >> Nicholas Piggin wrote: >> >> > This is looking pretty nice now... >> > >> > On Wed, 13 Dec 2017 19:08:28 +1100 >> > Balbir Singh wrote: >> > >> > > @@ -543,7 +543,25 @@ void smp_send_debugger_break(void) >> > > #ifdef CONFIG_KEXEC_CORE >> > > void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *)) >> > > { >> > > + int cpu; >> > > + >> > > smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, crash_ipi_callback, 1000000); >> > > + if (kdump_in_progress() && crash_wake_offline) { >> > > + for_each_present_cpu(cpu) { >> > > + if (cpu_online(cpu)) >> > > + continue; >> > > + /* >> > > + * crash_ipi_callback will wait for >> > > + * all cpus, including offline CPUs. >> > > + * We don't care about nmi_ipi_function. >> > > + * Offline cpus will jump straight into >> > > + * crash_ipi_callback, we can skip the >> > > + * entire NMI dance and waiting for >> > > + * cpus to clear pending mask, etc. >> > > + */ >> > > + do_smp_send_nmi_ipi(cpu); >> > >> > Still a little bit concerned about using NMI IPI for this. >> > >> >> OK -- for offline CPUs you mean? > > Yes. > >> > If you take an NMI IPI from stop, the idle code should do the >> > right thing and we would just return the system reset wakeup >> > reason in SRR1 here (which does not need to be cleared). >> > >> > If you take the system reset anywhere else in the loop, it's >> > going to go out via system_reset_exception. I guess that >> > would end up doing the right thing, it probably gets to >> > crash_ipi_callback from crash_kexec_secondary? >> >> You mean like if we are online at the time of NMI'ing? If so >> the original loop will NMI us back into crash_ipi_callback >> anyway. We don't expect this to occur for offline CPUs > > No, if the offline CPU is executing any instruction except for > stop when the crash occurs. > OK, yeah >> >> > >> > It's just going to be a very untested code path :( What we >> > gain I suppose is better ability to handle a CPU that's locked >> > up somewhere in the cpu offline path. Assuming the uncommon >> > case works... >> > >> > Actually, if you *always* go via the system reset exception >> > handler, then code paths will be shared. That might be the >> > way to go. So I would check for system reset wakeup SRR1 reason >> > and call replay_system_reset() for it. What do you think? >> > >> >> We could do that, but that would call pnv_system_reset_exception >> and try to call the NMI function, but we've not used that path >> to initiate the NMI, so it should call the stale nmi_ipi_function >> which is crash_ipi_callback and not go via the crash_kexec path. > > It shouldn't, if the CPU is not set in the NMI bitmask, I think > it should go back out and do the rest of the system_reset_exception > handler. > > Anyway we have to get this case right, because it can already hit > as I said if the offline CPU takes the NMI when it is not stopped. > This is why I want to try to use a unified code path. > OK >> I can't call smp_send_nmi_ipi due to the nmi_ipi_busy_count and >> I'm worried about calling a stale nmi_ipi_function via the >> system_reset_exception path, if we are OK with it, I can revisit >> the code path > > You shouldn't get a stale one, that would also be a bug -- we > have to cope with NMIs coming in at any time that are triggered > externally (not by smp_send_nmi_ipi), so if you see any bugs > there those need to be fixed separately. > Yes, I think it's a bug, nothing clears nmi_ipi_function (from what I can see), so when the next NMI comes in and goes into pnv_system_reset_exception it'll execute the stale handler. I'll respin things based on the suggestion above and deal with any bugs as well. Balbir Singh.