From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mahesh@linux.vnet.ibm.com>
Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com
 [148.163.156.1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by lists.ozlabs.org (Postfix) with ESMTPS id 3xCjQn439DzDrDD
 for <linuxppc-dev@lists.ozlabs.org>; Thu, 20 Jul 2017 15:40:01 +1000 (AEST)
Received: from pps.filterd (m0098396.ppops.net [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id
 v6K5dog7091788
 for <linuxppc-dev@lists.ozlabs.org>; Thu, 20 Jul 2017 01:39:59 -0400
Received: from e23smtp04.au.ibm.com (e23smtp04.au.ibm.com [202.81.31.146])
 by mx0a-001b2d01.pphosted.com with ESMTP id 2btp4jrwtd-1
 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)
 for <linuxppc-dev@lists.ozlabs.org>; Thu, 20 Jul 2017 01:39:59 -0400
Received: from localhost
 by e23smtp04.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only!
 Violators will be prosecuted
 for <linuxppc-dev@lists.ozlabs.org> from <mahesh@linux.vnet.ibm.com>;
 Thu, 20 Jul 2017 15:39:56 +1000
Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139])
 by d23relay10.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id
 v6K5dshA27721884
 for <linuxppc-dev@lists.ozlabs.org>; Thu, 20 Jul 2017 15:39:54 +1000
Received: from d23av04.au.ibm.com (localhost [127.0.0.1])
 by d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id
 v6K5dqBI009594
 for <linuxppc-dev@lists.ozlabs.org>; Thu, 20 Jul 2017 15:39:53 +1000
Subject: Re: [PATCH v2 1/3] powerpc/powernv: handle the platform error reboot
 in ppc_md.restart
To: Nicholas Piggin <npiggin@gmail.com>, linuxppc-dev@lists.ozlabs.org
References: <20170719065912.19183-1-npiggin@gmail.com>
 <20170719065912.19183-2-npiggin@gmail.com>
From: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>
Date: Thu, 20 Jul 2017 11:09:52 +0530
MIME-Version: 1.0
In-Reply-To: <20170719065912.19183-2-npiggin@gmail.com>
Content-Type: text/plain; charset=utf-8
Message-Id: <7494dafd-a967-ee17-82df-393fb9095fad@linux.vnet.ibm.com>
List-Id: Linux on PowerPC Developers Mail List <linuxppc-dev.lists.ozlabs.org>
List-Unsubscribe: <https://lists.ozlabs.org/options/linuxppc-dev>,
 <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>
List-Archive: <http://lists.ozlabs.org/pipermail/linuxppc-dev/>
List-Post: <mailto:linuxppc-dev@lists.ozlabs.org>
List-Help: <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>
List-Subscribe: <https://lists.ozlabs.org/listinfo/linuxppc-dev>,
 <mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>

On 07/19/2017 12:29 PM, Nicholas Piggin wrote:
> Unrecovered MCE and HMI errors are sent through a special restart OPAL
> call to log the platform error. The downside is that they don't go
> through normal Linux crash paths, so they don't give much information
> to the Linux console.
> 
> Change this by providing a special crash function which does some of
> the console flushing from the panic() path before calling firmware to
> reboot.
> 
> The downside of this is a little more code to execute before reaching
> the firmware reboot. However in practice, it's critical to get the
> Linux console messages output in order to debug a problem. So this is
> a desirable tradeoff.
> 
> Note on the implementation: It is difficult to plumb a custom reboot
> handler into the panic path, because panic does a little bit too much
> work. For example, it will try to delay with the timebase, but that
> may be corrupted in some cases resulting in a hang without reaching
> the platform reboot. Another problem is that panic can invoke the
> crash dump code which is not what we want in the case of a hardware
> platform error. Long-term the best solution will be to rework the
> panic path so it can be suitable for this kind of panic, but for now
> we just duplicate a bit of the code.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Thanks,
-Mahesh.

> ---
>  arch/powerpc/include/asm/opal.h           |  2 +-
>  arch/powerpc/platforms/powernv/opal-hmi.c | 22 ++------
>  arch/powerpc/platforms/powernv/opal.c     | 89 ++++++++++++++++++-------------
>  arch/powerpc/platforms/powernv/powernv.h  |  2 +
>  4 files changed, 57 insertions(+), 58 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index 588fb1c23af9..182dab435aad 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -50,7 +50,7 @@ int64_t opal_tpo_write(uint64_t token, uint32_t year_mon_day,
>  		       uint32_t hour_min);
>  int64_t opal_cec_power_down(uint64_t request);
>  int64_t opal_cec_reboot(void);
> -int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag);
> +int64_t opal_cec_reboot2(uint32_t reboot_type, const char *diag);
>  int64_t opal_read_nvram(uint64_t buffer, uint64_t size, uint64_t offset);
>  int64_t opal_write_nvram(uint64_t buffer, uint64_t size, uint64_t offset);
>  int64_t opal_handle_interrupt(uint64_t isn, __be64 *outstanding_event_mask);
> diff --git a/arch/powerpc/platforms/powernv/opal-hmi.c b/arch/powerpc/platforms/powernv/opal-hmi.c
> index 88f3c61eec95..d78fed728cdf 100644
> --- a/arch/powerpc/platforms/powernv/opal-hmi.c
> +++ b/arch/powerpc/platforms/powernv/opal-hmi.c
> @@ -30,6 +30,8 @@
>  #include <asm/cputable.h>
>  #include <asm/machdep.h>
> 
> +#include "powernv.h"
> +
>  static int opal_hmi_handler_nb_init;
>  struct OpalHmiEvtNode {
>  	struct list_head list;
> @@ -267,8 +269,6 @@ static void hmi_event_handler(struct work_struct *work)
>  	spin_unlock_irqrestore(&opal_hmi_evt_lock, flags);
> 
>  	if (unrecoverable) {
> -		int ret;
> -
>  		/* Pull all HMI events from OPAL before we panic. */
>  		while (opal_get_msg(__pa(&msg), sizeof(msg)) == OPAL_SUCCESS) {
>  			u32 type;
> @@ -284,23 +284,7 @@ static void hmi_event_handler(struct work_struct *work)
>  			print_hmi_event_info(hmi_evt);
>  		}
> 
> -		/*
> -		 * Unrecoverable HMI exception. We need to inform BMC/OCC
> -		 * about this error so that it can collect relevant data
> -		 * for error analysis before rebooting.
> -		 */
> -		ret = opal_cec_reboot2(OPAL_REBOOT_PLATFORM_ERROR,
> -			"Unrecoverable HMI exception");
> -		if (ret == OPAL_UNSUPPORTED) {
> -			pr_emerg("Reboot type %d not supported\n",
> -						OPAL_REBOOT_PLATFORM_ERROR);
> -		}
> -
> -		/*
> -		 * Fall through and panic if opal_cec_reboot2() returns
> -		 * OPAL_UNSUPPORTED.
> -		 */
> -		panic("Unrecoverable HMI exception");
> +		pnv_platform_error_reboot(NULL, "Unrecoverable HMI exception");
>  	}
>  }
> 
> diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
> index 9b87abb178f0..96436d129684 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -25,6 +25,10 @@
>  #include <linux/memblock.h>
>  #include <linux/kthread.h>
>  #include <linux/freezer.h>
> +#include <linux/printk.h>
> +#include <linux/kmsg_dump.h>
> +#include <linux/console.h>
> +#include <linux/sched/debug.h>
> 
>  #include <asm/machdep.h>
>  #include <asm/opal.h>
> @@ -436,10 +440,55 @@ static int opal_recover_mce(struct pt_regs *regs,
>  	return recovered;
>  }
> 
> +void pnv_platform_error_reboot(struct pt_regs *regs, const char *msg)
> +{
> +	/*
> +	 * This is mostly taken from kernel/panic.c, but tries to do
> +	 * relatively minimal work. Don't use delay functions (TB may
> +	 * be broken), don't crash dump (need to set a firmware log),
> +	 * don't run notifiers. We do want to get some information to
> +	 * Linux console.
> +	 */
> +	console_verbose();
> +	bust_spinlocks(1);
> +	pr_emerg("Hardware platform error: %s\n", msg);
> +	if (regs)
> +		show_regs(regs);
> +	smp_send_stop();
> +	printk_safe_flush_on_panic();
> +	kmsg_dump(KMSG_DUMP_PANIC);
> +	bust_spinlocks(0);
> +	debug_locks_off();
> +	console_flush_on_panic();
> +
> +	/*
> +	 * Don't bother to shut things down because this will
> +	 * xstop the system.
> +	 */
> +	if (opal_cec_reboot2(OPAL_REBOOT_PLATFORM_ERROR, msg)
> +						== OPAL_UNSUPPORTED) {
> +		pr_emerg("Reboot type %d not supported for %s\n",
> +				OPAL_REBOOT_PLATFORM_ERROR, msg);
> +	}
> +
> +	/*
> +	 * We reached here. There can be three possibilities:
> +	 * 1. We are running on a firmware level that do not support
> +	 *    opal_cec_reboot2()
> +	 * 2. We are running on a firmware level that do not support
> +	 *    OPAL_REBOOT_PLATFORM_ERROR reboot type.
> +	 * 3. We are running on FSP based system that does not need
> +	 *    opal to trigger checkstop explicitly for error analysis.
> +	 *    The FSP PRD component would have already got notified
> +	 *    about this error through other channels.
> +	 */
> +
> +	ppc_md.restart(NULL);
> +}
> +
>  int opal_machine_check(struct pt_regs *regs)
>  {
>  	struct machine_check_event evt;
> -	int ret;
> 
>  	if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
>  		return 0;
> @@ -455,43 +504,7 @@ int opal_machine_check(struct pt_regs *regs)
>  	if (opal_recover_mce(regs, &evt))
>  		return 1;
> 
> -	/*
> -	 * Unrecovered machine check, we are heading to panic path.
> -	 *
> -	 * We may have hit this MCE in very early stage of kernel
> -	 * initialization even before opal-prd has started running. If
> -	 * this is the case then this MCE error may go un-noticed or
> -	 * un-analyzed if we go down panic path. We need to inform
> -	 * BMC/OCC about this error so that they can collect relevant
> -	 * data for error analysis before rebooting.
> -	 * Use opal_cec_reboot2(OPAL_REBOOT_PLATFORM_ERROR) to do so.
> -	 * This function may not return on BMC based system.
> -	 */
> -	ret = opal_cec_reboot2(OPAL_REBOOT_PLATFORM_ERROR,
> -			"Unrecoverable Machine Check exception");
> -	if (ret == OPAL_UNSUPPORTED) {
> -		pr_emerg("Reboot type %d not supported\n",
> -					OPAL_REBOOT_PLATFORM_ERROR);
> -	}
> -
> -	/*
> -	 * We reached here. There can be three possibilities:
> -	 * 1. We are running on a firmware level that do not support
> -	 *    opal_cec_reboot2()
> -	 * 2. We are running on a firmware level that do not support
> -	 *    OPAL_REBOOT_PLATFORM_ERROR reboot type.
> -	 * 3. We are running on FSP based system that does not need opal
> -	 *    to trigger checkstop explicitly for error analysis. The FSP
> -	 *    PRD component would have already got notified about this
> -	 *    error through other channels.
> -	 *
> -	 * If hardware marked this as an unrecoverable MCE, we are
> -	 * going to panic anyway. Even if it didn't, it's not safe to
> -	 * continue at this point, so we should explicitly panic.
> -	 */
> -
> -	panic("PowerNV Unrecovered Machine Check");
> -	return 0;
> +	pnv_platform_error_reboot(regs, "Unrecoverable Machine Check exception");
>  }
> 
>  /* Early hmi handler called in real mode. */
> diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h
> index 6dbc0a1da1f6..a159d48573d7 100644
> --- a/arch/powerpc/platforms/powernv/powernv.h
> +++ b/arch/powerpc/platforms/powernv/powernv.h
> @@ -7,6 +7,8 @@ extern void pnv_smp_init(void);
>  static inline void pnv_smp_init(void) { }
>  #endif
> 
> +extern void pnv_platform_error_reboot(struct pt_regs *regs, const char *msg) __noreturn;
> +
>  struct pci_dev;
> 
>  #ifdef CONFIG_PCI
>