linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>,
	Paul Mackerras <paulus@samba.org>,
	Jeremy Kerr <jeremy.kerr@au1.ibm.com>,
	Anton Blanchard <anton@samba.org>
Subject: Re: [RFC PATCH 2/9] powerpc: handle machine check in Linux host.
Date: Thu, 08 Aug 2013 10:31:00 +0530	[thread overview]
Message-ID: <5203260C.4000705@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130807093815.5389.7668.stgit@mars.in.ibm.com>

On 08/07/2013 03:08 PM, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> Move machine check entry point into Linux. So far we were dependent on
> firmware to decode MCE error details and handover the high level info to OS.
> 
> This patch introduces early machine check routine that saves the MCE
> information (srr1, srr0, dar and dsisr) to the emergency stack. We allocate
> stack frame on emergency stack and set the r1 accordingly. This allows us
> to be prepared to take another exception without loosing context. One thing
> to note here that, if we get another machine check while ME bit is off then
> we risk a checkstop. Hence we restrict ourselves to save only MCE information
> and turn the ME bit on.
> 
> This is the code flow:
> 
> 		Machine Check Interrupt
> 			|
> 			V
> 		   0x200 vector				  ME=0, IR=0, DR=0
> 			|
> 			V
> 	+-----------------------------------------------+
> 	|machine_check_pSeries_early:			| ME=0, IR=0, DR=0
> 	|	Alloc frame on emergency stack		|
> 	|	Save srr1, srr0, dar and dsisr on stack |
> 	+-----------------------------------------------+
> 			|
> 		(ME=1, IR=0, DR=0, RFID)
> 			|
> 			V
> 		machine_check_handle_early		  ME=1, IR=0, DR=0
> 			|
> 			V
> 	+-----------------------------------------------+
> 	|	machine_check_early (r3=pt_regs)	| ME=1, IR=0, DR=0
> 	|	Things to do: (in next patches)		|
> 	|		Flush SLB for SLB errors	|
> 	|		Flush TLB for TLB errors	|
> 	|		Decode and save MCE info	|
> 	+-----------------------------------------------+
> 			|
> 	(Fall through existing exception handler routine.)
> 			|
> 			V
> 		machine_check_pSerie			  ME=1, IR=0, DR=0
> 			|
> 		(ME=1, IR=1, DR=1, RFID)
> 			|
> 			V
> 		machine_check_common			  ME=1, IR=1, DR=1
> 			.
> 			.
> 			.
> 
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/exception-64s.h |   43 ++++++++++++++++++++++++++
>  arch/powerpc/kernel/exceptions-64s.S     |   50 +++++++++++++++++++++++++++++-
>  arch/powerpc/kernel/traps.c              |   12 +++++++
>  3 files changed, 104 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
> index 2386d40..c5d2cbc 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -174,6 +174,49 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
>  #define EXCEPTION_PROLOG_1(area, extra, vec)				\
>  	__EXCEPTION_PROLOG_1(area, extra, vec)
> 
> +/*
> + * Register contents:
> + * R12		= interrupt vector
> + * R13		= PACA
> + * R9		= CR
> + * R11 & R12 is saved on PACA_EXMC
> + *
> + * Swicth to emergency stack and handle re-entrancy (though we currently
> + * don't test for overflow). Save MCE registers srr1, srr0, dar and
> + * dsisr and then turn the ME bit on.
> + */
> +#define __EARLY_MACHINE_CHECK_HANDLER(area, label)			\
> +	/* Check if we are laready using emergency stack. */		\
> +	ld	r10,PACAEMERGSP(r13);					\
> +	subi	r10,r10,THREAD_SIZE;					\
> +	rldicr	r10,r10,0,(63 - THREAD_SHIFT);				\
> +	rldicr	r11,r1,0,(63 - THREAD_SHIFT);				\
> +	cmpd	r10,r11;	/* Are we using emergency stack? */	\
> +	mr	r11,r1;			/* Save current stack pointer */\
> +	beq	0f;							\
> +	ld	r1,PACAEMERGSP(r13);	/* Use emergency stack */	\
> +0:	subi	r1,r1,INT_FRAME_SIZE;	/* alloc stack frame */		\
> +	std	r11,GPR1(r1);						\
> +	std	r11,0(r1);		/* make stack chain pointer */	\
> +	mfspr	r11,SPRN_SRR0;		/* Save SRR0 */			\
> +	std	r11,_NIP(r1);						\
> +	mfspr	r11,SPRN_SRR1;		/* Save SRR1 */			\
> +	std	r11,_MSR(r1);						\
> +	mfspr	r11,SPRN_DAR;		/* Save DAR */			\
> +	std 	r11,_DAR(r1);						\
> +	mfspr	r11,SPRN_DSISR;		/* Save DSISR */		\
> +	std	r11,_DSISR(r1);						\
> +	mfmsr	r11;			/* get MSR value */		\
> +	ori	r11,r11,MSR_ME;		/* turn on ME bit */		\

You need to mention here the fact that we are vulnerable to a core check
stop possibility if we get another machine check exception till we set
the ME bit ON (from the occurrence of the interrupt).

  parent reply	other threads:[~2013-08-08  5:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-07  9:37 [RFC PATCH 0/9] Machine check handling in linux host Mahesh J Salgaonkar
2013-08-07  9:38 ` [RFC PATCH 1/9] powerpc: Split the common exception prolog logic into two section Mahesh J Salgaonkar
2013-08-08  4:10   ` Anshuman Khandual
2013-08-08  4:16     ` Benjamin Herrenschmidt
2013-08-07  9:38 ` [RFC PATCH 2/9] powerpc: handle machine check in Linux host Mahesh J Salgaonkar
2013-08-08  4:51   ` Paul Mackerras
2013-08-08 13:19     ` Mahesh Jagannath Salgaonkar
2013-08-08 13:33       ` Benjamin Herrenschmidt
2013-08-08  5:01   ` Anshuman Khandual [this message]
2013-08-07  9:38 ` [RFC PATCH 3/9] powerpc: Introduce a early machine check hook in cpu_spec Mahesh J Salgaonkar
2013-08-07  9:38 ` [RFC PATCH 4/9] powerpc: Add flush_tlb operation " Mahesh J Salgaonkar
2013-08-07  9:38 ` [RFC PATCH 5/9] powerpc: Flush SLB/TLBs if we get SLB/TLB machine check errors on power7 Mahesh J Salgaonkar
2013-08-08  4:58   ` Paul Mackerras
2013-08-07  9:39 ` [RFC PATCH 6/9] powerpc: Flush SLB/TLBs if we get SLB/TLB machine check errors on power8 Mahesh J Salgaonkar
2013-08-07  9:39 ` [RFC PATCH 7/9] powerpc: Decode and save machine check event Mahesh J Salgaonkar
2013-08-07 18:41   ` Scott Wood
2013-08-08  3:40     ` Mahesh Jagannath Salgaonkar
2013-08-08  5:14   ` Paul Mackerras
2013-08-08 13:19     ` Mahesh Jagannath Salgaonkar
2013-08-08 13:33       ` Benjamin Herrenschmidt
2013-08-07  9:39 ` [RFC PATCH 8/9] powerpc/powernv: Remove machine check handling in OPAL Mahesh J Salgaonkar
2013-08-07  9:39 ` [RFC PATCH 9/9] powerpc/powernv: Machine check exception handling Mahesh J Salgaonkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5203260C.4000705@linux.vnet.ibm.com \
    --to=khandual@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=jeremy.kerr@au1.ibm.com \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mahesh@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).