From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58581C3A5A0 for ; Sat, 18 Apr 2020 14:54:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 318FB21D79 for ; Sat, 18 Apr 2020 14:54:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1587221682; bh=2M5vHab+njp3Ao4dy6+B1X01krlwjCcGCpWvxZxw13Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=UEdLgwasfM6OyqYzh+0WqhNyt0pjZS5Fd9sutdGuZBpvYshqWwz/oFbC3WQQXMClC 9wJx7KlNI6x7wqNFje99rrbt/a77z11DTr12qofsrIk7FVGMHLL5Azb9YfyypH1nyK +B1zAInpa8Rav7fgiRdVUH8hjgkK9AlCJvrNGnA0= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728831AbgDROyl (ORCPT ); Sat, 18 Apr 2020 10:54:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:50766 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726296AbgDROle (ORCPT ); Sat, 18 Apr 2020 10:41:34 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4106722245; Sat, 18 Apr 2020 14:41:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1587220894; bh=2M5vHab+njp3Ao4dy6+B1X01krlwjCcGCpWvxZxw13Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CnoEhgzEAnr0SMe7sM82bMauxW6JI0sewCCK4GDng6zNLU9R+7M4BD23iAKgHXNUf zOCaQT3MBX3P053g4Y7pmc8JrIGQ6xrPGjGPk0Xznv/5w8+nI1Do3IonvCA3+o1TKY boDfNLHwxQLZD97S3ESAuqK4s+aIeelVCgr/NOnY= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Ganesh Goudar , Mahesh Salgaonkar , Nicholas Piggin , Michael Ellerman , Sasha Levin , linuxppc-dev@lists.ozlabs.org Subject: [PATCH AUTOSEL 5.4 36/78] powerpc/pseries: Fix MCE handling on pseries Date: Sat, 18 Apr 2020 10:40:05 -0400 Message-Id: <20200418144047.9013-36-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200418144047.9013-1-sashal@kernel.org> References: <20200418144047.9013-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ganesh Goudar [ Upstream commit a95a0a1654f16366360399574e10efd87e867b39 ] MCE handling on pSeries platform fails as recent rework to use common code for pSeries and PowerNV in machine check error handling tries to access per-cpu variables in realmode. The per-cpu variables may be outside the RMO region on pSeries platform and needs translation to be enabled for access. Just moving these per-cpu variable into RMO region did'nt help because we queue some work to workqueues in real mode, which again tries to touch per-cpu variables. Also fwnmi_release_errinfo() cannot be called when translation is not enabled. This patch fixes this by enabling translation in the exception handler when all required real mode handling is done. This change only affects the pSeries platform. Without this fix below kernel crash is seen on injecting SLB multihit: BUG: Unable to handle kernel data access on read at 0xc00000027b205950 Faulting instruction address: 0xc00000000003b7e0 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: mcetest_slb(OE+) af_packet(E) xt_tcpudp(E) ip6t_rpfilter(E) ip6t_REJECT(E) ipt_REJECT(E) xt_conntrack(E) ip_set(E) nfnetlink(E) ebtable_nat(E) ebtable_broute(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) xfs(E) ibmveth(E) vmx_crypto(E) gf128mul(E) uio_pdrv_genirq(E) uio(E) crct10dif_vpmsum(E) rtc_generic(E) btrfs(E) libcrc32c(E) xor(E) zstd_decompress(E) zstd_compress(E) raid6_pq(E) sr_mod(E) sd_mod(E) cdrom(E) ibmvscsi(E) scsi_transport_srp(E) crc32c_vpmsum(E) dm_mod(E) sg(E) scsi_mod(E) CPU: 34 PID: 8154 Comm: insmod Kdump: loaded Tainted: G OE 5.5.0-mahesh #1 NIP: c00000000003b7e0 LR: c0000000000f2218 CTR: 0000000000000000 REGS: c000000007dcb960 TRAP: 0300 Tainted: G OE (5.5.0-mahesh) MSR: 8000000000001003 CR: 28002428 XER: 20040000 CFAR: c0000000000f2214 DAR: c00000027b205950 DSISR: 40000000 IRQMASK: 0 GPR00: c0000000000f2218 c000000007dcbbf0 c000000001544800 c000000007dcbd70 GPR04: 0000000000000001 c000000007dcbc98 c008000000d00258 c0080000011c0000 GPR08: 0000000000000000 0000000300000003 c000000001035950 0000000003000048 GPR12: 000000027a1d0000 c000000007f9c000 0000000000000558 0000000000000000 GPR16: 0000000000000540 c008000001110000 c008000001110540 0000000000000000 GPR20: c00000000022af10 c00000025480fd70 c008000001280000 c00000004bfbb300 GPR24: c000000001442330 c00800000800000d c008000008000000 4009287a77000510 GPR28: 0000000000000000 0000000000000002 c000000001033d30 0000000000000001 NIP [c00000000003b7e0] save_mce_event+0x30/0x240 LR [c0000000000f2218] pseries_machine_check_realmode+0x2c8/0x4f0 Call Trace: Instruction dump: 3c4c0151 38429050 7c0802a6 60000000 fbc1fff0 fbe1fff8 f821ffd1 3d42ffaf 3fc2ffaf e98d0030 394a1150 3bdef530 <7d6a62aa> 1d2b0048 2f8b0063 380b0001 ---[ end trace 46fd63f36bbdd940 ]--- Fixes: 9ca766f9891d ("powerpc/64s/pseries: machine check convert to use common event code") Reviewed-by: Mahesh Salgaonkar Reviewed-by: Nicholas Piggin Signed-off-by: Ganesh Goudar Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200320110119.10207-1-ganeshgr@linux.ibm.com Signed-off-by: Sasha Levin --- arch/powerpc/platforms/pseries/ras.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 3acdcc3bb908c..753adeb624f23 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -683,6 +683,17 @@ static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp) #endif out: + /* + * Enable translation as we will be accessing per-cpu variables + * in save_mce_event() which may fall outside RMO region, also + * leave it enabled because subsequently we will be queuing work + * to workqueues where again per-cpu variables accessed, besides + * fwnmi_release_errinfo() crashes when called in realmode on + * pseries. + * Note: All the realmode handling like flushing SLB entries for + * SLB multihit is done by now. + */ + mtmsr(mfmsr() | MSR_IR | MSR_DR); save_mce_event(regs, disposition == RTAS_DISP_FULLY_RECOVERED, &mce_err, regs->nip, eaddr, paddr); -- 2.20.1