From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B40AC433B4 for ; Thu, 15 Apr 2021 10:21:59 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A567E6124B for ; Thu, 15 Apr 2021 10:21:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A567E6124B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=sbJdWKRgcdLpYN92Ki/StgLAZDSODaC0u2TkCEXprEI=; b=QNdE8Xx8dmv5YWvvBNFcaVbR1 YIiFy7ia0XGje/rKDzaWPoj7GTPdH1fRZBP3Lds12tsOr6zjdk28oWhtYSjxefp7oyKApM1a8tELC fkIIRCvdle6J9GzBpU87eiUjZ/4fF6kZvc0GoYWVQtcYzsNfwAQYQsdCk+tno4hRHyTtfR+XznRie uJmpc+3LV8j1EPUq9wJN+cV8TlCvE/nIR1mIoK4Pg2J2jiackjRXxL8PNCVqQezAUi01v4v/3fAo/ 05K8g+qg7vMa4Z63L8RyQ2svpmVTnXfqcCIVG4apJXpJnvV9v+V2WbvlrdjpBBRw+TuaVRyHDtJ7r yTT5fxYlA==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lWz6I-00FhNK-0S; Thu, 15 Apr 2021 10:19:54 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lWz21-00Fg7x-5l for linux-arm-kernel@desiato.infradead.org; Thu, 15 Apr 2021 10:15:29 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=APdJXxnKJfxa/XoqTNgQY2TwhvVsjZ4sNKx8vJ/USUs=; b=kj5MV7SlGepyJg57MkLkplPhS9 5LypyAQU/noBakDQMyIvTdLWLfyyfTYiba2W+vq+Wo1FGDw8UOPmStWrLVKRt+Nje80QWKdrSiZHs vRfGWXkPmzInWVvJT0TPSm1st9GAE5jnD8F7rGDakHQwVu93PAvH/DACC7ffAbPxAWjW9TSS9TSWU SNB5wbCQiinESUSeg7KpL9uKX4AeIiTibycLHXmYhMJtXRqpCULl2Ql1yl7O4Hh8F7NKe8qu9OSug fHH0+2OLevz/hVJ9L/D3lbjY+igdXyOIU9LnCQXaen4GVoF9rJfZLScH7fLz9QPkhVqQtXm4x2IF0 hKapfehQ==; Received: from metis.ext.pengutronix.de ([2001:67c:670:201:290:27ff:fe1d:cc33]) by bombadil.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lWz1y-008Spy-3Z for linux-arm-kernel@lists.infradead.org; Thu, 15 Apr 2021 10:15:27 +0000 Received: from ptx.hi.pengutronix.de ([2001:67c:670:100:1d::c0]) by metis.ext.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lWz1k-00021v-J3; Thu, 15 Apr 2021 12:15:12 +0200 Received: from sha by ptx.hi.pengutronix.de with local (Exim 4.92) (envelope-from ) id 1lWz1h-0004Ec-54; Thu, 15 Apr 2021 12:15:09 +0200 Date: Thu, 15 Apr 2021 12:15:09 +0200 From: Sascha Hauer To: Marc Zyngier Cc: linux-edac@vger.kernel.org, Borislav Petkov , Mauro Carvalho Chehab , Tony Luck , James Morse , Robert Richter , York Sun , kernel@pengutronix.de, linux-arm-kernel@lists.infradead.org, Rob Herring , Mark Rutland Subject: Re: [PATCH 1/2] drivers/edac: Add L1 and L2 error detection for A53 and A57 Message-ID: <20210415101509.GD19819@pengutronix.de> References: <20210401110615.15326-1-s.hauer@pengutronix.de> <20210401110615.15326-2-s.hauer@pengutronix.de> <87czvdf1a7.wl-maz@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87czvdf1a7.wl-maz@kernel.org> X-Sent-From: Pengutronix Hildesheim X-URL: http://www.pengutronix.de/ X-IRC: #ptxdist @freenode X-Accept-Language: de,en X-Accept-Content-Type: text/plain X-Uptime: 11:47:36 up 56 days, 13:11, 104 users, load average: 0.38, 0.49, 0.33 User-Agent: Mutt/1.10.1 (2018-07-13) X-SA-Exim-Connect-IP: 2001:67c:670:100:1d::c0 X-SA-Exim-Mail-From: sha@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-arm-kernel@lists.infradead.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210415_031526_169278_2CC4F60D X-CRM114-Status: GOOD ( 41.34 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Marc, Thanks for the input. On Fri, Apr 02, 2021 at 11:06:56AM +0100, Marc Zyngier wrote: > > +config EDAC_CORTEX_ARM64_L1_L2 > > + tristate "ARM Cortex A57/A53" > > + depends on ARM64 > > + help > > + Support for L1/L2 cache error detection on ARM Cortex A57 and A53. > > I went through the TRMs for a few other Cortex-A cores, and this > feature looks more common than this comment suggests. At least A35 and > A72 implement something similar (if not strictly identical), probably > owing to their ancestry. Ok, I'll add these to the description. > > + } > > + > > + snprintf(msg, MESSAGE_SIZE, "%s %s error(s) on CPU %d", > > + str, fatal ? "fatal" : "correctable", cpu); > > + > > + if (fatal) > > + edac_device_handle_ue(edac_ctl, cpu, 0, msg); > > + else > > + edac_device_handle_ce(edac_ctl, cpu, 0, msg); > > + } > > + > > + if (l2merr & L2MERRSR_EL1_VALID) { > > + bool fatal = l2merr & L2MERRSR_EL1_FATAL; > > + > > + snprintf(msg, MESSAGE_SIZE, "L2 %s error(s) on CPU %d", > > + fatal ? "fatal" : "correctable", cpu); > > The shared nature of the L2 makes the CPU it has been detected on > pretty much irrelevant. What you really want here is the CPUID+Way > that is in the register data. You are right. For the next round I added some more code to decode the CPUID/Way field. What's still missing then is information which L2 cache has errors in case there is more than one. I wonder if we should add get_cpu_cacheinfo(cpu)->id to the message or if there's more to it. > > > + if (fatal) > > + edac_device_handle_ue(edac_ctl, cpu, 1, msg); > > + else > > + edac_device_handle_ce(edac_ctl, cpu, 1, msg); > > + } > > +} > > + > > +static void read_errors(void *data) > > +{ > > + struct merrsr *merrsr = data; > > + > > + merrsr->cpumerr = read_sysreg_s(SYS_CPUMERRSR_EL1); > > + write_sysreg_s(0, SYS_CPUMERRSR_EL1); > > + merrsr->l2merr = read_sysreg_s(SYS_L2MERRSR_EL1); > > + write_sysreg_s(0, SYS_L2MERRSR_EL1); > > If an error happens between read and write, you lose it. That's not > great. You could improve things by only writing 0 if you have found an > error. You probably also need an isb after the write if you want it to > take effect in a timely manner. Ok, will change. > > I'm also not sure of how valuable it is to probe for L2 errors on each > CPU, given that it is shared with up to 3 other cores. You probably > want to use the cache topology information for this. I have no idea how l2merr is implemented. When there is only one register for all CPUs sharing the same L2 cache then it shouldn't do any harm to read it more than once. The expensive part is probably to schedule a function on all CPUs, and we have to do that anyway to read the L1 cache errors. Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 | _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel