From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B95DC07E96 for ; Tue, 13 Jul 2021 15:39:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1FBA06101B for ; Tue, 13 Jul 2021 15:39:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237061AbhGMPlv (ORCPT ); Tue, 13 Jul 2021 11:41:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:41304 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236932AbhGMPlt (ORCPT ); Tue, 13 Jul 2021 11:41:49 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 62D12610A6; Tue, 13 Jul 2021 15:38:59 +0000 (UTC) Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1m3KUr-00D6yp-7k; Tue, 13 Jul 2021 16:38:57 +0100 Date: Tue, 13 Jul 2021 16:38:56 +0100 Message-ID: <87o8b66x67.wl-maz@kernel.org> From: Marc Zyngier To: Bharat Bhushan Cc: Mark Rutland , "catalin.marinas@arm.com" , "will@kernel.org" , "daniel.lezcano@linaro.org" , "konrad.dybcio@somainline.org" , "saiprakash.ranjan@codeaurora.org" , "robh@kernel.org" , "marcan@marcan.st" , "suzuki.poulose@arm.com" , "broonie@kernel.org" , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Linu Cherian , Sunil Kovvuri Goutham Subject: Re: [EXT] Re: [PATCH] clocksource: Add Marvell Errata-38627 workaround In-Reply-To: References: <20210705060843.3150-1-bbhushan2@marvell.com> <20210705090753.GD38629@C02TD0UTHF1T.local> <20210708114157.GC24650@C02TD0UTHF1T.local> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: bbhushan2@marvell.com, mark.rutland@arm.com, catalin.marinas@arm.com, will@kernel.org, daniel.lezcano@linaro.org, konrad.dybcio@somainline.org, saiprakash.ranjan@codeaurora.org, robh@kernel.org, marcan@marcan.st, suzuki.poulose@arm.com, broonie@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, lcherian@marvell.com, sgoutham@marvell.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Bharat, On Tue, 13 Jul 2021 03:40:22 +0100, Bharat Bhushan wrote: > > Hi Mark, > > -----Original Message----- > From: Mark Rutland [...] > > From your description so far, this doesn't sound like it is > > specific to the timer interrupt. Is it possible for a different > > interrupt to trigger this, e.g: > > > > * Can the same happen with another PPI, e.g. the PMU interrupt if that > > gets de-asserted, or there's a race with DAIF.I? > > > > * Can the same happen with an SGI, e.g. if one CPU asserts then > > de-asserts an SGI targetting another CPU, or there's a race with > > DAIF.I? > > > > * Can the same happen with an SPI, e.g. if a device asserts then > > de-asserts its IRQ line, or there's a race with DAIF.I? > > No issue with edge triggered, but this can happen with any level > sensitive interrupt. So let's say CPU0 is targeted by a level-triggered SPI, and right when the interrupt is reaching the CPU interface, CPU1 disables this interrupt, which gets recalled, and CPU0 never takes the interrupt. Bug hits again. Drivers do that. I actually suspect that an edge-triggered interrupt would result in the same issue, unless your GIC implementation isn't able to perform a recall on edge interrupts. I don't understand why you are only considering the timer here. Any interrupt can trigger this, and if there is going to be a workaround, it will need to be robust against all interrupts being retired, no matter what device triggers it. And given that the OoO nature of the machine leaks non architectural state, potentially belonging to a different security context, this isn't something that should be taken lightly. M. -- Without deviation from the norm, progress is not possible.