From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5545EC49EA7 for ; Thu, 24 Jun 2021 23:51:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 338D8613AD for ; Thu, 24 Jun 2021 23:51:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232919AbhFXXxr (ORCPT ); Thu, 24 Jun 2021 19:53:47 -0400 Received: from foss.arm.com ([217.140.110.172]:41988 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229521AbhFXXxm (ORCPT ); Thu, 24 Jun 2021 19:53:42 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9ED1EED1; Thu, 24 Jun 2021 16:51:22 -0700 (PDT) Received: from [10.57.9.136] (unknown [10.57.9.136]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D55473F718; Thu, 24 Jun 2021 16:51:20 -0700 (PDT) Subject: Re: [PATCH v2] PCI: rockchip: Avoid accessing PCIe registers with clocks gated To: Bjorn Helgaas Cc: Javier Martinez Canillas , linux-kernel@vger.kernel.org, Peter Robinson , Shawn Lin , Bjorn Helgaas , Heiko Stuebner , Lorenzo Pieralisi , Rob Herring , linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org, linux-rockchip@lists.infradead.org References: <20210624232841.GA3579021@bjorn-Precision-5520> From: Robin Murphy Message-ID: <5356a01c-5aab-fbff-b0a9-157b961c66ee@arm.com> Date: Fri, 25 Jun 2021 00:51:16 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210624232841.GA3579021@bjorn-Precision-5520> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021-06-25 00:28, Bjorn Helgaas wrote: > On Fri, Jun 25, 2021 at 12:18:48AM +0100, Robin Murphy wrote: >> On 2021-06-24 22:57, Bjorn Helgaas wrote: >>> On Tue, Jun 08, 2021 at 10:04:09AM +0200, Javier Martinez Canillas wrote: >>>> IRQ handlers that are registered for shared interrupts can be called at >>>> any time after have been registered using the request_irq() function. >>>> >>>> It's up to drivers to ensure that's always safe for these to be called. >>>> >>>> Both the "pcie-sys" and "pcie-client" interrupts are shared, but since >>>> their handlers are registered very early in the probe function, an error >>>> later can lead to these handlers being executed before all the required >>>> resources have been properly setup. >>>> >>>> For example, the rockchip_pcie_read() function used by these IRQ handlers >>>> expects that some PCIe clocks will already be enabled, otherwise trying >>>> to access the PCIe registers causes the read to hang and never return. >>> >>> The read *never* completes? That might be a bit problematic because >>> it implies that we may not be able to recover from PCIe errors. Most >>> controllers will timeout eventually, log an error, and either >>> fabricate some data (typically ~0) to complete the CPU's read or cause >>> some kind of abort or machine check. >>> >>> Just asking in case there's some controller configuration that should >>> be tweaked. >> >> If I'm following correctly, that'll be a read transaction to the native side >> of the controller itself; it can't complete that read, or do anything else >> either, because it's clock-gated, and thus completely oblivious (it might be >> that if another CPU was able to enable the clocks then everything would >> carry on as normal, or it might end up totally deadlocking the SoC >> interconnect). I think it's safe to assume that in that state nothing of >> importance would be happening on the PCIe side, and even if it was we'd >> never get to know about it. > > Oh, right, that makes sense. I was thinking about the PCIe side, but > if the controller itself isn't working, of course we wouldn't get that > far. > > I would expect that the CPU itself would have some kind of timeout for > the read, but that's far outside of the PCI world. Nah, in AMBA I'm not sure if it's even legal to abandon a transaction without waiting for the handshake to complete. If you're lucky the interconnect might have a clock/power domain bridge which can reply with an error when it knows its other side isn't running, otherwise the initiator will just happily sit there waiting for a response to come back "in a timely manner" :) Robin.