From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1F87C433EF for ; Fri, 14 Jan 2022 18:38:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242960AbiANSic (ORCPT ); Fri, 14 Jan 2022 13:38:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236181AbiANSic (ORCPT ); Fri, 14 Jan 2022 13:38:32 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5DC2DC061574 for ; Fri, 14 Jan 2022 10:38:31 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 1E37ACE23DB for ; Fri, 14 Jan 2022 18:38:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E5530C36AE5; Fri, 14 Jan 2022 18:38:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1642185507; bh=FlVKy93IqZkG1wUcb0eyrljpp2NO5o8WOguTB4nQWlc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=l5nEXEbLlK/r2Ra9eUZGmu9QB1+gK2pKbgPO4WsW3PBF7LPi1rUatA+EIg0Slfog2 Bl26gNOLH2auqTENtDWo/+EcQbEoODqRTJF+FjgGX0RJlwvNFd5PT7a+kpWqov1xxs bJCcTRLb0TKqEX0SoS3d/rrZ/QzuK+ciLou4Vtx71yxrWQ0koU2OnaXIpSQmGvdN6Z NNdZY1TOV47Ew2wYTtR7iU7ZaNMIRlm8uQ767Nx0Yd7IglZWqH9H0jhWazi5h//7FM GINwow2Xh/YttkzDnitBP4qsJt6VgUJN+A50DF3UaMRFKyA6fKNbzBQoMwZ/VySqLX 7FSwb536YB3Gg== Received: by pali.im (Postfix) id 8897A7D1; Fri, 14 Jan 2022 19:38:24 +0100 (CET) Date: Fri, 14 Jan 2022 19:38:24 +0100 From: Pali =?utf-8?B?Um9ow6Fy?= To: Bjorn Helgaas Cc: Stefan Roese , linux-pci@vger.kernel.org, Bjorn Helgaas , Keith Busch , Bharat Kumar Gogada , "Rafael J. Wysocki" Subject: Re: PCIe AER generates no interrupts on host (ZynqMP) Message-ID: <20220114183824.cw5y5cxt2gkyyxup@pali> References: <20220114183028.GA561657@bhelgaas> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220114183028.GA561657@bhelgaas> User-Agent: NeoMutt/20180716 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On Friday 14 January 2022 12:30:28 Bjorn Helgaas wrote: > On Fri, Jan 14, 2022 at 07:25:29AM +0100, Stefan Roese wrote: > > On 1/13/22 22:32, Bjorn Helgaas wrote: > > > On Thu, Jan 13, 2022 at 08:13:55AM +0100, Stefan Roese wrote: > > > > On 1/12/22 18:49, Bjorn Helgaas wrote: > > > > > > > > Ah. I assume you have: > > > > > > > > > > 00:00.0 Root Port to [bus 01-??] > > > > > 01:00.0 Switch Upstream Port to [bus 02-??] > > > > > 02:0?.0 Switch Downstream Port to [bus 04-??] > > > > > > > > This is correct, yes. > > > > > > > > > pcie_portdrv_probe() claims 00:00.0 and clears CERE NFERE FERE URRE. > > > > > > > > > > aer_probe() claims 00:00.0 and enables CERE NFERE FERE URRE for all > > > > > downstream devices, including 01:00.0. > > > > > > > > > > pcie_portdrv_probe() claims 01:00.0 and clears CERE NFERE FERE URRE > > > > > again. > > > > > > > > > > aer_probe() declines to claim 01:00.0 because it's not a Root Port, so > > > > > CERE NFERE FERE URRE remain cleared. > > > > > > > I'm baffled a bit, that this problem of AER reporting being disabled in > > > > the DevCtl regs of PCIe ports (all non root ports) was not noticed for > > > > this long time. As AER is practically disabled in such setups. > > > > > > The more common configuration is a Root Port leading directly to an > > > Endpoint. In that case, there would be no pcie_portdrv_probe() in the > > > middle to disable reporting after aer_probe() has enabled it. > > > > > > The issue you're seeing happens because of the switch in the middle, > > > which is becoming more common recently with Thunderbolt. > > > > > > I poked around on my laptop (Dell 5520 running v5.4): > > > > > > 00:01.0 Root Port to [bus 01] CorrErr- > > > 01:00.0 NVIDIA GPU CorrErr- > > > > > > 00:1c.0 Root Port to [bus 02] AER CorrErr+ > > > 02:00.0 Intel NIC AER CorrErr- <-- iwlwifi > > > > > > 00:1c.1 Root Port to [bus 03] AER CorrErr+ > > > 03:00.0 Realtek card reader AER CorrErr- <-- rtsx_pci > > > > > > 00:1d.0 Root Port to [bus 04] AER CorrErr+ > > > 04:00.0 NVMe AER CorrErr+ > > > > > > 00:1d.6 Root Port to [bus 06-3e] AER CorrErr+ > > > 06:00.0 Thunderbolt USP to [bus 07-3e] AER CorrErr- > > > 07:00.0 Thunderbolt DSP to [bus 08] AER CorrErr- > > > ... CorrErr- > > > > > > Everything in the Thunderbolt hierarchy has reporting disabled, > > > probably because of the issue you are pointing out. > > > > > > I can't explain the iwlwifi and rtsx_pci cases. Both devices have AER > > > and are directly below a Root Port that also has AER, so I would think > > > reporting should be enabled. > > > > This is because AER is enabled for the complete bus via > > set_downstream_devices_error_reporting(), which calls > > set_device_error_reporting(): > > > > static int set_device_error_reporting(struct pci_dev *dev, void *data) > > ... > > if ((type == PCI_EXP_TYPE_ROOT_PORT) || > > (type == PCI_EXP_TYPE_RC_EC) || > > (type == PCI_EXP_TYPE_UPSTREAM) || > > (type == PCI_EXP_TYPE_DOWNSTREAM)) { > > if (enable) > > pci_enable_pcie_error_reporting(dev); > > else > > pci_disable_pcie_error_reporting(dev); > > } > > > > if (enable) > > pcie_set_ecrc_checking(dev); > > > > Not sure, why error reporting is not enabled for "normal" PCIe devices > > but only pcie_set_ecrc_checking(). This behavior was integrated with the > > intial AER support in commit 6c2b374d7485 ("PCI-Express AER > > implemetation: AER core and aerdriver") in 2006. > > > > This explains, why you have AER disabled in the DevCtl registers of your > > iwlwifi and rtsx_pci devices. > > > > Now you might be asking, why you have AER enabled in the NVMe drive. > > This is because the NVMe driver specifically enables AER: > > > > drivers/nvme/host/pci.c: > > static int nvme_pci_enable(struct nvme_dev *dev) > > { > > ... > > pci_enable_pcie_error_reporting(pdev); > > > > There are other device drivers, especially ethernet, SCSI etc, which > > also specifically call pci_enable_pcie_error_reporting() for their > > PCIe devices. > > Thanks for that investigation! > > I think the PCI core should take care this and drivers should not be > in the business of enabling/disabling error reporting unless they have > defects and don't work according to spec. I agree too! > > So how to continue with this? Are we "brave enough" to enable AER > > for normal PCIe devices as well in set_device_error_reporting()? > > This would be a quite big change, as currently all PCIe devices have > > AER disabled per default. > > Good question. We have "pci=noaer" as a fallback, but it's a poor > experience if users have to discover and use it. I do think that when > CONFIG_PCIEAER=y, we really should enable AER by default on every > device that supports it. What about enable it globally for all devices if CONFIG_PCIEAER=y and pci=noaer is not specified for upcoming -next? It would be enabled in -next for two release cycles which could be enough to test if there is any problematic device. And then decide if it goes to -master or not. pci=noaer is still there... > > And we still have the problem that AER is disabled in all PCIe devices > > except for the Root Port. This can be fixed by removing the AER > > disabling from get_port_device_capability() as done in the patch I've > > sent yesterday to the list. > > > > Another idea that comes to my mind is, to change > > pci_enable_pcie_error_reporting() to walk the PCI bus upstream > > while enabling the device and enable AER in the DevCtl register in all > > upstream PCIe devices (e.g. PCIe switch etc) found on this way up to > > the PCIe Root Port. This way, AER will work on PCIe device, where it > > is specifically enabled in the device driver. But only there - at > > least in cases, where PCIe switches etc are involved. > > When we have a choice, I want to get away from pci_walk_bus() (walking > the downstream devices) and also from walking links upstream. > pci_walk_bus() is ugly and needs more care to make sure that whatever > we're doing also happens for future hot-added devices, and walking > upstream is problematic in terms of locking. > > Ultimately I think error configuration should be done during the > normal enumeration flow, e.g., somewhere like pci_init_capabilities(). > > Bjorn I see it in the same way. Function pci_enable_pcie_error_reporting() probably should not be available for PCIe consumer drivers... At last because it is not driver related, but rather PCI core related, like quirks.