From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5A9FC433ED for ; Wed, 28 Apr 2021 14:40:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7109D61440 for ; Wed, 28 Apr 2021 14:40:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239667AbhD1Ol3 (ORCPT ); Wed, 28 Apr 2021 10:41:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229794AbhD1Ol2 (ORCPT ); Wed, 28 Apr 2021 10:41:28 -0400 Received: from bmailout1.hostsharing.net (bmailout1.hostsharing.net [IPv6:2a01:37:1000::53df:5f64:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BC1CC061573 for ; Wed, 28 Apr 2021 07:40:43 -0700 (PDT) Received: from h08.hostsharing.net (h08.hostsharing.net [IPv6:2a01:37:1000::53df:5f1c:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.hostsharing.net", Issuer "RapidSSL TLS DV RSA Mixed SHA256 2020 CA-1" (verified OK)) by bmailout1.hostsharing.net (Postfix) with ESMTPS id 751403000064C; Wed, 28 Apr 2021 16:40:41 +0200 (CEST) Received: by h08.hostsharing.net (Postfix, from userid 100393) id 691D92DC693; Wed, 28 Apr 2021 16:40:41 +0200 (CEST) Date: Wed, 28 Apr 2021 16:40:41 +0200 From: Lukas Wunner To: Yicong Yang Cc: Bjorn Helgaas , Sathyanarayanan Kuppuswamy , Dan Williams , Ethan Zhao , Sinan Kaya , Ashok Raj , Keith Busch , linux-pci@vger.kernel.org, Russell Currey , Oliver O'Halloran , Stuart Hayes , Mika Westerberg , Linuxarm Subject: Re: [PATCH] PCI: pciehp: Ignore Link Down/Up caused by DPC Message-ID: <20210428144041.GA27967@wunner.de> References: <4177f0be-5859-9a71-da06-2e67641568d7@hisilicon.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4177f0be-5859-9a71-da06-2e67641568d7@hisilicon.com> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On Wed, Apr 28, 2021 at 06:08:02PM +0800, Yicong Yang wrote: > I've tested the patch on our board, but the hotplug will still be > triggered sometimes. > seems the hotplug doesn't find the link down event is caused by dpc. > Any further test I can do? > > mestuary:/$ [12508.408576] pcieport 0000:00:10.0: DPC: containment event, status:0x1f21 source:0x0000 > [12508.423016] pcieport 0000:00:10.0: DPC: unmasked uncorrectable error detected > [12508.434277] pcieport 0000:00:10.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Completer ID) > [12508.447651] pcieport 0000:00:10.0: device [19e5:a130] error status/mask=00008000/04400000 > [12508.458279] pcieport 0000:00:10.0: [15] CmpltAbrt (First) > [12508.467094] pcieport 0000:00:10.0: AER: TLP Header: 00000000 00000000 00000000 00000000 > [12511.152329] pcieport 0000:00:10.0: pciehp: Slot(0): Link Down Note that about 3 seconds pass between DPC trigger and hotplug link down (12508 -> 12511). That's most likely the 3 second timeout in my patch: + /* + * Need a timeout in case DPC never completes due to failure of + * dpc_wait_rp_inactive(). + */ + wait_event_timeout(dpc_completed_waitqueue, dpc_completed(pdev), + msecs_to_jiffies(3000)); If DPC doesn't recover within 3 seconds, pciehp will consider the error unrecoverable and bring down the slot, no matter what. I can't tell you why DPC is unable to recover. Does it help if you raise the timeout to, say, 5000 msec? Thanks, Lukas